Bummer. I love and use instapaper, gathering articles for a few weeks to read at altitude. It's a great product, and I paid for a subscription these last years in the hopes that I could therefore continue to enjoy it.

Now it's sold to Pinterest, one of the two sites I don't bother with links to—because I know Pinterest and Quora will require me to sign in rather than show me what they showed a search engine.

What else operates in this space? Pocket, I remember. ReadItLater used to exist, maybe still? Does Pinboard do this somehow, maybe with an RSS reader? Or do I have to pay for Paperback?

Wallabag (https://www.wallabag.org/) if you want self-hosted.

Pinboard (https://pinboard.in) offers archiving for (I believe) $25 a year.

Or Pocket (https://getpocket.com/) which used to be Read-It-Later.

Thanks for the link to Wallabag, I had never heard of it. Looks very interesting.

I use Pinboard, and pay for the archiving option. I even periodically request a tarball of the archive for my own backup. Pinboard archives the entire page, not just a readable version of the content. For archival and reference purposes, I like this. It would be nice if Pinboard also provided a readable option. In fact, a number of the apps that work with Pinboard add support for readable versions.

I will look into adding Wallabag into my workflow.

I'm cool with the idea of providing a readable option in Pinboard, since I already do something similar to get the text out of the page for indexing. Any library for this you particularly like?

Readability (https://github.com/luin/readability) is a classic, and included as part of Firefox (I think, maybe that's been discontinued). It's essentially a bag of hand-written heuristics but they're pretty good heuristics.

Some interesting reading is Christian Kohlschütter's thesis on this problem, which is framed in academia as "how do we assemble good text corpuses from webpages for data analysis, which means removing junk (boilerplate) from our HTML crawls" (https://code.google.com/archive/p/boilerpipe/wikis/WSDM2010P...). Boilerpipe would probably be the right way to go, but if you're not using Java it could be harder to integrate.

Firefox maintains a fork of Readability for Firefox's reader mode here:

https://github.com/mozilla/readability