I’ve been looking for something exactly like this. I keep the complete text of articles I’ve enjoyed but, to-date, effective searching has meant spinning up an ES instance, which is painful. This is a specific use case that isn’t necessarily well-served by something like grep or ripgrep. I’ll definitely try this, thanks - looks very elegant.

Can you say more, I'm curious.

Is it automated in some way during web browsing, remembering to copy to a folder when you enjoyed it enough, or do you use a reading app/e-reader to read them so they're already downloaded

d3nj4l

Not OP so I can't speak for them. There's a bunch of ways to do this, ranging from more turnkey solutions to collections of scripts and extensions you can use. On the turnkey side, there's programs like ArchiveBox[1] which take links and store them as WARC files. You can import your browsing history into ArchiveBox and set up a script to do it automatically. If you'd like to set something up yourself, you can extract your browsing history (eg, firefox stores its history in a sqlite database) and manually wget those urls. For a reference to the more "bootstrapped" version, I'll link to Gwern's post on their archiving setup [2]. It's fairly long, so I advise skipping to the parts you're interested in first.

1: https://github.com/ArchiveBox/ArchiveBox

2: https://www.gwern.net/Archiving-URLs