What does HackerNews think of scantailor-advanced?

ScanTailor Advanced is the version that merges the features of the ScanTailor Featured and ScanTailor Enhanced versions, brings new ones and fixes.

Language: C++

There's also https://scantailor.org/ (and a maintained fork at https://github.com/4lex4/scantailor-advanced ) which semi-automates unwarping and other corrective tasks in scanned books.
I use a £15 arm with a vice grip for my phone from Amazon, copy the files to my laptop and then run a bash for-loop of the tesseract CLI over the resultant files.

I use https://github.com/4lex4/scantailor-advanced to deskew the images and generate the PDF.

It isn't perfect but my purposes are more around research than publication, so, YMMV!

This has the latest developments, but is also seemingly unmaintained for over a year: https://github.com/4lex4/scantailor-advanced

Scan Tailor forum: https://forum.diybookscanner.org/viewforum.php?f=21

Interesting question, but I suspect that there isn't such a convincing use case for this as forged currency is for tracking color printers. Does government use paper as a security boundary, and if that's the issue, wouldn't it suffice to have the scanners in the relevant offices log everything they are scanning (which I think they already do)?

Also, almost all such watermarks would be easily destroyed by bitonalization and despeckling, which is usually done anyway to reduce the file size (e.g. it's part of the default operation of https://github.com/4lex4/scantailor-advanced ). Arguably the same is true for yellow dots if one is leaking scanned printouts, but identifying leakers is not the main purpose of the dots...

If one really wanted to, one could try embedding little holes into certain letters and hope they survive the smoothing algorithms and don't stand out too much. Not sure how successful that would be, though.

scantailor will get you most of the way there. the original project is dead but there are a few forks on github. It has been a while since I did any serious scanning so I can't remember which version I used. https://github.com/4lex4/scantailor-advanced https://github.com/trufanov-nok/scantailor-universal
There are a couple folks that forked scantailor. I'm not sure the status of those. Here are a couple: https://github.com/4lex4/scantailor-advanced https://github.com/trufanov-nok/scantailor-universal
I've used scan tailor in the past to convert a outboard motor manual to pdf, it's pretty powerful. I didn't have a proper setup, but my results still came out decently.

https://github.com/4lex4/scantailor-advanced

This doesn't completely fix your issue, but since you mentioned deskewing, I clean up my scanned documents using ScanTailor Advanced:

https://github.com/4lex4/scantailor-advanced

I find the autodeskewing algorithm to work well, but it allows hand adjustment as well, which I like. As I've gotten better as using it, I've been able to get the size of my scanned documents down considerably by cleaning up the scans. This includes some old manuals.

As far as the pdf encoding itself, I use both mutool, from mupdf, and qpdf. I just checked and it looks like while they both compress their streams, it may not have the same flexibility with Acrobat Pro. For me, I'll decompress, edit, and recompress streams on the files and that's been fine for my use.

That said, if someone knows of a better tool for compressing streams in a PDF, I'd be interested to hear about it as well.