What is a good OCR package that is open source? Perhaps I'm using it in the wrong way, but my experience with Tesseract is not so great.
OCR processing typically consist of two major steps: detecting/locating words or lines of text on the page, and recognizing lines of text.
Tesseract's text recognition uses modern methods, but the text detection phase is still based on classical methods involving a lot of heuristics, and you may need to experiment with various configuration variables to get the best results. As a result it can fail to detect text if you present it with something other than a reasonably clean document image.
Doctr (https://github.com/mindee/doctr) is a new package that uses modern methods for both text detection and recognition. It is pretty new however and I expect will take more time and effort to mature.