If you want to OCR a document image, modern versions of Tesseract can work well. If you last used it a few years ago, the recognition has improved since due to a new text recognition algorithm that uses modern (deep learning) techniques. Browser demo using a modern version: https://robertknight.github.io/tesseract-wasm/.

OCR processing typically consist of two major steps: detecting/locating words or lines of text on the page, and recognizing lines of text.

Tesseract's text recognition uses modern methods, but the text detection phase is still based on classical methods involving a lot of heuristics, and you may need to experiment with various configuration variables to get the best results. As a result it can fail to detect text if you present it with something other than a reasonably clean document image.

Doctr (https://github.com/mindee/doctr) is a new package that uses modern methods for both text detection and recognition. It is pretty new however and I expect will take more time and effort to mature.