Yes, PDFMiner in Python https://github.com/euske/pdfminer
Apache PDFBox in Java https://pdfbox.apache.org
Previous discussion https://news.ycombinator.com/item?id=11327493
For a list of others, see http://okfnlabs.org/blog/2016/04/19/pdf-tools-extract-text-a...
According to the PDFMiner site, pdf2txt.py cannot recognize text drawn as images that would require optical character recognition. I'm interested in software that combines OCR with some sort of math notation rendering engine.
https://www.tensorflow.org/tutorials/mnist/beginners/ (also google "tensorflow ocr")
http://yann.lecun.com/exdb/mnist/
CROHME: Competition on Recognition of Online Handwritten Mathematical Expressions http://www.isical.ac.in/~crohme/
Closed-sourced API: http://mathpix.com https://photomath.net/en/
Best off-the-shelf OCR (originally developed by HP, now Google):
https://github.com/tesseract-ocr/tesseract
https://github.com/tesseract-ocr/tesseract/wiki
Two Clojure talks...
Machine Learning Live - Mike Anderson https://www.youtube.com/watch?v=QJ1qgCr09j8
Adventures in Understanding Documents - Scott Tuddenham https://www.youtube.com/watch?v=94NjRg8zoCA