What does HackerNews think of PaddleOCR?
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
Language:
Python
When I was evaluating options a few months ago I found https://github.com/PaddlePaddle/PaddleOCR to be a very strong contender for my use case (reading product labels), but you'll definitely want to put together some representative docs/images and test a bunch of solutions to see what works for you.
I’ve had good results from paddle ocr.
Thanks for sharing! For context this is a demo of PaddleOCR V2 [0] which was released yesterday. You can find their original repo here [1]. We built this demo using Gradio [2] and deployed it on HuggingFace's Spaces [3].
[0]: https://arxiv.org/abs/2109.03144 [1]: https://github.com/PaddlePaddle/PaddleOCR [2]: https://gradio.app/ [3]: https://huggingface.co/spaces
Demo is cool, but it tells us nothing about this particular OCR.
Two alternatives, which are designed for OCR from photos:
https://github.com/PaddlePaddle/PaddleOCR/ https://github.com/JaidedAI/EasyOCR/
It's worth trying them if Tesseract isn't giving you good accuracy.
Two alternatives, which are designed for OCR from photos:
https://github.com/PaddlePaddle/PaddleOCR/
https://github.com/JaidedAI/EasyOCR/
It's worth trying them if Tesseract isn't giving you good accuracy.
I recently came across CRAFT wich appears to have come out of the ICDAR2017 Robust reading challenge.
It performed better than expected. I only tested a few images so please don't take my word for it.
That led me to PaddleOCR. There is still plenty of room for improvement but I found it way more convenient to use for my purposes than messing with Tesseract.
I would recommend https://github.com/PaddlePaddle/PaddleOCR over the default tesseract. It seems to do a better job these days and uses more modern approaches.