Skip to content Skip to sidebar Skip to footer

Numerical Character Recognition In Pytesser

I am working on a project that requires me to get prices from a commodity exchange. Unfortunately the exchange has no webservice or other plugin available that allows me to get the

Solution 1:

Is there a way to limit the set of characters to only digits and a dot for decimals?

Yes! Using the package pyslibtesseract:

from pyslibtesseract import TesseractConfig, PageSegMode
config_line = TesseractConfig(psm=PageSegMode.PSM_SINGLE_LINE)
config_line.add_variable('tessedit_char_whitelist', '0123456789.')

And how can the quality of the conversion be improved?

You need use OpenCV to improve the image quality.


Post a Comment for "Numerical Character Recognition In Pytesser"