OCR from Image?

Requests for new functionality or improvements in existing functionality. Please provide clear descriptions of your request, an example or if possible a real life scenario.
Post Reply
Posts: 1
Joined: Thu Nov 15, 2018 8:25 am

OCR from Image?

Post by mhswa »


We upload alot of PDF documents that are tech drawings that have dimensions on them, it doesn't seem that tesseract process them at all? see below an example is there anyway we can get this to work as sometimes we need to search with little information like a dimension only


User avatar
Posts: 494
Joined: Tue Aug 21, 2018 3:28 am
Location: Puerto Rico

Re: OCR from Image?

Post by rosarior »

Tesseract can't recognized rotated text (https://github.com/tesseract-ocr/tesser ... -deskewing).

My recommendation would be to add a rotation transformation to align the most amount of numbers.

Off the top of my head I can't think of any OCR engine that can recognize text with multiple rotations in the same page. I think you would need a trained neural network or a custom computer vision implementation to pull that off.

Post Reply