Serbian language

When things don't work as they should.
Post Reply
kolenmi
Posts: 1
Joined: Sat Sep 14, 2019 10:33 am

Serbian language

Post by kolenmi » Sat Sep 14, 2019 10:45 am

Hello,
I can not submit OCR for serbian language.
I get the following message
"Exception calling Tesseract with language option: hbs; RAN: /usr/bin/tesseract - - -l hbs STDOUT: STDERR: Error opening data file /usr/share/tesseract-ocr/4.00/tessdata/hbs.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory. Failed loading language 'hbs' Tesseract couldn't load any languages! Could not initialize tesseract. The requested OCR language "hbs" is not available and needs to be installed. "
I installed tesseract-ocr-srp and tesseract-ocr-srp-latn packages.
There is no tesseract-ocr-hbs package.
How to solve this problem?
I need both, Latin and Cyrillic, for OCR production.
Can you include, also, sebian language in mayanedms?
Thank you

User avatar
rosarior
Posts: 393
Joined: Tue Aug 21, 2018 3:28 am

Re: Serbian language

Post by rosarior » Tue Oct 01, 2019 6:46 pm

Hi,

As you noted, Tesseract doesn't support the HBS locale but it does support SRP and SRP-LATN. Install the supported ones with

Code: Select all

-e MAYAN_APT_INSTALLS="tesseract-ocr-deu tesseract-ocr-spa"
and when uploading your documents use that locale instead of "HBS".

Translating Mayan into other languages is simple but we rely on native speakers volunteers:

1- Open an account on Transifex (https://www.transifex.com) it is free.
2- Request the creation of the Serbian language.
3- Translate the strings using the web interface, no programing knowledge needed. We include a Google translate API key and have enable automatic translations to help you out.

As soon as the language translation percentage is above 15 or 20%, we will enable and include the language into Mayan.

Thanks!
Attachments
2019-10-01_14-39.png
2019-10-01_14-39.png (61.5 KiB) Viewed 33 times

Post Reply