I've been trying to add Danish to my list of languages for OCR support. But unsuccessful so far. By default it has support for odd languages like ancient greek and similar, but not Danish.
I am using the docker version, manual with MySQL instead of PostreSQL, however I did spin up a complete fresh one using the one line installer on your website, no luck.
Danish is supported byTesseract as shown here: https://github.com/tesseract-ocr/tesser ... Data-Files.
What I've tried so far:
Code: Select all
root@db34fa5eaea5:/opt/mayan-edms# apt-cache search tesseract-ocr tesseract-ocr-dan-frak - tesseract-ocr language files for Danish (Fraktur) tesseract-ocr-dan - tesseract-ocr language files for Danish tesseract-ocr-osd - tesseract-ocr language files for script and orientation tesseract-ocr-eng - tesseract-ocr language files for English tesseract-ocr - Tesseract command line OCR tool tesseract-ocr-equ - tesseract-ocr language files for equations
- I've tried reinstalling the dockers with different databases (albeit should be the difference).
- Installed OCR packages using the -e MAYA_APT_INSTALL parameter
- Installed it manually inside the container, using apt install tesseract-ocr-dan tesseract-ocr-dan-frak
- Tried changing the OCR tool from the default one to ocr.backends.tesseract.Tesseract, albeit the docker crashed stating that no such module exist.