(OCR error) Failed loading language / Tesseract couldn't load any languages / Could not initialize tesseract (possible solution)

If you are having language problems with OCR like the one above, try this step-by-step guide.
As the error states, you must install the language you need, in my case it is the “Portuguese-BR” language.
Open “Docker Desktop” and click on the 3 dots next to the “app-1” container, after that, click on “open in terminal”:

A local container terminal will open, there you will enter the following commands in order:

sudo apt update
apt list --upgradable
sudo apt upgrade
tesseract --version
sudo apt install tesseract-ocr-por

You can replace “por” (Portuguese) with the language you want.
I believe there is no need to restart any container in docker, just upload a new file to test the OCR. Hope this helps.

You can also use the MAYAN_APT_INSTALLS env variable in your compose file to make the containers install the additional packages automatically. Especially useful in a multi container setup:

MAYAN_APT_INSTALLS=“tesseract-ocr-deu tesseract-ocr-eng”

1 Like

It seems that the MAYAN_APT_INSTALLS variable in the .env file is no longer functioning as expected. I am currently using version 4.8.2 of Mayan EDMS, and the workaround I’ve been using is:

docker exec -it your-container-name /bin/bash
apt-get update && apt-get install -y tesseract-ocr-ara

Could you advise if there is a more efficient or recommended approach for handling this?

I’m still on 4.7.2 but I cannot find anything on the release notes regarding removal of MAYAN_APT_INSTALLS variable.

Your approach seems ok but you have to make sure to execute the install everytime you recreate the container. With the MAYAN_APT_INSTALLS the container will always install the missing packages automatically on startup.

1 Like