Cannot display Chinese character and cannot identify Excel files

When things doesn't work as they should.
Post Reply
leoliu
Posts: 2
Joined: Tue Sep 10, 2019 3:24 am

Cannot display Chinese character and cannot identify Excel files

Post by leoliu » Tue Sep 10, 2019 3:52 am

Hi There,

I encountered two issues when using Mayan EDMS:
1. Cannot display Chinese character
When uploading files including Chinese characters, these characters cannot be displayed correctly, otherwise, they are displayed as blank boxes. Please check the screenshot below:
The attachment BB.png is no longer available
Here is the error message in OCR errors:
Exception calling Tesseract with language option: cmn; RAN: /usr/bin/tesseract - - -l cmn STDOUT: STDERR: Error opening data file /usr/share/tesseract-ocr/tessdata/cmn.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to the parent directory of your "tessdata" directory. Failed loading language 'cmn' Tesseract couldn't load any languages! Could not initialize tesseract. The requested OCR language "cmn" is not available and needs to be installed.

2. Cannot identify Excel files
Some Excel files cannot be identified but some can. Please check the screenshot below:
BB.png
BB.png (110.1 KiB) Viewed 38 times
Mayan EDMS is a good system and I am fascinated by it. Hope someone could give me a hand. Thanks in advance.

User avatar
rosarior
Posts: 345
Joined: Tue Aug 21, 2018 3:28 am

Re: Cannot display Chinese character and cannot identify Excel files

Post by rosarior » Tue Sep 10, 2019 5:21 am

Hi, thanks for the reports.

It appears that the OCR engine Tesseract does not yet has support for Mandarin. The error message is saying it is not finding the language support for Mandarin. The error is not fatal as only the OCR will fail but all other features will be available.

The character rendering could be as simple as missing fonts. Can you provide some files that show can trigger the issue locally? They can be files with random content you create that behave in the same way as the issue you describe. They should not contain any real information.

Thank you.

leoliu
Posts: 2
Joined: Tue Sep 10, 2019 3:24 am

Re: Cannot display Chinese character and cannot identify Excel files

Post by leoliu » Tue Sep 10, 2019 5:47 am

Hi
Thanks for your quick reply.
Here is a sample file attached.
教师节快乐.zip
(139.01 KiB) Downloaded 2 times

Post Reply