Page 1 of 1

Broken preview of a PDF document

Posted: Mon May 31, 2021 7:43 pm
by hhz
Hi there,

on my docker-based 4.0.1 setup, I have two very large PDF files that Mayan won't create a preview for. There is no bug in the events log. How can I debug the reason for this? These are private documents, but I'd share these files with the developers if it helps.

Re: Broken preview of a PDF document

Posted: Wed Jun 02, 2021 2:39 pm
by michael
Hello,

It could mean that the PDF is not correct or deviates from the standard PDF file format. PDFs are a family of formats and not all of them are open. Implementations vary and sometimes some software produce PDFs that are not valid or readable by other software => What does “Couldn't read xref table” mean? => https://tex.stackexchange.com/questions ... table-mean

It could also mean you've encountered an issue with the PDF previewer we use, pdftoppm => https://gitlab.com/mayan-edms/mayan-edms/-/issues/991

The version of pdftoppfm shipped with Mayan 4.0, seems to have a bug. This library is packaged by Debian, so there is not much we can do at the moment but wait to see if a new version is released and packaged. We are actively trying to find a way to ship our own package of pdftoppm, however attempts end up getting us into a dependency hell of conflicting librarires => https://en.wikipedia.org/wiki/Dependency_hell

This in an ongoing issue but sadly falls outside of our control.

Re: Broken preview of a PDF document

Posted: Tue Jun 08, 2021 7:15 pm
by lsmoker
I might implement a bash shell wrapper to work around the bug in pdftoppm. I created a script /usr/local/bin/pdftoppm.sh...

Code: Select all

#!/bin/bash

# shell wrapper script to process a pdf with Imagemagick's "convert" command
# if poppler-utils "pdftoppm" command cannot handle the PDF

/usr/bin/pdftoppm $*

if [ $? -eq 99 ]; then

  fp="$(($6-1))"
  lp="$(($8-1))"

  /usr/bin/convert -quality 100 -density 150 -flatten $4[$fp-$lp] jpg:-
fi
It uses the "convert" command from the Imagemagick package so that needs to be installed. Also, the path to the pdftoppm binary must be changed to "/usr/local/bin/pdftoppm.sh" as above.

Certainly not perfect and could be improved, but it does catch the odd PDF that triggers the XREF bug.