Broken preview of a PDF document

Questions, comments, discussions. Over time certain topics might be moved to their own category.
Post Reply
hhz
Posts: 10
Joined: Mon Oct 26, 2020 9:31 pm

Broken preview of a PDF document

Post by hhz »

Hi there,

on my docker-based 4.0.1 setup, I have two very large PDF files that Mayan won't create a preview for. There is no bug in the events log. How can I debug the reason for this? These are private documents, but I'd share these files with the developers if it helps.
User avatar
michael
Developer
Developer
Posts: 135
Joined: Sun Apr 19, 2020 6:21 am

Re: Broken preview of a PDF document

Post by michael »

Hello,

It could mean that the PDF is not correct or deviates from the standard PDF file format. PDFs are a family of formats and not all of them are open. Implementations vary and sometimes some software produce PDFs that are not valid or readable by other software => What does “Couldn't read xref table” mean? => https://tex.stackexchange.com/questions ... table-mean

It could also mean you've encountered an issue with the PDF previewer we use, pdftoppm => https://gitlab.com/mayan-edms/mayan-edms/-/issues/991

The version of pdftoppfm shipped with Mayan 4.0, seems to have a bug. This library is packaged by Debian, so there is not much we can do at the moment but wait to see if a new version is released and packaged. We are actively trying to find a way to ship our own package of pdftoppm, however attempts end up getting us into a dependency hell of conflicting librarires => https://en.wikipedia.org/wiki/Dependency_hell

This in an ongoing issue but sadly falls outside of our control.
lsmoker
Posts: 28
Joined: Wed Sep 05, 2018 3:52 pm

Re: Broken preview of a PDF document

Post by lsmoker »

I might implement a bash shell wrapper to work around the bug in pdftoppm. I created a script /usr/local/bin/pdftoppm.sh...

Code: Select all

#!/bin/bash

# shell wrapper script to process a pdf with Imagemagick's "convert" command
# if poppler-utils "pdftoppm" command cannot handle the PDF

/usr/bin/pdftoppm $*

if [ $? -eq 99 ]; then

  fp="$(($6-1))"
  lp="$(($8-1))"

  /usr/bin/convert -quality 100 -density 150 -flatten $4[$fp-$lp] jpg:-
fi
It uses the "convert" command from the Imagemagick package so that needs to be installed. Also, the path to the pdftoppm binary must be changed to "/usr/local/bin/pdftoppm.sh" as above.

Certainly not perfect and could be improved, but it does catch the odd PDF that triggers the XREF bug.
---
LeVon Smoker
Post Reply