Export all Documents

Questions, comments, discussions. Over time certain topics might be moved to their own category.
Post Reply
Crayiii
Posts: 9
Joined: Fri Aug 24, 2018 12:25 am

Export all Documents

Post by Crayiii »

I'm looking to try to get Mayan-EDMS installed as a docker on my unRAID server. I currently have it running in a VM but there are weird issues that have cropped up with ACL's, etc.

Is there a way to export all of the documents in a batch from the database to a folder and have the original file names? Then I can import that folder and start from scratch.
User avatar
rosarior
Developer
Developer
Posts: 546
Joined: Tue Aug 21, 2018 3:28 am
Location: Puerto Rico
Contact:

Re: Export all Documents

Post by rosarior »

No official method yet to export documents. I've seen some scripts to dump them using the API of a Python shell.

Maybe we can add a simple document only exporter. The only issue is filename collisions.
theintelligentmouse
Posts: 5
Joined: Thu Nov 01, 2018 7:16 am

Re: Export all Documents

Post by theintelligentmouse »

Yes I was also going to suggest using the API to iterate through the documents and download them.

It would follow something like...

GET /api/documents/

<LOOP THROUGH RESULTS>

GET /api/documents/{id}/

<USE META INFO FOR FILENAME>

GET /documents/{id}/download/

<IF NOT EXISTS>
<SAVE FILE USING NAME FROM META>

<ELSE>
<SAVE FILE USING NAME FROM META + RANDOM>


Would be fairly straight-forward to write in PHP or Python, I'd use PHP and cURL to simply go through the above steps.
bernd
Posts: 5
Joined: Tue Aug 28, 2018 9:08 pm

Re: Export all Documents

Post by bernd »

has one of you guys finished such script?

bernd
KevinPawsey
50 Posts
50 Posts
Posts: 87
Joined: Wed Aug 22, 2018 2:52 pm

Re: Export all Documents

Post by KevinPawsey »

Silly question, but can’t you just go to Documents/All Documents.

Once that page has loaded, on the top right there is a drop down with a tick box next to it. Tick the box, that selects all documents, then go to “Advanced download” in the drop down. From there you can download all documents in a zip file.

Maybe that will work? I haven’t tried it myself, just noticed the option the other day.

Hope it helps

Kevin
Running Mayan-EDMS on: OpenMediaVault, (Docker plugin), on x86 dual-core
wesss
Posts: 6
Joined: Wed Feb 13, 2019 8:35 pm

Re: Export all Documents

Post by wesss »

I'd really appreciate such a script or option as well. Any update on this?
User avatar
rssfed23
Moderator
Moderator
Posts: 213
Joined: Mon Oct 14, 2019 1:18 pm
Location: United Kingdom
Contact:

Re: Export all Documents

Post by rssfed23 »

wesss wrote: Wed Jan 08, 2020 2:50 pm I'd really appreciate such a script or option as well. Any update on this?
The project team isn't doing anything on this currently. If it's going to get done it will be via a community contribution from someone here on the forum.

We have received a feature request for a select all button - https://gitlab.com/mayan-edms/mayan-edms/issues/744 - which if implemented you could potentially use on the documents page to select all documents rather than just documents visible on that page at the time achieving the same effect as an export all button.
But this feature request hasn't yet been accepted by the project, so to set expectation there are no promises that you'll see this any time soon and I can't give any timescales.

If a user here produces their own download script and shares it with the forum I'll happily move it to the guides section and put a link (and credit) to it in the official documentation.
Please bear with us during the current global situation. The team all have families and local communities to look after as well as the community here. Responses may be delayed during this time, but rest assured we will get to your query eventually.
manux
Posts: 3
Joined: Sun Feb 16, 2020 8:57 am

Re: Export all Documents

Post by manux »

Hi

i found that the files are not saved on the database but as real files and the just get a number to reference the database entry.

You can just copy the files and rename them after the title in the pdf.

Most PDFs have a title and i wrote a script to do this.

However, there are a few issues with double file names.

Code: Select all

#!/usr/bin/bash

# renames files to titlenames
# TODO 
# Mit md5 nach replicakten suchen und die dateien in md5 sum umbennen
shopt -s nullglob
for f in *
do
T=$(md5sum "$f" | cut -d' ' -f1)
T="*_$T.pdf"
echo "$f $T" 
mv "$f" "$T"
done
# scritt 2 namen exttrahieren
shopt -s nullglob
for f in *
do
T=$(pdfinfo "$f" | head -n 1 | cut -d' ' -f11-250)
echo "$f $T" 
mv -n --backup "$f" "$T"
done

# Umlaute in Dateinamen umwandeln zum auto upload
for file in ./*
do
  infile=`echo "${file:2}"|sed  \
         -e 's|"\"|"\\"|g' \
         -e 's| |\ |g' -e 's|!|\!|g' \
         -e 's|@|\@|g' -e 's|*|\*|g' \
         -e 's|&|\&|g' -e 's|]|\]|g' \
         -e 's|}|\}|g' -e 's|"|\"|g' \
         -e 's|,|\,|g' -e 's|?|\?|g' \
         -e 's|=|\=|g'  `
  outfileNOSPECIALS=`echo "${file:2}"|sed -e 's|[^A-Za-z0-9._-]|_|g'`
  outfileNOoe=`echo $outfileNOSPECIALS| sed -e 's|ö|oe|g'`
  outfileNOae=`echo $outfileNOoe| sed -e 's|ä|ae|g'`
  outfileNOue=`echo $outfileNOae| sed -e 's|ü|ue|g'`
  outfileNOOE=`echo $outfileNOue| sed -e 's|Ö|OE|g'`
  outfileNOAE=`echo $outfileNOOE| sed -e 's|Ä|AE|g'`
  outfileNOUE=`echo $outfileNOAE| sed -e 's|Ü|UE|g'`
  outfileNOss=`echo $outfileNOUE| sed -e 's|ß|ss|g'`
  outfile=${outfileNOss}
  if [ "$infile" != "${outfile}" ]
  then
        echo "filename changed for " $infile " in " $outfile
        mv "$infile" ${outfile}
  fi
done


####
####
# Reinschreiben von Dateinamen in PDF TAG
#shopt -s nullglob
for f in *
do
#exiftool -Title="$f" "$f"
pdfinfo "$f" | head -n 1 | cut -d' ' -f11-250
done
zbig
Posts: 1
Joined: Wed Sep 02, 2020 11:46 am

Re: Export all Documents

Post by zbig »

KevinPawsey wrote: Tue Nov 27, 2018 12:16 am Silly question, but can’t you just go to Documents/All Documents.

Once that page has loaded, on the top right there is a drop down with a tick box next to it. Tick the box, that selects all documents, then go to “Advanced download” in the drop down. From there you can download all documents in a zip file.

Maybe that will work? I haven’t tried it myself, just noticed the option the other day.
This is a great advice of a simple solution that actually works, thank you! No idea why it got ignored. I've done just that and got a handy ZIP file containing all the documents with their original file names. The only slight downside is that Mayan's "Select all" works page-wide only but it's nothing a quick adjustment of "COMMON_PAGINATE_BY" can't fix. Well, I guess you could potentially run into memory problems during ZIP compression on some huge archive sizes but then just divide the exports into, say 500 documents large chunks and you're good, not a big deal.
Post Reply