Page 1 of 1

Use Shard based/content addressable storage in Mayan EDMS

Posted: Wed Jan 29, 2020 3:52 pm
by hercovandyk
Hi all.

We have a project which will eventually store over 80 million documents.

One of our main concerns were that Mayan stores all its documents in the same directory. We prefer that the documents are stored in a sharded system. I eventually found a django project to help out; django-caster.

https://github.com/zacharyvoase/django-castor

Unfortunately, django-caster was last updated in 2015. As such, 2 small modifications need to be done for it to work with Python 3.8.

Here's the how to for Direct Installs. Docker installations will have to vary the procedure to match.

1. Activate your virtual environment for Mayan and install django-castor:

Code: Select all

source /opt/mayan-edms/bin/activate

Code: Select all

pip install django-castor==0.2.1
2. Edit the following files in django-caster;

djcaster/storage.py:
change line 80 to the following:

Code: Select all

def get_available_name(name, max_length=None):
djcaster/utils.py:
change line 78 to the following:

Code: Select all

    for i in range(depth):
3. Update your Mayan settings file.
Open config.yml in your mayan media directory.
Set the DOCUMENTS_STORAGE_BACKEND setting to the following:

Code: Select all

DOCUMENTS_STORAGE_BACKEND: djcastor.storage.CAStorage
Restart your services and Mayan should now store files using sharding

Re: HOW TO: Shard based storage in Mayan.

Posted: Wed Jan 29, 2020 4:29 pm
by rssfed23
Thanks a bunch for sharing this! I’ll give it a try later today and then move this to the guides section for everyone :)

Also good to see you’ve got up to 80 million documents! I remember your initial post talking about 60 million so pleased to hear the project is going well.

If you ever want to share any other modifications you’ve made to help mayan scale to that kind of level beyond sharded storage don’t hesitate to post :)

Re: HOW TO: Shard based storage in Mayan.

Posted: Tue Feb 04, 2020 9:48 pm
by hercovandyk
rssfed23 wrote:
> Thanks a bunch for sharing this! I’ll give it a try later today and then
> move this to the guides section for everyone :)
>
> Also good to see you’ve got up to 80 million documents! I remember your
> initial post talking about 60 million so pleased to hear the project is
> going well.
>
> If you ever want to share any other modifications you’ve made to help mayan
> scale to that kind of level beyond sharded storage don’t hesitate to post
> :)
Thanks for moving this to the guide section.

We have a few other modifications/integrations we are currently working on for future projects. When we are done with them I will do another write up on how we use Mayan as a repository for our smart scanning software..