Mayan Stack on Docker Swarm

When things doesn't work as they should.
Post Reply
gareththered
Posts: 5
Joined: Mon Feb 11, 2019 10:48 am

Mayan Stack on Docker Swarm

Post by gareththered » Tue Apr 02, 2019 12:30 pm

I've installed Mayan on a two-node Docker Swarm lab setup with the following definition:

Code: Select all

version: "3.5"

services:
  db:
    image: postgres
    environment:
      POSTGRES_DB: mayan
      POSTGRES_USER: mayan
      POSTGRES_PASSWORD: mayanuserpass
    volumes:
      - type: volume
        source: db
        target: /var/lib/postgresql/data
    networks:
    - backend
  app:
    image: mayanedms/mayanedms
    restart: always
    environment:
      MAYAN_DATABASE_ENGINE: django.db.backends.postgresql
      MAYAN_DATABASE_HOST: db
      MAYAN_DATABASE_NAME: mayan
      MAYAN_DATABASE_USER: mayan
      MAYAN_DATABASE_PASSWORD: mayanuserpass
      MAYAN_DATABASE_CONN_MAX_AGE: 60
      MAYAN_PIP_INSTALLS: 'django-storages boto3'
      MAYAN_DOCUMENTS_STORAGE_BACKEND: storages.backends.s3boto3.S3Boto3Storage
      MAYAN_DOCUMENTS_STORAGE_BACKEND_ARGUMENTS: '{access_key: mayanKey, secret_key: mayanSecret, bucket_name: mayan, endpoint_url: "https://<my internal NAS FQDN>:9000", verify: False}'
    ports:
      - "80:8000"
    networks:
      - backend
    volumes:
      - type: volume
        source: app
        target: /var/lib/mayan

networks:
  backend:

volumes:
  db:
    driver: local
    driver_opts:
      type: nfs
      o: addr=<my internal NAS FQDN>,rw
      device: ":/srv/DockerVols/mayan-db"
  app:
    driver: local
    driver_opts:
      type: nfs
      o: addr=<my internal NAS FQDN>,rw
      device: ":/srv/DockerVols/mayan-app"
The idea is that documents are stored in the S3 object store (Minio) and both the PostgreSQL database and Mayan's files are store on a NFS share that I host. In theory, I should be able to effortlessly move the services between the hosts.

When I initially start the stack it takes about 10-15 minutes to build Mayan - most of this is

Code: Select all

collectstatic --noinput --clear
caching all the static elements. I believe its slowed down by the fact that this is on a NFS share - I'm sure a local disk would be much quicker.

If this was a one-off, I could live with it. However, it seems that this process of collecting static files takes place each time an instance of Mayan is started. Therefore, if I drain the current docker node, Mayan moves to the other, but takes the same amount of time to run up due to it rebuilding the cache of static elements over the previously created and perfectly valid cache. Nothing should have changed, but the cache is always rebuilt.

There's logic in the entrypoint.sh script to check if this is an initial build or not, but nothing to check whether the cache genuinely needs rebuilding. My belief is that the culprit is:

Code: Select all

upgrade() {
    echo "mayan: upgrade()"
    su mayan -c "${MAYAN_BIN} performupgrade"
    su mayan -c "${MAYAN_BIN} collectstatic --noinput --clear"
}
and that it needs some form of test before collectstatic is run.

Am I on the right track? Should the Docker image have a little more logic in it?

Thanks.

Gareth

User avatar
rosarior
Posts: 274
Joined: Tue Aug 21, 2018 3:28 am

Re: Mayan Stack on Docker Swarm

Post by rosarior » Sat Apr 06, 2019 9:22 pm

Hello,

Neither Django nor Docker provide a way for the binary being execute in the container, for it to know if the current execution is the first or a restart of previous install. Here is an open patch trying to solve the same issue (https://gitlab.com/mayan-edms/mayan-edm ... equests/39).

Since a few versions back we've started adding more version and build information (https://gitlab.com/mayan-edms/mayan-edm ... _init__.py) as well as version markers (https://gitlab.com/mayan-edms/mayan-edm ... er/version).

Since we are implementing in the app functionality that is the responsibility of the platform we run the risk of shipping a feature for which we don't have complete control and whose failure could be interpreted as poor code on our side as it happened with the database conversion support (viewtopic.php?f=7&t=58)

We are testing code that will update the version marker only after an upgrade as a way to let Docker's entrypoint know that the execution is a restart and skip the collectstatic step. The problem we have is that we don't have a 100% reliable way of knowing when an upgrade completed successfully. We can only detect that the command finished. This could lead to broken upgrades. The current method is slower but guarantees 100% upgrade success.

We are attempting to implement this on the app and not as part of the Docker scripts to have more control and not add platform specific code which is harder to maintain. This is an ongoing development with incremental changes on each release and can't promise when the complete code will ship.

gareththered
Posts: 5
Joined: Mon Feb 11, 2019 10:48 am

Re: Mayan Stack on Docker Swarm

Post by gareththered » Sun Apr 07, 2019 12:20 pm

Hi,

Thanks for the in-depth reply.

If the container is running the 'latest' branch, I can see how it could upgrade on a (re)start. However, if the container is tagged there shouldn't be any upgrades should there?

For example, if I chose to use mayanedms/mayanedms:3.1.9 as an image, I should be able to restart the image without any chance of an upgrade happening.

Would a simple flag (maybe in the form of an environment variable) resolve this? If, as the responsible owner of this Docker installation, I chose to restrict it to a specific version, I could set this flag to disable any upgrades on restart. If, at a later date, I decide to move to the next version, I could edit the tag, remove the flag and have the installation upgrade. Once upgraded, I reset the flag ready for the next time the service restarts.

Kind regards.

User avatar
rosarior
Posts: 274
Joined: Tue Aug 21, 2018 3:28 am

Re: Mayan Stack on Docker Swarm

Post by rosarior » Sun May 19, 2019 7:30 pm

In order to know if the start is a new one or a restart you need two pieces of information: the version of the existing data and the version of the running code.

An environment variable would be only let the running code know the version it is running (which it can already do by importing the mayan/__init__.py module that contain version and build number), it doesn't provide any information about the existing data's version format.

In the past users had to manually issue the ``performupgrade`` command when upgrading their docker containers. We found out that the majority of users would not read the upgrade instructions and report problems with missing database tables, this is why now the Docker image tries to always do an upgrade if it find a database. Until Django provides an official way to do this we are at the mercy of experiments until we find a solution that works for 95 percentile of cases.

As for the static media download, version 3.2 beta 1 (released last week viewtopic.php?f=2&t=569) includes detection of static media and would not access the internet to download or compress media files if it find them on disk. That's the dependency manager,on the Docker image side of things we also figure out a way to include static media files as part of the Docker image and server them from inside the filesystem of the Docker container and not from the user's media file which avoid downloading them at the start or a new installation or for any restart, eliminating this issue completely.

Give 3.2 beta 1 try.

Post Reply