How to Load Test Mayan EDMS (and some impressive results)

Community contributed guides or tutorials for multiple topics like installations for other operating systems or platforms, monitoring, log aggregation, etc.
Post Reply
User avatar
rssfed23
Moderator
Moderator
Posts: 185
Joined: Mon Oct 14, 2019 1:18 pm
Location: United Kingdom
Contact:

How to Load Test Mayan EDMS (and some impressive results)

Post by rssfed23 »

If you want to skip to my Load Test results showing how Mayan EDMS can comfortably scale to over 1000 active users at once on one virtual machine (2 vCPU 4GB RAM) with no errors scroll down to "Mayan Load Test Results" at the bottom

I've been working on building a new demo environment for the Mayan Project so that users can explore a read-only Mayan EDMS without any Docker or Linux experience. As part of this I've been running some load tests and wanted to share both the load test results but also how to do it so that the users here can benefit.


Why would I want to Load Test Mayan EDMS?
There's many reasons for Load Testing. If you're running Mayan EDMS in a production environment (and especially within the Enterprise) you need to prove that it can handle the number of users and workloads you require on the hardware you can provide to it. Failure to do this during a Proof of Concept phase can lead to some serious headaches further down the line when deploying.

Side note: this is one area where a Mayan EDMS Support plan or professional services agreement can save you a lot of time: they have the skills and experience to provide expert guidance on the type of architecture and requirements Mayan environments of all shapes and sizes need.

Additionally, load testing can:
- Help identify the root cause of bugs/issues
- Help you optimise your infrastructure architecture
- Help with resource and capacity planning
- Help maintain Mayan EDMS availability and speed
- Help you test the impact of future updates
- Give you reassurance a configuration change you're about to make isn't going to cause your users slowdowns
In short, you really should consider doing one :)

How can I load test Mayan EDMS?
There are a lot of different load testing tools out there. A lot! They come in all sorts of different shapes and sizes for different operating systems, graphical and GUI, free and paid.

Todays walkthrough will be using Locust. Locust is a Python based load testing tool that's super simple to get up and running. Although initial setup/configuration is on the command line it also has a Web UI as well which makes running tests and viewing results a breeze!

Locust also runs on Windows, Linux and MacOS (as it's python based). This is important as you won't want to run your load test from your Mayan server itself - that would defeat the point as establishing/maintaining network connections can take up a fair amount of load on some systems so it wouldn't be an indicative test. So it's handy to be able to run it from your windows desktop.

These instructions were run on Ubuntu 18.04, but the process is the same on Windows/Mac once you've got Python3 installed (needs to be 3.5+). On Windows you'll get an error during the install if you've not installed the Visual Studio C++ Compiler (but that's free and easy to install also).

To get started, fire up your shell and run:

Code: Select all

pip install locustio
That's it! - Locust is now installed on the machine you're on.

Locust can be run entirely locally or in a master/slave scenario. When using a master/slave load test you can scale up the number of slaves so you don't max out the capabilities of the machines running the test.

Before we start Locust, we need to tell it what addresses/commands to run during the test. Create a new file called locust.py:

Code: Select all

vi locust.py
In here we write our test script. Here's an example for Mayan with comments inline describing what each section does:

Code: Select all

from locust import HttpLocust, TaskSet, task
import requests

class UserBehavior(TaskSet):
    def on_start(self):
       # on_start is called when a Locust start before any task is scheduled and is used to login to a website
        self.login()

    def login(self):
        response = self.client.get("/")
        csrftoken = response.cookies['csrftoken']    #this authenticates locust to Mayan
        self.client.post('/authentication/login/',{'id_username': 'rob', 'id_password': 'Zknigh1121$'},headers={'X-CSRFToken': csrftoken})

#Above is where we put the mayan username and password so Locust can login to Mayan EDMS

    @task(1)
    def documents(self):
        self.client.get("/documents/documents/")

    @task(2)
    def documentpreview(self):
        self.client.get("/documents/documents/16/preview/")

    @task(3)
    def documentpages(self):
        self.client.get("/documents/documents/16/pages/")

    @task(4)
    def documentcontent(self):
        self.client.get("/parsing/documents/16/content/")

    @task(5)
    def documentocrcontent(self):
        self.client.get("/ocr/documents/16/content/")

    @task(6)
    def attributes(self):
        self.client.get("/file_metadata/document_version_driver/12/attributes/")

    @task(7)
    def documenttypes(self):
        self.client.get("/documents/document_types/")

    @task(8)
    def tags(self):
        self.client.get("/tags/tags/1/documents/")

class WebsiteUser(HttpLocust):
    task_set = UserBehavior
    wait_time = between(3.00, 30.00)
#This wait's a random time between 3 and 30 seconds between each task execution. Helps simulate genuine user activity and more realistic requests

The above script will login to mayan and then load all of the URLs you see listed. Common document tasks are covered (viewing all documents with a specific tag, viewing the all documents page, viewing the OCR content of a document, viewing the parsed data from a document, viewing the metadata of a document, etc)

Important: Where some of the URLs are viewing a specific tag, document or document version, you must ensure that the document number actually exists within Mayan to avoid 404's. Change the numbers to match an already existing document/version/tag

To launch Locust we run:

Code: Select all

locust -f locust.py
This will run locust in the forground. For long running tests you might want to run

Code: Select all

nohup locust -f locust.py &
This will allow you to close your console window and keep locust running.

You can now go to 127.0.0.1:8089 in your browser (change to the IP address of the system running Locust).

The first screen you will get asks how many clients to simulate. The number you choose is up to you!
It also asks you how many clients to spawn per second. I recommend not choosing a huge number here as most operating systems have a limit on half-open outgoing connections (especially windows) and it will generate false errors when they time out. Anywhere between 1-50 is okay for most systems. If the purpose of your test is to test how Mayan copes with a huge surge in numbers then by all means set this higher, but make sure that you're running in distributed mode with 2 or more slaves so that you don't hit the limit of an individual slave machine.

FInally the host. This is where you enter the address of Mayan EDMS. You can see from the URLs above they are relative URLs to the host you enter in this box. "mayan.rssfed23.com:8000" may be a good value. If you're running Mayan on a different context path don't forget to include that also (e.g "mayan.rssfed23.com:8000/mayan").

Then, click the button to begin!

Your load test will now run until you either click "STOP" up the top right or exit the Locust process on the command line.

Running in distributed mode
Running in a master/slave mode is also super simple.
First, make sure you've copied your locust.py file to all the slave nodes that will run as test nodes.
Then, on your master run:

Code: Select all

locust -f locust.py --master
The mater is now running and the web UI will be visible.
If you start a test now nothing will happen as a master does not execute tests (it uses the slaves for that).

To start a slave, run:

Code: Select all

locust --slave --master-host=127.0.0.1 -f /opt/locust.py
Again, you can run either of these with "nohup" to avoid it shutting down when you leave your shell.

That's it! Your test is now running. You'll see the number of requests, the URL, statistics about how long each request has taken and also any errors that happen on the "failures" and "exceptions" tab up the top. The Charts tab is also handy.
If you click stop test, you can start a new one at any time by clicking new test up the top right.

How do I monitor the test from the Mayan EDMS perspective?
Although you can see any errors/failures or how long a request takes in Locust, you might also want to monitor Mayan as well to see how much load the system is under during the load test!
There's a million and one ways this can be done. From standard top/htop tools to expensive monitoring solutions. Anything that can monitor system load will do.
I'll be making another tutorial on monitoring using Prometheus/Grafana later, but one quick and easy (that's also VERY powerful) way to get real time metrics is NetData
NetData is great at real time monitoring. It is not designed for permanent metrics archive (but can ship metrics to Prometheus/InfluxDB for that purpose). This makes it ideal to get a snapshot view of what load a system is under at any point in time.
To install NetData run:

Code: Select all

bash <(curl -Ss https://my-netdata.io/kickstart.sh)
Then head to http://mayanserverIP:8000 in your browser
NetData also has the benefit of being able to monitor for us all running containers and other services (such as Redis/Rabbitmq/postgres) that are running natively on the same server. It picks all of this up automatically!

That's it as far as how to run a load test. Leave feedback below.

Mayan Load Test Results
I wanted to share some load test results that I ran on a Mayan EDMS instance that I plan to make public for public demo purposes.
The environment:
- AWS Lightsail Instance (Plan: $20 a month)
- 4GB RAM
- 2 vCPUs
- Postgresql is running on a seperate Lightsail RDS instance ($15 a month plan)
- No tweaks/improvements to the default docker-compose setup (Gunicorn + Redis)
- Traefik running as reverse SSL proxy
- Cache disabled so to try and generate some load
- 25,000 documents

Why is Postgresql separate? - It's best practice for production environments to separate out the Database from the Application Server. I would recommend anyone running Mayan at scale in production to at least separate out the Database to improve performance.

Using a similar test script to the above, here's the results with 1050 users

I am frankly amazed at these results. We're talking 1050 concurrent users doing random tasks on the system at once and not a single failure. That's around 60-150 http requests per sec (RPS). I was expecting 100-200 based on my past experience with other dynamic web platforms such as Java/PHP.

For a dynamic web application, on such a small virtual machine, without any kind of caching or other optimisations, that is VERY impressive!
To help put it in perspective, I work at a company with over 3,000 Employees and it's rare that I see our Wiki (that we all use for everything all day) go above 10-20 RPS.

Image

The average response time for most requests is about 2 seconds. That may seem like a lot, but please remember that this is a VM running with 2 vCPUs and only 4GB of Memory.
We must also remember that waiting 2 seconds for a page/document to load is a world apart from getting timeouts, errors and not loading at all.
Mayan itself is only using half of that RAM:

Image

So you could easily get away with the smallest lightsail plan ($3.50 a month) serving around 200 users (if you were not using that same VM for OCR tasks) on a similar environment to mine. The 2 vCPU cores really help here.

And here's the specific Mayan container specs:

Image

And traefik is floating around 10%:
Image

I hope you enjoyed following this tutorial. Feel free to leave feedback/ask questions below. I look forward to letting everyone loose on the public demo environment soon :)

Rob
Please don't PM for general support; start a new thread with your issue instead.

Post Reply