MySQL Error 1235 on clicking Recently Added documents

When things don't work as they should.
Post Reply
maathieu
Posts: 15
Joined: Tue Dec 24, 2019 10:17 am

MySQL Error 1235 on clicking Recently Added documents

Post by maathieu »

Hello,

On the documentation it is written:
PostgreSQL vs. MySQL

Since Django abstracts database operations from a functional point of view Mayan EDMS will behave exactly the same either way. The only concern would be that MySQL doesn’t support transactions for schema modifying commands. The only moment this could cause problems is when running South migrations during upgrades, if a migration fails the database structure is left in a transitory state and has to be reverted manually before trying again.
So I decided to run Mayan with the MySQL 8.0.16 instance I have on my box.

Everything runs well indeed except I get this error when clicking on "Recently Added documents":

Code: Select all

Error retrieving document list: (1235, "This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery'").
Is it possible to update the documentation to state that Mayan will not fully work with MySQL, then?

Many thanks in advance,
User avatar
rssfed23
Moderator
Moderator
Posts: 213
Joined: Mon Oct 14, 2019 1:18 pm
Location: United Kingdom
Contact:

Re: MySQL Error 1235 on clicking Recently Added documents

Post by rssfed23 »

Mayan *should* work okay with mysql even though Postgresql is preferred (we do test mysql as a supported backend and explicitly test on mysql:8.0.3). Mayan is using a generic database connector that handles the DB side of things and should make no difference to the end user. However, recent documents is known to not work with the mysql backend: https://gitlab.com/mayan-edms/mayan-edms/issues/627

For my own personal curiosity; you're using the docker version if I recall. Is there a specific reason your using mysql over postgres?
Last edited by rssfed23 on Sat Dec 28, 2019 9:31 pm, edited 1 time in total.
Reason: Changed link to the non-duplicate bug report
Please bear with us during the current global situation. The team all have families and local communities to look after as well as the community here. Responses may be delayed during this time, but rest assured we will get to your query eventually.
maathieu
Posts: 15
Joined: Tue Dec 24, 2019 10:17 am

Re: MySQL Error 1235 on clicking Recently Added documents

Post by maathieu »

Hi,

I just had a MySQL instance already installed on my machine for other purposes. I thought I wouldn't burden the poor thing with 2 database engines running at the same time :D
michelw
Posts: 4
Joined: Mon Jun 01, 2020 9:34 am

Re: MySQL Error 1235 on clicking Recently Added documents

Post by michelw »

I encountered the same problem, for the same reason: I already had MySQL installed, and the server isn't that up-to-date wrt specs, so decided to re-use it. I created a PR with the fix, seemed simple enough: https://github.com/mayan-edms/Mayan-EDMS/pull/9
Please be gentle :D it's my first PR for Mayan.
User avatar
franco
Developer
Developer
Posts: 29
Joined: Sun Apr 05, 2020 2:30 am

Re: MySQL Error 1235 on clicking Recently Added documents

Post by franco »

Thank you for patch, it is very appreciated.

Sadly I don't think it will be merged, and here is the explanation:

The statement as it is, generates a single (although large) query. The Django ORM is smart enough to detect that when passing the primary key result of a query to another query, both can be merged as a subquery of the other. Splitting and collapsing of them using a loop (or similar like list comprehension) causes the following degradation:

1- The query is now split into two.
2- There is a loop whose size is the configured recent documents count.
3- Part of the query processing now runs on Python instead of the database.

Code: Select all

    # Current code (unrolled for clarity) 
        return Document.objects.filter(pk__in=Document.objects.order_by('-date_added')[:setting_recent_added_count.value].values('pk')).order_by('-date_added')

Code: Select all

    # Patch (unrolled for clarity)
        pks = [doc['pk'] for doc in Document.objects.order_by('-date_added')[:setting_recent_added_count.value].values('pk')]
        return Document.objects.filter(pk__in=pks).order_by('-date_added')
Here is an excerpt of the version 3.2 release notes to better explain the reasoning:
The permission inherited computation was improved and now operates mostly at the
database layer instead of being mostly a Python computation. A similar
improvement was done to the ACL calculation system. The ACL calculation now
operates mostly on the database layer freeing many resources and scaling better.
Moving query manipulation to the database lowered memory used dramatically for that release and allowed Mayan to run again on devices like single board computers and small VMs with 2GB or less of RAM. Also speeds things up because the computation task of processing the query result is now offloaded to the database instead of running it on the frontend.

Granted, the default value for recent documents is just 400 and this is not something that should be expected to be setup to a ridiculous high number. But, like almost every feature in Mayan, we much rather make it as open ended as possible because we don't know how many use cases there might be for this feature.

Like every patch and change it will be queued for review and we will discuss it further, just wanted to share some of the reasoning we use for code changes and show my appreciation for your contribution.

Thank you.
michelw
Posts: 4
Joined: Mon Jun 01, 2020 9:34 am

Re: MySQL Error 1235 on clicking Recently Added documents

Post by michelw »

Thanks for that detailed explanation. I also understand and appreciate the choice of the original code, and I agree that this is much better. The only issue of course is that it just doesn't work when using MySQL. Which is also fine, but I believe that this should be clearly noted somewhere, or even checked when starting/installing. Another option would be a check to see what database is used, and then to pick one or the other solution, but this entirely defeats the purpose of having an ORM of course. Ideally, the ORM would handle this transparently, not generating this subquery when MySQL is in use and taking the performance hit for granted, but my Python/Django-fu is definitely not up to the task of making a PR for _that_ solution!
Anyway, feel free to reject the PR, the explanation makes sense. If you believe that having a MySQL/not-MySQL check at this point is a good solution (I doubt this, but it would work of course ...) I can amend my patch to include that.
Post Reply