New to Mayan - use case questions

Questions, comments, discussions. Over time certain topics might be moved to their own category.
Post Reply
stevemgrimes
Posts: 7
Joined: Sat Jun 01, 2019 10:43 pm
Location: San Bernardino, CA
Contact:

New to Mayan - use case questions

Post by stevemgrimes » Sat Jun 01, 2019 11:02 pm

Hello:

I am currently working on a project to convert my organization away from a very old (actually, EOL'd) version of ApplicationXtender, which has been in use since 2000. We have a fairly narrow set of requirements for a document management system, and it appears that something open source like Mayan may be suitable for our needs.

What I've not been able to discern from the documentation / website is whether the software system is capable of dealing with large numbers of documents and clients on a network. Our document count is in the neighborhood of 35,000 documents, and we have a dedicated scanning department (approx 10 users) and a client-consumer base of about 600 users.

I am writing the routine to extract and migrate the documents and metadata out of ApplicationXtender, and can have them readied for a new system in pretty much any format.

Some initial questions:
* will Mayan work at an enterprise level with the user and document base I've described?
* can Mayan use MS SQL, or is it PostegreSQL / MySQL only?
* what would a file/data migration to Mayan look like?
* I don't believe I saw anything in the documentation / website regarding barcode / QR reading - is that being considered for future development?
* Is there any articles / documents that could help me jump into the code base? I've been doing quite a bit of Python work lately, and would be very interested in seeing how your system works under-the-hood.


Thank you!

Steve Grimes
Business Systems Analyst
Inland Regional Center, San Bernardino, CA (SoCal)
Steve Grimes
Business Systems Analyst
Inland Regional Center
San Bernardino, CA

User avatar
rosarior
Posts: 340
Joined: Tue Aug 21, 2018 3:28 am

Re: New to Mayan - use case questions

Post by rosarior » Mon Jun 03, 2019 1:12 am

The reason there are no performance numbers in the documentation of website is those has been misinterpreted as guaranteed performance metrics in the past. In one example and organization was able to host several million documents using an Intel NUC style device and connection pooling with asymmetric replication and a SAN, some users just read the part that said "millions documents" and complained when didn't got the same scalability from their Raspberry Pis.

Instead we now discuss some performance details here in the forums where the medium allows more context.
From experience I would expect that with an entry level Intel or AMD server (SuperMicro for example), 35,000 documents should be easy to host, even with just the stock Docker image. If the 600 users are concurrent, some performance tuning might be needed though.

As an example of performance, I personally run Mayan EDMS at home in a solar powered Odroid C2 (costs $50 dollars, ARM CPU with 2GB of RAM). The single board computer runs the entire stack: Mayan, PostgreSQL, OCR. This hosts 1652 documents and a total of 26972 pages.

- https://magazine.odroid.com/article/sol ... croserver/
- https://medium.com/@siloraptor/building ... aafdeb2f17
- https://twitter.com/MayanEDMS/status/11 ... 3833618437

Even old versions scale well. For example, the state of California hosts 4945 documents online (https://cpucadviceletters.org/documents/list/recent/) using a very old version, version 1.0 rc3 (https://cpucadviceletters.org/about/). Version 1.0 was released in 2014 (https://docs.mayan-edms.com/releases/1.0.html). We are at 3.2 beta 1 now, which consequently includes many memory and CPU optimizations (viewtopic.php?f=2&p=1226#p1226).

The average size of installation we manage via professional services on adequate servers or cloud deployments is between 200,000 to 500,000 documents.

Migrations from other systems are done via professional services. It would be impossible to put in writing as part of the documentation the migration process as it is unique for every client and for every legacy system. We haven't done a migration from ApplicationXtender but we do have one scheduled for the end of this month to travel to a client in South Carolina to do an 8 million documents migration from ApplicationXtender. Beside documents, ApplicationXtender applications will be migrated too as Mayan cabinets. We plan to learn as much as possible from this migration as it seems Mayan EDMS is becoming a popular transition choice for ApplicationXtender users.

* will Mayan work at an enterprise level with the user and document base I've described?
Given the specified hardware disclaimer, yes.

* can Mayan use MS SQL, or is it PostegreSQL / MySQL only?
It can work with SQLite (for development, testing, single user), MySQL, MS SQL, PostgreSQL, IBM DB2, Firebird and even ODBC. The best tested database manager and the one we recommend is PostgreSQL.

* what would a file/data migration to Mayan look like?
For migrations we do on-site consultation and migration jobs by working with customers to build custom migration code suited for the data be migrated and source database holding the documents and metadata. We also train the staff on best practices, scalability tune ups, operations, customizations, API integrations.

* I don't believe I saw anything in the documentation / website regarding barcode / QR reading - is that being considered for future development?
Yes, it is planned for general use in an upcoming version possibly 4.0 or 4.1. We have some beta implementations already being tested. Along with QR Code and barcodes we will ship Zone OCR as both features are dependant on the same underlying services.

* Is there any articles / documents that could help me jump into the code base? I've been doing quite a bit of Python work lately, and would be very interested in seeing how your system works under-the-hood.
For custom apps and development work check out the chapter: https://docs.mayan-edms.com/topics/development.html
For API and integration: https://docs.mayan-edms.com/topics/integration.html
Every once in a while we publish technical blog posts: https://www.mayan-edms.com/post/mayan-converter/
Depending on the feedback, a self published book covering all aspects of Mayan is possibility as well as a professional certification program and a series of hosted talks in venues.
Attachments
Screenshot from 2019-06-02 20-08-25.png
Screenshot from 2019-06-02 20-08-25.png (90.84 KiB) Viewed 5418 times
Screenshot from 2019-06-02 20-07-29.png
Screenshot from 2019-06-02 20-07-29.png (93.75 KiB) Viewed 5418 times

daniel1113
Posts: 24
Joined: Tue Aug 21, 2018 2:32 pm

Re: New to Mayan - use case questions

Post by daniel1113 » Mon Jun 03, 2019 1:51 pm

Roberto:

I too am working on a large enterprise deployment of Mayan to replace an antiquated ApplicationXtender setup. We're hitting a volume of data where we may be interested in paying for your professional services to bring Mayan up to scale. How can we get in touch to discuss? Thanks.

stevemgrimes
Posts: 7
Joined: Sat Jun 01, 2019 10:43 pm
Location: San Bernardino, CA
Contact:

Re: New to Mayan - use case questions

Post by stevemgrimes » Wed Jun 05, 2019 3:59 pm

@Roberto - thank you so much for the detailed reply. I would very much like to continue this conversation - are you available for phone consult?

We're running a fairly robust data center here, so I'm not too concerned about computing and storage needs. We're fully virtualized with a SAN, and a stout network. No Raspberry Pi's, save for my own playing/home dev work :D

Our user base is in the 600-700 realm, but that's total count - concurrent would likely be under 100 at any given time.

I looked at the CPUC links you sent. Very nice look and feel. I am concerned a bit about the performance, as pulling up docs seems to take quite a bit of time. Is that site heavily used, or am I just seeing a poor implementation for today's needs, and the newer software on a good back-end would run much faster?

Good to see I've got company regarding migration from AppXtender. I am very interested in hearing how things go in South Carolina.

Once I get Mayan running here and start discussing more with my IT manager and our systems/network guy, we'll talk more about database (PostgreSQL vs MS SQL). We've got two large production MS SQL servers running with multiple databases as well as cloud replication happening, and I'm pretty sure they'll want me to focus on using that as the back-end, rather than bringing something new in that would require alternative capacity and DR.

Interested in discussing consultation. I've done a lot of work getting migration software together from AX, so not sure I'd need a lot of help with that, but I would likely need help understanding the Mayan platform and what would need to be done to migrate the docs, metadata, application/cabinet info, etc. in - whether it be an import utility that already exists or if moving doc files and writing data to the new database is the way to go.

If we were to utilize Mayan as our new EDMS, I would very much like to see the book and training you mention. We'd also want paid-support.

I look forward to hearing from you soon.

Thanks!
Steve Grimes
Business Systems Analyst
Inland Regional Center
San Bernardino, CA

stevemgrimes
Posts: 7
Joined: Sat Jun 01, 2019 10:43 pm
Location: San Bernardino, CA
Contact:

Re: New to Mayan - use case questions

Post by stevemgrimes » Wed Jun 05, 2019 4:00 pm

daniel1113 wrote:
Mon Jun 03, 2019 1:51 pm
Roberto:

I too am working on a large enterprise deployment of Mayan to replace an antiquated ApplicationXtender setup. We're hitting a volume of data where we may be interested in paying for your professional services to bring Mayan up to scale. How can we get in touch to discuss? Thanks.
Daniel - would be interested in talking to you about your plans and thoughts about this project. Please let me know if you're available sometime soon.

My email address is sgrimes@inlandrc.org.

Thank you
Steve Grimes
Business Systems Analyst
Inland Regional Center
San Bernardino, CA

User avatar
rosarior
Posts: 340
Joined: Tue Aug 21, 2018 3:28 am

Re: New to Mayan - use case questions

Post by rosarior » Wed Jun 05, 2019 5:56 pm

That is an impressive setup!

100 concurrent users is a much workable load.

To say that the CPUC setup is ancient is not an understatement. They are not using distributed tasks and as you noted there is response time penalty. I found out about Mayan at the CPUC by a user. We didn't do that installation. The more recent versions are light years ahead of the pre 1.0 version they are using. We've approached them to do an upgrade to a more recent version but have not received a reply yet.

We've done a lot of research into scalability of task distribution, including locking, race conditions, failure recovery and even graceful decay in case of infrastructure failures. Some of it can be found in this PyCon talk I did a few years ago: https://www.youtube.com/watch?v=0UJTG5QU7Ss

The Docker image we distribute also makes compromises in favor of easy of use. It is meant as an introduction to Mayan with the lowest deployment steps. The recommended starting point for scalability is the advanced deployment (https://docs.mayan-edms.com/chapters/de ... deployment), with further adjustments for scalability (https://docs.mayan-edms.com/topics/admi ... scaling-up).

This is our first foray into a migration from ApplicationXtender and one of the goals is to end up with a migration utility that can be reused.

I'll send you further details about the consultation work we are doing.

Thank you.
Attachments
Screenshot from 2019-06-05 13-27-11.jpg
Screenshot from 2019-06-05 13-27-11.jpg (59.11 KiB) Viewed 5359 times

Post Reply