Hello there,
I have trouble accessing the content of a Document to determine if a add a tag.
I tried it with the document.latest_version.content variable, but I guess this is no accessible from a workflow.
I have a workflow which enters a state when OCRing is finished. Then I would like to look for certain words an add a tag if they are in the content.
Unfortunatley I have no clue how I can access the OCR Content from within the workflow state action.
Any help is greatly appreciated.
Best regards
Workflows - Access OCR content for condition for adding tags
Re: Workflows - Access OCR content for condition for adding tags
just realized that I posted the question in the wrong forum. Sorry bout that!
Re: Workflows - Access OCR content for condition for adding tags
And now I feel pretty stupid.
Of course I can access everything from workflow_instance.document.latest_version.ocr_content.
I didn't realize I was not using the workflow_instance entrypoint there.
Of course I can access everything from workflow_instance.document.latest_version.ocr_content.
I didn't realize I was not using the workflow_instance entrypoint there.
Re: Workflows - Access OCR content for condition for adding tags
Hello,
I'm new here and I'm starting to use Mayan EDMS just now so please forgive me if I'm asking something that sounds obvious, but I am trying to do this same thing: a workflow for adding tags to documents depending on the presence of certain words in their OCR content.
I really don't understand how the action condition should be written
I've tried things like:
and many variations of this but I don't seem to find a way. The action is not performed.
Could anyone please give me a hint of what I'm doing wrong?
Thanks
Luigi
I'm new here and I'm starting to use Mayan EDMS just now so please forgive me if I'm asking something that sounds obvious, but I am trying to do this same thing: a workflow for adding tags to documents depending on the presence of certain words in their OCR content.
I really don't understand how the action condition should be written

I've tried things like:
Code: Select all
{% if "my word" in workflow_instance.document.latest_version.ocr_content %}True{% endif %}
Could anyone please give me a hint of what I'm doing wrong?
Thanks
Luigi
Re: Workflows - Access OCR content for condition for adding tags
Stuck here too. Can anyone help please?lonestar wrote: ↑Fri Jun 12, 2020 10:06 pm Hello,
I'm new here and I'm starting to use Mayan EDMS just now so please forgive me if I'm asking something that sounds obvious, but I am trying to do this same thing: a workflow for adding tags to documents depending on the presence of certain words in their OCR content.
I really don't understand how the action condition should be written
I've tried things like:
and many variations of this but I don't seem to find a way. The action is not performed.Code: Select all
{% if "my word" in workflow_instance.document.latest_version.ocr_content %}True{% endif %}
Could anyone please give me a hint of what I'm doing wrong?
Thanks
Luigi
Tried things like:
Code: Select all
{% if "try" in workflow_instance.document.latest_version.ocr_content %}
{% endif %}
Code: Select all
{% if "try" in workflow_instance.document.latest_version.ocr_content %}True{% endif %}
Re: Workflows - Access OCR content for condition for adding tags
I found that aplying the Django filter join to workflow_instance.document.latest_version.ocr_content worked for me
{% if "my word" in workflow_instance.document.latest_version.ocr_content|join:" " %}True{% endif %}
P.S:The Template Sandbox (in document view) really help with figuring out templates
{% if "my word" in workflow_instance.document.latest_version.ocr_content|join:" " %}True{% endif %}
P.S:The Template Sandbox (in document view) really help with figuring out templates
Re: Workflows - Access OCR content for condition for adding tags
Expanding on fgdutoit's solution:
{% for page_ocr in workflow_instance.document.latest_version.ocr_content %}
{% if "my word" in page_ocr %}True{% endif %}
{% endfor %}
This iterates over the OCR content of each page instead of joining all the OCR content in a single string, which can be more efficient for documents with a large number of pages.
Since "ocr_content" is a generator that returns the OCR content of each page, you can save a bit of memory and get a potential speed boost by using:fgdutoit wrote: ↑Fri Aug 14, 2020 4:27 am I found that aplying the Django filter join to workflow_instance.document.latest_version.ocr_content worked for me
{% if "my word" in workflow_instance.document.latest_version.ocr_content|join:" " %}True{% endif %}
P.S:The Template Sandbox (in document view) really help with figuring out templates
{% for page_ocr in workflow_instance.document.latest_version.ocr_content %}
{% if "my word" in page_ocr %}True{% endif %}
{% endfor %}
This iterates over the OCR content of each page instead of joining all the OCR content in a single string, which can be more efficient for documents with a large number of pages.
Re: Workflows - Access OCR content for condition for adding tags
2 years later, fresh install, new shot. Same problem.
Workflow exist and works until tagging.
Action add Tag wont be add tag with any condition that includes ocr.
Document OCR-Content result looks fine in the UI.
Tryed
Sandboxresult for all these conditions is empty.
EDIT: Shouldnt the "{{ document.versions__version_pages__ocr_content__content }}" give any result in the sandbox? Result is empty.
Workflow exist and works until tagging.
Action add Tag wont be add tag with any condition that includes ocr.
Document OCR-Content result looks fine in the UI.
Tryed
{% if "word" in document.latest_version.ocr_content|join:" " %}True{% endif %}
{% for page_ocr in workflow_instance.document.latest_version.ocr_content %}
{% if "my word" in page_ocr %}True{% endif %}
{% endfor %}
Whats the point i'm missing?{% if "my word" in workflow_instance.document.latest_version.ocr_content|join:" " %}True{% endif %}
Sandboxresult for all these conditions is empty.
EDIT: Shouldnt the "{{ document.versions__version_pages__ocr_content__content }}" give any result in the sandbox? Result is empty.
Re: Workflows - Access OCR content for condition for adding tags
Hello,
what worked for me was:
You can even filter with "and" for multiple strings:
To test these in the sandbox of a document just remove the "workflow_instance." part, you only need this for the workflow.
But it was important to add a "Transition" in the workflow config for this to work.
So try creating a transition with the same origin and destination state (if you only have one like me).
In the transition you can enable a trigger, e.g. "Document version OCR finished".
Hope this helps.
Kind regards
what worked for me was:
Code: Select all
{% if "word" in workflow_instance.document.ocr_content|join:" " %}True{% endif %}"
Code: Select all
{% if "word1" and "word2" in workflow_instance.document.ocr_content|join:" " %}True{% endif %}"
But it was important to add a "Transition" in the workflow config for this to work.
So try creating a transition with the same origin and destination state (if you only have one like me).
In the transition you can enable a trigger, e.g. "Document version OCR finished".
Hope this helps.
Kind regards