[SOLVED] Weird date issue

When things don't work as they should.
Post Reply
Tim
Posts: 2
Joined: Wed Feb 03, 2021 12:08 pm

[SOLVED] Weird date issue

Post by Tim »

I have a number of documents that routinely come to me with the date in yyyy-mm-dd format in the file name, and thus, once imported, in the document label. I created a workflow which conducts a regex search for this date and adds it as metadata in a receipt_date metadata field. That part seems to work great. The problem comes when I try to index from this field.

In the sandbox, {{ document.metadata_value_of.receipt_date }} returns the expected yyyy-mm-dd with no other characters. But, in an index, {{ document.metadata_value_of.receipt_date|slice:"0:4" }} returns only 20, as in the first two characters of the date. If I manually click edit metadata on a file, change nothing, but select the check mark for this metadata field, and click edit, it works as expected in the indexes, with the above returning the four digit date.

Any ideas? I have about 3000 documents formatted this way, so manually opening each and “editing” the metadata is less than ideal.
Last edited by Tim on Wed Feb 03, 2021 8:50 pm, edited 1 time in total.
Tim
Posts: 2
Joined: Wed Feb 03, 2021 12:08 pm

Re: Weird date issue

Post by Tim »

Ok. I figured it out.

My Workflow action adding the metadata was as follows:

Code: Select all

{% regex_match "([0-9]{4}[-/]?((0[13-9]|1[012])[-/]?(0[1-9]|[12][0-9]|30)|(0[13578]|1[02])[-/]?31|02[-/]?(0[1-9]|1[0-9]|2[0-8]))|([0-9]{2}(([2468][048]|[02468][48])|[13579][26])|([13579][26]|[02468][048]|0[0-9]|1[0-6])00)[-/]?02[-/]?29)" document.label as m %}
{{ m.0 }}
What was happening is that this essentially adds a hidden \r\n to the beginning of the code—picking up the return between the regex.search and the m.0. I’m not sure why— maybe I’m too new and that’s expected.

Code: Select all

{% regex_match "([0-9]{4}[-/]?((0[13-9]|1[012])[-/]?(0[1-9]|[12][0-9]|30)|(0[13578]|1[02])[-/]?31|02[-/]?(0[1-9]|1[0-9]|2[0-8]))|([0-9]{2}(([2468][048]|[02468][48])|[13579][26])|([13579][26]|[02468][048]|0[0-9]|1[0-6])00)[-/]?02[-/]?29)" document.label as m %}{{ m.0 }}
Causes the date to be filled in correctly, and solves the issue

Also, I know the regex is probably overly complicated— in that it will pick up dates clearly not intended for this application. I spent a few days focusing on regex-noobness as the issue, utilizing as many different permutations of the yyyy-mm-dd search format as possible. This doesn’t seem broken, so I’m not going to fix it.
Post Reply