You are viewing limited content. For full access, please sign in.

Question

Question

Quick Fields Pattern Matching works, but value in field does not match

asked on July 3, 2015

I am using Pattern Matching with Omnipage Zone OCR to get an Invoice Date out of a detailed section on the invoice and not getting expected results. 

  1. My pattern works in testing (designed to target the Invoice date only, drop the leading "Invoice date" words and leave only the date), but the resulting field value includes the words.  Can anyone tell me what I'm doing wrong?
  2. It is supposed to pick out "Invoice date" in the Zone area, but the data captured includes all four lines in the Zone and the pattern matching token leaves everything it captured in the first line in the field value (my target is in the fourth line).

 

This screen capture (below) shows where the pattern matching is in the page processing, shows the area of the sample page it is looking at, and shows the Pattern Matching window open after the successful Test run. 

This second screen capture shows the output results for the Zone OCR, and what the results are in the field, which has the token for the pattern matching in it:

Thx, Connie Prendergast, Flagstaff County

0 0

Replies

replied on July 6, 2015

Hi Connie,

It looks like from the screenshot in the original post it is picking up the lines and adding them in as pipe symbols "|".

Try adding a local enhancement to remove vertical lines and see if it it removes them from your OCR text.

Alternatively, you can try move your ZoneOCR box to not obtain the title of the fields as it looks like there is pretty good spacing there. Just extract each line as the actual field and perform pattern matching for the dates.

2 0
replied on July 6, 2015

Thanks, John, that is exactly what I had to do.  I actually had them in there, but the one that was intended for vertical was accidentally left on the Horizontal setting.  Once I changed it, it was no longer collecting those lines.

Any ideas why my Pattern Match activity is coming up empty, even though the Test (as in the above printscreen) is coming out correctly?

Thx, Connie

0 0
replied on July 6, 2015

It's likely that your pattern match didn't capture the data it wanted so default to the invoice date token.

When you are testing the pattern matches, I recommend using the text captured by your zone OCR process as your test value. You can retrieve this by testing the zone OCR process and copying the text captured in the output pane:

Hopefully that will give you an idea where it may be wrong. Also, it's always best to use "\s" for all space characters and only one set of round brackets is required for your pattern: (\w{3-10}\s*\d{1,2}\s*\d{4}).

 

Good luck!

0 0
replied on July 6, 2015

Like Cathy said, your pattern did not match anything. The checkbox labeled "Use the input value as the token value" (when no matches are found) at the bottom of the Pattern Matching properties dialog is why you're seeing the entire Zone OCR value in the field.

0 0
replied on July 6, 2015

Okay, so, I used Cathy's suggestion for the pattern and for copying the zone OCR results into the pattern matching test area.  I've done that and can show that the pattern is, in fact, returning the right result, however, when I close that window and then run the Test current process on the Pattern Matching set, it is still coming up blank.  Why?

 

0 0
replied on July 6, 2015

Trying adding a \s* every time there can be a space, including before and after the colon as well as inside the title text as it looks like it is grabbing a space from the most recent screen shot. It would start like Invoice\s*Date\s*:\s* and also turn off match case. The rest of your patternmatching looks good.

 

0 0
You are not allowed to follow up in this post.

Sign in to reply to this post.