You are viewing limited content. For full access, please sign in.



Pattern match test differs from process result

asked on May 26, 2016 Show version history



I'm trying to extract a section of OCR that could be either 3 or 4 characters long, and either letters, numbers, or a combination of the two.


You'll notice in the screenshot of the pattern match configuration, that the test shows it should prefer the 4-digit number.  This test text is pulled from the OCR result of an actual batch.


Here is the result from the actual batch itself.  You'll see my test data from PropNumZone, and you'll see that the pattern match for PropNum preferred 3 digits instead of 4.


Why is the test showing a different value then production, and is there a way to fix this issue?  

0 0


replied on May 26, 2016

Hi Eric, 


I notice that there's a line break between "PropNum:" and "208" in your output pane screenshot. It looks like a linebreak is getting captured as part of the .{4} part of your pattern. 

From this, I believe that the discrepancy you're seeing between test and runtime results is caused by a known issue (which will be fixed in Quick Fields 10) that causes the regex to run in multi-line mode during testing even though it runs in single-line mode during runtime. In multi-line mode (during testing), "." does NOT match the \n character. However, in single-line mode (and thus at runtime), the "." character matches everything, INCLUDING the \n character. 


So now the question is - how can you change the regex to work like you want it to at runtime? 

First, I recommend using \w instead of "." since as you said, you're only looking to capture "letters, numbers, or a combination of the two". I'd also double check the \n?\r? part. Most of the time you'll see \r?\n? instead. Perhaps something like Prop\r?\n?(\w{3,4}) will accomplish what you're trying to do. 

1 0
replied on June 2, 2016

That worked!  Thanks Tessa

1 0


You are not allowed to reply in this post.
You are not allowed to follow up in this post.

Sign in to reply to this post.