You are viewing limited content. For full access, please sign in.

Question

Question

Capturing 1 line from a multi line OCR zone

asked on September 20, 2017

Hello all,

I have run into this problem a couple times with scanning check through quick fields. The problem is the way that Reg ex matches new lines. Here is the example;

the pattern I have been using starts with ORDER OF and then I'm trying to capture only the first line. What I'm noticing is that \s match new line also so if my pattern is "ORDER OF\s(\w+\s\w*\s?\w*\s?\w*) it will jump to the next line. If I add /r/n it sometimes works and sometime not because the OCR output will sometimes have /r/n and sometimes only /r. I have to account for vendors having multiple word in their name but it can be as short as AT&T and as long as Some Other Company Than Has Long Name.  

Does anyone have a pattern that works for this scenario? Thanks. 

0 0

Replies

replied on September 21, 2017

Hi Lucas,

Check your "/r/n" . Most new line characters should be "\r\n".

 

I put the regex  - "ORDER OF\s(.*)\R\N" into https://regex101.com/ and it seem's to bring back what you require. Laserfiche may require the lowercase \r\n to work though.

 

Let me know how it goes.

 

Kind Regards,

Aaron

1 0
replied on February 26, 2018

I have had this issue come up again and your pattern doesn't work anymore. Whats really crazy is that I can get it to work with this;

PAY TO THE ORDER OF\s(.+?)\n

The question mark does something to limit the capture group to 1 line.

0 0
You are not allowed to follow up in this post.

Sign in to reply to this post.