You are viewing limited content. For full access, please sign in.

Question

Question

pattern matching - spaces as breaks to create tokens.

asked on January 27, 2016

I  have a line of data that came out of ocr's a page in QF.   I need to break the line apart.  I want to take the 1st numbers for one token, the second set for another token and so forth.  How do I use the spaces (which could be variable) to break the line up.

Imput Line looks something like this:

 

1233    131911  11111333   222222

I've tried this, without success- and have read all I could find on Answers, without luck.

(\d+) \d+ \d+ \d+     1st set

\d+ \d+ (\d+) \d+     3rd set

 

What I need is a token for 1233, one for 11111333 and one for 4th set.   

The spaces between the numbers are a natural break,- but they can be variable.

 

Any help would be appreciated.   

 

 

 

0 0

Replies

replied on January 28, 2016

Hi Rebecca,

If you want all the numbers you could just use \d+ and set the "If multiple matches found" to "All matches as a multi-value token" which will give you all grouped numbers as a multi-value token

Otherwise I think you may need to use the "whitespace" character class in your pattern match rather than actual spaces as you can then use it to find one or more spaces the same as you have done with the digits. \s+ should find one or more spaces in your pattern laugh

Cheers, Dan

1 0
You are not allowed to follow up in this post.

Sign in to reply to this post.