You are viewing limited content. For full access, please sign in.

Question

Question

Pattern Matching : Need expression's help

asked on April 23, 2019

Hi,

I need your help for an expression. I don't know how to do.

This is my text:

"some text1

123456 - some text2\r

789101 - some text3\r

some text4"

 

I need to get some text2 and 3.

I tried this solution : \d{6}(.*)\r but sometimes I have an error because inside "some text2", I can have "\r".

How can I do?

Thanks in advance.

 

Regards

 

 

0 0

Replies

replied on April 24, 2019 Show version history

Do I understand correctly that sometimes your first capture item may or may not have a line feed?

But the first capture item is always between 2 six digit numbers?

The second capture item is always after the 2nd six digit number?

 

Sometimes the source looks like:

some text1
123456 - some text2
789101 - some text3
some text4

Sometimes the source looks like:

some text1
123456 - some
text2
789101 - some text3
some text4

You can use a multivalue token with 2 separate capture expressions:

Capture Expression 1:

\d{6}\s+-\s+(\S[\s\S]*)\r?\n?\d{6}

Capture Expression 2:

\d{6}\s+-\s+\S[\s\S]*\r?\n?\d{6}\s+-\s+(\S[^\r\n]+)

And used as shown below:

3 0
replied on April 23, 2019

\r actually happens to be special command in pattern matching that means return to start. Since after every line, there is a hidden return to start character, it was picking up the entire end of the line of text. 

You will want to use \d{6}(.*)\\r for your pattern where \\ means look for the character "\" and r is just the literal character r.

0 0
replied on April 23, 2019 Show version history

Hi Chad,

 

thank you for your help. Yes I know about the special caracters "\r".

I use it in my example to show you the carriage return. I don't want to get "\r" I just want to know how to get the line 2 if inside the value I already have a carriage return.

 

Example :

Expression : \d{6}(.*)\r

 

Values :

123456 - some\r text2\r

789101 - some text3\r

 

Results :

some

some text3

 

What I want :

some\r text2

some text 3

0 0
replied on April 23, 2019 Show version history

Oh I see. Pattern matching stops by default at \n unless specified. I think this means, if you want to dynamically capture multiple lines, you must enter an endless string like this

(.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?.*\n?)

\n proceeds all \r characters, it means new line. \r just means return home.

See this post

https://answers.laserfiche.com/questions/132269/Pattern-Matching-On-Multiple-Lines-of-Text

0 0
replied on April 23, 2019

I tried your solution but it doesn't work.

I think I didn't express myself well.

 

I want to get the result as multi values

value 1 : some\r text2

value 2 : some text 3

 

I can't use "\r\n" to stop the expression because on my text "some text2" I can have a carriage return inside the texte. Example : "some \r\ntext2" and this will return "some "

0 0
replied on April 23, 2019

Hi Oliver, 

 

Try this 

(?:\d{6} ?[-] ?)(.*)(.*\n)*(?:\d{6} ?[-] ?)(.*)

 

?: is used as a non capture group

replied on April 23, 2019

Hi Oliver,

Try this:

(?:\d{6} ?[-] ?)(.*)(?=\\r)

 

Make sure you have 'All matches (as multi-value token)' selected.

0 0
replied on April 24, 2019

Hi Aaron,

 

Thank you for your help.

Your solution return me nothing, sorry.

I don't understand what this expression means.

0 0
replied on April 25, 2019

?: Means, non capture group, so it will find 6 digits "\d{6}" followed by a space which may or may not be there " ?" followed by a literal dash [-] (You don't actually need the square brackets in this instance. - is used as a range expression, so I put them in out of habit) followed by another space which may or may not be there " ?"

 

The capture group (.*) means any amount of any character (except newline).

 

Lastly the "?=" is used as a positive look ahead. Meaning it will look further into the string to find a match. In this case we use "\\r" (the extra \ at the start is used as an escape character, so it takes the \r as literal, instead of carriage return).

 

1 0
You are not allowed to follow up in this post.

Sign in to reply to this post.