You are viewing limited content. For full access, please sign in.

Question

Question

Need your help about pattern matching's expression

asked on March 27, 2018

Hi all,

 

I have a string with 2 possibilities:

1°) N° RCS 123456A N° TAHITI AB123456 CODE APE ANCIEN 1234X CODE APE NOUVEAU 1234Y N° CPS (employeur)

 

2°) N° RCS 123456A N° TAHITI 234AB CODE APE 1234X N° CPS (employeur)

 

I wish to recover only the highlighted bold chain (AB123456)

The chain's lenght and structure can change.

 

Before the chain, I'll always have TAHITI so the begining is easy, but for the end, I can have 2 ways.

1°) "CODE APE",

or

2°) "CODE APE ANCIEN ..... CODE APE NOUVEAU"

 

If I have the first way, I can use this expression "TAHITI\s?(.*[^\s])\s?CODE".

But this one doesnt work with my second one because I have 2 "CODE" in my string.

My question is ; what can it be the expression for the first one and the second one ?

Thanks in advance.

Regards

0 0

Answer

SELECTED ANSWER
replied on March 28, 2018 Show version history

You can try a Non Greedy expression like this:

TAHITI\s*(.*?)\s*CODE

And if that does not work, you should be able to use 2 expressions ORed together like this

TAHITI\s*(.*\S)\s*CODE\s*APE\s*ANCIEN|TAHITI\s*(.*\S)\s*CODE

You want the Regular Expression to start with the "CODE APE ANCIEN" check first because when you "OR" Regular Expressions you always want to work from most restrictive to least restrictive expressions.

2 0
replied on March 28, 2018

Hi Bert

 

I was thinking of something like that but my expression was wrong XD

Thank you very much.

 

1 0

Replies

replied on March 27, 2018

Olivier,

N° RCS 123456A N° TAHITI AB123456 CODE APE ANCIEN 1234X CODE APE NOUVEAU 1234Y N° CPS (employeur)

These regex's work for me as long as there are no white space characters in your bolded text string ;

TAHITI\s*(\w+)

TAHITI\s*(\S+)

 

1 0
replied on March 28, 2018

Hi Cliff, thank you for your return.

 

This is what I did, but i'm not sure the bolded text string have no space.

My customer either XD

But for the moment, I'm using this expression.

 

Thanks anyway.

0 0
replied on March 28, 2018

TAHITI\s(\w+\s*\w+)\sCODE\sAPE

 

I have made this expression work with the line that has both "CODE APE" in it and if there are spaces if your bolded text.

1 0
replied on March 28, 2018

Hi Jennifer,

 

Thank you for your return.

But like I said to Cliff, I don't know if my bolded text string have spaces and how many. Bert's solution look working.

 

Thank you anyway ^^

Regards

0 0
replied on April 4, 2018

Hi all,

 

I found how to do.

This is my expression

 

(?s)RENSEIGNEMENTS RELATIFS AUX ETABLISSEMENTS.*Adresse\s?(.*[^\s\r\n])\s?\r?\n?<End>

 

With <End> = the next line.

1 0
replied on April 5, 2018

@████████, could you mark the correct answer for the original question, please? It looks like you have 2 separate questions going here and the post marked as an answer makes no sense to the original question. This is confusing to users who may find this thread through searches.

0 0
replied on April 5, 2018

Hi Miruna,

 

You right, sorry.

0 0
replied on April 5, 2018

Thank you!

0 0
replied on April 3, 2018 Show version history

Hi all!

 

I need your help about a new pattern matching's expression.

 

I have 2 paragraph with the same fields' name.

1. Paragraph 1

      Field 1 : value 1

      Field 2 : value 2

2. Paragraph 2

      Field 1 : value 3

      Field 2 : value 4

 

What can it be the expression to get Paragraph2-Field1's value?

I means, something like that : "Paragraph 2\s?\r?\n?Field 1 : (.*[^\s])\s?\r?\n?"

The expression in black is variable, sometimes I have more than 4 lines and sometimes I have only 1 line.

 

Thank you for your help.

Regards

Olivier

 

Hope this picture can help you to understand.

 

0 0
replied on April 3, 2018

Do the Paragraphs always begin with RENSEIGNEMTNTS? If so, then you could use something like this:

RENSEIGNEMTNTS.*?RENSEIGNEMTNTS.*?Adresse\s*(\S.*?\S)\s*\r?\n?

 

0 0
replied on April 4, 2018

Hi Bert

 

I have 2 problems with you expression.

 

1°) the Paragraphs don't always start with "RENSEIGNEMENTS" but this is not really important because I can use directly "RENSEIGNEMENTS RELATIFS AUX ETABLISSEMENTS.*?Adresse\s*...."

 

2°) This expression don't really work because of return and line feed (\r\n)

I should write something like that :

RENSEIGNEMENTS RELATIFS AUX ETABLISSEMENTS\r\n.*\r\n.*\r\nAdresse\s(.*[^\s])\s\r\n

 

The position what I want is not fix.

Sometimes, what I want is in the first line, sometimes in the last line.

 

Example 1

Paragraph(\r\n)

     What I want(\r\n)

 

Example 2

Paragraph(\r\n)

     Line 1(\r\n)

     What I want(\r\n)

0 0
You are not allowed to follow up in this post.

Sign in to reply to this post.