You are viewing limited content. For full access, please sign in.

Question

Question

pattern matching to stop at an underscore

asked on September 15, 2021

I am using pattern matching to apply a template to documents with specific naming conventions.

My doucment name is : PIN_10_FC_Docket#N-00066-05_Tompkins County Family Court_Judge Rowley_

I only to capture anything between the work Docket and _.

In the example above, I want to capture #N-00066-05

the pattern I am using is ....\d+.\D\D.......(.*)(?:[__]) but it returns: 

#N-00066-05_Tompkins County Family Court_Judge Rowley_

 

0 0

Answer

SELECTED ANSWER
replied on September 15, 2021

Hi Susan,

The issue is that the (.*) portion of your pattern is capturing everything, including underscores, as the '.' substitutes for any character.

A better way to do it would be to use [^] for any character NOT in set, and specify the underscore as the character that you want to include.  So the pattern would be:

\d+.\D\D.......([^_]+)

 

Additionally, if the word Docket is always there, you could simplify this even more by just using:

Docket([^_]+)

 

Hope that helps,

Mike

 

 

3 0
replied on September 16, 2021

That worked!

I found that sometimes the work "Docket" is spelled incorrectly but the symbol "#" is always there so I used that instead.

thanks Mike!

 

 

1 0

Replies

replied on September 15, 2021 Show version history

I would first load the string (document name) into a multivalue token using the Split function and split on "_".  Then I would get the item from index 4 of the multivalue token and strip out the word "Docket".

1 0
You are not allowed to follow up in this post.

Sign in to reply to this post.