You are viewing limited content. For full access, please sign in.

Question

Question

Use Pattern Matching to exclude excessive whitespace

asked on March 28, 2018

Hello,

 

We have a customer who is exporting documents out of a 3rd party application. They are able to take metadata from the third party application and put it in the name of the document they are exporting, with certain special characters (^, in this case) separating the field values so we can parse them out with Workflow and assign them to field values in Laserfiche. However, this application they're exporting from is putting excessive whitespace after the strings we're trying to capture. Example below:


Test Insurance Providers                                              ^ TestBank ^ 0 ^ Nov 30 2007 12_00AM

 

We want to capture just "Test Insurance Providers" in this case. This particular string could be any amount of words separated by one whitespace character. We use [^\^]+ to capture everything before the first ^, but I can't seem to figure out how to exclude the whitespace.

Any help is appreciated. Thanks!

0 0

Answer

SELECTED ANSWER
replied on March 28, 2018 Show version history

(.+?)\s+\^  should work for you.   Capturing all the text to the first ^ without the spaces before the first ^.

0 0
replied on March 28, 2018

This works. Can you explain what the ? does in this case? How does the + work with the ? to make this work?

 

Thanks!

0 0
replied on March 28, 2018 Show version history

The (.+) will match as much as it possibly can,  it's greedy.   Adding the ? (.+?) tells the match to be non-greedy.   It will match as little as it possibly can. 

 Try the match with and without the question mark.   Without (greedy) it will match as much as it can, to the very last ^ in the line and return 
"Test Insurance Providers                                 ^ TestBank ^ 0"

Adding the ? (non-greedy) tells it to only match to the first occurrence of a ^.   It stops as soon as the match conditions are met the first time, instead of finding the last occurrence when the conditions match.

Often easier to see than explain.   If you play with the match removing and adding  the ? (and other parts)  you can see what it is doing.

Hope that helped.   

~ Andrew


 

0 0

Replies

You are not allowed to reply in this post.
You are not allowed to follow up in this post.

Sign in to reply to this post.