You are viewing limited content. For full access, please sign in.

Question

Question

Regular expression to exclude all instances of something

asked on July 11, 2019

I've been trying to build a regex that can exclude all instances of something. I'm finding solutions online, but none work inside Workflow. Would someone be able to assist?

 

For example, let's say I have the following string:

 

I'm going to the store to look at the products.

 

I want a regex that will remove "the" and return

 

I'm going to  store to look at  products.

 

Assume that we do not know what the sentences involved will be ahead of time (the one above is just an example) so it can't be hardcoded in. Thank you for any assistance you can provide.

0 0

Answer

SELECTED ANSWER
replied on July 12, 2019 Show version history

I just tested it real quick, and the pattern I gave actually might not be a good option.

The reason it wasn't working with the spaces is because it is removing all those characters individually, not as a group.

If you have any numbers in the string that don't have % it would end up removing those as well, which I don't think you want.

 

I think the actual solution might be a two-part process (I've done something similar with complex strings).

  1. Use Pattern Matching with \d{2}%\s
  2. Set it to return all values as a multi-value token.
  3. Add a For Each Value loop for the results
  4. Inside the loop
    1. add a Token Calculator
    2. use Substitute to remove the current iteration value
    3. Update your token with the result

The problem with the Token Calculator was that you didn't have a static value to target for the substitution.

The problem with the Pattern Matching is that it is extremely difficult to exclude an entire pattern in a single step.

This would use pattern matching to identify what you need to replace so that the substitute option then becomes viable.

2 0

Replies

replied on July 11, 2019

Use Token Calculator's Substitute function to replace text.

1 0
replied on July 11, 2019

Thank you for your quick response. While this does work, I was actually asking for a simpler version of what I really wanted so I could extrapolate it myself. What I actually want is not to remove a specific string, but this:

 

\d{2}%

 

In other words, every time there is a percent with 2 digits (50%, 30%, 15%) I want to remove that. That's why I was trying to use a regex, because I don't think the Token Calculator functions can handle character types like that.

0 0
replied on July 11, 2019 Show version history

Something like the following might do it if you use Pattern Matching set to return all matches (for things like this, Pattern Matching more useful than regex in the Token Editor).

[^(\d{2}%)]

UPDATE: I don't think is actually an ideal solution. I didn't notice that this is actually looking at the characters individually, so it would affect any numbers even if they're not with a %

1 0
replied on July 11, 2019

That works! Thank you very much, Jason. The only thing left to handle is that there are now blank lines where the percents were and that there's a certain hardcoded string (let's just say it's "how are you") at the beginning of the passage that I also want to exclude.

 

Thank you!

0 0
replied on July 11, 2019 Show version history

Actually, there's probably not blank lines from the percents. More likely you're seeing the space that was before it AND the space that was after it.

You could add \s to the left or right of the pattern to remove one of the two spaces.

0 0
replied on July 12, 2019

Actually, with "All matches (combined with no spaces)" selected, adding \s anywhere in the pattern completely messes up the formatting of the resulting string.

0 0
SELECTED ANSWER
replied on July 12, 2019 Show version history

I just tested it real quick, and the pattern I gave actually might not be a good option.

The reason it wasn't working with the spaces is because it is removing all those characters individually, not as a group.

If you have any numbers in the string that don't have % it would end up removing those as well, which I don't think you want.

 

I think the actual solution might be a two-part process (I've done something similar with complex strings).

  1. Use Pattern Matching with \d{2}%\s
  2. Set it to return all values as a multi-value token.
  3. Add a For Each Value loop for the results
  4. Inside the loop
    1. add a Token Calculator
    2. use Substitute to remove the current iteration value
    3. Update your token with the result

The problem with the Token Calculator was that you didn't have a static value to target for the substitution.

The problem with the Pattern Matching is that it is extremely difficult to exclude an entire pattern in a single step.

This would use pattern matching to identify what you need to replace so that the substitute option then becomes viable.

2 0
replied on July 12, 2019

Thank you Jason! It was a bit more complex than I was hoping since I also needed to exclude everything at the beginning of the string up to a certain point, but I was able to build something that gave me the desired result at the end.

0 0
You are not allowed to follow up in this post.

Sign in to reply to this post.