You are viewing limited content. For full access, please sign in.

Question

Question

Pattern Matching Last Name

asked on April 7, 2017 Show version history

I'm looking to identify the last name from a field that generally shows only a person's first and last name.  (Ex. Evan Willie)

 

However sometimes the names will have middle initials or suffixes. (Ex: Evan J. Willie or Evan Willie MLO#1111111 or Evan Willie, AAP)

 

Would anyone have suggestions on how to go about building a pattern matching expression to work with this?

 

Thanks for any help.

0 0

Answer

SELECTED ANSWER
replied on April 7, 2017

Ex: Evan J. Willie or Evan Willie MLO#1111111 or Evan Willie, AAP

Are these the only edge cases?  Chaining together regex with the pipe | symbol can get your exceptions covered.

(\w+),|(\w+)\sMLO#|(\w+)$

 

1 0
replied on April 7, 2017 Show version history

Essentially yes, those are the only other edge cases.  (The others not mentioned are varieties of letter characters similar to the AAP example, but they worked too in the testing area).

 

That worked great, Thanks!

0 0

Replies

replied on April 7, 2017

Use a regular expression to match from the end.  

(\w+)$

 

0 0
replied on April 7, 2017

That was my initial thought, however that doesn't account for suffixes unfortunately.

 

 

0 0
replied on April 7, 2017

Provided that all of the possible scenarios are those you've included (no prefixes), I would do this in two parts:

a) First, Trim first name and space after it: ^\w+\s(.+)

b) Capture the result from the first regex and get the last name: ^(?:\w\.)\s(\w+)|(\w+)

 

This worked for me with those test values.

 

0 0
replied on April 7, 2017

Edgar, when I tried your answer, I was essentially getting reversed results.  I'm not sure why.

 

^\w+\s(.+)

 

^(?:\w\.)\s(\w+)|(\w+)

0 0
replied on April 7, 2017

Hi Evan,

 

The idea was to run the first regex on a pattern matching activity, and pass the resulting token to a second pattern matching activity which would use the second regex. So, the second regext would receive only "Willie, MLO#1111" and evaluate that string. 

0 0
You are not allowed to follow up in this post.

Sign in to reply to this post.