You are viewing limited content. For full access, please sign in.

Question

Question

Regular Expression To Remove Duplicate

asked on February 2

Hello, I have a folder where I have some duplicates which I've already removed, but I'm left with a ton of filenames with the appending (2) or (3) such as 

Juan Angulo - XYZ (2)
Juan Angulo - XYZ (3) 

 

What's the regular expression I could use to remove the (2) or (3) or (4) whatever it may be from the file name?

0 0

Answer

SELECTED ANSWER
replied on February 2

Are there any parenthesis in the document names other than the ones that encompass the duplicate number?  If not, you could use:

(.*)\s\(

Basically grab anything up to a ( not including the space.

1 0

Replies

replied on February 2

What do you plan on using to remove duplicates? What do you expect to happen to the duplicates? Do they need to be merged to the original or just deleted?

0 0
replied on February 2

I plan on doing this with workflow, 


The duplicates have already been moved out of this folder to their final storage destination

The problem I'm having is I'm running a query to a database where the file names don't have the appended (2) or (3) or whatever, so it's not finding a match, I realize that once work flow moves these file to their final storage place, they will have the appended (2) but I need to first have the metadata assigned to these files via that workflow the database query that is looking for these files names without the appended string

To be clear, the file names are as such 

 

Angulo, Juan - 10-31-2011 (2)
Doe, John - 12-11-1990 (2)
Doe, Jane - 04-23-1980 (2) 

 

And in the Database they are just named
Angulo, Juan - 10-31-2011 
Doe, John - 12-11-1990 
Doe, Jane - 04-23-1980 

 

And that's why I need the (2) removed. Hope i haven't confused you all, I appreciate the help. 

 

 

 

0 0
replied on February 2

You could use something like

[^\(\d*\)]*

This will get you any character that is not (any digits).  I guess you would just have to make sure that there was no actual numbers inside parenthesis you want to keep or before the things you want to delete.

0 0
replied on February 2

Actually the file name does has digits such as 

Juan Angulo XYZ 10-13-2011 (2)

I should have been more clear lol 

0 0
SELECTED ANSWER
replied on February 2

Are there any parenthesis in the document names other than the ones that encompass the duplicate number?  If not, you could use:

(.*)\s\(

Basically grab anything up to a ( not including the space.

1 0
replied on February 2

Nope, shouldn't be any other () within the file name other than the dupe stuff at the end. I'll try this out! 

 

Thanks again for the help! 

0 0
You are not allowed to follow up in this post.

Sign in to reply to this post.