Hi guys,
I need to extract individual entries from a comma delimited file via Workflow. I've got a pattern matching activity that splits the rows into individual tokens and leaves me with a string of column delimited text e.g.
Employee,12345,Smith,Fred,01/05/2018,29/05/2018,Department,Location
I can run a second pattern matching activity splitting into individual tokens per delimiter e.g. to get the Employee Type token:
(.*),.*,.*,.*,.*,.*,.*,.*
... to get the Employee Number token:
.*,(.*),.*,.*,.*,.*,.*,.*
... etc
And that works really well, right up until there's a department with a comma in the name, so it's in quotes:
Employee,12345,Smith,Fred,01/05/2018,29/05/2018,"Other, Department",Location
Testing with various regex's I found online, the following will work to get the Employee token when there are quotes around the Department:
(.*),.*,.*,.*,.*,.*,\"(?:[^\"]+|\"\")*\"|[^,],.*
And that successfully extracts all the tokens from that string, i.e. with a department with the quotes around the name and one or more commas, but then all the other lines without quotes fail.
Any suggestions on how to successfully populate the pattern matching tokens regardless of whether the department entry has quotes or not?
Here's the Workflow:
And here's the structure of the pattern matching activity for CSV split:
Thanks very much,
Mike