You are viewing limited content. For full access, please sign in.

Question

Question

Split a field into tokens every 8 characters

asked on February 10, 2021

I have a field in a SQL table that lists the filenames of documents, but if there are multiple pages in the doc then it runs the filenames together like this:

I'm trying to figure out a way to take documents where the pages are greater than one and split the FILELIST value into the separate filenames.  They're all 8 characters long so I'm sure there's a simple regex or something I can do, but I've never been very good at it.  I can use \d{8} to capture the first set of eight, but I don't know how to capture every set and split them up.

I've also got scenarios like this where sometimes it has the filepath; I tested out using the same "\d{8}" and it matches the eight digit names I need, but again I don't know how to assign the different sets to the token

0 0

Answer

SELECTED ANSWER
replied on February 11, 2021

I would just go with Pattern Matching instead of the Repeat loop, it's faster.

Dennis, note the change under "if multiple matches are found". The default is "first match", so that is why you were only getting one. If you change it to "all values as a multi-value token", you'll get all matches as individual values.

3 0

Replies

replied on February 10, 2021

One way to do it without using RegEx is to look at each value, grab the first 8 characters, and then return the remainder back to the token - then loop back and do it again.

It looks like this in Workflow:

Here's what "File List Test" looks like (both multi-value tokens):

The "For Each Value" activity works on the %(File List) token.

Here's what "Current Working File" looks like:

Here's what the "Repeat" Activity looks like:

Here's what "Split File Name" looks like:

Here's what "Update Tokens" looks like:

"Track Tokens" is set to work on the %(New File List) Token.

After running it, "Track Tokens" reports that the %(New File List) Token looks like this:

1 0
replied on February 10, 2021

You could likely use RegEx in the tokens in place of the MID formula on the token calculator activity - but I think you'd still need to work the same way looping through the values.

I don't believe you can split a single value into multiple values using just RegEx in Workflow.

The SPLIT function can split a single value token into a multi-value token, but requires a delimiter character like a comma or pipe, it can't really do something like "split every nth character"

If you Google about splitting a string every nth character, there are a lot of coding examples - you could potentially find C# or VB.NET examples that could be implemented in Workflow.  That being said, the example I provided with loops took me 5 minutes to put together, and coding a solution would be longer.

1 0
replied on February 10, 2021

Thanks for the help!  I haven't tested this out yet but I've got something in place using what you posted up there that should work once I iron out the kinks

1 0
replied on February 10, 2021

Because I'm a big nerd and think this stuff is fun.  Here is a script to do it (using the "Script" activity).

This works exactly like my original comment does using Workflow activities - it just does it as a Script instead.  It loops through each value, grabbing the first 8 characters and leaving the remainder and looping back.

To test this, you just need the %(Test File) token, then the Script activity, and the Track Tracks activity.

Here's the Script: 

namespace WorkflowActivity.Scripting.Script
{
    using System;
    using System.Collections.Generic;
    using System.ComponentModel;
    using System.Data;
    using System.Data.SqlClient;
    using System.Text;

    /// <summary>
    /// Provides one or more methods that can be run when the workflow scripting activity is performed.
    /// </summary>
    public class Script1 : ScriptClass90
    {
        /// <summary>
        /// This method is run when the activity is performed.
        /// </summary>
        protected override void Execute()
        {
            IEnumerable<object> array = (IEnumerable<object>)this.GetTokenValue("File List");
            List<string> newFileList = new List<string>();
            foreach(string value in array)
            {
                string newValue = value;
                if(newValue.Length > 8)
                {
                    do
                    {
                        string newValue2 = newValue.Substring(0, 8);
                        if(newValue.Length == 8)
                        {
                            newValue = "";
                        }
                        else
                        {
                            newValue = newValue.Substring(8, newValue.Length - 8);
                        }
                        newFileList.Add(newValue2);
                    } while (newValue.Length >= 8);
                }
                else
                {
                    newFileList.Add(newValue);
                }
            }
            SetMultiValueToken("New File List", newFileList, true);
        }
    }
}

 

0 0
SELECTED ANSWER
replied on February 11, 2021

I would just go with Pattern Matching instead of the Repeat loop, it's faster.

Dennis, note the change under "if multiple matches are found". The default is "first match", so that is why you were only getting one. If you change it to "all values as a multi-value token", you'll get all matches as individual values.

3 0
replied on February 11, 2021 Show version history

I completely forgot that the Pattern Matching activity existed!  🤦‍♂️

The regular Assign Token Values activity has no way to convert the object into a multi-value token, but the Pattern Matching activity totally does.  I feel so sheepish that I missed that.

0 0
replied on February 11, 2021

Oh yeah, that looks like a good solution Miruna - thanks!

And thanks for the script info Matthew, this is one of my first workflows from scratch so it's interesting to see how many ways you can skin a cat!

1 0
You are not allowed to follow up in this post.

Sign in to reply to this post.