You are viewing limited content. For full access, please sign in.

Question

Question

Feature Request: Conditional Workflow Schedule

asked on August 17, 2017 Show version history

Hello,

Currently we have the option to trigger a Workflow either based on Conditions or based on a Schedule. However, I think it could prove valuable to have a Conditional Schedule that combines the two rule types.

For example,

Instead of always automatically running, I'd like to be able to schedule a workflow to evaluate conditions every X minutes/hours, and at that point decide whether or not to initiate the process based on the starting conditions.

Basically, instead of a subscriber event kicking off the condition evaluation, I'd like to be able to trigger that with the timer. (Maybe just add "timer" as an Event Type or the starting condition properties?)

 

To provide an example use case,

We have a workflow that runs every 1 minute for 16 hours each day because documents must be processed in batches, and must be processed as soon as possible. However, the workflow instances can take anywhere between a few hundred milliseconds and several minutes to complete.

To prevent too many overlapping processes, I created conditions within the workflow itself that can identify the number of active processes and abort the current instance if too many are already in progress.

The configuration works, however, the big downside here is that each day we are generating hundreds of instances for this workflow when the vast majority are not doing any work.

Instead, I'd like to do something like this:

  1. Every 1 minute for 16 hours
  2. If Conditions are satisfied,
    • Start a new instance

 

EDIT: A secondary component of this would be the ability to designate a specific entry in the conditions. For example, to check the document count within a specific folder every X minutes.

 

1 0

Replies

replied on August 17, 2017

There's an option to only allow one instance at a time in the Server Options. See "single instance mode", it can be set per workflow definition.

2 0
replied on August 17, 2017

Thank you, however, that does not address our circumstances for a couple of reasons:

  • To keep up with our volume we allow up to 3 instances to run at a time as single instance mode leads to backlog during peak processing times.
  • Even if we were to limit the workflow to one instance at a time, that would not address the large number of inactive instances that accumulate.
0 0
replied on August 17, 2017

There wouldn't be inactive instances, they're discarded, not queued, if one is already running.

0 0
replied on August 17, 2017 Show version history

I should clarify. The documents tend to come in waves, so the "inactive" instances occur when there is nothing in the target folder to be processed in addition to when the instance "limit" is reached.

Each "active" instance can run for anywhere between 30 seconds to 10 minutes depending on the total page count and combined file size.

Additionally, each instance is limited to 100 documents for optimization (too few documents increases the "per document" processing time too much, and too many documents introduces too much of a delay when large documents come in)

For example,

  1. 8:00 - "inactive" no documents
  2. 8:01 - "inactive" no documents
  3. 8:02 - 125 new documents found
    • Instance 1 - 100 documents
  4. 8:03 - 75 new + 25 "old" found
    • Instance 2 - 100 documents
  5. 8:04 - 50 new documents found
    1. Instance 3 - 50 documents
  6. 8:05 - "inactive" instance limit reached
  7. 8:06 - "inactive" instance limit reached
0 0
replied on August 17, 2017

What do you do to the documents you find between 8:02 and 8:03 to ensure the search doesn't find the same documents?

0 0
replied on August 17, 2017 Show version history

For each "active" instance, I create a new subfolder named with the workflow instance ID, route the 100 documents into that folder first, then process everything in the subfolder.

Each successive instance only looks at the "root" folder so it ignores anything already in progress.

0 0
replied on August 17, 2017

Are you running each document through For Each Entry? Have you tried moving everything inside For Each Entry to its own workflow and invoking it? That should speed up processing quite a bit since you'd be running 4 of those instances per CPU concurrently.

0 0
replied on August 17, 2017 Show version history

Yes, but the duration is not really an issue of "per document" times, it is caused by the relative slowness of the Replicate Entries activity (whether or not it is set to run as a Task).

The general process when workflow runs its course is as follows:

  1. Create subfolder named after the workflow instance ID
  2. Collect the newest documents in the target folder (up to 100)
  3. For each entry
    • Validate the document (ensure a specific field matches a regular expression)
    • If valid move to the subfolder, else reroute to another folder
  4. Replicate Entries (longest part of the process)
    • copy all documents to the permanent repository.
  5. Find the folder/documents in the destination repository
  6. For each entry
    • Invoke secondary workflow
      • additional processing activities (routing, versioning, updating external systems/applications, etc.)
      • route "original" document to backup folder
  7. Delete empty subfolder

 

Originally, I configured it to process/replicate each individual entry in its own instance triggered when it was moved into the folder, but that proved to be highly inefficient compared to batch replication.

After extensive testing, 100-at-a-time proved to be a happy medium based on our document processing volume, resource efficiency, and time requirements.

However, single instance mode creates too much potential for backlog, and running every minute for 16 hours means we have a lot of extra instances that didn't really do anything, hence the desire for a "conditional" schedule.

0 0
You are not allowed to follow up in this post.

Sign in to reply to this post.