Workflow for finding duplicate files

asked on June 21, 2023 • Show version history

I've been attempting to get a workflow to find duplicate test files and am having partial success. I tried to base my workflow off the one found here:

https://answers.laserfiche.com/questions/180998/What-step-could-I-add-to-a-workflow-to-stop-duplicate-entries-in-a-repository#197544

The problem I am having is that the conditions that are being evaluated are evaluating them as a whole, and not individually. It will find an address that matches and moves it correctly, but the next address does not "match" and fails the condition.

Here is my workflow and below I will post the workflow results:

First condition - success:

Second condition fails (because the matching address doesn't update to 1110 E Arkansas Lane

How do I get the conditions to evaluate against each entry that is returned? It seems like it is stuck on the first entry.

0 0

SELECTED ANSWER

replied on June 21, 2023 • Show version history

I believe your "Compare Files" condition should be inside a "For Each Entry" activity, that is looping through each entry returned on the "Search All Permits" activity. And then your condition should be evaluating against the current iteration of that.

2 0

View 2 previous replies

replied on June 21, 2023 • Show version history

Matthew you have once again come to my rescue. Thank you for your help as always. Got it to move the second test file. I'll start more intensive testing.

1 0

replied on June 21, 2023

That's good. I'm glad that worked.

0 0

replied on June 21, 2023

Unless we're talking about a handful of entries in your zzImports folder, this workflow is very inefficient because the loop inside of another loop has the potential of running very large numbers of activities.

You should look at eliminating the first search and running it on individual documents at a time, either when the doc enters the zzImports folder or when it's modified to have all fields needed.

Returning a potentially larger results set and sifting through them to check if they are the duplicate is also inefficient. The criteria in your "Duplicate" branch should be moved into the search as part of the search criteria. Then your conditional decision only needs to check the search got results because it will either find the duplicate or return no results.

2 0

replied on June 21, 2023 • Show version history

I am seeing this now - it took about 10 minutes to run on just about 80 documents.

How would I trigger the workflow to run on single documents when they enter the import directory? Import Agent will be handling the monitoring of a network file path.

Additionally, I am trying to figure out how to put the duplicate criteria into the search so that reduces the load.

0 0

replied on June 21, 2023

You would create a starting rule for document creation in the folder Import Agent sends documents to.

For the search, you would add search criteria on the field values. So doc comes in you read the "address" field from the workflow's starting entry and then in your search you would add a &([]:[field name]="%(field token goes here)") to your search syntax. The best way to fine tune the search syntax is to run it with fixed values in the desktop or web client until you're happy with the results and then copy it into WF and replace the hardcoded values with tokens.

0 0

Question

Question

Workflow for finding duplicate files

Answer

Replies

Sign in to reply to this post.