I am migrating a large amount of records within a repository and to try to process over 19.000 records would prevent other users within our organization from storing their scanned files. Is there a way to identify a range of folders when searching?
Question
Question
Defining a range of numbers in a Workflow Search Repository activity.
Replies
What do you mean by "Identify a range of folders"?
What is your current search criteria?
What options do you have for breaking this into smaller batches?
Right now we have our records broken down by school, and each school folder is broken down by student's ID. If I can't specify a range, which thinking about it would be hard to do since the numbers are not congruous, can I include multiple school numbers separated with a special character?
It is really hard to say without a better understanding of what you're trying to accomplish. Are you moving them to a new folder structure? Are you only moving them to a new volume? Are you moving them to a new repository? etc.
You don't actually need to have the search criteria handle the "batching" of the workload; there's a lot of options here.
As one example, you could have a For Each Value loop in the workflow. The list of values would then be your folders names/search criteria. Inside of the loop you have your search repository activity that uses the token inside of the search syntax so it can be dynamic.
Alternatively, if all of the schools are inside of the same parent folder you could just target the parent folder as the starting entry, use a find entries to get the subfolders, do a for each entry loop, then again use a search repository activity with the current entry name dynamically added to the search criteria.
Better yet, if the folder structure is consistent, you don't need a search at all. You could just use nested Find Entries activities that recursively go down the folder structure and migrate the documents.
The big thing is that you should not do this all in one workflow. The more activities a workflow instance generates, the more it will start to impact performance so if you try to run a single workflow on a large batch it could end up being fairly slow toward the end. Instead, you should break it into at least two workflows (i.e., a parent and child process).
The parent process would handle the top-level, then it would call (and wait for) the child process to handle the more repetitive tasks in smaller batches; this has the advantage of starting a fresh instance for each "batch" to maintain consistent performance levels.
We're moving files to a new record series on the same repository since DOE mandates that we must have a retention policy applied to all existing, and future student records.
Since it sounds like you're dealing with a predictable folder structure, I don't think I would use a search activity at all.
I would just use the Find Entries approach and iterate through the folder contents; that way you can start it up or stop it as-needed and you won't have the overhead of large search results.