We have been using Laserfiche on premises for six months now but only set up the DCC this week. I would like to know if we uploaded PDF files today. When DCC starts operating, will it be applied, or will you check every PDF file already in Laserfiche? Or will it only address files saved on that day?
Distributed Computing Cluster DCC
DCC does not go hunting for documents, it only accepts requests for work. You can use Workflow to search for documents and send them out. Or you can set up the web client to use DCC so users can also send documents manually when they need to.
It is entirely up to you. You must use Workflow to create a DCC job and pass the entries you want to OCR.
Make sure the workers are not using a CPU that you need for other purposes as OCR jobs will max your processors.
After having fought with getting the DCC to behave reasonably for a while I would suggest using the following settings on the "Schedule OCR" activity in workflow.
The accuracy setting can cause the DCC to sit on a page with a photo for a long time and then fail; leaving the rest of the pages un-OCRed.
I have also done a two server system around the DCC where one machine, which was not maxxed out yet, was used as the DCC scheduler but the workhorse was a dedicated server for OCRing processes.
That enabled me to get the workhorse machine provisioned a lot higher than the rest of the system and blow through the OCR jobs (hence Chad's suggestion above)
This is not in use all the time though.