You are viewing limited content. For full access, please sign in.

Question

Question

Workflow to OCR docs that are not OCRed in LF Rep

asked on June 18, 2014

Is it possible to have a Workflow that will run on a schedule to OCR any docs in the repository that are not currently OCRed? (using the newest version of LF WF)

 

If so would there be some examples or documentation on this configuration?

 

Thanks

 

1 0

Answer

SELECTED ANSWER
replied on June 18, 2014 Show version history

Sure, so what you'd do is have a workflow that is scheduled to (I'd suggest) run during non working hours and have it search repository using advanced search syntax.  The main customized search you would want to use for this is Pages and check the box contain text on no/some pages then add any additional search criteria you may need such as folder location or maybe a specific doc type etc. etc.

 

Correction: Note that Distributed Computer Cluster must be installed on a server and have workflow configured to connect to the DCC.

 

 

4 0
replied on June 18, 2014

One quick clarification here: You must have a Distributed Computing Cluster installed somewhere, and have your Workflow server configured to connect to that cluster. DCC does not need to be installed on the Workflow server itself. DCC is only compatible with Workflow 9.1 or later.

2 0
replied on June 18, 2014

Thanks for the quick responses.

 

I will try this out.

0 0

Replies

You are not allowed to reply in this post.
replied on June 18, 2014

You can use the Search Repository activity to obtain all documents that are not OCRed. Example search syntax would be

({LF:AssociatedPages="Y"} & {LF:OCR=none}) & {LF:pagecount > 0}

 

Then use the Schedule OCR activity where the documents to OCR would be the output entries from your Search Repository activity. Note that Schedule OCR works in conjunction with the Distributed Computing Cluster module.

You are not allowed to follow up in this post.

Sign in to reply to this post.