You are viewing limited content. For full access, please sign in.

Question

Question

Will the schedule OCR process in Workflow process PDF files.

asked on June 27, 2014

Ive setup the Schedule OCR process within workflow using the Laserfiche Distributed Cluster (on the production server).  The Workflow is running and searching the repository each night for documents that have not been OCR’d.   The problem is that is does not find (and therefore does not OCR) PDF files. 

0 0

Replies

replied on June 27, 2014

The OCR process only applies to image pages.

1 0
replied on July 1, 2014

Hey Jeff,

 

You may want to consider generating images for your PDF documents in order to make sure that the Workflow OCR's the entries.  Depending on the volume of PDFs, you could run a pretty simple search and generate images en masse.

0 0
replied on July 2, 2014

Hey Rob,

 

Thanks for the info and we were talking about that option.  I know you can automate the search in workflow but is there a way to automate generating the LF pages maybe with a script?  The documents are being directly inserted into LF using a 3rd party application or we would just set the option in the client to generate the pages.

0 0
replied on July 2, 2014

Maybe I'm missing something, but is there a reason that we're not extracting text from the PDF upon import?  Check out the options menu in Tools>Options>Generate Text>Advanced Settings for PDFs.  Here you'll find radio buttons that allow you to generate text using text extraction or OCR existing text as soon as a PDF is imported to the repository.

 

I'll assume that you want to move forward with the original plan, but if you'd like more information regarding the previous paragraph, just let me know!

 

OK, so the original problem is regarding the automation of generating pages for PDFs.  You'll find an option in the Tools>Options>New Documents>Settings menu that allows you to "Generate Laserfiche pages...when importing PDFs".  Once you have your PDF/TIFF document available in the repository, your nightly workflow that OCRs should be able to find the PDF/TIFF file and OCR appropriately.

0 0
You are not allowed to follow up in this post.

Sign in to reply to this post.