You are viewing limited content. For full access, please sign in.

Question

Question

Can DCC OCR extract text from PDF with a text stream?

asked on July 1, 2014

 Setting up DCC to OCR documents in a repository. Works great for tiff documents but does not extract text from PDF files with text stream. Is this even possible?

0 0

Answer

APPROVED ANSWER SELECTED ANSWER
replied on July 1, 2014 Show version history

The Schedule OCR activity in Workflow specifically OCRs TIFF image documents. It does not extract text from PDFs or any other electronic documents.

0 0

Replies

replied on July 1, 2014 Show version history

This question has come up a lot in various forms, so maybe an explanation of terms would help. The help files have more information on this topic as well.

 

Text extraction is a process performed on electronic files such as Word documents and PDF files to extract the text already associated with the file. If you have the appropriate iFilters, and there is text associated with the document, no OCR needs to be performed to extract this text. You can extract text from the electronic file using the Client or Web Access applications.

 

OCR stands for Optical Character Recognition, and is performed on images (such as TIFFs) to turn the image of text into text that you can edit. Laserfiche can only OCR TIFF image pages. If you have a PDF that does not have any associated text, you must first create TIFF image pages from the electronic file, then OCR those image pages.

 

Workflow uses the Distributed Computer Cluster to schedule OCR, so this activity can only be used on TIFF image pages.

 

In the desktop Client, the user actions to extract text from the associated electronic file and OCRing the image pages from the client are the same so that end-user does not need to understand the difference between these two operations.

3 0
replied on July 2, 2014

Hi Jesse, 

 

If your question has been answered, please click the "This answered my question" button on the response.

 

If you still need assistance with this matter, just update this thread. Thanks!

0 0
You are not allowed to follow up in this post.

Sign in to reply to this post.