Question

Import Agent 9 - OCR Images files extracted from PDF's?

Version 9 Import Agent

Updated June 21, 2016

asked on August 22, 2014

I set up Import agent to extract images from PDF's and then remove the original PDF files. However I have noticed that none of these extracted pages are OCR'd. (Also, keeping the original PDF's there's no extracted text either).

Is this an known issue or is this by design? I understand not being able to OCR PDF's but in Snapshot I know you have the option to OCR the extracted images. I'm guessing I could then kick off a DCC session to OCR these but I'd rather not introduce another process if I don't have to.

0 0

Answer

SELECTED ANSWER

replied on August 26, 2014

Currently it isn't possible to OCR the images generated from PDFs when importing via Import Agent 9. An enhancement request has been filed for this. For now, the workaround would be to use Workflow that can monitor for these types of images imported by Import Agent and then run a DCC job on them.

0 0

View 1 previous reply

replied on May 22, 2015

Yes please, would love an update on this as well!

0 0

replied on May 22, 2015

Eric and John,

There haven't been any feature updates to Import Agent since this was posted. Be aware though that you can Import Agent WILL generate text from PDFs if text streams are present. Explicitly OCRing is only relevant if the text streams are not present and it has to read the generated image to determine the text.

0 0

replied on January 7, 2016

This same request came from one of our clients today - it would be a nice feature to have as most people's expectation is that the PDFs would be text searchable when TIF pages are generated, as they would be if TIF files were being imported directly.

1 0

Replies

replied on June 21, 2016

Hi Chris,

The released Import Agent 10 has supported to OCR pdf pages which does not have text stream. You can upgrade to the latest version and edit profile and check "Retrieve text" and "Use OCR if no text is available" options.

Thanks,

Qinmei

0 0

You are not allowed to follow up in this post.

Question

Question

Import Agent 9 - OCR Images files extracted from PDF's?

Answer

Replies

Sign in to reply to this post.