Generate Pages without OCR'd text on a PDF

replied on October 6, 2022

The options for PDF text extraction can be found under Tools\Options\Generate text\Advanced PDF options. You can set it to "use native extraction method" which will only generate text pages if the PDF has a text layer.

There is no way to completely turn off text page generation if you're generating images.

"Index (make text searchable)" does not control page generation.

Out of curiosity, what is the reason you need images but not text?

0 0

replied on October 6, 2022 • Show version history

The PDF's have information that is redacted on the PDF layer, but the text layer when OCR'd allows users to highlight and copy/paste the redacted text. So they want to iron on the PDF redactions into LF pages to prevent people from having that data.

Is there an easy way to delete OCR'd text for an entire document or set of documents?

0 0

replied on October 6, 2022

I see. Then your best bet might be to use Workflow and Distributed Computing Cluster for page generation from these PDFs. Schedule PDF Page Generation activity does give you the option to not extract text at all.

1 0

Question

Question

Generate Pages without OCR'd text on a PDF

Replies

Sign in to reply to this post.