You are viewing limited content. For full access, please sign in.

Question

Question

OCR does not capture fillable PDF content

asked two days ago

Hello,

I just ran into an issue with a document that was created with Adobe Acrobat Sign. We searched for it using information that was input into the fillable fields (it was a contract) and it would not come up. I wound up getting more information from the person who was searching and was able to find the document but only because the information was included in the Adobe Acrobat Sign audit report, not the actual contract fields. The document is static, no longer fillable and I ran generate text again and it is still leaving out the information from where the fillable fields would have been. 

Has anyone else run into this issue? Is there a fix?

0 0

Replies

replied two days ago

Is it actually running OCR or is it just extracting text from the PDF?

If you haven't generated pages, then my best guess is that it is just extracting the text component of the PDF, and Adobe Sign is not updating that with the fillable fields.

One thing I've done in the past when I only want the PDF but need OCR is to generate pages, run OCR, then delete the pages after I have the text.

Something else worth noting is that the Schedule PDF Page Generation activity in workflow runs through the Distributed Computing Cluster and uses a newer PDF conversion library that is substantially more reliable than the older library used in the client and elsewhere.

0 0
replied two days ago

Hi Jason,

It may be extracting text instead of OCR. The document already had pages generated from what I can see. I did try generating the pages and then running the OCR/extract text process and it did not change anything. Is the Distributed Computing Cluster an additional component that we would need to purchase?

 

0 0
replied two days ago Show version history

That's unfortunate, I would've expected it to work if pages were generated.

I think DCC is included with most, if not all, self-hosted packages, but it does take a bit of work to get it installed and configured.

Do you notice any difference between the PDF and the generated pages? One of the reasons we switched to the DCC for page generation and OCR is that we had stuff getting lost if the PDF had fillable fields, annotations, etc.

1 0
replied one day ago

What version of Laserfiche? This is expected to work. Please have your reseller open a case with Laserfiche Support and attach a sample PDF. 

0 0
You are not allowed to follow up in this post.

Sign in to reply to this post.