Scanned documents and OCR in Cloud

asked on November 15, 2024

Hi all. We are using Laserfiche Cloud. The current process is to batch scan from a copier and use Import Agent to import the documents into Laserfiche Cloud in TIFF format. The issue we've discovered is if the pages are not initially correctly scanned for orientation, the text is not generated properly from the documents. That is, if a 11x17 page is scanned sideways to fit through the copier feeder - there is unreadable characters generated for text. After the import, if we rotate the page in Laserfiche Cloud for proper orientation, the unreadable characters generated for the text remains and Laserfiche Cloud does not OCR the TIFF file again.

For the documents that have pages rotated after being imported into Laserfiche Cloud, I can't figure out how to trigger the TIFF file to be OCRed again, to generate the text properly. Aside from scanning everything in the proper orientation (which is not always reasonable when doing batch scanning with pages of different sizes), I'm wondering if anyone else has come across this problem and what is done as a workaround - is it possible to automatically fix this through some sort of workflow in Cloud? I'm wondering about how many files were imported without searchable text generated, and how to fix that as well.

Example of scanned document that was rotated after the document was imported using Import Agent:

1 0

replied on November 15, 2024

In the Import Agent OCR Settings, there is an option to rotate pages automatically. See if that may address the issue. Refer to the product documentation for more information.

1 0

View 3 previous replies

replied on November 15, 2024

I will try this - thank you for the quick response!

For existing documents in Laserfiche Cloud, is there a workflow or something we can use to trigger the TIFF to be OCRed again?

0 0

replied on November 15, 2024

Does this also apply to Laserfiche Cloud?

0 0

replied on November 15, 2024

@████████ - I just tested it in Laserfiche Cloud and it worked.

0 0

replied on November 15, 2024

For existing documents in the repository, one option may be to use the Laserfiche Windows Client to connect and then manually re-OCR the affected documents. The OCR settings in the Windows Client can be configured in the same way to autorotate pages.

0 0

replied on November 15, 2024

Thank you @████████ - is there an automated solution or workflow for the Cloud we can utilize instead to trigger the OCR process? Manually re-OCRing each document is not really an option due to the number of documents in our repository.

0 0

replied on November 18, 2024

It's also not an option for me I think as I don't have a distributed cluster to use.

0 0

Question

Question

Scanned documents and OCR in Cloud

Replies

Sign in to reply to this post.