We have a customer that is importing PDFs that have no text layer and they are not getting OCRed. Is there an option to designate using an alternative method so that the resulting images get OCRed? They do not want to have to install and use the desktop client.
Question
Question
Replies
Hi Bert,
Do they have both "Generate Text" and "Generate Pages" selected on import?
I believe PDFs without a text layer require page generation for OCR.
Yes they have both selected and the Pages are generated (PDF is not kept), but the text is never generated.
I discussed with the team and we request that you file a support case that includes several sample PDFs.
Based on the descriptions thus far, we're unsure if this is a bug, working as expected but suboptimal design, or some kind of edge case potentially involving the customer's specific PDFs, and would like to investigate further.
While you specifically mentioned they don't want to have to use the desktop Laserfiche Client, if there is an immediate need to have the text available, we are fairly confident that regenerating the document text in the client would work. The desktop client generates the text locally and sends it to the server so unless something goes wrong with the local OCR process you're guaranteed to get it.
In contrast, text generation/OCR in LF Cloud is a background asynchronous process where documents get sent to a queue to be processed by a pool of OCR workers, conceptually similar to Laserfiche Distributed Computing Cluster (DCC) for self-hosted systems.
Burt, can you check again now? It may have been OCRed overnight by the asynchronous process that Sam mentioned.
In contrast, text generation/OCR in LF Cloud is a background asynchronous process where documents get sent to a queue to be processed by a pool of OCR workers, conceptually similar to Laserfiche Distributed Computing Cluster (DCC)
This is awesome!