I am using the SDK version 10.4 and cannot get OCR to work with pages or PDF files. The documents are already created and have pages/PDFs assigned to them.
I run the following code against a valid page with text:
try { OcrEngine ocr = OcrEngine.LoadEngine(); ocr.Run(lfDocNew, new PageSet(pageNum.ToString())); } catch (Exception ex) { result.ErrorMessage = "OCR Error on page " + pageNum + ": " + ex.Message; }
And get this error thrown:
Error:
"Interface not registered"
Stack Trace:
at BPIClientLib.IBPProcessEx.Process(IBPPageEx pUnkPage)
at Laserfiche.DocumentServices.OcrEngine.ProcessImage(String imagePath, DocumentInfo document, PageInfo page)
at Laserfiche.DocumentServices.OcrEngine.ProcessPage(DocumentInfo document, PageInfo page)
at Laserfiche.DocumentServices.OcrEngine.Run(DocumentInfo document, PageSet pages)
at mviLFPost.LFRAPostEngine.AssignPages(LFDocumentInfo DocInfo, DocumentInfo lfDocNew, DocumentImporter importer, PostResult& result) in C:\vs\mvidev\mvi\Dev\Laserfiche Integration\SVR\mviLFPost\mviLFPost\LFRAPostEngine.cs:line 1381
When I run the following code with a PDF electronic document, it does not extract text nor does it throw any error.
try { using (TextExtractor te = TextExtractor.LoadExtractor()) { te.ExtractFrom(lfDoc); } } catch (Exception ex) { result.ErrorMessage = "Error extracting text for electronic document: " + ex.Message; // Don't fail on entire import if text is not extracted. }
Am I missing a required component to run OCR when importing with the 10.4 SDK? Relying on Workflow after the fact really slows down the server when importing many documents.