I am using the SDK version 10.4 and cannot get OCR to work with pages or PDF files. The documents are already created and have pages/PDFs assigned to them.
I run the following code against a valid page with text:
try
{
OcrEngine ocr = OcrEngine.LoadEngine();
ocr.Run(lfDocNew, new PageSet(pageNum.ToString()));
}
catch (Exception ex)
{
result.ErrorMessage = "OCR Error on page " + pageNum + ": " + ex.Message;
}
And get this error thrown:
Error:
"Interface not registered"
Stack Trace:
at BPIClientLib.IBPProcessEx.Process(IBPPageEx pUnkPage)
at Laserfiche.DocumentServices.OcrEngine.ProcessImage(String imagePath, DocumentInfo document, PageInfo page)
at Laserfiche.DocumentServices.OcrEngine.ProcessPage(DocumentInfo document, PageInfo page)
at Laserfiche.DocumentServices.OcrEngine.Run(DocumentInfo document, PageSet pages)
at mviLFPost.LFRAPostEngine.AssignPages(LFDocumentInfo DocInfo, DocumentInfo lfDocNew, DocumentImporter importer, PostResult& result) in C:\vs\mvidev\mvi\Dev\Laserfiche Integration\SVR\mviLFPost\mviLFPost\LFRAPostEngine.cs:line 1381
When I run the following code with a PDF electronic document, it does not extract text nor does it throw any error.
try
{
using (TextExtractor te = TextExtractor.LoadExtractor())
{
te.ExtractFrom(lfDoc);
}
}
catch (Exception ex)
{
result.ErrorMessage = "Error extracting text for electronic document: " + ex.Message; // Don't fail on entire import if text is not extracted.
}
Am I missing a required component to run OCR when importing with the 10.4 SDK? Relying on Workflow after the fact really slows down the server when importing many documents.