Question

How extract text from pdf with images(scanned)? via sdk laserfiche, repositoryaccess, script workflow its possible?

SDK Snapshot Version 9 Laserfiche Workflow

Updated March 6, 2016

asked on January 20, 2014 • Show version history

I have a pdf with text and images (scanned) and for obtaining text from pdf

i use ocr tool of laserfiche. Its working fine.

But via sdk laserfiche , ocrengine not working , not obtaing text from images of pdf.

So i use code via sdk snapshot and next ocr via sdk its working fine, but

only client laserfiche is open.

There is code:

using (C_AU.ClientManager clienteLaserFiche = new C_AU.ClientManager())
{

IList<C_AU.ClientWindow> carpetasAbiertas = clienteLaserFiche.GetAllClientWindows(C_AU.ClientWindowType.Main).ToList();

C_AU.MainWindow carpeta = carpetasAbiertas[0] as C_AU.MainWindow;

C_AU.GeneratePagesOptions opcionesCat = new C_AU.GeneratePagesOptions();

opcionesCat.ShowUI = false;

List<int> documentos = new List<int>();
documentos.Add(915);
carpeta.GeneratePages(documentos, opcionesCat);

}

next ocr al id document 915

motorOcr = C_DS.OcrEngine.LoadEngine();
motorOcr.AutoOrient = true;
motorOcr.Decolumnize = true;
motorOcr.OptimizationMode = C_DS.OcrOptimizationMode.Accuracy;

motorOcr.Run(C_RA.Document.GetDocumentInfo(Convert.ToInt32(915),varConexion.varSession));

How extract text from pdf with images (scanned) ? via sdk laserfiche, script workflow, etc?

its possible?

its possible extract text from pdf with images without call client laserfiche snapshot ?

Jquery.pdf (160.91 KB)

| Download

0 0

Answer

APPROVED ANSWER SELECTED ANSWER

replied on January 21, 2014

The OCR engine only works on images. The OCR engine itself does not create image pages. Due to licensing agreements for 3rd party components, the ability to generate image pages from PDFs is not available in the SDK.

1 0

Replies

replied on January 21, 2014

ok thanks

0 0

replied on March 6, 2016 • Show version history

One thought to share that we've employed elsewhere:

If Laserfiche Import Agent (LFIA) and Laserfiche Workflow are available, create a Workflow rule to push out PDF to folder monitored by an LFIA Import rule with Advanced options to generate Laserfiche Native Document (TIFF Image) as it imports PDF. Then use Second Workflow rule to merge the LFIA incoming image with the original PDF, or just keep the incoming Image if you don't want the PDF.

0 0

You are not allowed to follow up in this post.

Question

Question

How extract text from pdf with images(scanned)? via sdk laserfiche, repositoryaccess, script workflow its possible?

Answer

Replies

Sign in to reply to this post.