You are viewing limited content. For full access, please sign in.

Question

Question

Import multi-page tif file into one-page OCR'd Laserfiche documents

SDK
asked on February 4, 2015

I've been using the com interop sdk functions for a few years, but I'm new to the .Net components.

What is the best way to write a function to save a non-Laserfiche multi-page tif file as several separate Laserfiche one-page documents?
Are all of these steps necessary?

1 - save as a multi-page Lf doc with Interop.DocumentProcessor90.DocumentImporter
2 - for each page, ExportPage(IDocumentContents, Int32, Stream) into a stream
3 -                         ImportEdoc(String, Stream) into new LF document
4 -                         OCR this document with Run(DocumentInfo, PageSet)

 

Thank you!

0 0

Replies

replied on February 5, 2015

Use the LaserficheImaging library to extract out each page from the multi-page TIFF (they're called frames in LaserficheImaging), then separately import each page using RepositoryAccess. If these are TIFF-G4 or TIFF-JPEG images then you don't even need to use DocumentServices for this.

0 0
replied on February 9, 2015

Hi Howard,

If you still need help with your question, let us know by updating the thread! If however it has been answered, please click the "Mark this reply as the answer" button on the appropriate response - Thank you!

replied on February 9, 2015

Hi Howard, 

If you still need help with your question, let us know by updating the thread! If however it has been answered, please click the "Mark this reply as the answer" button on the response - Thank you!

0 0
replied on February 9, 2015

Thanks Michael.

But due to the lack of documentation for LaserficheImaging, I'm writing this with DocumentServices and RepositoryAccess.

 

0 0
replied on February 11, 2015

Hi Howard,

I'm not familiar enough with the SDK to give you a function-based answer, but I believe the SDK may be unnecessary in this case. Using Workflow, you are able to separate the pages of a document into single pages. The basic components of the workflow include a Find Entry or Search Repository activity to locate the document, then a Repeat loop to separate its pages. I show an example (with further explanation) below.

Within the Find Entry or Search Repository step, be sure to specify "Page Count" under Additional Properties - this pulls the number of pages in the document for use in the Repeat loop.

In the Repeat activity, the condition designates the number of times it will iterate. Configuring the condition as below tells the loop to run as many times as there are pages in the document.

Within the loop, each page of the document will go to the destination and will be named as specified in the Create Entry activity. I suggest using a token in the Entry Name to give each page a unique name. The %(Repeat_Iteration) token will name the new document according to the page number.

In the Move Pages activity, you will separate the document one page at a time. If you'd like to keep the original document, under Action select "Copy pages." Otherwise, select "Move pages" and "Delete document if all its pages have been moved." You will Move Pages From the document path found in the Search Repository (or Find Entry) step, and you will Move Pages To that designated in the Create Entry activity. If you are copying pages, the page number you pull out will be that of the current iteration; so, Page Range will be %(Repeat_Iteration). If moving pages, Page Range will be 1, i.e. the new first page.

Another Answers post that is useful can be found here. I hope this will be useful!

You are not allowed to follow up in this post.

Sign in to reply to this post.