Ability to Spilt the page in SDK Script in Workflow

replied on March 31, 2015

The Laserfiche SDK will give you the tools you need to iterate through the pages in a document, retrieve their image data and decode them into bitmaps, to create new images from rectangular regions in the source image, and to import these new images as pages in a document. However, Laserfiche does not ship with code to identify horizontal lines on a page and to determine their coordinates. You would need to supply code to do that. If you know how to write code to identify the presence of horizontal lines on a bitmap, or can find an image processing library which can do this, then we can outline which Laserfiche classes and methods to call to perform the other tasks. However, I wanted to clarify first that you know how to identify these lines in an image using code since it's key to solving your problem.

0 0

replied on April 1, 2015

Thanks Michael. Even if I can identify the word 'TEST' in the page, that will work. It need not be a horizontal line. Do you have any sample code for this please?

0 0

replied on April 1, 2015 • Show version history

I don't have sample code, as a fair amount of code would be required to do this. I can point you to the API functions you should be calling though, and in what order.

First you need a DocumentInfo instance that represents the document you'll be working with. If you are using Workflow, then an EntryInfo reference representing the entry should be passed to the script. Cast this to a DocumentInfo.
You need to iterate through the pages so you can retrieve the text and image data. Call DocumentInfo.GetPageInfos, which will return a PageInfoReader instance. PageInfoReader implements IEnumerable and you should use a loop to iterate through the pages by retrieving a PageInfo instance for each page. You will be processing one page at a time.
Call PageInfo.ReadTextPagePart and read the text part of the page out. Store the text into a local MemoryStream instance you allocate.
Instantiate a WordsReader instance from the MemoryStream that contains the page text and call WordsReader.Read in a loop until you find a PageTextWord value that represents the key word you're looking for, such as "TEST".
Reset the MemoryStream.Position property for the text page to 0 to reset the stream and create a new instance of WordsReader.
Call PageInfo.ReadLocationsPagePart to retrieve a WordLocationsReader instance. The locations data for the page contains data about where the on the image each word is located. This will be used to find a corresponding rectangle and its coordinates on the image that is the bounding box for the key word.
Instantiate a TextLinker instance, passing in the WordsReader instance from step 5 and the WordsLocationReader from step 6 to the constructor.
Call TextLinker.GetRectanglesInRange, passing in as the start and end range the starting and ending position for the key word "TEST" that is available as properties of the PageTextWord value you retrieved in step 4. You now have a rectangle listing the coordinates of the word "TEST" on the image portion of the page.
Call PageInfo.ReadPagePart, specifying PagePart.Image to retrieve a Stream instance representing the image portion of the page.
Reference the LaserficheImaging.dll assembly. Instantiate an instance of Laserfiche.Imaging.Core.LfWriteableBitmap, passing in the Stream instance from step 9 to the constructor.
Query the properties of LfWriteableBitmap to determine the image's width and height.
Now you can split the source image into several smaller images. For each smaller image you want to create, instantiate a new instance of LfWriteableBitmap, specifying the desired width, height, DPI and pixel format. You should copy the DPI and pixel format values from the properties of the source (original) LfWriteableBitmap.
Call LfWriteableBitmap.WritePixels on each of the destination (new) LfWriteableBitmap instances, specifying the coordinates in the source LfWriteableBitmap to copy over.
Instantiate an instance of Laserfiche.Imaging.LfiBitmapEncoder, specifying the desired image format. For each new bitmap, set the element of LfiBitmapEncoder.Frames to reference the LfWriteableBitmap instance representing the new bitmap and call LfiBitmapEncoder.Save to save a copy of the image to disk. You will later delete this temporary file.
Call DocumentInfo.InsertPage to create a new page and store the returned PageInfo reference representing the new page. You might want to do this in a new document so as not to disturb the source document.
Call PageInfo.WritePagePart specifying PagePart.Image to retrieve a Stream instance that you will use to upload the image data. The size argument to WritePagePart should be the length of the image file that was saved to disk in step 14.
In a loop call Stream.Write to upload the data from the image file you created in step 14 to the page you created in step 15. When you're done, delete the temporary file.
You're done. Note that steps 12-17 will have to be repeated for each sub-image you create from the original image.

3 0

replied on April 1, 2015

Thank you so much. We will try this.

Priya

0 0

Question

Question

Ability to Spilt the page in SDK Script in Workflow

Replies

Sign in to reply to this post.