You are viewing limited content. For full access, please sign in.

Question

Question

SDK OCR Location Data

asked on November 10, 2015

We're looking into using a 3rd party OCR Engine. Last piece is writing the location data to laserfiche.

Can't find any info on the Location data format that laserfiche is using/expecting. And not sure what to decode the binary data coming from laserfiche so i can see what it is expecting.

Using SDK 9.x

Anyone have any experience with this or examples?

Thanks.

0 0

Replies

replied on November 10, 2015 Show version history

It's a series of rectangles, one per word. Each rectangle consists of four unsigned 16-bit integers, denoting the X coordinate of the upper left-hand corner, the Y coordinate, the width in pixels, and the height in pixels, in that order. The upper-left corner of the image is coordinate (0, 0). There is no padding at all in the locations stream.

As for what constitutes a word, it's a bit complex to explain. Instead, use the WordsReader class in RepositoryAccess. Given a System.IO.Stream or TextReader representing your text for the page, use it to break up the text part of the page into a sequence of words. Each word needs to have one rectangle, the order of the rectangles in the locations page part should match up to the order of the words in the text.

0 0
replied on November 11, 2015

Is the location stream an array, csv, or?

0 0
replied on November 11, 2015

An array of binary data. Each rectangle is 8 bytes long: four 16-bit integers per rectangle.

0 0
You are not allowed to follow up in this post.

Sign in to reply to this post.