Question

I want to OCR Pages that fall within a specific range of page size (Kb)

Workflow SDK Scanning

Updated July 14, 2015

asked on July 2, 2015

Hi,

I am using DCC to OCR all documents created in specific folder. The volume of pages that need to be OCR'd are quite alot. A single document will have minimum 60 pages.

The process of OCR'ing these pages becomes extremely slow if the specific page being OCR'd has alot of printed data on it. The less data on the page, the quicker the OCR process for that page.

For example, the OCR of a page that is scanned at 300dpi, Black and White and has a size of about 200kb, could take up to a minute to OCR. However, a page of 20kb in size would take less than 5 seconds. The nature of these scanned documents is such that there is alot of handwritten pages, and bad quality pages that results in longer OCR times.

My design only requires the OCR of the small pages, not the large ones. The smaller pages are simply cover sheets that I use to identify document types. I'm not interested in the OCR of all the other pages.

Is there a way to pick out the pages of a document based on its size, then only OCR those specific pages? If it must be done via script, I will do so. If it can be done via DCC, I will do that too, but I don't see the options in DCC for this.

Can anyone assist?

Thanks

Sheldon

0 0

Replies

replied on July 2, 2015

Are you using Workflow alongside DCC? Did you use Workflow to generate the list of documents that are OCR'd by DCC?

0 0

replied on July 9, 2015

Hi Ryan,

Yes, that is correct, I use WF to generate the list and pass over to DCC.

Regards,

Sheldon

0 0

replied on July 14, 2015

Hi Ryan,

Any thoughts?

Thanks

Sheldon

0 0

You are not allowed to follow up in this post.

Question

Question

I want to OCR Pages that fall within a specific range of page size (Kb)

Replies

Sign in to reply to this post.