non uniform documents

replied on May 19, 2017

Can you use slip sheets to separate the documents?

Can you use the same pattern to find the information on all documents? With a Pattern Match activity, you can match on the whole text of the page rather that trying to limit it to a zone.

0 0

View 2 previous replies

replied on May 19, 2017

The documents are already scanned and in tiffs format, there are about 1300 of them. I could potentially edit the documents but with 1300 that seems a bit unreasonable now.

I'm unfamiliar with the Pattern Match at the moment, it sounds good but my problem seems to be separating one document from the next.

0 0

replied on May 19, 2017

Do you have one multi-page TIFF for each document? If that's the case, then you don't need identification processes. Universal Capture can just keep their current structure.

Your original description sounded like you were describing a stack of papers with no pattern to where a document ends and one starts, so both Tessa and Bert were trying to figure out how would you tell when to separate them.

1 0

replied on May 19, 2017

Hi Miruna,

What a have is a set of tif images (about 1300) and each has varying number of pages( one having 67 and another having 24 etc..)

I want to scan for a permit # and then store each tiff as one document. So the 67 page tif as one document and the 24 page tif as a separate document. The permit number should exist on a page that should be in every tif image/document but not necessarily on the same page number in each instance.

My apologies if I wasn't being clear. Let me know if this makes sense. Thanks for your help.

0 0

replied on May 19, 2017

Ah I see. Yeah in your case, for the identification, since your documents are already split up properly you don't need to configure any identification processes - just check this box instead:

And then for the permit number, you'll want to OCR all pages and use pattern matching to search for the permit number on all pages of the document.

1 0

replied on May 19, 2017

How are the Tiff files that you already have named? Is the permit number already part of the document name for each Tiff? If so, you can grab that without OCR and Pattern Matching.

0 0

Question

Question

Replies

Sign in to reply to this post.