You are viewing limited content. For full access, please sign in.

Question

Question

Zone OCR token changing data into random characters

asked on January 26, 2016

I have a customer that has a QF session with a couple of Zone OCR spots. One for Vendor name and an other for Check number and amount. The check number and amount work perfectly. The vendor name is giving me trouble. Whats really confusing is that in the Processing Information pane, it shows that it read the vendor name perfectly. When I put that token in the vendor name field, it gets changed into random characters. A lot of them. This is a screen grab from her work station that shows what I'm talking about. 

If the picture is hard to see let me know and I'll upload it instead. Any ideas on what to do would be great, I feel like this might be a glich or bug but let me know what you guys think. Thanks. 

0 0

Answer

SELECTED ANSWER
replied on January 29, 2016

Pasting the image in Paint and saving it will definitely change the resolution down to 96 dpi, so that might explain why the boxes were off.

For the temp folder, you clicked here:

0 0

Replies

replied on January 27, 2016

That looks like the page has shifted up or down during scanning and Zone OCR read through the characters on the first line rather than seeing whole characters. Or possibly the sample image has a different size or resolution than the one you're scanning, so the data is not in the same place. The data in the Processing Info pane looks like it's from testing processes rather than scanning, so I think that might be throwing you off.

1 0
replied on January 27, 2016

The documents shifting would defiantly give me trouble but its not shifting enough to effect the other Zone OCR regions I have. Check number and check amount work, even date works. There is something about the text for the vendor name that gives my OCR an issue. And even crazier is the some of the Vendor names come through fine. I have worked with the client to make sure the documents are the same size as the sample doc because I have run into that one before too.   

0 0
replied on January 28, 2016

The way I usually check these types of issues is by taking the problem page and setting it as a sample page because that gives me a better idea of where exactly the zones are on this particular image and I can test the processes directly.

If you select the page in the document so it's displayed in the image pane and then double-click on the status bar in the bottom right corner on the image information (where it says "Page 1 of 6...."), Quick Fields will open the session's temp folder in Windows Explorer and select the image for you. You can copy it out of that folder and then use it to replace the sample image for that page.

1 0
replied on January 28, 2016

I was thinking about trying that, I'll give it a shot and see what happens. Thanks.

0 0
replied on January 29, 2016

Ok Miruna, your tip was very helpful. By taking the problem image and making it a sample image, I was able to see that the alignment was off. It makes me want a feature in Quick Fields to show the OCR boxes on a scanned document so that we can check alignment with out having to do the extra work. Is this is feature we can expect to see?

p.s. I also could not get the QF session's temp folder to open with your instruction. I copied the image to clipboard and pasted it to paint and saved it as a new image. Do you have any documentation on your technique because it sounds easier than what I did. 

Thanks for the help. 

0 0
SELECTED ANSWER
replied on January 29, 2016

Pasting the image in Paint and saving it will definitely change the resolution down to 96 dpi, so that might explain why the boxes were off.

For the temp folder, you clicked here:

0 0
replied on January 29, 2016

You are correct, Paint made it 96 DPI but it keep the height and width correct so it worked for me. I would rather access the temp folder just like you showed me. Thanks for the picture. That helped a ton. I got it now. But what about the idea of QF showing the OCR boxes on an actually scan. The process information in the output pane shows the text OCR extracts but I would love to see the boxes on the scanned docs. They don't even need to be adjustable at that point but as an easy way to make sure alignment is on would be great.  

1 0
replied on January 29, 2016

That's not scheduled for any time soon, but we'll look into whether there's a way to indicate the regions on scanned documents.

1 0
replied on January 26, 2016 Show version history

Since I see you have a 6-page document shown there, my first thought would be look at the 'Page Range' to check what pages that 'Vendor Name' Zone OCR process is running on.

If it's running on all pages, it could be capturing the correct information from one of the pages but then overwriting the value with gibberish captured from the same spot on a different page. 

0 0
replied on January 27, 2016

Good thought, that was one of the first things I looked at. There is actually 2 separate OCR processes both setup to run on page 1. That is what is so crazy. One of the processes runs fine and this one doesn't. I'm going to try to combine both processes to see if for some reason that fixes it. Ill let you guys know. 

0 0
You are not allowed to follow up in this post.

Sign in to reply to this post.