You are viewing limited content. For full access, please sign in.

Question

Question

Zone OCR not picking up all the text

asked on March 17, 2023 Show version history

I am setting up a new session and copied a previously working session to get there faster.  Unfortunately, the Last Page Identification is not working.  I have rebuilt the OmniPage Zone OCR activity numerous times and tried all kinds of changes, but the tests still do not separate the two test pages.

There is another Zone OCR in the Page Processing and it is collecting all the text on the page and I have been able to copy that collected text from the Output pane into a word doc to view it to see why it the almost identical Zone OCR in Last Page Identification might not find the targeted word.  The word is there, it should be able to find it.

All the Last Page Identification Zone OCR can find is this:  

And in a full test run of two sheets, it finds the above on the first sheet and this on the second sheet (which does have the first words of OWNERS).  It does not appear to be collecting the rest of the words on the page:  

and further down on the page is...

And when I copy the Page Processing text that was collected, I can retrieve this:  *END OF SHEET* as having been collected by the Page Processing zone OCR.

0 0

Replies

replied on March 17, 2023

Your advanced options are set to 'true'. Have you tried setting that to 'false'

0 0
replied on March 17, 2023 Show version history

Thanks, Tegan.  Yes, I have tried numerous variations including that one.  I've also tried Character preference to None, and Use Existing Text as True and also False.  

0 0
replied on March 17, 2023

unfortunately when this happened to me the only solution I found that worked was burning it down and rebuilding it from scratch. Not the best response, but I have found that sometimes (not all the time) when I copy files over they get really buggy. 

1 0
replied on March 17, 2023

That's what I decided to do too.  I'm in the middle of the build, but I won't finish today and then I'm off for two weeks.  :(   Hate leaving something like this unfinished.

1 0
replied on March 20, 2023

Looks like the word SHEET may have some white spaces between the letters that might be interpreted by the machine. So you could use a regex to account for those potential white spaces.

0 0
replied on April 5, 2023

Rebuilding from scratch, plus some more tweaking, and I got it working.

0 0
You are not allowed to follow up in this post.

Sign in to reply to this post.