You are viewing limited content. For full access, please sign in.

Question

Question

Background Shade Removal Before Zone OCR'g

asked on February 19, 2015

I am trying to lift the Amount Due and Due Date off of a dirty scanned document where the background shading has been turned into a corduroy pattern.  I have tried all the image enhancements available in LF / QF Scanning but continue to pick up garbage values with the Zone OCR.  I have also tried toying with the various advanced options in Zone OCR.

Has anyone overcome a similar issue using LF or a third party product to cleanup the background? 

 

1 0

Replies

replied on February 19, 2015

I got perfect OCR results for your sample image by running Color Removal and then Despeckle. Here are the settings I used:

 

2 0
replied on February 19, 2015

Tessa,

I tried similar combinations of QF processes and have since retried with your configuration and it still does not capture the Zone OCR's values accurately. 

The Image in the email is a copy/paste from a screen shot using Snipit. I believe you would need the full page TIF of the original to run an equivalent test. Thanks for trying.

Eric.

0 0
replied on February 19, 2015

Feel free to attach the original file (with any personal/confidential info redacted), and I'll take a look. 

1 0
replied on February 20, 2015

Eric,

For completeness' sake, please make sure the OCR is set to the accuracy option together with Tessa's image enhancement options.

0 0
replied on February 20, 2015

Do you have the original documents, or a faxed/copied version of the originals to scan?

I'm just guessing here but to me this looks like a colored background field. This is one of those areas that is best handled on the device or with scanning software that came with your device.

For example on Canon Advanced copiers (most of the models released in the past few years) you can set a color to be dropped out when it converts it to B&W. Black is all colors mixed together so you can drop 80% of the color off the page and still get a good B&W document.  I haven't seen this feature show up on all copiers though. And when it does it's almost always there only when the mode is set to B&W (not grayscale).

 

 

If you have a canon scanner with capture perfect software you can even scan in 

2 0
replied on February 19, 2015

If the input is bitonal, you can try the Grow and Erode options.

Academically speaking, these options perform some kind of mathematical morphological operations. A tutorial for morphological operations can be found at http://homepages.inf.ed.ac.uk/rbf/HIPR2/morops.htm

The choice between Grow or Erode depends on whether the text is lighter or darker than the shaded background. You may need to run a trial to find out the correct one. Alternatively, as Tessa suggests, please attach a full-size test image so that we can do some testing here.

The random dots in the shaded area is the result of dithering, as described at http://homepages.inf.ed.ac.uk/rbf/HIPR2/dither.htm and http://en.wikipedia.org/wiki/Dither .

The optimum filter size for Grow and Erode depends on the dithering parameters. Try use the smallest size that enable OCR to be performed.

As mentioned on the articles linked above, dithering is only necessary because the images are being converted into bitonal format. One way of avoiding the need for shade removal is to scan the original documents in grayscale, then run OCR in Quick Fields, and only after OCR convert the images down to bitonal for storage.

0 0
replied on February 20, 2015

Redact secure information (or every part of the image except the section from your snipit).  Then export the page as TIFF (and include the redactions) and upload/attach the file to your post here so that other can assist you with enhancement configuration.

0 0
replied on September 17, 2020

I have a similar image, but color removal will not work as it is a Black and White image?

0 0
You are not allowed to follow up in this post.

Sign in to reply to this post.