You are viewing limited content. For full access, please sign in.

Question

Question

DCC unable to OCR Greyscale Images

asked on May 23, 2018

Hi All,

 

I'm getting a lot of errors when using DCC and trying to OCR greyscale images. I can OCR them perfectly fine through the client, but DCC simply fails to OCR them.

 

I've tried enabling image enhancements etc. but nothing seems to work. Has anyone else seen this or know what we can do to improve the OCR capabilities of DCC when dealing with greyscale images? B&W OCR's without any issue.

 

Cheers!

0 0

Answer

SELECTED ANSWER
replied on July 12, 2018

Hey All,

 

This is resolved in DCC 10.2.x, it can OCR greyscale images no problem.

 

Cheers!

0 0

Replies

replied on May 23, 2018 Show version history

Hi Chris,

What errors are you getting exactly? I know that when we first started testing our DCC process we ran into problems with OCR on some images because of how much "noise" they contained.

As a follow up question, what settings are you using? I've found that some Image Cleanup Options can actually cause more errors with certain images.

As an example, a large percentage of our images ran into errors particularly with the Decolumnize and Despeckle settings enabled so we don't use those at all.

We have a retry process setup to where if OCR fails twice, it retries a third time with all of the cleanup options disabled (decolumnize/auto-rotate unchecked and all additional options set to "no").

Grayscale is inherently more noisy than black and white, and in my experience they cause more problems with OCR than color because it is hard for the system to analyze the image with relatively little variation between the background and foreground.

0 0
replied on May 23, 2018

Hey Jason,

 

Unfortunately DCC doesn't offer much in the way of feedback as to why it fails from what I can see, it just shows that the job failed in the job log. Greyscale scanning is a business decision so that isn't going to change, what's very odd is that as I said it OCR's absolutely fine through the client, so it's not an image issue, it's something specific with DCC.

 

Cheers!

0 0
replied on May 23, 2018 Show version history

That sounds pretty similar to what I encountered. I wasn't suggesting you stop using grayscale, just that it might require some adjustments to your OCR settings.

Something to consider is that OCR on a local machine is very different from using OCR on the DCC. Local OCR is just one process at a time, but the DCC can be juggling multiple actions and could easily overwhelm your server's resources.

I would try the following steps:

  1. Check your server to see how it is doing as far as memory/cpu resources
  2. Set the "max" threads/tasks to be 1 lower than the number of processors (for example, our DCC workers have 12 processors, so we limit them to 11 tasks to make sure they shouldn't go beyond about 85% utilization)
  3. Test again to see how things run

If you continue having the same problem

  1. Disable OCR options one at a time until you get better results
  2. Try re-enabling them one at a time to see which ones may be causing the problems.

My guess would be that you're either looking at resource issues, problems with some of the settings, or a combination of both. Also check which version of DCC you are on because an update improved performance when we first started using DCC.

 

0 0
replied on May 23, 2018

Thanks Jason. I’ve already tried various combinations of image enhancement settings with not much improvement. The customer is on dcc 9 but will be upgrading to v10 this year. We will have to hold out till then as I agree it’s better in 10, cheers!

0 0
replied on July 11, 2018

Just an FYI - I've done some more testing internally on this using the latest version of DCC 10.0.0.111 and this is also unable to OCR greyscale images, the job just fails without giving a reason. I will raise a support case:-

 

0 0
replied on July 11, 2018

Does it work on that image if you turn off auto-rotate?

0 0
SELECTED ANSWER
replied on July 12, 2018

Hey All,

 

This is resolved in DCC 10.2.x, it can OCR greyscale images no problem.

 

Cheers!

0 0
You are not allowed to follow up in this post.

Sign in to reply to this post.