You are viewing limited content. For full access, please sign in.

Question

Question

DC and OCR jobs

asked on August 21, 2018 Show version history

I have set up a Workflow to check for documents with no electronic file and containing text on "some" pages, and then OCR them.

The OCR jobs don't seem to be generating text and/or marking pages properly, and so the Workflow keeps OCRing the same documents over and over again. 

See attached image.

Now, it IS updating SOMETHING, as the last modified date on the documents keeps changing... but it never changes "pages with text" from "some" to "all".

FWIW: Entry ID 237295 is an 8 page document.  I was able to OCR it from the Windows Client with no problem, and it's now marked "all" pages having text.

Any thoughts/suggestions?

[Workflow 10.2.0.202]

 

2018-08-21 09_43_37-Laserfiche Web Admin.png
0 0

Answer

SELECTED ANSWER
replied on August 21, 2018

OCR settings are configurable in the Workflow activity:

What version of Workflow are you running?

As for the blank pages, there is a patch to DCC that allows it to set the status properly. See KB 1013860.

0 0

Replies

replied on August 21, 2018

Are you using the same OCR settings in the Client and in DCC? What version of DCC are you using? Are any of the pages in the document blank?

0 0
replied on August 21, 2018

Well, AFAIK, I was using the default settings in both, but those defaults are definitely different.

The client is set to decolumnize, despeckle, deskew. and remove lines.

The DCC service (10.2.0.828) is not set to do any of those...

 

So... is there a way to change the default settings on the OCR service in the DCC?

There doesn't appear to be when building the Workflow?

 

 

0 0
replied on August 21, 2018

Regarding blank pages:

Checking some of the smaller examples... it looks like that's a yes, and those are the pages marked as "no text".

0 0
SELECTED ANSWER
replied on August 21, 2018

OCR settings are configurable in the Workflow activity:

What version of Workflow are you running?

As for the blank pages, there is a patch to DCC that allows it to set the status properly. See KB 1013860.

0 0
replied on August 21, 2018

?

 

Oh, I see... It's the VM interface messing with me. 

I wasn't getting focus on the "Image Cleanup Options" box until I clicked it once.

If I click it again, I actually get a popup to change the settings.

 

<Sigh>

 

Thanks!

 

 

0 0
replied on August 22, 2018

Update: It was definitely the KB issue.

I updated the OCR settings and got the same result until I installed the patch.

 

Thanks again.

0 0
You are not allowed to follow up in this post.

Sign in to reply to this post.