You are viewing limited content. For full access, please sign in.

Question

Question

How can I get information on fails from the "Schedule OCR" activity?

asked on November 2, 2022 Show version history

We have a "Schedule OCR" activity that runs several times each night to generate text for scanned documents.  On the server where we manage the DCC that runs the OCR activity, we can see that many of these documents fail, but Workflow happily reports each 'Schedule OCR' activity that we run as successfully completed.  I've tried this with both scheduling the OCR for whole batches of documents, as well as scheduling them individually through a For Each loop.

 

We would really like to get details on entries that failed so that we can automate applying some tags and evaluating them.  I was very hopeful when I learned about the Callback Options in the advanced settings of the "Schedule OCR" activity, but I'm getting nowhere.  At first, I just tried using a failure workflow, but when I could get nothing back from that I also added a success workflow.  That should mean something happens, right?  No such luck.  Neither the success nor failure workflows ever trigger.

 

Is there a setting on the server somewhere that we need to adjust, or a firewall rule that needs to be added, or... something else?  Please comment if you know anything about this or have any ideas!  If the callback function doesn't work I'm open to other solutions to get failure data back to Workflow.

 

My installed Workflow Designer version is 11.0.2206.826, and the DCC installation on the server is version 11.0

0 0

Replies

replied on November 3, 2022

The reason the OCR scheduling task isn't failing is most likely because the actual scheduling is not failing. If you see the jobs in your DCC, the scheduling was successful. If the DCC jobs themselves are failing, that's a completely separate process. 

You might want to try monitoring the Laserfiche Web Admin Console event viewer or the Windows event viewer to see what is happening. I don't know of a built-in way to receive emails about errors/warnings in DCC jobs, but this is doable using a PowerShell script (or some other scripting language that has event viewer access) on the DCC server.

0 0
replied on November 3, 2022

Thanks for the response, Kevin!  I had assumed the same thing originally, but if you look at the documentation for the OCR scheduler under "Advanced: Callback Options" it states:

Callbacks allow the activity to start a workflow for either additional processing once the OCR has been completed on a file, or to handle information on files that were unable to be processed.

 

It goes on to explain more about the callbacks and certainly seems to indicate that the workflows are triggered based on the end result of the DCC job.  I would think that either the documentation is very misleading, something on our end is configured incorrectly, or the feature is just plain broken.

 

That being said, I'll talk to our Ops guys about checking the event viewer logs to see if they provide any information that we could use to automate failed entry identification.  I'm hoping that @████████ or someone will be able to get on and just say "Oh, you need to make sure this port is open in your firewall" or something along those lines, but scripting an export based on event viewer logs could be a backup plan.

0 0
replied on November 9, 2022

What Kevin said about scheduling OCR (and PDF page generation, for that matter). The activity does not wait for the results of the OCR process. It queues up the documents on DCC and then it's done.

The callback workflow is triggered at the end of each task in the OCR job. Jobs involving multiple documents are broken down into one task per document.

DCC's service user needs to have rights to connect to the WF server and start the workflow you want invoked.

There should be more information in the DCC event logs if it attempted to start a workflow and failed.

0 0
replied on November 10, 2022

Hello Miruna,

I am working with Sean on this,

They did look at the logs and in the DCC Admin logs it shows:

"Admin operation completed successfully."  with a list of Entry ID's that is was successful in OCR'ing.

In the Operational Logs it shows "Job Completed", but there were the following warnings:

The job encountered the following non-terminating errors.

Task ID: 33270.33270.18, Task Type: OCR.OCR, Host Name: (Removed customers Host Name)

Context Message: Entry locked. [9014]

Entry locked. [9014]

Laserfiche.RepositoryAccess.LockedObjectException

   at Laserfiche.RepositoryAccess.EntryLock.LockInternal(HttpUrl url, LockType type, LockExtent lockExtent, Dictionary`2 additionalHeaders, String etag)

   at Laserfiche.RepositoryAccess.EntryLock.LockWithCheck(LockType type, LockExtent extent, String etag)

   at Laserfiche.RepositoryAccess.EntryInfo.LockWithCheck(LockType type, Nullable`1 duration, String comment, LockExtent extent)

   at Laserfiche.DistributedComputingCluster.Modules.OcrLfEntryModule.OcrTask.DownloadDriver.DownloadDocImages(DocOcrWorkItem docOcrWorkItem)

   at Laserfiche.DistributedComputingCluster.Modules.OcrLfEntryModule.OcrTask.DownloadDriver.ProcessDocOcrWorkItems()

 

Task ID: 33270.33270.26, Task Type: OCR.OCR, Host Name: (Removed customers Host Name)

Context Message: Entry locked. [9014]

Entry locked. [9014]

Laserfiche.RepositoryAccess.LockedObjectException

   at Laserfiche.RepositoryAccess.EntryLock.LockInternal(HttpUrl url, LockType type, LockExtent lockExtent, Dictionary`2 additionalHeaders, String etag)

   at Laserfiche.RepositoryAccess.EntryLock.LockWithCheck(LockType type, LockExtent extent, String etag)

   at Laserfiche.RepositoryAccess.EntryInfo.LockWithCheck(LockType type, Nullable`1 duration, String comment, LockExtent extent)

   at Laserfiche.DistributedComputingCluster.Modules.OcrLfEntryModule.OcrTask.DownloadDriver.DownloadDocImages(DocOcrWorkItem docOcrWorkItem)

   at Laserfiche.DistributedComputingCluster.Modules.OcrLfEntryModule.OcrTask.DownloadDriver.ProcessDocOcrWorkItems()

The warning does not give any data of which Entry ID(s) is/are in the locked state.

0 0
replied on November 11, 2022

@████████-Any ideas as to the issue based on the error message above?

Thanks,

Jeff Curtis

0 0
You are not allowed to follow up in this post.

Sign in to reply to this post.