You are viewing limited content. For full access, please sign in.

Question

Question

OcrEngine spawns multiple lfomniocr19.exe processes

SDK
asked on February 12, 2019

Each time I execute OcrEngine.Run on a document, a new lfomniocr19.exe process spawns that remains persistent for some time on the Laserfiche server.  I created a small function to test this out:

        public bool ocrDoc(int entryId)
        {
            DocumentInfo doc = Document.GetDocumentInfo(entryId, _lfSession);

            using (OcrEngine ocrEngine = OcrEngine.LoadEngine())
            {
                ocrEngine.Run(doc);
            }

            return true;
        }

Each time it runs, a little memory is chewed up dealing with the OCR.  But even when the function is done, the memory is still used and the process persists.  The app I'm having issues with runs multiple documents through OCR via a search result listing which keeps it from spawning thousands of processes.  However, the issue is still the same.  While the listing only generates one lfomniocr19.exe for thousands of documents, it uses far more memory and the memory remains stuck for hours or even days.  Once another round of OCR commences, that memory gets used up and eventually the system runs out of resources.

Is there a way to immediately get back that memory?  Am I missing a call that shuts down lfomniorc19.exe and returns all of it's resources?

0 0

Replies

replied on February 14, 2019

One workaround is to use a helper class that manages a pool of available OcrEngine objects. This causes one or more instances of LFOmniOCR19.exe to be reused:

// Wrapper class that puts an OcrEngine back in the pool when it is no longer needed
class OcrEngineWrapper : IDisposable
{
    public OcrEngine OcrEngine { get; private set; }

    public OcrEngineWrapper(OcrEngine ocrEngine)
    {
        this.OcrEngine = ocrEngine;
    }

    public void Dispose()
    {
        if (this.OcrEngine != null)
            OcrEngineManager.ReleaseEngine(this.OcrEngine);
    }
}

// Manager class that keeps a pool of available OcrEngines
static Queue<OcrEngine> _ocrEnginePool = new Queue<OcrEngine>();
class OcrEngineManager
{
    public static OcrEngineWrapper GetOcrEngine()
    {
        lock (_ocrEnginePool)
        {
            if (_ocrEnginePool.Count > 0)
                return new OcrEngineWrapper(_ocrEnginePool.Dequeue());
        }

        return new OcrEngineWrapper(OcrEngine.LoadEngine());
    }

    public static void ReleaseEngine(OcrEngine ocrEngine)
    {
        lock (_ocrEnginePool)
        {
            _ocrEnginePool.Enqueue(ocrEngine);
        }
    }
}

To OCR a document, do this:

using (OcrEngineWrapper ocrEngine = OcrEngineManager.GetOcrEngine())
{
    ocrEngine.OcrEngine.Run(newDoc);
}

 

1 0
replied on February 21, 2019 Show version history

I still seem to end up with a bunch of lfomniocr19 processes. It's even more than one for each document on which I ran OCR.

For example, I ran a batch of 50 documents, and ended up with 7 instances of OcrEngine in the pool, but 54 instances of lfonmniocr19. I even tried calling OcrEngine.Close() on each engine instance at the end, but that didn't seem to have any effect.

0 0
replied on February 21, 2019

Are you launching a new process for each document being OCR'd? The sample code assumes all OCRing happens within the same process. When i run the the following sample code on a batch of 100 documents, only one lfomniocr19.exe is spawned:

using (Session session = LogIn())
{
    string localFile = @"C:\Program Files (x86)\Laserfiche\Client\SAMPLE 8.tif";

    List<int> docIds = new List<int>();

    for (int i = 0; i < 100; i++)
    {
        Console.WriteLine("Creating document " + (i + 1));
        using (DocumentInfo newDoc = new DocumentInfo(session))
        {
            FolderInfo parentFolder = GetOrCreateFolder(session);
            newDoc.Create(parentFolder, "ImportPages", EntryNameOption.AutoRename);

            DocumentImporter docImporter = new DocumentImporter();

            docImporter.Document = newDoc;
            docImporter.ImportImages(localFile);
            docIds.Add(newDoc.Id);
        }
    }

    for (int i = 0; i < 100; i++)
    {
        Console.WriteLine("OCRing document " + (i + 1));
        DocumentInfo doc = Document.GetDocumentInfo(docIds[i], session);
        using (OcrEngineWrapper ocrEngine = OcrEngineManager.GetOcrEngine())
        {
            ocrEngine.OcrEngine.Run(doc);
        }
    }
}

 

1 0
replied on February 25, 2019

Ok. I figured it out. I'm running my import process using Parallel.ForEach. All of the processes that are being spun up and then destroyed seem to be causing the OcrEngine instances to lose track of the lfomniocr processes. I'll play around with it and see if I can come up with a happy medium.

Thanks!

0 0
replied on February 27, 2019

Robert,

While this seemed to work ok for my internal testing, it did not for the end user.  Not sure if it makes a difference but they are using a 32 bit version of lfomniocr19.

Peter

0 0
replied on February 13, 2019

I also experience this. The only way that I'm dealing with it is to have my script run pskill after the OCR is complete.

0 0
replied on February 13, 2019

Thanks Devin.  It occurred to me to do something similar but it is possible for my app to be running simultaneous OCR jobs.  Killing all of them could end up shutting down something important.  If there was a way to get the process id when the job starts, that might solve the problem for now.

0 0
replied on February 13, 2019

If you start the process, you can get the PID, but I have never been able to find a way to figure out specifically which instances are started by the SDK. I learned of this when my batch machine started running funny, and there were thousands of instances running. They weren't consuming any CPU and negligible memory, but eventually it added up and brought the system to it's knees.

0 0
replied on February 15, 2019

Robert,

Thanks for the helper class.  I put it in my code and tests went well.  Hoping it doesn't end up trading lots of processes taking up a smaller amount of memory with one process taking up lots of memory.  End user should be implementing the update over the weekend and will let me know how it goes.

Peter

0 0
replied on June 12, 2019

@Robert  what is the cause of that "lfomniocr19.exe" stays in memory even the workflow finished?  We have some issues with it and as the scan/OCR could happen anytime so the "batch" run doesn't apply. With the wrapper class it doesn't resolve the issue of get it out of memory after apparently OCR finished.

To record process id and kill by program isn't a neat solution at all that require much much more programming knowledge.

 

 

0 0
replied on June 12, 2019

The root cause should be fixed in 10.4.1 (bug# 155383), I believe installing the 10.4.1 windows client should fix the problem.

0 0
replied on June 12, 2019

Luke, are you seeing more than 4 instances of LFOmni19.exe piling up when your workflow runs? A few leftover processes is normal, but if you end up seeing more than 4 processes that stay around indefinitely then you are seeing the bug I mentioned.

0 0
replied on June 12, 2019

@Robert, yes I am actually testing with a very simple workflow which search under a folder then for each document to do an OCR with the SDK script as following:

 

    // Wrapper class that puts an OcrEngine back in the pool when it is no longer needed
    class OcrEngineWrapper : IDisposable
    {
        public OcrEngine OcrEngine { get; private set; }

        public OcrEngineWrapper(OcrEngine ocrEngine)
        {
            this.OcrEngine = ocrEngine;
        }

        public void Dispose()
        {
            if (this.OcrEngine != null)
                OcrEngineManager.ReleaseEngine(this.OcrEngine);
        }
    }

    // Manager class that keeps a pool of available OcrEngines
    class OcrEngineManager
    {
        static Queue<OcrEngine> _ocrEnginePool = new Queue<OcrEngine>();
        public static OcrEngineWrapper GetOcrEngine()
        {
            lock (_ocrEnginePool)
            {
                if (_ocrEnginePool.Count > 0)
                    return new OcrEngineWrapper(_ocrEnginePool.Dequeue());
            }

            return new OcrEngineWrapper(OcrEngine.LoadEngine());
        }

        public static void ReleaseEngine(OcrEngine ocrEngine)
        {
            lock (_ocrEnginePool)
            {
                _ocrEnginePool.Enqueue(ocrEngine);
            }
        }
    }


    /// <summary>
    /// Provides one or more methods that can be run when the workflow scripting activity is performed.
    /// </summary>
    public class Script1 : RAScriptClass102
    {
        /// <summary>
        /// This method is run when the activity is performed.
        /// </summary>
        protected override void Execute()
        {
            SetActivityTokenValue("SDK_Warning", "");

            var docinfo = this.BoundEntryInfo as DocumentInfo;

            try
            {
                using (OcrEngineWrapper ocrEngine = OcrEngineManager.GetOcrEngine())
                {
                    ocrEngine.OcrEngine.Run(docinfo);
                }
            }
            catch(Exception ex)
            {
                string errormessage = ex.Message;
                SetActivityTokenValue("SDK_Warning", errormessage);
            }
            finally
            {


            }
        }
    }

And by running that workflow again and again (twice or 3 times) then I see more than 4 instances stays in memory.

 

When you said it's been fix for Client 10.4.1, does it also being fixed for SDK call? Thanks.

 

 

0 0
replied on December 9, 2019

Hi,

 

Our customer sees a related problem with multiple spawning NuanceLS.exe processes with OmniOCR18.5 also using SDK to store in Laserfiche

 

Is there a fix for version 10.3.1?

many thanks

0 0
replied on December 15, 2019

Hi team,

Would appreciate a reply to my post.

Thank you

0 0
You are not allowed to follow up in this post.

Sign in to reply to this post.