You are viewing limited content. For full access, please sign in.

Question

Question

Quick Fields Parallel Processing Sessions

asked on July 16, 2024

Hi everyone,

Just wanted to ask this question.  I've got a large job (35,000 records, 175,000-ish pages) that is currently running through Quick Fields, and I've got a server with lots of horsepower.

Today I bumped up the number of sessions in Parallel processing from 1 to 2.  But that is all the system allows me to add.  My server has 6 CPUs, and from the documentation I understand I should be able to have up to 12 sessions.  I'd be happy with 4 sessions.

The Quick Fields server is a Windows 2019 VMWare VM, with 6 CPUs available to it and 16GB of RAM.  It is running at 36% with the two sessions.

What do I need to do to enable more sessions?

0 0

Replies

replied on July 16, 2024

Parallel processing refers to running multiple different Quick Fields sessions concurrently if their schedules overlap in Quick Fields Agent.

From your description, it sounds like all these documents are processed by a single Quick Fields session. Is that correct?

0 0
replied on July 16, 2024

The documents are split up into about 80 different files that are then processed separately.  I have moved them to different folders and give each folder its own session. 

I am currently running 2 Quick Fields sessions running concurrently, and it is working well.  My question is specifically about why I can't increase past 2 sessions, despite having 6 CPUs available to the server.  From the documentation, I should be able to have 2 Quick Fields sessions per CPU, and this server has 6 CPUs.  It is 2 cores per socket, with 3 sockets, if that helps.

I was using a single Quick Fields session before, but I'm trying to reduce the time it takes to complete the process, so 2 sessions should reduce the overall time by about half.  If I could get it down to 1/4, I'd be a happy camper.

0 0
replied on July 16, 2024

Like I said before, you'd have to have the scheduled at roughly the same time to start in parallel.

Quick Fields is likely going to be IO and network bound while processing, so you might not see a linear scaling with the number of sessions.

0 0
replied on July 16, 2024

Fair point, but right now with 2 sessions I'm not seeing a reduction in throughput.  This is a VM on a very robust ESXi host with a direct connection to a high speed storage, so I'm sure at some point I could overload it, but it won't be today.

You seem to be focusing on whether I should do this or not, but my question is more why I can't make the change to the setting.  According to the documentation, I should be able to have up to 12 sessions because the VM has been allocated 6  CPUs.  I don't want 12 sessions, 4 seems like the sweet spot, but I would like to know why the option isn't available.

0 0
replied on July 17, 2024

Oh, I see what you mean. I was reading your post as saying you already bumped it up to 12 but you only ever see 2 running at the same time. But you're saying you can't change the number to anything higher than 2.

If you go to C:\ProgramData\Laserfiche\Quick Fields Agent and open Settings.xml in a text editor, what does it say for "MaxAgents"? (You may have to turn on showing hidden files in File Explorer to see ProgramData).

0 0
replied on July 17, 2024

Thanks for that - MaxAgents is set to 2.  Can I increase that in the file?

0 0
replied on July 17, 2024

I tried increase that setting to 4, and now I can increase the sessions to 4 as well.

 

Thank you!

0 0
replied on July 17, 2024

No problem. Any chance this is a VM that started out with 2 CPUs and then had more added after Quick Fields was installed?

The max value is supposed to default to the number of CPUs on the machine. We'll take a look at the code to see why it might not do that.

0 0
replied on July 17, 2024

That is possible, but it would have had at least 2 CPUs.  I think I increased the number of CPUs last year when I was looking at this last time.

0 0
replied on July 18, 2024

Ok, so I tried running with 4 sessions over night, and the results far exceeded my expectations.  I had understood that each session would be a unique instance - each session would get its own files, ocr them, and process them.

But I had files that were ingested faster than I've ever seen before, and the only way I can explain it was the 4 sessions were all working on processing the same input file.  Does that make any sense?

0 0
replied on July 18, 2024

That would depend on your settings. If you are using Laserfiche Capture Engine to pull documents from the repository and your settings match across 2 or more sessions, then they would be pulling the same documents if they run concurrently.

What are you actually seeing?

0 0
replied on July 18, 2024

I'm seeing jobs that would normally take hours being done in minutes.  However, it isn't consistent, so I'm wondering if the documents were processed and then simply written to the repository as a batch.  I'm not sure why they weren't written one at a time like they normally would have been.

 

I am using Quickfields Agents to monitor a set of folders on a server, and it pulls the files in, OCRs them and then processes them.  The files tend to be multiple records together, and they get divided into unique records.  Not sure if that is the Laserfiche Capture Engine, as that is terminology I don't see anywhere.  I am not bringing the files from the repository and re-processing them. 

0 0
replied on July 19, 2024

By default, Quick Fields Agent sends documents back when the session is done processing them. So, they should be showing modified dates within seconds of each other. With a batch with multiple records, Quick Fields would slice it and dice it into documents and then only send these documents back after it processes the last page of the incoming batch.

You can set the session to send immediately and then they'll come in as they're finished instead of waiting for a full batch to be done.

Capture Engine is for pulling docs from the repository, Universal Capture is what you'd be using if you're pulling files from a folder on disk in Windows. But the same processing rules apply once images are in Quick Fields.

0 0
replied on July 19, 2024

Yes, I had to divide my files into separate folders, and have each agent only monitor one folder.  

This worked well for two agents or four agents and smaller files.  When I got into some of the large files that I'm dealing with now, the system choked on the file inputs.  It was kicking the whole files out as "Unidentified".  Seems to be fine now with 2 agents and 2 files at a time.  For the first time I saw this server maxed out on CPU.  

I did try to stagger the agents' start time by a few minutes so they wouldn't both be trying to grab all the Universal capture resources at once, but that clearly wasn't enough.  Is there a way to force the agent to queue until the Universal Capture is free?  As the "Batch Processor Container" and the "LFOmniOCR Engine" seem to play nicely together.

0 0
replied on July 19, 2024

Ok, I just found something interesting.  Quick Fields Agents are running as expected, and I opened up Quick Fields to look at some settings.  I know any changes I make (particularly if I don't save them) would not be applied to the jobs the agents are working on.

I'm getting an error that "Sign In failed because the number of sessions has reached the licensed limit, or the user account has reached its session limit, or no named user license has been allocated to the user".  Could there be a limit on the user that is stopping the multiple agents from being successful?  There are minimal sessions running on the server, and this user is only 2 of them.

0 0
replied on July 19, 2024

If  you are opening the session from Quick Fields Server, then any changes you make will be saved to the server, but you'll have manually download and update the session file on the Quick Fields Agent machine in the location specified in the schedule's setup.

If you are opening the session directly from disk on the Quick Fields Agent machine, you changes should apply on the next run. The agent won't run the session while you have it open.

As for the error, " number of sessions has reached the licensed limit " applied to the older licensing scheme with concurrent connections, that's not your issue.

Take a look at the list of sessions in the Laserfiche Administration Console. If you see more than 8 already connected for the user, then "the user account has reached its session limit". This is most likely the case since I'm guessing other sessions are running at the same time. You should be able to see what machines and applications the logins are coming from. The user in question would be the one specified in the document class properties. There is a limit of 8 active connections per user.

If that's not it, then the last one, "no named user license has been allocated to the user", is likely saying that the user specified in the document class does not have a license, but has Manage Trustees privilege and was getting as through the repository's administrative connection. There is a limit of 2 administrative connections per repository, regardless of the user, so if your login attempt is the 3rd one, you'd get that error.

0 0
replied on July 22, 2024

Thank you - that makes sense.

 

Back to my other question, then, is there any way to get the Universal Capture to play nicely?  The jobs that had issues were the large files with thousands of pages that took a long time for the Universal Capture to process.  It happened repeatedly that these jobs got kicked out the entire file as "Unidentified" without processing it.  If I let them run by themselves the files were fine and processed correctly.

0 0
You are not allowed to follow up in this post.

Sign in to reply to this post.