You are viewing limited content. For full access, please sign in.

Question

Question

Import Speed for Large Quantity

asked on April 22

Hello! I am wondering if there is a way to increase the speed in which Import Agent imports files into LF.

We have a batch job that imports roughly 350k files (roughly 26gb in total) that takes around 24hours to complete. Other systems I have used in the past could handle this workload in 2-3 hours. The server hosting Import Agent is not hitting above 80% RAM utilization throughout this process either. 

I know there are a few threads on this, but most that would fit this are from ~10 years ago. Thank you in advance for any insight you may have!

0 0

Replies

replied on April 22 Show version history

Whenever I have bulk imports, I split them up between multiple folders/profiles.

Putting everything in one folder effectively bottlenecks the process because you'll have one thread importing the files sequentially. Basically, everything in the folder is waiting in one long line.

If you split them up, and configure Import Agent to use multiple threads (Import Thread Count under Profile > Options > Advanced), it should be able to run more parallel imports.

Your import settings will also affect the processing times. For example, if you're generating pages on import, that is going to take substantially longer than just importing a file as-is.

There will be other factors as well, like drive speeds, network bandwidth, etc.

2 0
replied on April 22

Hey Seth!

Do you know the specs of the machine you're running ImportAgent off of? Depending on how powerful it is, you could look at:

  • as Jason said, increasing the threads allocated for the task (which allows the machine to perform more tasks simultaneously, but can bog down performance if you don't find the right balance)
  • doing your OCR scheduling after import, since OCR also tends to be the most load-heavy part of an import job in my experience
  • checking whether you have a Distributed Computing Cluster (DCC) set up, which can help distribute the processing load and speed things up if you direct ImportAgent to it.

 

To increase the qty of threads, you can go to Profile > Options in ImportAgent and under the "Advanced" tab, increase the thread count. Usually I run our ImportAgent batches with 6 threads, and it cuts the processing speed dramatically.

To remove OCR from the ImportAgent process, go into the specific ImportAgent profile you're using, and under "Processing", uncheck "Use OCR if no text is available". You can run OCR afterwards using QuickFields or a workflow.

 

 

2 0
You are not allowed to follow up in this post.

Sign in to reply to this post.