You are viewing limited content. For full access, please sign in.

Question

Question

LFFTS 10.3 Parallel IO mode

asked on January 24, 2018

I recently upgraded a customer from 10.2.x to 10.3.0.  They reported back to me that ever since, the search has been painfully slow.  I looked at the indexing status in the LF Admin console and found that it had a huge backlog of queued entries to index and the queue was growing still larger.

I then went to the LFFTS server (an external box from LFS) and looked at the resources and it was not overly taxed.  I then looked at the registry and found the the "ScalableCatalog" key was gone.  I deleted the Search catalog, stoped the LFFTS service, added the key back in, and restarted the LFFTS service.  Upon restarting the LFFTS, the "ScalableCatalog" key was removed again.

Have the registry keys changed for the LFFTS Parallel IO mode?  What can we do to speed up the indexing of the search catalog?

0 0

Replies

replied on January 25, 2018 Show version history

Any info from LF on this?

 

Bump!!!!!!!!!!!

0 0
replied on January 26, 2018 Show version history

Can no one from Laserfiche shed light on this?  I have a customer whose LFFTS indexing as well as user searching has become very slow.

 

Bump

0 0
replied on January 30, 2018

Hi,

It is "DataNodeNum" under HKLM\SOFTWARE\LASERFICHE\LFFTS\DataBase\[CatalogUUID] in 10.3.

In our test, it costs about 1 hour to index a 20 GB repository in a normal catalog. Our machine is 8 vCpus, 8GB RAM, SSD. What is your index/search speed, repository size, LFFTS machine's specifications (CPU, memory, disk, IOPS) and load (CPU usage, memory usage, disk usage, disk read bytes/sec, disk write bytes/sec)?

We need more information for troubleshooting. My guesses are:

1. There may be many partial optimizations. This can be checked by LFFTS event log messages (LFFTS event log is under Application and Services Logs/Laserfiche/SearchEngine/Service/Admin in event viewer). The message is like "Optimization ID: .... Initiating index file partial optimization since dump has reached the optimization threshold. Search catalog: ...". If yes, please try disable the optimizations by setting HKLM\SOFTWARE\LASERFICHE\LFFTS\Config\OptionalOptimizationType to 0. Restarting LFFTS is required.
2. TextProvider is extracting electronic documents, which may be slow for some extensions or some documents.
 

0 0
replied on January 30, 2018

So if "DataNodeNum" = 1 then we need to increase it?  4 was the max with the old "ScalableCatalog", what is the max with the new "DataNodeNum" setting and if we increase it from 1, do we need to re-index the catalog like before?  Are there any papers/instructions for the new options in 10.3?

0 0
replied on January 31, 2018 Show version history

You may refer to the "Parallel IO Mode LFFTS" part on https://support.laserfiche.com/devnotes/viewer?name=LFFTS9.2%2FListOfChanges921. The recommended hardware requirements section provided is still valid. Just replace ScatalogCatalog with the data node number. In Laserfiche 10.0 and higher, when you create a new search catalog in the Admin Console, you can specify a value of DataNodeNum between 1 and 4. When it is 1, it's a standalone catalog.

Parallel IO mode should help for some customers only, whose repository is large (> 50GB) and LFFTS machine is powerful (CPU > 16). For other customers, PIO may not help much. I suppose PIO catalog is not helpful in your case. Could you please provide the information mentioned in previous reply? So we can provide suggestions on performance.

0 0
replied on January 31, 2018 Show version history

In LF 10.2, the LFFTS was running in parallel mode (ScalableCatalog = 4).  Since the switch to LF 10.3, the users have all been complaining that it is very slow to return search results.  Now that I know what/where the new Registry setting is, I can see that the setting did not pick up the ScalableCatalog setting and move it into the new DataNodeNum setting when it removed the ScalableCatalog entry.  So the Search engine was defaulted back to serial mode (DataNodeNum = 1) because I failled to notice any option on the Catalog creation dialog to set the IO mode.

We will have to change this back to Parallel IO mode.  When switching, do we still need to delete the catalog and recreate it setting the DataNodeNum value in the creation dialog, or can we just change the 1 to a 4 in the registry and restart the service?

The LFFTS is installed on it own VM with a total of 6 virtual processors and 64 GB RAM.  The search was working very well before the upgrade, so I am fairly certain that once we get it put back into Parallel IO, it will be back to the performance that they are used to and expect.

0 0
replied on January 31, 2018

As for answers to your questions above, I did not find errors pertaining to "Optimization ID", but the log is full of error that start with:

"The TextProvider process was terminated due to an unhandled exception when extracting text from"

 

The error is showing up for office as well as PDF files.  This is odd since we had the iFilters in place and working with LF 10.2.  Not sure why LF 10.3 would not like the iFilters.

0 0
replied on February 1, 2018

According to the info you provided, PIO is probably not helpful. To create a PIO catalog, you need to delete the catalog and recreate it setting the DataNodeNum value in the creation dialog, and re-index the whole repository. 

If the concern is search performance, then it is recommended to do troubleshooting starting from Laserfiche server / database profiling. It is recommended to file a support ticket.

For the "The TextProvider process was terminated ..." messages, it contains the document entry id in the message. Could you please find some documents (according to the entry id) and tell us the size of each electronic document?

0 0
replied on February 1, 2018

I have not yet provided any information about the repository for you to be able to make an informed decision if PIO is needed or not.  In this case, it is a very large organization with a very large repository.  They have over 120 Full Named users actively in the system each day and over 70 of those users spend a good amount of their day in Laserfiche. The repository volume data is over 1 TB of storage, and the repository DB is over 65 GB.  And they have not brought all departments on board yet, so yes, the PIO is needed (in my customers opinion as well as mine).  It was configured in LF 10.2 and there was a noticeable performance hit to searching when we upgraded to 10.3 and the PIO was inadvertently switched off.

 

We will have to look further into the TextProvider failing.

0 0
replied on February 2, 2018

TP errors doesn't affect search performance. If search performance is the concern, as mentioned in the previous reply, it is recommended to do database profiling first. And it is better to create a support case to handle this.

If index performance is the concern, then please provide the size of the problematic electronic document files. We just need the size of around five electronic documents. Also please take a screenshot of the index files of the catalog, with the file size column displayed. If it is a PIO catalog, take a screenshot of index files under each data node folder ("datanode0", "datanode1","datanode2","datanode3") under the search folder.

Considering your hardware specification and repository size, most likely, PIO won't affect the index and search performance.
 

0 0
replied on January 31, 2018

Bump!

0 0
You are not allowed to follow up in this post.

Sign in to reply to this post.