Information on how to cluster/improve efficiency of LFFTS

replied on July 3, 2023

Hi Sam,

See LF Case #232428

They are concerned that searching is not 100% reliable and the only way to *ensure* that the catalog is complete is by reindexing regularly. The index would say it is Ready and spit out no errors, but searches would be incomplete.

We tried reindexing via the LF Admin Console to no meaningful effect and using the QRcmd tool also gave us very slow speeds, especially given their environment (SSD backend, dedicated LFFTS server with 12 vCPUs, 32GB RAM, Gigabit+ network speeds, etc.).

We understand given the response from the LF tech that the recommended storage space requirements for LFFTS is 4 * (size of text files + electronic file size), which as you can see in the case notes, the end result size of the SEARCH folder is nowhere near what was suggested.

The customer is aware that constant reindexing is not normal or needed, but their concern is the following:

the challenge will be knowing when the corruption occurs, what copy to restore, and then how to fix going forward which seems to be running a reminder to reindex

0 0

replied on July 3, 2023 • Show version history

Quick Answers note, it's helpful if you reply to the reply rather than the top-level post so the conversation gets threaded correctly. Otherwise you lose the reply order and the discussion gets difficult to follow.

------

They are concerned that searching is not 100% reliable and the only way to *ensure* that the catalog is complete is by reindexing regularly.

I would not personally jump to that conclusion. Let's change a few words:

They are concerned that their car is not 100% reliable and the only way to *ensure* that the car doesn't have hidden issues is by rebuilding the engine regularly.

No mechanic would tell you that's a reasonable way to address the concern.

I bring this up to highlight the huge jump from their legitimate concern to the questionable proposed remedy, not to diminish the concern.

Looking at that case, your last update is "the index is complete (though it took 4.5 days to finish), and searches are back to normal".
Here's my understanding the chain of events, with commentary:

There was a single identified point in time after which indexed searches stopped returning all the results they should have.
1. To be clear, I'm not downplaying the business impact of this, just stating frequency.
Suspecting corruption (or other issues) with the LFFTS search catalogs, the customer triggered a reindex of their repository in hopes rebuilding the catalogs would resolve the issue.
The reindexing job was going "slow in general" at "around 2500 pages/min". At this point you opened a support case.
You noted: "When we reindexed Wed. @ 3 PM and checked the status the next morning at around 7 AM, the index queue seemed to have gone through ~4-4.5 million entries, but it's been at or around 7-7.5 million for over a day. Initial queue started at around 11.x million."
1. I have a recollection that indexing is automatically temporarily paused when LFFTS receives and processes an indexed search request. I can't find a source for that at the moment though. Could help explain why it got through millions overnight and "stalled" during the day."
Customer noted: "Of the 11+ million docs, only 118k have actual text pages."
1. A document does not need to have "actual text pages" to be indexed - the criteria is having a file extension on the TextProviders list (Admin Console -> Indexing -> Properties -> Electronic Text Extraction.
2. The observed queue size of "11.x million" means nearly every document in the repository was sent for re-indexing. That means millions of PDFs and such going through text extraction again.
The re-indexing job though Admin console going at 2,500 pages per minute with that many edocs in the mix seems like a decent rate to me.
1. Why LFFTS wasn't necessarily using more/all of the hardware resources available to it is a fair and open question, but I'm not sure this is objectively "slow".
2. With Admin Console re-indexing, LFS has to send every page and edoc to LFFTS for extraction (QuickReindex bypasses this). That's 4 TB worth of small files.
  1. I've done my fair share of Laserfiche server migrations where robocopying 4 TB worth of volume files over a gigabit network with the /MT (multi-threading) switch set to 8 or 16 threads still took a few days.
You then installed and ran QuickReindex with params:
qrcmd.exe \\server\D$\SEARCH -s $Database -r $Repository -a N
(-a N = only index documents that were previously marked as indexed).
QuickReindex started generating PDF iFilter warnings, so you installed what seemed like the right iFilters ("MS Office Filter Pack and the Adobe PDF Filter 11", though it's not explicitly stated if on the LFS or LFFTS instance; needed to be the former) and restarted the QuickReindex utility. A few hours later you provided an update of "The QRcmd indexing wasn't going anywhere fast and was still spitting out PDF iFilter warnings. [...] They have stopped the QRcmd tool and reattached the "old" incomplete search catalog, but searches are still obviously incomplete and indexing is still incredibly slow."
1. The case notes don't specify if you stopped/paused the original Admin Console re-indexing job before kicking off the QuickReindex one. If it wasn't stopped/paused and the LFS instance was potentially already slammed with small file I/O, that could have contributed to slower than expected QuickReindex performance.
2. QuickReindex also likely wasn't going anywhere fast because it kept running into iFilter errors. Couldn't speak as to why that was happening without first seeing the full event logs, which I understand you're in the process of obtaining.

We understand given the response from the LF tech that the recommended storage space requirements for LFFTS is 4 * (size of text files + electronic file size), which as you can see in the case notes, the end result size of the SEARCH folder is nowhere near what was suggested.

That recommendation comes from comments by the LFFTS dev team in the old Answers thread How much space needed for indexing?, which was linked in the support case. They did update the rec in 2017 for LFFTS 10.2 to be:

3 (instead of 4) * size of text documents + size of electronic documents whose extensions is supported in LFFTS (See Admin Console => Index Properties => 'Electronic Text Extraction' tab)

However, given the overwhelming majority of this customer's repository's doc size comes from the 4.1 TB of edocs rather than the 118k text pages, the result of the calculation is about the same.

Further down in the thread, they note that for a ~1.8 TB repo:

The index files will not go large and large and become 1.8TB. Most likely it will be 100~500GB (depends on the documents in your repository) after several hours or days (depends on the hardware specification of the machine). During the index, the index files will be optimized for several times. Extra disk spaces are required for optimization. It may take 100GB to 1TB. It is possible that optimization only takes several minutes to complete.

Emphasis mine. The extra disk space beyond the index file size is used for shorter temporary operations (that can take much longer if space constrained). This is why you won't necessarily see them show up on spot-checked storage utilization metrics. They also say:

After re-indexing the repository / volume, when the documents are modified, there will be index requests, and index files may still [temporarily] take a lot of space (3 * index file size). So it is recommended to keep the large drive you have after re-indexing, and not to move the index files to a smaller drive.

Personal Speculation:

It's possible that LFFTS re-indexing storage calculation is specific to Admin Console initiated re-indexing operations where LFS has to send a copy of every in-scope page/edoc to LFFTS, and LFFTS theoretically needs to be able to hold all those copies while it works through the indexing queue. QuickReindex works by accessing the original volume files directly and doesn't need to store copies, so you might not need nearly as much space reindexing with that method.

It's also entirely possible the re-indexing storage formula starts returning numbers that are increasingly overkill as repositories get larger into the multi-TB range (I agree that 4 TB for a 4 TB repo seems ridiculous). But I'm not aware of any updated official or unofficial guidance from Dev explicitly saying that and offering an alternative calculation, ceiling, etc.

Given the age of that Answers post, I would invite you to specifically ask Support to ask the LFFTS Dev team for updated recommendations for reindexing with LFFTS 11 both normally through Admin Console and with QuickReindexer. I'd also ask for both "Ideal" and "Feasible" storage amounts for your specific scenario.

For example, it may be that LFFTS really could use all 4 TB of space during active index optimization operations, but that the practical impact of having 1 TB vs 4 TB available is that index optimization takes 10 minutes instead of 5. While that is a performance enhancement, and you're here asking for ways to increase indexing performance, that's a lot of storage and saving 5 minutes a few times doesn't meaningfully affect the overall duration of the re-indexing job.

The customer is aware that constant reindexing is not normal or needed, but their concern is the following:

the challenge will be knowing when the corruption occurs, what copy to restore, and then how to fix going forward which seems to be running a reminder to reindex

Practical recommendations:

First, I'm not aware of a method to detect partial "corruption" in search index files. I imagine any method for doing so would necessarily involve performing a re-index on all previously indexed documents and then comparing the resulting index files with the source. There's nothing that does this out of the box and it's necessarily no faster than a full re-index in any event.

Second, this is a scenario where as soon as it became clear that a full re-index wasn't going to finish fast enough to quickly mitigate the business impact, I probably would have tried restoring the last known good backup of the search catalogs. They might have been missing a day or two of the most recent content, but from what I understand of the use case, that would have been a small price to pay to buy time.

Something critically important here is to set the Laserfiche Volume Shadow Copy Service (VSS) Writer service to Automatic start (default is Manual) and ensure your backup solution has VSS (sometimes called "application consistent") backups enabled. See: Backing Up Your Search Index Files. If you take a backup of the search index files while they're in use, you'll have invalid index files in the backup. If you can't take VSS backups for whatever reason, you should look into scripting the LFFTS service to stop during scheduled backup windows (which are hopefully outside of business hours).

Third, get QuickReindex working on the LFS instance (sort out iFilter errors etc.). I think the best strategy is having a pre-tested game plan for a QuickReindex with approximate timings if evidence of a similar issue arises again.

If the QuickReindex is fast enough to just do that and swap in the new catalog, great.
If it's not, plan to restore the last good catalog backup as an immediate mitigation, then QuickReindex and swap in the rebuilt catalog once it's complete.

Conduct QuickReindex performance tests by running it off-hours specifying a different search catalog directory so you don't affect the current active catalog. If you want to first test with subsets of the repository, you can try providing specific volumes using the -v/-vn flags. If the Laserfiche Server instance has resources similar to the LFFTS instance, this should, in theory, go fast(er).

Note: If you choose to perform the quick reindex in place by specifying your current search catalog directory, you will need to stop your Laserfiche Search service before proceeding. If you choose another directory, full-text searches can continue to be run on your repository while the reindex process occurs. However, in the latter case, when the reindex is done you will need to stop the Laserfiche Search service, swap the existing index files for the new index files, and then restart the service.

Write the procedure/runbook up and add it to the Disaster Recovery plan for the Laserfiche system. I believe the best outcome here is that having a solid, validated plan in place to handle potential future events addresses most of the customer's anxiety (as well as yours) here, and then you never have to use the plan because spontaneous search catalog corruption is really not a common event.

Hopefully some of that is helpful.

Cheers,
Sam

4 0

Question

Question

Information on how to cluster/improve efficiency of LFFTS

Replies

Sign in to reply to this post.