I have a client that just upgraded from version 8 to 9 and they are trying to index 400gb worth of documents and data. The drive they are using to store and run the index is 500gb so they only have 100gb of free space to run the index. How much more room will be needed to run this index? Since they cannot take away storage once they allocate it to that drive they would like a rough estimate as to how much they should inform their IT to give to that drive. I knew 100gb of space wasnt enough for a 400 repository but I dont know how much should be enough. Any suggestions?
Question
Question
Replies
It is safe to reserve 1.5TB for index. It can be calculated in this way:
4 * size of text documents + size of electronic documents
Please see https://support.laserfiche.com/ow.aspx?LFFTS9%2E2%2FTroubleshooting921 for details.
In 10.2, less disk is required. It can be calculated:
3 (instead of 4) * size of text documents + size of electronic documents whose extensions is supported in LFFTS (See Admin Console => Index Properties => 'Electronic Text Extraction' tab)
Any changes for 10.4.1 on calculating this size?
It's impossible to give estimates. Although it's worth noting that the only thing that really matters is the size of the .txt files generated as a result of OCR and text extraction.
In any case, disk space is super cheap these days. I'd recommend going with a 2 TB hard drive. That way you don't have to worry about silly stuff like this.
Is it possible to reindex specific folders and not the entire repository?
You can index a folder in desktop client. In client, list the documents in a folder, select all of the documents and then index them. Please see https://www.laserfiche.com/support/webhelp/Laserfiche/9.2/en-US/UserGuide/Laserfiche_Client.htm#Extraction_and_Indexing/Indexing.htm for details.
Note that re-indexing the specific folders is not necessary to mean that it will use much less space than re-indexing the entire repository. It depends on the index file size you have now.
The index size may go up to the following number during index:
3 * existing index files size + 4 * documents size in the folders
In most cases, index size is not that large. However, it is highly recommended to reserve sufficient free space.
Hey Yiping,
Thank you for the responses. We were thinking about creating new volumes and migrating documents into more manageable portions that way we dont have to index everything all at once. If we make our volumes lets say 20gb volumes 72 gb of free space should be enough to index that volume and move onto the next without failing correct? (Assuming the 20gb isnt all text documents) Also, will migrating the documents cause any system down time?
Are there instruction on how to move the repository and any necessary files over to another drive and perform the indexing there? We were just granted 1.8tb of space on another drive that they want us to use in order to index the repository. Once done they will delete that drive. I understand that i must move certain things over then once the indexing is done then move them back to the original drive.
How do i perform this indexing of repository on the other drive and what are the necessary files or volumes that are needed as well as precautions in terms of downtime.
I essentially need a step by step process on how to perform this action.
Thanks!
Hello, Kyle,
After re-index the repository / volume, when the documents are modified, there will be index requests, and index files may still take a lot of space (3 * index file size). So it is recommended to keep the large drive you have after re-index, and not to move index files to a smaller drive. For the very same reason, indexing a repository volume by volume is not a solution for the index file space issue. As for volume management itself, you may want to have a look at https://www.laserfiche.com/support/webhelp/Laserfiche/9.2/en-US/AdminGuide/LFAdmin.htm#Overview_Volumes.htm%3FTocPath%3DLaserfiche%2520Administration%2520Guide%7CVolumes%7C_____0
Index files location is independent with repository and volume locations. This means you can have one volume on drive C, one volume on drive D, and the search catalog on drive E. To change search catalog location, you can follow this page: https://www.laserfiche.com/support/webhelp/Laserfiche/9.2/en-US/AdminGuide/LFAdmin.htm#Attach_Detach_Index_Catalog.htm%3FTocPath%3DLaserfiche%2520Administration%2520Guide%7CSearch%2520and%2520Indexing%2520Administration%7CIndexed%2520Searches%2520and%2520Indexing%7CManaging%2520the%2520Search%2520Catalog%7C_____3 You need to detach the catalog first, and then move the search files to the new location, and then attach the catalog back. Alternatively, if you don't have any existing index, yet, or you simply want to re-create everything from scratch, you can delete and create the catalog: https://www.laserfiche.com/support/webhelp/Laserfiche/9.2/en-US/AdminGuide/LFAdmin.htm#Create_Delete_Index_Catalog.htm%3FTocPath%3DLaserfiche%2520Administration%2520Guide%7CSearch%2520and%2520Indexing%2520Administration%7CIndexed%2520Searches%2520and%2520Indexing%7CManaging%2520the%2520Search%2520Catalog%7C_____4 It may take several minutes to perform those operations. During this time, full-text search will not be available. However, users can still do database search, or create, view, modify and delete documents in the repository.
I went ahead and started the indexing on the new volume. I created a new folder for the index to be created. It has been running for 10 minutes and we have not seen any space taken away from the new drive. It still says that it has 1.79tb free of 1.79tb. How quickly should we see a portion of space lost to the new index files?
The index files will not go large and large and become 1.8TB. Most likely it will be 100~500GB (depends on the documents in your repository) after several hours or days (depends on the hardware specification of the machine). During the index, the index files will be optimized for several times. Extra disk spaces are required for optimization. It may take 100GB to 1TB. It is possible that optimization only takes several minutes to complete.
You can use Admin Console to track index progress, though it doesn't tell the index file size. https://www.laserfiche.com/support/webhelp/Laserfiche/9.2/en-US/AdminGuide/LFAdmin.htm#Indexing_Status.htm%3FTocPath%3DLaserfiche%2520Administration%2520Guide%7CSearch%2520and%2520Indexing%2520Administration%7CIndexed%2520Searches%2520and%2520Indexing%7CManaging%2520the%2520Search%2520Catalog%7CIndexing%2520Status%7C_____0
Thank you, I recieved a final file size of 14gb, then after attaching it shrank down to 3 gb. The idx file stayed the same size so i believe its fine. Is a 400gb repository having a 3gb index file abnormal?
It looks good. 3GB index is too small for 400GB text pages. So that I suppose there are many image pages in the repository. An image page is not necessary a picture. It can be an image copy of a text page, usually generated from the scanner. Image pages are much larger than text pages. It is common that a 400GB text/image repository has only 4GB text pages. In this case, 3GB index makes sense. It can also be some huge electronic documents, like videos. I suggest to pick up some documents and search some words in them. If they are returned in search results, then it should be fine.
You can use desktop client to get the text size and image size. Please right click the root folder of the repository, and then open the property dialog, and then go to the folder tab. Please see https://www.laserfiche.com/support/webhelp/Laserfiche/9.2/en-US/UserGuide/Laserfiche_Client.htm#cshid=Metadata/Document_and_Folder_Properties_Dialog.htm for details.