You are viewing limited content. For full access, please sign in.

Question

Question

Fulltext entire repo

asked on February 6, 2019

We had a client that had about 40gb of data uploaded. Fulltext Catolog wasn’t run on the storage. Is there a way to run fulltext on the entire repository? The documents were imported and workflow created folder structure based on the metadata. 

0 0

Answer

SELECTED ANSWER
replied on February 18, 2019

I added a for each entry module and put in the OCR inside it. Worked after I did that.

0 0

Replies

replied on February 6, 2019

Are you referring to this?

1 0
replied on February 6, 2019

Do you mean the documents were not OCRed or that they were not indexed by the Full Text Search Engine?

If they are just not OCRed (therefore you cannot text search them), you can run a search to find them, select the results, and generate text.  It will take a while to OCR that many, so it may be better to use DCC.

1 0
replied on February 12, 2019

this is what i have setup in workflow

 

but it doesn't seem to work at all.

0 0
SELECTED ANSWER
replied on February 18, 2019

I added a for each entry module and put in the OCR inside it. Worked after I did that.

0 0
replied on February 12, 2019

OCR is the process of extracting text from the image and creating the text layer.  This text layer is then used to index the words in the Search Catalog.  Only after the document has been indexed by the LFFTS is it text searchable.

There are several reason a document may be OCRed, but not Indexed.  The most common is that the Search Catalog sometimes will get corrupted and either go off-line or into read-only.  When you have this condition nothing new gets indexed and little if any text results are returned.  To fix this, delete the old Search Catalog and then create a new one and reindex the whole repository.

Another common reason can be if the LFFTS service has stopped.  Restart the service and then try to manually trigger the indexing of the document.

1 0
replied on February 12, 2019

Sometimes it's as simple as just seeing if you can restart indexing. If that doesn't work, you can do the above.

0 0
replied on February 12, 2019

no you have realllllly confused me. The documents ARE indexed but we still can't search and find document text. Why would that be??? I rcreated a new search catalog and reindexed the repo. all is completed. Why would I not be able to search the text on the document? Do I also have to use DCC and create OCR for each Doc?

0 0
replied on February 12, 2019

so basically you are saying i have to OCR before it can index....? If I don't OCR I would assume the index would have nothing to catalog? If this is the case, wow.

0 0
replied on February 12, 2019

The documents must have text before they can be indexed. If the documents have text in the text pane, then it should be possible to index them. If there is no text, then you'll have to OCR them. Are the documents native Laserfiche documents, with pages, or are they some other format?

Workflow does not perform the OCR on it's own. It needs DCC in order to do the work. So, you'll have to set up DCC, and then it will work.

0 0
replied on February 12, 2019

Ok, so I am really confused here. What is the difference between Full Text search and OCRed? What is the point of Full Text indexing a document when I can't search in the client by the information in the pages?? So, if the user wants to search by words it isn't full text but OCR searching??? 

0 0
You are not allowed to follow up in this post.

Sign in to reply to this post.