You are viewing limited content. For full access, please sign in.

Question

Question

Retrieve document text limit

asked on March 7, 2018

There is text that I need to look for in an MS Word document using workflow.

I can see the text in the OCR'ed text that is available when I view in Laserfiche Client.

I can find text when using "Retrieve Document text" in workflow, but workflow is only returning about 250 characters.

The Word documents have 11 pages and quite a bit of text e.g. > 1000 characters.

Is there a limit to the number of characters returned by "Retrieve Document text"?

I have got around the problem by using "Search Repository" instead, and this finds the required text quite quickly.

0 0

Answer

SELECTED ANSWER
replied on March 25, 2018

I found the issue.  The tick-box "For each page, create a separate value in the multi-token Text token" was checked.  So my workflow was only testing the first value of this multi-token.

By un-checking this tick-box the problem was resolved.  My workflow now checks the text as one whole token.

I tried to get values from the multi-value token but was unable to create a workflow to do this.  Not an issue as the other workflow is now working.

 

0 0

Replies

replied on March 8, 2018

In your Retrieve Text activity, did you set it to retrieve all pages or did you set to retrieve a specific page?

0 0
replied on March 8, 2018

I set retrieve all pages.

The text retrieved does not include all of the first page.  I mentioned above that the doc was 1000 characters.  It is actually 1000 words in MS Word.

0 0
replied on March 9, 2018

It sounds like your document  was text extracted when it was put in Laserfiche, then edited and did not have its pages regenerated after that. If you right-click it in the Client and run "Generate Searchable Text" again, it should update its text pages so Workflow can get the current text.

0 0
SELECTED ANSWER
replied on March 25, 2018

I found the issue.  The tick-box "For each page, create a separate value in the multi-token Text token" was checked.  So my workflow was only testing the first value of this multi-token.

By un-checking this tick-box the problem was resolved.  My workflow now checks the text as one whole token.

I tried to get values from the multi-value token but was unable to create a workflow to do this.  Not an issue as the other workflow is now working.

 

0 0
You are not allowed to follow up in this post.

Sign in to reply to this post.