You are viewing limited content. For full access, please sign in.

Question

Question

Document Searchable text size vs Retrieve Document Text Token Size

asked on January 24, 2020

Hi, I am working on a workflow to search documents' searchable text using pattern matching. Although, with only a few documents we have ran into an issue when trying to retrieve the text. In LF Client when you view the searchable text, the whole page has been ocr'd. Although within Workflow, when I retrieve this text, not the whole text is extracted. It seems there is a character limit within Workflow's Retrieve Document Text token. This causes issues for me because some items I need to test when pattern matching is at the end of the document (1 page) but the text is not found because it is not within the Retrieve Document Text Token. 

I was wondering if anyone knows a solution for this as we will be running this process on a daily bases.

1 0

Answer

SELECTED ANSWER
replied on January 24, 2020

Yes, both Track Tokens and the testing area in Pattern Matching are limited. The actual running workflow is not.

2 0

Replies

replied on January 24, 2020

How are you determining that not all the text was extracted? There is a character limit to the value recorded through Track Tokens, but the token at runtime has the entire content of the document.

0 0
replied on January 24, 2020

I track the tokens in the workflow and I can tell there are differences. I can also see when I copy text from the document in LF Client to a pattern matching test, it only copies a specific amount of characters. 

0 0
SELECTED ANSWER
replied on January 24, 2020

Yes, both Track Tokens and the testing area in Pattern Matching are limited. The actual running workflow is not.

2 0
replied on January 27, 2020

I can confirm this as I created an email task and emailed myself the text captured from the Retrieve Document Text. Thank you for pointing this out. 

2 0
You are not allowed to follow up in this post.

Sign in to reply to this post.