You are viewing limited content. For full access, please sign in.

Question

Question

Search and redact

asked on May 29, 2020

Good day,

 

I'm looking for a way to search for a specific word or phrase in a document and have it redacted. Depending on the document, it could be encountered dozens of times [Legal Department]. An example would be a 100-page document where the name "Smith" appears 25 times, anywhere on the page, and sometimes more than once. Ideally, I would like to be able to search the document and have the name redacted throughout, all at once.

 

Is this something the base client install can do, or are we going to have to rely on something like QuickFields to accomplish this?

 

Any suggestions are welcome!

Thanks,

0 0

Replies

replied on May 29, 2020

You can use Forms in conjunction with workflow to collect a word or phrase and a document ID. A workflow can then be called to redact the word or phrase each time it is found in the document with the provided ID.

1 0
replied on May 29, 2020

As Justin mentioned, this can be done with a combination of Forms and Workflow. We are working on a very similar process for redacting documents.

However, it is worth noting that the automated redaction was only added as of Workflow 10.4 and I think the 10.4.1 update is the minimum version needed to find/redact based on a token/variable.

 

You don't need Quick Fields, but the documents would need to be stored as native Laserfiche documents (TIFF pages) and fully text-searchable for the Workflow activity to locate matches.

i.e., if your documents do not have already text for every page they would need to be OCR'ed first.

 

The Workflow activity you're looking for is called "Apply Text Annotation" and below is an example of how we have ours configured for testing.

Based on my experience, you might still want to have a human "review" process afterward because OCR is not 100% accurate and as a result the automated redaction might miss something that is handwritten or had some other issue like print/image quality.

1 0
replied on June 2, 2022

How do I stop "Fuzzy match" when applying annotation using workflow?
with this if I want to redact Tom, it's redacting both "Tom" and "Tomorrow"

is there any way I can apply auto annotation when it finds exact match. (not fuzzy match)

0 0
replied on June 2, 2022

Are you using the Search Repository activity in Workflow? There's an option to use fuzzy searches or not in its properties. Uncheck the box and republish the workflow.

0 0
replied on June 2, 2022

I'm not using "search repository" activity

I have a multi value field in a template called "search and redact" 
staff put all the words that they need redacted in that multivalue field and save the file to run the workflow.

see the attached screen capture of the workflow.

Capture.PNG
Capture.PNG (38.98 KB)
0 0
replied on June 2, 2022

In that case, you're going to have to tweak the pattern to account for complete words. Regular expression always matches as much as possible, so "tom" will match both "atom" and "tomorrow". If you want to match the whole word, you could add the requirement that there is a space before and after the value.

Something like \s%(token here)\s

0 0
replied on June 2, 2022

Thanks. I'm not an expert in RegEx but have one last question.

\s%(token here)\s kind of worked but not exactly.

so, for example above regex is redacting the word "are Tom" from a sentence "You are Tom" and if I remove \s before token, it will redact "atom" as you mentioned.

is there any better way of doing this?

0 0
replied on June 3, 2022

Nevermind. I'm able to figure it out the regex to make it work..
Thanks!

1 0
You are not allowed to follow up in this post.

Sign in to reply to this post.