You are viewing limited content. For full access, please sign in.

Question

Question

Import agent and password protected pdf's

asked on May 6, 2015 Show version history

I do not have access to a import agent session I can test with at the moment, so I'm asking this here.

 

Say you have a standard secure PDF that asks for a password when you try to print it if you have acrobat reader open. What happens if that PDF is placed into a monitored Import Agent folder where it would normally be pulled into Laserfiche and converted to tiff?

 

 

Edit: I was able to confirm that this did not import, and it goes into Import Agent's IAerror folder. 

 

 

0 0

Answer

SELECTED ANSWER
replied on May 7, 2015

"Is there any way to determine if there is a password on a document? "

One alternative is to do that outside of Import agent and LF first before putting it into the volume monitored by Import Agent.
http://stackoverflow.com/questions/11298651/checking-if-pdf-is-password-protected-using-itextsharp

2 0
replied on May 8, 2015

Raymond,

 

Thank you. This might be something we consider doing!

 

0 0

Replies

replied on May 7, 2015 Show version history

Is the PDF you are using encrypted with 128 or 256 bit AES encryption?  It should work with the former if you do not create LF pages on import but not the latter, which is a bug (126208). 

2 0
replied on September 14, 2015

Raymond,

Is there anywhere on the support site where we can check on the status of a fix for this bug?

Thanks.

0 0
replied on September 14, 2015

Unfortunately, this SCR is not resolved yet. 

Laserfiche depends on outside libraries which do not fully support encryption yet.  Until that occurs, we cannot address it on our end.

0 0
replied on May 7, 2015

Once it is imported into the client, you can OCR/extract text from it by right clicking and selecting that action.  You will at that point get prompted for the password.  Once you put that in the OCR/text extraction goes through.  That is if the password is 128 k encrypted.  If it is not, you will run into the bug.  The alternative at that point is to use Snapshot to import the document into LF.

1 0
replied on May 7, 2015

Unfortunately extracting pages is exactly what we are trying to do. They also have limited control over what sort of encryption is done as the third party companies that are sending them control it. 

0 0
replied on May 7, 2015 Show version history

Raymond,

 

Is there any way to determine if there is a password on a document? 

 

Currently the system is setup like this:

 

1. Customer gets emails with pdf's attached. Occasionally lawyers that send these PDF's have encrypted the documents on them. Most of them are not.

 

2. The customer drags these PDF's (generally without opening them) to a windows share that is monitored by import agent. These documents have their pages extracted, but import agent cannot extract and OCR them at the same time so we do this at a later step. 

 

3. A process OCR's all of the documents. Right now we use a QF agent session to do this but may be transitioning this to distributed OCR. 

 

4. A 2nd Quick Fields session runs on the now OCR'd documents to identify forms. We keep these separated as we also keep a copy of the original packet so that users can refer back to it if not all forms are separated and/or identified. Occasionally some of these form identification and extraction processes required zone OCR, which works better if the document is converted to a tiff. 
 

 

So our big issue is what happens with PDF's that are encrypted or have redaction annotations applied over portions of pages on the documents. If the entire PDF is locked it generally goes into the IAError folder. If there is an annotation that is blocking part of a page (say something placed over a SS#) a lot of times the page generation works (including what's underneath that annotation!) but when this happens the page that was generated has some corruption on it causing the page to fail with OCRing on that page. The text for this page will either be blank or it will have a very small section of the page OCRd. 

 

This causes issues because then the identification of forms fails. 

 

So at the end, having some sort of method where we can tell if this is a special document and being able to separate these for someone to manually reprint in or OCR by hand (so they get the password prompt) would be very helpful. 

 

0 0
replied on December 6, 2017

Is there any progress on this?

replied on December 6, 2017

Is there any progress on this issue, as security is a bigger concern for companies they are starting to use 256 bit encryption and Laserfiche is unable to process/OCR them?

 

This is a big concern for offshore companies now.

0 0
You are not allowed to follow up in this post.

Sign in to reply to this post.