Import Agent imports PDF incorrectly but Snapshot works

asked on June 10, 2016

When using Import Agent to import PDFs into our repository, some PDFs are having bits of text garbled. But the Snapshot utility works perfectly. But since this will be a large volume of documents, the import will need to be automated, necessitating the use of Import Agent.

As an example, an original document may look like this:

But once imported by Import Agent it looks like this:

Import Agent has limited settings for document processing. Since the error seems to only occur to certain text characters in the affected documents (in this document it is bolded capital As and lower case Rs), I tried both using and not using OCR, but it made no difference.

Any ideas?

0 0

replied on June 10, 2016

Hi Chris,

We just deployed a new version of Import Agent on the support site which, among other things, should increase PDF quality. If you haven't already tried that out, I'd suggest giving that a try and seeing if that performs better for your situation. If that's not an option for you right now, in the meantime I'd suggest trying the approach discussed in the following KB article: https://support.laserfiche.com/KB/1013751.

0 0

View 3 previous replies

replied on June 13, 2016

Hi Justin

This worked for the black documents but as far as the embedded documents... No Joy. Do you have any other suggestions. We have some users keen on resolving this.

0 0

replied on June 13, 2016

Hi Norman,

Can you provide examples of the documents that still don't work to Laserfiche Support, and we can take a look at what's special about them.

0 0

replied on June 14, 2016

Hi Justin,

I can provide a document that is experiencing this issue. Due to the fact that it contains confidential student information, I cannot upload it through this interface. Is there an email address I can send the document to?

Additionally, it is my understanding that our VAR has attempted the suggestion in the knowledgebase article, but we have not tried upgrading to version 10 of the software. I am still curious to see if this would solve our issue.

0 0

replied on June 14, 2016

Your VAR will be able to submit the document directly to Laserfiche Support.

As for IA 10, I think it uses the same version of the 3rd party library that the workaround does. We'll have an update to the 10.1 Client out very shortly (later this week ideally) that has a newer version of that library (4.7.4). Once that's up, you can use the same file for the workaround and just try using the newer version in it.

That said, IA 10 has some major performance improvements, specifically taking advantage of multiple-core machines to import in multi-threaded, so I'd definetly recommend trying it out regardless.

0 0

replied on June 20, 2016

Justin

We will try IA10 however the fix you suggested fixed some things and broke others. I am sure it is because we are still on 9.0.0.464. The fix refers to a non-existing registry entry.

0 0

replied on June 20, 2016

Which registry entry are you referring to? I don't see any referenced in the KB.

0 0

Question

Question

Import Agent imports PDF incorrectly but Snapshot works

Replies

Sign in to reply to this post.