asked on June 21, 2019

Documents are being scanned in as 17" x 11" (sometimes 8.5” x 11”) at 600 DPI.  The PDFs seem to report correct size when viewed prior to being processed by the Import Agent.  After being imported to LF with Import Agent, the LF documents report a wildly incorrect page size and DPI (such as 141.67" x 91.67" at 72 DPI, or 70.83" x 91.67" at 72 DPI).  The DPI seems to be consistent at 72.  I imagine the low DPI number is a result of comparing pixels to page size, which in this case seems to be artificially increased by the Import Agent process.

This behavior appears to have started immediately after the 10.4 upgrade in April.  Looking back at older documents, we can find issues where OCR may not have been ideal, but page size and DPI numbers were correct, and the OCR process never failed completely.

The primary problem we're having as a result of this page size problem is that OCR often fails completely (with errors) on documents where the page size is artificially large:
    Error reading file. /  [6408]
    Error preparing page for OCR /  [404]

In cases where it doesn’t fail as above, OCR does not work as well as it does on a document with a properly-reported page size.

We haven't been able to find a combination of scanner, import agent, and OCR settings to mitigate this issue.


 

0 0