Hi Everyone,
I don't know what is happening, but we have a pdf file that when we import it to Laserfiche, I can't read, it seems like would be encrypted or something like that. I have attached an example. Is this an error?
Thanks
Best Regards,
Vitor
Hi Everyone,
I don't know what is happening, but we have a pdf file that when we import it to Laserfiche, I can't read, it seems like would be encrypted or something like that. I have attached an example. Is this an error?
Thanks
Best Regards,
Vitor
Hi Vitor,
Are there any errors in the event logs ( C:\programData\Laserfiche\WebAccess\EventLog\ )?
It looks like you're viewing image pages in the screenshot you attached. If you go to Display Options-> view PDF, does it display correctly? Can you open the original PDF in Adobe?
Hi Ryan,
I checked the log file that you instructed to look, and I found this error "2016-05-03 14:49:06.215 [00062] | CRITICAL | Session: 3reu1knbribicviumovrx5zl | OP: StartImportEntry | Message: Erro: The file format is not compatible with text extraction."
If you want to take a look, I will attach the file. But to answer your questions, If I choose the view PDF option, I can see the pdf file correctly. And yes, I can open using a PDF in adobe without a problem.
I also tested in the Client, and happens the same thing(I attach a screeshot).
I tried do replicate the error with other files and I may have find a hint, I am not sure. I imported some files and worked ok and other did not, I track the origin of one of them and I found that was a pdf generated by a browser. So I generated a pdf from a browser like chrome, using the print option and select save to pdf, when I took the file and import to laserfiche, it got the same problem, but using the same browser I chose a pdfprinter(CutePdf Write), I took the file and imported and worked fine.
What font was used when the PDF was created? If it is not an embedded font, it can have issues extracting text. Usually there is a way to embed the font when the PDF is originally created.
Hi Vitor,
It sounds like problem isn't with viewing/displaying the PDF itself, but with generating Tiff pages from that PDF.
You mentioned you could reproduce the issue using PDFs generated by Chrome's 'Print to PDF' feature. I believe this is due to a bug in Chrome, where Chrome isn't correctly encoding the text on the PDF it generates. This issue has been reported to the chrome team here (see comment 58 in particular): https://bugs.chromium.org/p/chromium/issues/detail?id=108937
The Desktop Client and Web Access use a 3rd party component to generate these pages. It seems that library isn't properly handling the case of PDFs with incorrect text encoding, or isn't handling it as well as it could. It's possible this behavior could be improved. We'll follow up with the developers of the library to see if we can improve the behavior for a future release.
Do you know if all these PDFs with garbled text are generated from Chrome? If so, I recommend using a 3rd party PDF printing utility, like CutePDF, in the meantime.
Hi Ryan,
Sorry for taking so long to answer, but yes all were generated from Chrome or from "Microsoft print to pdf" that is native available on windows 10. I had only one case that I could not track if was through chrome because it was sent by e-mail and I didn't get an answer back so far.
Thanks for the tip, we are currently using CutePDF.
Have a good day!