I'm Doing Something Wrong When Converting PDFs to TIFFs...

asked on September 14, 2022

Hello! We have a process (albeit a currently broken one) in place for our new hire paperwork to take 10+ PDF documents, all of which have unique file names based on the name of the form, and drag and drop them into a specific folder within our repository.

I've then built a workflow to run as soon as documents are placed into that folder, and if the document name matches one of the several conditional decision branches (one for each potential form being imported), then it will convert that PDF into a TIFF document, rename it, and then delete the original PDF. Two of those branches are pictured below:

This process used to be working fine, but with one exception: for any documents that were more than one page, they would not only create the correct TIFF file, but also an erroneous, empty document with zero pages with the same name.

Without changing anything, the process has broken even further since. Now, no documents are created correctly. The TIFF files are created, and claim to have pages, but when opening one of those documents, this is all that appears (last name has been redacted):

TL;DR I have no idea what I've done wrong and would appreciate any assistance. If seeing additional details about how certain activities are configured would be helpful, I can happily provide additional screenshots of those. Thank you very much for your time, and have a wonderful rest of your day!

0 0

replied on September 14, 2022

I only see Create, Move, Assign, Rename, and Delete activites. What activity converts a PDF into a TIFF? I have not heard of an activity that does this and there are many feature requests for it.

Maybe the users uploading were doing the conversion on their workstation and they stopped doing this?

Laserfiche offers a conversion utility but only for the workstation, not for the server.

0 0

View 2 previous replies

replied on September 14, 2022

Chad,

To my knowledge, so long as the user logged into the repository has the "Generate pages for PDFs" setting toggled on, pages will be created for all imported PDFs by that user. If doing this through Forms, the user's credentials associated with the Save to Repository step in the Business Process would need to have those credentials.

So in this workflow, I'm creating an electronic entry (TIFF file), then moving the pages from that PDF to the TIFF file, renaming it, and then deleting the original PDF.

0 0

replied on September 14, 2022

In that case your moving pages from a TIFF file (already created at the workstation) to another TIFF file. There is no option to move pages from PDF to TIFF (that would be a conversion).

Do the pages become corrupted after moving them or are they already corrupted before you move them?

0 0

replied on September 14, 2022 • Show version history

I think you should check to make sure the option is still enabled for the user(s).

That being said, I'd recommend moving away from that method in favor of using the DCC's new PDF Page Generation functionality.

The flaw in this process is that you're depending on individual user settings that cannot be easily controlled; if they uncheck that box it breaks your process.

If you use the DCC method, you can avoid that dependency. You can schedule PDF page generation as the first step, then use the callback options to trigger your subsequent workflow processes.

Based on your screenshot, it seems like the erroneous empty document would just be the leftover file after moving all the pages, unless you have it set to delete documents when all pages are removed.

1 0

replied on September 14, 2022

I can't tell at what point the pages are corrupted as they are converted the second they land in the repository, but I can only assume it's a result of the workflow.

And Jason, can you please provide more information on the DCC method you mentioned? I'm unfamiliar with what that even stands for, much less what that is, haha. Thank you!

0 0

replied on September 14, 2022

I wouldn't assume the pages are being "corrupted" unless you first verify that the pages were initially fine before the workflow ran.

Some options would be to use a Track Tokens activity to track the page count at the start of the workflow and/or temporarily preserve an unaltered copy of the original until you get the issues resolved.

DCC is short for Distributed Computing Cluster, and it is a separate system that works in conjunction with Workflow to offload tasks.

Originally the DCC could only be used for the Schedule OCR activity in workflow, however, in version 11 they added Schedule PDF Page Generation as well as options to trigger a specific workflow when the task/job completes.

2 0

Question

Question

I'm Doing Something Wrong When Converting PDFs to TIFFs...

Replies

Sign in to reply to this post.