You are viewing limited content. For full access, please sign in.

Question

Question

Workflow: Move Pages activity dramatically increases size of document

asked on November 1, 2022

I have a workflow with a Move Pages activity that combines a PDF with a two-page form saved to the repository in TIFF format. 

  • Forms process used to upload PDF to form
  • Form is saved in TIFF format to repository along with PDF in the same folder
  • Workflow combines the form with the PDF using Move Pages

 

The PDF started out as a 670 kb file with 31 pages.  After the above process has run, the resulting document is over 6 mb and I'm trying to understand how that happens.  Is there a way to prevent this type of bloat from occurring? (Additional note: When I use the Email activity in WF to send the document as an attachment, the resulting PDF attachment is over 11 mb).

Here are the page sizes from the resulting document's properties page:

0 0

Replies

replied on November 1, 2022

I just ran a test process and it looks like the PDF file size grows before it even gets to WF.  As soon as Forms stores the attachment in the repository, it grows from 670 kb to 6.71mb. Does anyone know if there Is there a way to keep the original file size...perhaps a setting in LF somewhere?

 

 

0 0
replied on November 1, 2022 Show version history

Mike,

The PDF isn't growing, the total document size is increasing because it is generating TIFF pages; you haven't removed the PDF (electronic file) so you're still seeing that icon, but the size is from the pages.

If you right click and look at the properties, you'll see that the PDF is still the same (electronic file size) but the total size of the Laserfiche document is higher because of the pages.

By default, TIFF pages use LZW lossless compression, and a separate image is generated for each page of the document; the resulting files will be inherently larger than the source PDF because they're flattened image data.

Laserfiche cannot add pages to a PDF document, so the only way to do that is to convert to TIFF pages and then make the modifications. As a result, you'll also need to make sure you remove that original electronic file (PDF) as it will no longer match once you add pages to the LF document.

1 0
replied on November 1, 2022

That makes sense.  Is there a WF activity to remove the electronic document from the new document that was created with the Move Pages activity? It doesn't seem like there should be an electronic document within the new LF document, but if there is, I would want to remove it using a WF activity if it's possible to do that.

 

0 0
replied on November 1, 2022

Hi Mike,

The electronic document is not related to the Move Pages activity and move pages cannot create a new document; it can only modify existing documents.

It sounds like the account used to save documents in Forms is configured to generate pages automatically for PDF uploads.

As a result, the pages are being generated which allows you to Move Pages, however, Forms profiles cannot be configured to discard the electronic file.

There's no out-of-the-box workflow activity to remove an electronic file, however, you have two options:

  1. Move the pages from the "source" document into another document so you can just delete the copy with the PDF
  2. Use a workflow script to remove the electronic file

 

Since you're already moving pages from the saved form anyway, I think the first option would be the easiest; just reverse your approach.

Instead of moving the form pages to the uploaded document, you can move the pages of your uploaded document to the beginning of the form document.

That way you can just discard the entry with the electronic file.

0 0
replied on November 1, 2022

What I'm doing is saving the form pages as a TIFF file in one folder and the uploaded PDF in a separate folder.  When the Forms TIFF is saved it triggers the WF where I grab the submission ID which is stored in the starting entry's fields, then I use that to search the attachments folder for the uploaded PDFs that have a matching submission ID.  Then I use Move Pages to move the PDF into the TIFF file and then I delete the PDF.  The end result is a TIFF file that is much larger than the original PDF, so I guess we just chalk that up to the TIFF format.

 

0 0
replied on November 9, 2022

Workflow does not generate image pages out of PDFs. You'll want to look at the PDF page generation settings for the user specified in the Save to Repository task in Forms since that's where the page generation happens.

As for deleting the original file's e-doc, is there a point in doing that rather than deleting the whole document? Once you moved its image page, if you delete the e-doc component too, you'll have an empty doc (with metadata). Does that serve a purpose at point or can you just delete the whole document?

0 0
replied on November 9, 2022

Miruna,

The Forms user is storing the form as TIFF and the attached PDF to separate repository folders.  I then use Workflow to Move Pages from the PDF into the Forms TIFF.  I think all that does is add pages to the TIFF and it doesn't add the electronic document itself so there's no need to delete it from the TIFF.  I then delete the PDF after the pages have been moved.  The end result is a TIFF that is very much larger than the original PDF which was the reason for my question.

0 0
replied on November 9, 2022

Right. If you have pages to move with Move Pages, that's because the PDF had image pages generated at some point.

That would could happen in a few ways:

  • at import time in Forms
  • after import with a tool like Quick Fields or Distributed Computing Cluster
  • after import, manually by a user

 

So you'd want to look at image quality and color settings at that step to control size. Since you didn't mention any processing after import, Jason and I were guessing that Forms makes pages when it imports the PDF.

0 0
replied on November 10, 2022

Here are the settings for the Forms user:

0 0
You are not allowed to follow up in this post.

Sign in to reply to this post.