You are viewing limited content. For full access, please sign in.

Question

Question

Can Laserfiche Remove A Cover Page From A PDF, If Not Can It Remove A Cover Page From A Tif And Keep The File Normal Size

asked on November 28, 2018

I am looking at options for removing a cover page from a document and it seems like whether I use quick fields or workflow I have to generate the document into a tif file.  The problem with that is the file is huge when I export it out of the repository.  I have looked at several options in regards to this size issue and didn't know if there is any way to retain the original file size or file format (PDF) or if my only option is to convert to pages and find a way to deal with the big file later.  I am currently on Laserfiche 10.3.  

 

Thanks!

0 0

Replies

replied on November 28, 2018

To answer your first question, no, Laserfiche cannot edit PDF files meaning you'll need to determine the best approach for minimizing file sizes in your specific circumstances.

 

The short version is that the "best" solution really depends on the nature of the source document and your import/processing settings.

If you generate color (TIFF-LZW) pages, they are always going to be much larger than the source PDF because they are raw bit data whereas a PDF is a complex collection of layered and optimized components.

If you convert those images that back to PDF with default compression the resulting PDF will be much larger than the original for a variety of reasons; in our environment we apply variable compression based on the average page image size and it makes a significant difference with no noticeable impact on quality.

On the other hand, importing the PDF in monochrome (TIFF Group IV), or importing them in color and removing the color in Quick Fields, will yield much smaller image files, and in many cases exporting those with the right compression will even give you a PDF smaller than the original.

Another option for color is to use the TIFF-JPEG format instead of TIFF-LZW for color images; unlike LZW, they are lossy, but they are also much smaller.

 

Again, it all depends on your circumstances.

0 0
replied on November 28, 2018

thanks! that is helpful information.  what do you do to apply variable compression?  I've never heard of this done before.  

0 0
replied on November 28, 2018 Show version history

We use an SDK script within Workflow to generate the PDF and attach it to the document; the code evaluates average page size, sets the compression value, generates the PDF, then adds it back to the entry as an electronic file.

I have another process that does something similar and deletes the pages after the PDF is rendered so all that remains is the revised PDF.

The following is the code I wrote for the PDF generation, deleting the pages is easy to add if all you want is a PDF

        protected override void Execute()
        {
            // Set variables for image size, text size, document size, and average bytes per page
            long imageSize = 0;
            long textSize = 0;
            long bytesPerPage = 0;
            int compression = 0;

            // Get and lock source document
            DocumentInfo doc = (DocumentInfo)this.BoundEntryInfo;
            doc.Lock(LockType.Exclusive);

            // Try to generate and attach PDF
            try{
                // Retrieve page count
                int pageCount = doc.PageCount;

                // Read values of each page
                PageInfoReader pageReader = doc.GetPageInfos();
                foreach (PageInfo page in pageReader){
                    // Track cumulative page and text sizes
                    // to determine what level of compression to apply
                    imageSize += page.ImageDataSize;
                    textSize += page.TextDataSize;
                }

                // Get average bytes per page
                bytesPerPage = imageSize/pageCount;

                // Initialize compression variables
                int minSize = 62500;
                int midSize = 250000;
                int maxSize = 1000000;

                // Evaluate average page size
                if (bytesPerPage < minSize){
                    // default for low-quality images
                    compression = 90;
                }
                else {
                    // low-to-medium quality images
                    if (bytesPerPage < midSize){
                        compression = 75;
                    }
                    else {
                        // medium-to-high quality images
                        if (bytesPerPage < maxSize){
                            compression = 50;
                        }
                        // high-quality images
                        else {
                            compression = 25;
                        }
                    }
                }

                // Build a list of watermarks and add tag watermarks associated with the document
                List<WatermarkSpecification> watermarks = new List<WatermarkSpecification>();
                foreach(TagWatermark w in doc.GetTagWatermarks()){
                    watermarks.Add(new WatermarkSpecification(w.WatermarkText,w.WatermarkTextSize,w.WatermarkRotation,w.WatermarkPosition,String.Empty,0,String.Empty,0,w.WatermarkIntensity));
                }

                // Initialize document exporter
                DocumentExporter dExp = new DocumentExporter();

                // Configure document exporter settings
                dExp.CompressionQuality = compression;
                dExp.IncludeAnnotations = true;
                dExp.BlackoutRedactions = true;
                dExp.PageFormat = DocumentPageFormat.Jpeg;
                dExp.Watermarks = watermarks;

                // Set PDF export options to flatten annotations and include searchable text
                PdfExportOptions ExportOptions = PdfExportOptions.RenderAnnotationsAsImage | PdfExportOptions.IncludeText;

                // Initialize memory stream and export PDF
                MemoryStream ms = new MemoryStream();
                dExp.ExportPdf(doc, doc.AllPages, ExportOptions, ms);

                // Write PDF to document
                using(Stream eDocStream = doc.WriteEdoc("application/pdf",ms.ToArray().LongLength)){
                    eDocStream.Write(ms.ToArray(),0,ms.ToArray().Length);
                }

                // Update document extension and save changes
                doc.Extension = ".pdf";

                // Save changes
                doc.Save();
            }

            // Ensure tokens are always updated and document is always unlocked
            finally{
                // Confirm eDoc attached and get eDoc Size
                SetTokenValue("eDoc",doc.IsElectronicDocument);
                SetTokenValue("eDocSize",doc.ElecDocumentSize);
                SetTokenValue("imageSize",imageSize);
                SetTokenValue("textSize",textSize);
                SetTokenValue("Quality",compression);

                // Release document
                doc.Unlock();
                doc.Dispose();
            }
        }

The script outputs several token values, which I use to determine whether or not the process succeeds, and to track how much compression was applied.

Also note that the lower the compression number, the higher the compression; this is because it is the compression quality % value.

2 0
replied on May 3, 2019 Show version history

Jason,  I may be wanting to borrow that SDK Script...  but can't figure out how to get it to work.  This is my very first SDK script to use in conjunction with a Laserfiche workflow, it spits out a bunch of syntax errors, and not as easy as copy and paste into the SDK Script activity.  Any pointers for getting it to work?

EDIT: forgot to switch it from vb to cs...    but still have an error, thoughts?

Capture.JPG
Capture.JPG (46.31 KB)
0 0
replied on May 3, 2019

Looks like you have some extra brackets based on the screenshot. If you're copying and pasting into something with existing code, only copy lines 3-104.

0 0
replied on May 3, 2019 Show version history

Shoot, deleting one of those brackets created another error, lol

replied on May 3, 2019

I don't think you have extras at the end, you have more at the top than you do at the bottom. It's really hard to say exactly since I can't see the entire code block.

0 0
replied on May 3, 2019 Show version history

Alright, yes I am a beginner, so thanks for your patience! When I opened up the script, I copy pasted it where the highlighting starts.... 

Capture.JPG
Capture.JPG (90.27 KB)
0 0
replied on May 3, 2019 Show version history

If that's where you copied it, then you should have 4 end brackets in total

1 - closes the Finally

2 - closes the void

3 - closes the class

4 - closes the namespace

 

You might also want to indent the pasted code so that it shows the correct hierarchy.

 

You'll also need to add

  • Reference for Laserfiche.DocumentServices
  • using Laserfiche.DocumentServices
  • using System.IO
0 0
replied on May 3, 2019

Bingo,  no errors :) Time for testing! Thank you very much!

0 0
You are not allowed to follow up in this post.

Sign in to reply to this post.