You are viewing limited content. For full access, please sign in.

Question

Question

Workflow to remove blank pages from multi page tiff

asked on January 16, 2014

Does anyone have a workflow that will remove blank pages from a multipage tiff and reassemble doc minus blank pages. Quickfields is not an option.                   

0 0

Answer

SELECTED ANSWER
replied on January 17, 2014 Show version history

William,

 

Here is some code for an SDK script to delete 'blank' pages from the workflow starting entry.  NOTE: this script looks at the page text and if there is no text then it will delete the page.  The obvious downside to this approach is that if the document has not been OCR'd then all of the document pages will be deleted! 

 

A more robust solution I think would be to look at the image size and set a threshold to delete pages (images) that are smaller than that threshold.

 

Later edit:  The option to look at the image size versus the text size is only a single line of code so I added that to the code below and commented it out.  I arbitrarily set that limit at 3000 bytes...

        Protected Overrides Sub Execute()
          Try
                'Instantiate a document object and set it to the workflow starting entry...
                Dim document as LFDocument = Me.Entry
                'Instantiate a pages object and set it to the document pages...
                Dim docPages As LFDocumentPages = document.Pages

               'Since we will be stepping through the document and marking the 'blank'
               'pages lets make sure all pages are unmarked...
                docPages.UnmarkAllPages()

                'Now step through the document pages and look at the text object
                'If the text object has a length of 0 then mark it for deletion...
                For Each page as LFPage in docPages

                    'Replace the TextSize property with the ImageSize property to
                    'look at image size versus page text size.  i.e.  If page.ImageSize < 3000 Then
                    If page.TextSize = 0 Then
                        docPages.MarkPage(page)
                    End If

                Next

                'Lock the object in preparation to delete the pages...
                docPages.LockObject(Lock_Type.LOCK_TYPE_WRITE)
                'Delete the pages...
                docPages.DeleteMarkedPages()

                'Save the updated page object and unlock...
                docPages.Update()
                docPages.UnlockObject()

                'Cleanup...
                document = Nothing
                docPages = Nothing

            Catch


            End Try
            
        End Sub

 

1 0
replied on January 17, 2014

Thanks Cliff this looks just like what the doctor ordered. Much appreciated.

 

Cheers,

Bill

0 0

Replies

replied on January 16, 2014

William,

 

Interesting question; if I were going to do this in workflow it would be in an SDK script.  The immediate issue is determining which pages are 'blank'.  My first thought would be that if the document was OCR'd then I would look at the text object for that page and if it was empty then delete it.  The second thought would be to look at the image size (in bytes) and set a threshold that if the image is less than this number of bytes then delete it (perhaps 3K?)

 

In either case I would probably step through the document pages first and build a PageSet object of the pages to delete and then make a single call to the DocumentInfo.DeletePage(PageSet) method.

 

If that would satisfy your needs then let me know and I can mock up some code snippets. 

 

(Then again, I might have totally over-thought your question and someone else can provide an easier way to accomplish this!)  wink

0 0
replied on January 16, 2014

Hi Cliff,

You have it correct. I have been trying to do this within the bounds of Workflow designer using retrieve doc text etc.

I do not have experience with SDK script however I have some basic knowledge of .net programming and would be able to follow code snippets. 

 

Much appreciated if you could provide some code snippets.

 

Cheers,

Bill

0 0
replied on January 17, 2014

Workflow does not have any image processing capabilities. Like Cliff said, you could do it with a script, but a better tool for this type of job is Quick Fields.

0 0
replied on January 17, 2014

Thanks for the reply Miruna however like I said in the question quick fields is not an option.

 

Using workflow I was able to create a workflow that removed blank pages from a collection of single page tiffs. Just used retrieve doc text and checked if token was empty or not. However ran into problems with multi page tiffs couldn't find a way to make workflow look at each individual page.

0 0
replied on February 4, 2014

 

You can look at individual pages of an entry using a Repeat activity with the condition: Page count (of the entry) greater or equal %(Repeat iteration). The %(Repeat iteration) token is the current page number Inside the Repeat loop. Make sure you start the iteration token at 1.

 

1 0
You are not allowed to follow up in this post.

Sign in to reply to this post.