You are viewing limited content. For full access, please sign in.

Question

Question

Determine whether document has image pages

asked on September 1, 2021

Is there a way in the repository's search capabilities to find documents with no image pages? If a document has no image pages but is OCRed (for example, importing a PDF and checking the box for generating text but unchecking the one to generate pages), it shows a page count without actually having image pages. I don't see a way to specifically search for ones with no image pages, just pages in general.

I was able to do it in the SDK by using di.GetThumbnails(di.AllPages).Read() which returns true if it has image pages and false if it doesn't (where di is the DocumentInfo) but didn't see a way to determine outside the SDK using either search or repository columns.

0 0

Answer

SELECTED ANSWER
replied on September 1, 2021

Pages section of search option might do it.

Or try the img advanced search.

1 0
replied on September 2, 2021

Thanks, Miruna

I do want to note that that the search option does to allow finding documents with no image pages by using "all pages with image size exactly 0KB" but that will not include documents with no pages at all.

Also, the search syntax {LF:Img=N} will return documents with text pages but no image pages, but it will not return documents with no pages at all. For that you would need {LF:Img=N}|{LF:PageCount=0}.

 

Also, it doesn't seem like there's a column for this.

0 0
replied on September 2, 2021

Going a slightly different direction with the Advanced syntax, I'm looking for PDF files that have no image pages. Using Miruna's option above, I changed the logic to NOT have images > 0 kb  This will return PDFs both with and without text extracted.

 

({LF:Ext="pdf"}) - {LF:imagesize > 0}

 

Note that you could get more generic and do any document without images, regardless of if there is a text record:

 

{LF:Name="*", Type="D"} - {LF:imagesize > 0}

1 0

Replies

You are not allowed to reply in this post.
You are not allowed to follow up in this post.

Sign in to reply to this post.