One thing to keep in mind is that no two documents will have the same entry ids. Those are assigned at the time the document is created, and are unique within a repository.
You can possibly start out by doing a workflow that walks through a search of documents and compares metadata, and puts the duplicates someplace.
As far as comparing pages, are the contents of the documents only rendered Forms submissions, or can there be more attached? If it's only from forms, then you should have a fairly consistent base to start comparing pages. Documents that have matching metadata, but different page counts need to be looked into further. Documents with matching metadata and the same page counts also need to be looked into, but in a different way.
Are the results of Forms submissions stored in a database somewhere? If so, you can use that to inform the process of weeding out duplicates. Otherwise, you might have to OCR the documents and then pick out differentiating values, and use those. If you have access to the data in database form, you can weed out duplicates that way, and then get rid of documents you don't need.
If you are allowed to query the Laserfiche repository database directly, you can run queries to look at the metadata and spot duplicates there as a first cut. Then you can save those entry ids off into a different database, or even just a spreadsheet. I often have tables that have just a single column of entry ids. I populate it using Laserfiche searches or database queries, and then I have a list of entry ids that I can act on in Workflow.
You certainly don't need to hit the database to make this happen. You can do it all by collecting multi-value tokens within a workflow. I don't prefer this because often I'll want to explore documents in several different ways, and add to my list of "To Delete/do operation 'x'" entry ids in several batches over time. It makes it much easier when it's in a separate database table that I can write to. We have a scratchpad database that we use for just this kind of thing.
There's no catchall method for finding duplicates. Just think of it as an iterative process, and use various criteria to whittle the list down one chunk at a time. I'm sorry some of this is vague and rambling, that's the way my process tends to be when I'm faced with these kinds of tasks. Let me know if anything needs clarifying.