Capturing Non-Contiguous Data

replied on June 30, 2024

This is difficult to answer without seeing more of the document--and may be a difficult problem in general.

One idea is to use extra zones (each anchored to its respective field)... maybe lots of extra zones... placed in roughly the places that the label/value pairs can appear. You can use multiple sample documents or create a fake sample document that has lots of the label/value pairs. You'd then be relying on the proximity logic of anchors to snap to the nearest anchor when the label/value isn't in quite the right place. And you'd be relying on having lots of them to make sure you always have an anchor near to one and that you capture them all. However...
1) it may be difficult to guarantee that you've captured them all for every document
2) you'll need to use Workflow to concatenate the zones into a single token, then remove empty zone values (use a token function)
3) you'll need to use Workflow to remove duplicate values because multiple zones might capture the same data. To remove duplicates, you might be able to use a token function, but you might also need to use a For Each Value activity in combination with a Conditional Sequence activity and multiple Assign Token Value activities.

Here's the gist of what I mean by creating a fake manifest with lots of the zones:
(but make sure they are all anchored!)

For the multiple page issue, you're doing to do something similar: add multiple pages to the fake document, and on each page, configure a bunch of these zones.
(Hint: you can make a multipage document from several single page documents by importing the documents into your LF repository, merging them, exporting the merged document as a multipage TIFF, and then creating a sample document from the multipage TIFF in the Capture Profile Designer)

0 0

View 2 previous replies

replied on July 1, 2024

So there's no way to set up the zones and then have them repeat in relative positions down the page?

0 0

replied on July 1, 2024

I'm not exactly sure what you're asking, but I think the answer is no. There's also currently no way to specify that a zone should be captured from more than one page.

That said, creating zones is pretty fast and easy (and anchoring even auto-renames them for you) and the capture profile will still run very very fast even if you're capturing dozens of zones.

That said, feel free to tell us more about your use case (including providing sample documents) and we'll consider your feedback when we're planning additional features.

0 0

replied on July 1, 2024

Is there a not public facing place that I can send that?

0 0

replied on July 1, 2024

Currently, in Laserfiche, I know that it is possible to pull data from individual fields as well as table like areas with contiguous data. For example, in the below sample I have a table where I can do a capture on all of the data in the individual columns.

This becomes difficult in two scenarios.

The first is when you have one document with multiple pages. I don’t know a way to access information on tables that expand to other pages like the below.

The next issue comes when the data that needs to be captured isn’t in a contiguous format such as a table. In the below case, I have an Item, Lot Number, Pallet #, and Qty that I need to capture, but each of the lines are not in a table. Instead, they’re separated out and I can’t make it repeat the same capture fields an indeterminate number of times. They’re all there in the same spaces, but I can’t make it repeat that set of fields as a lookup.

Then even so, I’d still run into the page break issue.

It may be easier if there was a simple way like an icon or an option to make them all one long page, that way I could capture the whole area and then use a workflow to parse out the values, but that’s still a rather difficult process to go through.

0 0

replied on July 1, 2024

Thanks, that makes a lot of sense, but unfortunately I don't think there's an easy way to do what you want currently.

I still think my original idea might work, but it's definitely more of a workaround and has some extra complexity.

0 0

Question

Question

Replies

Sign in to reply to this post.