You are viewing limited content. For full access, please sign in.

Question

Question

Feature Request - Metadata bulk update tool

asked on August 2, 2021

Hi All,

 

Customers who have now had Laserfiche for a long time, the document sets and repository sizes grow over time as you'd expect. Some of our customers now have repositories over 20TB in size.

 

As part of cleanup exercises some customers may wish to bulk update metadata, this is also a pretty common request we receive.

 

As we know, it's possible to automate this through Laserfiche workflow, but really there is a bottleneck of around 5,000-10,000 documents per hour, if you're having to update millions of documents, this isn't really a feasible option, especially to try and co-ordinate this with backup and recovery should anything go wrong.

 

What would be a better option would be a tool to bulk update metadata fields directly within SQL, at the end of the day it's just data in a database so this should be possible with the right query/tool.

 

Is this something which could be developed and put on the solutions exchange?

 

Cheers!

2 0

Replies

replied on August 2, 2021

Hi Chris,

Could you share more details and example for some of those scenarios where customers want to bulk update millions of entries?

0 0
replied on August 3, 2021

Hey Sam,

 

Sure, this is really very simple on the face of it.

 

Let's say a financial institution has a template for client information. They have 3 key fields of information, client name, document type and document date.

 

Now let's say they are making tweaks or alterations to their document type list and want to bulk update certain things, let's say something simple like changing 'Invoice' to Invoices'. They might have 4 million invoices throughout the system, that need that metadata updating. You could use workflow by searching repository, then for each entry loop updating the metadata via assign field values. However as previously pointed out the workflow process ceiling seems to be around 5-10k per hour, regardless of server and SQL spec. Even at 10k per hour that simple change would take 400 hours to complete via workflow.

 

This same query might take less than 1 hour to complete in SQL.

 

Cheers!

1 0
replied on January 29 Show version history

I wanted to upvote this request too - we've been doing file imports that sometimes have non-real file extensions due to naming conventions or file exports from other systems (ex files with names like "cb=gapi(1).loaded_0", which Windows interprets as having an extension of "loaded_0").

Having already migrated in hundreds of thousands of documents, I would love to clean up the file extensions assigned to files in our LF repository because narrowing a search by filetype becomes very onerous when you need to select variations of windows filetypes using the checklist in the Type filter in an effort to exclude invalid extensions.

If this has already been done and I've missed it, please disregard my comments.

0 0
You are not allowed to follow up in this post.

Sign in to reply to this post.