We have discovered a problem with our backup process. When it runs on a repository server it seems to causes a communication problem between the workflow/subscriber server and the repository server.
The repository server is not shut down during backup but I am seeing workflows terminating and errors indicating network communication problems. Unfortunately my catch clause in my workflow also fails to send me an email notification.
I am also discovering documents that are imported, via Laserfiche Import agent, into the repository around the same time as backup is happening, are being skipped by the workflow that is meant to be monitoring the ‘create’ event for the given folder.
Researching into how workflow works, so I can get a better idea how we can improve our backup procedure, I read in the workflow administration help documentation that ‘The Workflow Subscriber receives notification from a Laserfiche Server when a change is made to a Laserfiche entry’. Can someone confirm it ‘receives’ the event notifications? If this is the case surely the Laserfiche repository server will know to reissue the notification once communications between the servers resumed so no events are missed? Can someone throw some light on the subject of what might be causing the events to be lost?
We are planning to stop the workflow service on our WorkFlow server and the import agent service on our Import Agent server before commencing backup. The thinking behind this approach is to stop workflows running so they won’t terminate due to communication problems between servers. Stopping the import agent will prevent documents being put in the repository during backup to prevent events being mysteriously lost. Can someone advise if this is the right approach to the problem or give me a better solution to the problem?
The background on our backup system is we use Veeam B&R to replicate our production servers to our DR site. As part of this replication a VMware snapshot is performed by Veeam when the backup starts and is deleted when the backup finishes. VM's go into a stun state when the snapshot is removed, normally this stun or freeze takes at most couple of seconds. We believe this small outage is causing the issue with Laserfiche Workflow as the workflow server loses connectivity to the Laserfiche repository.
FYI we are running Workflow 9.2 and Laserfiche 9.2.