You are viewing limited content. For full access, please sign in.

Discussion

Discussion

Workflow SQL Query Failing every hour on the hour (Potential VSS Locking)

posted on November 2, 2023 Show version history

I have several Workflows running that are processing thousands of rows of Data and it's not uncommon for them take several hours to complete.  Most of the time it works just fine running queries and inserting data into SQL tables but every hour around the 42 minute mark, every workflow that is running records this error:

"Unable to complete login process due to delay in opening server connection".

It seems like it only lasts a few seconds but when it recovers, I find that data wasn't written properly (or at all).  It's to the point where I don't feel like I can trust the data and will have to clear it all out and start from scratch.

Looking at event viewer, the only thing I see consistently around that time is VSS running and from what I've researched, VSS can lock down databases and prevent actions while it's backing up.  WF should just wait for the database to be unlocked and try the activity again but for some reason that doesn't seem to be happening and I can tell that because I have some views that try to report incomplete data sets and every time this error occurs, another series of rows get added to that view.

Has anyone else experienced this before?  I found a similar forum post from 2022 but there wasn't any resolution posted.  

I have an idea for a workaround involving Repeat loop and Try/Catch but I don't want to have to build it into every SQL activity for every Workflow going forward so I really need to fix the actual problem.

 

Screenshot 2023-11-02 161123.png
0 0
replied on November 2, 2023 Show version history

Grant, I noticed that the error messages are surfaced from "SQL Server Native Client 11.0" (aka SNAC), which is a legacy driver Microsoft is trying to kill off. See: https://learn.microsoft.com/en-us/sql/relational-databases/native-client/applications/support-policies-for-sql-server-native-client?view=sql-server-ver16

The SQL Server Native Client (often abbreviated SNAC) has been removed from SQL Server 2022 (16.x) and SQL Server Management Studio 19 (SSMS). The SQL Server Native Client (SQLNCLI or SQLNCLI11) and the legacy Microsoft OLE DB Provider for SQL Server (SQLOLEDB) are not recommended for new application development. Switch to the new Microsoft OLE DB Driver (MSOLEDBSQL) for SQL Server or the latest Microsoft ODBC Driver for SQL Server going forward.

You mention:

WF should just wait for the database to be unlocked and try the activity again but for some reason that doesn't seem to be happening

WF is just invoking the SNAC driver here, and it may be that SNAC has sub-par built-in error/retry handling for those possible VSS disruptions. Also, Workflow is waiting but not for an unlimited amount of time. That threshold is clearly being exceeded in these cases.

I'd suggest re-configuring the data source in question as an ODBC one so it uses the ODBC Driver for SQL Server that Microsoft recommends. See if the issue persists after that change and provide an update.

--------

Edit: Also consider adding it to the Workflow Task Error Handlers to make the queries retriable.

0 0
replied on November 3, 2023

Thanks Sam, I've changed the ODBC driver and restarted my Workflows so now I just have to wait for the lockdown to reoccur and see how WF handles it.

 

I will try the task error handlers next if this doesn't work.

 

Any ideas why the SQL server or database is locking down on routine?  It seems like VSS writer does lock down input/outputs when it's running but I've built hundreds of workflows that run for hours performing SQL activities for dozens of customers and I have never once seen this.  The only thing this customer has that others don't is massive .ldf log files (one is almost 400 GB in size).  I'm assuming that is causing VSS to run much slower than other customers but I'm not completely convinced this is the culprit

0 0
replied on November 3, 2023

I changed the driver to ODBC Driver 17 for SQL Server (this is the latest driver we have listed on this machine).  It does seem to have made Workflow faster at processing and inserting SQL data which is awesome but once the lock down occurred, all my workflows encountered the error at the same time and the data got skewed.

I went ahead and added Error code 08001 to Task Error Handler in WFA and set it to retry after 3 minutes.  I'm hoping this will work because no matter where I look I can't seem to find what is causing this lock up of SQL every hour.  There are no events in event viewer OR the SQL Server Logs around the same time WF is recording the error.

1 0
replied on November 3, 2023

A few thoughts:

0 0
You are not allowed to follow up in this post.

Sign in to reply to this post.