I've seen several posts on issues with Workflow service tasks in Forms. The issue we're having, especially since the 10.3 upgrade, is that some instances of a particular process aren't receiving a signal back from Workflow when it's finished running. Most of the instances of this process work without any issues. The workflow in question runs without errors: there are no errors in the event logs on the Workflow server, the Forms server, the database server, and in the Failed Instance Completions of the Advanced Diagnostics node in the Workflow Administration Console. In other words, everything looks like the instance should continue without issues.
I have 3 major issues with how this is handled in Forms:
1) this results in a "silent failure"; we have hundreds of instances that run daily and can't check each one to make sure it didn't silently fail and each of these instance is the difference between being on time with payments or receiving late payment fees;
2) rather than going into a state we can't recover from, we really need to be able to retry these, just like we can with suspended tasks, otherwise, we potentially have to re-do a significant amount of work. One form can take over an hour just to re-enter, let alone to re-process.
3) that I can see, there's no way to create a search/report to find the instances that have are waiting for these workflows to return. The workflow service task that I mention here is in a large subprocess, so putting it in its own stage won't work without redesigning that part of the process. Without such a search, I can't monitor when an instance is stuck/has died because of this issue.