You are viewing limited content. For full access, please sign in.

Question

Question

Wait condition met but workflow does not move to next step

asked on March 7, 2014

This has been happening to me for years now with a small percentage of a particular workflow's instances.  I worked with support a few months ago (we are a VAR) and after much haggling, they advised us to upgrade from 8.3 to 9. 

 

As an aside: The experience with support was nothing short of unpleasant, but that is not the topic of this thread.  Suffice to say, I have sought help from Laserfiche Support with this in the past and would now like to pursue answers from the community.  (not that I would not welcome answers from Laserfiche support personnel or any other Laserfiche representatives, just hoping to receive better results from this forum.

 

Upgrading to 9 seemed to resolve the issue for a while, but I'm starting to the see it again.  Affected workflows show that the condition has been met, but the entry does not move.  This happens only about 3% of the time.

 

 

Any input would be appreciated.

 

 

 

 

Replies

replied on March 28, 2014

John, thank you for sending a copy of your database to Tech Support. We were able to narrow down your issue. Tech Support will be emailing the same information to you shortly, but I am posting it here anyway since this thread has been getting so much attention.

 

The problem seems to happen when the email or export entry tasks can't notify their corresponding activities that they have completed. This seems to be an issue with how Windows Workflow Foundation handles tasks with longer names. Since it is not in our code, confirming our assumptions and implementing a fix is taking a bit longer. We are still investigating the best way to handle the issue.

 

We won't be able to get the stuck instance to move on, since there is no way to signal to the instance that the task is completed from outside of the task itself. However, you can side-step the issue for now in future instances by turning off the option to run these 2 activities as tasks. To do so, open the WF Admin Console, go to the Server Configuration node and open the Advanced Server Options dialog. In the Activity Performance tab, uncheck the boxes labeled "Email" and "Export Entry".

replied on March 7, 2014

Sure:

This is a requisition and purchasing workflow, so the entry flows from the initiator to their department director for approval.  We have a list field 'DirectorApproval' with options for Approved, Denied, Pending, and Not Required.  I have a wait condition the routing activity for the Dept Director that waits for DirectorApproval to equal Approved or Denied. 

 

Every once in a while the Director will mark the field Approved, and Workflow apparently recognizes the condition is met - according to the Conditions tab in the Workflow details - but does not move on from the Routing Activity to the next activity.  Like its still waiting, but the condition clearly shows it has been met.

 

 

replied on March 7, 2014

what is the condition being evaluated on? The routed Entry?

replied on March 7, 2014

I have Input Entry selected in the 'Wait until these conditions are satisfied for' field under Wait Conditions tab.  The routed entry and input entry should be one and the same I believe - we're only passing around this requisition to get various approvals.

 

Thanks for any help you can offer, Kenneth

replied on March 7, 2014

Because this is a production environment and the issue has been ongoing for some time, management requests for me to terminate and restart these as soon as I discover them.

I manually terminated and restarted the workflow to keep them operating. 

 

That in itself should be telling:  I made the workflow so that it checks the various approval fields before routing to the user for approval.  This way if a field is already Approved, Denied, or Not Required, Workflow bypasses routing to them.  I am able to restart these workflows without altering any fields and on subsequent workflow instances, the field is recognized as being Approved, Denied, or Not Required and is routed accordingly

 

 

replied on March 7, 2014

It's pretty hard to tell what is going on without looking at the log files or knowing which version we're talking about. Each subsequent screenshot gives a bit more information, but not enough. The last one shows the routing activity was canceled, which usually indicates a deadline or escalation activity, but the main activity that got canceled is not visible.

 

Please open a case with tech support and attach the workflow logs. You can request escalation if your case is not handled in a timely manner.

replied on March 7, 2014

We have also had issues for years where simple workflows should 'move document(s) to a folder'.  However they did not move the document every single time.  If we had 20 docs to move at once, it would move 18.  Or it could be a onezy twozy and it would not move.  Just as you have done, we look at Workflow and it shows they all completed.  So every 3-4 months staff hasto do a search on their 'trigger field' values and then see where the 'location' of the document is.  If it did not move...put a 'space bar' in a field to retrigger it again.

 

I have asked for help on this also for years.  I have asked if there was a way to have part of workflow a confirmation where if you tell it to move... the sytem verifies that it does, and if not it notifies me.  So if you find out why, it would be great to hear what may be causing it.  I would look for similarites with my work flows.  These are critical moves in many situations and we need  better accuracy.

replied on March 8, 2014

I am keen to see the outcome of this thread. I also have a customer with exactly the same issue that comes up time and again and is persistent across WF upgrades. Since it is a very busy system, up to 15 000 running/waiting instances at any given time and between 15000 and 20000 new instances triggered per day, this issue tends to come up often, albeit at random.

 

Usually only spotted once users complain that files are not moving after updating them.

 

I am actually troubleshooting a few issues in an open case at the moment and I can only come up with some wild theories at this point as to why WF may be behaving this way.

 

I'll be keeping an eye on this thread and will be sure to share if I discover anything concrete. ;-)

replied on March 9, 2014

Sheldon, with that many running/waiting instances, have you looked at the feasibility of ending that workflow, starting a new one, and incorporating what was in the wait condition into your starting rule for the new workflow? Just thought that be more efficient and not keep so many open at a given point in time.

replied on March 30, 2014

Hi

 

It is not random, in my case it can be triggred by updating a running instance that is waiting and sending emails. Look to my case in detail in the thread.

replied on March 11, 2014

Also, for everyone's information, I have opened a case with Support and will report any significant findings.

 

Thanks for everyone's input and interest in this!

replied on March 18, 2014

Support did a remote session to our Workflow server.  After poking around a good bit, they recommended we upgrade to latest version.  We were on 9.1.0 and have upgraded to 9.1.1.486.  Upgrade was just done, so I will keep this thread updated on if the issue recurs.

replied on March 24, 2014

Thinking I have made significant progress towards discovering the cause - and possibly an actual resolution to this issue that has plagued me for so long, I reported my above findings immediately to Support. 

 

Disappointingly, I have yet to receive a meaningful response. 

replied on March 30, 2014 Show version history

I had similar problem before, WF hang in waiting state.

I think you were using “route entry to” and it was waiting, and you updated the WF and then you updated the running instance.

Check your WF admin console Error Log: Current Error Log, and look for this error: Object reference not set to an instance of an object.

if the error is there, then take a look at this page:

https://support.laserfiche.com/ForumsFrames.aspx?Link=viewtopic.php%3ft%3d19945%26amp

 

These are the steps to recreate the problem:

 

1- Publish a WF with Route Entry to Folder with waiting status for example <tag> is set and to Send Email every 5 minutes for 60 email - the idea is to have "Route Entry to Folder" to be in waiting state while sending emails.

2- Run it on entry

Definition: Version 1
Designer : Version 1 - Waiting
 

Diagnostic: Version 1 - False - Next Timer: (Date/Time + 5 minutes)

2- Wait for the first email to be sent so you can 5 minutes to do next step

3- Publish: Overwrite: Yes - Update running WF: Yes

Definition: Version 2
Designer : Version 1 - Waiting
Diagnostic: Version 2 - True - Next Timer: (Date/Time + 5 minutes)

4- Wait for the next email, then notice values changed

Definition: Version 2
Designer : Version 2 - Waiting
Diagnostic: Version 1 - False - Next Timer: (Date/Time + 5 minutes)

5- When the Next Timer the WF will stuck on running with these values

Definition: Version 2
Designer : Version 2 - Running
Diagnostic: Version 1 - False - Next Timer: Empty

6- Error in Activity log but last modified date remain the same

7- Set the tag to true to end the wait and you will see an activity (wait for entry change) highlighted in blue in the instance details, but the conditions tab shows the conditions for it have evaluated successfully.

 

replied on March 31, 2014

This is a known issue. I've filed it in our bug tracker and there is no resolution yet. It only happens when intermediate versions between the running instance's version and the current version are deleted. It has nothing to do with the issue that started this thread.

replied on March 31, 2014

Hi

 

In the above 7 steps to reproduce the problem, I didn't delete any version.

 

Step 1 one is brand new WF, and until step 7 I didn't delete any version.

 

 

replied on April 8, 2014 Show version history

Hi

 

In the above 7 steps to reproduce the problem, I didn't delete any version.

Step 1 is brand new WF, and until step 7 I didn't delete any version.

 

---------------------------------------------------

 

Update:

 

I want to note that when I updated the running waiting workflow I can see this message “Successfully update workflow “Test’ to version 2 …………”
 

I used your recommendation in this thread: “ you can side-step the issue for now in future instances by turning off the option to run these 2 activities as tasks. To do so, open the WF Admin Console, go to the Server Configuration node and open the Advanced Server Options dialog. In the Activity Performance tab, uncheck the boxes labeled "Email" and "Export Entry".”

 

The result the WF didn’t stuck using my 7 steps – In fact I can’t reach step 5.

 

However:

 

Step 4:

Definition: Version 2
Designer : Version 2 - Waiting
Diagnostic: Version 1 - False - Next Timer: (Date/Time + 5 minutes)

 

So I don’t think the logic changed of the running instance, it still show version 1......

replied on March 31, 2014

Ali,

That is quite interesting.  What had you done to work around this problem in your environment?

replied on March 31, 2014 Show version history

 

Create smaller WF as much as possible, and each WF calls another. In this case unlikely the one need update in logic is running.

 

Never update running WF instance, and if users ask for change in logic it apply only for future WF, which is little bit confusing for the users and for me as well.

 

I am interested to know if your WF admin console Error Log: Current Error Log, got this error: Object reference not set to an instance of an object, in this we both having the same problem.

 

Note that I didn't delete any version from Definition History for the above 7 steps.

 

If you delete versions you may activate this bug although I think it is fixed the last time I checked it out

 

https://support.laserfiche.com/ForumsFrames.aspx?Link=viewtopic.php%3ft%3d19972%26amp%3bhighlight%3dorphaned%26amp

 

 

 

replied on March 31, 2014 Show version history

 

Hi.

 

New Update:

 

I am using this thread to know WF stuck before users inform me:

 

https://support.laserfiche.com/ForumsFrames.aspx?Link=viewtopic.php%3ft%3d19951%26amp%3bhighlight%3d%26amp

 

I am monitoring Event id 100 in Windows Event Log: LFWorkflow

 

also, if you unchecked the box labeled "Email" from Advanced Server Options then you can't do my 7 steps, you will not be able to reach step 5.

 

 

However:

 

Step 4:

Definition: Version 2
Designer : Version 2 - Waiting
Diagnostic: Version 1 - False - Next Timer: (Date/Time + 5 minutes)

 

So I don’t think the logic changed of the running instance, it still show version 1

 

I am still interested to know if you have this error "Object reference not set to an instance of an object" as for some reason on this "Answers site" and old "LaserFiche Forums site" when I stated my 7 steps the answers I get that I deleted versions.

 

------------------------------------

 

My current workaround is:

 

Create smaller WF as much as possible, and each WF calls another. In this case unlikely the one need update in logic is running.

 

Never update running WF instance, and if users ask for change in logic it apply only for future WF, which is little bit confusing for the users and for me as well.

 

I am interested to know if your WF admin console Error Log: Current Error Log + rolled over logs , got this error: Object reference not set to an instance of an object, in this case we both having the same problem.

 

Note that I didn't delete any version from Definition History for the above 7 steps.

 

If you delete versions you may activate this bug although I think it is fixed the last time I checked it out

 

https://support.laserfiche.com/ForumsFrames.aspx?Link=viewtopic.php%3ft%3d19972%26amp%3bhighlight%3dorphaned%26amp

replied on April 9, 2014

Ali, I understand your frustration, but, like John's original issue, this is not a problem that can be resolved here. Please have your reseller open a case with Tech Support. I'm locking this thread since the original question has been investigated and answered.

replied on March 11, 2014

Bonnie & Sheldon:  Thank you very much for validating that I'm not crazy or making a careless mistake somewhere!  As a technical-, analytical-, scientific-minded guy who believes that the computer universe is governed strictly by cause & effect rules, this is the kind of problem that keeps me up at night.  The more of us we can find, the more data we'll have to establish a pattern or some kind of link. 

 

John Geist:  Thanks for your input; I have not tried separating out the wait condition, but I can see how that would at least give better tracking info, and maybe even give me better results.  That may be the next thing I try.

 

Currently, I do have a 'Check' workflow, running on a 15 minute schedule from 7:00a - 6:00p, that goes through each folder where entries wait.  For each entry, if the appropriate wait condition is found to be satisfied, it sends me an email.  not ideal, but at least I don't have to wait to hear it 'through the grape vine'.

 

I've noticed 2 problems with this 'Check' workflow - the first of which, I feel I can address; the second, I'm not so sure.

 

1.  if I don't address a given entry within 15 minutes of reporting, the check workflow will trigger another email on the next check.  I'm going to use a 'Stuck' tag in attempt to prevent these repeats.

 

2.  If the Check WF happens to catch an entry where the wait condition has been met, but workflow hasn't had reasonable time to act, a false positive will be reported.  I was hoping to somehow check when the wait condition was met and compare that time to current time.  A difference of more than five minutes surely indicates the workflow is stuck, I figure. 

replied on March 7, 2014 Show version history

EDIT: Disregard what I wrote previously. (now deleted)

 

What do you mean by the wait condition has been met but the entries have not moved? Can you give us a better idea of the actual process that is supposed to occur

replied on March 7, 2014

This workflow is showing as terminated in your screenshot, so it's past that condition step or it terminated in that activity.

replied on March 9, 2014

My two cents....I have found that separating the wait condition out into its own activity seems to be more consistent and give better visibility for tracking, such putting that right before the routing activity. Have you tried that by chance?

replied on March 19, 2014

Folks,

I think I'm on to something:

 

I noted yesterday that some instances show a status of Running in the Search Results pane, while the majority have Waiting status.  Those running instances seem to stay running for longer than any activity should run. So, I went to digging.

 

Those instances that were sitting with Running status showed an email activity that had been running for more than a few minutes - days in most cases.  See, my WFs are setup so that if they are not acted on by designated users, an email is sent to a manager everyday.  Every day for 90 days.  Eventually that repeating email child activity - its parent activity is a Route to User activity - hangs and runs indefinitely.  So when the user does change their approval field, the wait condition is met, but the instance is still stuck on running that email activity.  See below, a normal instance's activities tab and a 'hung' instance's activities tab.

 

Normal...

 

 

Hanging....

replied on March 26, 2014

We are experiencing the same problems as being described. We have however 2 variations to the this problem.

1. Where the Entry Condition is met and workflow recognizes that the condition is met when looking at the conditions tab, but still the workflow does not progress. the only solution is to manually complete or terminate the Workflow and restart it on the document. On this we were also advised to upgrade which we did a couple of weeks ago.

 

2. where the Entry Condition is met and workflow does not seem to recognize the change in the entry has taken place. In this instance, you could find the event in the subscriber, but the even does not seem to be promoted to the workflow service. In some cases we found you can do one of the following for workflow to become aware of the change:

  a. Change the condition being evaluated back to a "false" state, save the document and then change it back to a "true" state. the second time around on some entries workflow will then become aware of the condition being met and carry on.

  b. In some instances we found if the Entry is manually moved to a different folder, then Workflow becomes aware of the Change that initially should have made the waiting condition become satisfied.

  c. some instances nothing neither of the 2 approaches mentioned above worked and the only way to get workflow back on track is to complete or terminate the workflow and start it again on the document.

 

The second variation is still happening on 9.1.1.365 and just this morning we had 22 entries effected by this.

replied on March 28, 2014

It's most likely not the same issue as John's. His issue involves the Email activity and not wait conditions. Please open a support case.

replied on March 28, 2014 Show version history

Hey Miruna,

To clarify, I do have the same symptoms that Vincent Kelly mentioned in #1.  This is often how I get notified of the issue - a user will change an approval field and expect the entry to move and it does not.  On viewing the instance in WF Designer, Conditions tab, I do see that even though a wait condition is shown to be met, the entry does not move.  It seems in my case that the situation in #1 is caused by the email activity hanging.

 

I do not have any issues like #2 though.

 

ALSO:  Thank you Vincent for the helpful input.

replied on March 28, 2014

Right, I didn't say it very well. In your case, from the other screenshots, it is the Email activity causing problems. The condition evaluation happens on the subscriber, so it would get done anyway. However, because the Email tasks is stuck, the activity never knows it can move on.

 

In Vincent's case, we would need to look at the instances in question to see if it is the same problem or something different. Vincent's second problem would be entirely different because we would have to look at the Subscriber to see if the event ever came in to be evaluated or if the LFServer never sent the notification.

replied on March 28, 2014

Gotcha...the wait condition being met and the instance not moving is a symptom of the issue.  Vincent and I could very well have different causes though.

 

Am I getting it?


 

replied on March 28, 2014

Right, the wait condition being ignored in your case is a side-effect of the issue.

replied on March 28, 2014

Thanks very much for the post Miruna.  Your explanation, along with Support's is very helpful.  I feel you all are definitely on to the root of the problem.

 

I have made the above changes you all recommended and will monitor for more occurrences.

 

Thanks again!

You are not allowed to follow up in this post.