I have a customer that has a 25 user public portal license for WebLink. This WebLink site is setup to allow public access to a repository using auto-login with a generic Laserfiche WEBUSER account. In Laserfiche we've set the default timeout to be 20 minutes. In the web.config file for this site we've set the session state timeout for 5 minutes. The issue we are seeing is that some of these sessions are never expiring. They have some that are over 3 days old and they are having to manually monitor and kill these old sessions in order to keep public access open. I've seen the posts about setting the sessionstate timeout, which we've done, and about ensuring the user's browser accepts cookies or they log off, which we can't because it is for the public. They are running Laserfiche 8.3.2 & WebLink 8.2.1 on a 2008 R2 box with IIS 7.5. Is there someplace I am missing on the server where I can force the timeout?
Question
Question
Answer
That rule has been tested in the sense that it won't break WebLink. But there is no regulation about bot user strings and so there is no pattern to put there that will catch all bots that ever will exist. That said, from the data I have it is a good start but is not complete.
In one week last month, webdemo.laserfiche.com saw 140 distinct user agent strings. If I filter by "bot|spider|crawl" I remove most of what look like bots. The two big ones I miss are Wada.vn and Ezooms, so you'll need separate blocks for those. With just those 3 rules in place ("bot|spider|crawl", "Wada" and "Ezooms"), there are just a handful of strings left that look suspicious.
For instance, there were 12 requests using the string "ZmEu" that appear to be hacking attempts (untargeted attack against a PHP application). Making a rule for that isn't worth the time - it will be a different string next time.
It might be worth making a separate question where we can crowd source new filter strings as people start to see them.
Replies
Hi Beau,
Have you taken a look at this presentation from the conference, which includes information on Troubleshooting timeout issue in WebLink?
If you're still having issues with sessions never expiring after looking at that, please contact Support.
While bots can cause issues with sessions, the situation Beau described is more likely to be caused by the KeepAlive timer for the WebLink docviewer from real users leaving a document open than by bots. Both solutions to issues with the bots and preventing the KeepAlive timer from auto-refreshing sessions are covered briefly in the presentation Kelsey linked.
The previous answers thread that Alexander linked to has several old forum posts that provide more information about bots, including a case where the bots can hit the KeepAlive timer and cause issues, but because there does not seem to be an issue with excessive number of sessions being consumed it is less likely to be bot problems.
The KeepAliveTimer.Stop() portion was previously added to the DocView.aspx page by another technician and did not resolve the issue. I will look into the bot possibility as I am seeing activity from bots that are not in the webconfig.
I've added the appropriate text to the WebConfig for these bots and will monitor to see if these were the issue:
- AhrefsBot
- MJ12bot
While this may resolve the current issue, the customer is looking for a longer-term solution so they are not having to constantly update this list. I saw a previous post in which a Laserfiche engineer suggested putting this in the webconfig but advised it was untested:
- <case match="(bot|spider)">
crawler=true
tagwriter=System.Web.UI.HtmlTextWriter
</case>
Has this been tested and/or is there a generic case we can put in that will encompass a large majority of the bots?
That rule has been tested in the sense that it won't break WebLink. But there is no regulation about bot user strings and so there is no pattern to put there that will catch all bots that ever will exist. That said, from the data I have it is a good start but is not complete.
In one week last month, webdemo.laserfiche.com saw 140 distinct user agent strings. If I filter by "bot|spider|crawl" I remove most of what look like bots. The two big ones I miss are Wada.vn and Ezooms, so you'll need separate blocks for those. With just those 3 rules in place ("bot|spider|crawl", "Wada" and "Ezooms"), there are just a handful of strings left that look suspicious.
For instance, there were 12 requests using the string "ZmEu" that appear to be hacking attempts (untargeted attack against a PHP application). Making a rule for that isn't worth the time - it will be a different string next time.
It might be worth making a separate question where we can crowd source new filter strings as people start to see them.
That works for me. I didn't expect anything to be 100% but anything that can help is always appreciated. I've added it to my customer's WebConfig and hopefully this will reduce the maintenance. Thanks for the help!
I am also going to add to this that "TestSession.aspx" and "SessionKeepAlive.aspx" entries are found in the JavaScript that is generated from WebLink. I have been able to bring down a WebLink site in seconds by hammering at these two HTTPHandlers with cookies disabled.
Years ago we ran into an issue with GoogleBot that found this HTTPHandler and kept refreshing the SessionKeepAlive.aspx URL.
This issue is related to bots hitting the site. Please refer to this previous answers thread for ways to address the matter.