Tomcat Persistence Manager Kills Session Logins - apache

For my web app, I use tomcat declarative security to tie login credentials to the company Active Directory. On two of our servers, logins were timing out after one minute of inactivity. On the other two servers, there is a thirty minute timeout (which is what I want).
Yeseterday, I found the cause of the problem. The two servers with one minute timeouts have a tomcat Persistence Manager enabled to write session information to disk. Our IT guy is out this week, so I don't know the exact details of what he was trying to accomplish with this, but he had set PersistenceManager up like this in context.xml:
<Manager sessionIdLength="64" className="org.apache.catalina.session.PersistentManager"
maxIdleBackup="10" maxIdleSwap="30">
<Store className="org.apache.catalina.session.JDBCStore" dataSourceName="jdbc/Auth"
sessionTable="sessions" sessionAppCol="app_name" sessionDataCol="session_data" sessionIdCol="session_id"
sessionLastAccessedCol="last_access" sessionMaxInactiveCol="max_inactive" sessionValidCol="valid_session" />
</Manager>
I did some research and discovered that the Idle numbers are in seconds. Thinking that might be the culprit I changed the Manager portion to:
<Manager sessionIdLength="16" className="org.apache.catalina.session.PersistentManager"
maxIdleBackup="600" maxIdleSwap="3600" minIdleSwap="1800">
This fixed my problem. So it appears that forcing the Persistence Manager to write sessions out to disk after thirty seconds of inactivity was killing my session logins. I tracked the JSESSIONID cookie and found that the cookie remained the same even after the user is forced back to the login screen. It only changes when you re-login. This is what you would expect, because persisting the session to disk couldn't possibly change the session id. However, it does cause my declarative security model to force the user to log in again.
I did find in the manual that the maxIdleSwap variable not only controls persisting sessions to disk, but also causes the "passivating of the session out of server memory". This sounds a bit suspicious to me.
Does anyone have any experience with this issue? Why does the Persistence Manager kill my web app logins when it persists sessions to disk? Is there any way around this without changing the swap control variables like I did?

Related

Unstable cluster with hosts.xml resets to default

While performing management api operations such as removing app servers from a MarkLogic cluster, it becomes unstable resetting hosts.xml to default/localhost setting.
Logs shows something like:
MarkLogic: Slow send xx.xx.34.113:57692-xx.xx.34.170:7999, 4.605 KB in 1.529 sec; check host xxxx
Consider infrastructure is slow or not slow, but automatic recovery is still not happening.
How to overcome this situation?
Anyone who can provide more info on how management api is working under the hood?
Adding further details:
DELETE http://${BootstrapHost}:8002${serveruri}
POST http://${BootstrapHost}:8002${forest}?state=detach
DELETE http://${BootstrapHost}:8002${forest}?replicas=delete&level=full
DELETE http://${BootstrapHost}:8001/admin/v1/host-config?remote-host=${hostname}
When removing servers 1st request or removing host 4th request, few nodes in the cluster restarts and we check for nodes availability. However this is uncertain and sometimes hosts.xml resets to a default xml which says it is not part of any cluster.
How we fix, we copy hosts.xml from another host to this faulty host and it starts working again.
We found that it was very less likely to come in MarkLogic 8, but with MarkLogic 9 this problem is frequent and if it is on AWS it is even more frequent.

WSO2 login screen timeouts?

Back when we were running the regular Apereo CAS, there was a setting for login session timeouts, so that if someone went to the CAS login screen and just let it sit, the login session would timeout after a certain period of time (5-10 minutes IIRC.)
I was curious if there was a similar configuration settings with WSO2, and if so, what parameter it is?
The reason I'm asking is because on Saturday we did our first round of incoming student registrations, and apparently the Admissions folks logged in all of the lab computers and got them to the login screen about an hour before the students went to use them, and no one could log in until they refreshed their browsers. So I'm expecting that there is a setting for that somehow, I'm just not sure which setting it would be. Just looking at the identity.xml file, there are quite a few configurable timeout settings, and I'm not sure if it's even one of these:
...../repository/conf/identity # cat identity.xml | grep -i timeout
<CleanUpTimeout>720</CleanUpTimeout>
<CleanUpTimeout>2</CleanUpTimeout>
<SessionIdleTimeout>720</SessionIdleTimeout>
<RememberMeTimeout>10080</RememberMeTimeout>
<AppInfoCacheTimeout>-1</AppInfoCacheTimeout>
<AuthorizationGrantCacheTimeout>-1</AuthorizationGrantCacheTimeout>
<SessionDataCacheTimeout>-1</SessionDataCacheTimeout>
<ClaimCacheTimeout>-1</ClaimCacheTimeout>
<PersistanceCacheTimeout>157680000</PersistanceCacheTimeout>
<SessionIndexCacheTimeout>157680000</SessionIndexCacheTimeout>
<ClientTimeout>10000</ClientTimeout>
<!--<Cache name="AppAuthFrameworkSessionContextCache" enable="false" timeout="1" capacity="5000"/>-->
<CacheTimeout>120</CacheTimeout>
The global configuration can be found in the < IS_HOME >/repository/conf/identity/identity.xml file under the < TimeConfig >element.
<TimeConfig>
<SessionIdleTimeout>15</SessionIdleTimeout>
<RememberMeTimeout>20160</RememberMeTimeout>
</TimeConfig>
More information can be found here.
mgt console session timeout: Open repository/conf/tomcat/carbon/WEB-INF/web.xml Increase the session-timeout value.
<session-config>
<session-timeout>240</session-timeout>
<cookie-config>
<secure>true</secure>
</cookie-config>
</session-config>

How to track down long running calls to IIS?

Our users are restless. They keep complaining about woolly, unmeasurable stuff, particularly slowness, without giving specifics, which of course makes it very difficult to track down.
Nonetheless, it is quite possible that they are right, that there are server calls that are taking way too long to come back. So I want to put some kind of sniffer on the web site (we're using ASP.NET MVC 4 on IIS7) that will log any call that takes more than n seconds to turn around, or that returns more than x megabytes of data, along with all request parameters, the response size, and maybe a certain amount of response data.
I haven't a clue how to do this, though. Any suggestions?
here is my take on this:
FRT
While you can use failed request tracing to log slow requests, in my experience is more useful for finding out why a request fails before it hits your application, rather than why its running slowly. 9/10 times its going to simply show you that the slowdown is in your code somewhere.
Log Parser
Yes you can download and analyze iis logs. I use Log Parser Lizard to do the analysis - its a great gui over log parser. Here's a sample of how you might query slow requests over 1000ms:
SELECT
To_String(To_timestamp(date, time), 'dd/MM/yyyy hh:mm:ss') As Time,
cs-uri-stem, cs-uri-query, cs-method, time-taken, cs-bytes, sc-status
FROM
'C:\inetpub\logs\LogFiles\W3SVC1\u_ex140721.log'
WHERE
time-taken > 1000
ORDER BY time-taken desc
New Relic
My recommendation - go easy on yourself and sign up for a free trial. No I don't work for them, but I've used their APM product a lot. Install the agent on the server - set it up. In 10 mins you will be amazed at the data you see about the site. Trust me.
Its designed to work in production environments and gives you amazing depth of info on what's running slow, down to the database query and stack traces. Its pure awesome. Once its setup wait for the next user complaint, log in and look at traces for the time frame.
When your pro trial ends, you can still get valuable data on the free tier, but it will only keep last 24 hours. We purchased licenses -expensive yes, but worth every cent. Why? Time taken to identify root causes was reduced by an order of magnitude, we can get proactive by looking at what is number 2, 3 and 4 on the slow requests list and working those before they become big problems, and finally the alerting makes us much more responsive when things were going wrong.
Code it
You could roll you own. This blog uses Mvc ActionFilters to do the logging. You could also use an HttpModule similar to this post. The nice thing about this approach is you can compile and implement the module separately from your application, and then just drop in the dll and update web.config to wire up the module. I would be wary of these approaches for a very busy site. Also, getting the right level of detail to fully identify the root is challenging.
View Requests
As touched on by Appleman1234, IIS has a little known feature to look at requests currently executing. Its handy for the 'hey its running slow right now' situation. You can use appcmd.exe or the IIS gui to do it. You will need to install the 'Request Monitor' IIS feature for this to work. This approach is ok for rudimentary narrowing of the problem, but does not show you whats running slowly in your controller.
There are various ways you can do this:
Failed Requests Tracing(FRT) – formerly known as Failed Request Event Buffering (FREB) with custom failure condition of takes over a certain time to load / run
Logging request information with IIS logging functionality and then using a tool like LogParserStudio
Using tools like Fiddler or IISMonitor on the IIS server to capture request information
For FRT the official documentation is available here and information how to capture dumps for long running process is avaliable here
For logging request information in IIS information about log file analysis is located here
For information on configuring Fiddler to capture IIS requests find information here
A summary of the steps in the linked resources is provided below.
For FRT
From IIS Manager for a given site,In the Actions pane, under Configure, click Failed Request Tracing and enter desired values in dialog box to enable Failed Request Tracing.
From IIS Manager for a given site, under IIS click Failed Request Tracing Rules, in order to define rules of failure for a given request. In the Actions pane, click Add and follow the wizard.
The logs will go in the directory you specify and are viewable in a web broswer.
For IIS logging
Logging is enabled by default on IIS
From IIS Manager for a given site,under IIS click Logging, and in the Actions Pane, click Enable to enable logging if it isn't already.
From IIS Manager for a given site,under IIS click Logging, and then configure as desired and click apply.
Install LogParser, .Net 4.x and LogParserStudio (if you need additional steps see here
Open LogParserStudio and add logs to it, you then can use SQL queries to get information from the log files.
For Fiddler
You need to change the user that IIS runs as to a user that can launch applications, like Fiddler (instead of Network Service), and then launch Fiddler with that user.
Also see Monitor Activity on a Web Server (IIS 7) for further information.

What can cause IIS app pool to recycle?

I am currently experiencing some instability in my session variables and believe the app pool is where the error is coming from. What I cannot find is a list of possible culprits for the issue. What can cause the app pool to recycle on its own, other than a scheduled recycle?
Common reasons why your application pool may unexpectedly recycle
EDIT: Full Text in the event that the link goes 404:
If your application crashes, hangs and deadlocks it will cause/require the application pool to recycle in order to be resolved, but sometimes your application pool inexplicably recycles for no obvious reason. This is usually a configuration issue or due to the fact that you're performing file system operations in the application directory.
For the sake of elimination I thought I'd list the most common reasons.
Application pool settings
If you check the properties for the application pool you'll see a number of settings for recycling the application pool. In IIS6 they are:
Recycle worker processes (in minutes)
Recycle worker process (in requests)
Recycle worker processes at the following times
Maximum virtual memory
Maximum used memory
These settings should be pretty self explanatory, but if you want to read more, please take a look at this MSDN article
The processModel element of machine.config
If you're running IIS5 or the IIS5 isolation mode you'll have to look at the processModel element. The Properties you should pay the closest attention to are:
memoryLimit
requestLimit
timeout
memoryLimit
The default value of memoryLimit is 60. This value is only of interest if you have fairly little memory on a 32 bit machine. 60 stands for 60% of total system memory. So if you have 1 GB of memory the worker process will automatically restart once it reaches a memory usage of 600 MB. If you have 8 GB, on the other hand, the process would theoretically restart when it reaches 4,8 GB, but since it is a 32 bit process it will never grow that big. See my post on 32 bit processes for more information why.
requestLimit
This setting is "infinite" by default, but if it is set to 5000 for example, then ASP.NET will launch a new worker process once it's served 5000 requests.
timeout
The default timeout is "infinite", but here you can set the lifetime of the worker process. Once the timeout is reached ASP.NET will launch a new worker process, so setting this to "00:05:00" would recycle the application every five minutes.
Other properties
There are other properties within the processModel element that will cause your application pool to recycle, like responseDeadlockInterval. But these other settings usually depend on something going wrong or being out of the ordinary to trigger. If you have a deadlock then that's your main concern. Changing the responseDeadlockInterval setting wouldn't do much to resolve the situation. You'd need to deal with the deadlock itself.
Editing and updating
ASP.NET 2.0 depends on File Change Notifications (FCN) to see if the application has been updated. Depending on the change the application pool will recycle. If you or your application is adding and removing directories to the application folder, then you will be restarting your application pool every time, so be careful with those temporary files.
Altering the following files will also trigger an immediate restart of the application pool:
web.config
machine.config
global.asax
Anything in the bin directory or it's sub-directories
Updating the .aspx files, etc. causing a recompile will eventually trigger a restart of the application pool as well. There is a property of the compilation element under system.web that is called numRecompilesBeforeAppRestart. The default value is 20. This means that after 20 recompiles the application pool will recycle.
A workaround to the sub-directory issue
If your application really depends on adding and removing sub-directories you can use linkd to create a directory junction. Here's how:
Create a directory you'd like to exclude from the FCN, E.g. c:\inetpub\wwwroot\WebApp\MyDir
Create a separate folder somewhere outside the wwwroot. E.g. c:\MyExcludedDir
use linkd to link the two: linkd c:\inetpub\wwwroot\WebApp\MyDir c:\MyExcludedDir
Any changes made in the c:\inetpub\wwwroot\WebApp\MyDir will actually occur in c:\MyExcludedDir so they will go unnoticed by the FCN.
Is recycling the application pool really that bad?
You really shouldn't have to recycle the application pool, but if you're dealing with a memory leak in your application and need to buy time to fix it, then by all means recycling the application pool could be a good idea.
What about session state?
Well, if you're running in-process session state, then obviously it's going to be reset each and every time the application pool is recycled. If you need to brush up on your state server options, then I recommend taking a look at this entry.

WCF receive timeout

When attempting to connect/communicate with my service i have to wait for almost exactly 20 seconds each time before the exception is fired. Since this all gonna be running on a local network, I would like decrease that timeout period to 5 seconds? I tried decreasing the receiveTimeout on my client, but it didn't work. I looked all over my code for a 20 second timeout variable set, but couldn't find any. What should i be changing?
There are different timeout settings http://msdn.microsoft.com/en-us/library/ms731078.aspx. They can be set for example in a config file (web.config or app.config) see http://msdn.microsoft.com/en-us/library/ms731343.aspx as an example. Under http://msdn.microsoft.com/en-us/library/ms731399.aspx you can choose the binding which you use and set the corresponding setting.
UPDATED: You probably have the timeout set on the TCP level. Try reducing the TcpMaxConnectRetransmissions (Default value 2) or TcpInitialRTT (Default value 3, on NT 4.0 the parameter has the name InitialRTT) parameters in the registry, reboot your computer and try your experiments one more time. About affect of 21 seconds you can read in http://support.microsoft.com/kb/223450, http://support.microsoft.com/kb/175523, http://support.microsoft.com/kb/170359 or http://www.boyce.us/windows/tipcontent.asp?ID=189. You can read a description of the TCP/IP default configuration values at http://support.microsoft.com/kb/314053 (for Windows XP) and http://technet.microsoft.com/en-us/library/cc739819(WS.10).aspx (for Windows Server 2003 with SP2).
What you may actually be seeing is the cold start from your webapp. The Service Not Found exception would fire back pretty quickly unelss you had hit it pretty hard and you started queueing service requests beyond what WCF was configured to do.
However, if you had your website unloaded (appdomain and worker process) it could take 20 seconds to hit to the code that builds the channel to your service. So it may be something masked.
If your website and service are in different application pools then this is maginfied because it has to cold start the website and then coldstart the service, which are done in succession instead of simultaneously.
To somewhat alleviate this you can use a keepalive/ping service. Something that just constantly hits the URL to keep the AppDomain in memory and the worker process alive (if not shared). By default IIS 6 will shutdown the worker process after 20 minutes of inactivity, so when the first request comes in, http.sys starts up a new worker process, which loads the framework, which loads your app, which starts the pipeline, which executes your code, which delivers to your user. :)