While checking SystemOut.log during a reported slowness in the application I found StaleConnectionException occuring frequently. This exception was not observed earlier and I doubt that if this is the reason for slowness and needs to be resolved.
StaleConnectionException usually happens, when WebSphere was disconnected from the database. It can be caused by database restart or by network issue e.g. firewall which disconnects requests after some time. If it happens frequently, make sure that Purge policy for that datasource is set to Entire Pool, not Failing Connections. If you have firewall between WAS and DB set Aged timeout to lower value than timeout on firewall (try with 1200 for example).
Can this be a reason for slowness?
It can a bit, as when application gets StaleConnectionException, that request is failing and either application has implemented logic to retry it or end user will get error and will retry the same request.
Related
InvalidOperationException: Timeout expired. The timeout period elapsed
prior to obtaining a connection from the pool. This may have occurred
because all pooled connections were in use and max pool size was
reached.
For some reason i usually get this error message once in awhile when testing locally
"DefaultConnection": "Server=(localdb)\\mssqllocaldb;Database=myDatabase;Trusted_Connection=True;"
Just when i pressed log in and where it does the check in database, it just takes forever until it gives me the exception with the error i just posted. Why does this happen? I think this hasn't happened on the production server yet or i just missed it, but would be nice to know why this happens.
I am having trouble getting the feature recovery on failure to work for my Windows Service Application. I set in up to restart the application on first failure. Then to test in I use this line of code
System.Environment.Exit(-1)
This causes the application to end okay but it doesn't restart.
It is reasonable to suppose that a service process exiting without setting the service status to stopped would constitute a failure. However, that isn't the case. (Perhaps for backwards compatibility; there might be too many third-party services that such a change would break.)
However, if the process exits as the result of an unhandled exception, that is considered a service failure and triggers the recovery options. So if you want to cause the service to fail, raise an exception (and don't catch it).
In our production environment, we have a WCF serivce that is very frequently called.
We noticed that sometimes, calls to this service (only this one) fail on timeout for a period of time, after everything falls into place and the service responds correctly again.
I used Dynatrace to try to understand what's happening, I noticed that for the calls resulting on a timeout, the method of the service is never called ! And at the same time the server throw this error
A blocking operation was interrupted by a call to
WSACancelBlockingCall
and the client throws a Timeout Exception.
I want to understand the cause of this errors. Is the server error caused by the client's TimeoutException (when the client close its connection) ? Otherwise why do the server throw this error ?
Can you attach a screenshot of that PurePath?
The TimeoutException is simply thrown by the caller of a service when the called web services doesnt return within the default timeout - typically something like 60s. And - once the client aborts its network connection it will cause the exception in the server who has accepted that connection.
There can be multiple reasons for this slow behavior, e.g: you are maxing out the number of connections you have in your client - or the server implementation is overloaded and cant handle incoming requests. Definitely look at the number of worker threads/connections configured on both sides
If you want specific help on dynatrace freel free to send over the PurePaths - check out http://bit.ly/sharepurepath
hope this helps
I am running a load test in order to see the performance of the WCF services during the peak time (heavy load). I am using the Step-Load where we push the virtual users Step-by-Step. When I start running the load test for the first few minutes the test runs smoothly and as the load increases by time, after some time all of a sudden the below error is triggering,
"Test method "XYZ" threw exception:
System.ServiceModel.CommunicationException: The socket connection was aborted. This could be caused by an error processing your message or a receive timeout being exceeded by the remote host, or an underlying network resource issue. ---> System.Net.Sockets.SocketException: An existing connection was forcibly closed by the remote host".
I tried lot of solution that I found online but non of them worked for me. I tried changing the default time-outs, maxconnections, maxconcurrent connections etc., in config files. I would really appreciate any help on this.
I had similar problems while ago when moved to WCF. It have probably something to do with the way WCF handles connections.
Solution for me was to move each test into separate ApplicationDomain.
My application has 50 service endpoints (such as /mysite/myService.svc). It's hosted in IIS. Intermittently (once every two or three days) a service stops responding. It's never the same service that hangs. While a service is hung, some of the other services work fine and some other are also hung.
All clients (from different computers) get this error:
ServiceModel.CommunicationException
Message: An error occurred while receiving the HTTP response to
https://server/mysite/myservice1.svc.
This could be due to the service endpoint binding not using the HTTP
protocol. This could also be due to an HTTP request context being
aborted by the server (possibly due to the service shutting down).
See server logs for more details.
No exceptions are raised by the server when the client attempts to call the service that is hung. All I have is that error on the client side.
I have to manually recycle the application pool to fix the problem.
Do you know what could be the cause? How can I investigate this issue? I'm willing to take a memory dump of the worker process when a service is hung but I would not know what to search for in the dump.
Update (Aug 13 2009): I have almost ruled out the idea that the server runs out of connections (see comment in Shiraz Bhaiji's answer). I might have a new lead: I log all server-side exceptions in a log file. So in theory, when this occurs on the client, no exceptions are raised on the server; otherwise I'd have proof of that in my logs. But what if an error does occur on the server but is happening at a low level where exceptions are not routed to my exception handling code? I have posted this question about scenarios where low level exceptions cannot be handled. I'll keep you informed of the progress of my investigation.
Sounds like you are running out of connections.
By default WCF has a timeout and therefore holds a connection open for 10 mins.
When you recycle the app pool all connections are closed, and therefore things work again.
To fix it check your code to make sure that you close connections / dispose of proxies.
To resolve this, we set establishSecurityContext to False on the binding.
I have not come across this particular issue but would suggest to turn on tracing/message logging for the WCF service in the config for the service and/or the client app (if you have control over that). I've done this in the last few days for a service that I needed to troubleshoot.
The MSDN link here is a good starting point.
Also see the table in this post for the varying levels of trace detail you can configure. There are several levels which can go from exception only logging to full message details. It is quite quick to set this up in the app.config file.
To parse the log file output use the SvcTraceViewer.exe that comes with the Windows SDK, which if you have it installed should be located in this folder: C:\Program Files\Microsoft SDKs\Windows\v6.0\Bin