I have a WCF application with a couple thousand clients connecting to a pair of services running under IIS. What I've noticed is that some of these clients get into a hung state, and I'm trying to reproduce this.
When this problem was first noticed, I had not modified the throttling configuration and the services were set to ConcurrencyMode.Single. One thing I noticed was that an IISReset on the server caused many clients to hang. Yet pulling this same stunt on the client running against IIS on my local machine doesn't seem to cause the problem.
I caught this only once in the wild, but didn't have debugging enabled at the time. The symptom I witnessed was that the client appeared to be trying to open a connection to the web server, but did not succeed. While monitoring with Fiddler, I saw no attempt to reach the service endpoint. Obviously that makes me suspect the client proxy.
I have a very solid hunch as to what's happening -- namely I've been using "Close()" instead of "Abort()" when the service throws an exception, which I believe is causing the channels to become corrupted. But considering the effort to get a new version out there, I need to reproduce this problem by causing a client on my own machine to hang before I can start making changes to the code.
Where should I start?
Thanks in advance,
roufamatic
Have you got any logging turned on? This could help in diagnosing the problem. It can be done completely in config, so no need to build a new version. Use the Service Configuration Editor tool to set it all up. The Visual Studio 2008 Training Kit has a good tutorial on how to use logging and the log viewer.
I suppose this was too vague a question though I was mostly curious what people might suggest. As it turns out there was a nontrivial difference between my workstation and a production environment that, once resolved, allowed me to see the problem. In this case, somehow using Fiddler to watch the traffic actually prevented the error from occurring! Now to ask another question.
Related
We have an issue with a windows service which uses nServiceBus. At some random moment, the nServiceBus stops processing messages and direct them directly to Error queue, and I have to restart the service. After the restart, the messages arrived in the input message queue are handled, and everything gets back to normal. If we re-drop the messages which were went to error queue, it is processing it successfully without any issue.
We are using log4net logs to audit the message flow and storing in DB. The NServiceBus Handler stops to log in log4net. After we restart the windows service (NServiceBus) then it start to log again. We are NOT able to redproduce this issue in development environment. We are suspecting this could be a NService Bus Memory Leak issue. But we don't know how to confirm this issue and resolve the same.
We are planning to move this Windows Service (NServiceBus) to different server as a trial and error basis. Did anyone face this issue ever and resolved it? Please help us to resolve this issue as it is causing more troubles in Production environment.
NServiceBus Version that we are using : 2.0.0.1329
Message queue and windows service are in the same machine.
I believe you're running on a version of NServiceBus that is about 5 years old and is no longer supported. While I could give you the standard recommendation of upgrading to a more current release, it could very well be that some of the configuration APIs that you're using have been made obsolete so you may need to make some modifications there and/or in the app.configs.
I'm sorry to say that there probably isn't a better solution for you at this time.
In general, I'd suggest trying to track the NServiceBus releases somewhat more closely. If you're within 6-12 months of the current release, you should generally be in good shape.
For a few weeks now I've been having a really weird problem. I have a couple of services which work just fine when self-hosted in a command line app. However in IIS+AppFabric I cannot access one of the services - I get TimeoutException and am pretty sure that the call doesn't even make it to the service (all services have an aspect to log all calls before doing anything). Note that both services are configured identically with regards to bindings and behaviors by code. I tried many things like putting them on different app pools, disabling some of the transports... And what is really strange that if both services are in one app pool - one of the services works but if I put them on separate threads - the other service times-out. It really drives me nuts...
Also I see pretty often events in the system event log: "A process serving application pool 'Authorization Management' suffered a fatal communication error with the Windows Process Activation Service. The process id was '11852'. The data field contains the error number." The error number is 0x80070218. After the event the service host initializes without problems (I can see my own info log messages) however the service is unreachable.
Does this ring a bell to anyone?
Thanks!
It turned out that I had a bug in the initialization of the services' hosts. I was trying something, and when I removed the try code, apparently I didn't delete the first line which was locking some resource.
Anyway, it is a good lesson. Nevertheless, if your services do not work, your initialization might be buggy...
Sorry about the noice.
We have some WCF services implemented in an IIS application, communicating over net.tcp on the default port (808), using the Microsoft Net.Tcp Port Sharing Service, throwing an error on production servers. When I instantiate a connection to the first of the services, I get back an exception:
The server at <URL> rejected the session-establishment request. All the other services respond fine.
But it runs fine on our test servers.
I initially thought there was something wrong with the particular service that was failing, but I tried rearranging the list of services into a different order, and it SEEMS to always be the first service that I hit that fails. (I say SEEMS because it think once in the early iterations of testing, I saw it happen on the second service that it hit. But I haven't been able to reproduce that.)
I've looked at application startup delays, and that doesn't seem to be the problem, because I can come back and run the test again as soon as it finishes - a delay of only a minute or two - and get the same error. Also, in the lower level environments, there is a start up delay of probably 30 seconds to a minute, but the result still comes back as expected.
I've tried accessing the services over http from INetManager, and I get intermittent failures on all the services - a particular service will return a yellow screen of death on on invocation, then come up with the expected link to the WSDL on the next one seconds later.
I'm completely at a loss to explain this behavior, or how to resolve it. I've googled the error message, and not found anything helpful. It may be a configuration issue - the production servers are newly provisioned VM's, and we may not have the config exactly right (whereas all the lower level environments have been running this and other similar apps for some time), but I have not idea what to look for. I've looked at the properties of the app pool that the app is running on and compared it to the lower level environments without finding any differences.
If somebody can point me in the right direction, you would have my undying gratitude.
Things I can find:
http://go4answers.webhost4life.com/Example/connect-busy-wcf-service-host-while-725.aspx:
MaxConcurrentSessions (default = 10) [Per-channel] The maximum number of sessions that a service can accept at one time. Only comes into play with session-based bindings (wsHttp or netTcp)"
http://blogs.infosupport.com/unable-to-generate-a-wcf-proxy-using-svcutil-but-retreiving-the-wsdl-works/
So in the end the trick is to add the additional right on the c:\windows\temp folder for your App Pool Identity [for the service to be able to generate metadata] to solve the problem.
Also, are timeouts or other limits configured and being hit? Give tracing a look and access the service using WcfTestClient and see if you can find underlying errors.
Ok, can I vent?? I am so sick and tired of this. I'm working away most of the day and the WCF services are working great. Next time I run my app and make a WCF call, bam! the tcp socket is no longer available. I have searched high and low to solve this and there is no real solution. The only solution I can find is to reboot the machine which is a huge time-waste and burden. Restarting WPA service, net.tcp service, IIS, etc. does not do a thing. Logging off and back on does not fix it. Only a reboot fixes this issue. I do nothing except run my app again making a WCF call, and this crap happens. There are no configuration issues with anything. I have been dealing with this for months and cannot find any specific reason or solution as to why this happens. It happens with my firewall on or off, does not matter.
Any insight from anyone? I think there is truly a bug in the WCF / net.tcp layer that is causing this. I even get it on a production 2008 R2 server when sometimes making a Web.config change, so I have learned to stop the IIS, WPA, net.tcp, etc. services prior to the change then restart them. What a pain.
I'm using .NET4 all around, VS2010, all service packs, etc. applied. Everything is the most current.
Excuse me while I reboot.....
Can anyone help with this?
Open a command prompt
Navigate to c:\windows\microsoft.net\framework64\v4.0.30319
Register the service model using the command "ServiceModelReg.exe -r"
Credits go there http://kumaranbose.blogspot.be/2010/08/cryptic-wcf-nettcp-errors.html
This issue hunts me for almost 3 years now but only happens sporadically. TCPView helped.
I have killed SMSSvcHost.exe process and then restarted Net.Tcp Listener Adapter service. That cleared the issue. Not really a solution but at least, I don't have to resort to rebooting the server anymore.
I had this issue. It would happen after each IIS reset (which happens as part of our deployment). The issue was resolved after restarting NetTcpPortSharing service (which also restarts Net.Tcp Listener Adapter service)
I am not sure I have an answer but, you could identify the process that has the port open and that can help narrow the scope of the problem. I have used Sysinternals suite which has a TCPView. This proggy was helpful to me.
TCPView - http://technet.microsoft.com/en-us/sysinternals/bb897437
Sounds Net.Tcp Listener Adapter service is being killed by some process or exception being throw by the web service putting the channel in a faulted state.
Have you tried setting the startup type of the service to automatic and recovery to restart service on first and second failure?
I doubt it very much that there is a bug in wcf net.tcp channel layer. If the listener is running and tcp socket no longer available i would suggest you look into the code especially around the exception handling strategy and have a peek into the iis request logs.
I have a Problem which confuses me a little bit, resp. where I don't have any idea about what it could be.
The System I'm using is Windows Vista, IIS 7.0, VS2008, Windows Software Factory, Entity Framework, WCF. The Binding for all Webservices is wshttpbinding.
I'm using a Webservice hosted in IIS. This Webservice uses/calls another Webservice (also installed in the IIS). If I use a client calling the first Webservice (which calls the second Webservice) it works fine for about 4-10 Times. And then (it is repeatable to get this Problem, but sometimes it happens after 4, sometimes after 10 Time, but it always will happen), the Service and the IIS gets stuck.
Stuck means, that this Webservice isn't callable anymore and generates an timeout after 1 minute.
Even increasing Timeout doesn't change anything.
If i try to restart the IIS I get an timeout error (and this is really confusing me. It seems that the Webservice has "crashed" somehow and blocks the Restart of the IIS). So the IIS is also "stuck" (it is not really stuck, but I can't restart it). Only if I kill the w3wp.exe IIS is restartable and the Webservice will work again (until i again call this service several times).
The logfiles (i'm no expert in things like logging or where to find/enable such logs, so to say : i'm a newbie) like http-logging, Event Viewer or WCF-Message Logging don't show any hints upon the source of the problem.
I don't have this problem when I'm using a Webservice which doesn't call another Service.
Calling a Webservice is done by Service Reference (I'm using no Proxy-Classes), but I think this should be no Problem.
I have no idea of what is happening, nor how to solve this Problem.
Regards
Rene
Edit. : I hope my posting is more readable now :-)
insert System.Diagnostics.Debugger.Break() into your web service code. When that point is reached, you will be able to step through the service logic. This may help you diagnose the cause of the deadlock.
Another alternative is to turn on WCF Tracing, and diagnose that way.