The identity check failed for the outgoing message - wcf

We have a WCF Service that runs on a Domain Server. We have a couple a website (WCF Client) not on the domain and we use Username and password to authenticate. And everything workes fine.
Some days when the Service app-pool recycles the website fails to connect and starts throwing lots of identity check failed error messages. (The expected identity is 'identity(http://schemas.xmlsoap.org/ws/2005/05/identity/right/possessproperty: http://schemas.xmlsoap.org/ws/2005/05/identity/claims/thumbprint)' for the 'http://xxx.com:8004/sts.svc/username' target endpoint.) But most days it works fine.
What could be wrong and how to resolve this.
** The clocks on the server and client are in sync.
thanks
Ravi

Check the clocks on the servers and clients. WS-Security fails if the clock skew between the client and server is greater than a threshold, 5 minutes by default.
The automatic clock sync on Windows Server doesn't always do its job. Clients may not sync at all. If everyone syncs to a reputable time source (NIST, for example), your problems may go away entirely.

Related

The server rejected the session-establishment request: WCF hosted on IIS

We have some WCF services implemented in an IIS application, communicating over net.tcp on the default port (808), using the Microsoft Net.Tcp Port Sharing Service, throwing an error on production servers. When I instantiate a connection to the first of the services, I get back an exception:
The server at <URL> rejected the session-establishment request. All the other services respond fine.
But it runs fine on our test servers.
I initially thought there was something wrong with the particular service that was failing, but I tried rearranging the list of services into a different order, and it SEEMS to always be the first service that I hit that fails. (I say SEEMS because it think once in the early iterations of testing, I saw it happen on the second service that it hit. But I haven't been able to reproduce that.)
I've looked at application startup delays, and that doesn't seem to be the problem, because I can come back and run the test again as soon as it finishes - a delay of only a minute or two - and get the same error. Also, in the lower level environments, there is a start up delay of probably 30 seconds to a minute, but the result still comes back as expected.
I've tried accessing the services over http from INetManager, and I get intermittent failures on all the services - a particular service will return a yellow screen of death on on invocation, then come up with the expected link to the WSDL on the next one seconds later.
I'm completely at a loss to explain this behavior, or how to resolve it. I've googled the error message, and not found anything helpful. It may be a configuration issue - the production servers are newly provisioned VM's, and we may not have the config exactly right (whereas all the lower level environments have been running this and other similar apps for some time), but I have not idea what to look for. I've looked at the properties of the app pool that the app is running on and compared it to the lower level environments without finding any differences.
If somebody can point me in the right direction, you would have my undying gratitude.
Things I can find:
http://go4answers.webhost4life.com/Example/connect-busy-wcf-service-host-while-725.aspx:
MaxConcurrentSessions (default = 10) [Per-channel] The maximum number of sessions that a service can accept at one time. Only comes into play with session-based bindings (wsHttp or netTcp)"
http://blogs.infosupport.com/unable-to-generate-a-wcf-proxy-using-svcutil-but-retreiving-the-wsdl-works/
So in the end the trick is to add the additional right on the c:\windows\temp folder for your App Pool Identity [for the service to be able to generate metadata] to solve the problem.
Also, are timeouts or other limits configured and being hit? Give tracing a look and access the service using WcfTestClient and see if you can find underlying errors.

WCF Service: Status 200 with sc-win32-status of 64

We observed the following behavior on one of the servers hosting a WCF service on IIS 6.0:
The IIS log shows a high value for time-taken (> 100000)
The HTTP status code is 200
sc-win32-status code shows a value of 64
I found out that sc-win32-status code of 64 indicates "The specified network is no longer available"
Initially I suspected that it could be because of limits set on MinFileBytesPerSecond, which sets the minimum throughput rate that HTTP.sys enforces when sending data from the client to the server, and back from the server to the client.
But the value for sc-bytes and cs-bytes indicate that the amount of data is sent is within the range generally observed for the service.
Also note that the WCF service is hosted on four boxes and is load-balanced, but the problem occurs only one of the servers. (but not essentially on the same server). The problem is also intermittent.
Has anybody else encountered this error? Any clues about what could be wrong?
Update
Note: Observation on IIS 7.5 (IIS version does not really matter)
I was able to replicate the issue. The issue occurs if:
1. The WCF service takes a long time to respond
2. The client proxy times out before it receives a response from the server. In this case it leads to TimeoutException on the client.
3. The server keeps waiting for TCP ACK for the client, which it would never receive.
Hence a long timeout (TCP socket timeout (default value: 4 minutes) and sc-win32-status of 64
So essentially it appears that WCF code is taking a long time to respond and the client is timing out, what I observe in IIS log is just a symptom and not a problem.
The behavior you are describing will also occur if you exceed a WCF service's max sessions, calls or instances (depending on how you have your service instancecontext mode configured). If you observe the System.ServiceModel performance counters for %max concurrent sessions and/or %max concurrent calls (again depending on your service's instance context), you may see a correlation with the IIS log entries.
Note that these maxes can be configured in the service throttling behavior.
https://msdn.microsoft.com/en-us/library/vstudio/system.servicemodel.description.servicethrottlingbehavior(v=vs.100).aspx
I saw your question again and wanted to point out that I found a solution for this. It turned out to be this piece of code in the web.config:
<pages smartNavigation="true">
After turning this off I stopped receiving the same time-out errors. See also the answer here
IIS put the services into sleep to save recources.
Copied from here (WCF REST Service goes to sleep after inactivity)
The application pool hosting your service defines Idle Time-out property (advanced settings of app pool in IIS management console) which defaults to 20 minutes. If no request is received by the app pool within idle timeout the worker processes serving the pool is terminated. After receiving a new request the IIS must start the process again, the process must load application domain and all related assemblies, compile .svc file, run the service host and process the request.The solution can be increasing idle time-out but the meaning of this time-out is correct handling of server resources. If the process is not needed it should be stopped. Another ugly workaround is using some ping process (for example cron job or scheduled task on the server) which will regularly ping call some method on the service or page in the same application.

NT Hosted WCF Service With MSMQ fails to stop cleanly and Locks Up

This is a problem which has had me baffled for weeks now on a client's Live environment.
The WCF service is hosted on Windows Server 2003, and has both HTTP and MSMQ endpoints.
When placing the service in the test environment, the service cleanly starts and stops, and messages are passed without problems. However on the Live environment, the service starts fine, but does not exit cleanly.
When attempting to stop the service, the machine takes a long time to respond and eventually displays an error saying that the service could not be stopped. Inspecting the error on the event log, it says that it was unable to write to the MSMQ queue (access denied), however, the service is able read and remove messages from the queue. If one then refreshes the service manager, the service is in fact stopped.
The MSMQ queue is hosted on a different physical machine, and we have been unable to reproduce the error on the test environment.
We are not sure if it is related or not, but the service will also occasionally stop pulling messages from the queue. This has been solved by restarting the service. Again, we have not been able to reproduce the error.
Recently we experienced another error with the HTTP based client where upon midnight one night, the service suddenly started rejecting connections with the following exception:
The HTTP request is unauthorized with client authentication scheme 'Anonymous'. The authentication header received from the server was 'Negotiate,NTLM'. ---> System.Net.WebException: The remote server returned an error: (401) Unauthorized.
Even more curious, is that simply restarting the service seems to correct the problem.
If anyone has seen anything like this before or has any comments, it would be much appreciated!
Speaking to a colleague, apparently setting the ServiceModelEx throttling options all to "1" help with the lock ups on MSMQ based WCF services.

IIS 7 Restarts Automatically

I have a WCF Service Deployed on IIS. (BasicHTTPBinding with [AspNetCompatibilityRequirements(RequirementsMode = AspNetCompatibilityRequirementsMode.Allowed)])
I have built custom in-memory session management and Now I am facing a strange problem that is IIS 7 Restarts Automatically without even throwing any kind of warning or error not even in EventLog. This problem leads to destroy the all available sessions.
I discovered this issue after logging the Application_Start and Application_End methods using log for net and also i put the break point in application_start and it paused there in between test execution.
This happens rarely but i need to know why it happens and if it is normal and acceptable or not. if not then what may be the possible reasons of this.
Regards
Mubashar Ahmad
Could it be the app pool being re-cycled? IIS 6 has this set on by default to 1740 minutes. As for IIS 7 I guess you would have the same kind of setting? I know in IIS 6 this "event" is not logged as 'n error.
IIS recycles worker processes either when it detects an "unhealthy" process, or after certain operator-configurable limits are reached.
Among the limits are:
memory threshold
after a configured number of requests
elapsed time
time of day
more info
The Session timeout (which is separate to the app pool recycling) is set to 90 minutes by default, this is set at the application level. This also means anything being held in Session will be blown away at that time. You can set it via the properties of the virtual directory/application in IIS6, and via SessionState->Open Feature in IIS7 (when you have the application selected).
Also note that session timeout can be set via the web.config of an ASP.Net application, should your web services be hosted in one of those.

WCF Service polling hangs

I have 2 wcf services, 1 which polls the other service at regular interval.The service2 is hosted in no. of machines with the same configuration.
My problem is that whenever the poller service gets restarted, even though the service2 on other machines runs fine, i am not getting the response from those services (basically it gets timed out - getting SYSTEM.TimeOutException ). If I try to access the same service (service2) from some temp application (without restarting the service2) it gives response.
If I restart the service2, than it works fine, the service1 (poller service) gets the responses from all hosted services (Service2).
Dont know what is causing problem.
Regards,
Chirag
Attach VS to your wcf service which hangs. And find out if your connection is successful.
Do it with both services, so that you can debug the services at runtime.
If you're using a sessionful binding (netTcpBinding, wsHttpBinding), it's more than likely that you're not explicitly closing your client channel when you're done with it. This would cause the behavior you see, because the session takes a minute or so to time out if you don't explicitly close it, and the default max number of sessions is low (10)- the server will let new sessions stack up until old ones close. You can also adjust the service throttle on the server side binding to increase the max number of open sessions allowed, but you really should make sure your clients are getting cleaned up properly first.