Service becomes unresponsive with many sockets in CLOSE_WAIT state

Service becomes unresponsive with many sockets in CLOSE_WAIT state - wcf

I have a WCF service with a NetTcpBinding running with about 100 clients. The clients regulary poll information from the server and after a while the service does not respond anymore.
Looking at netstat, I can see many connections that are in the CLOSE_WAIT state.
This is my binding:
<netTcpBinding>
<binding name="default" maxReceivedMessageSize="2147483647" maxBufferPoolSize="2147483647" maxConnections="10000">
<readerQuotas maxDepth="2147483647" maxStringContentLength="2147483647" maxArrayLength="2147483647" maxBytesPerRead="2147483647" maxNameTableCharCount="2147483647" />
</binding>
</netTcpBinding>
I have also tried to change the values of closeTimeout from the default of 00:01:00 to 00:00:10, but with no effect.
The machine is a Windows Server 2008 R2 64bit.
Update:
I have added a ServiceThrottlingBehavior now, but the result is still the same.
new ServiceThrottlingBehavior
{
MaxConcurrentCalls = 1000,
MaxConcurrentInstances = 1000,
MaxConcurrentSessions = 1000
};
Update2
I have set the SessionMode to NotAllowed and changed the binding to streamed.
Any ideas what I could do to improve performance or to figure out the problem?

From your description, it seems: 1. initially the clients were able to connect to your server with no problem, so this rules out configuration problem 2. After a while server stopped responding, but you didn't say how long, and how big is the request rate, and whether the server stopped responding at all, or only intermittently responding. Based on this one possibility is that something is wrong on the server side. Did you noticed anything unusual on the server side? Things to look for is:
Thread count -- was the thread pool being depicted (as some settings
may set a cap on thread pool thread)? Especially try a fresh launch
of the server and observe the thread count till it stopped
responding, any pattern there? You may have dead locks, long
blocking operations etc. which holds thread for too long.
Memory -- is there a problem with memory leaks?
Is it a self hosting service? Do you have proper code to catch ServiceHost.Faulted event (and
restart the service)? If a ServiceHost is faulted, it'll not respond
to any requests.
See what WCF performance counter tells you, especially the queue size and number of active connections. From the performance counter, you'll know whether the service is taking any request, or if your throttling
configurations are necessary at all.
The ultimate diagnostic tool: turned on service side WCF tracing? Open a trace file will
definitely tell you what happened with a request. If you see any
exception in the tracing file, you'll find your root cause.

It sounds like your clients are never disconnecting.
Are you sure your clients are correctly closing the channel? Note that you should call ChannelFactory.Close, not just Dispose.
Set the receiveTimeout to something low to verify this is the problem.

Your client closed the connection by calling close(), which sent FIN to the server socket, which ACKed the FIN and the state of which now changed to CLOSE_WAIT, and stays that way unless the server issues close() call on that socket.
Your server program needs to detect whether the client has aborted the connection, and then close() it immediately to free up the port. How? Refer to read(). Upon reading end-of-file (meaning FIN is received), zero is returned.
You can detect if client is disconnected.
Any WCF channel implements ICommunicationObject , which provides events for the channel lifetime.
You should listen to the Faulted event
The sessionId can be accessed as always from the OperationContext.Current property.
When your client open the channel (on the first operation), register to the adequate events:
OperationContext.Current.Channel.Faulted += new EventHandler(Channel_Faulted);
OperationContext.Current.Channel.Closed += new EventHandler(Channel_Faulted);
and
void Channel_Faulted(object sender, EventArgs e)
{
Logout((IContextChannel)sender);
}
protected void Logout(IContextChannel channel)
{
string sessionId = null;
if (channel != null)
{
sessionId = channel.SessionId;
}
}
if the socket is disconnected, you should get a channel Fault event. The Closed event is raised when the client shuts down gracefully, the Faulted when it's unexpected (as in case of network failure).
check out following link.. Its kind a similar . It helped me ..
TCP Socket Server Builds Up CLOSE_WAITs Occasionally Over Time Until Inoperable

Verify that the router isn't the problem, since some consumer grade routers have a cap on the number of allowed open sockets/connections.

Related

Do the WCF timeouts on server side bindings have any relevance?

I host a WCF Service on IIS and have the following binding in web.config:
<bindings>
<wsHttpBinding>
<binding name="transactionalBinding"
transactionFlow = "true"
sendTimeout = "00:00:01"
receiveTimeout = "00:00:01"
openTimeout = "00:00:01"
closeTimeout = "00:00:01">
<security mode="Transport">
<transport clientCredentialType="None" proxyCredentialType="None" realm=""/>
</security>
</binding>
</wsHttpBinding>
</bindings>
In my service method I sleep for 10 seconds. I do not get a timeout exception when calling my service method from a client.
Is there any meaning in defining timeouts in server side bindings?

I do not get a timeout exception when calling my service method from a client.
TL;DR: because WCF timeouts by default are one minute so naturally a server operation that only takes 10 seconds isn't going to timeout. The timeouts you have specified on the server would only affect transmission not execution of your method. (you aren't calling anything else)
You are specifying the timeouts in the server config. What you need to do is specify the timeouts in the client's config file, specifically SendTimeout. Essentially whatever end is making the call, needs to specify the operation timeout. Probably not relevant in your case but if your "server" in turn made another WCF call to another service, you would want your own timeout there too.
MSDN:
SendTimeout – used to initialize the OperationTimeout, which governs the whole process of sending a message, including receiving a reply message for a request/reply service operation. This timeout also applies when sending reply messages from a callback contract method.
Generally, WCF client and server configs should match one another and unless you are using Add Service Reference/Refresh Service Reference each time the server contracts and/or config change, the client won't know about it. By the way, avoid the latter because it duplicates your model and can lead to runtime errors if they are out of sync. Not to mention service contracts can get out of sync.
A passing thought
And this brings up one of the problems of WCF configuration via config files, they are subject to runtime errors impossible to find at compile time.
A better practice is to do away with config files completely and do programatic configuration via a common assembly that both your client and server use. Specify bindings in code along with your timeouts.
That way both server and client are always in sync with regards to WCF configuration.
With both client and server agreeing on timeouts would have addressed some issues.
Tell me more
WCF the Manual Way… the Right Way

Set DNS timeout for WCF webservice

I am using Visual Studio 2012 to generate a web service to be used by a winforms client. I created the client side by using "add service reference". This winforms client is a .net c# replacement of an old VB 6 app. Previously, in the VB app there were external settings for timeout values including the following:
DNS timeout
Connect timeout
Request timeout
The DNS timeout would work when the endpoint host address is a FQDN forcing a DNS lookup. The timeout value here would place a limit on the amount of time to wait for DNS resolution.
The connect timeout would place a limit on the amount of time the winforms client would wait to establish an http connection to the server. DNS lookup would have been successful.
The request timeout would place a limit on the amount of time to wait for the request to return after an http connection was successful. This would come into play if a long running query took too long after the web service call was initiated.
Is there something similar to the above in .net 4.0. I would like to be able to configure this in the app.config. I do know about the below.
<bindings>
<basicHttpBinding>
<binding name="IncreasedTimeout"
openTimeout="12:00:00"
receiveTimeout="12:00:00" closeTimeout="12:00:00"
sendTimeout="12:00:00">
</binding>
</basicHttpBinding>
Could these map to the ones I need or does it really not matter?
thanks

The OpenTimeout setting for the WCF binding is for the length of time to wait when opening the channel, so I believe this will be analogous to your old Connect timeout. This should be fast so you normally would only want to specify a few seconds to wait (30 or less), not 12 hours.
The WCF CloseTimeout is for when a Close Channel message is sent, and this is how long to wait for an acknowledgement. This may not have an equivalent in your old architecture. Again, this should be fast and should only need a few seconds.
The WCF SendTimeout (for the client config) essentially covers the time for the Client to send the message to the service, and to receive back the response (if any). This would correspond to your old Request timeout. This may need to be for several minutes if your server takes a while to process things.
The WCF SendTimeout (for the server config) is for when you want callbacks, so that the Server knows how long to wait for acknowledgement that its callback was received.
The WCF ReceiveTimeout does not apply to client-side configuration. For Server-side config the ReceiveTimeout is used by ServiceFramework layer to initialize the session-idle timeout (to be honest I don't really know what that is)
This MSDN discussion may be helpful http://social.msdn.microsoft.com/Forums/vstudio/en-US/84551e45-19a2-4d0d-bcc0-516a4041943d/explaination-of-different-timeout-types?forum=wcf
As a final note, having really big timeout values isn't a good idea unless you definitely have long running requests. This is because you can run out of available resources on your server if the client isn't closing the connections properly.

WCF Remote MSMQ - I can write to a remote queue, but cannot receive

jobsServer: Windows Server 2008 R2
.NET Version: 4.5
I'm using WCF to connect two servers - app and queue. I want app to be able to send/receive messages from queue. For some reason, app can send messages, but CANNOT receive them.
The netMsmq binding looks like:
<binding name="JobExecutionerBinding" receiveErrorHandling="Move">
<security>
<transport msmqAuthenticationMode="None" msmqProtectionLevel="None" />
</security>
</binding>
And the service binding looks like:
Now, the client binding looks like:
<endpoint address="net.msmq://queue/private/jobs"
binding="netMsmqBinding"
bindingConfiguration="JobExecutionerBinding"
contract="JobExecution.Common.IJobExecutionService"
name="SimpleEmailService"
kind=""
endpointConfiguration=""/>
I changed a few names for security's sake.
So, the WC client can send to the remote queue without a problem. It even properly queues the outgoing message and forwards it on later in the event that the remote queue server is down. But every time I start up the WCF service, I get this:
There was an error opening the queue. Ensure that MSMQ is installed
and running, the queue exists and has proper authorization to be read
from. The inner exception may contain additional information. --->
System.ServiceModel.MsmqException: An error occurred while opening the
queue:The queue does not exist or you do not have sufficient
permissions to perform the operation. (-1072824317, 0xc00e0003). The
message cannot be sent or received from the queue. Ensure that MSMQ is
installed and running. Also ensure that the queue is available to open
with the required access mode and authorization. at
System.ServiceModel.Channels.MsmqQueue.OpenQueue() at
System.ServiceModel.Channels.MsmqQueue.GetHandle() at
System.ServiceModel.Channels.MsmqQueue.SupportsAccessMode(String
formatName, Int32 accessType, MsmqException& msmqException) --- End
of inner exception stack trace --- at
System.ServiceModel.Channels.MsmqVerifier.VerifyReceiver(MsmqReceiveParameters
receiveParameters, Uri listenUri) at
System.ServiceModel.Channels.MsmqTransportBindingElement.BuildChannelListener[TChannel](BindingContext
context) at
System.ServiceModel.Channels.Binding.BuildChannelListener[TChannel](Uri
listenUriBaseAddress, String listenUriRelativeAddress, ListenUriMode
listenUriMode, BindingParameterCollection parameters) at
System.ServiceModel.Description.DispatcherBuilder.MaybeCreateListener(Boolean
actuallyCreate, Type[] supportedChannels, Binding binding,
BindingParameterCollection parameters, Uri listenUriBaseAddress,
String listenUriRelativeAddress, ListenUriMode listenUriMode,
ServiceThrottle throttle, IChannelListener& result, Boolean
supportContextSession) at
System.ServiceModel.Description.DispatcherBuilder.BuildChannelListener(StuffPerListenUriInfo
stuff, ServiceHostBase serviceHost, Uri listenUri, ListenUriMode
listenUriMode, Boolean supportContextSession, IChannelListener&
result) at
System.ServiceModel.Description.DispatcherBuilder.InitializeServiceHost(ServiceDescription
description, ServiceHostBase serviceHost) at
System.ServiceModel.ServiceHostBase.InitializeRuntime() at
I've been all over StackOverflow and the internet for 8 hours. Here's what I've done:
Ensured that ANONYMOUS LOGIN, Everyone, Network, Network Service, and Local Service have full control
Stopped the remote MSMQ server and observed what the WCF service does, and I get a different error - so I'm sure that the WCF service when starting up is speaking to the MSMQ server
Disabled Windows Firewall on both boxes and opened all ports via EC2 security groups
Set AllowNonauthenticatedRpc and NewRemoteReadServerAllowNoneSecurityClient to 1 in the registry
Configured MS DTC on both servers (the queue is transactional, but I get the same error regardless as to whether the queue is transactional or not)
Confirmed that the WCF server starts up fine if I use the local queue, and receives without a problem
Help!!! I can't scale my app without a remote queueing solution.

It's not clear from your post which tier can not read and more importantly which queue.
However, reading remote queues transactionally is not supported:
Message Queuing supports sending transactional messages to remote queues, but does not support reading messages from a remote queue within a transaction. This means that reliable, exactly-once reception is not available from remote queues. See
Reading Messages from Remote Queues
I suspect that somewhere your system is still performing transactional remote reads even though you mentioned you disabled it.
From a best practice point of view, even if you got it to work, your design will not scale which is a shame as it is something you mentioned you wanted.
Remote reading is a high-overhead and therefore inefficient process. Including remote read operations in an application limits scaling. 1
You should always remote write not remote read.
A better way is to insert a message broker or router service that acts as the central point for messaging. Your app and queue services (confusing names by the way) should merely read transactionally from their local queues.
i.e.
app should transactionally read it's local queue
app should transactionally send to the remote broker
broker transactionally reads local queue
broker transactionally sends to remote queue
Similarly if your queue tier wanted to reply the reverse process to the above would occur.
Later if you wish to improve performance you can introduce a Dynamic router which redirects a message to a different remote queue on another machine based on dynamic rulesets or environmental conditions such as stress levels.

Remote transactional reads are supported as of MSMQ 4.0 (Windows server 2008). If you are facing this issue, be sure to checkout https://msdn.microsoft.com/en-us/library/ms700128(v=vs.85).aspx

Hosting MSMQ in IIS 6 and sending it messages via WCF

I want to test out the possibility of queuing message on remote clients who may or may not be connected, those clients when connected will push the messages sent to an msmq over the internet that is hosted in IIS 6.
Now, I setup MSMQ on the win server2003 hosting IIS. After I did this "MSMQ" shows up in the IIS default web site.
Ok, then I added a new transactional private queue through computer management-> message queuing.
From there all I want to do is see messages stack up, I'll deal with those after this works.
Now, I made a client app that has the following code:
using (var contract = new HttpMsmqBridgeProxy())
{
var valueToSend = 2456;
contract.TestFunction(valueToSend);
Console.WriteLine("value sent: " + valueToSend + "\r\n");
}
Here's the app.config of this client:
<configuration>
<system.serviceModel>
<client>
<endpoint
address="net.msmq://**.**.***.228/private/MarksTestHttpQueue"
binding="netMsmqBinding"
bindingConfiguration="srmpBinding"
contract="HttpMsmqBridgeLibrary.IHttpMsmqBridgeContract">
</endpoint>
</client>
<bindings>
<netMsmqBinding>
<binding name="srmpBinding"
queueTransferProtocol="Srmp">
<security mode="None"/>
</binding>
</netMsmqBinding>
</bindings>
</system.serviceModel>
</configuration>
The IP is my public facing IP that works, I can host a wcf service or webpage just fine. I followed this guide somewhat for using srmpBinding.
http://msdn.microsoft.com/en-us/library/aa395217.aspx
So, in short what happens when I run the app is it succeeds, tells me it was sent, I go into Message Queue of my client and see that a new queue has shown up in Outgoing folder called:
Direct:http://..*.228/msmq/private$/MarksTestHttpQueue
there is no outgoing messages waiting in this queue so I assume the message was sent.
When I look at my msmq now on the winserver2003 there are no arrived queued messages waiting.
ETA: I can send messages to a non-transactional queue using the classic MessageQueue implimintation:
var queue = new MessageQueue("FormatName:DIRECT=http://**.**.***.228/msmq/private$/nonTransQueue");
System.Messaging.Message Msg;
Msg = new System.Messaging.Message();
Msg.Formatter = new ActiveXMessageFormatter();
Msg.Body = "Testing";
queue.Send(Msg);
The messages show up (after altering the mapping file in the system32/msmq/mapping directory) just fine. I'm wondering if because it's IIS6 I won't be able to send using the net.msmq binding.

You are correct in that your WCF service hosted in IIS6 won't be able to process the messages. This is because IIS6 doesn't use WAS which instantiates processes for non-http requests. But I think that this comes after everything you're doing in the workflow. I would expect
you run your client, pushing the message to the remote queue
the message appears in the remote queue
your WCF service does not pickup the message because it's hosted in IIS6, so you are left with a message in the remote queue.
I don't believe that IIS is involved at all up until the point where it wouldn't be working anyway.
A simple test for this is to self host your service on the server, e.g. run it in a console app. It will be able to accept MSMQ messages just as IIS7 would, and will remove that as a potential problem from your rig.
You might also want to test whether you can push a message directly from the client to a transactional queue on the server. If you're having problems sending messages to transactional queues on other machines then you can possibly check the MSDTC log. I don't envy having to delve into there.

WCF service with msmq binding and closeTimeout?

I am working to speed up a large number of integration tests in our environment.
The problem I am facing right now is that during teardown between tests one WCF service using msmq binding takes about 1 minute to close down.
I the teardown process we loop over our servicehosts calling the Close() method with a very short timeout, overriding the closeTimeout value in the WCF configuration. This works well for net.tcp bindings but the one service that uses msmq still takes 1 minute to close down. The closeTimeout doesn't seem to have any effect.
The config look like this for the test service:
<netMsmqBinding>
<binding name="NoMSMQSecurity" closeTimeout="00:00:01" timeToLive="00:00:05"
receiveErrorHandling="Drop" maxRetryCycles="2" retryCycleDelay="00:00:01" receiveRetryCount="2">
<security mode="None" />
</binding>
</netMsmqBinding>
And the closing call I use is straight forward like this:
service.Close(new TimeSpan(0, 0, 0, 0, 10));
Is there another approach I can take to close down the servicehost faster?
As this is an automated test that at this point has succeded or failed I don't want to wait for any other unprocessed messages or similar.
Best regards,
Per Salmi

I found the cause of the delayed closing down of the service host using Msmq.
The reason for the long close times seems to be that the service uses another net.tcp based service which has reliableSession activated and the servicehost. The reliableSession settings had an inactivity timeout set to 5 minutes which causes it to send keep-alive infrastructure messages, they should be sent every 2.5 minutes. This keep-alive messaging interval seems to cause the msmq based service to hang around for 1-2 minutes probably waiting for some of the keep-alive messages to arrive.
When I set the inactivityTimeout down to 5 seconds the shutdown of the msmq service completes in about 2.5 seconds. This makes the automatic integration tests pass a lot faster!

Could there be some transaction that is blocking the close.
Say for example there is an open transaction, if you close without commiting the transaction, then it will wait 1 min for the transaction to timeout before it can close.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas