Client closing TCP connection before receiving async WCF response - wcf

We have a Silverlight 4.0 application that makes many calls to our WCF services. We have lots of users and mostly it’s fine, but one user is having a problem that seems to be confined to a particular asynchronous call on a single (Windows 7) machine. It is always reproducible, but only for that call from that machine. The WCF request is sent but then after a pause the Silverlight app throws this:
Message: [HttpWebRequest_WebException_RemoteServer]
Arguments: NotFound
Stack Trace: at System.ServiceModel.AsyncResult.End[TAsyncResult](IAsyncResult result)
at System.ServiceModel.Channels.ServiceChannel.EndCall(String action, Object[] outs, IAsyncResult result)
at System.ServiceModel.ClientBase`1.ChannelBase`1.EndInvoke(String methodName, Object[] args, IAsyncResult result)
at MyContractClientChannel.EndApplicationSave(IAsyncResult result)
at MyServiceContract.EndApplicationSave(IAsyncResult result)
at MyServiceContractClient.OnEndApplicationSave(IAsyncResult result)
at System.ServiceModel.ClientBase`1.OnAsyncCallCompleted(IAsyncResult result)
The initiating call looks like this:
using (new OperationContextScope((IContextChannel)Client.InnerChannel))
{
Client.ApplicationSaveAsync(request, callback);
}
Client.ApplicationSaveAsync is the usual auto-generated code that calls System.ServiceModel.ClientBase.InvokeAsync.
When we watch the conversation with WireShark we see the request packets being sent (this takes about two seconds) followed by a delay of about ten seconds. Then comes the weird part. On a machine that works we see the response packets come down from the server, but on the problem machine we see the client suddenly send an empty packet with the TCP FIN flag set (before any response is received). We can’t understand why it would do this. When we try Fiddler instead of WireShark, Fiddler reports “Session State: Aborted”, which looks like just another interpretation of the same underlying problem (that the client has unexpectedly pulled the plug on the connection).
We would greatly appreciate any suggestions as to why this might be happening or what we can do to investigate it further.

The problem seemed to go away when we created a new Windows profile for the user.

Related

ASP.NET Core middleware with unbounded output

I have a piece of ASP.NET Core middleware that produces an unbounded amount of data when it takes over processing a request. It goes into a loop that calls await context.Response.Body.WriteAsync as long as it is allowed to. If the caller stays connected, it expects to keep receiving data. This service is being hosted with Kestrel, and it seems to be working properly as described so far. But, what I am finding is that when the caller disconnects, Kestrel doesn't seem to notice and continues pumping output from the middleware. That output isn't going anywhere, because the memory usage of the process isn't going up, and at the same time netstat -an doesn't show the connection any more. But, the middleware just keeps on chugging away.
For typical HTTP requests, this wouldn't be a terribly serious issue, because most of the time the client doesn't disconnect when it has only read part of the request, and in those cases where it does, the response is finite in size anyway. But the pattern with this endpoint is that the data is conceptually infinite in length, and the caller stays connected for as long as it wants and then signals that it no longer wants further data by disconnecting.
These images illustrate the problem:
https://imgur.com/a/9Qp7VV3
How can I make it so that the middleware notices when the client disconnects?
I also posted this as an issue on the aspnetcore GitHub repo, and I got a reply that explained the problem and provided a solution:
https://github.com/dotnet/aspnetcore/issues/22156
Basically, for better or worse, the ASP.NET Core infrastructure suppresses all errors from write operations, and so if the request has been aborted, calling context.Response.Body.WriteAsync fails silently. I personally think there's a mistake in there somewhere, but the rationale given is that this reduces exception/log spam from behaviours over which the server has no control.
Because of this, if you're writing a loop like mine, you have to explicitly check for the request having been aborted. The context provides a CancellationToken that is used to capture aborts. You can also use this CancellationToken on other actions the handler is doing that aren't within the scope of the request context.
My data pump looks like this now:
while (true)
{
int bytesRead = await responseStream.ReadAsync(buffer, 0, buffer.Length, context.RequestAborted);
if (bytesRead < 0)
throw new Exception("I/O error");
if (bytesRead == 0)
break;
await context.Response.Body.WriteAsync(buffer, 0, bytesRead);
if (context.RequestAborted.IsCancellationRequested)
break;
_statusConsoleSender.NotifyRequestProgress(requestID, bytesRead);
}

Using the Asynchronous Programming Model (APM) in a WCF operation, what's going on "under the hood"?

Given an operation such as this:
public void DoSomething()
{
IWcfServiceAgentAsync agent = new WcfServiceAgentProxy();
var request = new DoSomethingRequest();
agent.BeginDoSomething(request,
iar =>
{
var response = agent.EndDoSomething(iar);
/*
* Marshal back on to UI thread with results
*/
}, null);
}
What is really going on underneath the hood between the moment that the operation gets started, and the callback is executed? Is there a socket that's getting polled waiting for completion? Is there an underlying OS thread that gets blocked until it's return?
What happens is BeginDoSomething ends up calling base.Channel.BeginGetTest(callback, asyncState); on the WCF proxy. What that proxy then does is go through each part of the "Binding Stack" you have set up for your WCF communication.
One of the main parts of the binding stack your request will pass through is the "Message Encoder". The message encoder packages your request up in to something that can be represented as a byte[] (This process is called Serializing).
Once through the message encoder your request will be sent to the Transport (be it HTTP, TCP, or something else). The transport takes the byte[] and sends it to your target endpoint, it then tells the OS "When you receive a response directed to me, call this function" via the IO Completion Ports system. (I will assume either TCP or HTTP binding for the rest of this answer) (EDIT: Note, IO Completion ports don't have to be used, it is up to the Transport layer to decide what to do, it is just most of the implementations built in to the framework will use that)
In the time between your message was sent and the response was received no threads, no poling, no nothing happens. When the network card receives a response it raises a interrupt telling the OS it has new information, that interrupt is processed and eventually the OS sees that it is a few bytes of data intended for your application. The OS then tells your application to start up a IOCP thread pool thread and passes it the few bytes of data that was received.
(See "There is no Thread" by Stephen Cleary for more info about this processes. It is talking about TPM instead of APM like in your example but the underlying layers are exactly the same.)
When the bytes from the other computer are received those bytes go back up the stack in the opposite direction. The IOCP thread runs a function from the Transport, it takes the bytes that was passed to it and hands it off to the Message Encoder. This action can happen several times!
The message encoder receives the bytes from the transport and tries to build up a message, if not enough bytes have been received yet it just queues them and waits for the next set of bytes to be passed in. Once it has enough bytes to desearalize the message it will create a new "Response" object and set it as the result of the IAsyncResult, then (I am not sure who, it may be the WCF call stack, it may be somewhere else in .NET) sees that your IAsyncResult had a callback delegate and starts up another IOCP thread and that thread is the thread your delegate is run on.

WCF client causes server to hang until connection fault

The below text is an effort to expand and add color to this question:
How do I prevent a misbehaving client from taking down the entire service?
I have essentially this scenario: a WCF service is up and running with a client callback having a straight forward, simple oneway communication, not very different from this one:
public interface IMyClientContract
{
[OperationContract(IsOneWay = true)]
void SomethingChanged(simpleObject myObj);
}
I'm calling this method potentially thousands of times a second from the service to what will eventually be about 50 concurrently connected clients, with as low latency as possible (<15 ms would be nice). This works fine until I set a break point on one of the client apps connected to the server and then everything hangs after maybe 2-5 seconds the service hangs and none of the other clients receive any data for about 30 seconds or so until the service registers a connection fault event and disconnects the offending client. After this all the other clients continue on their merry way receiving messages.
I've done research on serviceThrottling, concurrency tweaking, setting threadpool minimum threads, WCF secret sauces and the whole 9 yards, but at the end of the day this article MSDN - WCF essentials, One-Way Calls, Callbacks and Events describes exactly the issue I'm having without really making a recommendation.
The third solution that allows the service to safely call back to the client is to have the callback contract operations configured as one-way operations. Doing so enables the service to call back even when concurrency is set to single-threaded, because there will not be any reply message to contend for the lock.
but earlier in the article it describes the issue I'm seeing, only from a client perspective
When one-way calls reach the service, they may not be dispatched all at once and may be queued up on the service side to be dispatched one at a time, all according to the service configured concurrency mode behavior and session mode. How many messages (whether one-way or request-reply) the service is willing to queue up is a product of the configured channel and the reliability mode. If the number of queued messages has exceeded the queue's capacity, then the client will block, even when issuing a one-way call
I can only assume that the reverse is true, the number of queued messages to the client has exceeded the queue capacity and the threadpool is now filled with threads attempting to call this client that are now all blocked.
What is the right way to handle this? Should I research a way to check how many messages are queued at the service communication layer per client and abort their connections after a certain limit is reached?
It almost seems that if the WCF service itself is blocking on a queue filling up then all the async / oneway / fire-and-forget strategies I could ever implement inside the service will still get blocked whenever one client's queue gets full.
Don't know much about the client callbacks, but it sounds similar to generic wcf code blocking issues. I often solve these problems by spawning a BackgroundWorker, and performing the client call in the thread. During that time, the main thread counts how long the child thread is taking. If the child has not finished in a few milliseconds, the main thread just moves on and abandons the thread (it eventually dies by itself, so no memory leak). This is basically what Mr.Graves suggests with the phrase "fire-and-forget".
Update:
I implemented a Fire-and-forget setup to call the client's callback channel and the server no longer blocks once the buffer fills to the client
MyEvent is an event with a delegate that matches one of the methods defined in the WCF client contract, when they connect I'm essentially adding the callback to the event
MyEvent += OperationContext.Current.GetCallbackChannel<IFancyClientContract>().SomethingChanged
etc... and then to send this data to all clients, I'm doing the following
//serialize using protobuff
using (var ms = new MemoryStream())
{
ProtoBuf.Serializer.Serialize(ms, new SpecialDataTransferObject(inputData));
byte[] data = ms.GetBuffer();
Parallel.ForEach(MyEvent.GetInvocationList(), p => ThreadUtil.FireAndForget(p, data));
}
in the ThreadUtil class I made essentially the following change to the code defined in the fire-and-foget article
static void InvokeWrappedDelegate(Delegate d, object[] args)
{
try
{
d.DynamicInvoke(args);
}
catch (Exception ex)
{
//THIS will eventually throw once the client's WCF callback channel has filled up and timed out, and it will throw once for every single time you ever tried sending them a payload, so do some smarter logging here!!
Console.WriteLine("Error calling client, attempting to disconnect.");
try
{
MyService.SingletonServiceController.TerminateClientChannelByHashcode(d.Target.GetHashCode());//this is an IContextChannel object, kept in a dictionary of active connections, cross referenced by hashcode just for this exact occasion
}
catch (Exception ex2)
{
Console.WriteLine("Attempt to disconnect client failed: " + ex2.ToString());
}
}
}
I don't have any good ideas how to go and kill all the pending packets the server is still waiting to see if they'll get delivered on. Once I get the first exception I should in theory be able to go and terminate all the other requests in some queue somewhere, but this setup is functional and meets the objectives.

How to be notified if WCF Duplex session is prematurely closed

I have a publish/subscribe scenario in WCF using net.tcp and Duplex callbacks. I have a number of clients that subscribe to the service, and this works fine. However, sometimes a client will close without unsubsribing (Client computer goes to sleep, computer crashes, network connection is aborted, etc..), this causes an exception to be thrown when I callback via my callback list.
Now, I can certainly catch the exception and remove the offending callback, but this seems less like an exception scenario to me and further along the lines of "expected behavior".
Is there an event that gets fired on connection close that will notify me so that I can remove the callback from my list? Consider that this is net.tcp and not HTTP, so connection state should be known.
Clearly the framework knows the connection has been closed and disposed because the exception is something along the lines of "attempt to call a disposed object".
EDIT:
I should point out, that this is not a long running transaction. It's a long running connection in a publish/subscribe scenario. Basically, the callback is used to notify transient subscribers of various events as they happen. Each event is isolated and not long running.
It has been a while, this is from memory so I could be wrong, but I think perhaps if you make an IEndpointBehavior that goes an pokes at the DispatchRuntime to add an IInputSessionShutdown, then you can get notified when the session channel ends.
http://msdn.microsoft.com/en-us/library/system.servicemodel.dispatcher.dispatchruntime.inputsessionshutdownhandlers.aspx

Is there a way to tell WCF to use security in the request, but ignore it on the response?

We have to connect to a third party SOAP service and we are using WCF to do so. The service was developed using Apache AXIS, and we have no control over it, and have no influence to change how it works.
The problem we are seeing is that it expects the requests to be formatted using Web Services Security, so we are doing all the correct signing, etc. The response from the 3rd party however, is not secured. If we sniff the wire, we see the response coming back fine (albeit without any timestamp, signature etc.).
The underlying .NET components throw this as an error because it sees it as a security issue, so we don't actually receive the soap response as such. Is there any way to configure the WCF framework for sending secure requests, but not to expect security fields in the response? Looking at the OASIS specs, it doesn't appear to mandate that the responses must be secure.
For information, here's the exception we see:
The exception we receive is:
System.ServiceModel.Security.MessageSecurityException was caught
Message="Security processor was unable to find a security header in the message. This might be because the message is an unsecured fault or because there is a binding mismatch between the communicating parties. This can occur if the service is configured for security and the client is not using security."
Source="mscorlib"
StackTrace:
Server stack trace:
at System.ServiceModel.Security.TransportSecurityProtocol.VerifyIncomingMessageCore(Message& message, TimeSpan timeout)
at System.ServiceModel.Security.TransportSecurityProtocol.VerifyIncomingMessage(Message& message, TimeSpan timeout)
at System.ServiceModel.Security.SecurityProtocol.VerifyIncomingMessage(Message& message, TimeSpan timeout, SecurityProtocolCorrelationState[] correlationStates)
at System.ServiceModel.Channels.SecurityChannelFactory`1.SecurityRequestChannel.ProcessReply(Message reply, SecurityProtocolCorrelationState correlationState, TimeSpan timeout)
at System.ServiceModel.Channels.SecurityChannelFactory`1.SecurityRequestChannel.Request(Message message, TimeSpan timeout)
at System.ServiceModel.Dispatcher.RequestChannelBinder.Request(Message message, TimeSpan timeout)
at System.ServiceModel.Channels.ServiceChannel.Call(String action, Boolean oneway, ProxyOperationRuntime operation, Object[] ins, Object[] outs, TimeSpan timeout)
Incidentally, I've seen plenty of posts stating that if you leave the timestamp out, then the security fields will not be expected. This is not an option - The service we are communicating with mandates timestamps.
Microsoft has a hotfix for this functionality now.
http://support.microsoft.com/kb/971493
Funny you should ask this question. I asked Microsoft how to do this about a year ago. At the time, using .NET 3.0, it was not possible. Not sure if that changed in the 3.5 world. But, no, there was no physical way of adding security to the request and leaving the response empty.
At my previous employer we used a model that required a WS-Security header using certificates on the request but the response was left unsecured.
You can do this with ASMX web services and WSE, but not with WCF v3.0.
There is a good chance you will not be able to get away with configuration alone. I had to do some integration work with Axxis (our end was WSE3 -- WCF's ancestor), and I had to write some code and stick it into WSE3's pipeline to massage the response from Axxis before passing it over to WSE3. The good news is that adding these handlers to the pipeline is fairly straightforward, and once in the handler, you just get an instance of a SoapMessage, and can do anything you want with it (like removing the timestamp, for example)