WCF service hosted on Azure App service never seems to finish threads opened for processing - wcf

I have deployed a WCF service to Azure App Service that performs just one task - send a message to the topic. Although app works fine with normal load, it starts experiencing higher thread count as soon as load on the app increases.
The app instance becomes unhealthy when the threads count limit is reached.
Those threads stay in waiting state forever. We tried scaleout option on thread count metrics but the app just keeps on adding more instances as the earlier instance still had almost all threads waiting and remain unhealthy forever.
This is performed in the below sequence.
Accept a request.
initialize a Service bus topic client
Send the requested message to the topic.
Closed the topic client.
While sending a burst of 1000 requests, the app works but the number of threads initiated always stays in the waiting state. However, while these threads are waiting CPU stays at 0%. The average response time from this service is also under 100 ms avg.
After sending 1000 requests to this service, I see a similar number of threads open.
What could be the potential root cause of this issue? Is there any issue with my code to send the message to the topic?
public async Task SendAsync(Message message)
{
try
{
await _topicClient.SendAsync(message);
}
catch(Exception exc)
{
throw new Exception(exc.Message);
}
finally
{
await _topicClient.CloseAsync();
}
}
enter image description here

The code sample you provided does not really tell us much. We do not know how SendAsync(Message message) is being invoked. Is your image your queue count that drops to 0 before accepting more messages? I'm assuming a client calls your WCF app service which tells it send the message to service bus?
It does sound like you are hitting the 1000 maximum connections. Your _topicClinet should be a singleton for your app domain that all clients use. You also should only need one app service instance if all you're doing is message forwarding. No need for scaling unless there's more processing that you haven't alluded to.
Have a look at the Service Bus messaging best practices doc for more suggestions.

Thanks for responding. These are good suggestions and I will look to review my implementation inline with these.
The good news is that I was able to resolve the issue, it wasn't related to the topic client as I thought earlier. It was due to how I was registering dependency injection.
I am implementing a WCF service based on .Net Framework 4.8 and initially, we did not include Global.asax but registered DI in the service controller constructor. The implementation worked till we realized (as part of performance testing) it seems to add additional threads when we added ILogger dependency. Those additional threads never cool down but were adding up as the service received more requests.
To resolve, I moved DI registration into Application_Start in global.asax.

Related

Testing with in-memory NServiceBus

I'm attempting to create a high level test in my solution, and I want to 'catch' messages sent to the bus.
Here's what I do:
nUnit [SetUp] spins up the WebAPI project in IISExpress
SetUp also creates the bus
Send a HTTP request to the API
Verify whatever I want to verify
The WebAPI part of the whole test works fine. The creation of the bus and kicking it off seems great too. It even finds my fake message handler. The problem is the handler never receives the command from the queue, they just stay in the RabbitMQ queue forever.
Here's how the bus is being configured:
var bus = Configure.With()
.DefineEndpointName("Local")
.Log4Net()
.UseTransport<global::NServiceBus.RabbitMQ>()
.UseInMemoryTimeoutPersister()
.RijndaelEncryptionService()
.UnicastBus();
.CreateBus();
In the log from NServiceBus starting up, I see that my fake handler is being associated with the command:
2014-09-24 15:29:59,007 [Runner thread] DEBUG NServiceBus.Unicast.MessageHandlerRegistry
[(null)] <(null)> - Associated 'Bloo.MyCommand' message with 'Blah.FakeMyCommandHandler' handler
So seeing as the message lands in the correct RabbitMQ queue, I'm assuming everything up until the handler point is working fine.
I've tried putting waits in my [TearDown] so that the bus lives a little longer - hoping to give the handler time to receive the message. I've also tried spinning off the in-memory bus for the consumer part of the interactoin into a new thread with no luck.
Has anyone else tried this?
This is only the first step, what I would love to do is create a fake bus that records messages being sent to it. The need for RabbitMQ is just to get myself going (the bounds of my solution are WebAPI on the front and the bus at the back).
Cheers
You forgot to call .Start() on the bus, that's why it didn't listen for messages.
See here for more info: http://docs.particular.net/nservicebus/hosting-nservicebus-in-your-own-process-v4.x
Also, consider using NServiceBus.Testing for unit testing your handlers and sagas:
https://www.nuget.org/packages/NServiceBus.Testing
I'm guessing your messages are just sitting in your queue forever because your end point is listening on "Local.MachineName" queue instead of "Local"
If you set the ScaleOut to be SingleBrokerQueue this should sort the issue.
Configure.ScaleOut(s => s.UseSingleBrokerQueue());
var bus = Configure.With()
.DefineEndpointName("Local")
...
If you are attempting to do full integration tests, using actual queues, then this answer won't help you.
If you are doing more focused tests, i.e. testing individual components that rely on the bus, I would recommend that you use a mocking framework (I like Moq) and mock out IBus. You can then verify that messages you expected to be sent to the bus were indeed sent.

WCF client causes server to hang until connection fault

The below text is an effort to expand and add color to this question:
How do I prevent a misbehaving client from taking down the entire service?
I have essentially this scenario: a WCF service is up and running with a client callback having a straight forward, simple oneway communication, not very different from this one:
public interface IMyClientContract
{
[OperationContract(IsOneWay = true)]
void SomethingChanged(simpleObject myObj);
}
I'm calling this method potentially thousands of times a second from the service to what will eventually be about 50 concurrently connected clients, with as low latency as possible (<15 ms would be nice). This works fine until I set a break point on one of the client apps connected to the server and then everything hangs after maybe 2-5 seconds the service hangs and none of the other clients receive any data for about 30 seconds or so until the service registers a connection fault event and disconnects the offending client. After this all the other clients continue on their merry way receiving messages.
I've done research on serviceThrottling, concurrency tweaking, setting threadpool minimum threads, WCF secret sauces and the whole 9 yards, but at the end of the day this article MSDN - WCF essentials, One-Way Calls, Callbacks and Events describes exactly the issue I'm having without really making a recommendation.
The third solution that allows the service to safely call back to the client is to have the callback contract operations configured as one-way operations. Doing so enables the service to call back even when concurrency is set to single-threaded, because there will not be any reply message to contend for the lock.
but earlier in the article it describes the issue I'm seeing, only from a client perspective
When one-way calls reach the service, they may not be dispatched all at once and may be queued up on the service side to be dispatched one at a time, all according to the service configured concurrency mode behavior and session mode. How many messages (whether one-way or request-reply) the service is willing to queue up is a product of the configured channel and the reliability mode. If the number of queued messages has exceeded the queue's capacity, then the client will block, even when issuing a one-way call
I can only assume that the reverse is true, the number of queued messages to the client has exceeded the queue capacity and the threadpool is now filled with threads attempting to call this client that are now all blocked.
What is the right way to handle this? Should I research a way to check how many messages are queued at the service communication layer per client and abort their connections after a certain limit is reached?
It almost seems that if the WCF service itself is blocking on a queue filling up then all the async / oneway / fire-and-forget strategies I could ever implement inside the service will still get blocked whenever one client's queue gets full.
Don't know much about the client callbacks, but it sounds similar to generic wcf code blocking issues. I often solve these problems by spawning a BackgroundWorker, and performing the client call in the thread. During that time, the main thread counts how long the child thread is taking. If the child has not finished in a few milliseconds, the main thread just moves on and abandons the thread (it eventually dies by itself, so no memory leak). This is basically what Mr.Graves suggests with the phrase "fire-and-forget".
Update:
I implemented a Fire-and-forget setup to call the client's callback channel and the server no longer blocks once the buffer fills to the client
MyEvent is an event with a delegate that matches one of the methods defined in the WCF client contract, when they connect I'm essentially adding the callback to the event
MyEvent += OperationContext.Current.GetCallbackChannel<IFancyClientContract>().SomethingChanged
etc... and then to send this data to all clients, I'm doing the following
//serialize using protobuff
using (var ms = new MemoryStream())
{
ProtoBuf.Serializer.Serialize(ms, new SpecialDataTransferObject(inputData));
byte[] data = ms.GetBuffer();
Parallel.ForEach(MyEvent.GetInvocationList(), p => ThreadUtil.FireAndForget(p, data));
}
in the ThreadUtil class I made essentially the following change to the code defined in the fire-and-foget article
static void InvokeWrappedDelegate(Delegate d, object[] args)
{
try
{
d.DynamicInvoke(args);
}
catch (Exception ex)
{
//THIS will eventually throw once the client's WCF callback channel has filled up and timed out, and it will throw once for every single time you ever tried sending them a payload, so do some smarter logging here!!
Console.WriteLine("Error calling client, attempting to disconnect.");
try
{
MyService.SingletonServiceController.TerminateClientChannelByHashcode(d.Target.GetHashCode());//this is an IContextChannel object, kept in a dictionary of active connections, cross referenced by hashcode just for this exact occasion
}
catch (Exception ex2)
{
Console.WriteLine("Attempt to disconnect client failed: " + ex2.ToString());
}
}
}
I don't have any good ideas how to go and kill all the pending packets the server is still waiting to see if they'll get delivered on. Once I get the first exception I should in theory be able to go and terminate all the other requests in some queue somewhere, but this setup is functional and meets the objectives.

WCF Service hangs on the 14th call

I'm having a problem where the WCF service hangs after 13-14 asynchronous process calls from the client. This occurs all the time. The client is a mobile JavaFX app. There is no specific error outputted in the server as well as in client. Someone suggested that it might be a throttling issue.
I've set the service side .config parameters maxConcurrent calls from 10 to 500
<serviceThrottling maxConcurrentCalls="500" maxConcurrentSessions="500” />
So this means, it should be able to accept more than 10 calls, right? However, it didn't resolve this issue. Still hangs on the 13-14th process call.
Only one client is connecting to this web service.
What do you think is wrong?
Do you close the client after doing your call?
When I encountered this problem, I did not close it, and the open requests blocked the service after a short time.
Edit: Ok, I know nothing about JavaFX =) The code below is C#, sorry. But you can surely do something similar.
Use either
WcfClient client = new WcfClient()
// ...
client.Close()
or
using(WcfClient client = new WcfClient()){
// ...
}
Similar problem here - I have an app calling from one process to another, locally, named pipes.
Calls are really light in code- basically takex an array of serializable objects, queues them on other side. Occasionally it hangs. Restarts afte rtimeout. no data lost, but... as the data is financial data, and the receiving app an autoamted trading system, that may result in very bad financial issues. Not been able to reproduce it yet.
This could very easily be caused by any deadlock condition in your code. If your service locks up and starts eating up 100% or CPU you have a dead lock. Create a dump file and see where your code was at.
I ran into the same issue my first WCF app it was a dictionary that i wasn't making sure was synchronized in logging code.
The SvcTraceViewer is super helpful in figuring out tough wcf

WCF Server Push connectivity test. Ping()?

Using techniques as hinted at in:
http://msdn.microsoft.com/en-us/library/system.servicemodel.servicecontractattribute.callbackcontract.aspx
I am implementing a ServerPush setup for my API to get realtime notifications from a server of events (no polling). Basically, the Server has a RegisterMe() and UnregisterMe() method and the client has a callback method called Announcement(string message) that, through the CallbackContract mechanisms in WCF, the server can call. This seems to work well.
Unfortunately, in this setup, if the Server were to crash or is otherwise unavailable, the Client won't know since it is only listening for messages. Silence on the line could mean no Announcements or it could mean that the server is not available.
Since my goal is to reduce polling rather than immediacy, I don't mind adding a void Ping() method on the Server alongside RegisterMe() and UnregisterMe() that merely exists to test connectivity of to the server. Periodically testing this method would, I believe, ensure that we're still connected (and also that no Announcements have been dropped by the transport, since this is TCP)
But is the Ping() method necessary or is this connectivity test otherwise available as part of WCF by default - like serverProxy.IsStillConnected() or something. As I understand it, the channel's State would only return Faulted or Closed AFTER a failed Ping(), but not instead of it.
2) From a broader perspective, is this callback approach solid? This is not for http or ajax - the number of connected clients will be few (tens of clients, max). Are there serious problems with this approach? As this seems to be a mild risk, how can I limit a slow/malicious client from blocking the server by not processing it's callback queue fast enough? Is there a kind of timeout specific to the callback that I can set without affecting other operations?
Your approach sounds reasonable, here are some links that may or may not help (they are not quite exactly related):
Detecting Client Death in WCF Duplex Contracts
http://tomasz.janczuk.org/2009/08/performance-of-http-polling-duplex.html
Having some health check built into your application protocol makes sense.
If you are worried about malicious clients, then add authorization.
The second link I shared above has a sample pub/sub server, you might be able to use this code. A couple things to watch out for -- consider pushing notifications via async calls or on a separate thread. And set the sendTimeout on the tcp binding.
HTH
I wrote a WCF application and encountered a similar problem. My server checked clients had not 'plug pulled' by periodically sending a ping to them. The actual send method (it was asynchronous being a server) had a timeout of 30 seconds. The client simply checked it received the data every 30 seconds, while the server would catch an exception if the timeout was reached.
Authorisation was required to connect to the server (by using the built-in feature of WCF that force the connecting person to call a particular method first) so from a malicious client perspective you could easily add code to check and ban their account if they do something suspicious, while disconnecting users who do not authenticate.
As the server I wrote was asynchronous, there wasn't any way to really block it. I guess that addresses your last point, as the asynchronous send method fires off the ping (and any other sending of data) and returns immediately. In the SendEnd method it would catch the timeout exception (sometimes multiple for the client) and disconnect them, without any blocking or freezing of the server.
Hope that helps.
You could use a publisher / subscriber service similar to the one suggested by Juval:
http://msdn.microsoft.com/en-us/magazine/cc163537.aspx
This would allow you to persist the subscribers if losing the server is a typical scenario. The publish method in this example also calls each subscribers on a separate thread, so a few dead subscribers will not block others...

WCF polling, background processing, and resource starvation

I have a web service, implemented with WCF and hosted in IIS7, with a submit-poll communication pattern. An initial request is made, which returns quickly and kicks off a background process. The client polls for the status of the background process. This interface is set and can't be changed (it's a simulation of an external service we depend on).
I implemented the background processing by adding another service contract to the existing service with a one-way message contract that starts the long-running process. The "background" service keeps a database updated with the status in order to communicate with the main service. This avoids creating any new web services or items to deploy.
The problem is that the background process is very CPU intensive, and it seems to be starving the other service calls out. It will take up an entire processor, and while a single instance of the background process is running, status polling calls to the main service can take over a minute. I don't care how long the background process takes.
Is there any way to throttle the resource usage of the background method? Or an obvious way to do long running async processes in WCF without changing my submit/poll service contract? Would separating them into different web services help if the two services were still running on the same server?
The first thing I would try would be to lower the priority.
If you're actually spinning off a separate process for the background work, then you can do it like this:
Process.GetCurrentProcess().PriorityClass = ProcessPriorityClass.BelowNormal;
If it's really just a background thread, use this instead (from within the thread):
Thread.CurrentThread.Priority = ThreadPriority.BelowNormal;
(Actually, it's better to start the thread suspended and change the priority at the caller before running it, but it's generally OK to lower your own priority.)
At the very least it should help determine whether or not it's really a CPU issue. If you still have problems after lowering the priority then it might be something else that's getting starved, like file or network I/O.