how to scale a WCF service in such a scenario

how to scale a WCF service in such a scenario - wcf

I have an app which can track vehicles. Vehicles can change location, appear or disappear at any time. In order to be always up to date, every 3 seconds the app sends to the server the region that is currently visible on the map and the server responds with a list of vehicles which are in the specific area.
Problem: What happens when I have a database of 1000 vehicles and 10000 requests being sent to the server every 3 seconds? How would you solve this scalability issue with WCF?

There are a couple a thing to do
On Client-Side
As Joachim said, try to limit request from client-side. I am not sure that vehicule will move significally every 3 seconds. Eventually, try to combine positions and others informations in a batch.
On Server-Side
Problem: What happens when I have a database of 1000 vehicles and
10000 requests being sent to the server every 3 seconds? How would you
solve this scalability issue with WCF?
The best way to answer this question is to do a load test. The results are very depending on your service implementation. If your request takes more than 1 sec, you will certainly have performance problems.
Your can also add a queue behind your service for handling request, and even deploy your service on many servers in order to dispatch requests between different servers.

Related

Asp.net Web Api 2 OWIN self hosted, high CPU, what is the average compute load I should expect?

I built a set of 3 APIs using Asp.net Web Api 2, self-hosted using OWIN in an Azure Cloud service Worker role.
The Worker Role is exposed to the internet with a custom domain.
Each API has a single controller, doing some normal dictionary operations, table calls and Azure Redis calls. 1 request on two just do a single Redis call and return in around 10ms.
The average call when going through all the API code is 150ms.
The answer is a JSON object of around 10k in size.
Everything works fine, but I have a problem.
I'm having around 25 peaks connections per second and no more than 2 Million requests per day, and I can barely get the CPU below 40% with 3 Azure D2_V2 (2 cores , 8GB RAM) instances running.
I'm in trouble because I'm spending almost 1.5k$ a month for an Api serving just 15-25 calls per second.
If I remove or scale down an instance, the CPU go up to 55-60%, Redis and Azure table calls slows down a lot and an API request takes 3- 5 seconds to get back.
I tried everything at the best of my abilities, I thought could be some bots or DDos attack, so I installed the nuget package WebApi Throttle, set a maximum of 1 requests per IP per second.
Nothing changed.
I reviewed all the code configuration to cut unoptimized parts, but 1 call in 2 just call redis and get back and the others are very clean and simple C# returning in 150ms with 2 azure table calls + 1 azure queue set.
The API Controllers are async, everything is async.
I enabled Profiling, the CPU is high in the main azure process, and the Redis Get method, nothing else relevant here, no bottlenecks.
I enabled Diagnostics, no errors.
I installed Application Insights, and here I see something strange that cannot tell if it is normal or not.
I see this IP: 13.88.23.0 doing thousands requests to the APIs with querystring values generally used in normal requests. A lot of them fail.
This IP is Azure itself, why is calling the Api?
Some of these requests are stuck for minutes, I can see that from the Application insights panel, it's always the same IP.
Then I see the remaining logs, dependencies etc,nothing relevant.
Apart from that , what could I do to understand the problem?
I can't think is normal to consume so many CPU resources for an API with just 2 Million calls a day, or not?
Is there an additional profiling technique I could use?
Based on your experience, how many API calls should I expect to serve with 3 dual core 8GB RAM servers in normal conditions? (assuming there is something wrong in my configuration)
Thanks
UPDATE
I separated the API in two cloud services, 2 in one and 1 in another.
I still see in Application Insights calls from another IP belonging to Microsoft.
I suppose this is normal, probably Application insights cannot detect the real IP of the client since is a Worker Role and show the internal one.
But the problem of having to use so much power for so few calls remain.
Any thoughts on that?

Long polling on a penny auction site?

On a penny auction site, there are a few fundamental requests that happen over time, namely:
Bidding request (when someone places a bid)
Timer updates
Leading bidder updates
I am trying to understand long polling a bit better and I'm stuck with this. As far as i know, Long polling is there to reduce Ajax requests. I.e. By only having ONE for visual updates, and ONE for actions. So, therefore:
bidding request (to place bids) will remain as is, but all the visual update requests will be combined into one "long poll" request, right?
If the user connects to the site for the first time, he will request the current state of the page by also passing in what he was last told the state of the page was. The server will compare it with the state of what it should be, and if they are different, it will pass the new state back to the user, correct?
When passing the state back, the LONG POLL will effectively stop, the screen will be updated, and a new LONG POLL will be started, correct?
Is this understanding correct so far?
If that is so, how will this in any way decrease the number of requests to the backend if the server still has to compare the state?
How will this help if the page is opened in 50 different windows by one user?

Long polling is used to simulate a connection in which the server pushes data to the client (rather than what is actually happening - which is the client requesting the information from the server). Basically the client requests data from the server, but rather than returning data to the client immediately the server 'holds' the request open - it can then return data to the client at a later time point - which can be used to simulate the server updating the client in 'real time'.
So in your example of an auction site the client might long-poll the sever for an item bid amount - the server would hold this request open, and when the bid value on that item changes can return the updated amount to the client.. this can be used to give the impression of the server updating the client as the bid amount changes.
As far as requests to the server go, this very much depends on how this is implemented. Obviously using long polling will reduce the number of requests made to the server compared with, say, getting the client to issue a new 'standard' request every second to check for updates. Multiple instances of the client will still result in multiple requests to the server - and moreover the server still has to deal with the overhead of holding the long polling requests open and responding to these when appropriate.. Apparently different servers, and server architectures, deal with this more effectively than others.

ASP.NET MVC site, shared WCF client object, causing a single-threaded bottleneck?

I'm trying to nail down a performance issue under load in an application which I didn't build, but have become very familiar with the workings of.
The architecture is: mobile apps call an ASP.NET MVC 3 website to get data to display. The ASP.NET site calls a third-party SOAP API using WCF clients (basicHttpBinding), caching results as much as it can to minimize load on that third party.
The load from the mobile apps is in the order of 200+ requests per second at peak times, which translates to something in the order of 20 SOAP requests per second to the third-party, after caching.
Normally it runs fine but we get periods of cascading slowness where every request to the API starts taking 5 seconds.. then 10.. 15.. 20.. 25.. 30.. at which point they time out (we set the WCF client timeout to 30 seconds). Clearly there is a bottleneck somewhere which is causing an increasingly long queue until requests can't be serviced inside 30 seconds.
Now, the third-party API is out of my control but they swear that it should not be having any issues whatsoever with 20 requests per second. So I've been looking into the possibility of a bottleneck at my end.
I've read questions on StackOverflow about ServicePointManager.DefaultConnectionLimit and connectionManagement, but digging through the source, I think the problem is somewhat more fundamental. It seems that our WCF client object (which is a standard System.ServiceModel.ClientBase<T> auto-generated by "Add Service Reference") is being stored in the cache, and thus when multiple requests come in to the ASP.NET site simultaneously, they will share a single Client object.
From a quick experiment with a couple of console apps and spawning multiple threads to call a deliberately slow WCF service with a shared Client object, it seems to me that only one call will occur at a time when multiple threads use a single ClientBase. This would explain a bottleneck when e.g. 20 calls need to be made per second and each one takes more than 50ms to complete.
Can anyone confirm that this is indeed the case?
And if so, and if I switched to every request creating it's own WCF Client object, I would just need to alter ServicePointManager.DefaultConnectionLimit to something greater than the default (which I believe is 2?) before creating the Client objects, in order to increase my maximum number of simultaneous connections?
(sorry for the verbose question, I figured too much information was better than too little)

ZMQ device queue does not load balance properly

I know that ZMQ offers all of the flexibility to do your own load-balancing. However I would expect the out-of-the-box broker, about 4 lines of code using the line
zmq_device (ZMQ_QUEUE, frontend, backend);
to load balance quite well as the documentation says it does load balance.
ZMQ_QUEUE creates a shared queue that collects requests from a set of clients, and distributes these fairly among a set of services. Requests are fair-queued from frontend connections and load-balanced between backend connections. Replies automatically return to the client that made the original request.
I have an army of back-end services and yet find that often my front-end clients have to wait several seconds for something that takes < 1/10 of a second in a 1:1 setting (there are same # of client and service machines). I suspect that ZMQ is not load-balancing properly out of the box - it's sending too many requests to the same service even though it doesn't have bandwidth, etc.
I think this is partly because the services are multithreaded in a way that lets them take up to 10 concurrent requests yet it slows down greatly at near the 10th request even though it can still accept them. Random distribution would be ideal. Is there an out-of-the-box way to do this or can it be done in a few lines of code, or do I have to write my own broker from scratch?

Fwiw issue was the workers were taking on work when they didn't have room for it, issue was not in ZMQ layer per se.

Service instances in WCF

I'm using perfmon to examine my service behaviour. What I do is I launch 6 instances of client application on separate machines and send requests to server in 120 threads (20threads per client application).
I have examined counters and maximum number of instances (I use PerSession model and set number of instances to 100) is 12, what I consider strange as my response times from service revolve around 120 seconds... I thought that increasing number of instances will cause WCF to create more instances, and as a result response times would be quicker.
Any idea why WCF doesn't create even more instances of service?
Thanks Pawel

WCF services are throttled by default - it's a service behavior, which you can tweak easily.
See the MSDN docs on ServiceThrottling.
Here are the defaults:
<serviceThrottling
maxConcurrentCalls="16"
maxConcurrentInstances="Int.MaxValue"
maxConcurrentSessions="10" />
With these settings, you can easily control how many sessions or concurrent calls can be handled, and you can make sure your server isn't overwhelmed by (fraudulent) requests and brought to its knees.

Ufff, last attempt to understand that silly WCF.
What I did now is:
create client that starts 20 threads, every thread sends requests to service in a loop. Performance counter on server claims that only 2 instances of service object are created all the time. Average request time is about 40seconds (I start measuring before proxy call and finish after call returns).
modify that client to start 5 threads and launch 4 instances of that client (to simulate 20 threads behaviour from previous example). Performance monitor shows that 8 instances of service object are created all the time. Average request time is 20seconds.
Could somebody tell me what is going on? I thought that there is a problem with server that it doesn't want to handle more requests at concurently, but apparently it is client that causes a stir and don't want to send more requests concurently... Maybe there is some kind of configuration option that limits client from sending more then two requests at one time... (buffer,throttling etc...)
Channel factory is created in every thread.

You might want to refer to this article and make adjustment to your WCF configuration (specifically maxConnections) to get the number of connections you want.
Consider using something like http://www.codeplex.com/WCFLoadTest to hit the service.
Also, perfmon will only get you so far. If you want to debug WCF service you should look at the SvcTraceViewer and SvcConfigEditor in the Windows SDK.

On your service binding what have you set the maxconnections to? Calls to connect will block once the limit is reached.
Default is 10 I think.
http://msdn.microsoft.com/en-us/library/ms731379.aspx

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas