I have a WCF SOAP service that receives too many synchronous requests from other systems
I am having problem when too many request comes at that time IIS Queue will not provide proper result and server memory and CPU usage is gone high and discard request or time out
I did a normal Load test (100 requests with 100 concurrent user) and the IIS started to discard the requests after the maximum queue length reach as well as all the request coming delays to provide the response and Other requests coming in are delayed until the first one either times out, or responds.
Below is server configuration
WCF application code is tested with Resharper tool and there is no object or memory dispose issue
Is there any settings for setup application pool or worker process to manage queue ?
Can i apply web garden in Application ?
Please help me to solve this issue
Thanks in Advance
Related
I have Spring Cloud Gateway (Greenwich) running with Netty. This application receives request and then sends request downstream applications depending on the route configuration.
Randomly few request take lot of time(> 70s). Even though the downstream server responded back within 5 sec, Netty threads (reactor-http-epoll-*) are not picking up the response. I have enabled debug logs to see what those threads are doing. From preliminary analysis, it look like those threads are processing something else and are always in runnable state. When this happens the traffic to server is not unusual and it's same as before.
My question here is:
Why response was not processed by reactor threads while response was received(according to the logging of the downstream app, it sent the response. However, spring-cloud app received response way too late in the logs). Is it possible that all the threads are busy doing other things.
Is there any run book on how such issues should be investigated?
Some-places in logs I do see high number of inactive connections in logs but not sure if that is impacting anything. (Channel cleaned, now 56 active connections and 1400 inactive connections)
Any general guidance on how to proceed with investigation to understand why random slowness is happening in application will really help. Thanks for the help in advance.
Okay, so I ended up doing below things and after lot of investigation it started working fine for me.
Enable logging. Look at how many connections are getting created. In my case, lot of new connections were getting created and and they were not getting re-used.
io.netty.leakDetectionLevel=paranoid
logging.level.reactor.netty=DEBUG
logging.level.reactor.netty.channel.FluxReceive=DEBUG
spring.cloud.gateway.httpclient.wiretap=true
spring.cloud.gateway.httpserver.wiretap=true
Make sure there is no blocking code running on reactor-http-epoll-* threads.
I upgraded Spring Cloud dependencies from Greenwhich train to latest version of Hoxton train.
We observed the following behavior on one of the servers hosting a WCF service on IIS 6.0:
The IIS log shows a high value for time-taken (> 100000)
The HTTP status code is 200
sc-win32-status code shows a value of 64
I found out that sc-win32-status code of 64 indicates "The specified network is no longer available"
Initially I suspected that it could be because of limits set on MinFileBytesPerSecond, which sets the minimum throughput rate that HTTP.sys enforces when sending data from the client to the server, and back from the server to the client.
But the value for sc-bytes and cs-bytes indicate that the amount of data is sent is within the range generally observed for the service.
Also note that the WCF service is hosted on four boxes and is load-balanced, but the problem occurs only one of the servers. (but not essentially on the same server). The problem is also intermittent.
Has anybody else encountered this error? Any clues about what could be wrong?
Update
Note: Observation on IIS 7.5 (IIS version does not really matter)
I was able to replicate the issue. The issue occurs if:
1. The WCF service takes a long time to respond
2. The client proxy times out before it receives a response from the server. In this case it leads to TimeoutException on the client.
3. The server keeps waiting for TCP ACK for the client, which it would never receive.
Hence a long timeout (TCP socket timeout (default value: 4 minutes) and sc-win32-status of 64
So essentially it appears that WCF code is taking a long time to respond and the client is timing out, what I observe in IIS log is just a symptom and not a problem.
The behavior you are describing will also occur if you exceed a WCF service's max sessions, calls or instances (depending on how you have your service instancecontext mode configured). If you observe the System.ServiceModel performance counters for %max concurrent sessions and/or %max concurrent calls (again depending on your service's instance context), you may see a correlation with the IIS log entries.
Note that these maxes can be configured in the service throttling behavior.
https://msdn.microsoft.com/en-us/library/vstudio/system.servicemodel.description.servicethrottlingbehavior(v=vs.100).aspx
I saw your question again and wanted to point out that I found a solution for this. It turned out to be this piece of code in the web.config:
<pages smartNavigation="true">
After turning this off I stopped receiving the same time-out errors. See also the answer here
IIS put the services into sleep to save recources.
Copied from here (WCF REST Service goes to sleep after inactivity)
The application pool hosting your service defines Idle Time-out property (advanced settings of app pool in IIS management console) which defaults to 20 minutes. If no request is received by the app pool within idle timeout the worker processes serving the pool is terminated. After receiving a new request the IIS must start the process again, the process must load application domain and all related assemblies, compile .svc file, run the service host and process the request.The solution can be increasing idle time-out but the meaning of this time-out is correct handling of server resources. If the process is not needed it should be stopped. Another ugly workaround is using some ping process (for example cron job or scheduled task on the server) which will regularly ping call some method on the service or page in the same application.
We noticed that CPU usage went up from 5% TO 50% after adding NServicebus to our ASP.net MVC app. This was on a server that was not under any load. We noticed the same behavior on another server that hosted a WCF app. After trying out different things, we figured out that if we configured the bus as send only, the CPU usage dropped to 5%. Does anybody know why the cpu usage was so high when the bus is not configured as send only?
I've experienced this before.
What happened to me was I set up an application pool, and it started out running as Network Service. Before I had the chance to set the application pool identity to a domain-level user (for access to file shares, etc.) the pages had already been hit, and so the NServiceBus installers had already created a queue with the Network Service credentials.
When I set the application pool user, all of a sudden it didn't have the proper permissions to the queue.
Normally NServiceBus checks for messages with a timeout if none are available to be received, but in this instance, it goes into a very tight loop of "Are there messages? I don't have permission. Are there messages? I don't have permission." and so you get the very high CPU.
I fixed the problem by deleting the queue and allowing NServiceBus to recreate it with the proper permissions.
It's possible that the cause of the high CPU was the NServiceBus code that looks for a message in the queue, though I find that a bit hard to believe. Send-only mode prevents NServiceBus from looking for messages in the queue.
I'm working on a self-hosted WCF application which runs just fine on my PC; however, when I try running it on a VM hosted locally using VMware Player, the service takes some two minutes to return data, whereas the original request took only a few seconds.
The VM is using 2Gb RAM and dual CPU running Windows Server 2008 R2 (on an 8Gb/quad core host running Windows 7).
Looking at the WCF service trace, I have the following log entries (time/description):
15:41:26.771 From: Processing message 1.
15:41:26.771 Activity boundary.
15:41:26.820 Received a message over a channel.
15:41:26.844 ServiceChannel information.
15:41:26.848 Incoming HTTP request to URI 'http://localhost:8000/Sql/Database' matched operation 'GetDatabase'
15:41:26.944 Message Log Trace
15:43:25.775 To: Execute 'MyProject.ISqlService.GetDatabase'
15:43:25.775 Activity boundary.
15:43:25.947 From: Execute 'MyProject.ISqlService.GetDatabase'
15:43:25.947 Activity boundary.
15:43:25.947 Message Log Trace
15:43:26.134 Throwing an exception.
15:43:26.134 RequestContext aborted
15:43:26.134 Activity boundary.
So the two minute delay occurs between receiving the incoming HTTP request and the dispatch to the service implementation. This delays occurs whether the request is the first (thus incurring the usual WCF warm-up penalty) or a subsequent request.
While I appreciate that I'm not going to get bare-metal performance from a VM, I'm still concerned about the dire performance, especially as the client tends to timeout before the end of the two minutes. Is there anything I can do to improve matters? It's making testing very difficult.
Maybe your proc does not support VT-x/AMD-V extension, so virtualization is not hardware-accelerated. Check your hardware using CPU-Z.
I have already asked a similar question here: WCF Service calling an external web service results in timeouts in heavy load environment but I've got a better idea now as to what's happening so posting a new question.
This is what is happening:
.NET client sends multiple requests at the same time to a WCF service (if it helps - I'm replicating this scneario by using Visual Studio Load Tests)
The client has got a "sendTimeout" set to 5 seconds
The WCF service receives it and start processing it. The processing involves sending a request to an external service which could take about 1 second to come back with a response
This is where I think the problem is: the client has sent many requests to the service and since the service is still busy processing the concurrent requests, some of the reqeusts from the client are timing out after 5 seconds
I have tried the following:
Changed the InstanceContextMode to PerCall
Increased the values of maxConcurrentCalls & maxConcurrentInstances
Increased the value of connectionManagement.maxconnection in machine.config
But none of that seems to be making any difference. Does anyone has any idea how can I ensure that I don't run into this timeout issue?
OK, you say WCF and that is not enough. What binding are you using and where are you hosting it? If you are using IIS, the could be different underlying problem than self-hosting.
The likely reason is the small number of ThreadPool size. You can use ThreadPool.SetMaxThreads() to change this but beware this is a sensitive value. Have a look here.
Check out the following link:
http://weblogs.asp.net/paolopia/archive/2008/03/23/wcf-configuration-default-limits-concurrency-and-scalability.aspx
I'm not sure what you're trying to achieve. Since the WCF service is doing a time consuming operation, you can't overload it and expect it to function. You can do the following (check the link about to set the following):
Increase the receiving capacity of the wcf service
Increase the send timeout of the service
Increase the send timeout of the client
Increase the receive timeout of the client
Limit the outgoing connections to the wcf service
The best and most robust option would be to configure and use MSMQ with the WCF service.