WCF cannot reach 100% CPU. What is the bottleneck? - wcf

I am benchmarking a self-hosting nettcp WCF service, making requests from 50 threads to a service located no the same computer. The problem is that the CPU utilization never exceeds 35% on Xeon E3-1270. When I run the same test on a two core laptop it does reach 100%.
The WCF method does nothing, so it should not be limited by IO. I tried to increase the number of threads, but that does not help. Each thread creates a service channel and performs thousands calls reusing that channel instance.
Here is the service class I am using:
[ServiceBehavior(InstanceContextMode = InstanceContextMode.Single, ConcurrencyMode = ConcurrencyMode.Multiple)]
public class TestService : ITestService
{
public void Void()
{
// DO NOTHING
}
}
Configs:
ServiceThrottlingBehavior:
MaxConcurrentCalls = 1000
MaxConcurrentInstances = 1000,
MaxConcurrentSessions = 1000
NetTcpBinding
ListenBacklog = 2000
MaxConnections = 2000

I would try changing your InstanceContextMode to PerCall. I'm pretty sure your current configuration setting will be ignored because WFC only ever creates a single instance of your class and will process them in order. With PerCall a new instance will be created for each request until the maximum number of threads or your configuration limit has been reached. You shouldn't need the netTcpBinding setting either, but keep your Throttling behaviour but make sure you get your proportions right otherwise might have adverse effects.
MaxConcurrentCalls: 16 * Processor Count
MaxConcurrentSessions: 100 * Processor Count
MaxConcurrentInstance: Sum (116 * Processor Count)

Related

Bursts of RedisTimeoutException using StackExchange.Redis

I'm trying to track down intermittent "bursts" of timeouts using the StackExchange Redis library. Here's a bit about our setup: Our API is written in C# and runs on Windows 2008 and IIS. We have 4 API servers in production, and we have 4 Redis machines (Running Linux latest LTS), each with 2 instances of Redis (one master on port 7000, one slave on port 7001). I've looked at pretty much every aspect of the Redis servers and they look fantastic. No errors in the logs, CPU and network is great, everything with the server side of things seem fantastic. I can tail -f the Redis logs while this is happening and don't see anything out of the ordinary (such as rewriting AOF files or anything). I don't think the problem is with Redis.
Here's what I know so far:
We see these timeout exceptions several times an hour. Usually between 40-50 timeouts in a minute, sometimes up to 80-90. Then, they'll go away for several minutes. There were about 5,000 of these events in the past 24 hours, and they happen in bursts from a single API client.
These timeouts only happen against Redis master nodes, never against slave nodes. However, they happen with various Redis commands such as GETs and SETs.
When a burst of these timeouts happen, the calls are coming from a single API server but happen talking to various Redis nodes. For example, API3 might have a bunch of timeouts trying to call Cache1, Cache2 and Cache3. This is strong evidence that the issue is related to the API servers, not the Redis servers.
The Redis master nodes have 108 connected clients. I log current connections, and this number remains stable. There are no big spikes in connections, and it doesn't look like there's any bad code creating too many connections or not sharing ConnectionMultiplexer instances (I have one and it's static)
The Redis slave nodes have 58 connected clients, and this also looks completely stable as well.
We're using StackExchange.Redis version 1.2.6
Redis is using AOF mode, and size on disk is about 195MB
Here's an example timeout exception. Most look pretty much the same as this:
Type=StackExchange.Redis.RedisTimeoutException,Message=Timeout
performing GET limeade:allActivities, inst: 1, mgr: ExecuteSelect,
err: never, queue: 0, qu: 0, qs: 0, qc: 0, wr: 0, wq: 0, in: 0, ar: 0,
clientName: LIMEADEAPI4, serverEndpoint: 10.xx.xx.11:7000,
keyHashSlot: 1295, IOCP: (Busy=0,Free=1000,Min=2,Max=1000), WORKER:
(Busy=9,Free=32758,Min=2,Max=32767) (Please take a look at this
article for some common client-side issues that can cause timeouts:
http://stackexchange.github.io/StackExchange.Redis/Timeouts),StackTrace=
at
StackExchange.Redis.ConnectionMultiplexer.ExecuteSyncImpl[T](Message
message, ResultProcessor1 processor, ServerEndPoint server) at
StackExchange.Redis.ConnectionMultiplexer.ExecuteSyncImpl[T](Message
message, ResultProcessor1 processor, ServerEndPoint server) at
StackExchange.Redis.RedisBase.ExecuteSync[T](Message message,
ResultProcessor1 processor, ServerEndPoint server) at
StackExchange.Redis.RedisDatabase.StringGet(RedisKey key, CommandFlags
flags) at Limeade.Caching.Providers.RedisCacheProvider1.Get[T](K
cacheKey, CacheItemVersion& cacheItemVersion) in ...
I've done a bit of research on tracing down these timeout exceptions, but what's rather surprising is all the numbers are all zeros. Nothing in the queue, nothing waiting to be processed, I have tons of threads free and not doing anything. Everything looks great.
Anyone have any ideas on how to fix this? The problem is these bursts of cache timeouts cause our database to be hit more, and in certain circumstances this is a bad thing. I'm happy to add any more info that anyone would find helpful.
Update: Connection Code
The code to connect to Redis is part of a fairly complex system that supports various cache environments and configuration, but I can probably boil it down to the basics. First, there's a CacheFactory class:
public class CacheFactory : ICacheFactory
{
private static readonly ILogger log = LoggerManager.GetLogger(typeof(CacheFactory));
private static readonly ICacheProvider<CacheKey> cache;
static CacheFactory()
{
ICacheFactory<CacheKey> configuredFactory = CacheFactorySection.Current?.CreateConfiguredFactory<CacheKey>();
if (configuredFactory == null)
{
// Some error handling, not important
}
cache = configuredFactory.GetDefaultCache();
}
// ...
}
The ICacheProvider is what implements a way to talk to a certain cache system, which can be configured. In this case, the configuredFactory is a RedisCacheFactory which looks like this:
public class RedisCacheFactory<T> : ICacheFactory<T> where T : CacheKey, ICacheKeyRepository
{
private RedisCacheProvider<T> provider;
private readonly RedisConfiguration configuration;
public RedisCacheFactory(RedisConfiguration config)
{
this.configuration = config;
}
public ICacheProvider<T> GetDefaultCache()
{
return provider ?? (provider = new RedisCacheProvider<T>(configuration));
}
}
The GetDefaultCache method is called once, in the static constructor, and returns a RedisCacheProvider. This class is what actually connects to Redis:
public class RedisCacheProvider<K> : ICacheProvider<K> where K : CacheKey, ICacheKeyRepository
{
private readonly ConnectionMultiplexer redisConnection;
private readonly IDatabase db;
private readonly RedisCacheSerializer serializer;
private static readonly ILog log = Logging.RedisCacheProviderLog<K>();
private readonly CacheMonitor<K> cacheMonitor;
private readonly TimeSpan defaultTTL;
private int connectionErrors;
public RedisCacheProvider(RedisConfiguration options)
{
redisConnection = ConnectionMultiplexer.Connect(options.EnvironmentOverride ?? options.Connection);
db = redisConnection.GetDatabase();
serializer = new RedisCacheSerializer(options.SerializationBinding);
cacheMonitor = new CacheMonitor<K>();
defaultTTL = options.DefaultTTL;
IEnumerable<string> hosts = options.Connection.EndPoints.Select(e => (e as DnsEndPoint)?.Host);
log.InfoFormat("Created Redis ConnectionMultiplexer connection. Hosts=({0})", String.Join(",", hosts));
}
// ...
}
The constructor creates a ConnectionMultiplexer based on the configured Redis endpoints (which are in some config file). I also log every time I create a connection. We don't see any excessive number of these log statements, and the connections to Redis remains stable.
In global.asax, in try adding:
protected void Application_Start(object sender, EventArgs e)
{
ThreadPool.SetMinThreads(200, 200);
}
For us, this reduced errors from ~50-100 daily to zero. I believe there is no general rule for what numbers to set as it's system dependant (200 works for us) so might require some experimenting on your end.
I also believe this has improved the performance of the site.

TCP Connection issue Connection refused

even if I am having a huge value of ListenBackLog=10000 and MaxConnection=10000 then also I am getting "tcp error code 10061 target machine actively refused". Its not every time but when load increases then error start appearing and one service is not able to communicate with other.
Following are the other values,
ListenBacklog = 10000;
MaxBufferPoolSize = 500000;
MaxBufferSize = 2000000000;
MaxConnections = 10000;
MaxReceivedMessageSize = 2000000000;
We already have MaxConcurrentCalls,MaxConcurrentSessions and MaxConcurrentInstances set as 10000.
In normal condition its working fine but when load increases services are not able to communicate with each other.
If we observer the performance counter then Calls Outstanding is lies between 150 to 250. Services are hosted as windows service.
any suggestion or thoughts?

Jedis getResource() is taking lot of time

I am trying to use sentinal redis to get/set keys from redis. I was trying to stress test my setup with about 2000 concurrent requests.
i used sentinel to put a single key on redis and then I executed 1000 concurrent get requests from redis.
But the underlying jedis used my sentinel is blocking call on getResource() (pool size is 500) and the overall average response time that I am achieving is around 500 ms, but my target was about 10 ms.
I am attaching sample of jvisualvm snapshot here
redis.clients.jedis.JedisSentinelPool.getResource() 98.02227 4.0845232601E7 ms 4779
redis.clients.jedis.BinaryJedis.get() 1.6894469 703981.381 ms 141
org.apache.catalina.core.ApplicationFilterChain.doFilter() 0.12820946 53424.035 ms 6875
org.springframework.core.serializer.support.DeserializingConverter.convert() 0.046286926 19287.457 ms 4
redis.clients.jedis.JedisSentinelPool.returnResource() 0.04444578 18520.263 ms 4
org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept() 0.035538 14808.45 ms 11430
May anyone help to debug further into the issue?
From JedisSentinelPool implementation of getResource() from Jedis sources (2.6.2):
#Override
public Jedis getResource() {
while (true) {
Jedis jedis = super.getResource();
jedis.setDataSource(this);
// get a reference because it can change concurrently
final HostAndPort master = currentHostMaster;
final HostAndPort connection = new HostAndPort(jedis.getClient().getHost(), jedis.getClient()
.getPort());
if (master.equals(connection)) {
// connected to the correct master
return jedis;
} else {
returnBrokenResource(jedis);
}
}
}
Note the while(true) and the returnBrokenResource(jedis), it means that it tries to get a jedis resource randomly from the pool that is indeed connected to the correct master and retries if it is not the good one. It is a dirty check and also a blocking call.
The super.getResource() call refers to JedisPool traditionnal implementation that is actually based on Apache Commons Pool (2.0). It does a lot to get an object from the pool, and I think it even repairs fail connections for instance. With a lot of contention on your pool, as probably in your stress test, it can probably take a lot of time to get a resource from the pool, just to see it is not connected to the correct master, so you end up calling it again, adding contention, slowing getting the resource etc...
You should check all the jedis instances in your pool to see if there's a lot of 'bad' connections.
Maybe you should give up using a common pool for your stress test (only create Jedis instances manually connected to the correct node, and close them nicely), or setting multiple ones to mitigate the cost of looking to "dirty" unchecked jedis resources.
Also with a pool of 500 jedis instances, you can't emulate 1000 concurrent queries, you need at least 1000.

How to enforce message queue sequence with multiple WCF service instances

I want to create a WCF service which uses an MSMQ binding as I have a high volume of notifications the service is to process. It is important that clients are not held up by the service and that the notifications are processed in the order they are raised, hence the queue implementation.
Another consideration is resilience. I know I could cluster MSMQ itself to make the queue more robust, but I want to be able to run an instance of my service on different servers, so if a server crashes notifications do not build up in the queue but another server carries on processing.
I have experimented with the MSMQ binding and found that you can have multiple instances of a service listening on the same queue, and left to themselves they end up doing a sort of round-robin with the load spread across the available services. This is great, but I end up losing the sequencing of the queue as different instances take a different amount of time to process the request.
I've been using a simple console app to experiment, which is the epic code dump below. When it's run I get an output like this:
host1 open
host2 open
S1: 01
S1: 03
S1: 05
S2: 02
S1: 06
S1: 08
S1: 09
S2: 04
S1: 10
host1 closed
S2: 07
host2 closed
What I want to happen is:
host1 open
host2 open
S1: 01
<pause while S2 completes>
S2: 02
S1: 03
<pause while S2 completes>
S2: 04
S1: 05
S1: 06
etc.
I would have thought that as S2 has not completed, it might still fail and return the message it was processing to the queue. Therefore S1 should not be allowed to pull another message off of the queue. My queue us transactional and I have tried setting TransactionScopeRequired = true on the service but to no avail.
Is this even possible? Am I going about it the wrong way? Is there some other way to build a failover service without some kind of central synchronisation mechanism?
class WcfMsmqProgram
{
private const string QueueName = "testq1";
static void Main()
{
// Create a transactional queue
string qPath = ".\\private$\\" + QueueName;
if (!MessageQueue.Exists(qPath))
MessageQueue.Create(qPath, true);
else
new MessageQueue(qPath).Purge();
// S1 processes as fast as it can
IService s1 = new ServiceImpl("S1");
// S2 is slow
IService s2 = new ServiceImpl("S2", 2000);
// MSMQ binding
NetMsmqBinding binding = new NetMsmqBinding(NetMsmqSecurityMode.None);
// Host S1
ServiceHost host1 = new ServiceHost(s1, new Uri("net.msmq://localhost/private"));
ConfigureService(host1, binding);
host1.Open();
Console.WriteLine("host1 open");
// Host S2
ServiceHost host2 = new ServiceHost(s2, new Uri("net.msmq://localhost/private"));
ConfigureService(host2, binding);
host2.Open();
Console.WriteLine("host2 open");
// Create a client
ChannelFactory<IService> factory = new ChannelFactory<IService>(binding, new EndpointAddress("net.msmq://localhost/private/" + QueueName));
IService client = factory.CreateChannel();
// Periodically call the service with a new number
int counter = 1;
using (Timer t = new Timer(o => client.EchoNumber(counter++), null, 0, 500))
{
// Enter to stop
Console.ReadLine();
}
host1.Close();
Console.WriteLine("host1 closed");
host2.Close();
Console.WriteLine("host2 closed");
// Wait for exit
Console.ReadLine();
}
static void ConfigureService(ServiceHost host, NetMsmqBinding binding)
{
var endpoint = host.AddServiceEndpoint(typeof(IService), binding, QueueName);
}
[ServiceContract]
interface IService
{
[OperationContract(IsOneWay = true)]
void EchoNumber(int number);
}
[ServiceBehavior(InstanceContextMode = InstanceContextMode.Single)]
class ServiceImpl : IService
{
public ServiceImpl(string name, int sleep = 0)
{
this.name = name;
this.sleep = sleep;
}
private string name;
private int sleep;
public void EchoNumber(int number)
{
Thread.Sleep(this.sleep);
Console.WriteLine("{0}: {1:00}", this.name, number);
}
}
}
batwad,
You are trying to manually create a service bus. Why don't you try to use an existing one?
NServiceBus, MassTransit, ServiceStack
At least 2 of those work with MSMQ.
Furthermore, if you absolutely need order it may actually be for another reason - you want to be able to send a message and you don't want dependent messages to be processed before the first message. You are looking for the Saga Pattern. NServiceBus and MassTransit both will allow you to manage Sagas easily, they will both allow you to simply trigger the initial message and then trigger the remaining messages based on conditions. It will allow you to implement the plumping of your distributed application a snap.
You can then even scale up to thousands of clients, queue servers and message processors without having to write a single line of code nor have any issues.
We tried to implement our own service bus over msmq here, we gave up because another issue kept creeping up. We went with NServiceBus but MassTransit is also an excellent product (it's 100% open source, NServiceBus isn't). ServiceStack is awesome at making APIs and using Message Queues - I'm sure you could use it to make Services that act as Queue front-ends in minutes.
Oh, did I mention that in the case of NSB and MT both only require under 10 lines of code to fully implement queues, senders and handlers?
----- ADDED -----
Udi Dahan (one of the main contributers of NServiceBus) talks about this in:
"In-Order Messaging a Myth" by Udi Dahan
"Message Ordering: Is it Cost Effective?" with Udi Dahan
Chris Patterson (one of the main contributers of Mass Transit)
"Using Sagas to ensure proper sequential message order" question
StackOverflow questions/answers:
"Preserve message order when consuming MSMQ messages in a WCF application"
----- QUESTION -----
I must say that I'm baffled as to why you need to guarantee message order - would you be in the same position if you were using an HTTP/SOAP protocol? My guess is no, then why is it a problem in MSMQ?
Good luck, hope this helps,
Ensuring in-order delivery of messages is one of the de-facto sticky issues with high volume messaging.
In an ideal world, your message destinations should be able to handle out-of-order messaging. This can be achieved by ensuring that your message source includes some kind of sequencing information. Again ideally this takes the form of some kind of x-of-n batch stamp (message 1 of 10, 2 of 10, etc). Your message destination is then required to assemble the data into order once it has been delivered.
However, in the real world there often is no scope for changing downstream systems to handle messages arriving out of order. In this instance you have two choices:
Go entirely single threaded - actually you can usually find some kind of 'grouping id' which means you can go single-threaded in a for-each-group sense, meaning you still have concurrency across different message groups.
Implement a re-sequencer wrapper around each of your consumer systems you want to receive in-order messages.
Neither solution is very nice, but that's the only way I think you can have concurrency and in-order message delivery.

WCF queue behavior for MaxConcurrentCalls

I have a WCF service with the following settings:
Binding = WebHttpBinding
InstanceContextMode = Single
ConcurrencyMode = Multiple
MaxConcurrentSessions = a high value
The documentation states about MaxConcurrentCalls: the MaxConcurrentCalls property specifies the maximum number of messages actively processing across a ServiceHost object. Each channel can have one pending message that does not count against the value of MaxConcurrentCalls until begins to process it.
Several questions:
What does the sentence "Each channel can have one pending message that does not count against the value of MaxConcurrentCalls until begins to process it" exactly mean?
If the MaxConcurrentCalls tresshold is reached, are new TCP connections queued?
If the MaxConcurrentCalls tresshold is reached, are new requests on existing TCP connections queued (during pipe-lining)?
How to specify the length of those queues?
Thanks!
Rene