What WCF Exceptions should I retry on failure for? (such as the bogus 'xxx host did not receive a reply within 00:01:00') - wcf

I have a WCF client that has thrown this common error, just to be resolved with retrying the HTTP call to the server. For what it's worth this exception was not generated within 1 minute. It was generated in 3 seconds.
The request operation sent to xxxxxx
did not receive a reply within the
configured timeout (00:01:00). The
time allotted to this operation may
have been a portion of a longer
timeout. This may be because the
service is still processing the
operation or because the service was
unable to send a reply message. Please
consider increasing the operation
timeout (by casting the channel/proxy
to IContextChannel and setting the
OperationTimeout property) and ensure
that the service is able to connect to
the client
How are professionals handling these common WCF errors? What other bogus errors should I handle.
For example, I'm considering timing the WCF call and if that above (bogus) error is thrown in under 55 seconds, I retry the entire operation (using a while() loop). I believe I have to reset the entire channel, but I'm hoping you guys will tell me what's right to do.
What other

I make all of my WCF calls from a custom "using" statement which handles exceptions and potential retires. My code optionally allows me to pass a policy object to the statement so I can easily change the behavior, like if I don't want to retry on error.
The gist of the code is as follows:
[MethodImpl(MethodImplOptions.NoInlining)]
public static void ProxyUsing<T>(ClientBase<T> proxy, Action action)
where T : class
{
try
{
proxy.Open();
using(OperationContextScope context = new OperationContextScope(proxy.InnerChannel))
{
//Add some headers here, or whatever you want
action();
}
}
catch(FaultException fe)
{
//Handle stuff here
}
finally
{
try
{
if(proxy != null
&& proxy.State != CommunicationState.Faulted)
{
proxy.Close();
}
else
{
proxy.Abort();
}
}
catch
{
if(proxy != null)
{
proxy.Abort();
}
}
}
}
You can then use the call like follows:
ProxyUsing<IMyService>(myService = GetServiceInstance(), () =>
{
myService.SomeMethod(...);
});
The NoInlining call probably isn't important for you. I need it because I have some custom logging code that logs the call stack after an exception, so it's important to preserve that method hierarchy in that case.

Related

Need Elastic APM support for Ktor backend server

Trying to monitor performance of our Ktor backend application and are able to attach Elastic APM agent to it. Server is visible at Kibana dashboard as a service. But it's not creating transactions automatically for each incoming request. When we manually start a transaction and end it in a specific route, then only it is recording performance for that request. Is there another way to solve this situation?
Tried following approach
Intercepted each request in setup phase and started a transaction, but could not end the transaction facing issue while intercepting the same call at the end.
For each request in controller/route defined below piece of code and it is working.
get("/api/path") {
val transaction: Transaction = ElasticApm.startTransaction()
try {
transaction.setName("MyTransaction#getApi")
transaction.setType(Transaction.TYPE_REQUEST)
// do business logic and response
} catch (e: java.lang.Exception) {
transaction.captureException(e)
throw e
} finally {
transaction.end()
}
}
Adding below line for better search result for other developers.
How to add interceptor on starting and ending on each request in ktor. Example of ApplicationCallPipeline.Monitoring and proceed()
You can use the proceed method that executes the rest of a pipeline to catch any occurred exceptions and finish a transaction:
intercept(ApplicationCallPipeline.Monitoring) {
val transaction: Transaction = ElasticApm.startTransaction()
try {
transaction.setName("MyTransaction#getApi")
transaction.setType(Transaction.TYPE_REQUEST)
proceed() // This will call the rest of a pipeline
} catch (e: Exception) {
transaction.captureException(e)
throw e
} finally {
transaction.end()
}
}
Also, you can use attributes to store a transaction for a call duration (begins when the request has started and ends when the response has been sent).

NServiceBus 6: want some errors to ignore eror queue

As per Customizing Error Handling "Throwing the exception in the catch block will forward the message to the error queue. If that's not desired, remove the throw from the catch block to indicate that the message has been successfully processed." That's not true for me even if I simply swallow any kind of exception in a behavior:
public override async Task Invoke(IInvokeHandlerContext context, Func<Task> next)
{
try
{
await next().ConfigureAwait(false);
}
catch (Exception ex)
{
}
}
I put a breakpoint there and made sure execution hit the catch block. Nevertheless after intimidate and delayed retries messages inevitably ends up in error queue. And I have no more Behaviours in the pipeline besides this one.
Only if I run context.DoNotContinueDispatchingCurrentMessageToHandlers(); inside the catch block it prevents sending error to the error queue, but it also prevents any further immediate and delayed retries.
Any idea on why it works in contravention of Particular NserviceBus documentation is very appreciated
NserviceBus ver. used: 6.4.3
UPDATE:
I want only certain type of exceptions not being sent to an error queue in NServiceBus 6, however to make test case more clear and narrow down the root cause of an issue I use just type Exception. After throwing exception, execution certainly hits the empty catch block. Here is more code to that:
public class EndpointConfig : IConfigureThisEndpoint
{
public void Customize(EndpointConfiguration endpointConfiguration)
{
endpointConfiguration.DefineEndpointName("testEndpoint");
endpointConfiguration.UseSerialization<XmlSerializer>();
endpointConfiguration.DisableFeature<AutoSubscribe>();
configure
.Conventions()
.DefiningCommandsAs(t => t.IsMatched("Command"))
.DefiningEventsAs(t => t.IsMatched("Event"))
.DefiningMessagesAs(t => t.IsMatched("Message"));
var transport = endpointConfiguration.UseTransport<MsmqTransport>();
var routing = transport.Routing();
var rountingConfigurator = container.GetInstance<IRountingConfiguration>();
rountingConfigurator.ApplyRountingConfig(routing);
var instanceMappingFile = routing.InstanceMappingFile();
instanceMappingFile.FilePath("routing.xml");
transport.Transactions(TransportTransactionMode.TransactionScope);
endpointConfiguration.Pipeline.Register(
new CustomFaultMechanismBehavior(),
"Behavior to add custom handling logic for certain type of exceptions");
endpointConfiguration.UseContainer<StructureMapBuilder>(c => c.ExistingContainer(container));
var recoverability = endpointConfiguration.Recoverability();
recoverability.Immediate(immediate =>
{
immediate.NumberOfRetries(2);
});
endpointConfiguration.LimitMessageProcessingConcurrencyTo(16);
recoverability.Delayed(delayed =>
{
delayed.NumberOfRetries(2);
});
endpointConfiguration.SendFailedMessagesTo("errorQueue");
...
}
}
public class CustomFaultMechanismBehavior : Behavior<IInvokeHandlerContext>
{
public override async Task Invoke(IInvokeHandlerContext context, Func<Task> next)
{
try
{
await next().ConfigureAwait(false);
}
catch (Exception ex)
{
}
}
}
UPDATE 2
I think I know what's going on: message is handled by first handler that throws an exception which is caught by the Behavior catch block, but then NServiceBus runtime tries to instantiate second handler class which is also supposed to handle the message (it handles class the message is derived from). That's where another exception is thrown in a constructor of one of dependent class. StructureMap tries to instantiate the handler and all its dependent services declared in the constructor and in the process runs into the exception. And this exception is not caught by CustomFaultMechanismBehavior.
So my I rephrase my question now: Is there any way to suppress errors (ignore error queue) occurring inside constructor or simply during StructureMap classes initialization? Seems like the described way does not cover this kind of situations
Your behavior is activated on Handler invocation. This means you are catching exceptions happening inside the Handle method so any other exception, e.g. in the Constructor of the handler would not be caught.
To change the way you 'capture' the exceptions, you can change the way the behavior is activated, e.g. change it from Behavior<IInvokeHandlerContext> to Behavior<ITransportReceiveContext> which is activated when the transport receives a message. You can investigate on different stages and behaviors to see which one suits your purpose best.

How can my WCF service recover from unavailable message queue?

I have a WCF service that receives messages from the Microsoft Message Queue (netMsmqBinding).
I want my service to recover if the message queue is unavailable. My code should fail to open the service, but then try again after a delay.
I have code to recognize the error when the queue is unavailable:
static bool ExceptionIsBecauseMsmqNotStarted(TypeInitializationException ex)
{
MsmqException msmqException = ex.InnerException as MsmqException;
return ((msmqException != null) && msmqException.HResult == (unchecked((int)(0xc00e000b))));
}
So this should be straightforward: I call ServiceHost.Open(), catch this exception, wait for a second or two, then repeat until my Open call is successful.
The problem is, if this exception gets thrown once, it continues to be thrown. The message queue might have become available, but my running process is in a bad state and I continue to get the TypeInitializationException until I shut down my process and restart it.
Is there a way around this problem? Can I make WCF forgive the queue and genuinely try to listen to it again?
Here is my service opening code:
public async void Start()
{
try
{
_log.Debug("Starting the data warehouse service");
while(!_cancellationTokenSource.IsCancellationRequested)
{
try
{
_serviceHost = new ServiceHost(_dataWarehouseWriter);
_serviceHost.Open();
return;
}
catch (TypeInitializationException ex)
{
_serviceHost.Abort();
if(!ExceptionIsBecauseMsmqNotStarted(ex))
{
throw;
}
}
await Task.Delay(1000, _cancellationTokenSource.Token);
}
}
catch (Exception ex)
{
_log.Error("Failed to start the service host", ex);
}
}
And here is the stack information. The first time it is thrown the stack trace of the inner exception is:
at System.ServiceModel.Channels.MsmqQueue.GetMsmqInformation(Version& version, Boolean& activeDirectoryEnabled)
at System.ServiceModel.Channels.Msmq..cctor()
And the top entries of the outer exception stack:
at System.ServiceModel.Channels.MsmqChannelListenerBase`1.get_TransportManagerTable()
at System.ServiceModel.Channels.TransportManagerContainer..ctor(TransportChannelListener listener)
Microsoft have made the source code to WCF visible, so now we can work out exactly what's going on.
The bad news: WCF is implemented in such a way that if the initial call to ServiceModel.Start() triggers a queueing error there is no way to recover.
The WCF framework includes an internal class called MsmqQueue. This class has a static constructor. The static constructor invokes GetMsmqInformation, which can throw an exception.
Reading the C# Programming Guide on static constructors:
If a static constructor throws an exception, the runtime will not invoke it a second time, and the type will remain uninitialized for the lifetime of the application domain in which your program is running.
There is a programming lesson here: Don't put exception throwing code in a static constructor!
The obvious solution lies outside of the code. When I create my hosting service, I could add a service dependency on the message queue service. However, I would rather fix this problem with code then configuration.
Another solution is to manually check that the queue is available using non-WCF code.
The method System.Messaging.MessageQueue.Exists returns false if the message queue service is unavailable. Knowing this, the following works:
private const string KNOWN_QUEUE_PATH = #".\Private$\datawarehouse";
private static string GetMessageQueuePath()
{
// We can improve this by extracting the queue path from the configuration file
return KNOWN_QUEUE_PATH;
}
public async void Start()
{
try
{
_log.Debug("Starting the data warehouse service");
string queuePath = GetMessageQueuePath();
while(!_cancellationTokenSource.IsCancellationRequested)
{
if (!(System.Messaging.MessageQueue.Exists(queuePath)))
{
_log.Warn($"Unable to find the queue {queuePath}. Will try again shortly");
await Task.Delay(60000, _cancellationTokenSource.Token);
}
else
{
_serviceHost = new ServiceHost(_dataWarehouseWriter);
_serviceHost.Open();
return;
}
}
}
catch(System.OperationCanceledException)
{
_log.Debug("The service start operation was cancelled");
}
catch (Exception ex)
{
_log.Error("Failed to start the service host", ex);
}
}

Redis Timeout Expired message on GetClient call

I hate the questions that have "Not Enough Info". So I will try to give detailed information. And in this case it is code.
Server:
64 bit of https://github.com/MSOpenTech/redis/tree/2.6/bin/release
There are three classes:
DbOperationContext.cs: https://gist.github.com/glikoz/7119628
PerRequestLifeTimeManager.cs: https://gist.github.com/glikoz/7119699
RedisRepository.cs https://gist.github.com/glikoz/7119769
We are using Redis with Unity ..
In this case we are getting this strange message:
"Redis Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool. This may have occurred because all pooled connections were in use.";
We checked these:
Is the problem configuration issue
Are we using wrong RedisServer.exe
Is there any architectural problem
Any idea? Any similar story?
Thanks.
Extra Info 1
There is no rejected connection issue on server stats (I've checked it via redis-cli.exe info command)
I have continued to debug this problem, and have fixed numerous things on my platform to avoid this exception. Here is what I have done to solve the issue:
Executive summary:
People encountering this exception should check:
That the PooledRedisClientsManager (IRedisClientsManager) is registed in a singleton scope
That the RedisMqServer (IMessageService) is registered in a singleton scope
That any utilized RedisClient returned from either of the above is properly disposed of, to ensure that the pooled clients are not left stale.
The solution to my problem:
First of all, this exception is thrown by the PooledRedisClient because it has no more pooled connections available.
I'm registering all the required Redis stuff in the StructureMap IoC container (not unity as in the author's case). Thanks to this post I was reminded that the PooledRedisClientManager should be a singleton - I also decided to register the RedisMqServer as a singleton:
ObjectFactory.Configure(x =>
{
// register the message queue stuff as Singletons in this AppDomain
x.For<IRedisClientsManager>()
.Singleton()
.Use(BuildRedisClientsManager);
x.For<IMessageService>()
.Singleton()
.Use<RedisMqServer>()
.Ctor<IRedisClientsManager>().Is(i => i.GetInstance<IRedisClientsManager>())
.Ctor<int>("retryCount").Is(2)
.Ctor<TimeSpan?>().Is(TimeSpan.FromSeconds(5));
// Retrieve a new message factory from the singleton IMessageService
x.For<IMessageFactory>()
.Use(i => i.GetInstance<IMessageService>().MessageFactory);
});
My "BuildRedisClientManager" function looks like this:
private static IRedisClientsManager BuildRedisClientsManager()
{
var appSettings = new AppSettings();
var redisClients = appSettings.Get("redis-servers", "redis.local:6379").Split(',');
var redisFactory = new PooledRedisClientManager(redisClients);
redisFactory.ConnectTimeout = 5;
redisFactory.IdleTimeOutSecs = 30;
redisFactory.PoolTimeout = 3;
return redisFactory;
}
Then, when it comes to producing messages it's very important that the utilized RedisClient is properly disposed of, otherwise we run into the dreaded "Timeout Expired" (thanks to this post). I have the following helper code to send a message to the queue:
public static void PublishMessage<T>(T msg)
{
try
{
using (var producer = GetMessageProducer())
{
producer.Publish<T>(msg);
}
}
catch (Exception ex)
{
// TODO: Log or whatever... I'm not throwing to avoid showing users that we have a broken MQ
}
}
private static IMessageQueueClient GetMessageProducer()
{
var producer = ObjectFactory.GetInstance<IMessageService>() as RedisMqServer;
var client = producer.CreateMessageQueueClient();
return client;
}
I hope this helps solve your issue too.

WCF nested Callback

The backgound: I am trying to forward the server-side ApplyChangeFailed event that is fired by a Sync Services for ADO 1.0 DBServerSyncProvider to the client. All the code examples for Sync Services conflict resolution do not use WCF, and when the client connects to the server database directly, this problem does not exist. My DBServerSyncProvider is wrapped by a head-less WCF service, however, and I cannot show the user a dialog with the offending data for review.
So, the obvious solution seemed to be to convert the HTTP WCF service that Sync Services generated to TCP, make it a duplex connection, and define a callback handler on the client that receives the SyncConflict object and sets the Action property of the event.
When I did that, I got a runtime error (before the callback was attempted):
System.InvalidOperationException: This operation would deadlock because the
reply cannot be received until the current Message completes processing. If
you want to allow out-of-order message processing, specify ConcurrencyMode of
Reentrant or Multiple on CallbackBehaviorAttribute.
So I did what the message suggested and decorated both the service and the callback behavior with the Multiple attribute. Then the runtime error went away, but the call results in a "deadlock" and never returns. What do I do to get around this? Is it not possible to have a WCF service that calls back the client before the original service call returns?
Edit: I think this could be the explanation of the issue, but I am still not sure what the correct solution should be.
After updating the ConcurrencyMode have you tried firing the callback in a seperate thread?
This answer to another question has some example code that starts another thread and passes through the callback, you might be able to modify that design for your purpose?
By starting the sync agent in a separate thread on the client, the callback works just fine:
private int kickOffSyncInSeparateThread()
{
SyncRunner syncRunner = new SyncRunner();
Thread syncThread = new Thread(
new ThreadStart(syncRunner.RunSyncInThread));
try
{
syncThread.Start();
}
catch (ThreadStateException ex)
{
Console.WriteLine(ex);
return 1;
}
catch (ThreadInterruptedException ex)
{
Console.WriteLine(ex);
return 2;
}
return 0;
}
And this is my SyncRunner:
class SyncRunner
{
public void RunSyncInThread()
{
MysyncAgent = new MySyncAgent();
syncAgent.addUserIdParameter("56623239-d855-de11-8e97-0016cfe25fa3");
Microsoft.Synchronization.Data.SyncStatistics syncStats =
syncAgent.Synchronize();
}
}