NServiceBus6 delayed recovery not delaying - nservicebus

Without getting to terribly detailed about the problem we are trying to solve I have the need to make NServiceBus do 1 of 5 things, but I'm currently just trying to get the first one to work. That is that given a reply back from a web API call we want to have a delayed retry, immediate retry, give up, cancel or start over. The delayed retry looks like it is best done using the custom recoverability so I followed this: Custom Recoverability Policy and came up with this
public static class UpdateEndpointConfiguration
{
public static void ConfigureEndpointForUpdateVocxoSurveyApi(this EndpointConfiguration configuration)
{
var recoverabilitySettings = configuration.Recoverability();
recoverabilitySettings.CustomPolicy(SetCustomPolicy);
}
private static RecoverabilityAction SetCustomPolicy(RecoverabilityConfig config, ErrorContext context)
{
var action = DefaultRecoverabilityPolicy.Invoke(config, context);
if (context.Exception is DelayedRetryException delayedRetryException)
{
return RecoverabilityAction.DelayedRetry(TimeSpan.FromSeconds(delayedRetryException.DelayRetryTimeoutSeconds));
}
return action;
}
}
Then as a test I made a simple message so I don't have to force the web api to do silly things:
public class ForceDelayRetry : ICommand
{
public int DelayInSeconds { get; set; }
}
and then "handle it"
public class TestRequestHandler : IHandleMessages<ForceDelayRetry>
{
private static readonly ILog Log = LogManager.GetLogger(typeof(TestRequestHandler));
public async Task Handle(ForceDelayRetry message, IMessageHandlerContext context)
{
Log.Info($"Start processing {nameof(ForceDelayRetry)}");
var handleUpdateRequestFailure = IoC.Get<HandleUpdateRequestFailure>();
await handleUpdateRequestFailure.HandleFailedRequest(new UpdateRequestFailed
{
DelayRetryTimeoutSeconds = message.DelayInSeconds,
Message = $"For testing purposes I am forcing a delayed retry of {message.DelayInSeconds} second(s)",
RecoveryAction = RecoveryAction.DelayRetry
}, context, 12345);
Log.Info($"Finished processing {nameof(ForceDelayRetry)}");
}
}
I start the service up and in the span of about 1.5 minutes the two test messages were processed 5,400 times approximately. The log message looks similar to this (ommitted stack trace for brevity)
20180601 15:28:47 :INFO [14] TestRequestHandler Start processing ForceDelayRetry
20180601 15:28:47 :WARN [22] NServiceBus.RecoverabilityExecutor Delayed Retry will reschedule message '690f317e-5be0-4511-88b9-a8f2013ac219' after a delay of 00:00:01 because of an exception:
20180601 15:28:47 :INFO [14] TestRequestHandler Start processing ForceDelayRetry
20180601 15:28:47 :WARN [14] NServiceBus.RecoverabilityExecutor Delayed Retry will reschedule message '7443e553-b558-486d-b7e9-a8f2014088d5' after a delay of 00:00:01 because of an exception:
20180601 15:28:47 :INFO [4] TestRequestHandler Start processing ForceDelayRetry
20180601 15:28:47 :WARN [14] NServiceBus.RecoverabilityExecutor Delayed Retry will reschedule message '690f317e-5be0-4511-88b9-a8f2013ac219' after a delay of 00:00:01 because of an exception:
20180601 15:28:47 :INFO [14] TestRequestHandler Start processing ForceDelayRetry
20180601 15:28:47 :WARN [14] NServiceBus.RecoverabilityExecutor Delayed Retry will reschedule message '7443e553-b558-486d-b7e9-a8f2014088d5' after a delay of 00:00:01 because of an exception:
so either i'm doing something wrong, or there is a bug but I don't know which. Can anyone see what the problem is?
edit
here is the method handleUpdateRequestFailure.HandleFailedRequest
public async Task HandleFailedRequest(UpdateRequestFailed failure, IMessageHandlerContext context, long messageSurveyId)
{
switch (failure.RecoveryAction)
{
case RecoveryAction.DelayRetry:
Log.InfoFormat("Recovery action is {0} because {1}. Retrying in {2} seconds", failure.RecoveryAction, failure.Message, failure.DelayRetryTimeoutSeconds);
await context.Send(_auditLogEntryCreator.Create(_logger.MessageIsBeingDelayRetried, messageSurveyId));
throw new DelayedRetryException(failure.DelayRetryTimeoutSeconds);
case RecoveryAction.EndPipelineRequest:
case RecoveryAction.RestartPipelineRequest:
case RecoveryAction.RetryImmediate:
case RecoveryAction.RouteToErrorQueue:
break;
}
}
and as the comment pointed out I would have infinite retries on my message which I found out too, but here is the updated logic for it
private static RecoverabilityAction SetCustomPolicy(RecoverabilityConfig config, ErrorContext context)
{
var action = DefaultRecoverabilityPolicy.Invoke(config, context);
if (context.Exception is DelayedRetryException delayedRetryException)
{
if (config.Delayed.MaxNumberOfRetries > context.DelayedDeliveriesPerformed)
return RecoverabilityAction.DelayedRetry(TimeSpan.FromSeconds(delayedRetryException.DelayRetryTimeoutSeconds));
}
return action;
}

That is that given a reply back from a web API call we want to have a delayed retry, immediate retry, give up, cancel or start over. The delayed retry looks like it is best done using the custom recoverability
I'm not sure I understand what you're trying to achieve, beyond what NServiceBus already offers? Let immediate and delayed retry do what it is best at: do the actual retries.
And if you want more functionality, use a saga. Let the saga orchestrate the process and have a separate handler do the actual call to the external service. The saga can then, based on the replies of this handler, decide if it should stop, continue, take an alternate path, etc.
If you want to discuss this further I suggest you contact us at support#particular.net and we can set up a conference call and show you how we'd do this.

Related

MassTransit / RabbitMQ - why so many messages get skipped?

I'm working with 2 .NET Core console applications in a producer/consumer scenario with MassTransit/RabbitMQ. I need to ensure that even if NO consumers are up-and-running, the messages from the producer are still queued up successfully. That didn't seem to work with Publish() - the messages just disappeared, so I'm using Send() instead. The messages at least get queued up, but without any consumers running the messages all end up in the "_skipped" queue.
So that's my first question: is this the right approach based on the requirement (even if NO consumers are up-and-running, the messages from the producer are still queued up successfully)?
With Send(), my consumer does indeed work, but still many messages are falling through the cracks and getting dumped into to the "_skipped" queue. The consumer's logic is minimal (just logging the message at the moment) so it's not a long-running process.
So that's my second question: why are so many messages still getting dumped into the "_skipped" queue?
And that leads into my third question: does this mean my consumer needs to listen to the "_skipped" queue as well?
I am unsure what code you need to see for this question, but here's a screenshot from the RabbitMQ management UI:
Producer configuration:
static IHostBuilder CreateHostBuilder(string[] args)
{
return Host.CreateDefaultBuilder()
.ConfigureServices((hostContext, services) =>
{
services.Configure<ApplicationConfiguration>(hostContext.Configuration.GetSection(nameof(ApplicationConfiguration)));
services.AddMassTransit(cfg =>
{
cfg.AddBus(ConfigureBus);
});
services.AddHostedService<CardMessageProducer>();
})
.UseConsoleLifetime()
.UseSerilog();
}
static IBusControl ConfigureBus(IServiceProvider provider)
{
var options = provider.GetRequiredService<IOptions<ApplicationConfiguration>>().Value;
return Bus.Factory.CreateUsingRabbitMq(cfg =>
{
var host = cfg.Host(new Uri(options.RabbitMQ_ConnectionString), h =>
{
h.Username(options.RabbitMQ_Username);
h.Password(options.RabbitMQ_Password);
});
cfg.ReceiveEndpoint(host, typeof(CardMessage).FullName, e =>
{
EndpointConvention.Map<CardMessage>(e.InputAddress);
});
});
}
Producer code:
Bus.Send(message);
Consumer configuration:
static IHostBuilder CreateHostBuilder(string[] args)
{
return Host.CreateDefaultBuilder()
.ConfigureServices((hostContext, services) =>
{
services.AddSingleton<CardMessageConsumer>();
services.Configure<ApplicationConfiguration>(hostContext.Configuration.GetSection(nameof(ApplicationConfiguration)));
services.AddMassTransit(cfg =>
{
cfg.AddBus(ConfigureBus);
});
services.AddHostedService<MassTransitHostedService>();
})
.UseConsoleLifetime()
.UseSerilog();
}
static IBusControl ConfigureBus(IServiceProvider provider)
{
var options = provider.GetRequiredService<IOptions<ApplicationConfiguration>>().Value;
return Bus.Factory.CreateUsingRabbitMq(cfg =>
{
var host = cfg.Host(new Uri(options.RabbitMQ_ConnectionString), h =>
{
h.Username(options.RabbitMQ_Username);
h.Password(options.RabbitMQ_Password);
});
cfg.ReceiveEndpoint(host, typeof(CardMessage).FullName, e =>
{
e.Consumer<CardMessageConsumer>(provider);
});
//cfg.ReceiveEndpoint(host, typeof(CardMessage).FullName + "_skipped", e =>
//{
// e.Consumer<CardMessageConsumer>(provider);
//});
});
}
Consumer code:
class CardMessageConsumer : IConsumer<CardMessage>
{
private readonly ILogger<CardMessageConsumer> logger;
private readonly ApplicationConfiguration configuration;
private long counter;
public CardMessageConsumer(ILogger<CardMessageConsumer> logger, IOptions<ApplicationConfiguration> options)
{
this.logger = logger;
this.configuration = options.Value;
}
public async Task Consume(ConsumeContext<CardMessage> context)
{
this.counter++;
this.logger.LogTrace($"Message #{this.counter} consumed: {context.Message}");
}
}
In MassTransit, the _skipped queue is the implementation of the dead letter queue concept. Messages get there because they don't get consumed.
MassTransit with RMQ always delivers a message to an exchange, not to a queue. By default, each MassTransit endpoint creates (if there's no existing queue) a queue with the endpoint name, an exchange with the same name and binds them together. When the application has a configured consumer (or handler), an exchange for that message type (using the message type as the exchange name) also gets created and the endpoint exchange gets bound to the message type exchange. So, when you use Publish, the message is published to the message type exchange and gets delivered accordingly, using the endpoint binding (or multiple bindings). When you use Send, the message type exchange is not being used, so the message gets directly to the destination exchange. And, as #maldworth correctly stated, every MassTransit endpoint only expects to get messages that it can consume. If it doesn't know how to consume the message - the message is moved to the dead letter queue. This, as well as the poison message queue, are fundamental patterns of messaging.
If you need messages to queue up to be consumed later, the best way is to have the wiring set up, but the endpoint itself (I mean the application) should not be running. As soon as the application starts, it will consume all queued messages.
When the consumer starts the bus bus.Start(), one of the things it does is create all exchanges and queues for the transport. If you have a requirement that publish/send happens before the consumer, your only option is to run DeployTopologyOnly. Unfortunately this feature is not documented in official docs, but the unit tests are here: https://github.com/MassTransit/MassTransit/blob/develop/src/MassTransit.RabbitMqTransport.Tests/BuildTopology_Specs.cs
The skipped queue happens when messages are sent to a consumer that doesn't know how to process.
For example if you have a consumer that can process IConsumer<MyMessageA> which is on receive endpoint name "my-queue-a". But then your message producer does Send<MyMessageB>(Uri("my-queue-a")...), Well this is a problem. The consumer only understands the A, it doesn't know how to process B. And so it just moves it to a skipped queue and continues on.
In my case, the same queue listens to multiple consumers at the same time

Message remains Unack'd in the rabbit broker despite DefaultRequeueRejected=false

My Scenario: I publish two messages to my Rabbit broker, and an unhandled exception occurs while processing the first message.
My Question: Why does the message remain Unack'd in the broker and as a consequence why is the second message not be dequeued and processed?
Some info:
I am using Spring AMQP 1.5.4 with Spring Integration 4.2.4. (See code below)
I have a Dead Letter Exchange set up and it is working as expected (i.e. When I Nack a message, it is forwarded to the DLX where it expires. It is then forwarded to the main Exchange).
What I want:
I would like unhandled exceptions (i.e. exceptions that are caught by the SimpleMessageListenerContainer) to result in the amqp-message being Nack'd rather than remaining Unack'd.
What I see:
There are 3 retry attempts to process the message which of course fail because of my forced exception (see code below in ErrorHandler).
The consumer tag of the BlockingQueueConsumer is the same so I'm guessing that the BlockingQueueConsumer is not restarted. However, the logs below show that it does continue to wait for messages.
I would like to know why the BlockingQueueConsumer does not nack the message and why subsequent message are not consumed despite the evidence in the logs that the Consumer is waiting for messages.
Any suggestions or background info would be very welcome!
#Bean
public SimpleMessageListenerContainer simpleMessageListenerContainer(ConnectionFactory connectionFactory, Queue mainQueue, RetryOperationsInterceptor retryOperationsInterceptor) {
SimpleMessageListenerContainer retVal = new SimpleMessageListenerContainer(connectionFactory);
retVal.addQueues(mainQueue);
retVal.setAcknowledgeMode(AcknowledgeMode.MANUAL);
retVal.setDefaultRequeueRejected(false);
retVal.setAdviceChain(new Advice[]{retryOperationsInterceptor});
return retVal;
}
#Bean
public RetryOperationsInterceptor retryOperationsInterceptor () {
return stateless().recoverer(new RejectAndDontRequeueRecoverer()).build();
}
<int-amqp:inbound-channel-adapter
channel="fromRabbitChannel"
error-channel="errorChannel"
listener-container="simpleMessageListenerContainer"
/>
<int:service-activator ref="errorHandler" input-channel="errorChannel" method="handleError"/>
#MessageEndpoint
public class ErrorHandler {
public void handleError(Message<MessagingException> message) throws IOException {
throw new IllegalStateException("FORCED EXCEPTION");
}
}
09:49:38.219 [SimpleAsyncTaskExecutor-1] INFO c.p.a.f.ErrorHandler - Throwing an exception!!
09:49:38.219 [SimpleAsyncTaskExecutor-1] DEBUG o.s.retry.support.RetryTemplate - Checking for rethrow: count=3
09:49:38.219 [SimpleAsyncTaskExecutor-1] DEBUG o.s.retry.support.RetryTemplate - Retry failed last attempt: count=3
09:49:38.220 [SimpleAsyncTaskExecutor-1] WARN o.s.a.r.r.RejectAndDontRequeueRecoverer - Retries exhausted for message (Body:'[B#c78ef32(byte[97])'MessageProperties [blah blah])
org.springframework.amqp.rabbit.listener.exception.ListenerExecutionFailedException: Listener threw exception
at org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.wrapToListenerExecutionFailedExceptionIfNeeded(AbstractMessageListenerContainer.java:865) [spring-rabbit-1.5.2.RELEASE.jar:na]
at org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.doInvokeListener(AbstractMessageListenerContainer.java:760) [spring-rabbit-1.5.2.RELEASE.jar:na]
at org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.invokeListener(AbstractMessageListenerContainer.java:680) [spring-rabbit-1.5.2.RELEASE.jar:na]
....
....
09:49:38.221 [SimpleAsyncTaskExecutor-1] WARN o.s.a.r.l.ConditionalRejectingErrorHandler - Execution of Rabbit message listener failed.
org.springframework.amqp.rabbit.listener.exception.ListenerExecutionFailedException: Retry Policy Exhausted
at org.springframework.amqp.rabbit.retry.RejectAndDontRequeueRecoverer.recover(RejectAndDontRequeueRecoverer.java:44) ~[spring-rabbit-1.5.2.RELEASE.jar:na]
at org.springframework.amqp.rabbit.config.StatelessRetryOperationsInterceptorFactoryBean$1.recover(StatelessRetryOperationsInterceptorFactoryBean.java:59) ~[spring-rabbit-1.5.2.RELEASE.jar:na]
at org.springframework.amqp.rabbit.config.StatelessRetryOperationsInterceptorFactoryBean$1.recover(StatelessRetryOperationsInterceptorFactoryBean.java:53) ~[spring-rabbit-1.5.2.RELEASE.jar:na]
at org.springframework.retry.interceptor.RetryOperationsInterceptor$ItemRecovererCallback.recover(RetryOperationsInterceptor.java:124) ~[spring-retry-1.1.2.RELEASE.jar:na]
at org.springframework.retry.support.RetryTemplate.handleRetryExhausted(RetryTemplate.java:458) ~[spring-retry-1.1.2.RELEASE.jar:na]
at org.springframework.retry.support.RetryTemplate.doExecute(RetryTemplate.java:320) ~[spring-retry-1.1.2.RELEASE.jar:na]
at org.springframework.retry.support.RetryTemplate.execute(RetryTemplate.java:168) ~[spring-retry-1.1.2.RELEASE.jar:na]
....
....
09:49:38.222 [SimpleAsyncTaskExecutor-1] DEBUG o.s.a.r.l.BlockingQueueConsumer - Retrieving delivery for Consumer: tags=[{amq.ctag-XVCBQNXxCMFERaF1kbeI3Q=debitCardStatusQueue}], channel=Cached Rabbit Channel: AMQChannel(amqp://guest#127.0.0.1:5671/,1), acknowledgeMode=MANUAL local queue size=0
09:49:39.222 [SimpleAsyncTaskExecutor-1] DEBUG o.s.a.r.l.BlockingQueueConsumer - Retrieving delivery for Consumer: tags=[{amq.ctag-XVCBQNXxCMFERaF1kbeI3Q=debitCardStatusQueue}], channel=Cached Rabbit Channel: AMQChannel(amqp://guest#127.0.0.1:5671/,1), acknowledgeMode=MANUAL local queue size=0
retVal.setAcknowledgeMode(AcknowledgeMode.MANUAL);
With manual acks, you are responsible to ack or reject the message; the container will only ack/nack if you set the mode to AUTO; it will then do exactly as you require.

Web API 2 return OK response but continue processing in the background

I have create an mvc web api 2 webhook for shopify:
public class ShopifyController : ApiController
{
// PUT: api/Afilliate/SaveOrder
[ResponseType(typeof(string))]
public IHttpActionResult WebHook(ShopifyOrder order)
{
// need to return 202 response otherwise webhook is deleted
return Ok(ProcessOrder(order));
}
}
Where ProcessOrder loops through the order and saves the details to our internal database.
However if the process takes too long then the webhook calls the api again as it thinks it has failed. Is there any way to return the ok response first but then do the processing after?
Kind of like when you return a redirect in an mvc controller and have the option of continuing with processing the rest of the action after the redirect.
Please note that I will always need to return the ok response as Shopify in all it's wisdom has decided to delete the webhook if it fails 19 times (and processing too long is counted as a failure)
I have managed to solve my problem by running the processing asynchronously by using Task:
// PUT: api/Afilliate/SaveOrder
public IHttpActionResult WebHook(ShopifyOrder order)
{
// this should process the order asynchronously
var tasks = new[]
{
Task.Run(() => ProcessOrder(order))
};
// without the await here, this should be hit before the order processing is complete
return Ok("ok");
}
There are a few options to accomplish this:
Let a task runner like Hangfire or Quartz run the actual processing, where your web request just kicks off the task.
Use queues, like RabbitMQ, to run the actual process, and the web request just adds a message to the queue... be careful this one is probably the best but can require some significant know-how to setup.
Though maybe not exactly applicable to your specific situation as you are having another process wait for the request to return... but if you did not, you could use Javascript AJAX kick off the process in the background and maybe you can turn retry off on that request... still that keeps the request going in the background so maybe not exactly your cup of tea.
I used Response.CompleteAsync(); like below. I also added a neat middleware and attribute to indicate no post-request processing.
[SkipMiddlewareAfterwards]
[HttpPost]
[Route("/test")]
public async Task Test()
{
/*
let them know you've 202 (Accepted) the request
instead of 200 (Ok), because you don't know that yet.
*/
HttpContext.Response.StatusCode = 202;
await HttpContext.Response.CompleteAsync();
await SomeExpensiveMethod();
//Don't return, because default middleware will kick in. (e.g. error page middleware)
}
public class SkipMiddlewareAfterwards : ActionFilterAttribute
{
//ILB
}
public class SomeMiddleware
{
private readonly RequestDelegate next;
public SomeMiddleware(RequestDelegate next)
{
this.next = next;
}
public async Task Invoke(HttpContext context)
{
await next(context);
if (context.Features.Get<IEndpointFeature>().Endpoint.Metadata
.Any(m => m is SkipMiddlewareAfterwards)) return;
//post-request actions here
}
}
Task.Run(() => ImportantThing() is not an appropriate solution, as it exposes you to a number of potential problems, some of which have already been explained above. Imo, the most nefarious of these are probably unhandled exceptions on the worker process that can actually straight up kill your worker process with no trace of the error outside of event logs or something at captured at the OS, if that's even available. Not good.
There are many more appropriate ways to handle this scenarion, like a handoff a service bus or implementing a HostedService.
https://learn.microsoft.com/en-us/aspnet/core/fundamentals/host/hosted-services?view=aspnetcore-6.0&tabs=visual-studio

NServiceBus Send() vs SendLocal() and exceptions

We are implementing a saga that calls out to other services with NServiceBus. I'm not quite clear about how NServiceBus deals with exceptions inside a saga.
Inside the saga we have a handler, and that handler calls an external service that should only be called once the original message handler completes succesfully. Is it okay to do:
public void Handle(IFooMessage message)
{
var message = Bus.CreateInstance<ExternalService.IBarMessage>();
Bus.Send(message);
// something bad happens here, exception is thrown
}
or will the message be sent to ExternalService multiple times? Someone here has suggested changing it to:
// handler in the saga
public void Handle(IFooMessage message)
{
// Do something
var message = Bus.CreateInstance<ISendBarMessage>();
Bus.SendLocal(message);
// something bad happens, exception is thrown
}
// a service-level handler
public void Handle(ISendBarMessage message)
{
var message = Bus.CreateInstance<ExternalService.IBarMessage>();
Bus.Send(message);
}
I've done an experiment and from what I can tell the first method seems fine, but I can't find any documentation other than http://docs.particular.net/nservicebus/errors/ which says:
When an exception bubbles through to the NServiceBus infrastructure, it rolls back the transaction on a transactional endpoint, causing the message to be returned to the queue, and any messages that user code tried to send or publish to be undone as well.
Any help to clarify this point would be much appreciated.
As long as you're doing messaging from your saga and not doing any web service calls, then you're safe - no need to do SendLocal.

What WCF Exceptions should I retry on failure for? (such as the bogus 'xxx host did not receive a reply within 00:01:00')

I have a WCF client that has thrown this common error, just to be resolved with retrying the HTTP call to the server. For what it's worth this exception was not generated within 1 minute. It was generated in 3 seconds.
The request operation sent to xxxxxx
did not receive a reply within the
configured timeout (00:01:00). The
time allotted to this operation may
have been a portion of a longer
timeout. This may be because the
service is still processing the
operation or because the service was
unable to send a reply message. Please
consider increasing the operation
timeout (by casting the channel/proxy
to IContextChannel and setting the
OperationTimeout property) and ensure
that the service is able to connect to
the client
How are professionals handling these common WCF errors? What other bogus errors should I handle.
For example, I'm considering timing the WCF call and if that above (bogus) error is thrown in under 55 seconds, I retry the entire operation (using a while() loop). I believe I have to reset the entire channel, but I'm hoping you guys will tell me what's right to do.
What other
I make all of my WCF calls from a custom "using" statement which handles exceptions and potential retires. My code optionally allows me to pass a policy object to the statement so I can easily change the behavior, like if I don't want to retry on error.
The gist of the code is as follows:
[MethodImpl(MethodImplOptions.NoInlining)]
public static void ProxyUsing<T>(ClientBase<T> proxy, Action action)
where T : class
{
try
{
proxy.Open();
using(OperationContextScope context = new OperationContextScope(proxy.InnerChannel))
{
//Add some headers here, or whatever you want
action();
}
}
catch(FaultException fe)
{
//Handle stuff here
}
finally
{
try
{
if(proxy != null
&& proxy.State != CommunicationState.Faulted)
{
proxy.Close();
}
else
{
proxy.Abort();
}
}
catch
{
if(proxy != null)
{
proxy.Abort();
}
}
}
}
You can then use the call like follows:
ProxyUsing<IMyService>(myService = GetServiceInstance(), () =>
{
myService.SomeMethod(...);
});
The NoInlining call probably isn't important for you. I need it because I have some custom logging code that logs the call stack after an exception, so it's important to preserve that method hierarchy in that case.