Handling messages along with long-term retry policy - rabbitmq

Greetings to MT experts.
In my app I have default retry policy that sends message each 3 minutes for 30 minutes in total. And if there are many failed messages that are affected with this policy (more than 16) other messages are not handled (even successfull ones). This is a huge problem, because if there are 16 broken messages, then whole queue is blocked for 30 minutes.
I'm sure there is a solution for this, but I haven't found any.

The solution is redelivery aka second-level retries.
Here is the documentation.
There are two ways to use it:
Explicit redelivery from a consumer, called on exception:
public async Task Consume(ConsumeContext<ScheduleNotification> context)
{
try
{
// try to update the database
}
catch (CustomerNotFoundException exception)
{
// schedule redelivery in one minute
context.Redeliver(TimeSpan.FromMinutes(1));
}
}
Or using configuration and policy (part of the endpoint configuration delegate):
ep.Consumer<CSomeConsumer>(c => c.Message<SomeMessage>(
x => x.UseDelayedRedelivery(
p =>
{
p.Handle<SqlException>(e => e.Message.Contains("Timeout"));
p.Exponential(40, TimeSpan.FromSeconds(10), TimeSpan.FromHours(1),
TimeSpan.FromSeconds(4));
})));
Remember that you must have scheduling configured to use this feature. It can be done using Quartz or RabbitMQ/AzureSB integrated scheduling features.

I think you're looking for the circuit breaker pattern, this can be applied to the masstransit with the following:
cfg.ReceiveEndpoint(host, "customer_update_queue", e =>
{
e.UseCircuitBreaker(cb =>
{
cb.TrackingPeriod = TimeSpan.FromMinutes(1);
cb.TripThreshold = 15;
cb.ActiveThreshold = 10;
cb.ResetInterval = TimeSpan.FromMinutes(5);
});
// other configuration
});
More information can be found in the docs:
http://masstransit-project.com/MassTransit/advanced/middleware/circuit-breaker.html

Related

MassTransit RMQ scheduling scheduled but not sent

I'm trying to implement scheduling mechanism by the masstransit/rabbitmq.
I've added the configuration as stated in the docs:
Uri schedulerEndpoint = new (Constants.MassTransit.SchedulerEndpoint);
services.AddMassTransit(mtConfiguration =>
{
mtConfiguration.AddMessageScheduler(schedulerEndpoint);
mtConfiguration.AddSagaStateMachine<ArcStateMachine, ArcProcess>(typeof(ArcSagaDefinition))
.Endpoint(e => e.Name = massTransitConfiguration.SagaQueueName)
.MongoDbRepository(mongoDbConfiguration.ConnectionString, r =>
{
r.DatabaseName = mongoDbConfiguration.DbName;
r.CollectionName = mongoDbConfiguration.CollectionName;
});
mtConfiguration.UsingRabbitMq((context, cfg) =>
{
cfg.UseMessageScheduler(schedulerEndpoint);
cfg.Host(new Uri(rabbitMqConfiguration.Host), hst =>
{
hst.Username(rabbitMqConfiguration.Username);
hst.Password(rabbitMqConfiguration.Password);
});
cfg.ConfigureEndpoints(context);
});
});
Then I'm sending a scheduled message using the Bus:
DateTime messageScheduleTime = DateTime.UtcNow + TimeSpan.FromMinutes(1);
await _MessageScheduler.SchedulePublish<ScheduledMessage>(messageScheduleTime, new
{
ActivationId = context.Data.ActivationId
});
_MessageCheduler is the IMessageScheduler instance.
I do see the Scheduler queue receive the scheduled message and I see the correct scheduledTime property in it but the message does not reach the state machine whenever its schedule should fire. Seems like I'm missing something in the configuration or some MassTransit service that is not started.
Please, assist.
If you actually read the documentation you would see that UseDelayedMessageScheduler is the proper configuration to use RabbitMQ for scheduling. And AddDelayedMessageScheduler for the container-based IMessageScheduler registration.

How to add a health check in .NET 5 for MassTransit Amazon SQS?

I have a working and running .NET 5 application. There is a REST API to POST data. When data is posted, a MassTransit message is published. Using the AWS Explorer, I can clearly see that if the topic named customerinfo.fifo does not exist, it is created in Amazon by my application. Apparently, my application does something useful with Amazon SQS/SNS. I like that.
What I do not like, is that when adding a health check, an error appears causing my health check to be "Unhealthy" (after calling http://localhost:5000/health/ready ). This should not happen as my application is working fine and being able to publish messages. Logically, an "Unhealthy" status should only occur when there is something wrong. Here is my code responsible for the health check. This is part of the ConfigureServices method.
services.AddHealthChecks();
services.Configure<HealthCheckPublisherOptions>(options =>
{
options.Delay = TimeSpan.FromSeconds(2);
options.Predicate = (check) => check.Tags.Contains("ready");
});
To use the added health check, the following code is added to the Configure method.
app.UseEndpoints(endpoints =>
{
endpoints.MapHealthChecks("/health/ready", new HealthCheckOptions()
{
Predicate = (check) => check.Tags.Contains("ready"),
});
endpoints.MapHealthChecks("/health/live", new HealthCheckOptions());
});
This code shown above is basically how it is documented in the MassTransit documentation. Moreover, I also have code to add MassTransit SQS itself. I have specific extension method for this:
public static void UseMassTransit(this IServiceCollection services, MassTransitConfiguration massTransitConfiguration)
{
services.AddMassTransit(x =>
{
x.AddConsumer<CustomerChangeConsumer>();
x.UsingAmazonSqs((context, cfg) =>
{
cfg.Host(massTransitConfiguration.Host, h =>
{
h.AccessKey(massTransitConfiguration.AccessKey);
h.SecretKey(massTransitConfiguration.SecretKey);
h.EnableScopedTopics();
});
cfg.ReceiveEndpoint("CustomerChangeConsumer",
configurator =>
{
configurator.ConfigureConsumer<CustomerChangeConsumer>(context);
});
cfg.Message<CustomerUpdate>(x =>
{
x.SetEntityName("customerupdate.fifo");
});
cfg.Publish<CustomerUpdate>(x =>
{
x.TopicAttributes["FifoTopic"] = "true";
});
});
});
services.AddMassTransitHostedService();
}
To solve this problem, two solution types are possible:
Just remove the MassTransit health check. It cannot fail it is not there any more.I need to find a way to implement a SQS/SNS health check myself when choosing such a solution.
Fix (my?) code in order to make the health check work properly.
Obviously, the last one is preferred but for both solution types, I have no idea how to implement them.
To further clarify my problem, the logged error message when doing a health check is shown here.
fail:
Microsoft.Extensions.Diagnostics.HealthChecks.DefaultHealthCheckService[103]
Health check masstransit-bus completed after 13.2985ms with status Unhealthy and description 'Not ready: not started'
MassTransit.AmazonSqsTransport.Exceptions.AmazonSqsConnectionException:
ReceiveTransport faulted: https://eu-west-2/
---> Amazon.SQS.AmazonSQSException: Access to the resource https://sqs.eu-west-2.amazonaws.com/ is denied.
---> Amazon.Runtime.Internal.HttpErrorResponseException: Exception of type 'Amazon.Runtime.Internal.HttpErrorResponseException'
was thrown.
at Amazon.Runtime.HttpWebRequestMessage.GetResponseAsync(CancellationToken
cancellationToken)
at Amazon.Runtime.Internal.HttpHandler1.InvokeAsync[T](IExecutionContext executionContext) at Amazon.Runtime.Internal.Unmarshaller.InvokeAsync[T](IExecutionContext executionContext) at Amazon.SQS.Internal.ValidationResponseHandler.InvokeAsync[T](IExecutionContext executionContext) at Amazon.Runtime.Internal.ErrorHandler.InvokeAsync[T](IExecutionContext executionContext) --- End of inner exception stack trace --- at GreenPipes.Caching.Internals.NodeValueFactory1.CreateValue()
at GreenPipes.Caching.Internals.NodeTracker1.AddNode(INodeValueFactory1
nodeValueFactory)
--- End of inner exception stack trace ---
at MassTransit.Transports.ReceiveTransport1.ReceiveTransportAgent.RunTransport() at MassTransit.Transports.ReceiveTransport1.ReceiveTransportAgent.Run()
I really hope there someone can help me with this. It is so strange. MassTransit and SQS/SNS are working fine together. The only problem is that the health check denies that, which is really frustrating.
I found some link where they say you need to define the custom health check like below
https://github.com/MassTransit/MassTransit/issues/1513#issuecomment-554943164
Have you tried to set Autostart = true? It helped me with RabbitMq so possibly it will help you as well.
mt.UsingAmazonSqs((context, cfg) =>
{
cfg.Host(massTransitConfiguration.Host, h =>
{
h.AccessKey(massTransitConfiguration.AccessKey);
h.SecretKey(massTransitConfiguration.SecretKey);
h.EnableScopedTopics();
});
cfg.AutoStart = true;
cfg.ReceiveEndpoint("CustomerChangeConsumer",
configurator =>
{
configurator.ConfigureConsumer<CustomerChangeConsumer>(context);
});
cfg.Message<CustomerUpdate>(x =>
{
x.SetEntityName("customerupdate.fifo");
});
cfg.Publish<CustomerUpdate>(x =>
{
x.TopicAttributes["FifoTopic"] = "true";
});
});

MassTransit / RabbitMQ - why so many messages get skipped?

I'm working with 2 .NET Core console applications in a producer/consumer scenario with MassTransit/RabbitMQ. I need to ensure that even if NO consumers are up-and-running, the messages from the producer are still queued up successfully. That didn't seem to work with Publish() - the messages just disappeared, so I'm using Send() instead. The messages at least get queued up, but without any consumers running the messages all end up in the "_skipped" queue.
So that's my first question: is this the right approach based on the requirement (even if NO consumers are up-and-running, the messages from the producer are still queued up successfully)?
With Send(), my consumer does indeed work, but still many messages are falling through the cracks and getting dumped into to the "_skipped" queue. The consumer's logic is minimal (just logging the message at the moment) so it's not a long-running process.
So that's my second question: why are so many messages still getting dumped into the "_skipped" queue?
And that leads into my third question: does this mean my consumer needs to listen to the "_skipped" queue as well?
I am unsure what code you need to see for this question, but here's a screenshot from the RabbitMQ management UI:
Producer configuration:
static IHostBuilder CreateHostBuilder(string[] args)
{
return Host.CreateDefaultBuilder()
.ConfigureServices((hostContext, services) =>
{
services.Configure<ApplicationConfiguration>(hostContext.Configuration.GetSection(nameof(ApplicationConfiguration)));
services.AddMassTransit(cfg =>
{
cfg.AddBus(ConfigureBus);
});
services.AddHostedService<CardMessageProducer>();
})
.UseConsoleLifetime()
.UseSerilog();
}
static IBusControl ConfigureBus(IServiceProvider provider)
{
var options = provider.GetRequiredService<IOptions<ApplicationConfiguration>>().Value;
return Bus.Factory.CreateUsingRabbitMq(cfg =>
{
var host = cfg.Host(new Uri(options.RabbitMQ_ConnectionString), h =>
{
h.Username(options.RabbitMQ_Username);
h.Password(options.RabbitMQ_Password);
});
cfg.ReceiveEndpoint(host, typeof(CardMessage).FullName, e =>
{
EndpointConvention.Map<CardMessage>(e.InputAddress);
});
});
}
Producer code:
Bus.Send(message);
Consumer configuration:
static IHostBuilder CreateHostBuilder(string[] args)
{
return Host.CreateDefaultBuilder()
.ConfigureServices((hostContext, services) =>
{
services.AddSingleton<CardMessageConsumer>();
services.Configure<ApplicationConfiguration>(hostContext.Configuration.GetSection(nameof(ApplicationConfiguration)));
services.AddMassTransit(cfg =>
{
cfg.AddBus(ConfigureBus);
});
services.AddHostedService<MassTransitHostedService>();
})
.UseConsoleLifetime()
.UseSerilog();
}
static IBusControl ConfigureBus(IServiceProvider provider)
{
var options = provider.GetRequiredService<IOptions<ApplicationConfiguration>>().Value;
return Bus.Factory.CreateUsingRabbitMq(cfg =>
{
var host = cfg.Host(new Uri(options.RabbitMQ_ConnectionString), h =>
{
h.Username(options.RabbitMQ_Username);
h.Password(options.RabbitMQ_Password);
});
cfg.ReceiveEndpoint(host, typeof(CardMessage).FullName, e =>
{
e.Consumer<CardMessageConsumer>(provider);
});
//cfg.ReceiveEndpoint(host, typeof(CardMessage).FullName + "_skipped", e =>
//{
// e.Consumer<CardMessageConsumer>(provider);
//});
});
}
Consumer code:
class CardMessageConsumer : IConsumer<CardMessage>
{
private readonly ILogger<CardMessageConsumer> logger;
private readonly ApplicationConfiguration configuration;
private long counter;
public CardMessageConsumer(ILogger<CardMessageConsumer> logger, IOptions<ApplicationConfiguration> options)
{
this.logger = logger;
this.configuration = options.Value;
}
public async Task Consume(ConsumeContext<CardMessage> context)
{
this.counter++;
this.logger.LogTrace($"Message #{this.counter} consumed: {context.Message}");
}
}
In MassTransit, the _skipped queue is the implementation of the dead letter queue concept. Messages get there because they don't get consumed.
MassTransit with RMQ always delivers a message to an exchange, not to a queue. By default, each MassTransit endpoint creates (if there's no existing queue) a queue with the endpoint name, an exchange with the same name and binds them together. When the application has a configured consumer (or handler), an exchange for that message type (using the message type as the exchange name) also gets created and the endpoint exchange gets bound to the message type exchange. So, when you use Publish, the message is published to the message type exchange and gets delivered accordingly, using the endpoint binding (or multiple bindings). When you use Send, the message type exchange is not being used, so the message gets directly to the destination exchange. And, as #maldworth correctly stated, every MassTransit endpoint only expects to get messages that it can consume. If it doesn't know how to consume the message - the message is moved to the dead letter queue. This, as well as the poison message queue, are fundamental patterns of messaging.
If you need messages to queue up to be consumed later, the best way is to have the wiring set up, but the endpoint itself (I mean the application) should not be running. As soon as the application starts, it will consume all queued messages.
When the consumer starts the bus bus.Start(), one of the things it does is create all exchanges and queues for the transport. If you have a requirement that publish/send happens before the consumer, your only option is to run DeployTopologyOnly. Unfortunately this feature is not documented in official docs, but the unit tests are here: https://github.com/MassTransit/MassTransit/blob/develop/src/MassTransit.RabbitMqTransport.Tests/BuildTopology_Specs.cs
The skipped queue happens when messages are sent to a consumer that doesn't know how to process.
For example if you have a consumer that can process IConsumer<MyMessageA> which is on receive endpoint name "my-queue-a". But then your message producer does Send<MyMessageB>(Uri("my-queue-a")...), Well this is a problem. The consumer only understands the A, it doesn't know how to process B. And so it just moves it to a skipped queue and continues on.
In my case, the same queue listens to multiple consumers at the same time

Keep messages in queue while consumer is offline

I use Masstransit in C# project.
I have a publisher and consumer services, and when both of them are up, then there are no problems. But if the consumer goes offline, published messages don't go to the queue. They just disappear.
The expected behavior is to keep messages in the queue until the consumer is started, and then send them to it. I've found several topics in google groups with same questions, but it wasn't clear for me how to solve that problem.
It seems strange to me that this functionality isn't provided out of the box because, in my understanding, it is the main purpose of RabbitMQ and MT.
The way I create publisher bus:
public static IBusControl CreateBus()
{
return Bus.Factory.CreateUsingRabbitMq(sbc =>
{
var host = sbc.Host(new Uri("rabbitmq://RMQ-TEST"), h =>
{
h.Username("test");
h.Password("test");
});
sbc.ReceiveEndpoint(host, "test_queue", ep =>
{
ep.Handler<IProductDescriptionChangedEvent>(
content => content.CompleteTask);
});
});
}
And the consumer:
public static void StartRmqBus()
{
var bus = Bus.Factory.CreateUsingRabbitMq(cfg =>
{
var host = cfg.Host(new Uri("rabbitmq://RMQ-TEST"), h =>
{
h.Username("test");
h.Password("test");
});
cfg.ReceiveEndpoint(host, "test_queue", ep =>
{
ep.Consumer<ProductChangedConsumer>();
});
});
bus.Start();
}
EDIT:
Here is one more interesting feature: if I stop both services and manually put a message to the queue via admin interface of MT, the message is waiting in test_queue. But when I start publisher or consumer service, it falls to test_queue_error queue.
You use the same queue for published and consumer, plus publisher has a consumer for this message type, as you pointed out in your own answer.
If your publisher does not consume messages, it is better to remove the receiving endpoint from it at all and then your service will be send-only.
If you have several services, where each of them need to have their own consumers for the same message type - this is how pub-sub works and you must have different queues per service. This is described in the Common Gotchas section of the documentation. In such scenario, each service will get it's own copy of the published message.
If you have one queue - you get competing consumers and this scenario is only valid for horizontal scalability, where you run several instance of the same services to increase the number of processed messages if the processing is too slow. In such case all these instances will consume messages from the same queue. In this scenario only one instance will get a message.
It seems like my publisher was set up incorrectly. After removing this part:
sbc.ReceiveEndpoint(host, "test_queue", ep =>
{
ep.Handler<IProductDescriptionChangedEvent>(
content => content.CompleteTask);
});
it started to work as expected. Looks like it consumed its own messages, that's why I didn't see messages in the queue when the consumer was down.

Redis Timeout Expired message on GetClient call

I hate the questions that have "Not Enough Info". So I will try to give detailed information. And in this case it is code.
Server:
64 bit of https://github.com/MSOpenTech/redis/tree/2.6/bin/release
There are three classes:
DbOperationContext.cs: https://gist.github.com/glikoz/7119628
PerRequestLifeTimeManager.cs: https://gist.github.com/glikoz/7119699
RedisRepository.cs https://gist.github.com/glikoz/7119769
We are using Redis with Unity ..
In this case we are getting this strange message:
"Redis Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool. This may have occurred because all pooled connections were in use.";
We checked these:
Is the problem configuration issue
Are we using wrong RedisServer.exe
Is there any architectural problem
Any idea? Any similar story?
Thanks.
Extra Info 1
There is no rejected connection issue on server stats (I've checked it via redis-cli.exe info command)
I have continued to debug this problem, and have fixed numerous things on my platform to avoid this exception. Here is what I have done to solve the issue:
Executive summary:
People encountering this exception should check:
That the PooledRedisClientsManager (IRedisClientsManager) is registed in a singleton scope
That the RedisMqServer (IMessageService) is registered in a singleton scope
That any utilized RedisClient returned from either of the above is properly disposed of, to ensure that the pooled clients are not left stale.
The solution to my problem:
First of all, this exception is thrown by the PooledRedisClient because it has no more pooled connections available.
I'm registering all the required Redis stuff in the StructureMap IoC container (not unity as in the author's case). Thanks to this post I was reminded that the PooledRedisClientManager should be a singleton - I also decided to register the RedisMqServer as a singleton:
ObjectFactory.Configure(x =>
{
// register the message queue stuff as Singletons in this AppDomain
x.For<IRedisClientsManager>()
.Singleton()
.Use(BuildRedisClientsManager);
x.For<IMessageService>()
.Singleton()
.Use<RedisMqServer>()
.Ctor<IRedisClientsManager>().Is(i => i.GetInstance<IRedisClientsManager>())
.Ctor<int>("retryCount").Is(2)
.Ctor<TimeSpan?>().Is(TimeSpan.FromSeconds(5));
// Retrieve a new message factory from the singleton IMessageService
x.For<IMessageFactory>()
.Use(i => i.GetInstance<IMessageService>().MessageFactory);
});
My "BuildRedisClientManager" function looks like this:
private static IRedisClientsManager BuildRedisClientsManager()
{
var appSettings = new AppSettings();
var redisClients = appSettings.Get("redis-servers", "redis.local:6379").Split(',');
var redisFactory = new PooledRedisClientManager(redisClients);
redisFactory.ConnectTimeout = 5;
redisFactory.IdleTimeOutSecs = 30;
redisFactory.PoolTimeout = 3;
return redisFactory;
}
Then, when it comes to producing messages it's very important that the utilized RedisClient is properly disposed of, otherwise we run into the dreaded "Timeout Expired" (thanks to this post). I have the following helper code to send a message to the queue:
public static void PublishMessage<T>(T msg)
{
try
{
using (var producer = GetMessageProducer())
{
producer.Publish<T>(msg);
}
}
catch (Exception ex)
{
// TODO: Log or whatever... I'm not throwing to avoid showing users that we have a broken MQ
}
}
private static IMessageQueueClient GetMessageProducer()
{
var producer = ObjectFactory.GetInstance<IMessageService>() as RedisMqServer;
var client = producer.CreateMessageQueueClient();
return client;
}
I hope this helps solve your issue too.