I've got a consumer that rejects messages and knows exactly why those messages were rejected. She'd like to provide the "why" as well as the "what" to the producer when rejecting a message.
What's a good queue architecture for nack'ing messages but also sending back metadata describing why the message failed?
(At a higher level, if the producer isn't doing anything with the 'nacked reason codes, I'm thinking logging the reason codes from the consumer would suffice for visibility, so the question becomes moot. Still, seems like an interesting question assuming otherwise.)
You can use the RPC model as described here:
https://www.rabbitmq.com/tutorials/tutorial-six-java.html
In this way you can send-back to the publisher a message with the reason.
You can also considerer Dead Letter Exchanges extension, but you can't change the message, so you are just informed that your message has been rejected.
With a little work, you can create an exchange where you redirect the nack messages, and using the header property message to write the reason, like that:
Map<String, Object> myHeader = new HashMap<String, Object>();
myHeader("reason", "can't access to database");//<-- just an example
AMQP.BasicProperties.Builder bob = new AMQP.BasicProperties.Builder();
bob.headers(myHeader);
In this way you can maintain the original message and modify only the header. (similar to Dead Letter Message)
hope it helps
I fall into similar issue. My solution was to assign unique ID to each message on sending (using properties) and then on rejection save error (associating it with assigned ID) into redis / memcached (I also used time expiration in redis to not overload storage). It is possible in my case, because I quickly handle all these dead messages so errors should not be keeped for a long time.
Probably not so elegant, but I didn't want publish anything manually and preferred rely on native rabbit functionality and I didn't nee many changes to the code.
Related
I'm following one of the RabbitMQ RPC tutorials (https://www.rabbitmq.com/tutorials/tutorial-six-dotnet.html) and got a little confused around Correlation ID matching.
The tutorial states:
That's when the CorrelationId property is used. We're going to set it to a unique value for every request. Later, when we receive a message in the callback queue we'll look at this property, and based on that we'll be able to match a response with a request. If we see an unknown CorrelationId value, we may safely discard the message - it doesn't belong to our requests.
But why is it "safe" to discard the message after we've already consumed it from the queue? What about the client that is expecting that message? Shouldn't the message be re-queued to prevent loss?
Sounds resonable. But after a requeue it will not be quaranteed that the sender of the original message will receive it. Without more knowledge about the setup of exchanges, bindings and queues it is hard to tell if requeuing makes sense or not.
The above linked tutorial intentionally seems not to cover this complex problem. I think it would be out of scope for a tutorial that tells the reader how to technically use the RPC feature of RabbitMQ.
Before a consumer nacks a message, is there any way the consumer can modify the message's state so that when the consumer consumes it upon redelivery, it sees that changed state. I'd rather not reject + reenqueue new message, but please let me know if that's the only way to accomplish this.
My goal is to determine how many times specific messages are being redelivered. I see two ways of doing this:
(1) On the message itself as described above. The message would be a container of basic stats and the application payload message.
(2) In some external storage. We would uniquely identify the message by the message id that we set.
I know 2 is possible, but my question is if 1 is possible.
There is no way to do (1) like you want. You would need to change the message, thus the message would become another message. If you want to do something like that (and it's possible that you meant this with I'd rather not reject + reenqueue new message) - you should ACK the message, increment one field in it and publish it again (again, maybe this is what you meant when you said reenqueue it). So your message payload would have some ID, counter, and again (obviously different) payload that is the content.
Definitvly much better way is (2) for multiple reasons:
it does not interfere with business logic, that is this diagnostic part is isolated
you are leaving re-queueing to rabbitmq (as you are supposed to do), meaning that you are not worrying about losing messages and handling some message meta info which has no use for you business logic
it's actually supposed to be used - the ACKing and NACKing, that's why it's in the AMQP specification
since you do need the number of how many times specific messages have been redelivered, you have it somewhere externally, meaning that it's independent of (rabbitmq's) message persistence, lifetime, potentially queue durability mirroring etc
Even if this question was marked as solved some time ago, I want to mention that there is a way at least for the redelivery. It might be integrated after the original answer. There is a different type of queues in RabbitMQ called Quorum queues.
Quorum queues offer the option to set redelivery limit:
Quorum queues support poison message handling via a redelivery limit. This feature is currently unique to Quorum queues.
In order to archive this, RabbitMQ is counting the numbers of deliveries in the header. The header attribute is called: x-delivery-count
We want to use Akka to implement a scenario when messages are fetched from a message queue (RabbitMQ) and then processed by a chain of actors. The queue is durable and messages must not be lost. So we need to send an acknowledgement (BasicAck in RabbitMQ) back to the queue in order to finalize the dequeued message. Because of that the very last actor in the processing chain needs to do the acknowledgement. This seems to be rather common need, and I wonder if there is a known pattern for this. Vaughn Vernon in his book writes about using Return Address, so all messages sent along the chain will have the return address (of the MQ channel actor) and the correlation identifier that specifies the queue message tag. Is this the proper way to do it?
An alternative is to ack the message right after the receival and then use persistent actors to provide its guaranteed delivery, but I was adviced against such approach because use of AMPQ eliminates the need for actor persistance for this particular scenario.
I'm not really familiar with Akka, but I think I get the gist of what it does (very similar to "process" in Erlang - i think - which is what RMQ is built on).
In general, your first suggestion from Vaughn Vernon's book is the way to go.
In my specific scenarios, I have taken a "middleware" approach to what you are suggesting. My specific middleware implementation forwards the message itself through a chain of commands that process the message. Each command calls an action.next() method to continue forwarding to the next command.
Prior to sending the message through the middleware, I create a default last-command-in-the-chain. This default command simply calls actions.ack() - which, behind the scenes, acknowledged the message.
I do things this way so that the commands never have to know anything about how to actually implement the mechanics of completing and moving on to the next thing. They have an API specific to themselves, being commands in a chain.
This allows me to change the implementation of acknowledging the message, or how i handle messages from RMQ, etc, without changing the commands directly.
Ack'ing the message immediately introduces danger, as your actor could crash, Akka itself could crash, and a host of other problems can (and will) occur, and you'll be more likely to lose the message.
Remember, though - there is not 100% perfect setup. You will, at some point, lose a message or process the same message twice. Your system needs to handle these scenarios in some way, at some point. Everything your doing is heading down the right path to make this less likely, but nothing will ever prevent crashes and message loss 100% of the time.
What I'm really trying to do is leave the message on the queue in the case where it is rejected by the current consumer. In RabbitMQ I could send a NACK to accomplish this. Is NACK supported in EasyNetQ? Is there another way to achieve the behavior I'm looking for?
Update: not a lot of responses, so I'm wondering how people are generally handling the lack of NACK in EasyNetQ. Not having the equivalent of basic.reject limits consumers to "I can always process every message" scenarios. I suppose consumers could throw a specific "rejected" exception to cause EasyNetQ to dequeue the message to the error queue, and I could requeue messages with those errors. Anyone else have other workarounds in place?
I used EasyNetQ for almost a year, but no matter how we tweaked it (amongst other things added our own implementation of IConsumerErrorStrategy) I never really got it to work the way I wanted. The fact that it is single threaded gave us some unexpected behaviour (sometimes deadlocks) when performing RequestAsync while in a SubscribeAsync handler.
The solution for us was to move from EasyNetQ. After working with the official RabbitMq Client for a while, I spent a few days writing a super thin client on top of that. It is influenced by EasyNetQ and supports most of the concepts that EasyNetQ has. However, I added some neat features like pluggable message contexts. I think that the Nack feature of IAdvancedMessageContext that I just added can be something for you:
var client = service.GetService<IBusClient<AdvancedMessageContext>>();
client.RespondAsync<BasicRequest, BasicResponse>((req, ctx) =>
{
ctx?.Nack(); // the context implements IAdvancedMessageContext.
return Task.FromResult<BasicResponse>(null);
}, cfg => cfg.WithNoAck(false));
If you're interested you can read more about it at the Github page (especially the NackTests.cs).
I think you can change the behavior by implementing your own IConsumerErrorStrategy:
https://github.com/EasyNetQ/EasyNetQ/blob/master/Source/EasyNetQ/Consumer/DefaultConsumerErrorStrategy.cs
But if you need that kind of control you might consider just using the RabbitMQ client directly?
It sounds like you are trying to handle failures. You can NACK a message, but that means it sits at the head of the queue. Great, but then it means that you could end up with a bunch of messages that are truthfully unable to be processed, and you will be unable to actually process real messages.
The solution that I have always used when using RabbitMQ is to utilize the default error handling of EasyNetQ, and have a separate application to resend messages. That is, when an exception is captured in RabbitMQ, it routes the message to a queue called "EasyNetQ_Default_Error_Queue". You are able to override this name and have different queues go to different error queues, but for now let's stick with the default. You can then have a Windows Service/Azure Worker role reading these messages, and working out what to do. That may include having a "RetryCount" on your message envelope/wrapper to make sure that it only loops around so many times. All in all, it's going to be a bit of work.
What you are finding, is what many people run into when using RabbitMQ/EasyNetQ. She's pretty raw.
I am trying to set up broadcast messaging to all nodes in the system. When a new node joins the system, it publishes a message to everyone else to announce its entry. The way I have designed is that, a exchange exists to which all nodes will bind its own queue. Whenever a new node joins the system, it will bind its queue as well to the exchange and publish a message to the exchange. All nodes will receive this msg(including itself) and all other nodes(except this message) will send a "ack" message so that the new node will get to know the available nodes in the system. But somehow I couldn't able to get this working. My broadcast messages doesn't propagate to every node in the system. A simple one node publishing and rest consuming is working. But same node publishing and consuming is somehow screwed up somewhere.
Is there any other efficient way of doing this apart from the logic mentioned above? Or is there any restriction from rabbitmq perspective to achieve the above or is my code buggy and do I have to take a closer look at it.
The way you described it, your solution should work. However, without more detailed code examples (of the consume/publish logic in the "announcer" and the consume/acknowledge-publish logic in the other peers) it's difficult to debug.
A couple common problems could be tripping you up, though:
If you're considering "did I get responses back from all the other nodes" as the authority for "did the other nodes get my announce message?", you might need to acknowledge (basic.ack in AMQP) the messages your announcer is receiving as it gets them. Otherwise, it's possible you're not seeing subsequent messages due to consumer prefetch, though in most client libraries you'd have to be explicitly turning that on somewhere first.
Make sure your other peers (the ones receiving the "announce" and sending a message back) are acknowledging the message as well, or are consuming in "no-ack" mode. Otherwise, if they get blocked (via flow, rate-limiting, or prefetch), they will probably receive announces for awhile and then stop.
Make sure you're using a "fanout" type exchange. It sounds like you want unconditional-fanout behavior, so you don't need to muck about with topic routing. If you're using a topic or direct exchange, you may have a bug in your routing logic, in which case switching to fanout will work. I suspect you're already doing this though.
This is likely not the issue, but: you mention that your peers (not the announcer) are "acknowledging" the announce. Make sure that they acknowledge the announce by publishing a new message back to the announcer's queue directly (with no exchange, just a routing key), not by sending a basic.ack to RabbitMQ (that doesn't notify the sender of anything), and not by publishing an announce-received to the fanout exchange.
As an aside, I don't know why you're doing declare-queue/bind/publish as opposed to publish/declare-queue/bind; is there a good reason you need an announcing node to receive its own announce message? If you're after a "self-test" behavior, I think it's probably better to just implement a periodic "can things announce successfully?" health-check somewhere instead, though that's entirely subjective.
Have you tried the RPC style message, with a callback queue that you identify in the broadcast message's propeties? Like at the rabbitmq tutorial.