How to safely consume from RabbitMQ - rabbitmq

I'm pretty new to rabbit but looking at how consumption works. In the basic example you consume a message, and that by it's very nature removes the message from the queue.
What happens if I blow up. Or my server is shut down once that message is consumed, but before I've done anything with it?
In the Kafka world I maintain an offset, the message remains on the queue, and I can re-consume it in another instance.
Obviously case by case consideration is needed - if I've blown up, then chances are next time I consume that message i'll blow up too. But I can manage that myself - what I really want to know is whether there is a way to take a message, no one else gets it, then 'ack' the message to remove it later on?

Related

RabbitMQ how to only have one message at a time and don't requeue on failure

Our system has a bunch of consumers that use rabbit to consume messages for long running tasks. Currently we ack at the end of processing, so that if the consumer crashes, the message gets requeued. What we want is that a consumer only works on one message at a time and does not prefetch so that another consumer can work on the next message, and if a crash occurs we do not requeue, but we'll have our own monitor that will decide whether we need to re-run on a larger EC2 instance or whatever. It looks like we can get CLOSE to this by acking at start of processing with a prefetch of 1, but that is still 1 message in the queue that could have been handled by another consumer. Apparently setting prefetch to 0 makes no sense
according to rabbit devs (I don't understand why), so another option would be to still ack only on completion so that a prefetch doesn't occur, but somehow DON'T requeue on crash.
If we are swimming upstream so to speak then I know we'll have to come up with another plan, but I don't understand why the desire for a consumer to only work on one thing at a time (and not prefetch the next item of work) and to not requeue on crash is so odd
Consider using one of the RabbitTemplate receive() or receiveAndConvert methods instead; that's a better model for this type of workload - fetching records as needed instead of them being pushed into your app.

ActiveMQ: How do I limit the number of messages being dispatched?

Let's say I have one ActiveMQ Broker and an undefined numbers of consumers.
Problem:
To process a message, consumers need an external service which is either "DATA1" or "DATA2" (specified in the message)
Each server, "DATA1" and "DATA2", can only handle 20 connections
So at most 20 "DATA1" and 20 "DATA2" messages must be dispatched at any time
Because of priorization, the messages must be enqueued in the same queue
Even if message A has a higher prio than message B, if A can't be processed because the external service has no free slots, message B needs to be processed instead
How can this be solved? As long as I was using message pulling (prefetch of 0), I was able to do this by using a BrokerPlugin that, on messagePull, achieved this by using semaphores and selectors. If the limits were reached, the pull returned null.
However, due to performance issues I had to set prefetch to 1 and use push instead. Therefore, my messagePull hack no longer works (it's never called).
So far I'm considering implementing a custom Cursor but I was wondering if someone knows a better solution.
Update the custom cursor worked but broke features like message removal. I tried with a custom Queue and QueueDispatchSelector (which is a pain to configure since there isn't a proper API to do so) and it mostly works but I still have synchronisation issues.
Also, a very suitable API seems to be DispatchPolicy, however, while it is referenced by Queue, it's never used.
Queues give you buffering for system processing time for free. Messages are delivered on demand. With prefetch=0 or prefetch=1, should effectively get you there. Messages will only be delivered to a consumer when the consumer is ready (ie.. during the consumer.receive() method).
consumer.receive() is a blocking call, so you should not need any custom plugin or other to delay delivery until the consumer process (and its required downstream services) are ready to handle it.
The behavior should work out-of-the-box, or there are some details to your use case that are not provided to shed more light on the scenario.

RabbitMQ: how to handle unwanted duplicate un-ack message after connection lost?

In my app(multiple instances), we occasionally see the case where connection is lost between my app and rabbitmq due to network issues(my app and rabbitmq are both alive), then after connection is recovered(re-established) we will receive messages that are unacked.
This creates an issue for us, because my app wasn't dead, and it is still processing the same message it received before, but now the message is redeivered, and it causes the app to process the message again (which can be fatal to us).
Since the app has multiple instances, it is not easy for an instance to check if another instance is processing the same message at the same time. We can't simply filter out redelivered message, because we need this feature to handle instance/app crashes/re-deployments.
It doesn't seem that there is an api to tell rabbitmq when to not redeliver unacked messages.
So what is the recommended practice to handle this situation ?
Thanks,
The general solution for such scenario is to make the consumers handle the messages in an idempotent manner . Generally what I do is from the producer side ( in case there is no unique identifier in the message body ) I add an attribute idempotencyId to the message body which is a guid and on the consumer side for each message this id is validated against the stored value in database , any duplicates are rejected.
This approach also works for messages which might be shoveled from another cluster or if in a same cluster multiple instances of consumers are listening then too this approach guarantee one time processing.
Would suggest to go over the RabbitMQ Reliability Guide here
Yeah, exactly-once delivery is not something RabbitMQ is good at. In fact, I'd say you should probably not be using it for these kinds of problems. Honestly, the only way to truly fix this is to use distributed transactions or locking.
Anyway, you could turn the problem on its head by ack'ing the message as soon as the consumer gets it, before it starts working on it. That would avoid the RabbitMQ-related duplication issue at least. This is at-most-once delivery.
Of course, it means that if the consumer crashes, the message is lost forever. So you need to persist the message right before you ack it so you can recover it later and also the consumer should remove it once it's complete.
Considering that crashes are rare, you can then have a single dedicated process that just works on those persisted messages. Or for that matter, handle them manually.
Just be aware that you are pushing the duplication problem in front of you, because the consumer might fail to remove the persisted message after it's done working with it anyway, but at least you have the option to implement it however you want.
Storage in this case could be anything from files, a RDBMS or something like ZooKeeper or Redis to lock/unlock in-flight messages.

How to guarantee message order in RabbitMQ (or any other asynchronous message queue service)

I have a Java application which publishes events to RabbitMQ. It has one very important characteristic: message order must be preserved at all times. The consumer can handle duplicates, but it cannot handle when message 2 is enqueued before message 1, so to say.
I have been reading a lot about RabbitMQ lately, and I feel there is only solution to do this: set the channel in confirm mode (https://www.rabbitmq.com/confirms.html - basically, it forces the broker to acknowledge the publication) and publish one by one. With one by one I mean that the message 2 is only published after RabbitMQ confirmed (via an asynchronous ACK response) that message 1 is actually well received and persisted.
I tried this in a conceptual implementation, and while this works fine, it's uber slow, without exaggerating. Which makes sense: after all, we are now limiting our message rate to 1 message at a time.
So this leads me to my question: are there other, more performant, ways to ensure that message ordering is always preserved (either in RabbitMQ or via different approaches)?
Although my concern is RabbitMQ, I believe this question might be applied to any kind of asynchronous message queue service.
RabbitMQ's clients enqueue in the same order that you sent. It's when subscribers go down, you get network splits or the subscriber NACKs messages that they can get re-ordered; and even then RMQ tries to keep them in the same approximate order by re-queueing at the same position, or as close to the same position.
You can do it like you suggest; take one message at a time, because if you take a message, but crash before you've ACKed it from the broker, it will pop up when your service comes back up, at the same position.
This assumes you only have a single service instance at any given time, consuming from the queue. Which in turn is a distributed systems problem on its own, if you have a scheduler like Kubernetes or Mesos, spawning your service instances.
Another solution would be to ensure ordering of processing in the receiving service, by "resequencing" the messages based on their logical timestamps/sequence numbers.
I've written a much more thorough guide as annotated code here https://github.com/haf/rmq-publisher-confirms-hopac/blob/master/src/Server/Shared/RabbitMQ.fs — with batching you can resequence. Furthermore, if your idempotence builds the consecutive sequence numbers into its logic, you can start taking batches and each event will be idempotent, despite being re-consumed.

NServiceBus pub/sub - where have my messages gone?

Well I've been doing this NServiceBus project for a while and once I got it working for PubSub I then spent the rest of the time on the actual workflow logic. However, I can see a serious issue which I want to get around (or rather learn how to handle correctly).
A publisher publishes a message to the storage queues of any subscribers as far as I understand. Great. But what happens when the subscriber isn't running (I've read other posts about this and they don't seem to be asking the same question).
Scenario - I get the publisher to Publish a message when no subscribers are running (attached/requested messages to be relayed to them).. I then find that.. the message is "gone" just simply isn't there! where did it go? Did the publisher say "hey, no one's subscribing to this, so I wont bother publishing it?", shouldn't it NOT do that and require at least one subscriber?
Can anyone shed any light on this? (nservicenewbie)
You should publish an event that has happened - a statement of fact, that other handler may or may not be interested in. It's perfectly valid to have zero subscribers! If this is not the case then maybe you should be Send()ing a command instead of Publish()ing an event.
If you are using a persistent subscription storage, start the subscriber up once and it will always be subscribed. If the subscriber is offline, messages for it will pile up in its Input Queue, ready to be processed when the subscriber comes back online.
If you're just testing with NServiceBus, the NServiceBus.Host.exe is running in the Lite profile, which uses in-memory (non-persistant) subscription storage, which would result in what you are seeing.
Ah ha! Well though it's not always an error to have no subscriber for a message type, there is a way to handle it.
In your publisher simply modify the:
IBus Bus
To use (you will need NServiceBus.Core.dll and the NS NServiceBus.Unicast):
IUnicastBus Bus
Then you can attach an handler to:
Bus.NoSubscribersForMessage += .......
This can then put the message in an error queue.. or perhaps retry forever.. or publish something else etc.. etc.. what ever you want. Thus ensuring there is nothing lost where your particular system (from a business perspective) requires an outcome