I have installed RabbitMQ on windows server 2012 64 Bit.
I Tested Publishing And Consuming Parts with Huge Data Everything is fine, the only problem i am facing is the messages in a queue are getting lost after RabbitMQServer restart.
I am using VB.Net SDK of RabbitMQ.
I am setting "Durable" property of Queue Declare to true, and DeliveryMode BasicQueueProperties to "2" to make the Messages persistent. But still the messages are getting lost after my server restart.
How can I overcome this?
https://www.rabbitmq.com/tutorials/tutorial-two-dotnet.html
In this page Message durability on RabbitMQ, it's explained good:
At this point we're sure that the task_queue queue won't be lost even if RabbitMQ restarts. Now we need to mark our messages as persistent - by setting IBasicProperties.Persistent to true.
var properties = channel.CreateBasicProperties();
properties.Persistent = true;
Note on message persistence
Marking messages as persistent doesn't fully guarantee that a message
won't be lost. Although it tells RabbitMQ to save the message to disk,
there is still a short time window when RabbitMQ has accepted a
message and hasn't saved it yet. Also, RabbitMQ doesn't do fsync(2)
for every message -- it may be just saved to cache and not really
written to the disk. The persistence guarantees aren't strong, but
it's more than enough for our simple task queue. If you need a
stronger guarantee then you can use publisher confirms.
Related
In my app(multiple instances), we occasionally see the case where connection is lost between my app and rabbitmq due to network issues(my app and rabbitmq are both alive), then after connection is recovered(re-established) we will receive messages that are unacked.
This creates an issue for us, because my app wasn't dead, and it is still processing the same message it received before, but now the message is redeivered, and it causes the app to process the message again (which can be fatal to us).
Since the app has multiple instances, it is not easy for an instance to check if another instance is processing the same message at the same time. We can't simply filter out redelivered message, because we need this feature to handle instance/app crashes/re-deployments.
It doesn't seem that there is an api to tell rabbitmq when to not redeliver unacked messages.
So what is the recommended practice to handle this situation ?
Thanks,
The general solution for such scenario is to make the consumers handle the messages in an idempotent manner . Generally what I do is from the producer side ( in case there is no unique identifier in the message body ) I add an attribute idempotencyId to the message body which is a guid and on the consumer side for each message this id is validated against the stored value in database , any duplicates are rejected.
This approach also works for messages which might be shoveled from another cluster or if in a same cluster multiple instances of consumers are listening then too this approach guarantee one time processing.
Would suggest to go over the RabbitMQ Reliability Guide here
Yeah, exactly-once delivery is not something RabbitMQ is good at. In fact, I'd say you should probably not be using it for these kinds of problems. Honestly, the only way to truly fix this is to use distributed transactions or locking.
Anyway, you could turn the problem on its head by ack'ing the message as soon as the consumer gets it, before it starts working on it. That would avoid the RabbitMQ-related duplication issue at least. This is at-most-once delivery.
Of course, it means that if the consumer crashes, the message is lost forever. So you need to persist the message right before you ack it so you can recover it later and also the consumer should remove it once it's complete.
Considering that crashes are rare, you can then have a single dedicated process that just works on those persisted messages. Or for that matter, handle them manually.
Just be aware that you are pushing the duplication problem in front of you, because the consumer might fail to remove the persisted message after it's done working with it anyway, but at least you have the option to implement it however you want.
Storage in this case could be anything from files, a RDBMS or something like ZooKeeper or Redis to lock/unlock in-flight messages.
I am using celery and rabbitmq , but due to pushing several task in queue my server memory utilization becomes more than 40% , so that rabbit further will not accepting any task . so i want to delete those message which are already executed , but due to durable behavior of rabbitmq those message not automatically delete, so i want to set some configuration like autoAck=True , so that if message is consumed from celery ,it will delete from rabbitmq queues and also from my server memory. please explain how can we do that .
OK, so while I don't fully understand why you have the problem you have, it is clear what is going on.
A publisher puts a message task in the queue
Your worker process pulls the message and processes it
The message is never actually removed from the queue
This behavior happens when a consumer fails to acknowledge the processing of a message. To confirm, if you look at the RabbitMQ management plug-in, you'll see a whole bunch of unacknowledged messages. These will be unavailable for consumption, but will continue to be held on the server and taking up disk space and memory.
Further, if you do a Basic.Recover, all of these messages will then get dumped back into the queue to be processed again.
This problem is due to incorrect configuration of your consumer. There are two ways to address this:
You can configure the consumer to auto-ack (i.e. acknowledge the message automatically upon receipt). This is done when you declare the consumer (using Basic.Consume). Edit: It looks like this may be the default behavior of Celery.
You can configure your worker process to submit an acknowledgement (using Basic.Ack). Edit: this is done via the acks_late property in Celery.
I want to ensure that certain kind of messages couldn't be lost, hence I should use Confirms (aka Publisher Acknowledgements).
The broker loses persistent messages if it crashes before said
messages are written to disk. Under certain conditions, this causes
the broker to behave in surprising ways.
For instance, consider this scenario:
a client publishes a persistent message to a durable queue
a client consumes the message from the queue (noting that the message is persistent and the queue durable), but doesn't yet ack it,
the broker dies and is restarted, and
the client reconnects and starts consuming messages.
At this point, the client could reasonably assume that the message
will be delivered again. This is not the case: the restart has caused
the broker to lose the message. In order to guarantee persistence, a
client should use confirms.
But what if, using confirms, the Publisher goes down before receive the ack and the message wasn't delivery to the queue for some reason (i.e. network failure).
Suppose we have a simple REST endpoint where we can POST new COMMENTS and, when a new COMMENT is created we want to publish a message in a queue. (Note: it doesn't matter if I send a message of a new COMMENT that at the end isn't created due to a rollback for example).
CommentEndpoint {
Channel channel;
post(String comment) {
channel.publish("comments-queue",comment) // is a persistent queue
Comment aNewComment = new Comment(comment)
repository.save(comment)
// what happens if the server where this publisher is running terminates here ?
channel.waitConfirmations()
}
}
When the server restarts the channel is gone and the message could never be delivered.
One solution that comes to my mind is that after a restart, query the recent comments (¿something like the comments created between the last 3 min before the crash?) in the repository and send one message for each one and await confirmations.
What you are worried about is really no longer RabbitMQ only issue, it is a distributed transaction issue. This discussion gives one reasonable lightweight solution. And there are more strict solutions, for instance, two-phase commit, three-phase commit, etc, to ensure data consistent when it is really necessary.
I am using ActiveMQ 5.4 with KahaDB as message store.
While Publishing Messages (with Persistence true) to a Topic, which has Durable subscriber, the persistence store is increasing even the messages are dispatched to Subscriber. So this is causing an issue as the message store is getting full and not accepting any more messages.
So my question is why the Persistence store is not discarding the messages in the KahaDB, even the messages are getting dispatched?
Regards,
Srinivas
What you are seeing is an interaction between the ActiveMQ message store behaviour and that for durable subscriptions on topics.
When you have durable subscriptions, a topic is treated like a queue for each subscriber's clientId (set on the Connection). The logic being that the client doesn't want to miss any messages when they disconnect. So if they disconnect, the durable subscription hangs around and keeps the messages alive.
The AMQ message store uses data logs for it's message journal. These are written sequentially, and never actually removed from (that would require random access). There is a second file which keeps track of which messages have been consumed. Once all the messages in a data file have been consumed, that file is deleted.
So what you're seeing is that some of the messages in the data file are not being consumed by these durable subscriptions and just hang around. ClientIds for durable subscribers not being consistently used would cause this issue. It's likely that there is something wrong with the way the feature is being used, if you use JMX to inspect the subscriptions on the broker that should help you track down the root cause.
As a general rule, whenever you think that you might want to use a durable subscription, use virtual topics instead - they are much easier to reason about, inspect and load balance. On the other hand if you just want to get the last couple of messages when you reconnect a topic subscriber rather than all the messages you may have missed, use retroactive consumers.
An easy way to get around this issue is to always use a time to live when you send a message - pretty much every use case has a time limit of when a message ought to be consumed by anyway. ActiveMQ will expire messages beyond this point, and free up the messages in the data files for deletion.
We've been using Rabbit successfully for about a year. Recently have upgraded to v2.6.1, because we want to use clusters with replicated message queues.
My testing has hit a puzzling behavior that smells like a Rabbit bug to me. The test that uncovers this is working with a two-node cluster. Both nodes are running v2.6.1. Both nodes have disk. Both nodes are running on Mac OS, though I doubt this is pertinent.
I'm also running Alice on the node that runs the test. The test uses it to programmatically do a stop_app on one of the nodes, because the test is trying to validate that if the cluster master fails, and a slave is elevated to take its place, that we don't lose messages.
So, the test has a small thread pool, which is given tasks that periodically 1) publish messages, and 2) toggle the state of the Rabbit master node (stopped if running; started if stopped). Other threads are consuming messages from queues.
I'm using publisher confirms, and I'm also acknowledging the messages in the consumers (using autoAck=false for channel.basicConsume()).
When the master node is stopped, I see both the producers and consumers catching ShutdownSignalException. They handle this by attempting to reconnect to the cluster. This works fine. When reconnected, they continue with their business.
Sometimes, what I see is that a consumer has successfully fetched a message from the broker, and is calling channel.basicAck() when it gets that ShutdownSignalException.
Later, when the consumer has reconnected, it again pulls down the same message. (The message bodies are tagged with a UUID, so I know it is the same one.) This time, when the consumer attempts to basicAck() the message, it again gets ShutdownSignalException, but this one has the following text in it: "reply-text=PRECONDITION_FAILED - unknown delivery tag 7".
In fact, that is the same delivery tag that was offered to the consumer by the broker before the master went down and the consumer reconnected.
Googling suggests that this event means that the consumer is attempting to ack the same message more than once.
But, how can this be so? If the first ack succeeded, then the message should have been removed from the broker's queues, and the consumer shouldn't see the same message again.
Yet, if the first ack did not succeed, then the consumer shouldn't be dinged for attempting to re-ack the message.
Anyone seen this before? It smells like a bug in Rabbit's replicated queues to me, but I've still new to Rabbit, and so am willing to believe there's a subtlety here in consuming from a clustered broker that I haven't yet grokked!
Thanks, --Steve
I'm not sure if my case matching yours, but I have seen similar "unknown delivery tag" on attempts to ack after reconnect and then the same message arrived again. Initially it looked like a bug to me, but in fact this is expected behavior. Consumer with QOS>1 may have in it's local buffer some messages and delivery tag will be invalid for all o them after reconnect. From another hand, attempt to ack even the current message after reconnect doesn't make any sense, because that message already nacked automatically on connection lost and this is why I got it again.