The strange behavior of `delete-after` attribute of dynamic shovel - rabbitmq

I was exploring the shovel plugin for moving the messages from source to temporary queues as a part of a bigger use case. I was creating the dynamic shovel for each queue to move the messages to the temporary queue and delete the dynamic shovel using the attribute "delete-after": "queue-length". I have seen in the RabbitMQ Management console(Admin->Shovel status) that the dynamic shovel got deleted successfully, but the source/temporary queues' state was running.
But the issue was that when new messages were coming to the source queues, they were automatically moving to the temporary queues even though there was no consumer of the source queue.
Note:
Source and temporary both queues are durable.
Messages are persistent (Delivery mode: 2)
The said operation was performed parallelly as there are hundreds of queues. I was creating dynamic shovel for each queue and delete them.
While I'm removing the dynamic shovel using the DELETE HTTP API instead of the above approach, it's working perfectly. I want to avoid making an extra HTTP call as the no of source queues are hundreds.

delete-after attribute has been deprecated and renamed with src-delete-after a long back. RMQ v3.7.x has the support of delete-after attribute but it was removed in v3.8.x(up to 3). Then it was brought back in v3.8.4
https://github.com/rabbitmq/rabbitmq-shovel/issues/72
Thanks to Michael

Related

How to have more than 50 000 messages in a RabbitMQ Queue

We have currently using a service bus in Azure and for various reasons, we are switching to RabbitMQ.
Under heavy load, and when specific tasks on backend are having problem, one of our queues can have up to 1 million messages waiting to be processed.
RabbitMQ can have a maximum of 50 000 messages per queue.
The question is how can we design the rabbitMQ infrastructure to continue to work when messages are temporarily accumulating?
Note: we want to host our RabbitMQ server in a docker image inside a kubernetes cluster.
we imagine an exchange that would load balance mesages between queues in nodes behind.
But what is unclear to us is how to dynamically add new queues on demand if we detect that queues are getting full.
RabbitMQ can have a maximum of 50 000 messages per queue.
There is no this kind of limit.
RabbitMQ can handle more messages using quorum or classic queues with lazy.
With stream queues RabbitMQ can handle Millions of messages per second.
we imagine an exchange that would load balance messages between queues in nodes behind.
you can do that using different bindings.
kubernetes cluster.
I would suggest to use the k8s Operator
But what is unclear to us is how to dynamically add new queues on demand if we detect that queues are getting full.
There is no concept of FULL in RabbitMQ. There are limits that you can put using max-length or TTL.
A RabbitMQ queue will never be "full" (no such limitation exists in the software). A queue's maximum length rather depends on:
Queue settings (e.g max-length/max-length-bytes)
Message expiration settings such as x-message-ttl
Underlying hardware & cluster setup (available RAM and disk space).
Unless you are using Streams (new feature in v 3.9) you should always try to keep your queues short (if possible). The entire idea of a Message Queue (in it's classical sense) is that a message should be passed along as soon as possible.
Therefore, if you find yourself with long queues you should rather try to match the load of your producers by adding more consumers.

Handling RabbitMQ node failures in a cluster in order to continue publishing and consuming

I would like to create a cluster for high availability and put a load balancer front of this cluster. In our configuration, we would like to create exchanges and queues manually, so one exchanges and queues are created, no client should make a call to redeclare them. I am using direct exchange with a routing key so its possible to route the messages into different queues on different nodes. However, I have some issues with clustering and queues.
As far as I read in the RabbitMQ documentation a queue is specific to the node it was created on. Moreover, we can only one queue with the same name in a cluster which should be alive in the time of publish/consume operations. If the node dies then the queue on that node will be gone and messages may not be recovered (depends on the configuration of course). So, even if I route the same message to different queues in different nodes, still I have to figure out how to use them in order to continue consuming messages.
I wonder if it is possible to handle this failover scenario without using mirrored queues. Say I would like switch to a new node in case of a failure and continue to consume from the same queue. Because publisher is just using routing key and these messages can go into more than one queue, same situation is not possible for the consumers.
In short, what can I to cope with the failures in an environment explained in the first paragraph. Queue mirroring is the best approach with a performance penalty in the cluster or a more practical solution exists?
Data replication (mirrored queues in RabbitMQ) is a standard approach to achieve high availability. I suggest to use those. If you don't replicate your data, you will lose it.
If you are worried about performance - RabbitMQ does not scale well.
The only way I know to improve performance is just to make your nodes bigger or create second cluster. Adding nodes to cluster does not really improve things. Also if you are planning to use TLS it will decrease throughput significantly as well. If you have high throughput requirement +HA I'd consider Apache Kafka.
If your use case allows not to care about HA, then just re-declare queues/exchanges whenever your consumers/publishers connect to the broker, which is absolutely fine. When you declare queue that's already exists nothing wrong will happen, queue won't be purged etc, same with exchange.
Also, check out RabbitMQ sharding plugin, maybe that will do for your usecase.

Could someone help me to understand difference between static queue and dynamic queue?

While I was working on the Message-Queue, I encounter the word static queue and dynamic queue.
Can any one tell me the difference?
A static queue is one that is defined ahead of time and the queue definition persists in the environment.
A dynamic queue is created on demand. Of these there are two varieties in IBM MQ. A temporary dynamic queue is created on demand and is deleted when the program that created it disconnects. A permanent dynamic queue is one that is created on demand but persists in the environment after the program which created it disconnects.
For example, a temporary dynamic queue is useful for catching replies in a request/reply scenario. The queue exists only so long as the application making requests is connected. When the program disconnects, the queue goes away so there is no need for the administrator to manually clean it up.
A permanent dynamic queue is useful for things like durable subscriptions. When a subscription is created, the queue needs to be unique and the overhead of having to define it ahead of time is excessive. So we let the application create it dynamically but also let the queue hang around when the program is offline in order to collect publications. Normally, the application deletes the queue when it is no longer needed so that the administrator doesn't need to.

RabbitMQ change queue parameters on a production system

I'm using RabbitMQ as a message queue in a service-oriented architecture, where many separate web services publish messages bound for RabbitMQ queues. Those queues are in turn subscribed to by various consumers, which perform background work; a pretty vanilla use-case for RabbitMQ.
Now I'd like to change some of the queue parameters (specifically, I'd like to bind queues to a new dead-letter exchange with a certain routing key). My problem is that making this change in place on a production system is problematic for a couple reasons.
Whats the best way for me to transition to these new queues without losing messages in a production system?
I've considered everything from versioning queue names to making a new vhost with the new settings to doing all the changes in place.
Here are some of the problems I'm facing:
Because RabbitMQ queues are idempotent, the disparate web services have been declaring the queues before publishing to them (in case they don't already exist). Once you change the queue parameters (but maintain the same routing key), the queue declare fails and RabbitMQ closes the channel.
I'd like to not lose messages when changing a queue (here I'm planning on subscribing an exclusive consumer that saves the messages and then republishes to the new queue).
General coordination between disparate publishers and the consumer base (or, even better, a way to avoid needing to coordinate them).
Queues bindings can be added and removed at runtime without any impact on clients, unless clients manually modify bindings. So if your question only about bindings just change them via CLI or web management panel and skip what written below.
It's a common problem to make back-incompatible changes, especially in heterogeneous environment, especially when multiple applications attempts to declare same entity in their own way (with their specific settings). There are no easy way to change queue declaration at the same time in multiple applications and it highly depends on how whole working process organized, how critical your apps are, what is your infrastructure and etc.
Fast and dirty way:
While the publishers doesn't deals with queues declaration and bindings (at least they should not do that), you can focus on consumers. Wrapping queues declaration in try-except block may be the fast and dirty choice. Also most projects, even numerous can survive small downtime, so you can block rabbitmq user in one shell, alter queue as you wish (create new one and make your consumers use it instead of old one) and then unblock user and let consumers works as before (your workers are under supervisor or monit, right?). Then migrate manually messages from old queue to new one.
Fast and safe solution:
Is is a bit tricky and based on a hack how to migrate messages from one queue to another inside single vhost. The whole solution works inside single vhost but requires extra queue for every queue you want to modify. Set up Dead Letter Exchanges on source queue and point it to route expired messages to your new target queue. Then apply Per-Queue Message TTL to source queue, set x-message-ttl=0 (to it's minimal value, see No Queueing at all note about immediate delivery). Both actions can be done via CLI or management panel and can be done on already declared queue. In this way your publishers can publish messages as usual and even old consumers can work as expected for the first time, but in parallel new consumers can consume from new queue which can be pre-declared with new args manually or in other way.
Note, that on queues with large messages number and huge messages flow there are some risks to met flow control limits, especially if your server utilize almost all of it resources.
Much more complicated but safer approach (for cases when whole messages workflow logic changed):
Make all necessary changes to applications and run new codebase in parallel to existing one, but on the different RabbitMQ vhost (or even use separate server, it depends on your applications load and hardware). Actually, it may be possible to run on the same vhost but change exchanges and queues name, but it even doesn't sound good and smells even in written form. After you set up new apps, switch them with old one and run messages migration from old queues to new one (or just let old system empty the queues). It guaranties seamless migration with minimal downtime. If you have your deployment automatized, whole process will not takes too much efforts.
P.S.: in any case above, if you can, let old consumers to empty queues so you don't need to migrate messages manually.
Update:
You may find very useful Shovel plugin, especially Dynamic Shovels to move messages between exchanges and queues, even between different vhosts and servers. It's the fastest and safest way to migrate messages between queues/exchanges.

Temporary queue made in Celery

I am using Celery with RabbitMQ. Lately, I have noticed that a large number of temporary queues are getting made.
So, I experimented and found that when a task fails (that is a tasks raises an Exception), then a temporary queue with a random name (like c76861943b0a4f3aaa6a99a6db06952c) is formed and the queue remains.
Some properties of the temporary queue as found in rabbitmqadmin are as follows -
auto_delete : True
consumers : 0
durable : False
messages : 1
messages_ready : 1
And one such temporary queue is made everytime a task fails (that is, raises an Exception). How to avoid this situation? Because in my production environment a large number of such queues get formed.
It sounds like you're using the amqp as the results backend. From the docs here are the pitfalls of using that particular setup:
Every new task creates a new queue on the server, with thousands of
tasks the broker may be overloaded with queues and this will affect
performance in negative ways. If you’re using RabbitMQ then each
queue will be a separate Erlang process, so if you’re planning to
keep many results simultaneously you may have to increase the Erlang
process limit, and the maximum number of file descriptors your OS
allows
Old results will not be cleaned automatically, so you must make
sure to consume the results or else the number of queues will
eventually go out of control. If you’re running RabbitMQ 2.1.1 or
higher you can take advantage of the x-expires argument to queues,
which will expire queues after a certain time limit after they are
unused. The queue expiry can be set (in seconds) by the
CELERY_AMQP_TASK_RESULT_EXPIRES setting (not enabled by default).
From what I've read in the changelog, this is no longer the default backend in versions >=2.3.0 because users were getting bit in the rear end by this behavior. I'd suggest changing the results backend if this not the functionality you need.
Well, Philip is right there. The following is a description of how I solved it. It is a configuration in celeryconfig.py.
I am still using CELERY_BACKEND = "amqp" as Philip had said. But in addition to that, I am now using CELERY_IGNORE_RESULT = True. This configuration will ensure that the extra queues are not formed for every task.
I was already using this configuration but still when a task fails, the extra queue was formed. Then I noticed that I was using another configuration which needed to be removed which was CELERY_STORE_ERRORS_EVEN_IF_IGNORED = True. What this did that it did not store the results for all tasks but did only for errors (tasks which failed) and hence one extra queue for a task which failed.
The CELERY_TASK_RESULT_EXPIRES dictates the time to live of the temp queues. The default is 1 day. You can modify this value.
The reason this is happening is because celery workers remote control is enabled (it is enabled by default).
You can disable it by setting the CELERY_ENABLE_REMOTE_CONTROL setting to False
However, note that you will lose the ability to do things like add_consumer, cancel_consumer etc using the celery command
amqp backend creates a new queue for each task. If you want to avoid it, you can use rpc backend which keeps results in a single queue.
In your config, set
CELERY_RESULT_BACKEND = 'rpc'
CELERY_RESULT_PERSISTENT = True
You can read more about this on celery docs.