If I declare a queue with x-max-length, all messages will be dropped or dead-lettered once the limit is reached.
I'm wondering if instead of dropped or dead-lettered, RabbitMQ could activate the Flow Control mechanism like the Memory/Disk watermarks. The reason is because I want to preserve the message order (when submitting; FIFO behaviour) and would be much more convenient slowing down the producers.
Try to realize queue length limit on application level. Say, increment/decrement Redis key and check it max value. It might be not so accurate as native RabbitMQ mechanism but it works pretty good on separate queue/exchange without affecting other ones on the same broker.
P.S. Alternatively, in some tasks RabbitMQ is not the best choice and old-school relational databases (MySQL, PostgreSQL or whatever you like) works the best, but RabbitMQ still can be used as an event bus.
There are two open issues related to this topic on the rabbitmq-server github repo. I recommended expressing your interest there:
Block publishers when queue length limit is reached
Nack messages that cannot be deposited to all queues due to max length reached
Related
Am I missing something, or is there no way to generate backpressure with Redis streams? If a producer is pushing data to a stream faster consumers can consume it, there's no obvious way to signal to the producer that it should stop or slow down.
I expected that there would be a blocking version of XADD, that would block the client until room became available in a capped stream (similar to the blocking version of XREAD that allows consumers to wait until data becomes available), but this doesn't seem to be the case.
How do people deal with the above scenario — signaling to a producer that it should hold off on adding more items to a stream?
I understand that some data stream systems such as Kafka do not require backpressure, but Redis doesn't appear to have a comparable solution, and it seems like this would be a relatively common problem for many Redis streams use cases.
If you have persistence (either RDB or AOF) turned on, your stream messages will be persisted, hence there's no need for backpressure.
And if you use replicas, you have another level of redudancy.
Backpressure is needed only when Redis does not have enough memory (or enough network bandwidth to the replicas) to hold the messages.
And, honestly, I have never seen this scenario.
Why would you want to ? Unless you run out of memory it is not an issue and each consumer slow and fast can read at their leisure.
Note not using consumer groups just publishing via XADD and readers read via XRANGE via position stored in a key which is closer to Kafka. Using one stream per partition.
Producer can check if the table size gets too big every 1K messages (via XLEN) to slow down if this is an issue and cant you cant throw HW at it 5 nodes with 20 Gig each is pretty easy with the streams spread across the cluster .. Don't understand this should be easy so im probably missing something.
There is also an XADD version that trims the size of the table to ensure you don't over fill with the above but that world require some pretty extreme stuff. For us this is 2 days worth for frequent stuff which sends the latest state and 9 months for others.
Another thing dont store large messages in the stream , use a blob or separate key/ store.
I am new to Active MQ but sometimes the queues are not being processed and keep piling up, Is it a good practice to purge?, Isnt there any other solution that may prevent me from keeping all my messages for reprocessing apart from purging? I really dont want to loose the queues, Is this possible?
The correct way to deal with this is to set an expiration on messages such that after a given time the broker can discard them. Letting messages just pile into queues without regard to their lifetime will lead you into all sorts of problems most notably storage.
You need to develop a strategy for how long the messages should live so that the broker can start getting rid of them once they are no longer of use. If you don't do that then purging the queue is you only option.
I would like to create a cluster for high availability and put a load balancer front of this cluster. In our configuration, we would like to create exchanges and queues manually, so one exchanges and queues are created, no client should make a call to redeclare them. I am using direct exchange with a routing key so its possible to route the messages into different queues on different nodes. However, I have some issues with clustering and queues.
As far as I read in the RabbitMQ documentation a queue is specific to the node it was created on. Moreover, we can only one queue with the same name in a cluster which should be alive in the time of publish/consume operations. If the node dies then the queue on that node will be gone and messages may not be recovered (depends on the configuration of course). So, even if I route the same message to different queues in different nodes, still I have to figure out how to use them in order to continue consuming messages.
I wonder if it is possible to handle this failover scenario without using mirrored queues. Say I would like switch to a new node in case of a failure and continue to consume from the same queue. Because publisher is just using routing key and these messages can go into more than one queue, same situation is not possible for the consumers.
In short, what can I to cope with the failures in an environment explained in the first paragraph. Queue mirroring is the best approach with a performance penalty in the cluster or a more practical solution exists?
Data replication (mirrored queues in RabbitMQ) is a standard approach to achieve high availability. I suggest to use those. If you don't replicate your data, you will lose it.
If you are worried about performance - RabbitMQ does not scale well.
The only way I know to improve performance is just to make your nodes bigger or create second cluster. Adding nodes to cluster does not really improve things. Also if you are planning to use TLS it will decrease throughput significantly as well. If you have high throughput requirement +HA I'd consider Apache Kafka.
If your use case allows not to care about HA, then just re-declare queues/exchanges whenever your consumers/publishers connect to the broker, which is absolutely fine. When you declare queue that's already exists nothing wrong will happen, queue won't be purged etc, same with exchange.
Also, check out RabbitMQ sharding plugin, maybe that will do for your usecase.
When my system's data changes I publish every single change to at least 4 different consumers (around 3000 messages a second) so I want to use a message broker.
Most of the consumers are responsible to update their database tables with the change.
(The DBs are different - couch, mysql, etc therefor solutions such as using their own replication mechanism or using db triggers is not possible)
questions
Does anyone have an experience with data replication between DBs using a message broker?
is it a good practice?
What do I do in case of failures?
Let's say, using RabbitMQ, the client removed 10,000 messages from the queue, acked, and threw an exception each time before handling them. Now they are lost. Is there a way to go back in the queue?
(re-queueing them will mess their order ).
Is using rabbitMQ a good practice? Isn't the ability to go back in the queue as in Kafka important to fail scenarios?
Thanks.
I don't have experience with DB replication using message brokers, but maybe this can help put you in the right track:
2. What do I do in case of failures?
Let's say, using RabbitMQ, the client removed 10,000 messages from the
queue, acked, and threw an exception each time before handling them.
Now they are lost. Is there a way to go back in the queue?
You can use dead lettering to avoid losing messages. I'd suggest to not ack until you are sure the consumers have processed them successfully, unless it is a long-running task. In case of failure, basic.reject instead of basic.ack to send them to a dead-letter queue. You have a medium throughput, so gotta be careful with that.
However, the order is not guaranteed. You'll need to implement a manual mechanism to recover them in the order they were published, maybe by using message headers with some sort of timestamp or id mechanism, to re-process them in the correct order.
Currently are getting tons of new messages and our workers can't handle them as fast as they are coming in. The message queue index gets bigger and bigger untill the set_vm_memory_high_watermark is reached and it stops accepting connections.
So what we could do is increase the memory, but this may not be scalable untill a certain point. Instead I would like to add more servers and distribute the message queue index over several rabbitmqnodes and if we need more memory we just add more servers.
How would I set this up and is this possible or are there any other ways to solve this problem?
Yes, you can use Distributed RabbitMQ brokers, chose federation Shovel.
You can store messages on disk if it is an option for you or drop the oldest one (with per-message or per-queue ttl) or set the max queue length.