I know about the size limit for the message (4MB), but is there a limit to the queue size? Best practice limit maybe?
"Best practice limit maybe?"
One good test is to see how long it takes for a system to restart and become operational again.
So you may want to have {{picks number out of air}} 100Gbytes of messages, for example, as a maximum limit but 100Gbytes could take {{picks another number out of air}} 30 minutes to reload. That may be way outside your SLA.
So:
Decide what your down-time SLA for the system is
Work out how many messages in storage would be needed to break SLA on restart
Work out how many messages the system can actually store
Choose lower of the two.
Yes, of course there is a limit.
You can find details here.
However, if your consumers consume messages at the same rate as producers produce, then you dont have to worry about the memory limit etc.
As far as my experience, the queue capacity depends on the machine and not that large compared to ActiveMQ or RabbitMQ
Related
On our RabbitMQ installed in production, we have a performance issue.
To explain the context, we have an initialization batch that creates around ~60k messages. For business reasons, those messages must be treated in strict order and we can't lose any. As such, we have only one queue which is durable and lazy and one consumer (SpringBoot AMQP) with a prefetch of 10. Both are on the same virtual machine.
At first, the processing is fast enough, around 5 to 10 messages per second. But it progressively slows down until it reaches a cap of fewer than 20 messages per hour. It takes approximately 1 hour to reach this point.
After some investigations, we found out that the problem comes from RabbitMQ. When we simply stop and restart it, the performance goes back to normal and then drops slowly again. Doing the same on just the consumer doesn't change anything.
I'm thinking about some resources bottleneck but I can't manage to find which one as RAM, CPU, and disk looks fine. I am not really familiar with ERL virtual machine and managing RabbitMQ itself so I may have missed something.
Does someone as an idea of the source of the problem or where I could look for more information on what is happening?
RabbitMQ characteristics :
ERL 23.3.2
RabbitMQ 3.8.14
I have RabbitMQ running on a server and there's some script which inserts data into it. I know the approximate frequency in which the data is inserted, but it's not only approximate, it can also vary quite a lot.
How can I know how often does another script have to take the data out of RabbitMQ?
What will happen if the 2nd script take the data out of RabbitMQ slower than needed?
How can I measure whether or not the frequency is good enough?
How can I know how often does another script have to take the data out of RabbitMQ?
You should consume messages from the queue at a rate greater than or equal to the rate they are published. RabbitMQ reports publish rates; however, you will want to get a reasonable estimate from load testing your application.
What will happen if the 2nd script take the data out of RabbitMQ slower than needed?
In the short term, the number of messages in the queue will increase, as will processing time (think about what happens when more people get in line for Space Mountain at Disney). In the long term, the system will be unstable because the queue will increase without bound, eventually resulting in a failure of the queue, as well as other practical consequences (think of this as the case where Space Mountain is broken down, but people are still allowed to enter the queue line).
How can I measure whether or not the frequency is good enough?
From an information only perspective, you can monitor the queue yourself using the RabbitMQ management plugin. If you need automated processes to spawn up additional workers, you'll have to integrate those processes into the RabbitMQ management API. How to do this is the subject of a number of how-to articles.
I want to know how many consumers/producers should be on a single connection?
At what number should I distribute my consumers over separate connections?
I have multiple DMLC's with concurrent consumers but they are using single connection.
I have around 100 consumers using same connection. Should I distribute them?
I haven't faced any problems but just asking if is this a good thing ?
What is costlier for ActiveMQ to handle - two connections with 50 consumers or 100 consumers on single connection?
I have asked this on activemq forum also but nobody has replied - link.
Thanks,
Abhi
I think 100 consumers using single connection is better than multiple connections for same number of consumers as connections are costlier than consumers. But it is difficult to say how many max concurrent consumers you can have per connection.
Also please make sure that you have very low prefetch size set for these concurrent consumers otherwise total prefetch buffer size increases.
There is likely no hard and fast rule; it will depend on so many things.
I suggest you try it both ways and use whichever is best for you. But bear in mind, the results might be different on your test Vs. production environment.
I haven't faced any problems...
This is often called "premature optimization".
If I declare a queue with x-max-length, all messages will be dropped or dead-lettered once the limit is reached.
I'm wondering if instead of dropped or dead-lettered, RabbitMQ could activate the Flow Control mechanism like the Memory/Disk watermarks. The reason is because I want to preserve the message order (when submitting; FIFO behaviour) and would be much more convenient slowing down the producers.
Try to realize queue length limit on application level. Say, increment/decrement Redis key and check it max value. It might be not so accurate as native RabbitMQ mechanism but it works pretty good on separate queue/exchange without affecting other ones on the same broker.
P.S. Alternatively, in some tasks RabbitMQ is not the best choice and old-school relational databases (MySQL, PostgreSQL or whatever you like) works the best, but RabbitMQ still can be used as an event bus.
There are two open issues related to this topic on the rabbitmq-server github repo. I recommended expressing your interest there:
Block publishers when queue length limit is reached
Nack messages that cannot be deposited to all queues due to max length reached
Currently are getting tons of new messages and our workers can't handle them as fast as they are coming in. The message queue index gets bigger and bigger untill the set_vm_memory_high_watermark is reached and it stops accepting connections.
So what we could do is increase the memory, but this may not be scalable untill a certain point. Instead I would like to add more servers and distribute the message queue index over several rabbitmqnodes and if we need more memory we just add more servers.
How would I set this up and is this possible or are there any other ways to solve this problem?
Yes, you can use Distributed RabbitMQ brokers, chose federation Shovel.
You can store messages on disk if it is an option for you or drop the oldest one (with per-message or per-queue ttl) or set the max queue length.