All,
I have a problem with the performance with RabbitMQ when consuming messages when there is a large amount of messages to be consumed e.g. 280,000. It seems to go up and down from a performance perspective. The graph illustrated in the diagram taken from the management console demonstrates this where a consumer averages around 40 messages per second but then jumps up to around 120 messages per second:
The pattern will repeat itself again where it will go back to 40 and up to 120 again etc
Also, if I run the same test 1 hour later, the same up and down effect occurs but the range can vastly vary e.g. from 140 to 400 messages per second.
Note: The consumer does nothing with the messages
Note: Single consumer and ConsumerMessagePrefetchCount = 500
In relation to performance I have the following questions:
Is this up and down behaviour normal and expected or should the consumption speed of messages be steady?
Are the numbers that I am quoting expected or should they be much better/worse?
Any help appreciated
Billy
This behavior is quite normal, the queues are designed to be always close to zero messages. 280,000 is an high number, it means that the producer is faster than the consumer(s) so you have to increase the consumers number.
If you have a spike load, 280,000 could be not high number because you have a time to consume the messages.
There are lots techniques to increase the performances, for example:
Increase the consumer threads, (How many threads do you use to
consume the messages?)
Send messages with noAck
PrefetchCount is very important, an high value couldn’t be a right
solution.
The consumers should be steady, but also the producers should be steady, in load spike situation you need more time or more resources.
A few questions:
What rate do you have ?
Do you consume the messages from the same queue?
Do you need the ACK?
Why do you have 280.000 messages to the queue? Is it just a test or
a real situation ?
I hope it can be useful
As said Alexis Richardson (RabbitMQ) :
The easiest way to increase performance is to change what you are measuring ;-)
Related
We have a setup where we have many consumers to a Queue.
The problem it seems like only a subset of those consumers are actually doing work.
Example
One Queue has 120 Consumers and has about 1000 messages?
It seems to only process 20 messages at a time though.
Any ideas?
It sounds like you're running into the prefetch count limit. I believe the default is 20.
From https://rabbitmq.docs.pivotal.io/36/rabbit-web-docs/consumer-prefetch.html
channel.basicQos(10, false); // Per consumer limit
channel.basicQos(15, true); // Per channel limit
Just remember that there are design complexities to working with large numbers of concurrent operations. (It can be done, but be careful that you're maintaining data integrity.)
We have a design challenge where the situation is as follow:
There are multiple producers and multiple consumers (on same queue).
Each message represent a task with parameters that consumer needs to handle.
The problem is that there are certain tasks that take lots of memory (and cpu power) which we know the consumer have no capacity to handle this. the good thing is that we know how much memory (and cpu power) it approximately can take in advance, so we could prevent a consumer taking that task and giving a change to other consumer with enough memory to handle.
There is the prefetch setting but i can't see how it can configure to meet this requirement
Finally I found an option to rollback a transaction, so the consumer can basically check if it has enough hardware resources to handle the task and if not rollback which retrieves the message back to queue allowing next consumer take it and so forth.
Not sure if that's the right approach or there is a better way?
The messages could have properties set which indicate whether or not they will require high CPU and/or memory and then consumers could use selectors to only receive the messages which fit their hardware constraints.
We have a Java application that gets messages from rabbitmq using Spring AMQP.
For some of the queues, the number of consumers are not increasing resulting in slower messages delivery rate.
e.g. even though the max consumers is set to 50, number of consumers remained 6 for most of the time for the load of 9000 messages.
However, this is not the case with other queues. i..e consumers count reached till 35 for other queues.
We are using SimpleMessageListenerContainer's setMaxConcurrentConsumers API for setting max consumers.
Can someone please help me to understand this?
Configuration:
number of concurrent consumers: 4
number of max concurrent consumers: 50
When asking questions like this, you must always show configuration. Edit your question with complete details.
It depends on your configuration. By default, a new consumer is only added once every 10 seconds, and only if an existing consumer receives 10 messages without any gaps.
If that still doesn't answer your question, turn on DEBUG logging. If you can't figure it out from that, post the log (covering at least startConsumerMinInterval milliseconds) someplace like pastebin or dropbox.
Where did the question:
We are using RabbitMQ as task queue. One of specific tasks - sending notices to Vkontakte social network. They api has limit to request per seconds and this limit based on your application size. Just 3 calls for app with less then 100k people and so on. So we need to artificially limit out request to they service. Now this logic is application based. It simple while you can use just one worker per such queue, just set something like sleep(300ms) and be calm. But when your should use N workers this synchronization becomes not trivial.
How to limit throughput with RabbitMQ?
Based on story above. If it were possible set prefetch size not only message based but time based to this logic can be much simple. For example, "qos to 1 message per fetch not faster then 1 time in seconds" or so on.
Is there something like this?
May be other strategy about this?
This is not possible out of the box with RabbitMQ.
You're right, with distributed consumers this throttling becomes a difficult exercise. I would suggest having a look at ZooKeeper which would allow you to synchronize all consumers and throttle processing of messages by leveraging it's Znodes / Watches
for throttled yet scalable solution.
In our project, we want to use the RabbitMQ in "Task Queues" pattern to pass data.
On the producer side, we build a few TCP server(in node.js) to recv
high concurrent data and send it to MQ without doing anything.
On the consumer side, we use JAVA client to get the task data from
MQ, handle it and then ack.
So the question is:
To get the maximum message passing throughput/performance( For example, 400,000 msg/second) , How many queues is best? Does that more queue means better throughput/performance? And is there anything else should I notice?
Any known best practices guide for using RabbitMQ in such scenario?
Any comments are highly appreciated!!
For best performance in RabbitMQ, follow the advice of its creators. From the RabbitMQ blog:
RabbitMQ's queues are fastest when they're empty. When a queue is
empty, and it has consumers ready to receive messages, then as soon as
a message is received by the queue, it goes straight out to the
consumer. In the case of a persistent message in a durable queue, yes,
it will also go to disk, but that's done in an asynchronous manner and
is buffered heavily. The main point is that very little book-keeping
needs to be done, very few data structures are modified, and very
little additional memory needs allocating.
If you really want to dig deep into the performance of RabbitMQ queues, this other blog entry of theirs goes into the data much further.
According to a response I once got from the rabbitmq-discuss mailing group there are other things that you can try to increase throughput and reduce latency:
Use a larger prefetch count. Small values hurt performance.
A topic exchange is slower than a direct or a fanout exchange.
Make sure queues stay short. Longer queues impose more processing
overhead.
If you care about latency and message rates then use smaller messages.
Use an efficient format (e.g. avoid XML) or compress the payload.
Experiment with HiPE, which helps performance.
Avoid transactions and persistence. Also avoid publishing in immediate
or mandatory mode. Avoid HA. Clustering can also impact performance.
You will achieve better throughput on a multi-core system if you have
multiple queues and consumers.
Use at least v2.8.1, which introduces flow control. Make sure the
memory and disk space alarms never trigger.
Virtualisation can impose a small performance penalty.
Tune your OS and network stack. Make sure you provide more than enough
RAM. Provide fast cores and RAM.
You will increase the throughput with a larger prefetch count AND at the same time ACK multiple messages (instead of sending ACK for each message) from your consumer.
But, of course, ACK with multiple flag on (http://www.rabbitmq.com/amqp-0-9-1-reference.html#basic.ack) requires extra logic on your consumer application (http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/2013-August/029600.html). You will have to keep a list of delivery-tags of the messages delivered from the broker, their status (whether your application has handled them or not) and ACK every N-th delivery-tag (NDTAG) when all of the messages with delivery-tag less than or equal to NDTAG have been handled.