While viewing RabbitMQ Grafana Plot for Consumer utilisation I am seeing inverted graph but all other graph Such as (rabbitmq_queue_messages, rabbitmq_queue_messages_ram)
indicate that performance at the left hand graph is better (2x). How it is so.
as per articles and RabbitMq Docs Higher consumer utilization is better. Why the graph is coming be different For consumer utilization
Pls suggest.
Edit:
The definition of consumer utilization is the proportion of time that a queue's consumers could take new messages.
for same message push rate. (e.g. 1per second) Is it so that
Low consumer utilization means queue is more efficient?
Related
We have a design challenge where the situation is as follow:
There are multiple producers and multiple consumers (on same queue).
Each message represent a task with parameters that consumer needs to handle.
The problem is that there are certain tasks that take lots of memory (and cpu power) which we know the consumer have no capacity to handle this. the good thing is that we know how much memory (and cpu power) it approximately can take in advance, so we could prevent a consumer taking that task and giving a change to other consumer with enough memory to handle.
There is the prefetch setting but i can't see how it can configure to meet this requirement
Finally I found an option to rollback a transaction, so the consumer can basically check if it has enough hardware resources to handle the task and if not rollback which retrieves the message back to queue allowing next consumer take it and so forth.
Not sure if that's the right approach or there is a better way?
The messages could have properties set which indicate whether or not they will require high CPU and/or memory and then consumers could use selectors to only receive the messages which fit their hardware constraints.
I was trying to perform LoadTest on the RabbitMQ messaging to see to what extent it can take messages into the queue and transfer it to the target machine over shovel.
Steps i followed:
producer has 20 threads. Each thread sends message to a dedicated queue(Say suppose ProducerQueue1 -- ProducerQueue20).The message is of size 51Mb each. The messages are sent in random interval using java.util.Random(50) seconds.
After each message sent at random seconds(A random second between 1- 50),
there is a sleep of 2 minutes.Therefore each of the producer threads sleep for
2 min after every send.
The messages are sent in a infinite while loop.
There are shovels from each dedicated queue to the consumer side dedicated queues(Say suppose ConsumerQueue1 --- ConsumerQueue20).
The link speed is 100mbps.
Issue observed:
Initially the messages are transferred with no issues, but after some time the NETWORK AT CONSUMER SIDE IS CHOKED.
The reason for choking is that after certain period of time, even if 4/5 out of 20 thread's random second coincides, then the consumer receives close to 250Mb message in one shot. Since the network speed is 100mbps as mentioned above, the network gets choked.
Due to this, the shovels will not be able to exchange heartbeats to stay in "running" state. The leads shovels to move from "running" to "terminated" state. The shovels try to establish a connection depending upon the "reconnect delay".
Due to break in shovels at producer side, the queues at the producer starts getting accumulated.
My Question:
The consumer's rabbitMq memory starts increasing as the queues start accumulating more messages. The memory is crossing the water mark. The purpose of water mark is not served. I have 16gb ram and i have set watermark to 40%(i.e 6.4gb ram). But still the memory shoots up to 10gb and doesnt recover and the producer system hangs.
Can any one please answer my question. and also tell me can there be any other reason for network choking which i mentioned above.
Thanks in advance.
As far as I know, RabbitMQ has a internal flow control which blocks a producer which publishes messages too fast that consumers cannot catch up it. (It does not require any configuration)
I'd like to know whether I can configure some amount of quota (MB/sec) for each producer and client so that they do not burden the broker system too much.
For example, a producer with quota 2 MB/sec cannot publish messages at higher rate than 2 MB/sec.
There is no a way lo limit each single producer.
The flow control needs to do not burden the broker system too much.
If needs, you can tune the memory threshold and the paging threshold:
https://www.rabbitmq.com/memory.html
about the flow control I suggest to read:
http://www.rabbitmq.com/blog/2014/04/14/finding-bottlenecks-with-rabbitmq-3-3/
and
https://www.rabbitmq.com/blog/2015/10/06/new-credit-flow-settings-on-rabbitmq-3-5-5/
I'd add that, for my side, it doesn't make too much sense to limit a single producer, what happen if for example you have thousand of producers ?
All,
I have a problem with the performance with RabbitMQ when consuming messages when there is a large amount of messages to be consumed e.g. 280,000. It seems to go up and down from a performance perspective. The graph illustrated in the diagram taken from the management console demonstrates this where a consumer averages around 40 messages per second but then jumps up to around 120 messages per second:
The pattern will repeat itself again where it will go back to 40 and up to 120 again etc
Also, if I run the same test 1 hour later, the same up and down effect occurs but the range can vastly vary e.g. from 140 to 400 messages per second.
Note: The consumer does nothing with the messages
Note: Single consumer and ConsumerMessagePrefetchCount = 500
In relation to performance I have the following questions:
Is this up and down behaviour normal and expected or should the consumption speed of messages be steady?
Are the numbers that I am quoting expected or should they be much better/worse?
Any help appreciated
Billy
This behavior is quite normal, the queues are designed to be always close to zero messages. 280,000 is an high number, it means that the producer is faster than the consumer(s) so you have to increase the consumers number.
If you have a spike load, 280,000 could be not high number because you have a time to consume the messages.
There are lots techniques to increase the performances, for example:
Increase the consumer threads, (How many threads do you use to
consume the messages?)
Send messages with noAck
PrefetchCount is very important, an high value couldn’t be a right
solution.
The consumers should be steady, but also the producers should be steady, in load spike situation you need more time or more resources.
A few questions:
What rate do you have ?
Do you consume the messages from the same queue?
Do you need the ACK?
Why do you have 280.000 messages to the queue? Is it just a test or
a real situation ?
I hope it can be useful
As said Alexis Richardson (RabbitMQ) :
The easiest way to increase performance is to change what you are measuring ;-)
In our project, we want to use the RabbitMQ in "Task Queues" pattern to pass data.
On the producer side, we build a few TCP server(in node.js) to recv
high concurrent data and send it to MQ without doing anything.
On the consumer side, we use JAVA client to get the task data from
MQ, handle it and then ack.
So the question is:
To get the maximum message passing throughput/performance( For example, 400,000 msg/second) , How many queues is best? Does that more queue means better throughput/performance? And is there anything else should I notice?
Any known best practices guide for using RabbitMQ in such scenario?
Any comments are highly appreciated!!
For best performance in RabbitMQ, follow the advice of its creators. From the RabbitMQ blog:
RabbitMQ's queues are fastest when they're empty. When a queue is
empty, and it has consumers ready to receive messages, then as soon as
a message is received by the queue, it goes straight out to the
consumer. In the case of a persistent message in a durable queue, yes,
it will also go to disk, but that's done in an asynchronous manner and
is buffered heavily. The main point is that very little book-keeping
needs to be done, very few data structures are modified, and very
little additional memory needs allocating.
If you really want to dig deep into the performance of RabbitMQ queues, this other blog entry of theirs goes into the data much further.
According to a response I once got from the rabbitmq-discuss mailing group there are other things that you can try to increase throughput and reduce latency:
Use a larger prefetch count. Small values hurt performance.
A topic exchange is slower than a direct or a fanout exchange.
Make sure queues stay short. Longer queues impose more processing
overhead.
If you care about latency and message rates then use smaller messages.
Use an efficient format (e.g. avoid XML) or compress the payload.
Experiment with HiPE, which helps performance.
Avoid transactions and persistence. Also avoid publishing in immediate
or mandatory mode. Avoid HA. Clustering can also impact performance.
You will achieve better throughput on a multi-core system if you have
multiple queues and consumers.
Use at least v2.8.1, which introduces flow control. Make sure the
memory and disk space alarms never trigger.
Virtualisation can impose a small performance penalty.
Tune your OS and network stack. Make sure you provide more than enough
RAM. Provide fast cores and RAM.
You will increase the throughput with a larger prefetch count AND at the same time ACK multiple messages (instead of sending ACK for each message) from your consumer.
But, of course, ACK with multiple flag on (http://www.rabbitmq.com/amqp-0-9-1-reference.html#basic.ack) requires extra logic on your consumer application (http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/2013-August/029600.html). You will have to keep a list of delivery-tags of the messages delivered from the broker, their status (whether your application has handled them or not) and ACK every N-th delivery-tag (NDTAG) when all of the messages with delivery-tag less than or equal to NDTAG have been handled.