Difference between maxPageSize and Prefetch limit in activemq - activemq

Can anyone please tell me the difference between maxPageSize and prefetch limit?

maxPageSize: according to apache websitemaximum number of persistent messages to page from store at a time. This is when messages are stored in persistent store (like kahadb). You have reference of the message stored in the database. This maxPageSize limits the number of references for message you can have. You have this reference for faster access from database (like having an index in database which increase the performmance)
Prefetch Limit: is related to sending number of message to consumer to improve performance. If you set prefetch limit as 0, means consumer would keep on polling for message from queue however if you set it as 100 then activemq would send 100 message in advance (prefetched to consumer) to consumer to process which would remove an extra effort which otherwise would have to make by consumer to check for any message on queue.

Related

activemq memory limit configuration Vs maxPageSize

I am unable to understand how the two attributes differ:'memoryLimit' and 'maxPageSize'
As per documentation:
'maxPageSize' = 'maximum number of persistent messages to page from store at a time'
'memoryLimit' - corresponds to the amount of memory that's assigned to the in-memory store for the queue
Here is a sample configuration for a queue :
<policyEntry queue="Consumer.normal.queue" producerFlowControl="true" memoryLimit="3200" maxPageSize="4"
maxBrowsePageSize="1000" prioritizedMessages="true" useCache="false" expireMessagesPeriod="0" queuePrefetch="1">
what I have observed is that if the maxPageSize =1 and memoryLimit = "3200" then I can see 2 messages loaded into memory and can be browsed via a jms client ( rest of the messages get stored in kahadb )
however if the maxPageSize = 4 and memoryLimit = "3200" then I can see 4 messages loaded into memory and can be browsed via a jms client
So are the two values meant to serve the same purpose ?
AND
does it mean that whichever of the these two attributes provides the greater number of messages will be used by activemq ?
maxPageSize determines how many messages ActiveMQ loads from the store (in your case, KahaDB) to hand to consumers. The memoryLimit indicates how much memory to allocate to keep messages in memory.
In short, (message size x maxPageSize <= memoryLimit) so that you do not hit producer flow control.
You want your page size to be much higher than 1 or 2 for ActiveMQ to perform (200 to 1000 to start). Numbers that low will have higher latency.
Note: Priority is an anti-pattern in distributed messaging at significant load (over 1M messages per day). It works well in a local embedded broker within your Java VM process. ActiveMQ disables it by default.
To enable priority support, update the <destinationPolicy queue=".." and add this attribute: prioritizedMessages="true"

Camel RabbitMQ connector reads thousands of message before using them

In my app, we are using a Camel route to read messages from a RabbitMQ queue.
The configuration looks like that :
from("rabbitmq:myexchange?routingKey=mykey&queue=q")
The producer can send 50k messages within a few minutes, and each message can take 1 second or more to process.
What I can see is that that ALL messages are consumed very fast, but the processing of this messages can take many hours. Many hours of processing is expected but does that mean that the 50k messages are stored in memory ? If so, I would like to disable this behavior because I don't want to loose messages when the process goes down ... Actually, we are loosing most of the messages even when the process stays up, which is even worse. It looks like the connector is not designed to handle so many messages at once, but I cannot say if it is because of the connector himself or because we did not configure it properly.
I tried with the option autoAck :
from("rabbitmq:myexchange?routingKey=mykey&queue=q&autoAck=false")
This way the messages are rollbacked when something goes wrong but keeping 50k messages unacknowledge at the same time does not seem to be a good idea anyway...
There are a couple of things that i will like to share.
AutoAck - Yes in case when you want to process the message ( after receiving it ) you should set AutoAck to False and explicitly acknowledge the message once it is processed.
Setting Consumer PreFetch - You need to fine tune the PreFetch size , the pre fetch size is the max number of messages which RabbitMQ will present to the consumer in a go i.e. at the most your total un-acknowledged message count will be equal to the Pre Fetch size. Depending on your system if every message is critical you can set the pre fetch size to 1 , if you have multi threaded model of processing messages you can set the pre fetch size to match the number of threads where each thread processes one message and likewise.
In a way it acts like a buffer architecturally. If your process goes down while processing those message any message which was un acked before the process went down will still be there in the queue and the consumer will get it again for processing.

RabbitMq: Disabling prefetching (prefetch_count=0) with auto-ack=false

Is it possible to disable prefetching with auto-ack=false? I just want to avoid reading message (prefetching) from a queue every time I acknowledge a message. I want to read a message only when I call a 'consume_message'. Setting prefetch_count=0 seems doesn't work and it's treated as 'no specif limit'.
UPDATED:
As I understand 'prefetch_count' is the number of messages cached on the client side (read locally into buffers). For example there is a use case:
(let's assume there is a queue we connect to and it has messages)
Create a connection.
Set Basic.Qos (prefetch_count=1)
Start consuming Basic.Consume
Due to the prefetch_count=1 one message is already transferred to the client and ready to be read and marked as not-ack'd.
Reading message and then processing it.
Then the message is ack'd. And everything starts from step 4.
I thought that setting prefetch_count to 0 would avoid the step 4 and a message is transferred only when you read it - no caching on the client side.
Prefetch and auto-acknowledgment are not related like that. Prefetch count is simply a number of unacknowledged messages prepared to be delivered to a specific consumer.
Let's say you set prefetch count to N. If you set auto-ack to true, these means that these N messages are ACKed upon receiving. If you set it to false, this means that you still get the N messages but they're not ACKed until you manually ACK them.
For the last part - try setting prefetch_count to 1.
Also check this question and both answers.

Custom plugin development for RabbitMQ

I need to implement sequential message processing on multiple consumers, but only one message per time on the queue. I have a lot of queues, but all of them are sequential and I need multiple consumers support for load balancing and redundancy. Anybody can tell whether it is real or not to limit number of unacknowledged message to 1 per queue?
Anybody can tell whether it is real or not to limit number of unacknowledged message to 1 per queue?
this isn't possible with multiple consumers. you can limit the number of unacknowledged messages using prefetch limit for a single channel, but not across multiple channels / consumers. it is tied to the channel of the consumer, not the queue.
the only way you can achieve this is with a single consumer and a single queue, using prefetch.
even then, you have no guarantee that the messages will arrive in the queue in the correct order.
(this is a fundamental difficulty with distributed systems of any kind, not a rabbitmq limitation)
look at the Message Sequencer and Resequencer patterns to try and put the messages back in order.
but even then, you're going to run into difficulty.
you'll also want to read up on idempotency so you don't re-process a message that has already been processed.
You should be able to configure your consumer to consume only X message(s) at time and same for your channel. Take a look at QOS or Consumer Prefetch
https://www.rabbitmq.com/consumer-prefetch.html
Here is an example, where multi-consumers will acknoledge only one message and channel allow only one message to be acknoledged (whatever how much consumers a plugged on it)
Channel channel = ...;
Consumer consumer1 = ...;
Consumer consumer2 = ...;
channel.basicQos(1, false); // Per consumer limit
channel.basicQos(1, true); // Per channel limit
channel.basicConsume("my-queue1", false, consumer1);
channel.basicConsume("my-queue2", false, consumer2);
Here, a consumer can acknoledge only one message each time, and the channel can only have one unacknoledged message. You didn't mention which language you use so you'll problably have to adapt this example.

Get visibility on number of rabbitmq messages in flight when autoAck=true

I have a RabbitMQ setup where a (java) producer sends messages to a fanout exchange, which are handled by a consumer. It's no problem if messages get lost when the consumer dies, so for performance I set autoAck=true at the consumer side.
Now I'm investigating a situation in which the rate the consumer can handle messages, is lower than the rate at which they are sent.
After a while, a (huge) backlog of messages must queue up somewhere. Is there a way to get visibility on this backlog?
Using the rabbitmqmanagement interface does not work: the queue appears empty
Ready: 0
Unacknowledged: 0
Total: 0
I assume the queue is empty because the messages are (unlimitedly) prefetched by the rabbitmqclient used by the consumer. But limiting the prefetch by e.g.
channel.basicQos(10)
does not help either, probably because this only limits unacknowledged messages, and with autoAck=true, messages are ack'ed from the moment they are prefetched by the client.
Setting autoAck=false (and explicit ack'ing on delivery) is a solution (the Unacknowledged counter keeps on rising), but I was wondering whether this is the only way?
Preferably I'd like to limit the amount of cached messages at the client side irrespective of acknowledgements, such that the backlog eventually becomes visible through the rabbitmqmanagement interface.
Alternatively, is there a way to query the number of messages sitting somewhere in the client's prefetch queue waiting to be delivered?
I suggest using a combination of basicQos and autoAck=false. This will make everything show up in the queues both through the admin website and the REST APIs. Having an unlimited number of messages sent to each consumer seems to defeat the point of a queue.
If your queues are time sensitive you can also add a TTL on the queues so that messages are automatically Nacked after (as an example) 60 minutes.