I'm evaluating using RabbitMQ as a message queue broker/framework to replace an internally built message queue (using C# if that matters).
I will have a service with N threads, each thread being a consumer for a specific queue. There may be more than one thread that shares the same queue. I believe I would use the prefetch property of 1 for each consumer so that a consumer thread receives 1 message at a time.
Lets use an example where I have 4 consumer threads and all look at a queue named "reports". These reports can be run for different "customers". I want to avoid allowing a customer to monopolize the queue so lets say I don't want any single customer to use more than 2 consumers at any given time (if there are other queued messages waiting). If there are no other queued messages waiting for other customers then I'd like to allow all consumers to be eligible.
With my limited RabbitMQ understanding so far, I believe that I could either use a topic pattern to indicate the customer or I could use a custom header.
I'd like to know if there's a design pattern to support defining a limit per unique header/customer value.
I'm not asking for anyone to write the code for me, I just want to know if anyone can tell me "no that won't work" in advance before I waste a bunch of time getting ramped up.
If my question doesn't make sense please let me know and I'll update it with more information. Thanks in advance.
Related
Let's say I have one ActiveMQ Broker and an undefined numbers of consumers.
Problem:
To process a message, consumers need an external service which is either "DATA1" or "DATA2" (specified in the message)
Each server, "DATA1" and "DATA2", can only handle 20 connections
So at most 20 "DATA1" and 20 "DATA2" messages must be dispatched at any time
Because of priorization, the messages must be enqueued in the same queue
Even if message A has a higher prio than message B, if A can't be processed because the external service has no free slots, message B needs to be processed instead
How can this be solved? As long as I was using message pulling (prefetch of 0), I was able to do this by using a BrokerPlugin that, on messagePull, achieved this by using semaphores and selectors. If the limits were reached, the pull returned null.
However, due to performance issues I had to set prefetch to 1 and use push instead. Therefore, my messagePull hack no longer works (it's never called).
So far I'm considering implementing a custom Cursor but I was wondering if someone knows a better solution.
Update the custom cursor worked but broke features like message removal. I tried with a custom Queue and QueueDispatchSelector (which is a pain to configure since there isn't a proper API to do so) and it mostly works but I still have synchronisation issues.
Also, a very suitable API seems to be DispatchPolicy, however, while it is referenced by Queue, it's never used.
Queues give you buffering for system processing time for free. Messages are delivered on demand. With prefetch=0 or prefetch=1, should effectively get you there. Messages will only be delivered to a consumer when the consumer is ready (ie.. during the consumer.receive() method).
consumer.receive() is a blocking call, so you should not need any custom plugin or other to delay delivery until the consumer process (and its required downstream services) are ready to handle it.
The behavior should work out-of-the-box, or there are some details to your use case that are not provided to shed more light on the scenario.
I am trying to use Spring JMS and ActiveMQ to process a large number of messages. The context of the problem is the following:
Each customer produces a set of messages that are added to the queue. The messages are added to the queue with the customer id as parameter.
In one case, customer A can add 10k messages to the queue, while customer B only adds 100 messages to that same queue. My issue is customer B needs to wait until all the 10k messages have finished processing before its 100 messages are processed.
Is there a way to process some of messages of customer A and some of the messages of customer B at the same time? I know there is the option to set a higher priority on the messages from customer B, but that does not solve the issue when there is more multiple customers. The customer with the more messages will fill-up the queue while the others will have to wait.
I would appreciate if you could provide some help or advice.
The basic semantic of a queue is first-in-first-out (i.e. FIFO). There's no real way to escape that. I recommend you redesign your application to use multiple queues - one for each "type" of message or independent application you have.
I would say you can fine tune your activemq the number of messages it is processing in batches. Also, there are way you can fine tune a given Broker and queue. For more details refer this link:
http://activemq.apache.org/performance-tuning
https://access.redhat.com/documentation/en-US/Fuse_ESB/4.4.1/html-single/ActiveMQ_Tuning_Guide/index.html
I think I have find the solution to the issue. It involves using message groups. For each message I set the property JMSXGroupID with an identifier for the customer.
Since they are multiple message groups the queue takes care of assigning messages from different groups to different consumers. In that way, documents from customer B can be processed while the ones from customer A are still being processed.
I have a question about multi consumer concurrency.
I want to send works to rabbitmq that comes from web request to distributed queues.
I just want to be sure about order of works in multiple queues (FIFO).
Because this request comes from different users eech user requests/works must be ordered.
I have found this feature with different names on Azure ServiceBus and ActiveMQ message grouping.
Is there any way to do this in pretty RabbitMQ ?
I want to quaranty that customer's requests must be ordered each other.
Each customer may have multiple requests but those requests for that customer must be processed in order.
I desire to process quickly incoming requests with using multiple consumer on different nodes.
For example different customers 1 to 1000 send requests over 1 millions.
If I put this huge request in only one queue it takes a lot of time to consume. So I want to share this process load between n (5) node. For customer X 's requests must be in same sequence for processing
When working with event-based systems, and especially when using multiple producers and/or consumers, it is important to come to terms with the fact that there usually is no such thing as a guaranteed order of events. And to get a robust system, it is also wise to design the system so the message handlers are idempotent; they should tolerate to get the same message twice (or more).
There are way to many things that may (and actually should be allowed to) interfere with the order;
The producers may deliver the messages in a slightly different pace
One producer might miss an ack (due to a missed package) and will resend the message
One consumer may get and process a message, but the ack is lost on the way back, so the message is delivered twice (to another consumer).
Some other service that your handlers depend on might be down, so that you have to reject the message.
That being said, there is one pattern that servicebus-systems like NServicebus use to enforce the order messages are consumed. There are some requirements:
You will need a centralized storage (like a sql-server or document store) that allows for conditional updates; for instance you want to be able to store the sequence number of the last processed message (or how far you have come in the process), but only if the already stored sequence/progress is the right/expected one. Storing the user-id and the progress even for millions of customers should be a very easy operation for most databases.
You make sure the queue is configured with a dead-letter-queue/exchange for retries, and then set your original queue as a dead-letter-queue for that one again.
You set a TTL (for instance 30 seconds) on the retry/dead-letter-queue. This way the messages that appear on the dead-letter-queue will automatically be pushed back to your original queue after some timeout.
When processing your messages you check your storage/database if you are in the right state to handle the message (i.e. the needed previous steps are already done).
If you are ok to handle it you do and update the storage (conditionally!).
If not - you nack the message, so that it is thrown on the dead-letter queue. Basically you are saying "nah - I can't handle this message, there are probably some other message in the queue that should be handled first".
This way the happy-path is to process a great number of messages in the right order.
But if something happens and a you get a message out of band, you will throw it on the retry-queue (the dead-letter-queue) and Rabbit will make sure it will get back in the queue to be retried at a later stage. But only after a delay.
The beauty of this is that you are able to handle most of the situations that may interfere with processing the message (out of order messages, dependent services being down, your handler being shut down in the middle of handling the message) in exact the same way; by rejecting the message and letting your infrastructure (Rabbit) take care of it being retried after a while.
(Assuming the OP is asking about things like ActiveMQs "message grouping:)
This isn't currently built in to RabbitMQ AFAIK (it wasn't as of 2013 as per this answer) and I'm not aware of it now (though I haven't kept up lately).
However, RabbitMQ's model of exchanges and queues is very flexible - exchanges and queues can be easily created dynamically (this can be done in other messaging systems but, for example, if you read ActiveMQ documentation or Red Hat AMQ documentation you'll find all of the examples in the user guides are using pre-declared queues in configuration files loaded at system startup - except for RPC-like request/response communication).
Also it is very easy in RabbitMQ for a consumer (i.e., message consuming thread) to consume from multiple queues.
So you could build, on top of RabbitMQ, a system where you got your desired grouping semantics.
One way would be to create dynamic queues: The first time a customer order was seen or a new group of customer orders a queue would be created with a unique name for all messages for that group - that queue name would be communicated (via another queue) to a consumer who's sole purpose was to load-balance among other consumers that were responsible for handling customer order groups. I.e., the load-balancer would pull off of its queue a message saying "new group with queue name XYZ" and it would find in a pool of order group consumer a consumer which could take this load and pass it a message saying "start listening to XYZ".
Another way to do it is with pub/sub and topic routing - each customer order group would get a unique topic - and proceed as above.
RabbitMQ Consistent Hash Exchange Type
We are using RabbitMQ and we have found a plugin. It use Consistent Hashing algorithm to distribute messages in order to consistent keys.
For more information about Consistent Hashing ;
https://en.wikipedia.org/wiki/Consistent_hashing
https://www.youtube.com/watch?v=viaNG1zyx1g
You can find this plugin from rabbitmq web page
plugin : rabbitmq_consistent_hash_exchange
https://www.rabbitmq.com/plugins.html
Before a consumer nacks a message, is there any way the consumer can modify the message's state so that when the consumer consumes it upon redelivery, it sees that changed state. I'd rather not reject + reenqueue new message, but please let me know if that's the only way to accomplish this.
My goal is to determine how many times specific messages are being redelivered. I see two ways of doing this:
(1) On the message itself as described above. The message would be a container of basic stats and the application payload message.
(2) In some external storage. We would uniquely identify the message by the message id that we set.
I know 2 is possible, but my question is if 1 is possible.
There is no way to do (1) like you want. You would need to change the message, thus the message would become another message. If you want to do something like that (and it's possible that you meant this with I'd rather not reject + reenqueue new message) - you should ACK the message, increment one field in it and publish it again (again, maybe this is what you meant when you said reenqueue it). So your message payload would have some ID, counter, and again (obviously different) payload that is the content.
Definitvly much better way is (2) for multiple reasons:
it does not interfere with business logic, that is this diagnostic part is isolated
you are leaving re-queueing to rabbitmq (as you are supposed to do), meaning that you are not worrying about losing messages and handling some message meta info which has no use for you business logic
it's actually supposed to be used - the ACKing and NACKing, that's why it's in the AMQP specification
since you do need the number of how many times specific messages have been redelivered, you have it somewhere externally, meaning that it's independent of (rabbitmq's) message persistence, lifetime, potentially queue durability mirroring etc
Even if this question was marked as solved some time ago, I want to mention that there is a way at least for the redelivery. It might be integrated after the original answer. There is a different type of queues in RabbitMQ called Quorum queues.
Quorum queues offer the option to set redelivery limit:
Quorum queues support poison message handling via a redelivery limit. This feature is currently unique to Quorum queues.
In order to archive this, RabbitMQ is counting the numbers of deliveries in the header. The header attribute is called: x-delivery-count
I need to know what is the difference between prefetch count vs no ack in rabbitmq ?
Also
What is the difference between following statements :-
if i set prefetch count say 10 does 10 consumer threads are created ?
Or --
if i register 10 cosumers will it create 10 threads ?
Which of the above is more efficient
To answer this specifically for spring-amqp.
prefetchCount=10 means the broker allows up to 10 unacked message for each consumer; it does not affect the number of threads.
Use concurrentConsumers to create multiple consumers - which will have one thread each.
auto ack means the broker doesn't require acks (so you can lose messages). Spring AMQP also blocks deliveries (to prefetch count) if the listener can't keep up.
Pre-fetch count: How many messages the consumer should read from queue and kept internally rather than picking one message at a time.
No-Ack: Do not acknowledge back that the consumer is done consuming the message.
Those both are used to fine tune your set-up
To address your second part of the question:
If you set prefetch count to 10, 10 consumers won't be created, but your single consumer will fetch 10 messages at a time.
And if you create 10 consumers it will most likely create 10 threads (or processes). It all depends on how you configure it. Most likely you will be wanting to use a thread pool though
I know this question is old, but part of it was never specifically answered, so for anyone who comes here later looking for answers:
If you don't want new messages sent to you as soon as you acknowledge the previous ones, but instead want a message to be sent to you only when you explicitly request one, then you don't want to set up a "consumer" (in RabbitMQ terminology) at all; specifically, you'll want to use AMQP's basic.get operation (which just fetches a single message without creating a consumer) rather the more common basic.consume operation (which registers a consumer that will be sent messages as they become available).
Different libraries and frameworks will have different ways of accomplishing this; for example, in Ruby, using the Bunny client, you can call message = queue.get instead of queue.subscribe do .... In Spring, you'd do something like GetResponse response = channel.basicGet("some.queue", false);.