I want to implement a priority work queue, in which the priority of a group of messages can change once they are in the queue. Since it is a work queue with variable processing time, the messages are not assigned using round-robin algorithm, but are pulled from the queue when a resource is free (using per-consumer limit).
I came up with 2 ideas for implementation:
Use priority queue from RabbitMQ, and when a request for priority change comes, read messages with this priority from the queue and re-send them with different priority. (I am not sure this is a good approach, given the O(n) complexity.)
Use several queues with distinct names for each group of messages, and use a separate queue to communicate the current priority list (ordered list of queue names) to workers. (Using this approach, I am not sure how to make the list of priorities "persistent", so that newly joined worker knows what is the current priority list.)
How would you implement it? Is RabbitMQ viable option for this use case?
your idea "priority of a message can change once they are in the queue" IMO is not possible with rabbitmq because rabbitmq only allows you to get messages from the head of a queue.
for example:
you have N queues each used for a different priority
each queue has 100+ messages
your idea requires you to reach into the middle of a queue to get a specific message but this is not possible with rabbitmq so the thought experiment stops here because you can only get messages at the head of a queue
your idea IMO would require using something else besides rabbitmq.
a quick and dirty idea that would work with rabbitmq now and is similar to your idea:
create one rabbitmq queue with N priorities
submit a message with priority x
if you need to change the priority to higher priority like priority y then you could send the same message again but with a new higher priority y
this would ensure the new message is processed faster
the side effect is that you may process the same request twice
you could fix the side effect in your design by having a some sort of database for synchronization to keep track of what jobs are completed and then this could avoid processing the job twice
there are many other details that would need to be addressed like keeping the original message around somehow outside of rabbitmq, concurrency, etc, etc,
Related
I'm thinking of using RabbitMQ for a new project (with little own RabbitMQ experience) to solve the following problem:
Upon an event, a long running computation has to be performed. The "work queue" pattern as described in https://www.rabbitmq.com/tutorials/tutorial-two-python.html seems to be perfect, but I want an additional twist: I want no two jobs with the same routing key (or some parts of the payload or metadata, however to implement that) running on the workers at the same time. In other words: when one worker is processing job XY, and another job XY is queued, the message XY must not be delivered to a new idle worker until the running worker has completed the job.
What would be the best strategy to implement that? The only real solution I came up with was that when a worker gets a job, it has to check with all other workers if they are currently processing a similar job, and if so, reject the message (for requeueing).
Depending on your architecture there are two approaches to your problem.
The consumers share a cache of tasks under process and if a job of the same type shows up, they reject or requeue it.
This requires a shared cache to be maintained and a bit of logic on the consumers side.
The side effect is that duplicated jobs will keep returning to the consumers in case of rejection while in case of requeueing they will be processed with unpredictable delay (depending on how big the queue is).
You use the deduplication plugin on the queue.
You won't need any additional cache, only a few lines of code on the publisher side.
The downside of this approach is that duplicated messages will be dropped. If you want them to be delivered, you will need to instruct the publisher to retry in case of a negative acknowledgment on the publisher.
I have a question about multi consumer concurrency.
I want to send works to rabbitmq that comes from web request to distributed queues.
I just want to be sure about order of works in multiple queues (FIFO).
Because this request comes from different users eech user requests/works must be ordered.
I have found this feature with different names on Azure ServiceBus and ActiveMQ message grouping.
Is there any way to do this in pretty RabbitMQ ?
I want to quaranty that customer's requests must be ordered each other.
Each customer may have multiple requests but those requests for that customer must be processed in order.
I desire to process quickly incoming requests with using multiple consumer on different nodes.
For example different customers 1 to 1000 send requests over 1 millions.
If I put this huge request in only one queue it takes a lot of time to consume. So I want to share this process load between n (5) node. For customer X 's requests must be in same sequence for processing
When working with event-based systems, and especially when using multiple producers and/or consumers, it is important to come to terms with the fact that there usually is no such thing as a guaranteed order of events. And to get a robust system, it is also wise to design the system so the message handlers are idempotent; they should tolerate to get the same message twice (or more).
There are way to many things that may (and actually should be allowed to) interfere with the order;
The producers may deliver the messages in a slightly different pace
One producer might miss an ack (due to a missed package) and will resend the message
One consumer may get and process a message, but the ack is lost on the way back, so the message is delivered twice (to another consumer).
Some other service that your handlers depend on might be down, so that you have to reject the message.
That being said, there is one pattern that servicebus-systems like NServicebus use to enforce the order messages are consumed. There are some requirements:
You will need a centralized storage (like a sql-server or document store) that allows for conditional updates; for instance you want to be able to store the sequence number of the last processed message (or how far you have come in the process), but only if the already stored sequence/progress is the right/expected one. Storing the user-id and the progress even for millions of customers should be a very easy operation for most databases.
You make sure the queue is configured with a dead-letter-queue/exchange for retries, and then set your original queue as a dead-letter-queue for that one again.
You set a TTL (for instance 30 seconds) on the retry/dead-letter-queue. This way the messages that appear on the dead-letter-queue will automatically be pushed back to your original queue after some timeout.
When processing your messages you check your storage/database if you are in the right state to handle the message (i.e. the needed previous steps are already done).
If you are ok to handle it you do and update the storage (conditionally!).
If not - you nack the message, so that it is thrown on the dead-letter queue. Basically you are saying "nah - I can't handle this message, there are probably some other message in the queue that should be handled first".
This way the happy-path is to process a great number of messages in the right order.
But if something happens and a you get a message out of band, you will throw it on the retry-queue (the dead-letter-queue) and Rabbit will make sure it will get back in the queue to be retried at a later stage. But only after a delay.
The beauty of this is that you are able to handle most of the situations that may interfere with processing the message (out of order messages, dependent services being down, your handler being shut down in the middle of handling the message) in exact the same way; by rejecting the message and letting your infrastructure (Rabbit) take care of it being retried after a while.
(Assuming the OP is asking about things like ActiveMQs "message grouping:)
This isn't currently built in to RabbitMQ AFAIK (it wasn't as of 2013 as per this answer) and I'm not aware of it now (though I haven't kept up lately).
However, RabbitMQ's model of exchanges and queues is very flexible - exchanges and queues can be easily created dynamically (this can be done in other messaging systems but, for example, if you read ActiveMQ documentation or Red Hat AMQ documentation you'll find all of the examples in the user guides are using pre-declared queues in configuration files loaded at system startup - except for RPC-like request/response communication).
Also it is very easy in RabbitMQ for a consumer (i.e., message consuming thread) to consume from multiple queues.
So you could build, on top of RabbitMQ, a system where you got your desired grouping semantics.
One way would be to create dynamic queues: The first time a customer order was seen or a new group of customer orders a queue would be created with a unique name for all messages for that group - that queue name would be communicated (via another queue) to a consumer who's sole purpose was to load-balance among other consumers that were responsible for handling customer order groups. I.e., the load-balancer would pull off of its queue a message saying "new group with queue name XYZ" and it would find in a pool of order group consumer a consumer which could take this load and pass it a message saying "start listening to XYZ".
Another way to do it is with pub/sub and topic routing - each customer order group would get a unique topic - and proceed as above.
RabbitMQ Consistent Hash Exchange Type
We are using RabbitMQ and we have found a plugin. It use Consistent Hashing algorithm to distribute messages in order to consistent keys.
For more information about Consistent Hashing ;
https://en.wikipedia.org/wiki/Consistent_hashing
https://www.youtube.com/watch?v=viaNG1zyx1g
You can find this plugin from rabbitmq web page
plugin : rabbitmq_consistent_hash_exchange
https://www.rabbitmq.com/plugins.html
We're seeing an issue where consumers of our message queues are picking up messages from queues at the top of the alphabetical range. We have two applications: a producer, and a subscriber. We're using RabbitMQ 3.6.1.
Let's say that the message queues are setup like so:
Our first application, the producer, puts say 100 messages/second onto each queue:
Our second application, the subscriber, has five unique consumer methods that can deal with messages on each respective queue. Each method binds to it's respective queue. A subscriber has a prefetch of 1 meaning it can only hold one message at a time, regardless of queue. We may run numerous instances of the subscriber like so:
So the situation is thus: each queue is receiving 100 msg/sec, and we have four instances of subscriber consuming these messages, so each queue has four consumers. Let's say that the consumer methods can deal with 25 msg/sec each.
What happens is that instead of all the queues being consumed equally, the alphabetically higher queues instead get priority. It's seems as though when the subscriber becomes ready, RabbitMQ looks down the list of queues that this particular ready channel is bound to, and picks the first queue with pending messages.
In our situation, A_QUEUE will have every message consumed. B_QUEUE may have some consumed in certain race conditions, but C_QUEUE/D_QUEUE and especially E_QUEUE will rarely get touched.
If we turn off the publisher, the queues will eventually drain, top to bottom.
Is it possible to configure either RabbitMQ itself or possibly even the channel to use some sort of round robin distribution policy or maybe even random policy so that when a channel has numerous bound queues, all with messages pending, the distribution is even?
to clarify: you have a single subscriber application with multiple consumers in it, right?
I'm guessing you're using a single RabbitMQ Connection within the subscriber app.
Are you also re-using a single RabbitMQ Channel for all of your consumers? If so, that would be a problem. Be sure to use a new Channel for each consumer you start.
Maybe the picture is wrong, but if it's not then your setup is wrong. You don't need 4 queues if you are going to have subscribers that listen to each and every queue. You'd just need one queue, that has multiple instances of the same subscriber consuming from it.
Now to answer, yes (but no need to configure, as long as prefetch is 1), actually rabbitmq does distribute messages evenly. You can find about about that here, and on the same place actually how your setup should look like. Here is a quote from the link.
RabbitMQ just dispatches a message when the message enters the queue.
It doesn't look at the number of unacknowledged messages for a
consumer. It just blindly dispatches every n-th message to the n-th
consumer.
I'm new to RabbitMQ and I'm wondering how to implement the following: producer creates tasks for multiple sites, there's a bunch of consumers that should process these tasks one by one, but only talking to 1 site with concurrency of 1, without starting a new task for this site before the previous one ended. This way slow site would be processed slowly, and the fast ones - fast (as opposed by slow sites taking up all the worker capacity).
Ideally a site would be processed only by one worker at a time, being replaced by another worker if it dies. This seems like a task for exclusive queues, but apparently there's no easy way to list and subscribe to new queues. What is the proper way to achieve such results with RabbitMQ?
I think you may have things the wrong way round. For workers you have 1 or more producers sending to 1 exchange. The exchange has 1 queue (you can send directly to the queue, but all that is really doing is going via a default exchange, I prefer to be explicit). All consumers connect to the single queue and read off tasks in turn. You should set the queue to require messages to be ACKed before removing them. That way if a process dies it should be returned to the queue and picked up by the next consumer/worker.
I have a somewhat unique use case with RabbitMQ and I'm not sure how to go about solving the problem. I want to have one queue with multiple consumers bound to it and then have RabbitMQ send out one message to only one consumer at at time and wait for an ACK before sending out another message to any other consumer.
I realize this kills throughput and can essentially starve the other consumers but for me that's OK. The reason for this odd use case is that the service that the consumers talk to can only handle one concurrent request at a time so I need a way to limit this but consumers can also die unexpectedly and I need another consumer to pick up processing the messages if this happens. I know there is the prefetch option but that still allows multiple users to get a and exclusive queues but I'm not sure those accomplish what I want. Is it possible configure RabbitMQ to do this?
No; there is no way to limit competing consumers on the same queue such that there is one and only one message in process across all consumers until the ack is received.
A similar question came up some time ago; I don't remember if it was here or in the Spring forums but I believe the solution was to have the consumers acquire a global lock of some kind, using something like hazelcast, or even a simple database table row lock (with prefetch=1 so each consumer had only one "in process" message which was processed as and when each one got the lock).