How to know the last job in a queue - rabbitmq

I have a group of of jobs that need to be processed.
Some may take 10 min, some may take 1h.
Now I need to know what is the last job executed because at the end of that group of jobs I need to fire another message.
The message queue in this case is RabbitMQ.
Is there a way I can accomplish this with only RabbitMQ?
What would be a good strategy for this task?

Thats strategy you can use with any messaging system.
I assume you have group of workers listening to single queue with jobs "jobs queue" to be processed. Now you can have service lets call it Manager witch duplicates this queue and saves all no finished messages. Now when worker finishes the job it send acknowledgment message to the Manager. Manager for example discards all finished jobs and stores only running once. (If you want to take in to account passable failures it can track that too).
When Manager have no more messages it publishes message to the "all messages in the group done topic". Now publishers can listen to the topic and fire new job messages to the "job queue".
Of course in simple case you can have one producer witch could be the Manager in the same time.
Example RabbitMQ implementation.
Now to implement this in RabbitMQ you can for example create one FanoutExchange (for producer to send messages to) and two queues jobsQueue (to send jobs to workers) and jobTrackingQueue (to send jobs to Manager for tracking jobs). Now you create second FonoutExchange (for Manager to send task done messges to) you create unnamed queue per producer who wants to know if all messges are done.

Related

RabbitMQ pause a queue

I am using a RabbitMQ Server (v3.8.9) with Java clients.
Use case is:
Our Backend creates messages for different clients. We send them out to their respective Endpoints.
1 Producer -> Outbound Queue -> 1 Consumer
The producer creates messages for n clients
Which the consumer should send out to the clients' endpoints
Messages must be kept in the correct order regarding each client
Works fine, unless all clients are up and running. Problem: If one client becomes unavailable, we need to have a bulletproof retry mechanism for that.
Say:
Wait 1 Minute and try again
All following messages must NOT be delivered before the first failed one and kept in the correct order
If a retry works, then ALL other messages should be send to the client immediately
As you can see, it is not a solution to just "supsend" the consumer, because it should still deliver msg to the other (alive) clients. Due to application limitations and a dynamic number of clients, we cannot spawn one consumer per client queue.
My best approach right now is to dynamically create one queue per client, which are then routed to a single outbound queue. If one msg to one client cannot be delivered by the consumer, I would like to "pause" the clients queue for x minutes. An API call like "queue_pause('client_q1', '5 Minutes')" would help. But even then I have to deal with the other, already routed messages to that particular client and keep them in the correct order...
Any better ideas?
I think the key here is that a single consumer script can consume from multiple queues. So if I'm understanding correctly, you could model this as:
Each client has its own queue. These could be created by the consumer script when it starts up, or by a back-end process when a new client is created.
The consumer script subscribes to each queue separately
When a message is received, the consumer tries to send it immediately to the client; if it succeeds, it is manually acknowledged with basic.ack, and the consumer is ready to send the next message to that client.
When a message cannot be delivered to the client, it is requeued (basic.nack or basic.reject with requeue=1), retaining its position in the client's queue.
The consumer then needs to pause consuming from that particular queue. Depending on how its written, that could be as simple as a sleep in that particular thread, but if that's not practical, you can effectively "pause" the subscription to the queue:
Cancel the subscription to that queue, leaving other subscriptions in tact
Store the queue name and the retry time in an appropriate variable
If the consumer script is implemented with an event/polling loop, check the list of "paused" subscriptions each time around that loop; if the retry time has been reached, re-subscribe.
Alternatively, if the library / framework supports it, register a delayed event that will fire at the appropriate time and re-subscribe the queue. The exact mechanics of this depend on the technologies you're using.
All the other subscriptions will continue, so messages to other clients will be delivered. The queue with no subscribers will retain the messages for the offline client in order until the consumer script starts consuming them again.

How to consume from queues with custom names (defined on a fly) with RabbitMQ

I have tasks with an ids, and each task has some amount of jobs to do:
every job for each user;
for each task, number of users and jobs are different.
And I want to put all the jobs to one queue with name task{id}, so I can control when the task is done (by empty queue task{id}), and automatically delete it with rabbitMQ help, and control a number of consumers working on one task{id}.
And I want my consumers works all the time like daemons and choose the queues with jobs automatically to work with.
The main question here is how to get name of a task inside a consumers to bind them to it?
Or maybe there can be some another trick with rabbitMQ to do so, without knowing name of a queue?
You need to use the Event Exchange Plugin , this plugin allows you to consumes internal events and re-publishes them to a topic exchange, thus exposing the events to clients (applications).
You can bind to queue.created Event which will give you the name of the queue in the message header and you can then use it to bind your consumer to that specific queue

Distribute rabbitmq messages evenly

At the moment we have number of publishers (micro-services) which publish their messages to exchange. Each message has a serviceId attribute. The queue is connected to a single subscriber (micro-service) which processes the queue messages, processing of a single message is a costly operation (takes about 20-30 secs).
Currently we have the following situation: service A publishes ~200 messages, after some seconds service B publishes 2 messages. So the subscriber will process these 2 messages only after the first 200 will be processed.
We want to process the messages in the order they came to the queue, but with respect to the source serviceId.
Obvious solution is to split the queue to a separate queues (one per publisher) and subscribe to each queue separately, but the number of publishers can change, we need to request them dynamically and subscribe (unsubscribe) to them.
Another approach is to replicate our subscriber app to have one to one relationship between publisher and subscriber, but this will require more system resources.
What would be the best approach to handle this situation?
Thanks!
/!\ Be careful, publishers publish to an exchange, not to a queue.
We want to process the messages in the order they came to the queue,
but with respect to the source serviceId.
If I understand well, you want to load balance your messages according to a serviceId, and serviceIds are not known in advance.
The solution I would suggest here is to have a direct exchange, with routing keys such as xxxxx.<serviceId>. Then, you can bind one queue by serviceId (that is: one queue for service A, one for service B, ...), each consumer consuming on all queues.
Then you have to handle the publisher subscription: I would make a publisher publish a "hello" message, this message being consumed by each consumer, which in turn bind a new queue for that service (using xxxxx.<newServiceId>), and finally publish a response back (so that the publisher can start sending messages).
Note: each service queue is the same for all consumers, resulting in the worker configuration (see this tutorial)
Hope this helps.

RabbitMQ workers with unique key

I'm thinking of using RabbitMQ for a new project (with little own RabbitMQ experience) to solve the following problem:
Upon an event, a long running computation has to be performed. The "work queue" pattern as described in https://www.rabbitmq.com/tutorials/tutorial-two-python.html seems to be perfect, but I want an additional twist: I want no two jobs with the same routing key (or some parts of the payload or metadata, however to implement that) running on the workers at the same time. In other words: when one worker is processing job XY, and another job XY is queued, the message XY must not be delivered to a new idle worker until the running worker has completed the job.
What would be the best strategy to implement that? The only real solution I came up with was that when a worker gets a job, it has to check with all other workers if they are currently processing a similar job, and if so, reject the message (for requeueing).
Depending on your architecture there are two approaches to your problem.
The consumers share a cache of tasks under process and if a job of the same type shows up, they reject or requeue it.
This requires a shared cache to be maintained and a bit of logic on the consumers side.
The side effect is that duplicated jobs will keep returning to the consumers in case of rejection while in case of requeueing they will be processed with unpredictable delay (depending on how big the queue is).
You use the deduplication plugin on the queue.
You won't need any additional cache, only a few lines of code on the publisher side.
The downside of this approach is that duplicated messages will be dropped. If you want them to be delivered, you will need to instruct the publisher to retry in case of a negative acknowledgment on the publisher.

RabbitMQ - subscribe to message type as it gets created

I'm new to RabbitMQ and I'm wondering how to implement the following: producer creates tasks for multiple sites, there's a bunch of consumers that should process these tasks one by one, but only talking to 1 site with concurrency of 1, without starting a new task for this site before the previous one ended. This way slow site would be processed slowly, and the fast ones - fast (as opposed by slow sites taking up all the worker capacity).
Ideally a site would be processed only by one worker at a time, being replaced by another worker if it dies. This seems like a task for exclusive queues, but apparently there's no easy way to list and subscribe to new queues. What is the proper way to achieve such results with RabbitMQ?
I think you may have things the wrong way round. For workers you have 1 or more producers sending to 1 exchange. The exchange has 1 queue (you can send directly to the queue, but all that is really doing is going via a default exchange, I prefer to be explicit). All consumers connect to the single queue and read off tasks in turn. You should set the queue to require messages to be ACKed before removing them. That way if a process dies it should be returned to the queue and picked up by the next consumer/worker.