How to track parallel progress and completion in RabbitMQ - rabbitmq

The scenario is that I want my code to iterate all line items on an order and insert them separately in queues for parallel processing. Products would pass through multiple queues before finally landing in a data store.
I know I can attach the Order ID with the products so that they all come together in the end but my question is:
Is there a way to tag products so that RabbitMQ understands they are part of a group and report on progress or completion?
I know could use code to setup an array to track progress/completion or use the data store. I'm just wondering if there is some facility I can use with RabbitMQ before reinventing the wheel.

No, there are no logical grouping of items in RabbitMQ, beside the queue they are on.
You need to implement that logic in your application.

Related

How do I clear up Documentum rendition queue?

We have around 300k items on dmi_queue_item
If I do right click and select "destroy queue item" I see the that row no longer appears if I query by r_object_id.
Would it mean that the file no longer will be processed by the CTS service ? Need to know if this would it be the way to clear up the queue for the rendition process (to convert to PDF) or what it would it be the best way to clear up the queue ?
Also for some items/rows I get this message when doing the right click "destroy" thing, what does it mean ? or how can I avoid it ? Not sure if maybe the item was processed and the row no longer exists or is something else.
dmi_queue_item table is used as queue for all sorts of events at Content Server.
Content Transformation Service is using it to read at least two types of events, afaik.
According to Content Transformation Services, Administration Guide, ver. 7.1, page 18 it reads dm_register_assets and performs the configured content actions for this specific objects.
I was using CTS for generating content renditions for some objects using dm_transcode_content event.
However, be carefull when cleaning up dmi_queue_item since there could be many different event types. It is up to system administrators to keep this queue clean by configuring system components to use events or not to stuff up events that are not supposed to be used.
As per cleaning the queue it is advised to use destroy API command, though you can even try to delete row using DELETE query. Of course try to do this in dev environment first.
You would need to look at 2 queues:
dm_autorender_win31 and dm_mediaserver. In order to delete them you would run a query:
delete dmi_queue_item objects where name = 'dm_mediaserver' or name = 'dm_autorender_win31'

Azure app function: best approach for this scenario

I’m developing a small game where the player owns droids used to perform some automated actions. The easiest example is giving an order to a droid to send him at a specific position. Basically, the users gives it a position and the droid goes there. I’m already using a lot Azure app function and I’d like to use them to make the droid moves.
On the top of my head, I thought about making one function that would trigger every minute, fetch all the droid that need to move then make them move.
The issue with this approach is that if the game is popular, there could be hundreds of droids and I have to ensure that the function execution time stays below the minute.
I thought about just retrieving all droids that needs to move then for each of them calling a Azure app function via its URL to make it execute for this particular droid. In my head, it would parallelize the execution a bit but I’m not sure I’m correct.
I also have to think about using sql transaction or not in order to be sure not to create deadlocks.
The final question would be « how to handle recurring treatment of potentially large amount of data and ensure that it stays below the minute ? »
Thanks for your advice
Typically, you handle such scenarios with queues. Each order becomes a queue message, and then Azure Function is triggered by it and processes the order. It can and will scale based on the amount of messages in the queue.
If your logic still requires timer-based processing, the timer should be as lean as possible, e.g. send the queue messages to a queue which would do the real work.

How to handle a single publisher clogging up my RabbitMQ's queue

In my last project, I am using MassTransit (2.10.1) with RabbitMQ.
On some scenarios, a producer is allowed to send a bulk of messages to the queue.
For example - the user set to a bulk notification to his list of contacts - the list could be as large as 100000 contacts on some cases. This will send a message per each contact to the queue (I need to keep track of each message). Now since - as I understand - messages are being processed in the order of entrance, that user is clogging up the queue for a large amount of time while another user, which may have done a simple thing such as send a test message to himself, waits for the processing to end.
I have considered separating queues for regular VS bulk operations but this still doesn't solve the problem for small bulks (user with dozens of contacts waiting for users with hundred thousands) and causes extra maintenance.
The ideal solution for me - I think - would involve manipulating the routing in such a way that the consumer will be handling x messages from the same user, move the X messages from the next user, than again, and than moving back to the beginning of the queue, until all messages are processed.
Is that possible? Is there a better solution?
Thanks in advance.
You will to have to write code to manage this yourself. RabbitMQ doesn't really have any built-in mechanism to handle a scenario like this, without your code getting involved.
If you want to process a few at a time from bulk, then back to normal, then back to bulk, you'll need 2 queues and code to manage which one is being pulled from, when.
Just my opinion, seeing as how there is no built in way to my knowledge...Have you considered using whatever storage you are using to store the notifications, then just publish one message, with a List of Notifications, store it in you DB, and then have a retrieve notifications for user consumer. the response would be one message, it may have a massive payload, but even if that gets bogged down, add a skip and take property to the message and force it to be between 0 and 50 (or whatever). In what scenario would you want to show a user 100,000 notifications at once?

"Archiving" publish/subscribe message in Redis

I am using Redis' publish/subscribe feature. So the server is publishing 10 items then the client gets those 10 items.
Now however, a new client subscribes to the feed. I would like them to get the previous 10 items as well as any new items.
Does Redis have a way of doing this using the publish and subscribe functionality? Is a feed history stored anywhere in the database? Is there an easy way of doing this? Is the best way to also store the messages in a list and have the client do an LRANGE my_list 0 10 on the list?
I'd keep a separate archive of the data and have events added to both. New clients can subscribe and queue the real time events, read the archive until it's up to date with the first published event, then catch up with the published events. That way you shouldn't miss any published events while switching between the archive and real time events.
Stumbled on this during some research. I know it is old but I wanted to add that with the Redis Streams data structure it is not overly complex to implement persistent messaging.
The publisher would publish messages to a Stream and a subscriber would just get the latest message if that is all it cared about. You can also create user groups to limit how many subscribers can get the message and then mark them as acknowledged to avoid duplicate processing. This is good when you want a message to be handled only once and need a way to confirm that.
I ended up creating a nodejs app for this sort of purpose. In my case, user data was published to the redis server which i wanted to store, I subscribed to the redis channel with a nodejs app and then saved the details to a database, ive played around with mysql and mongo so far, let me know if this is of any interest and ill paste some code, there are some similarities in trying to store a publish history...
Cheers

How to know when a set of RabbitMQ tasks are complete?

I am using RabbitMQ to have worker processes encode video files. I would like to know when all of the files are complete - that is, when all of the worker processes have finished.
The only way I can think to do this is by using a database. When a video finishes encoding:
UPDATE videos SET status = 'complete' WHERE filename = 'foo.wmv'
-- etc etc etc as each worker finishes --
And then to check whether or not all of the videos have been encoded:
SELECT count(*) FROM videos WHERE status != 'complete'
But if I'm going to do this, then I feel like I am losing the benefit of RabbitMQ as a mechanism for multiple distributed worker processes, since I still have to manually maintain a database queue.
Is there a standard mechanism for RabbitMQ dependencies? That is, a way to say "wait for these 5 tasks to finish, and once they are done, then kick off a new task?"
I don't want to have a parent process add these tasks to a queue and then "wait" for each of them to return a "completed" status. Then I have to maintain a separate process for each group of videos, at which point I've lost the advantage of decoupled worker processes as compared to a single ThreadPool concept.
Am I asking for something which is impossible? Or, are there standard widely-adopted solutions to manage the overall state of tasks in a queue that I have missed?
Edit: after searching, I found this similar question: Getting result of a long running task with RabbitMQ
Are there any particular thoughts that people have about this?
Use a "response" queue. I don't know any specifics about RabbitMQ, so this is general:
Have your parent process send out requests and keep track of how many it sent
Make the parent process also wait on a specific response queue (that the children know about)
Whenever a child finishes something (or can't finish for some reason), send a message to the response queue
Whenever numSent == numResponded, you're done
Something to keep in mind is a timeout -- What happens if a child process dies? You have to do slightly more work, but basically:
With every sent message, include some sort of ID, and add that ID and the current time to a hash table.
For every response, remove that ID from the hash table
Periodically walk the hash table and remove anything that has timed out
This is called the Request Reply Pattern.
Based on Brendan's extremely helpful answer, which should be accepted, I knocked up this quick diagram which be helpful to some.
I have implemented a workflow where the workflow state machine is implemented as a series of queues. A worker receives a message on one queue, processes the work, and then publishes the same message onto another queue. Then another type of worker process picks up that message, etc.
In your case, it sounds like you need to implement one of the patterns from Enterprise Integration Patterns (that is a free online book) and have a simple worker that collects messages until a set of work is done, and then processes a single message to a queue representing the next step in the workflow.