is there a way to implement a FIFO queue based on key (like SQS message group ID) using RabbitMQ?
I have a system with multiple consumers that processes messages from the same group sequentially, this is done using SQS FIFO with MessageGroupId, but now I need to move to a RabbitMQ solution, but couldn't find out how.
I could use direct exchange, which enqueues using routing keys, but I need this to be dynamic, routing keys are based on content data.
Ex. I have 4 messages, A1, B1, A2, C1. Messages A1, B1, C1 should be processed by the consumers concurrently, but A2 should be processed only after A1.
The AMQP protocol is so flexible that it allows you to create queues and bind them to the exchange with their own routing keys at runtime. So, what you are looking for for groups is indeed a dynamic queue per group and respective binding for the specific routing key.
Please, learn more about this feature in the official RabbitMQ docs:
https://www.rabbitmq.com/tutorials/tutorial-four-java.html
https://www.rabbitmq.com/tutorials/tutorial-four-spring-amqp.html
Related
I want to create a consumer that process messages from multiple variable number of sources, that are connected or disconnected dynamically.
What I need is that each consumer prioritize first N messages of each source. Then to run multiple consumers to improve the speed.
I have been reading docs for Work queues, Routing and Topics, and a lot of other docs without identifying how to implement this. Also I made some tests without luck.
Can someone point me how to do it or where to read about it?
--EDIT--
QueueA-----A3--A2--A1-┐
QueueB-----B3--B2--B1-┼------ Consumer
QueueC-----C3--C2--C1-┘
The desired effect is that each consumer gets first messages of each queue. For example: A1, B1, C1, A2, B2, C2, A3, B3, C3, and so on. If a new queue is created (QueueD), the consumer would start receiving messages from it in the same fashion.
Thanks in advance
What I need is that each consumer prioritize first N messages of each source. Then to run multiple consumers to improve the speed.
All message queues that I know of only provide ordering guarantees within the queue itself (Kafka provides ordering guarantee not at queue level but within the partitions within queues). However, here you are asking to serialize multiple queues. Which will not be possible in a distributed system context.
Why? because if you have more than one consumers to these queues, messages will be delivered to each connected consumers of a queue in a round robin fashion.
Assuming a prefetch_count=1 and with two connected consumers, say first set of messages delivered as follows:
A1, B1 & C1 delivered to consumer 1 (X)
A2, B2 & C2 delivered to consumer 2 (Y)
Now, in a distributed system, everything is async, and things could go wrong. For example:
If X acks A1, A3 will be delivered to X. But if Y acks A2 before X, A3 will be delivered to Y.
Who acks first is not within your control in a distributed system. Consider following scenarios:
X might had to wait for I/O or CPU bound task, while Y might got lucky that it doesn't had to wait. Then Y will advance through the messages in queue.
Or Y got killed (a partition) or n/w got slow, then X will continue consuming the queue.
I'll strongly advice you to re-think your requirements, and consider your expected guarantees in an async context (you wouldn't be considering a MoM otherwise, would you?).
PS: it is possible to implement what you are asking for with some consumer side logic (with a penalty on performance/throughput).
A single consumer has to connect to all queues
wait for messages from every queue before Ack'ing the messages.
Once a message from every queue is received, group them as a single message and publish to another queue (P).
Now many consumers could be subscribed to P to process the ordered group of messages.
I do not advise it, but hey, it is your system, who is going to stop you ;)
I'm seeking some advise as to how best configure my rabbitMQ exchanges.
I'm trying to use a topic exchange in a round robin methodology. Each consumer has its own (uniquely) named queue attached to a topic exchange. I would like the exchange to round robin messages to each consumer queue for the "same" topic - lets say *.log for example.
I have tried multiple combinations and only seem to be able to simultaneously deliver messages to the consumer queues which effectively means I'm processing the message twice, once in each consumer.
For clarity I also have a fanout exchange, which I use to "control" the consumers (start, stop etc).this should remain in place in any outcome.
Any guidance on how best to achieve the stated outcome would be great.
Each consumer has its own (uniquely) named queue attached to a topic exchange
The trick is to have every worker/consumer that you want to round-robin between to setup a named queue and all use the same queue instead creating their own.
So you could create a named queue called "log" for all of the "log" workers. You would create a different named queue for say "foo" for all of the "foo" workers. Requests will be delivered round-robin to all consumers looking at the same queue.
To use RabbitMQ in round robin fashion, better to use direct exchange instead of topic exchange.
A direct exchange is ideal for the unicast routing of messages (although they can be used for multicast routing as well).
A queue binds to the exchange with a routing key K.
When a new message with routing key R arrives at the direct exchange, the exchange routes it to the queue if K = R.
Direct exchanges are often used to distribute tasks between multiple workers in a round robin manner. When doing so, it is important to understand that, messages are load balanced between consumers and not between queues.
I have trouble understanding the routing in RabbitMQ. Consider I have several producers (let call them clients) that produce messages to the queue. E.g., clients A, B, and C send messages to queue X1.
Let the consumer respond to all messages sending responses back to the queue. E.g., consumer gets message from queue X1, does something, and sends responses to the queue X1.
How can, client A determine where are in the queue X1 messages sent to it and where are messages sent to clients B or C?
I can't declare one queue per connection because of large number of connections expected (~10^6). So I'm in trouble here. Any suggestions? Thanks.
I think you need to look at the RPC tutorial. From your description it sounds like that is what you want to do. However that would probably require you to declare more queues than you want.
Approaching this a different way. I cannot understand why you would send a reply back to the producer not only by the same exchange but the same queue that the consumers are consuming from.
Would it not make sense to have producers P1,P2 and P3 send to exchange X1 with routing key "abc.aaa.xyz" / "abc.bbb.xyz" / "abc.ccc.xyz". Then have queues Q1, Q2 and Q3 bound to X1 with binding keys ".aaa." / ".bbb." / ".ccc." or just Q1 with binding key "abc.*.xyz" (I am unclear on exactly what you want so just making some suggestions). Which are consumed by Consumers C1, C2 and C3
When the Consumer has finished processing the message then it will send a message to X2, with routing key that identifies itself. The producers will consume from queues bound to X2.
The point I am trying to make is that you do not want more than one consumer reading from a queue. There is only one case in which you want that and that is a task queue. I am not clear on your use case so you may want a task queue. If you do then you should still not have your producers reading from the same task queue as your consumers. Aside from task queues you should have one consumer read from one queue. You may have many queues to one exchange and even many bindings from one queue to one exchange.
I hope this helps
I have to implement this scenario:
An external application publish message to rabbitmq.
This message has a client_id property. We can place this id to routing key or message header or some other property.
I have to implement sharding in a exchange routng logic - the message should be delivered to specific queue based on the client_id range.
Is it possible to implement in a standard exchanges?
If not what exchange should I take as the base?
How to dynamicly change client_id ranges?
Take a look at the rabbitmq plugin. It's included in the RabbitMQ distribution from v3.6.0 onwards.
Just have your producer put enough info into the routing key that causes the message to go into the right queue on the other side of the Exchange.
So for example, create two queues called 1 and 2 and bind them with routing keys matching the names. Then have your producer decide which routing key to use when producing the event message. Customers with names starting with letters a-m go to 1, n-z go to 2, you get the idea. It pushes the sharding to the producer but that might be OK for your application.
AMQP doesn't have any explicit implementation of sharding, but its architecture should help you to do that.
Spreading messages to several queues is just a rabbitmq challenge (and part of amqp specification), and with routing, way you can attach hetereogeneous consumers to handle specific messages routed via the same exchange. Therefore, producer should push a specific key to be consumed by specific queue/consumer...
You can decide to make a static sharding, perhaps you have 10 queues with one consumer per queue. You could implement a consistent hashing function such that key is CLIENT_ID % 10.
Another ways and none static solutions could be propoused, and you can try to over this architecture.
I have a Topic exchange from which I'd like to distribute messages to two queues on two servers part of a cluster, in order to reduce memory pressure on any particular server. My consumers are periodically slow, and I sometimes run into the high memory watermark.
The way I tried to resolve this is by routing messages using an intermediate direct exchange, with two queues bound to the exchange:
a (topic) -> a1 (direct) -> q1/q2 (bound to routing key "a")
But the messages were routed to both queues, as AMQP intends. Anyone has ideas? What I need is an exchange that routes to one and only one queue, even if the routing key matches many queues. I'd prefer not to change my routing keys, but that could be arranged.
I found Selective routing with RabbitMQ, which may mean I'll need to implement my own routing logic. Hopefully, this already exists somewhere else.
You could perhaps use the Shovel plugin - http://www.rabbitmq.com/shovel.html - to move messages from your intermediate exchange to the two queues.
If you set up two shovels, both consuming from a single queue on the direct intermediate exchange, they should be able to fight over the messages coming in (I'm assuming that you don't care too much if the two recipient queues don't get the incoming messages in a strict round robin fashion). The shovels then each publish to one of the two end queues, and can send through the ACKs from the end consumer.