I have been learning AMQP using rabbitMQ and I came across this concept called fanout exchanges. From the illustration diagram, all I could see is that it's some kind of load balancer. Could anyone please explain what is it's actual purpose?
I assume that you mean that only one queue will get a message once it arrives to fanout exchange. So from that point of view:
No, I don't think its a load-balancer (I admit that terminology can be confusing).
In Rabbit MQ there are different types of exchanges, its true and fanout exchange is only one type of them. The basic model of Rabbit MQ assumes that you can connect as many queues as you want to the same exchange. Now, all the queues that are connected to the exchange will get the message (Rabbit MQ just replicates the message) - so exchange can't act as a load balancer.
The only difference between the exchange types is an algorithm of matching routing key. It's like a "to" field in a regular envelope. When a message arrives to exchange, it checks the routing key (a.k.a. binding) and depending on type of exchange "finds" to which queue the message should be routed.
When queue gets registered to exchange it always uses this binding. It like queue says to the binding "hey, all messages which are supposed to arrive to John Smith (its a routing key), please pass them to me". Then when the message arrives, it always has a "to" field in the envelope - so exchange checks whether the message is intended to be sent to John Smith, and if so - just routes it to the queue.
It's possible that there will be many queues interested to get a message from John Smith, in this case the message will be replicated. As for fanout exchange - it just doesn't pay any attention to the routing key and instead just sends the message to all the connected queues.
Now, there is another abstraction called consumer. Consumers can be connected to the single queue (again, many consumers can be connected to the queue).
The trick is that only one consumer can get the message for processing at a time.
So if you want a load balancer - you can use a single queue, connected to your exchange (it can be fanout of course), but then connect many consumers to that queue, and rabbit will send the message to the first consumer (it uses round robin internally to pick the first consumer) - if the consumer can't handle it, the message will be re-queued and rabbit will attempt to send it to another consumer.
Related
Using RabbitMQ 3.7.16, with spring-amqp 2.2.3.RELEASE.
Multiple clients publish messages to the DataExchange topic exchange in our RabbitMQ server, using a unique routing key. In the absence of any bindings, the exchange will route all the messaged to the data.queue.generic through the AE.
When a certain client (client ID 1 and 2 in the diagram) publishes lots of messages, in order to scale the consumption of their messages independently from other clients, we are starting consumers and assign them to only handle a their client ID. To achieve this, each client-consumer is defining a new queue, and it binds it to the topic exchange with the routing key events.<clientID>.
So scaling up is covered and works well.
Now when the messages rate for this client goes down, we would like to also scale down its consumers, up to the point of removing all of them. The intention is to then have all those messages being routed to the GenericExchange, where there's a pool of generic consumers taking care of them.
The problem is that if I delete data.queue.2 (in order to remove its binding which will lead to new messages being routed to the GenericExchange) all its pending messages will be lost.
Here's a simplified architecture view:
It would be an acceptable solution to let the messages expire with a TTL in the client queue, and then dead letter them to the generic exchange, but then I also need to stop the topic exchange from routing new messages to this "dying" queue.
So what options do I have to stop the topic exchange from routing messages to the client queue where now there's no consumer connected to it?
Or to explore another path - how to dead letter messages in a deleted/expired queue?
If the client queue is the only one with a matching binding as your explanation seems to suggest, you can just remove the binding between the exchange and the queue.
From then on, all new messages for the client will go through the alternate exchange, your "generic exchange", to be processed by your generic consumers.
As for the messages left over in the client queue, you could use a shovel to send them back to the topic exchange, for them to be routed to the generic exchange.
This based on the assumption the alternate exchange is internal. If it's not internal, you can target it directly with the shovel.
As discussed with Bogdan, another option to resolve this while ensuring no message loss is occuring is to perform multiple steps:
remove the binding between the specific queue and the exchange
have some logic to have the remaining messages be either consumed or rerouted to the generic queue
if the binding removal occurs prior to the consumer(s) disconnect, have the last consumer disconnect only once the queue is empty
if the binding removal occurs after the last consumer disconnect, then have a TTL on messages with alternate exchange as the generic exchange
depending on the options selected before, have some cleanup mecanism to remove the lingering empty queues
I have implemented the example from the RabbitMQ website:
RabbitMQ Example
I have expanded it to have an application with a button to send a message.
Now I started two consumer on two different computers.
When I send the message the first message is sent to computer1, then the second message is sent to computer2, the thrid to computer1 and so on.
Why is this, and how can I change the behavior to send each message to each consumer?
Why is this
As noted by Yazan, messages are consumed from a single queue in a round-robin manner. The behavior your are seeing is by design, making it easy to scale up the number of consumers for a given queue.
how can I change the behavior to send each message to each consumer?
To have each consumer receive the same message, you need to create a queue for each consumer and deliver the same message to each queue.
The easiest way to do this is to use a fanout exchange. This will send every message to every queue that is bound to the exchange, completely ignoring the routing key.
If you need more control over the routing, you can use a topic or direct exchange and manage the routing keys.
Whatever type of exchange you choose, though, you will need to have a queue per consumer and have each message routed to each queue.
you can't it's controlled by the server check Round-robin dispatching section
It decides which consumer turn is. i'm not sure if there is a set of algorithms you can pick from, but at the end server will control this (i think round robin algorithm is default)
unless you want to use routing keys and exchanges
I would see this more as a design question. Ideally, producers should create the exchanges and the consumers create the queues and each consumer can create its own queue and hook it up to an exchange. This makes sure every consumer gets its message with its private queue.
What youre doing is essentially 'worker queues' model which is used to distribute tasks among worker nodes. Since each task needs to be performed only once, the message is sent to only one node. If you want to send a message to all the nodes, you need a different model called 'pub-sub' where each message is broadcasted to all the subscribers. The following link shows a simple pub-sub tutorial
https://www.rabbitmq.com/tutorials/tutorial-three-python.html
Why do we need routing key to route messages from exchange to queue? Can't we simply use the queue name to route the message? Also, in case of publishing to multiple queues, we can use multiple queue names. Can anyone point out the scenario where we actually need routing key and queue name won't be suffice?
There are several types of exchanges. The fanout exchange ignores the routing key and sends messages to all queues. But pretty much all other exchange types use the routing key to determine which queue, if any, will receive a message.
The tutorials on the RabbitMQ website describes several usecases where different exchange types are useful and where the routing key is relevant.
For instance, tutorial 5 demonstrates how to use a topic exchange to route log messages to different queues depending on the log level of each message.
If you want to target multiple queues, you need to bind them to a fanout exchange and use that exchange in your publisher.
You can't specify multiple queue names in your publisher. In AMQP, you do not publish a message to queues, you publish a message to an exchange. It's the exchange responsability to determine the relevant queues. It's possible that a message is routed to no queue at all and just dropped.
Decoupling queue names from applications is useful for flexibility.
You could establish multiple queues to consume the same message, but queues can't have the same name.
In some cases, message's originator doesn't know the names of queues. (like when you have randomly generated queue names when horizontally scaling a server)
An exchange may be routing messages for more than just one type of consumer. Then you would need some wildcards in your routing keys to route messages to concerned consumers.
I've been trying to find a way to CC messages from a Qpid Exchange to another Queue for testing/monitoring purposes. I noticed that a RabbitMQ user out there was having a similar problem, and the solution seemed to be RabbitMQ's Firehose feature. Is there a similar solution in Qpid?
Here's some more details for the curious.
Let's call the Exchange "App.Ex", and through it are flowing messages for a single other intended recipient (let's call him "Bob")
I connect to App.Ex, initiate a session, start a receiver, and start fetching (using code adapted from the QPID Book's "A Simple Messaging Program in Python")
I start seeing the messages I want to see. HOWEVER, in doing so I've robbed Bob of the messages he needs!
So there's the rub, how can I get the messages CC'd to me, but in a way that Bob still receives his messages?
I have permission to modify the messaging configuration, so I can create my own Queues and Exchanges if need be. Thoughts appreciated!
A direct exchange is probably most appropriate because you can have some queues with CC like behavior and some without, and you can change it anytime on a live exchange.
You can have two queues bound to the same subject/routing key. When a message is sent to the exchange with that particular subject/routing key, both bound queues will receives copies of the same message.
Both queues bar1 and bar2 are bound to routing_key foo. When producer B posts messages to the exchange with routing_key = foo, both bar1 and bar1 receive copies of all messages.
Ask if you need commands for creating the exchange and appropriate bindings.
However there are more ways to do the same thing:
You can also achieve the similar behavior using a topic queue, with exact matches on topic name
Lastly, you can also use a fanout exchange, where any message you send to the queue, a copy is routed to all the queues bound to the exchange.
Note that all of these exchange types are from the AMQP spec, so they are not qpid specific, you could do the same thing or something very similar in any AMQP implementation.
As far as I can tell, there is no proper use case for a direct exchange, as anything you can do with it you can do with a fanout exchange, only more expandably.
More specifically, in reading RabbitMQ in Action, the authors numerously refer to the use case that goes something like - "Suppose when a user uploads a picture you need to generate a thumbnail. But then later marketing also tells you to award points for uploading a photo. With RabbitMQ you just have to create another queue and do no work on the producer side!"
But that's only true if you've had the foresight to create a fanout exchange on the producer side. To my understanding a direct exchange cannot accomplish this and is only appropriate when you actually want tight coupling between exchange and queue, (which you don't, because that's the point of messaging systems.)
Is this correct or is there an actual use case?
Compared to the fanout exchange, the direct exchange allows some filtering based on the message's routing key to determine which queue(s) receive(s) the message. With a fanout exchange, there is no such filtering and all messages go to all bound queues.
So if you have a direct exchange with several queues bound with the same routing key, and all messages have this key, then you have the same behavior as the fanout exchange. This is better explained in tutorial 4 on the RabbitMQ website.
In the image upload use case, you can use:
a fanout exchange with two queues (one for the thumbnail worker, one for the score computation worker). The routing key is ignored.
fanout-exchange
|--> queue --> thumbnail-worker
`--> queue --> score-worker
a direct exchange with again two queues. Queues are bound with the image-processing key for instance, and messages with this key will be queued to both queues.
direct-exchange
|--["image-processing"]--> queue --> thumbnail-worker
`--["image-processing"]--> queue --> score-worker
Of course, in this situation, if the message's routing key doesn't match the binding key, none of the queues will receive the message.
You can't put the two workers on the same queue, because messages will be load balanced between them: one worker will see half of the messages.
Do you mean a fanout exchange or a topic exchange? a fanout exchange is very different from a direct exchange. I presume that sending the photo to the exchange is sent with a routing key that specifies that there is a photo. In which case you have a consumer that generates the thumbnail and when you want to add a new consumer you can just add it and get the same message but do something different with it, ie award points.
The use case holds up. I think the point is that the exchange is originally created as a direct exchange.
This answer echoes the previousone and if you refer to this page, I believe you'll that one particular use case described is:
Direct exchanges are often used to distribute tasks between multiple
workers (instances of the same application) in a round robin manner.