How to implement single-consumer-multi-queue model for rabbitMQ - rabbitmq

I have found this image is very similar to my bussiness model. I need to split message to some queue.
for some heavy work. I can add more worker thread for them. But for some no much heavy work. I can
let single consumer to subscribe their message. But how to do that in rabbitMQ.
Through their document. I just found that single-queue-multi-consumer model.

You can add multiple workers to a queue
There can be multiple queues bound to an exchange.
In RabbitMQ, the producer always sends the message to an exchange. So, in your case, I hope only one exchange is enough. If you want to load balance at the consumer side, you have the above said two options.
You can also read my article:
https://techietweak.wordpress.com/2015/08/14/rabbitmq-a-cloud-based-message-oriented-middleware/

RabbitMQ has a very flexible model, which enables a wide variety of routing scenarios to take place.
I need to split message to some queue. for some heavy work. I can add more worker thread for them.
Yes, this is supported via a direct exchange. Publish a message using a routing key that is the same as the name of the queue. For convenience, let's say you use the fully-qualified object name (e.g. MyApp.Objects.DataTypeOne). All you need to do is subscribe multiple consuming processes to this queue, and RabbitMQ will load-balance using a round-robin approach.
But for some no much heavy work. I can let single consumer to subscribe their message.
Yes, you can do this also. Same process as in the paragraph above. Just don't attach multiple consuming processes.
I have found this image is very similar to my business model.
The diagram isn't very useful, because it lacks information about the type of messages being published. In that sense, it is only an interconnect diagram. The interesting lines are the ones connecting the queues to the exchange, as that is what you specify within RabbitMQ via Queue Bindings. You can also bind exchanges to one another, but that's a bit further than we probably need to go.
Everything else on the diagram is fully under your control as the user of the RabbitMQ/AMQP system. You can create an arbitrary number of publishers and have an arbitrary number of consuming processes each consuming from an arbitrary number of queues. There are no hard and fast limits, though there are some practical aspects you probably will want to think about to ensure your system is maintainable.

Related

RabbitMQ with pika: different callback for different queue on same channel?

I have a hard time understanding the basic concepts of RabbitMQ. I find the online documentation not perfectly clear.
So far I understand, what a channel, a queue, a binding etc. is.
But how would the following use case be implemented:
Use Case: Sender posts to one exchange with different topics. On the receiver side, depending on the topic, different receivers should be notified.
So the following should somehow be feasible with a topic exchange:
create a channel
within this channel, create a topic exchange
for each topic to be subscribed to, create a queue and a queue binding with this topic as property
My difficulty is that the callback would be related to the channel, not to the queue or the queue binding. I am not 100 % sure if I am right here.
So that's my question: in order to have multiple callbacks, IOW: different message handlers, depending on the subscribed topic - do you have to create multiple channels, one for each "different message handling"? All these channels should grab the same exchange and define their own queue + queue binding for that specific topic?
Please confirm if this is correct or if I am straying from the canonic path of AMPQ ... "queue" sounds so light-weight, so I intuitively thought of a queue or a queue binding as the right point to attach a consuming event handler to, but it seems that, instead, channel is my friend in this. Right?
Another aspect of my question:
If I really have to use multiple channels for this, do I have to declare the same exchange (exchange name and exchange type of "topic") for each channel? I hoped there was something like:
define the exchange with this name and the type of "topic" once
for each channel, "grab" this predefined exchange and use it by adding queues and queue bindings to this exchange
I find it helpful to think about the roles of the broker (RabbitMQ) and the clients (your applications) separately.
The broker, RabbitMQ, will receive messages from your publishers, route them to queues, and eventually send them to consumers. The message routing can be simple or complex. In your case, the routing is topic based with a few different queues.
You haven't said much about the publishers, likely because their job is simple. They send messages with a routing key to RabbitMQ.
The consumer side is where things can get interesting. At the simplest level, a consumer subscribes to a queue, receives messages from RabbitMQ, and processes them. The consumer opens a connection to RabbitMQ and will use a channel for a particular use (e.g., subscribing to a queue). The power of message brokers is that they allow designers to break up processes into separate apps if desired.
You don't give much insight into your application, other than the presence of different message topics. An important design choice for you to make is how to define the application(s). Are the different topics suitable for separate applications, or will a single application handle all types of messages.
For the former case, you would have one application for each queue. A single channel that subscribes to the queue is probably the most sensible decision unless your application needs to be threaded. For threaded applications, each thread would have its own channel and all threads can be subscribed to the same queue. Each application would have its own callback function for processing that type of message.
For the latter case (single application with multiple queues), the best approach would be to have at least one channel per queue. It sounds like each queue would require its own callback function, and you would assign the functions to the channels according to its subscription. You might have multiple channels per queue if your application can process multiple messages (of each topic) simultaneously.
Regarding your question about declaring exchanges, queues, and bindings, these items only need to be created once. But it is reasonable practice to have your clients declare them at connection time. Advantages of declaring them are that they will be created again if they were deleted and that any discrepancies between your declaration and what is on the broker will trigger errors.

Is it possible to buffer messages in exchange until at least one queue is available?

I'm looking for a way to buffer messages received by the exchange as long as there is at least one queue bind to that exchange.
Is it supported by RabbitMQ?
Maybe there are some workarounds (I didn't find any).
EDIT
My use case:
I've got one data producer (which reads real-time data from an external system)
I've got one fanout exchange which receives data from the producer
On system startup, there might be no consumer, but after a few moments, there should be at least one which creates his own queue and binds it to the exchange from 2.
The problem is this short time between step 2. and 3. where there are no queues bound to the exchange created in step 1.
Of course, it's an edge case and after system initialization queues and exchanges are bound and everything works as expected.
Why queues and bindings has to be created by consumers (not by the producer)? Because I need a flexible setup where I can add consumers without any changes in other components code (e.g. producer).
EDIT 2
I'm processing the output from another system which stores both real-time and historical data. There are the cases where I want to read historical data first (on initialization) and then continue to handle real-time data.
I may mislead you by saying that there are multiple consumers. In the case where I need a buffer on exchange there is only one consumer (which writes everything to time series DB as it appears in queue).
The RabbitMQ team monitors this mailing list and only sometimes answers questions on StackOverflow.
Why queues and bindings has to be created by consumers (not by the producer)?
Queues and bindings can be created by producers or consumers or both. The requirement is that the exact same arguments are used when creating them if a client application tries to "re-create" a queue or binding. If different arguments are used, a channel-level error will happen.
As you have found, if a producer publishes to an exchange that can't route messages, they will be lost. Olivier's suggestion to use an alternate exchange is a good one, but I recommend you have your producers create queues and bindings as well.
If you mean to avoid throwing away messages because there is no destination configured for it, yes.
You should look at alternate exchange.
This assume that before (or when) you start (or when), the alternate exchange is created (would typically go for fanout) and a queue is binded to it (let's call it notroutedq).
So the messages are not lost, they will be stored in notroutedq.
From there you can possibly setup a mechanism that would reprocess messages in that queue - reinjecting them into the main exchange most likely - once a given time has passed or when a binding has been added to your main exchange.
-- EDIT --
Thanks for the updated info.
Could you indicate how long typically you'd expect the past messages to be useful to the consumers?
In your description, you mention real-time data and possibly multiple consumers coming and going. Based on that, I'm not sure how much of the data kept in the notroutedq would be of value, and with which frequency you'd expect to resend them to the consumers.
The cases I had with alternate exchange where mostly focused on identifying missing bindings, so that one could easily correct the bindings and reprocess the messages without loss.
If the number of consumers varies through time and the data content is real-time, I'd wonder a bit about the benefit of keeping the data.

In RabbitMQ which is more expensive, multiple queues per exchange, or multiple exchanges and less queues per each?

So we decided to go with RabbitMQ as a message/event bus in our migration to micro-services architecture, but we couldn't find a definite answer on what is the best way of putting our queues, we have two options to go with:
One main exchange which will be a Fanout exchange, which in turn will fan messages out to a main queue for logging and other purposes and another sub exchange which will be a topic exchange and route the messages to each desired queue using the message routing key. We expect the number of queues behind the sub-exchange to be some how a large number. This can be explained by this graph:
One main exchange, which will be a Topic exchange, with still one main queue bound to that exchange using "#" routing key. That main exchange will also handles main routing to other sub exchanges, so routing keys might be "agreements.#", "assignments.#", "messages.#", which are then used to bind multiple topic sub-exchanges, each will handle sub routing, so one sub exchange might be handling all "assignments" and queues bound to that exchange could be bound by routing keys like "assignments.accepted", "assignments.deleted"...In this scenario, we feel like the huge number of queues will be less per exchange, they will be somehow distributed between exchanges.
So, which of these scenarios could be the best approach? Faster on RabbitMQ, less overhead.
Taking in mind, all queues, exchanges and bindings will be done on the fly from the service which will be either publishing or subscribing.
You can find some explanation in this topic: RabbitMQ Topic exchanges: 1 Exchange vs Many Exchanges
I am using RabbitMQ in a very similar way that you showed in the case 2, as I found the same benefits as described in this article: https://skillachie.com/2014/06/27/rabbitmq-exchange-to-exchange-bindings-ampq/
Exchange-to-exchange bindings are much more flexible in terms of the topology that you can design, promotes decoupling & reduce binding churn
Exchange-to-exchange bindings are said to be very light weight and as a result help to increase performance *
Based in my own experience with exchange-to-exchange, the case 2 is great and it will allows to create/change messages flow topologies in a very fast way.
I'm going to first re-summarize what I think is your question, since I'm sure it's buried somewhere in your post.
It is desirable to have a tracer/logging queue, in addition to a series of work-specific queues for actual message processing. What exchange topology is best for this scenario?
First off, neither option makes much sense given your application. Option 1 will create an exchange that will publish a message to every queue bound to it, regardless. This is clearly not what you want. Option 2 will give you a rather complex routing topology for which the benefit is unclear, and the drawback is painful maintenance and a steep learning curve. (Just because you can do something does not mean you should do it.)
What should be done?
It is important to remember that in RabbitMQ, it is the queues which consume the resources of the broker. Exchanges merely connect queues with publishers. The exchange is a means to an end, while the queue is the end itself.
What instead I think you should do is set up a single topic exchange. Bind your tracing queue to routing key # so that you receive all messages. Then, bind your work queues appropriately so that they receive only the messages that need to flow into them. For example, it is common to route messages by message type, where each queue holds exactly one type of message. This is both simple and effective.
The advantage of a single topic exchange is that you get the benefits of both a Direct Exchange and a Fanout Exchange depending on the binding key used. Further, configuration changes are easy to achieve and can often be done without disrupting any system processing at all (let's say that you want to stop tracing certain messages - this can be done with ease using a topic exchange, assuming your routing keys are rational).
Exchange-to-exchange bindings is semantically identical to exchange-to-queue bindings.
https://www.rabbitmq.com/e2e.html

How to use routing_key and queues

I'm setting up a consumer that will listen for messages from two different sources. I want to have a different callback for messages from these two sources(other solutions are welcome though).
I'm very new to rabbitmq and pika and I haven't grasped the nitty gritty details yet. But what i want to know is:
Should i use different queues and setup two
channel.basic_consume(callback_1, ...)
channel.basic_consume(callback_2, ...)
for my callbacks or should i do some tricks with routing keys instead?
That depends on your needs a little. It's really about processing, I am most familiar with Java so I will tell you how I handle things and then you can make a decision based on that.
If I need to have different threads process different data or do different things with the data I create two different queues and each thread will consume a different queue. I use topic exchanges to make sure the queues get the correct messages. If the data is only slightly different then using the routing key I can handle the data differently with the same thread. The decision is purely based on the parallelism I require, ie how many queues I want processing the data.

RabbitMQ fan out on a topic exchange

Pretty new to RabbitMQ and we're still in the investigation stage to see if it's a good fit for our use cases--
We've readily come to the conclusion that our desired topology would have us deploying a few topic based exchanges, and then filtering from there to specific queues. For example, let's say we have a user and an upload exchange, where the user queue might receive messages where the topic is "new-registration" or "friend-request" and the upload exchange might receive messages like "video-upload" or "picture-upload".
Creating the queues, getting them routed to the appropriate queue, and then building listeners to handle the messages for the various queues has been quite straight forward.
What's unclear to me however is if it's possible to do a fanout on a topic exchange?
I.e. I have named queues that are bound to my topic exchange, but I'd like to be able to just throw tons of instances of my listeners at those queues to prevent single points of failure. But to the best of my knowledge, RabbitMQ treats these listeners in a straight forward round robin fashion--e.g. every Nth message always go to the same Nth listener rather than dispatching messages to the first available consumer. This is generally acceptable to us but given the load we anticipate, we'd like to avoid the possibility of hot spots developing amongst our consumer farm.
So, is there some way, either in the queue or exchange configuration or in the consumer code, where we can point our listeners to a topic queue but have the listeners treated in a fanout fashion?
Yes, by having the listeners bind using different queue names, they will be treated in a fanout fashion.
Fanout is 1:N though, i.e. each task can be delivered to multiple listeners like pub-sub. Note that this isn't restricted to a fanout exchange, but also applies if you bind multiple queues to a direct or topic exchange with the same binding key. (Installing the management plugin and looking at the exchanges there may be useful to visualize the bindings in effect.)
Your current setup is a task queue. Each task/message is delivered to exactly one worker/listener. Throw more listeners at the same queue name, and they will process the tasks round-robin as you say. With "fanout" (separate queues for a topic) you will process a task multiple times.
Depending on your platform there may be existing work queue solutions that meet your requirements, such as Resque or DelayedJob for Ruby, Celery for Python or perhaps Octobot or Akka for the JVM.
I don't know for a fact, but I strongly suspect that RabbitMQ will skip consumers with unacknowledged messages, so it should never bottleneck on a single stuck consumer. The comments on their FAQ seem to suggest that RabbitMQ will make an effort to keep things chugging along even in the presence of troublesome consumers.
This is a late answer, but in case others come across this question...
It sounds like what you want is fair dispatch rather than a fan out model (which would publish a given message to every queue).
Fair dispatch will give a message to the next available worker rather than using a simple round-robin approach. This should avoid the "hotspots" you are concerned about, without delivering the same message to multiple consumers.
If this is what you are looking for, then see the "Fair Dispatch" section on this page in the Rabbit docs. A prefetch count of 1 is the key here.