I have a RabbitMQ setup with following configuration.
each Exchange is FANOUT type
Multiple Queue attached to each Exchange.
BlockingConnection is made by consumer.
Single Consumer to handle all callbacks.
Problem -
Some payload take longer time to process than others, which leads the consumer to stay idle even when there are payloads in other queue.
Question -
How should I implement the consumer to avoid long waits ? Should I
run separate consumer for each module ? any user experience ?
Can I configure RabbitMQ to handle these situations ? if so how.?
First it would be nice to know why do you have more than one fanout exchange? Do you really need this? Fanout exchange sends messages to all queues...
Just have more consumers. Check this example from rabbitmq tutorial.
You don't really need to configure rabbitmq explicitly, everything can be done with the clients (publishers and subscribers), you just need to figure out how many exchanges do you need and which type should they be etc.
First, what programming language are u using? Most common languages, such as python, java, c#, all support creating additional threads for parallel process.
Let's say you consume the queue like below (pseu code):
def callback(ch, method, properties, body) ...
def threaded_function(ch, method, properties, body) ...
channel.basic_qos(prefetch_count=3)
channel.basic_consume(callback, queue='task_queue')
channel.start_consuming()
first, setting "prefetch_count=3" allows your consumer to have at-most 3 messages in not-ack status concurrently.
In the callback method, you should start a thread for executing each message with threaded_function. At the end of the threaded_function method body, do:
ch.basic_ack(delivery_tag = method.delivery_tag)
so that, at-most 3 messages could be processed concurrently, even it takes longer time for one or two of the threads to run, the others could still process next messages.
Related
I have a hard time understanding the basic concepts of RabbitMQ. I find the online documentation not perfectly clear.
So far I understand, what a channel, a queue, a binding etc. is.
But how would the following use case be implemented:
Use Case: Sender posts to one exchange with different topics. On the receiver side, depending on the topic, different receivers should be notified.
So the following should somehow be feasible with a topic exchange:
create a channel
within this channel, create a topic exchange
for each topic to be subscribed to, create a queue and a queue binding with this topic as property
My difficulty is that the callback would be related to the channel, not to the queue or the queue binding. I am not 100 % sure if I am right here.
So that's my question: in order to have multiple callbacks, IOW: different message handlers, depending on the subscribed topic - do you have to create multiple channels, one for each "different message handling"? All these channels should grab the same exchange and define their own queue + queue binding for that specific topic?
Please confirm if this is correct or if I am straying from the canonic path of AMPQ ... "queue" sounds so light-weight, so I intuitively thought of a queue or a queue binding as the right point to attach a consuming event handler to, but it seems that, instead, channel is my friend in this. Right?
Another aspect of my question:
If I really have to use multiple channels for this, do I have to declare the same exchange (exchange name and exchange type of "topic") for each channel? I hoped there was something like:
define the exchange with this name and the type of "topic" once
for each channel, "grab" this predefined exchange and use it by adding queues and queue bindings to this exchange
I find it helpful to think about the roles of the broker (RabbitMQ) and the clients (your applications) separately.
The broker, RabbitMQ, will receive messages from your publishers, route them to queues, and eventually send them to consumers. The message routing can be simple or complex. In your case, the routing is topic based with a few different queues.
You haven't said much about the publishers, likely because their job is simple. They send messages with a routing key to RabbitMQ.
The consumer side is where things can get interesting. At the simplest level, a consumer subscribes to a queue, receives messages from RabbitMQ, and processes them. The consumer opens a connection to RabbitMQ and will use a channel for a particular use (e.g., subscribing to a queue). The power of message brokers is that they allow designers to break up processes into separate apps if desired.
You don't give much insight into your application, other than the presence of different message topics. An important design choice for you to make is how to define the application(s). Are the different topics suitable for separate applications, or will a single application handle all types of messages.
For the former case, you would have one application for each queue. A single channel that subscribes to the queue is probably the most sensible decision unless your application needs to be threaded. For threaded applications, each thread would have its own channel and all threads can be subscribed to the same queue. Each application would have its own callback function for processing that type of message.
For the latter case (single application with multiple queues), the best approach would be to have at least one channel per queue. It sounds like each queue would require its own callback function, and you would assign the functions to the channels according to its subscription. You might have multiple channels per queue if your application can process multiple messages (of each topic) simultaneously.
Regarding your question about declaring exchanges, queues, and bindings, these items only need to be created once. But it is reasonable practice to have your clients declare them at connection time. Advantages of declaring them are that they will be created again if they were deleted and that any discrepancies between your declaration and what is on the broker will trigger errors.
When using RabbitMQ as Message Broker, I have a scenario where multiple concurrent consumers pull messages from a Queue using the basic.get AMQP method and use explicit acknowledgement for deleting the message from the Queue. Assuming the following setup
Q has messages M1, M2, M3 and has consumers C1, C2 and C3 (each having its own connection and channel) connected to it.
How is concurrency handled in the basic.get method? Is the call to basic.get method synchronized to handle concurrent consumers each using its own connection and channel? C1, C2 and C3 issue a basic.get call to receive a message at the same time (assume the server receives all 3 requests simultaneously).
C1 requests a message using basic.get and gets M1. When C2 requests for a message, since its using a different connection, does it get M1 again?
How can consumers pull messages in batches of a predefined size?
Your questions really hit at the heart of queuing and process theory, so I will answer from that standpoint (RabbitMQ is really a generic message broker as far as my answers are concerned, as this applies to any message broker).
How is concurrency handled in the basic.get method? Is the call to
basic.get method synchronized to handle concurrent consumers each
using its own connection and channel? C1, C2 and C3 issue a basic.get
call to receive a message at the same time (assume the server receives
all 3 requests simultaneously).
Answer 1: RabbitMQ is designed to be a reliable message broker. It contains internal processes and controls to ensure that the same message does not get passed out multiple times to different consumers. Now, due to the impracticality of testing the scenario that you describe, does it work perfectly? Who knows. That is why properly-designed applications using message-based architecture will use idempotent transactions, such that if the same transaction is processed multiple times, the result will be the same as if the transaction was processed once.
Takeaway: Design your application so that the answer to this question is unimportant.
C1 requests a message using basic.get and gets M1. When C2 requests
for a message, since its using a different connection, does it get M1
again?
Answer 2: No. Subject to the assumptions of my previous answer, the RabbitMQ broker will not serve the same message back once it has been delivered. Depending on the settings of the channel and queue, the message may be automatically acknowledged upon delivery and will never be redelivered. Other settings will have the message requeue automatically upon the "death" of the processing thread/channel or a negative acknowledgment from your processing thread. This is important functionality, since a "poison" message could repeatedly wreak havoc in your application if it could be served up to multiple consumers. Takeaway: you may safely rely on this assumption in designing your application.
How can consumers pull messages in batches of a predefined size?
Answer: They can't, nor would it make sense for them to. In any queuing system, the fundamental assumption is that items are removed from the queue in single file. Attempts to violate this assumption result in unpredictable behavior; furthermore, single-piece flow is commonly the most efficient method of processing. However, in the real world, there are cases where batch sizes > 1 are necessary. In such cases, it makes sense to load the batch into its own single message, so this may require a separate processing thread that pulls messages from the queue and batches them together, or put them in batches initially. Keep in mind that once you have multiple consumers, there is no possible way to guarantee single messages will be processed in order. Takeaway: Batching should be avoided wherever possible, but where it is not practical to avoid, you may not assume that batches will contain individual messages in any particular order.
You might wanna read the RabbitMQ Api guide and the introduction to Amqp.
First of all, avoid consuming messages using basicGet in your consumers. Rather use the Consumer interface basicConsume. This allows RabbitMq to push you messages as they arrive on the queue. Everything else is a waist of resources here as it boils down to busy polling.
When using basicConsume RabbitMq will even push you more messages in the background up to a certain prefetch count. This allows you to process multiple messages concurrently as well as minimizing the time you need to wait for your next message to process (if some message is available).
Concurrency is not an issue at all, that's what you're using a queue for!
When having multiple consumers on one queue, a message will always only be delivered to one consumer (as long as the message is ACKed). Otherwise you need private queues for each consumer and route your messages accordingly.
Btw, if you're able to share the connection among your consumers, you should do so.
Just make sure to use one channel per thread.
There is no special configuration required for that scenario. Each client will atomically fetch and receive one message from the queue, just as you would like to happen.
I am starting with ActiveMQ and I have a usecase. i have n producers sending messages into a Queue Q1. I want to stop the delivery of messages (i.e. i do not want consumers to consume those messages). I want to store the messages for sometime without those being consumed.
I was looking at ways this can be achieved. These two things came into my mind based on what i browsed through.
Using Mirrored queues, so that I can wiretap the messages and save into a virtual queue.
Possibly stop consumers from doing a PULL on the queue.
Another dirty way of doing this is by making consumers not send ack messages once its consumed a message from the queue.
We are currently not happy with either of these.
Any other way you can suggest.
Thanks in advance.
If you always want message delivery to be delayed you can use the scheduler feature of ActiveMQ to delay delivery until a set time or a fixed delay etc.
Other strategies might also work but it really up to you to design something that fits your use case. You can try to use Apache Camel to define a route that implements the logic of your use case to either dispatch a message to a Queue or send it to the scheduler for delayed processing. It all really depends on your use case and requirements.
I have two questions about RabbitMQ Work Queues:
As I understand it from the RabbitMQ tutorials, it seems that if I have a basic queue consumer client (just a basic "Hello, World!" consumer) and then I add a second consumer client for the same queue, then RabbitMQ will automatically dispatch the messages between those two queues in a round robin manner. Is that true (without adding in any extra configuration)?
My consumer clients are configured to only ever receive one message at a time, using (GetResponse response = channel.basicGet("my_queue", false). Since I am only ever receiving one message at a time, is it still necessary to set a prefetchCount (channel.basicQos(1)) for fair dispatch?
Answers to your questions:
Yes
No
However, your two questions 1 and 2 are not compatible. If you are using a consumer, it is designed to have messages pushed to it, and you don't use Basic.Get. When you use a consumer, you will need to use Basic.QoS to specify that the consumer can only "own" one unacknowledged message at a time. RabbitMQ will not push additional messages beyond the QoS limit.
Your alternative is to "pull" from the queue using Basic.Get, and you will control your own destiny as far as how many messages you run at a time.
Does this make sense?
Pretty new to RabbitMQ and we're still in the investigation stage to see if it's a good fit for our use cases--
We've readily come to the conclusion that our desired topology would have us deploying a few topic based exchanges, and then filtering from there to specific queues. For example, let's say we have a user and an upload exchange, where the user queue might receive messages where the topic is "new-registration" or "friend-request" and the upload exchange might receive messages like "video-upload" or "picture-upload".
Creating the queues, getting them routed to the appropriate queue, and then building listeners to handle the messages for the various queues has been quite straight forward.
What's unclear to me however is if it's possible to do a fanout on a topic exchange?
I.e. I have named queues that are bound to my topic exchange, but I'd like to be able to just throw tons of instances of my listeners at those queues to prevent single points of failure. But to the best of my knowledge, RabbitMQ treats these listeners in a straight forward round robin fashion--e.g. every Nth message always go to the same Nth listener rather than dispatching messages to the first available consumer. This is generally acceptable to us but given the load we anticipate, we'd like to avoid the possibility of hot spots developing amongst our consumer farm.
So, is there some way, either in the queue or exchange configuration or in the consumer code, where we can point our listeners to a topic queue but have the listeners treated in a fanout fashion?
Yes, by having the listeners bind using different queue names, they will be treated in a fanout fashion.
Fanout is 1:N though, i.e. each task can be delivered to multiple listeners like pub-sub. Note that this isn't restricted to a fanout exchange, but also applies if you bind multiple queues to a direct or topic exchange with the same binding key. (Installing the management plugin and looking at the exchanges there may be useful to visualize the bindings in effect.)
Your current setup is a task queue. Each task/message is delivered to exactly one worker/listener. Throw more listeners at the same queue name, and they will process the tasks round-robin as you say. With "fanout" (separate queues for a topic) you will process a task multiple times.
Depending on your platform there may be existing work queue solutions that meet your requirements, such as Resque or DelayedJob for Ruby, Celery for Python or perhaps Octobot or Akka for the JVM.
I don't know for a fact, but I strongly suspect that RabbitMQ will skip consumers with unacknowledged messages, so it should never bottleneck on a single stuck consumer. The comments on their FAQ seem to suggest that RabbitMQ will make an effort to keep things chugging along even in the presence of troublesome consumers.
This is a late answer, but in case others come across this question...
It sounds like what you want is fair dispatch rather than a fan out model (which would publish a given message to every queue).
Fair dispatch will give a message to the next available worker rather than using a simple round-robin approach. This should avoid the "hotspots" you are concerned about, without delivering the same message to multiple consumers.
If this is what you are looking for, then see the "Fair Dispatch" section on this page in the Rabbit docs. A prefetch count of 1 is the key here.