I come from a RabbitMQ background, and with RabbitMQ, you can set up exchanges that route messages to different queues based on a routing key.
In Kafka, how I am currently understanding topics is that they can be thought of as queues (that never get emptied). However, I am interested in putting different messages into different topics based on a certain criteria, and I would like to avoid doing that logic on the producer side.
Are there Kafka equivalent(s) to RabbitMQ's exchanges?
There is no equivalent. The only way to route different messages to
different topics is to put that logic on the producer side. Even deciding which partition of a topic to send an individual message to is left up to the producer.
Kafka's great strength is that it's really simple. That's part of why Kafka can scale really, really well. The downside is that Kafka doesn't have the feature set of a conventional message queue.
Kafka has something called a message key and when a message key added to the message for the first time Kafka pushes it to a random partition in the topic but when there is a new message with the same message key then Kafka pushes it to the same partition
Related
I have a hard time understanding the basic concepts of RabbitMQ. I find the online documentation not perfectly clear.
So far I understand, what a channel, a queue, a binding etc. is.
But how would the following use case be implemented:
Use Case: Sender posts to one exchange with different topics. On the receiver side, depending on the topic, different receivers should be notified.
So the following should somehow be feasible with a topic exchange:
create a channel
within this channel, create a topic exchange
for each topic to be subscribed to, create a queue and a queue binding with this topic as property
My difficulty is that the callback would be related to the channel, not to the queue or the queue binding. I am not 100 % sure if I am right here.
So that's my question: in order to have multiple callbacks, IOW: different message handlers, depending on the subscribed topic - do you have to create multiple channels, one for each "different message handling"? All these channels should grab the same exchange and define their own queue + queue binding for that specific topic?
Please confirm if this is correct or if I am straying from the canonic path of AMPQ ... "queue" sounds so light-weight, so I intuitively thought of a queue or a queue binding as the right point to attach a consuming event handler to, but it seems that, instead, channel is my friend in this. Right?
Another aspect of my question:
If I really have to use multiple channels for this, do I have to declare the same exchange (exchange name and exchange type of "topic") for each channel? I hoped there was something like:
define the exchange with this name and the type of "topic" once
for each channel, "grab" this predefined exchange and use it by adding queues and queue bindings to this exchange
I find it helpful to think about the roles of the broker (RabbitMQ) and the clients (your applications) separately.
The broker, RabbitMQ, will receive messages from your publishers, route them to queues, and eventually send them to consumers. The message routing can be simple or complex. In your case, the routing is topic based with a few different queues.
You haven't said much about the publishers, likely because their job is simple. They send messages with a routing key to RabbitMQ.
The consumer side is where things can get interesting. At the simplest level, a consumer subscribes to a queue, receives messages from RabbitMQ, and processes them. The consumer opens a connection to RabbitMQ and will use a channel for a particular use (e.g., subscribing to a queue). The power of message brokers is that they allow designers to break up processes into separate apps if desired.
You don't give much insight into your application, other than the presence of different message topics. An important design choice for you to make is how to define the application(s). Are the different topics suitable for separate applications, or will a single application handle all types of messages.
For the former case, you would have one application for each queue. A single channel that subscribes to the queue is probably the most sensible decision unless your application needs to be threaded. For threaded applications, each thread would have its own channel and all threads can be subscribed to the same queue. Each application would have its own callback function for processing that type of message.
For the latter case (single application with multiple queues), the best approach would be to have at least one channel per queue. It sounds like each queue would require its own callback function, and you would assign the functions to the channels according to its subscription. You might have multiple channels per queue if your application can process multiple messages (of each topic) simultaneously.
Regarding your question about declaring exchanges, queues, and bindings, these items only need to be created once. But it is reasonable practice to have your clients declare them at connection time. Advantages of declaring them are that they will be created again if they were deleted and that any discrepancies between your declaration and what is on the broker will trigger errors.
Is it possible to bind a single queue to many topics using RabbitMQ STOMP client?
Each time a client sending SUBSCRIBE frame server creates a new queue for it, it makes usage of "prefetch-count" useless for me, because it applies to each subscription individually.
I am just looking for any way to get messages with many topics in the single queue via RabbitMQ Web-STOMP. Any ideas?
See Documentation:User generated queue names for Topic and Exchange destinations
The header x-queue-name specified queue name should binding to same queue if there exist, but will exist multiple subscription on client.
The different between AMQP and STOMP concept not compatible in some ways.
I have to implement this scenario:
An external application publish message to rabbitmq.
This message has a client_id property. We can place this id to routing key or message header or some other property.
I have to implement sharding in a exchange routng logic - the message should be delivered to specific queue based on the client_id range.
Is it possible to implement in a standard exchanges?
If not what exchange should I take as the base?
How to dynamicly change client_id ranges?
Take a look at the rabbitmq plugin. It's included in the RabbitMQ distribution from v3.6.0 onwards.
Just have your producer put enough info into the routing key that causes the message to go into the right queue on the other side of the Exchange.
So for example, create two queues called 1 and 2 and bind them with routing keys matching the names. Then have your producer decide which routing key to use when producing the event message. Customers with names starting with letters a-m go to 1, n-z go to 2, you get the idea. It pushes the sharding to the producer but that might be OK for your application.
AMQP doesn't have any explicit implementation of sharding, but its architecture should help you to do that.
Spreading messages to several queues is just a rabbitmq challenge (and part of amqp specification), and with routing, way you can attach hetereogeneous consumers to handle specific messages routed via the same exchange. Therefore, producer should push a specific key to be consumed by specific queue/consumer...
You can decide to make a static sharding, perhaps you have 10 queues with one consumer per queue. You could implement a consistent hashing function such that key is CLIENT_ID % 10.
Another ways and none static solutions could be propoused, and you can try to over this architecture.
I have a Topic exchange from which I'd like to distribute messages to two queues on two servers part of a cluster, in order to reduce memory pressure on any particular server. My consumers are periodically slow, and I sometimes run into the high memory watermark.
The way I tried to resolve this is by routing messages using an intermediate direct exchange, with two queues bound to the exchange:
a (topic) -> a1 (direct) -> q1/q2 (bound to routing key "a")
But the messages were routed to both queues, as AMQP intends. Anyone has ideas? What I need is an exchange that routes to one and only one queue, even if the routing key matches many queues. I'd prefer not to change my routing keys, but that could be arranged.
I found Selective routing with RabbitMQ, which may mean I'll need to implement my own routing logic. Hopefully, this already exists somewhere else.
You could perhaps use the Shovel plugin - http://www.rabbitmq.com/shovel.html - to move messages from your intermediate exchange to the two queues.
If you set up two shovels, both consuming from a single queue on the direct intermediate exchange, they should be able to fight over the messages coming in (I'm assuming that you don't care too much if the two recipient queues don't get the incoming messages in a strict round robin fashion). The shovels then each publish to one of the two end queues, and can send through the ACKs from the end consumer.
Pretty new to RabbitMQ and we're still in the investigation stage to see if it's a good fit for our use cases--
We've readily come to the conclusion that our desired topology would have us deploying a few topic based exchanges, and then filtering from there to specific queues. For example, let's say we have a user and an upload exchange, where the user queue might receive messages where the topic is "new-registration" or "friend-request" and the upload exchange might receive messages like "video-upload" or "picture-upload".
Creating the queues, getting them routed to the appropriate queue, and then building listeners to handle the messages for the various queues has been quite straight forward.
What's unclear to me however is if it's possible to do a fanout on a topic exchange?
I.e. I have named queues that are bound to my topic exchange, but I'd like to be able to just throw tons of instances of my listeners at those queues to prevent single points of failure. But to the best of my knowledge, RabbitMQ treats these listeners in a straight forward round robin fashion--e.g. every Nth message always go to the same Nth listener rather than dispatching messages to the first available consumer. This is generally acceptable to us but given the load we anticipate, we'd like to avoid the possibility of hot spots developing amongst our consumer farm.
So, is there some way, either in the queue or exchange configuration or in the consumer code, where we can point our listeners to a topic queue but have the listeners treated in a fanout fashion?
Yes, by having the listeners bind using different queue names, they will be treated in a fanout fashion.
Fanout is 1:N though, i.e. each task can be delivered to multiple listeners like pub-sub. Note that this isn't restricted to a fanout exchange, but also applies if you bind multiple queues to a direct or topic exchange with the same binding key. (Installing the management plugin and looking at the exchanges there may be useful to visualize the bindings in effect.)
Your current setup is a task queue. Each task/message is delivered to exactly one worker/listener. Throw more listeners at the same queue name, and they will process the tasks round-robin as you say. With "fanout" (separate queues for a topic) you will process a task multiple times.
Depending on your platform there may be existing work queue solutions that meet your requirements, such as Resque or DelayedJob for Ruby, Celery for Python or perhaps Octobot or Akka for the JVM.
I don't know for a fact, but I strongly suspect that RabbitMQ will skip consumers with unacknowledged messages, so it should never bottleneck on a single stuck consumer. The comments on their FAQ seem to suggest that RabbitMQ will make an effort to keep things chugging along even in the presence of troublesome consumers.
This is a late answer, but in case others come across this question...
It sounds like what you want is fair dispatch rather than a fan out model (which would publish a given message to every queue).
Fair dispatch will give a message to the next available worker rather than using a simple round-robin approach. This should avoid the "hotspots" you are concerned about, without delivering the same message to multiple consumers.
If this is what you are looking for, then see the "Fair Dispatch" section on this page in the Rabbit docs. A prefetch count of 1 is the key here.