Rabbitmq queue sharding - rabbitmq

I have to implement this scenario:
An external application publish message to rabbitmq.
This message has a client_id property. We can place this id to routing key or message header or some other property.
I have to implement sharding in a exchange routng logic - the message should be delivered to specific queue based on the client_id range.
Is it possible to implement in a standard exchanges?
If not what exchange should I take as the base?
How to dynamicly change client_id ranges?

Take a look at the rabbitmq plugin. It's included in the RabbitMQ distribution from v3.6.0 onwards.

Just have your producer put enough info into the routing key that causes the message to go into the right queue on the other side of the Exchange.
So for example, create two queues called 1 and 2 and bind them with routing keys matching the names. Then have your producer decide which routing key to use when producing the event message. Customers with names starting with letters a-m go to 1, n-z go to 2, you get the idea. It pushes the sharding to the producer but that might be OK for your application.

AMQP doesn't have any explicit implementation of sharding, but its architecture should help you to do that.
Spreading messages to several queues is just a rabbitmq challenge (and part of amqp specification), and with routing, way you can attach hetereogeneous consumers to handle specific messages routed via the same exchange. Therefore, producer should push a specific key to be consumed by specific queue/consumer...
You can decide to make a static sharding, perhaps you have 10 queues with one consumer per queue. You could implement a consistent hashing function such that key is CLIENT_ID % 10.
Another ways and none static solutions could be propoused, and you can try to over this architecture.

Related

RabbitMQ - Can i publish an event from different exchange

Here's an example:
TYPE : TOPIC
exchange.v1 -> queue.order
exchange.v2 -> queue.log
so when the apps running it's must configure the exchange first right? and in a single service only can have 1 exchange?
I have 1 service for logging and 1 service for ordering. all proses will be sent into logging service and then forward another event. in this case to queue.order
So it's possible to publish an event from a different exchange? or I miss something? please let me know :(
Exchanges are not tied to “services”, much less in a 1:1 manner.
Exchanges in RabbitMQ are message sinks. Any existing exchanges can be published to by any number of applications (“services”) with adequate permissions.
Exhanges can either be pre-deployed or created automatically by an application. Pre-deployment is usually more common. This may or may not be outside the lifecycle of a single “service”.
Exchanges (depending on type) may also route to any number of queues on the same vhost.
Now, with all of that out of the way..
It is very possible to forward a message from a queue to another exchange: read from queues (stores), publish to exchanges (sinks). This can be done in code or even from a tool like the Shovel plugin - the “correct” approach depends significantly based on semantics, just as the choice of routing.
Personally, I recommend keeping RabbitMQ processing chains to as limited a scope as allowed by the application domain.

In RabbitMQ which is more expensive, multiple queues per exchange, or multiple exchanges and less queues per each?

So we decided to go with RabbitMQ as a message/event bus in our migration to micro-services architecture, but we couldn't find a definite answer on what is the best way of putting our queues, we have two options to go with:
One main exchange which will be a Fanout exchange, which in turn will fan messages out to a main queue for logging and other purposes and another sub exchange which will be a topic exchange and route the messages to each desired queue using the message routing key. We expect the number of queues behind the sub-exchange to be some how a large number. This can be explained by this graph:
One main exchange, which will be a Topic exchange, with still one main queue bound to that exchange using "#" routing key. That main exchange will also handles main routing to other sub exchanges, so routing keys might be "agreements.#", "assignments.#", "messages.#", which are then used to bind multiple topic sub-exchanges, each will handle sub routing, so one sub exchange might be handling all "assignments" and queues bound to that exchange could be bound by routing keys like "assignments.accepted", "assignments.deleted"...In this scenario, we feel like the huge number of queues will be less per exchange, they will be somehow distributed between exchanges.
So, which of these scenarios could be the best approach? Faster on RabbitMQ, less overhead.
Taking in mind, all queues, exchanges and bindings will be done on the fly from the service which will be either publishing or subscribing.
You can find some explanation in this topic: RabbitMQ Topic exchanges: 1 Exchange vs Many Exchanges
I am using RabbitMQ in a very similar way that you showed in the case 2, as I found the same benefits as described in this article: https://skillachie.com/2014/06/27/rabbitmq-exchange-to-exchange-bindings-ampq/
Exchange-to-exchange bindings are much more flexible in terms of the topology that you can design, promotes decoupling & reduce binding churn
Exchange-to-exchange bindings are said to be very light weight and as a result help to increase performance *
Based in my own experience with exchange-to-exchange, the case 2 is great and it will allows to create/change messages flow topologies in a very fast way.
I'm going to first re-summarize what I think is your question, since I'm sure it's buried somewhere in your post.
It is desirable to have a tracer/logging queue, in addition to a series of work-specific queues for actual message processing. What exchange topology is best for this scenario?
First off, neither option makes much sense given your application. Option 1 will create an exchange that will publish a message to every queue bound to it, regardless. This is clearly not what you want. Option 2 will give you a rather complex routing topology for which the benefit is unclear, and the drawback is painful maintenance and a steep learning curve. (Just because you can do something does not mean you should do it.)
What should be done?
It is important to remember that in RabbitMQ, it is the queues which consume the resources of the broker. Exchanges merely connect queues with publishers. The exchange is a means to an end, while the queue is the end itself.
What instead I think you should do is set up a single topic exchange. Bind your tracing queue to routing key # so that you receive all messages. Then, bind your work queues appropriately so that they receive only the messages that need to flow into them. For example, it is common to route messages by message type, where each queue holds exactly one type of message. This is both simple and effective.
The advantage of a single topic exchange is that you get the benefits of both a Direct Exchange and a Fanout Exchange depending on the binding key used. Further, configuration changes are easy to achieve and can often be done without disrupting any system processing at all (let's say that you want to stop tracing certain messages - this can be done with ease using a topic exchange, assuming your routing keys are rational).
Exchange-to-exchange bindings is semantically identical to exchange-to-queue bindings.
https://www.rabbitmq.com/e2e.html

RabbitMQ same message to each consumer

I have implemented the example from the RabbitMQ website:
RabbitMQ Example
I have expanded it to have an application with a button to send a message.
Now I started two consumer on two different computers.
When I send the message the first message is sent to computer1, then the second message is sent to computer2, the thrid to computer1 and so on.
Why is this, and how can I change the behavior to send each message to each consumer?
Why is this
As noted by Yazan, messages are consumed from a single queue in a round-robin manner. The behavior your are seeing is by design, making it easy to scale up the number of consumers for a given queue.
how can I change the behavior to send each message to each consumer?
To have each consumer receive the same message, you need to create a queue for each consumer and deliver the same message to each queue.
The easiest way to do this is to use a fanout exchange. This will send every message to every queue that is bound to the exchange, completely ignoring the routing key.
If you need more control over the routing, you can use a topic or direct exchange and manage the routing keys.
Whatever type of exchange you choose, though, you will need to have a queue per consumer and have each message routed to each queue.
you can't it's controlled by the server check Round-robin dispatching section
It decides which consumer turn is. i'm not sure if there is a set of algorithms you can pick from, but at the end server will control this (i think round robin algorithm is default)
unless you want to use routing keys and exchanges
I would see this more as a design question. Ideally, producers should create the exchanges and the consumers create the queues and each consumer can create its own queue and hook it up to an exchange. This makes sure every consumer gets its message with its private queue.
What youre doing is essentially 'worker queues' model which is used to distribute tasks among worker nodes. Since each task needs to be performed only once, the message is sent to only one node. If you want to send a message to all the nodes, you need a different model called 'pub-sub' where each message is broadcasted to all the subscribers. The following link shows a simple pub-sub tutorial
https://www.rabbitmq.com/tutorials/tutorial-three-python.html

How to get delivery path in rabbitmq to become message property?

The undelying use case
It is typical pubsub use case: Consider we have M news sources, and there are N subscribers who subscribe to the desired news sources, and who want to get news updates. However, we want these updates to land up in mongodb - essentially maintain most recent 'k' updates (and can be indexed and searched etc.). We want to design for M to scale upto million publishers, N to scale to few millions.
Subscribers' updates are finally received and stored in more than one hosts and their native mongodbs.
Modeling in rabbitmq
Rabbitmq will be used to persist the mappings (who subscribes to which news source).
I have setup a pubsub system in this way: We create publisher exchanges (each mapping to one news source) and of type 'fanout'.
For modelling subscribers, there are two options.
In the first option, have one queue for each subscriber bound to relevant publisher exchanges. And let the client process open connections to all these subscriber queues and receive the updates (and persist them to mongodb). Note that in this option, when the client is restarted, it has to manage list of all susbcribers, and open connections to all subscriber queues it is responsible for.
In the second option, we want to be able to remove overhead of having to explicitly open on each user queue upon startup. Instead, we want to listen to only one queue - representative of all subscribers who will send updates to this client host.
For achieving this, we first create one exchange for each subscriber and let it bind to the publisher exchange(s) that it follows. We let a single queue for each client, and let the subscriber exchange bind to this queue (type=direct) if the subscriber belongs to that client.
Once the client receives the update message, it should come to know which subscriber exchange it came from. Only then we can add it to mongodb for relevant subscriber. Presumably the subscriber exchange should add this information as a new header on the message.
As per rabbitmq docs, I believe there is no way to get achieve this. (Or more specifically, to get the 'delivery path' property from the delivered message, from which we can get this information).
My questions:
Is it possible to add a new header to message as it passes through exchange?
If this is not possible, then can we achieve it through custom exchange and relevant plugin? Any plugin that I can readily use for this purpose?
I am curious as to why rabbitmq is not providing delivery path property as an optional configuration?
Is there any other way I can achieve the same? (See pubsubhubbub note below)
PubSubHubBub
The use case is very similar to what pubsubhubbub protocol provides for. And there is rabbitmq plugin too called rabbithub. However, our system will be a closed system, and I believe that the webhook approach of the protocol is going to be too much of overhead compared to listening on single queue (and from performance perspective.)
The producer (RMQ Client) of the message should add all the required headers (including the originator's identity) before producing (publishing) it on RMQ. These headers are used for routing.
If, while in transit, the message (including headers) needs to be transformed (e.g. adding new headers), it needs to be sent to the transformer (another RMQ Client). This transformer will essentially become the new publisher.
The actual consumer should receive its intended messages (for which it has subscribed to) through single queue. The routing of all its subscribed messages should be arranged on the RMQ Exchange.
Managing the last 'K' updates should neither be the responsibility of the producer nor the consumer. So, it should be done in the transformer. Producers' messages should be routed to this transformer (for storage) before further re-routing to exchange(s) from where consumers consume.

How to consume messages selectively from Spring AMQP?

In the queue I have pushed 10K objects. Timestamp is one of the attribute in object. So, how can I write a consumer code using spring amqp?
can anyone help me on this.
AMQP, unlike JMS, has no notion of message selection for consumers. One solution is to use a topic exchange and set the routing key - let's say consumer 1 binds his queue to the exchange with foo.bar a second one binds with foo.baz; and a third binds with foo.*. The third will get all messages (with routing keys starting with foo.); the others will only get messages with their respective keys.
A direct exchange could also be used; it requires a complete match on the routing key.
You should probably work through all the RabbitMQ tutorials to understand the different exchange types before asking more questions here.