RabbitMQ - How to Federate / Mirror Messages - rabbitmq

I setup two nodes, A and B. Both have RabbitMQ with the federation plugin installed.
From the Web UI, I can see the "Federation Status" > "State" is "running" on A and B.
On A, I created a queue called "test1".
On B, I can see the "test1" queue (replicated from A).
On A, I added a message.
However, the message does not appear in the replicated queue on B - the message stays on A.
This is the policy I used on A and B:
rabbitmqctl set_policy --apply-to exchanges my-queue "test1" \
'{"federation-upstream-set":"all"}'
So, it's like this: A (upstream) -> B (downstream) and B (upstream) -> A (downstream)
Am I supposed to see messages replicated to both A and B? Did I misconfigure the directions?

However, the message does not appear in the replicated queue on B - the message stays on A.
TL;DR: federated exchange != federated queue.
References:
https://www.rabbitmq.com/federated-exchanges.html
https://www.rabbitmq.com/federated-queues.html
The "How it works" section on federated queues explains:
" The federated queue will only retrieve messages when it has run out of messages locally, it has consumers that need messages, and the upstream queue has "spare" messages that are not being consumed ... "
Whereas the "What does a federated exchange do?" explains:
" ... messages published to the upstream exchanges are copied to the federated exchange, as though they were published directly to it ... "
recap:
if you use a federated queue,
you would need a consumer on the B side that needs messages (pull model?).
if you use a federated exchange,
messages a copied directly (push model?).
Use cases
Redundancy / Backups
Federated exchanges copy messages (max-hops copies) so they can be used for redundancy.
E.g.
here is my data, back it up.
Content distribution network
Federated exchanges copy messages (max-hops copies) so they can be used to distribute content across regions (that's also redundancy btw) provided you configure the topology correctly.
E.g.
hey everybody, please apply this security patch, which you can find at your nearest broker.
Load balancing
Federated queues can be used for load balancing: if a message is available upstream and there is no consumer there to process it, a free consumer downstream is able to receive the message and work on it. Rock on.
E.g.
I'm a computer, and I feel bored, can I help you? Any job you need me to do?
double-whammy
Federated exchange + federated queues = you can distribute the same set of tasks to multiple regions (cluster), and one worker in each cluster can perform the job.
E.g.
It's end of the quarter, I need performance metrics for each region (cluster), each store manager (one node in cluster) will aggregate metrics (inside cluster), and we'll give gift cards to the top 3.

Related

In a publish/subscribe model in microservices, how to receive/consume a message only once per service type

We are designing for a microservices architecture model where service A publishes a message and services B, and C would like to receive/consume the message. However, for high availability multiple instances of services B and C are running at the same time. Now the question is how do we design such that only one service instance of B and one service instance of C receive the message and not all the other service instances.
As far as I know about RabbitMQ, it is not easy to achieve this behavior. I wonder if Kafka or any other messaging framework has a built-in support for this scenario, which I believe should be very common in a microservices architecture.
Kafka has a feature called Consumer Groups that does exactly what you describe.
Every identical instance of B can declare its group.id to be the same string (say "serviceB") and Kafka will ensure that each instance gets assigned a mutually exclusive set of topic partitions for all the topics it subscribes to.
Since all instances of C will have a different group.id (say "serviceC") then they will also get the same messages as the instances of B but they will be in an independent Consumer Group so messages go only to 1 of N instances of C, up to maximum number of instances which is the total number of topic partitions.
You can dynamically and independently scale up or down the number of instances of B and C. If any instance dies, the remaining instances will automatically rebalance their assigned topic partitions and take over processing of the messages for the instance that died.
Data never has to be stored more than once so there is still one single commit log or "source of truth" for all these service instances.
Kafka has built-in support for this scenario.
You can create two Consumer Groups, one for B, and the other for C. Both Consumer Groups subscribe messages from A.
Any message published by A will be sent to both groups. However, only one member of each group can receive the message.
This are the changes you need to perform to achieve the same with Rabbit MQ
Create 2 seperate queue, one for each B and C service
Change your logic to read message from queue such that only one
instance will read the message from queue ,using blocking
connection thing of rabbitmq.
this way, when multiple instance of B and C are running both will get the message and still be scalable.
You can also test this use case in kafka with the command line tools.
You create a producer with
bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test
Then, you can create two different Consumer Groups (cgB, cgC) with
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning --consumer-property group.id=cgB
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning --consumer-property group.id=cgC
As soon as you send a message to the topic, both groups (B,C) will receive the message but will save what message they processed independently.
Better explained here: Kafka quickstart

ActiveMQ, Network of brokers, offline durable subscriber dedupe

Scenario: Two ActiveMQ nodes A, B. No master slave, but peers, with network connectors between them.
A durable topic subscriber is registered with both (as it uses failover and at one point connects to A and at another point connects to B).
Issue: If subscriber is being online against A, a copy of each message is placed in the offload subscription on B.
Question: Is this by design? Can this be configured so that a message is deduped and only sent to the subscriber in one of subscriptions?
Apparently by-design: http://activemq.apache.org/how-do-distributed-queues-work.html
See "Distributed Topics in Store/Forward" where it says:
For topics the above algorithm is followed except, every interested client receives a copy of the message - plus ActiveMQ will check for loops (to avoid a message flowing infinitely around a ring of brokers).

Select consumers before publishing a message rabbitmq

I am trying to build a system where I need to select next available and suitable consumer to send a message from a queue (or may be any other solution not using the queue)
Requirements
We have multiple publishers/clients who would send objects (images) to process on one side and multiple Analysts who would process them, once processed the publisher should get the corresponding response.
The publishers do not care which Analyst is going to process the data.
Users have a web app where they can map each client/publisher to one or more or all agents, say for instance if Publisher P1 is mapped to Agents A & B, all objects coming from P1 can be processed by Agent A or Agent B. Note: an object can only be processed by one agent only.
Depending on the mapping I should have a middleware which consumes the messages from all publishers and distributes to the agents
Solution 1
My initial thoughts were to have a queue where all publishers post their messages. Another queue where Agents publish message saying they are waiting to process an object.
A middleware picks the message, gets the possible list of agents it can send the message to (from cached database) and go through the agents queue to find the next suitable and available agent and publish the message to that agent.
The issue with this solution is if I have agents queue like a,b,c,d and the message I receive can only be processed by agent b I will be rejecting agents d & c and they would end up at the tail of the queue and I have around 180 agents so they might never be picked or if the next message can only be processed by agent d (for example) we have to reject all the agents to get there
Solution 2
First bit from publishers to middleware is still the same
Have a scaled fast nosql database where agents add a record to notify there availability. Basically a key value pair
The middleware gets config from cache and gets the next available + suitable agent from the nosql database sends message to the agent's queue (through direct exchange) and updates the nosql to set isavailable false ad gets the next message.
Issue with this solution is the db and middleware can become a bottleneck, also if I scale the middleware I will end up in database concurrency issues for example f I have two copies of middleware running and each recieves a message which can be proceesed by Agents A & B and both agents are available.
The two middleware copies would query the db and might get A as availble and end up sneding both messages to A while B is still waiting for a message to process.
I will have around 100 publishers and 180 agents to start with.
Any ideas how to improve these solutions or any other feasible solution would be highly appreciated?
Depending on this I also need to figure out how the Agent would send response back to the publisher.
Thank you
I'll answer this from the perspective the perspective of my open-source service bus: Shuttle.Esb
Typically one would ignore any content-based routing and simply have a distributor pattern. All message go to the primary endpoint and it will distribute the messages. However, if you decide to stick to these logical groupings you could have primary endpoints for each logical grouping (per agent group). You would still have the primary endpoint but instead of having worker endpoints mapped to agents you would have agent groupings map to the logical primary endpoint with workers backing that.
Then in the primary endpoint you would, based on your content (being the agent identifier), forward the message to the relevant logical primary endpoint. All the while you keep track of the original sender. In the worker you would then send a message back to the queue of the original sender.
I'm sure you could do pretty much the same using any service bus.
I see several requirements in here, that can be boiled down to a few things, I think:
publisher does not care which agent processes the image
publisher needs to know when the image processing is done
agent can only process 1 image at a time
agent can only process certain images
are these assumptions correct? did I miss anything important?
if not, then your solution is pretty much built into RabbitMQ with routing and queues. there should be no need to build custom middle-tier service to manage this.
With RabbitMQ, you can have a consumer set to only process 1 message at a time. The consumer sets it's "prefetch" limit to 1, and retrieves a message from the queue with "no ack" set to false - meaning, it must acknowledge the message when it is done processing it.
To consume only messages that a particular agent can handle, use RabbitMQ's routing capabilities with multiple queues. The queues would be created based on the type of image or some other criteria by which the consumers can select images.
For example, if there are two types of images: TypeA and TypeB, you would have 2 queues - one for TypeA and one for TypeB.
Then, if Agent1 can only handle TypeA images, it would only consume from the TypeA queue. If Agent2 can handle both types of images, it would consume from both queues.
To put the right images in the right queue, the publisher would need to use the right routing key. If you know if the image type (or whatever the selection criteria is), you would change the routing key on the publisher side to match that selection criteria. The routing in RabbitMQ would be set up to move messages for TypeA into the TypeA queue, etc.
The last part is getting a response on when the image is done processing. That can be accomplished through RabbitMQ's "reply to" field and related code. The gist of it is that the publisher has it's own exclusive queue. When it publishes a message, it includes the name of it's exclusive queue in the "reply to" header of the message. When the agent finishes processing the image, it sends a status update message back through the queue found in the "reply to" header. That status update message tells the producer the status of the request.
From a RabbitMQ perspective, these pieces can be put together using the examples and documentation found here:
http://www.rabbitmq.com/getstarted.html
Look at these specifically:
Work Queues: http://www.rabbitmq.com/tutorials/tutorial-two-python.html
Topics: http://www.rabbitmq.com/tutorials/tutorial-five-python.html
RPC (aka Request/Response): http://www.rabbitmq.com/tutorials/tutorial-six-python.html
You'll find examples in many languages, in these docs.
I also cover most of these scenarios (and others) in my RabbitMQ Patterns eBook
Since the total number of senders and receivers are only hundreds, how about to create one queue for each of your senders. Based on your sender receiver mapping, receivers subscribes to the sender queues (update the subscribing on mapping changes). You could configure your receiver to only receive the next message from all the queues it subscribes (in a random way) when it finishes processing one message.

"Mirror" rabbitmq queue on the same node

I have a rabbitMQ queue.
The consumer of this queue is an application that pulls messages off the queue, and inserts them into a database (after some processing).
I want to also be able to use these messages for something else (to send them to another application for storage and other, unrelated processing).
The consumer application is closed source, so I can't open it up and change its behaviour.
I think the best way of achieving my goal would be to mirror the rabbit queue, and consuming it independently (and without interfering) of the original message flow.
I've looked at RabbitMQ mirroring, but this seems to be designed to operate on two or more nodes in a master/slave configuration.
What I think I want is:
Pre-processor application > rabbit_queue_1 > Normal DB consumer
\
> rabbit_queue_2 > New independent consumer.
I need both consumers to get all the same messages, so I dont want two applications reading from the same queue, or for a new consumer to read off the queue then put back on to it again.
Mirroring is a high-availability solution and is inappropriate for what you are asking.
Instead, consider that RabbitMQ splits the publishing and consuming functions. If the existing program is publishing to RabbitMQ, simply figure out the routing key of the current application's queue, and use that in declaring your own queue.
Messages published, when matching the routing key, will flow to all queues using that key. Special cases include fanout/topic exchanges, which additionally permit wildcards in routing keys.
Using a direct exchange, your topology is actually:
Pre-processor application > Direct exchange > rabbit_queue_1 > Normal DB consumer
(via routing key) \
> rabbit_queue_2 > New independent consumer.

ActiveMQ network of brokers don't forward messages

I had two ActiveMQ brokers (A and B) that were configured as store-forward network. They work perfectly to forward messages from A to B when there is a consumer connected on broker B and producer sends messages to A. The problem is that when the consumer is killed and reconnected to A, the queued messages on B (they were forwarded from A) won't forward back to A where the consumer connected to. Even I send new messages to B, all messages were stuck on B until I restart brokers. I have tried to set networkTTL="4" and duplex="true" on the broker network connector, but it doesn't work.
Late answer, but hopefully this will help someone else in the future.
Messages are getting stuck in B because by default AMQ doesn't allow messages to be sent back to a broker to which they have previously been delivered. In the normal case, this prevents messages from going in cycles around mesh-like network topologies without getting delivered, but in the failover case it results in messages stuck on one broker and unable to get to the broker where all the consumers are.
To allow messages to go back to a broker if the current broker is a dead-end because there are no consumers connected to it, you should use replayWhenNoConsumers=true to allow forwarding messages that got stuck on B back to A.
That configuration option, some settings you might want to use in conjunction with it, and some considerations when using it, are described in the "Stuck Messages (version 5.6)" section of http://activemq.apache.org/networks-of-brokers.html, http://tmielke.blogspot.de/2012/03/i-have-messages-on-queue-but-they-dont.html, and https://issues.apache.org/jira/browse/AMQ-4465. Be sure that you can live with the side effects of these changes (e.g. the potential for duplicate message delivery of other messages across your broker-to-broker network connections).
Can you give more information on the configuration of broker A and B, as well as what you are trying to achieve?
It seems to me you could achieve what you want by setting a network of brokers (with A and B), with the producer only connecting to one, the consumer to the other.
The messages will automatically be transmitted to the other broker as long as the other broker has an active subscription to the destination the message was sent to.
I would not recommend changing the networkTTL if you are not sure of the consequences it produces (it tends to lead to unwanted messages loops).