We have an architecture such that, there are 3 instances of RabbitMQ (with multiple clusters) setup in 3 different data centers - which are the (federation) upstreams.
There's one instance of RabbitMQ at a different data center, acting as the downstream - to which messages from the other 3 upstreams are federated.
Clients connect to our stomp service, that is setup to connect to this single RabbitMQ - from which it gets the messages from all the instances.
But this single downstream can potentially go down, and the clients would then not be getting any messages. So my questions are:
Is it possible to have a redundant downstream setup?
Can we setup multiple downstreams, for example, also a downstream on one of the 3
data centers?
If so, how can we make sure that the messages are not
duplicated among the 2 (or more) downstreams?
Finally, are there any
other ways to tackle this problem?
Related
I have around 300 different consumers / 300 message types / 300 queues with the most wildest functionality behind it.
From the extreme side:
Is the best choice to make 1 windows service (easier to deploy) with 300 consumers listening.
Or 300 windows services (easier to split between devs) each independent 1 consumer but impossible to maintain by support
?
update: from 1 to 300 queues
RabbitMQ can support hundreds of queues simultaneously, and each queue should be responsible for one specific type of message e.g. a response status or an online order information or a stack trace information for further processing by some other unit of work, these three are not same and if you are keeping them all in one then please segregate them into different queues.
If you will keep all the data in one queue it will also effect your application performance as each queue works in a sequential order and since you have 300 consumers that wait for 300 types of messages, almost all of them could be in waiting state and it is also a reason for complex decision making algorithm, if you are using one to figure out the correct consumer.
What could also go wrong with a single queue is that it is now a bottleneck which could obstruct the functioning of the whole application, if that queue fails, because every consumer listens to it. By having different queues the rest of the system can still process if one particular queue faces an issue.
Instead of going for 1 consumer per service you can check if there's anything common and if the services can take up more consumers than one after increasing the number of queues from 1 to many.
I'm having problems finding any information on how to scale RabbitMQ consumers, specifically, how to work with multiple instances of the same component.
Say I have two components; A and B. I have three instances of each component set up as an HA cluster. Let's say A.1 sends a message with a key which matches B. I only want one instance of B to consume this message not all 3 of them.
Can you point me to some documentation which explains how this can be done? Ideally, some information about the load balancing approach adopted would be appreciated.
Should not be a problem as RabbitMQ uses variety of asynchronous architectural patterns to decouple applications and one of them is round robin
Round-Robin
By default RabbitMQ immediately dispatches (or pre-assigns) each message to the next consumer in sequence when it enters the queue. It dispatches messages evenly where where on average every consumer will get the same number of messages.
A shortcoming of this approach is when messages use uneven resources. In a situation with two workers, when all odd messages are heavy and even messages are light, one worker will be constantly busy and the other will do little work.
As shown in the example below , both the consumers will get the messages in a round robin manner , so in your case if the three instances bound to the same queue then one message will go to only one of the consumers , the KEY is that they should bind to a common queue.
Let's say I have 3 subscribers A,B,C for one topic, and I want A,B to be treated "same subscriber", meaning they can get only one copy of each message. And C got another copy.
I find that http://activemq.apache.org/virtual-destinations.html is one way. But what if I can't change the activemq broker's config?
I wonder if there is some "id" props, can make two subscribers to be treated as one? like group id in kafka?
I presume you are using JMS from your client. ActiveMQ 5.x only support JMS 1.1 which does not allow load balanced topics. JMS 2.0 does and is implemented in ActiveMQ Artemis, but that's another product.
However, you can use Virtual Topics without config changes, but you have to change your naming of topics and queues.
Publish to topic: VirtualTopic.[TopicName] and consume from queues: Consumer.[LogicalConsumerId].VirtualTopic.[TopicName].
I.e.
Publish orders to VirtualTopic.Orders
Consume from:
Consumer.ManufacturingSystem.VirtualTopic.Orders
Consumer.CRMSystem.VirtualTopic.Orders
etc.
As the consumers consume from a queue, they can use different (or no) ClientId and still have a load balancing among nodes in each system. I.e. the CRMSystem may have two nodes and will only receive one message in total.
This naming convention can be customized if you change ActiveMQ config, but works OOTB.
In our ecosystem we want to use two Rabbit brokers for NSericeBus transport.
I observed with a spike, that I am able to get messages from both the brokers by instantiating two bus instances (on worker). No other change was required for the handlers.
If this approach is ok for integration scenarios (with other systems) in case other systems are using a different Rabbit Broker? Or in case if we want to use a additional Rabbit broker for failover (with some custom code to switch the publish/send to the available node)?
I have a middleware based on Apache Camel which does a transaction like this:
from("amq:job-input")
to("inOut:businessInvoker-one") // Into business processor
to("inOut:businessInvoker-two")
to("amq:job-out");
Currently it works perfectly. But I can't scale it up, let say from 100 TPS to 500 TPS. I already
Raised the concurrent consumers settings and used empty businessProcessor
Configured JAVA_XMX and PERMGEN
to speed up the transaction.
According to Active MQ web Console, there are so many messages waiting for being processed on scenario 500TPS. I guess, one of the solution is scale the ActiveMQ up. So I want to use multiple brokers in cluster.
According to http://fuse.fusesource.org/mq/docs/mq-fabric.html (Section "Topologies"), configuring ActiveMQ in clustering mode is suitable for non-persistent message. IMHO, it is true that it's not suitable, because all running brokers use the same store file. But, what about separating the store file? Now it's possible right?
Could anybody explain this? If it's not possible, what is the best way to load balance persistent message?
Thanks
You can share the load of persistent messages by creating 2 master/slave pairs. The master and slave share their state either though a database or a shared filesystem so you need to duplicate that setup.
Create 2 master slave pairs, and configure so called "network connectors" between the 2 pairs. This will double your performance without risk of loosing messages.
See http://activemq.apache.org/networks-of-brokers.html
This answer relates to an version of the question before the Camel details were added.
It is not immediately clear what exactly it is that you want to load balance and why. Messages across consumers? Producers across brokers? What sort of concern are you trying to address?
In general you should avoid using networks of brokers unless you are trying to address some sort of geographical use case, have too many connections for a signle broker to handle, or if a single broker (which could be a pair of brokers configured in HA) is not giving you the throughput that you require (in 90% of cases it will).
In a broker network, each node has its own store and passes messages around by way of a mechanism called store-and-forward. Have a read of Understanding broker networks for an explanation of how this works.
ActiveMQ already works as a kind of load balancer by distributing messages evenly in a round-robin fashion among the subscribers on a queue. So if you have 2 subscribers on a queue, and send it a stream of messages A,B,C,D; one subcriber will receive A & C, while the other receives B & D.
If you want to take this a step further and group related messages on a queue so that they are processed consistently by only one subscriber, you should consider Message Groups.
Adding consumers might help to a point (depends on the number of cores/cpus your server has). Adding threads beyond the point your "Camel server" is utilizing all available CPU for the business processing makes no sense and can be conter productive.
Adding more ActiveMQ machines is probably needed. You can use an ActiveMQ "network" to communicate between instances that has separated persistence files. It should be straight forward to add more brokers and put them into a network.
Make sure you performance test along the road to make sure what kind of load the broker can handle and what load the camel processor can handle (if at different machines).
When you do persistent messaging - you likely also want transactions. Make sure you are using them.
If all running brokers use the same store file or tx-supported database for persistence, then only the first broker to start will be active, while others are in standby mode until the first one loses its lock.
If you want to loadbalance your persistence, there were two way that we could try to do:
configure several brokers in network-bridge mode, then send messages
to any one and consumer messages from more than one of them. it can
loadbalance the brokers and loadbalance the persistences.
override the persistenceAdapter and use the database-sharding middleware
(such as tddl:https://github.com/alibaba/tb_tddl) to store the
messages by partitions.
Your first step is to increase the number of workers that are processing from ActiveMQ. The way to do this is to add the ?concurrentConsumers=10 attribute to the starting URI. The default behaviour is that only one thread consumes from that endpoint, leading to a pile up of messages in ActiveMQ. Adding more brokers won't help.
Secondly what you appear to be doing could benefit from a Staged Event-Driven Architecture (SEDA). In a SEDA, processing is broken down into a number of stages which can have different numbers of consumer on them to even out throughput. Your threads consuming from ActiveMQ only do one step of the process, hand off the Exchange to the next phase and go back to pulling messages from the input queue.
You route can therefore be rewritten as 2 smaller routes:
from("activemq:input?concurrentConsumers=10").id("FirstPhase")
.process(businessInvokerOne)
.to("seda:invokeSecondProcess");
from("seda:invokeSecondProcess?concurentConsumers=20").id("SecondPhase")
.process(businessInvokerTwo)
.to("activemq:output");
The two stages can have different numbers of concurrent consumers so that the rate of message consumption from the input queue matches the rate of output. This is useful if one of the invokers is much slower than another.
The seda: endpoint can be replaced with another intermediate activemq: endpoint if you want message persistence.
Finally to increase throughput, you can focus on making the processing itself faster, by profiling the invokers themselves and optimising that code.