Apache ActiveMQ 5.5+ Broker networks - activemq

I'm trying to create a network of brokers such that there will be two 'domains' a core and a distribution. The difference between then will be that no topics published to in the distribution will be allowed to flow into core.
broker_core_primary
broker_core_secondary
broker_dist_primary
broker_dist_secondary
The message flow would be as follows:
broker_core_primary <---> broker_core_secondary
broker_dist_primary <---> broker_dist_secondary
core (broker_core_primary,broker_core_secondary) ----> dist(broker_dist_primary,broker_dist_secondary)
I've got this working with the below configs but it does not gracefully recover from the loss of any one broker and reintroduction of that broker makes things even worse.
Any help would be greatly appreciated. I'm open to changing topologies as well, so long as I can keep the concept of a separate publication domain so that topics published in dist can be blocked from consumption on core brokers.
broker_core_primary
broker_core_secondary
broker_core_primary" duplex="true" networkTTL="5" uri="static:(tcp://broker_core_primary:61616)?maxReconnectDelay=5000,useExponentialBackOff=false,randomize=false,trace=true" userName="brokerBridge" password="REMOVED" />
broker_dist_primary
broker_core" duplex="false" networkTTL="5" uri="static:(tcp://broker_core_primary:61616,tcp://broker_core_secondary:61616)?maxReconnectDelay=5000,useExponentialBackOff=false,randomize=false,trace=true" userName="brokerBridge" password="REMOVED" />
broker_dist_secondary
broker_core" duplex="false" networkTTL="5" uri="static:(tcp://broker_core_primary:61616,tcp://broker_core_secondary:61616)?maxReconnectDelay=5000,useExponentialBackOff=false,randomize=false,trace=true" userName="brokerBridge" password="REMOVED" />
broker_dist_primary" networkTTL="5" duplex="true" uri="static:(tcp://broker_dist_primary:62626)?maxReconnectDelay=5000,useExponentialBackOff=false,randomize=false,trace=true"
userName="brokerBridge" password="REMOVED">

please define "not gracefully recover"
just some possible mistakes:
your producers and consumers need to be aware of all cluster node
for a network of 4 brokers in (basicly) a hypercube ttl 5 is overkill and could have unwanted effects
a different way of blocking destinations from publication to connected network nodes is to exclude them in config. excluded (or included) destinations would be configured on the network connectors (see the documentation)
ps: please format your questions better and use real xml from your config, its really hard to read.

Related

RabbitMQ: Shovel vs Federation for Microservice Communication

I've spent quite a bit of time trying to figure out whether I should use the RabbitMQ federation plugin or shovel.
Basically I have two microservices. I want one of them to send a message to another. Each microservice has a different rabbitMQ cluster, so I need to use Federation/shovel.
I read this post When to use RabbitMQ shovels and when Federation plugin? and still couldn't figure it out / make a decision.
I want to satisfy the following:
Loose coupling
Microservices don't know about each other -- I.e the first microservice emits a message saying "i'm done doing x". And the second microservice just listens to that 'event' and acts accordingly..
In the future I 'might' want to add more microservices, each with their own rabbitMQ cluster / vhost.
Based on this information - what do you recommend, shovel or federation?
Why not just have one cluster for everything? RabbitMQ is build for handling 10k+ exchanges and queues, actually there is no upper limit except memory or disk space. Setting up a cluster for each microservice is too much work and creates unnecessary overhead. Using vhost should also not be used for this, but for each business area.
I'm only using shovels and I use them to transfer messages from my production environment to test, so I can test with real data. It's very easy to setup with scripts. And yes, you should only do this with scripts. Using the UI is too slow.
I know this doesn't answer your question directly, but I hope it has given you some food for thought.
Happy messaging!

Correct way to set up ActiveMQ network of brokers

As explained in this question, we have a network of brokers consisting of three brokers on different servers.
The network connectors are configured as follows:
<networkConnectors>
<networkConnector uri="static:(ssl://broker2:61616,ssl://broker3:61616)" networkTTL="5"/>
</networkConnectors>
We are also considering to add the following parameters to the network connector as we think this might improve the behavior (due to advise on this blog post):
dynamicOnly="true"
decreaseNetworkConsumerPriority="true"
suppressDuplicateQueueSubscriptions="true"
However, it is also scary to do as we feel we do not fully understand what is happening right now and so cannot really be sure of the effect these settings will have on the behavior. The official documentation is not really clear on this (neither on this point nor many others by the way).
UPDATE:
What we want to achieve is that messages are as much as possible handled on the broker where they first arrive. Clients (as shown in the other post) are connected via Wifi, but have a fallback to 4G. In practice, we see that they regularly switch network and therefore connect to a different broker. We want to limit the traffic over the network connectors.
These settings should get you that 'prefer local' behavior you want:
decreaseNetworkConsumerPriority="true"
suppressDuplicateQueueSubscriptions="true"
Also, add messagTTL = 4 and consumerTTL = 1. This allows messages to hop around n + 1 times. (Where n is the number of brokers in your cluster). Also, consumerTTL = 1 means brokers will only see consumers from their immediate peer, and not see over multiple hops.
In your use case, drop the networkTTL setting-- messageTTl and consumerTTL replace it and give you more control over message hops and consumer awareness.

How to scale out apache atlas

There is no info provided in atlas document on how to scale it.
Apache atlas is connected to cassandra or hbase in the backend which can scale out ,but I dont know how apache atlas engine ( rest web-service and request processor ) can scale out.
I can install multiple instances of it on different machine and have load balancer in front of it to fan out the request. But would this model help ? Does it do any kind of locking and do db transaction, so that this model would not work.
Does someone know how apache atlas scales out ?
Thanks.
So Apache Atlas runs Kafka as the message queue under the covers, and in my experience, the way they have designed the Kafka queue (consumer group that says you should ONLY have ONE consumer) is the choke point.
Not only that, when you look at the code, the consumer has a poll time for the broker of 1 sec hard coded into the consumer. Put these two together, and that means that if the consumer can't process the messages from the various producers (HIVE, Spark, etc) within that second, the broker then disengages the ONLY consumer, and waits for a non-existent consumer to pick up messages...
I need to design something similar, but this is as far as I have got...
Hope that helps somewhat...
Please refer to this page. http://atlas.apache.org/#/HighAvailability
Atlas does not support actual horizontal scale-out.
All the requests are handled by the 'Active instance'. the 'Passive instances' just forward all the requests to the 'Active instance'.

Publish/subscribe listen for all topic #

I am using Apache ActiveMQ as a Brocker and Sensor devices continously publish data to it.
I want to write a subsriber # whih collect all data that broker listens.
Whther a client implemation can really scale for this kind of opeartion. what are the things to be considerd while making such client set up( subscriber)
thanks and regards
If a separate client subscribed to # will work will very much depend on the throughput/load levels involved and how well that client is written.
Some brokers (e.g. HiveMQ) have plugins that will do DB persistence directly at the broker level which for very high throughput situations may be better.

How to load balancing ActiveMQ with persistent message

I have a middleware based on Apache Camel which does a transaction like this:
from("amq:job-input")
to("inOut:businessInvoker-one") // Into business processor
to("inOut:businessInvoker-two")
to("amq:job-out");
Currently it works perfectly. But I can't scale it up, let say from 100 TPS to 500 TPS. I already
Raised the concurrent consumers settings and used empty businessProcessor
Configured JAVA_XMX and PERMGEN
to speed up the transaction.
According to Active MQ web Console, there are so many messages waiting for being processed on scenario 500TPS. I guess, one of the solution is scale the ActiveMQ up. So I want to use multiple brokers in cluster.
According to http://fuse.fusesource.org/mq/docs/mq-fabric.html (Section "Topologies"), configuring ActiveMQ in clustering mode is suitable for non-persistent message. IMHO, it is true that it's not suitable, because all running brokers use the same store file. But, what about separating the store file? Now it's possible right?
Could anybody explain this? If it's not possible, what is the best way to load balance persistent message?
Thanks
You can share the load of persistent messages by creating 2 master/slave pairs. The master and slave share their state either though a database or a shared filesystem so you need to duplicate that setup.
Create 2 master slave pairs, and configure so called "network connectors" between the 2 pairs. This will double your performance without risk of loosing messages.
See http://activemq.apache.org/networks-of-brokers.html
This answer relates to an version of the question before the Camel details were added.
It is not immediately clear what exactly it is that you want to load balance and why. Messages across consumers? Producers across brokers? What sort of concern are you trying to address?
In general you should avoid using networks of brokers unless you are trying to address some sort of geographical use case, have too many connections for a signle broker to handle, or if a single broker (which could be a pair of brokers configured in HA) is not giving you the throughput that you require (in 90% of cases it will).
In a broker network, each node has its own store and passes messages around by way of a mechanism called store-and-forward. Have a read of Understanding broker networks for an explanation of how this works.
ActiveMQ already works as a kind of load balancer by distributing messages evenly in a round-robin fashion among the subscribers on a queue. So if you have 2 subscribers on a queue, and send it a stream of messages A,B,C,D; one subcriber will receive A & C, while the other receives B & D.
If you want to take this a step further and group related messages on a queue so that they are processed consistently by only one subscriber, you should consider Message Groups.
Adding consumers might help to a point (depends on the number of cores/cpus your server has). Adding threads beyond the point your "Camel server" is utilizing all available CPU for the business processing makes no sense and can be conter productive.
Adding more ActiveMQ machines is probably needed. You can use an ActiveMQ "network" to communicate between instances that has separated persistence files. It should be straight forward to add more brokers and put them into a network.
Make sure you performance test along the road to make sure what kind of load the broker can handle and what load the camel processor can handle (if at different machines).
When you do persistent messaging - you likely also want transactions. Make sure you are using them.
If all running brokers use the same store file or tx-supported database for persistence, then only the first broker to start will be active, while others are in standby mode until the first one loses its lock.
If you want to loadbalance your persistence, there were two way that we could try to do:
configure several brokers in network-bridge mode, then send messages
to any one and consumer messages from more than one of them. it can
loadbalance the brokers and loadbalance the persistences.
override the persistenceAdapter and use the database-sharding middleware
(such as tddl:https://github.com/alibaba/tb_tddl) to store the
messages by partitions.
Your first step is to increase the number of workers that are processing from ActiveMQ. The way to do this is to add the ?concurrentConsumers=10 attribute to the starting URI. The default behaviour is that only one thread consumes from that endpoint, leading to a pile up of messages in ActiveMQ. Adding more brokers won't help.
Secondly what you appear to be doing could benefit from a Staged Event-Driven Architecture (SEDA). In a SEDA, processing is broken down into a number of stages which can have different numbers of consumer on them to even out throughput. Your threads consuming from ActiveMQ only do one step of the process, hand off the Exchange to the next phase and go back to pulling messages from the input queue.
You route can therefore be rewritten as 2 smaller routes:
from("activemq:input?concurrentConsumers=10").id("FirstPhase")
.process(businessInvokerOne)
.to("seda:invokeSecondProcess");
from("seda:invokeSecondProcess?concurentConsumers=20").id("SecondPhase")
.process(businessInvokerTwo)
.to("activemq:output");
The two stages can have different numbers of concurrent consumers so that the rate of message consumption from the input queue matches the rate of output. This is useful if one of the invokers is much slower than another.
The seda: endpoint can be replaced with another intermediate activemq: endpoint if you want message persistence.
Finally to increase throughput, you can focus on making the processing itself faster, by profiling the invokers themselves and optimising that code.