Does using ActiveMQ in Master/Slave mode with JDBC preclude use of journaling? - activemq

My group is looking to distribute our ActiveMQ queues across multiple brokers to achieve high availability. Of the three supported master-slave setups (pure, shared filesystem, JDBC) we are considering shared file system and JDBC.
I am seeing conflicting statements within the ActiveMQ documentation. Can, or can not, JDBC master-slave setup use ActiveMQ's high-performance journal?
On this page, ActiveMQ claims that
it cannot use the high performance journal.
On this page, ActiveMQ suggests that the two can, in fact, be used together:
For long term persistence we recommend using JDBC coupled with our high performance journal.
Can anyone shed light on this apparent conflict?

you should not use journaling with JDBC master/slave because the journal is not replicated. Any messages in the journal of the master that have not yet been batch submitted to the jdbc store will be isolated till restart. ie: the journal is not visible to the slave.

Related

Keep ActiveMQ running when losing connection to database

I have an instance of ActiveMQ 5.16.4 running that is using MySQL as a persistent data storage. Recently the MySQL server had some issues, and ActiveMQ lost its connection to MySQL. That caused multiple Spring microservices to throw errors because ActiveMQ wasn't working.
Is it possible to have master/slave ActiveMQ running where master and slave uses separate persistence storage?
I have done some research and found "pure master slave", but it says that it is deprecated and not recommend to use and will be removed in 5.8. It says to use shared storage which I am trying to avoid (cause my problem is what if storage itself is down).
What are my options to keep running ActiveMQ if it loses connection to database?
If you're using ActiveMQ "Classic" (i.e. 5.x) then your only option is to use shared storage between the master and the slave. This could be a shared file system or a relational database. This, of course, is a single point of failure.
However, there are both file system and database technologies that can mitigate this risk. For example you could use a replicated file system (e.g. Ceph or GlusterFS) or a replicated database (e.g. MySQL).
You might also consider using ActiveMQ Artemis (i.e. the next-generation broker from ActiveMQ) which supports replication natively.

Failover with Spring AMQP and RabbitMQ HA

There are multiple articles suggesting that load-balancer should be used in front of RabbitMQ cluster.
However, there are also multiple references that Spring AMQP is using some
failover implementation like connection reset when broker comes back to life.
I have several questions regarding this topic (given that those articles are more or less old and it's 2018 today)
When using Spring AMQP, is it load-balancing for still required?
If load-balancing is still suggested, how would I solve affinity of primary queue to its node? There would be much inter-connect between cluster nodes, because round-robin load-balancer would have 1-(1/n) success rate of hitting correct cluster node
Does Spring AMQP support some kind of topology awareness, which would allow it to consume from correct node?
There were some articles suggesting that clients should publish/consume to nodes respecting locality of queues. Does this still apply? How does this all fits together given load-balancing, Spring AMQP failover and CachingConnectionFactory?
Can anybody please provide answers to those topics and also provide relevant references, which would provide additional information for verification?
Thanks a lot
For each of your bullets:
a load balancer makes little sense with default configuration of Spring AMQP since it opens a single, long-lived, connection that is shared across all consumers. In, 2.0, you can configure the RabbitTemplate to use a separate connections; this is because it is a recommended configuration to use a different connection for publishers/consumers; this will be default in 2.1.
It might make sense to use a load balancer if you configure the connection factory to cache connections (instead of just channels) since, then, each component gets its own connection.
See next bullet.
See Queue Affinity and the LocalizedQueueConnectionFactory. It uses the management plugin to determine which node currently hosts the queue and connects to that. It will not work with a load balancer since it needs to connect to the actual node.
It is my understanding from several discussions that queue affinity is only needed in the most extreme environments and that, in most environments, the difference is immeasurable. However, environments/networks differ so much, YMMV so you may want to test. My general rule of thumb is to avoid premature optimization since the added complexity of the configuration may simply not be worth the benefit (and you may not have a problem in the first place).

rabbitMQ federation VS ActiveMQ Master/Slave

I am trying to set up cluster of brokers, which should have same feature like rabbitMQ cluster, but over WAN (my machines are in different locations), so rabbitMQ cluster does not work.
I am looking to alternatives, rabbitMQ federation is just backup the messages in the downstream, can not make sure they have exactly the same messages available at any time (downstream still keeps the old messages already consumed in the upstream)
how about ActiveMQ Master/Slave, I have found :
http://activemq.apache.org/how-do-distributed-queues-work.html
"queues and topics are all replicated between each broker in the cluster (so often to a master and maybe a single slave). So each broker in the cluster has exactly the same messages available at any time so if a master fails, clients failover to a slave and you don't loose a message."
My concern is that if it can automatically update to make sure Master/Slave always have the same messages, which means the consumed messages in Master will also disappear in Slaves.
Thanks :)
ActiveMQ has various clustering features.
First there is High Availability - "Master/Slave". The idea is that several physical servers act as a single logical ActiveMQ broker. If one goes down, another takes it place without losing data. You can do that by sharing the message store (shared file system or shared JDBC), or you could setup a replicated cluster, which replicates read/writes to the master down to all slaves (you need three+ servers). ActiveMQ is using LevelDB and Apache Zookeeper to achieve this.
The other format of cluster available in ActiveMQ is to be able to distribute load and separate security over several logical brokers. Brokers are then connected in a network of brokers. Messages are by default passed around to the broker with available consumers for that message. However, there is a rich toolbox of features in ActiveMQ to tweak a network of brokers to do things as always send a copy of a message to specific broker etc. It takes some messing with the more advanced features though (static network connectors and queue mirroring, maybe more).
Maybe there is a better way to solve your requirements, which is not really specified in the question?

Redis PUBLISH/SUBSCRIBE limits

I'm considering Redis for a section of the architecture of a new project. It will consist of a lot of clients (node.js connections) SUBSCRIBING to particular keys with one process PUBLISHING to those keys as needed.
I'm curious about the limits of the PUBLISH/SUBSCRIBE commands and how to mitigate those. An obvious limit is the amount of file descriptors open on the machine with Redis so at some point I'll need to implement Master-Slave or Consistent Hashing to multiple Redis instances.
Does anyone have any solutions about how to scale this architecture with Redis' PubSub?
Redis PubSub scales really easily since the Master/Slave replication automatically publishes to all slaves.
The easiest way is to load balance the connections to node.js with for instance HAProxy, run a Redis slave on each webserver that syncs with a single master that publishes the messages.
I can't give you exact numbers since that greatly depends on the underlying system, but this should scale extremely well. And you don't need to manage the clients and which server they connect to manually. You obviously need some way to handle session state, so you might need to do that anyway, but that's a lot easier to do in the load balancer than in your application.

Cassandra failover vs other databases?

Cassandra offers controlled consistency like "write to 2 nodes and tell me it's done".
Two "master" nodes and some slaves makes system good failover.
MongoDB offers replication pairs - simmilar failover force like cassandra?
Is there any other database with this form-box functionality?
Cassandra is a fully distributed system, so there is no need for explicit failover. If the machine you are sending requests to dies, you just reconnect to another (RRDNS, haproxy, any method is fine). Even losing an entire datacenter is handled by Cassandra without your app having to care.