Cassandra failover vs other databases? - replication

Cassandra offers controlled consistency like "write to 2 nodes and tell me it's done".
Two "master" nodes and some slaves makes system good failover.
MongoDB offers replication pairs - simmilar failover force like cassandra?
Is there any other database with this form-box functionality?

Cassandra is a fully distributed system, so there is no need for explicit failover. If the machine you are sending requests to dies, you just reconnect to another (RRDNS, haproxy, any method is fine). Even losing an entire datacenter is handled by Cassandra without your app having to care.

Related

What is the difference between:Redis Replicated setup, Redis Cluster setup Redis Sentinel setup and Redis with Master with Slave only?[REDISSON]

I've read https://github.com/redisson/redisson
And I found out that there are several
Redis Replicated setup (including support of AWS ElastiCache and Azure Redis Cache)
Redis Cluster setup (including support of AWS ElastiCache Cluster and Azure Redis Cache)
Redis Sentinel setup
Redis with Master with Slave only
I am not a big expert in clusters and I don't understand the difference between these setups.
Could you beiefly explain the differences ?
Disclaimer I am an AWS employee.
I do not know how Redis Replicated Setup is different from Redis in Master-Slave mode. Maybe they mean cross-region replication?
In any case, I can try and explain setups I know about:
Redis with Master with Slave only - is a single shard setup where you create a primary replica together with one or more secondary (slave) replicas (let's hope PC police won't arrest me). This setup is used to improve the durability of your in-memory store. It's not advised to use your secondaries for reads because such setup has eventual consistency guarantees and your replica reads may be stale (depending on the replication lag).
Redis Cluster setup - the setup supported by cloud provides such as AWS Elasticache. In this setup your workload can be spread horizontally across multiple shards and each shard may have its own secondary replicas. Your client library must support this setup since it requires maintaining multiple connections to several nodes at a client level. Moreover, there are some locality rules you need to follow in order to use cluster mode efficiently:
Keys with foo{<shard>}bar notation will be routed to their shard according to what is stored inside curly brackets.
You can not use mset, mget and other multi-key commands across shards. You can still use these commands if their keys contain the same {shard} part.
There are additional cluster mode admin commands that are exposed by Redis but they are usually hijacked and hidden from users by cloud providers since cloud provides use them in order to manage redis cluster themselves.
Redis cluster have an ability to migrate part of your workload between shards. However, it still obliged to preserve correctness with respect to {shard} notation. Since your client library is responsible to fetch data from specific shard it must handle "moved" response when a shard might redirect it to another node.
Redis Sentinel setup - using an additional server that provides service discovery functionality for Redis clusters. Not strictly required and I believe is less popular across users. It serves as a single source of truth regarding each node's health and state. It provides monitoring, management, and service discovery functions for managing your Redis cluster. Many Redis client libraries provide the option of connecting to Redis sentinel nodes in order to achieve automatic service discovery and seamless failover flow. One of the reasons why this setup is less popular is because cloud companies like AWS Elasticache provide this service out of the box.

Redis mirror datacenter for active-active (xdcr feature)

Newbie to redis here so please bear with me.
I am looking for a method to have dual datacenter in active-active configuration. My need here is that:
if a datacenter goes down the other datacenter should not need any intervention to carry on working
Networking issues, if they occur, should not prevent one or the other datacenter to fail on set/get redis calls.
Restarting failed datacenter should have a means mirror the working redis' data back before going active
I have been reading up on replication abilities of redis, what I understand is there is
master-slave(s) replication
A cluster can have masters that are sharded
But what I haven't seen is a cluster replicating to another cluster.
Question:
Is there a architecture design where all redis in one dc replicate with the other? (Looks like couchbase appears to have this)
I do see a keyspace notifier which can be used in a pub/sub, what i want to know is whether I can use it to pub/sub redis to redis from one dc to the other to act as replication.

Does redis delete all the keys when one master and its slave fails in redis cluster

I have a question. Suppose I am using a Redis cluster with 3 shards (with master and slave). I came to know that if a master and its slave fails at the same time Redis Cluster is not able to continue to operate. What happen after that.
Would Redis cluster delete all the other keys from other 2 nodes as well? (When it comes back)
Do we need to manually restart this cluster and can we somehow retain the other keys values (on other nodes)?
How will it behave if I use Azure Redis Cache?
Thanks In Advance
1. Would Redis cluster delete all the other keys from other 2 nodes as well? (When it comes back)
First of all only the operations are blocked not the cluster activity and nothing is done with the data so says the documentation
Redis Cluster failure detection is used to recognize when a master or slave node is no longer reachable by the majority of nodes and then respond by promoting a slave to the role of master. When slave promotion is not possible the cluster is put in an error state to stop receiving queries from clients.
Next regarding if the data gets deleted or not (Under Replication document)
In setups where Redis replication is used, it is strongly advised to have persistence turned on in the master
Which means that only if the persistence was turned off and the master server pair went down then you will loose the data. When the pair comes back up, you will not be able to recover the data. So keep Redis persistence turned on.
2. Do we need to manually restart this cluster and can we somehow retain the other keys values (on other nodes)?
I think the above answer covers it up.
3. How will it behave if I use Azure Redis Cache?
From Azure Redis Cache FAQ
High Availability/SLA: Azure Redis Cache guarantees that a Standard/Premium cache will be available at least 99.9% of the time. To learn more about our SLA, see Azure Redis Cache Pricing. The SLA only covers connectivity to the Cache endpoints. The SLA does not cover protection from data loss. We recommend using the Redis data persistence feature in the Premium tier to increase resiliency against data loss.
So it's kinda their headache
OR
Redis Cluster: If you want to create caches larger than 53 GB or want to shard data across multiple Redis nodes, you can use Redis clustering which is available in the Premium tier. Each node consists of a primary/replica cache pair for high availability. For more information, see How to configure clustering for a Premium Azure Redis Cache.

Does using ActiveMQ in Master/Slave mode with JDBC preclude use of journaling?

My group is looking to distribute our ActiveMQ queues across multiple brokers to achieve high availability. Of the three supported master-slave setups (pure, shared filesystem, JDBC) we are considering shared file system and JDBC.
I am seeing conflicting statements within the ActiveMQ documentation. Can, or can not, JDBC master-slave setup use ActiveMQ's high-performance journal?
On this page, ActiveMQ claims that
it cannot use the high performance journal.
On this page, ActiveMQ suggests that the two can, in fact, be used together:
For long term persistence we recommend using JDBC coupled with our high performance journal.
Can anyone shed light on this apparent conflict?
you should not use journaling with JDBC master/slave because the journal is not replicated. Any messages in the journal of the master that have not yet been batch submitted to the jdbc store will be isolated till restart. ie: the journal is not visible to the slave.

Redis PUBLISH/SUBSCRIBE limits

I'm considering Redis for a section of the architecture of a new project. It will consist of a lot of clients (node.js connections) SUBSCRIBING to particular keys with one process PUBLISHING to those keys as needed.
I'm curious about the limits of the PUBLISH/SUBSCRIBE commands and how to mitigate those. An obvious limit is the amount of file descriptors open on the machine with Redis so at some point I'll need to implement Master-Slave or Consistent Hashing to multiple Redis instances.
Does anyone have any solutions about how to scale this architecture with Redis' PubSub?
Redis PubSub scales really easily since the Master/Slave replication automatically publishes to all slaves.
The easiest way is to load balance the connections to node.js with for instance HAProxy, run a Redis slave on each webserver that syncs with a single master that publishes the messages.
I can't give you exact numbers since that greatly depends on the underlying system, but this should scale extremely well. And you don't need to manage the clients and which server they connect to manually. You obviously need some way to handle session state, so you might need to do that anyway, but that's a lot easier to do in the load balancer than in your application.