I'm currently exploring Redis cluster. I've started 6 instances on 3 physical servers(3 master and 3 slaves) with persistence enabled.
I've noticed that when I kill one of the master instances then it's slave is promoted to master after some time. However, it remains as master even when I start the killed instance.
Since, Redis does asynchronous replication, therefore, I was thinking of a scenario where the master, immediately after flushing the data is killed i.e. it wasn't able to replicate that data.
Will this data get replicated to the new master(initially slave), once
the instance comes back up?
NO. If the master haven't replicate data to slave, the data will be lost. When the old master recovers, it will be become a slave of some other node based on some rules. Then the old master will replicate data from its new master.
Related
In a 6 nodes Redis Cluster, with 3 master nodes and 3 slave nodes, if let's say a master will go down, the according slave will be promoted. When the old master comes back live will be a slave.
Is it possible to force it somehow from the redis config or otherwise, so that when it comes back live, the old master will be promoted as a master as it was at the beginning?
Thank you!
If the old master comes up with property:
slaveof no one
It will join the cluster as a master, but I don't think you would like to do it.
The 'old master' does not have the latest data, if you force it to become the master, there will be data loss.
We have a Redis cluster with 3 shards each with a replica node. If a is lock acquired in a shard and while the thread is holding the lock the master and replica node goes down.
Will the cluster wait until the shard comes back live and not accept new locks until then OR will it run with 2 shards and create a new lock in a new shard?
That depends on retryAttempts and retryInterval settings. They should be big enough to survive failover.
I am very new in REDIS cache implementation.
Could you please let me know what is the replication factor means?
How it works or What is the impact?
Thanks.
At the base of Redis replication (excluding the high availability features provided as an additional layer by Redis Cluster or Redis Sentinel) there is a very simple to use and configure leader follower (master-slave) replication: it allows replica Redis instances to be exact copies of master instances. The replica will automatically reconnect to the master every time the link breaks, and will attempt to be an exact copy of it regardless of what happens to the master.
This system works using three main mechanisms:
When a master and a replica instances are well-connected, the master keeps the replica updated by sending a stream of commands to the replica, in order to replicate the effects on the dataset happening in the master side due to: client writes, keys expired or evicted, any other action changing the master dataset.
When the link between the master and the replica breaks, for network issues or because a timeout is sensed in the master or the replica, the replica reconnects and attempts to proceed with a partial resynchronization: it means that it will try to just obtain the part of the stream of commands it missed during the disconnection.
When a partial resynchronization is not possible, the replica will ask for a full resynchronization. This will involve a more complex process in which the master needs to create a snapshot of all its data, send it to the replica, and then continue sending the stream of commands as the dataset changes.
Redis uses by default asynchronous replication, which being low latency and high performance, is the natural replication mode for the vast majority of Redis use cases.
Synchronous replication of certain data can be requested by the clients using the WAIT command. However WAIT is only able to ensure that there are the specified number of acknowledged copies in the other Redis instances, it does not turn a set of Redis instances into a CP system with strong consistency: acknowledged writes can still be lost during a failover, depending on the exact configuration of the Redis persistence. However with WAIT the probability of losing a write after a failure event is greatly reduced to certain hard to trigger failure modes.
In redis master-slave architecture, when a master fails a slave is promoted to master. As only master can perform write operations, What happens to data in the window period when slave is promoted to master. Does my system remain unresponsive?
Define "data":)
Client connections to the master will be closed upon its failure, so your system will be notified of that. Any data that was not written to the master and the replicas before the failure will therefore still reside in your application/system.
Once your system tries using a replica it will be able to read the data in it up to the point it was synchronized before failure. Once the replica is promoted to masterhood, your system will be able to continue writing data.
Note that Redis' synchronization is asynchronous. That means that slaves may lag behind the master and therefore lose some updates in case of failure. Refer to the WAIT command for more information about ensure the consistency.
I have commented the "save" commands in both my master and slave as I want to do only in memory caching and not persist to the file. This works fine but as soon as the Master goes down and before Slave can be promoted to master ( It actually freezes for a min ) it starts flushing the data. How can I prevent the slave from flushing the data.
Thanks
zafer
Actually, the slave does not flush the data when the master goes down.
It starts a SYNC (flushing the data before) with the master, when it has lost the connection with the master, and has established the connection again.
IMO, the problem is the master restarts immediately, so the slave can reconnect before it has been promoted to master.
You should delay the restart of the master until the slave have been promoted. Depending on how the HA is automated, it may not be very convenient. A simple (but not very reliable) solution is to just put a delay in the start script of the Redis instance. The delay should be calculated so that you are 100% sure the slave will be promoted before it times out. A more complex solution is to try to connect to the slave in the start script of the master, and run the INFO command to check its status, before allowing the start.
See the following discussion for more information:
https://groups.google.com/d/topic/redis-db/wmRSuIgHcEs/discussion