Reconnect Shutdown Redis Instance back to Cluster - redis

Given a redis cluster with six nodes (3M/3S) on ports 7000-7005 with master nodes on ports 7000-7002 and slave nodes on the rest, master node 7000 is shut down, so node 7003 becomes the new master:
$ redis-cli -p 7003 cluster nodes
2a23385e94f8a27e54ac3b89ed3cabe394826111 127.0.0.1:7004 slave 1108ef4cf01ace085b6d0f8fd5ce5021db86bdc7 0 1452648964358 5 connected
5799de96ff71e9e49fd58691ce4b42c07d2a0ede 127.0.0.1:7000 master,fail - 1452648178668 1452648177319 1 disconnected
dad18a1628ded44369c924786f3c920fc83b59c6 127.0.0.1:7002 master - 0 1452648964881 3 connected 10923-16383
dfcb7b6cd920c074cafee643d2c631b3c81402a5 127.0.0.1:7003 myself,master - 0 0 7 connected 0-5460
1108ef4cf01ace085b6d0f8fd5ce5021db86bdc7 127.0.0.1:7001 master - 0 1452648965403 2 connected 5461-10922
bf60041a282929cf94a4c9eaa203a381ff6ffc33 127.0.0.1:7005 slave dad18a1628ded44369c924786f3c920fc83b59c6 0 1452648965926 6 connected
How does one go about [automatically] reconnecting/restarting node 7000 as a slave instance of 7003?

Redis Cluster: Re-adding a failed over node has detail explanation about what happens.
Basically, the node will become a slave of the slave (which is now a master) that replaced it during the failover.

Have you seen the Redis Sentinel Documentation?
Redis Sentinel provides high availability for Redis. In practical
terms this means that using Sentinel you can create a Redis deployment
that resists without human intervention to certain kind of failures.

Related

How to restart redis cluster node after failure

I am experimenting with Redis Cluster as per document. I have small confusion.
Initial Configuration
35edd8052caf37149b4f9cc800fcd2ba60018ab5 127.0.0.1:30005#40005 slave bd76f831d34ed265a964e5f5caff2c0807c96b85 0 1524390407263 5 connected
d9e92c606f1fddebf84bbbc6f76485e418647683 127.0.0.1:30003#40003 master - 0 1524390407263 8 connected 10923-16383
edf62838d10b99018a0ecb7698c1b9ac52aa3bbb 127.0.0.1:30002#40002 myself,master - 0 1524390407000 2 connected 5461-10922
bd76f831d34ed265a964e5f5caff2c0807c96b85 127.0.0.1:30001#40001 master - 0 1524390407062 1 connected 0-5460
55a72ea5b4d0a77e2b18ca2b3f74b20d3550244c 127.0.0.1:30006#40006 slave edf62838d10b99018a0ecb7698c1b9ac52aa3bbb 0 1524390407562 6 connected
26788ce4523c95a93bd63907c1c75827fe61476a 127.0.0.1:30004#40004 slave d9e92c606f1fddebf84bbbc6f76485e418647683 0 1524390407263 8 connected
Now to test that if any master get failed I failed it manually using following command.
redis-cli -p 30001 debug segfault
Now configuration is look like this. ( 30001 is failed and 30005 promoted as master)
35edd8052caf37149b4f9cc800fcd2ba60018ab5 127.0.0.1:30005#40005 master - 0 1524390694964 9 connected 0-5460
d9e92c606f1fddebf84bbbc6f76485e418647683 127.0.0.1:30003#40003 master - 0 1524390695064 8 connected 10923-16383
edf62838d10b99018a0ecb7698c1b9ac52aa3bbb 127.0.0.1:30002#40002 myself,master - 0 1524390694000 2 connected 5461-10922
bd76f831d34ed265a964e5f5caff2c0807c96b85 127.0.0.1:30001#40001 master,fail - 1524390636966 1524390636165 1 disconnected
55a72ea5b4d0a77e2b18ca2b3f74b20d3550244c 127.0.0.1:30006#40006 slave edf62838d10b99018a0ecb7698c1b9ac52aa3bbb 0 1524390694964 6 connected
26788ce4523c95a93bd63907c1c75827fe61476a 127.0.0.1:30004#40004 slave d9e92c606f1fddebf84bbbc6f76485e418647683 0 1524390695164 8 connected
How can I add 30001 again into cluster ? Also How can I start that node Only ?
I am following this document.
https://redis.io/topics/cluster-tutorial. ( Here there is one statement that "I restarted the crashed instance so that it rejoins the cluster as a slave" but did not mention how to do that ?)
creating a cluster using redis-trib.rb needs running instances of Redis which we should start using a custom config file
../redis-server redis.conf
where redis.conf contains config for that node.
For instance
port 7000
cluster-enabled yes
cluster-config-file nodes.conf
cluster-node-timeout 5000
appendonly yes
The redis cluster is created as below,
./redis-trib.rb create --replicas 1 host1:port1 host2:port2 host3:port3 host4:port4 host5:port5 host6:port6
The ruby file will randomly create master and slaves among these and create a nodes.conf file (as mentioned in redis.conf file) which will have the node information
when you start the server using ../redis-server redis.conf it will pick node information like id, its master/slave from nodes.conf and connect to cluster again
You can restart the redis instance on required port, using the same command as you have used to start it earlier i.e.
cd 30001
../redis-server redis.conf
Assuming that you followed the tutorial and created the cluster using create-cluster command i.e.
# pwd: redis/utils/create-cluster
./create-cluster start
./create-cluster create
To bring back the node that you failed, start it again using
./create-cluster start
This will start the failed node. Currently running nodes won't be affected.
https://github.com/antirez/redis/blob/unstable/utils/create-cluster/create-cluster#L25

Redis-nodes went out of cluster

I have a setup with 3 redis servers and 3 redis sentinels on each node.In the beginning we have one master node and two slave nodes, but after some time all the redis nodes went out of cluster and goes into slave mode.
I don't know why this is happening, am I missing something here?
Redis-Server configurations:
Redis-Server A (master node)
Redis-Server B (slaveof node A)
Redis-Server C (slaveof node A)
Redis-Sentinel Configuration for all three nodes are:
sentinel monitor Redis-Cluster ip-of-master-node 6379 2
sentinel down-after-milliseconds Redis-Cluster 2000
sentinel failover-timeout Redis-Cluster 2000
sentinel parallel-syncs Redis-Cluster 1
bind ip-of-interface

Redis - Make failover master return to slave state and master take up it's old master role

I have a Redis v4.0.7 cluster consisting of 4 servers. These 4 servers are all running Ubuntu v17.10 64 bit Virtual Machines (in VirtualBox) that I have on my Windows PC. I have shifted all the slaves 1 server and will be using M1 for master 1 as well as S1 for slave 1 in the following explanation of my "issue".
192.168.56.101 (with a master on port 7000 (M1) and slave on port 7001 (S4))
192.168.56.102 (with a master on port 7000 (M2) and slave on port 7001 (S1))
192.168.56.103 (with a master on port 7000 (M3) and slave on port 7001 (S2))
192.168.56.104 (with a master on port 7000 (M4) and slave on port 7001 (S3))
I am fiddling a little bit with the setup to check if the failover "works".
Therefore I have tried shutting down M2, which means that S2 takes over and becomes the master. This works as intended. However if I start up the (old) M2 again it is now a slave and remains as such until I shut S2 down at which point it will take over the master role again.
I was wondering if there is a certain command that I can issue to the slave that has taken over the master role which makes it take over it's (old) slave role and hand over the master role to the (old) master, in this case M2.
I have tried googling the "issue", but to no avail.
You can do this by running:
redis-cli -h M2_IP_ADDRESS M2_PORT CLUSTER FAILOVER
Above command will make manual failover. M2 will became master and S2 slave.

Redis Cluster: No automatic failover for master failure

I am trying to implement a Redis cluster with 6 machine.
I have a vagrant cluster of six machines:
192.168.56.101
192.168.56.102
192.168.56.103
192.168.56.104
192.168.56.105
192.168.56.106
all running redis-server
I edited /etc/redis/redis.conf file of all the above servers adding this
cluster-enabled yes
cluster-config-file nodes.conf
cluster-node-timeout 5000
cluster-slave-validity-factor 0
appendonly yes
I then ran this on one of the six machines;
./redis-trib.rb create --replicas 1 192.168.56.101:6379 192.168.56.102:6379 192.168.56.103:6379 192.168.56.104:6379 192.168.56.105:6379 192.168.56.106:6379
A Redis cluster is up and running. I checked manually by setting value in one machine it shows up on other machine.
$ redis-cli -p 6379 cluster nodes
3c6ffdddfec4e726f29d06a6da550f94d976f859 192.168.56.105:6379 master - 0 1450088598212 5 connected
47d04bc98ab42fc793f9f382855e5c54ab8f2e20 192.168.56.102:6379 slave caf2cec45114dc8f4cbc6d96c6dbb20b62a39f90 0 1450088598716 7 connected
040d4bb6a00569fc44eec05440a5fe0796952ccf 192.168.56.101:6379 myself,slave 5318e48e9ef0fc68d2dc723a336b791fc43e23c8 0 0 4 connected
caf2cec45114dc8f4cbc6d96c6dbb20b62a39f90 192.168.56.104:6379 master - 0 1450088599720 7 connected 0-10922
d78293d0821de3ab3d2bca82b24525e976e7ab63 192.168.56.106:6379 slave 5318e48e9ef0fc68d2dc723a336b791fc43e23c8 0 1450088599316 8 connected
5318e48e9ef0fc68d2dc723a336b791fc43e23c8 192.168.56.103:6379 master - 0 1450088599218 8 connected 10923-16383
My problem is that when I shutdown or stop redis-server on any one machine which is master the whole cluster goes down, but if all the three slaves die the cluster still works properly.
What should I do so that a slave turns a master if a master fails(Fault tolerance)?
I am under the assumption that redis handles all those things and I need not worry about it after deploying the cluster. Am I right or would I have to do thing myself?
Another question is lets say I have six machine of 16GB RAM. How much total data I would be able to handle on this Redis cluster with three masters and three slaves?
Thank you.
the setting cluster-slave-validity-factor 0 may be the culprit here.
from redis.conf
# A slave of a failing master will avoid to start a failover if its data
# looks too old.
In your setup the slave of the terminated master considers itself unfit to be elected master since the time it last contacted master is greater than the computed value of:
(node-timeout * slave-validity-factor) + repl-ping-slave-period
Therefore, even with a redundant slave, the cluster state is changed to DOWN and becomes unavailable.
You can try with a different value, example, the suggested default
cluster-slave-validity-factor 10
This will ensure that the cluster is able to tolerate one random redis instance failure. (it can be slave or a master instance)
For your second question: Six machines of 16GB RAM each will be able to function as a Redis Cluster of 3 Master instances and 3 Slave instances. So theoretical maximum is 16GB x 3 data. Such a cluster can tolerate a maximum of ONE node failure if cluster-require-full-coverage is turned on. else it may be able to still serve data in the shards that are still available in the functioning instances.

Redis Sentinel for Windows

I'm successfully using Redis for Windows (2.6.8-pre2) in a master slave setup. However, I need to provide some automated failover capability, and it appears the sentinel is the most popular choice. When I run redis in sentinel mode the sentinel connects, but it always thinks the master is down. Also, when I run the sentinel master command it reports that there are 0 slaves (not true) and that there are no other sentinels (again, not true). So it's like it connects to the master, but not correctly.
Has anyone else seen this issue on Windows and, more importantly, is anyone successfully using sentinel in a windows environment? Any help or direction at all is appreciated!
I recommend use this:
1 master node redis server 1 slave node redis server
List item 3 redis sentinels with a quorum of 2
It's so important have more than have 3 sentinels to get a odd quorum.
I made this configuration in Windows 7 and it's working well.
Example of sentinel conf:
port 20001
logfile "sentinel1.log"
sentinel monitor shard1 127.0.0.1 16379 2
sentinel down-after-milliseconds shard1 5000
sentinel failover-timeout shard1 30000