Resharding keys when a node goes down in redis cluster - redis

Suppose I have a redis cluster with nodes 10.0.0.1, 10.0.0.2, 10.0.0.3 and 10.0.0.4, which I'm using as a cache.
Then, for whatever reason, node 10.0.0.4 fails and goes down. This brings down the entire cluster:
2713:M 13 Apr 21:07:52.415 * FAIL message received from [id1] about [id2]
2713:M 13 Apr 21:07:52.415 # Cluster state changed: fail
Which causes any query to be shut down with "CLUSTERDOWN The cluster is down".
However, since I'm using the cluster as a cache, I don't really care if a node goes down. A key can get resharded to a different node and lose its contents without affecting my application.
Is there a way to set up such an automated resharding?

I found something close enough to what I need.
By setting cluster-require-full-coverage to "no", the rest of the cluster will continue to respond to queries, although the client needs to handle the possibility of being redirected to a failing node.
Then I can replace the broken node by running:
redis-trib.rb call 10.0.0.1:6379 cluster forget [broken_node_id]
redis-trib.rb add-node 10.0.0.5:6379 10.0.0.1:6379
redis-trib.rb fix 10.0.0.1:6379
Where 10.0.0.5:6379 is the node that will replace the broken one.

By assuming you have only master nodes in your current cluster, you will definitely get cluster down error because there is no replica of down master and Redis thinks cluster is not in safe and triggers an error.
Solution
Create a new node (Create redis.conf with desired parameters.)
Join that node to cluster
redis-trib.rb add-node 127.0.0.1:6379 EXISTING_MASTER_IP:EXISTING_MASTER_PORT
Make node slave of 10.0.0.4
redis-cli -p 6379 cluster replicate NODE_ID_OF_TARGET_MASTER
To Test
First be sure, cluster is in good shape.(All slots are covered and nodes are agreed about configurations.)
redis-trib.rb check 127.0.0.1:6379 (On any master)
Kill process of 10.0.0.4
Wait Slave to be new master.(It happens quickly. Slots assigned to 10.0.0.4 will be resharded automatically to Slave.)
Check cluster and be sure all slots are moved new master
redis-trib.rb check 127.0.0.1:6379 (On any master)
No manual actions needed. Additionally, if you have more slaves in cluster they may be promoted as new masters of other masters as well. (e.g. You have a setup of 3 master, 3 slaves. Master1 goes down, Slave1 becomes new master. Slave1 goes down, Slave1 can be new master as Master1.)

Related

how to check the message published from redis sentinel to redis master?

Question Background:
I deploy a redis cluster in k8s cluster and use Redis-Sentinel to implement ha for redis cluster. My redis cluster structure likes below:
One master
One slave
three sentinel (serve a specific redis cluster)
When i login the container of the one of sentinels, i execute a command:
sentinel sentinels mymaster
Luckly, i get a desirable output. These are two sentinel's infos. After a period of time, i execute "sentinels mymaster" command again, i found that there is a additional sentinel and don't find this instance through IP address or runId。
I know that sentinel discover other sentinels and master and slave through sub the channel of sentinel:hello in redis master.
Question:
how to check the message published from redis sentinel to redis master? I have opened log for master and set the log level to debug.
You can see the Sentinel's activity (when it discovers a sentinel, a replica, failsover to a new master, etc.) in the sentinel log file, not the master. If a sentinel is running on a host, it will be in the same directory the master or replica log file is. For me on CentOS it's /var/log/redis/sentinel.log.

Correct shutdown sequence for Redis cluster

Suppose I have the following Redis replication setup:
3 machines
Each machine has a Redis server and a Redis sentinel.
One of the servers is set as master, the other two are its slaves.
What would be the correct sequence and commands to gracefully shutdown this setup, all while keeping the existing master as master and existing slaves as slaves (meaning, no failover or reconfig should take place)
Thanks.
Shutdown sequence
You should shutdown sentinels first, to avoid alarms/notifications and failover. Then you can shutdown slaves and master.
Shutdown command
You can gracefully shutdown Redis instances (sentinel, slave and master) with the shutdown command.
For Redis version older than 3.0 (not very sure), there's no shutdown command for Redis sentinel. But you can just use killall or kill -9 process_id to kill it without any side effect.
============================================================================
UPDATE
In my original answer, I suggested shutdown slaves and master first, to avoid alarms from sentinel. In fact, there's another way to avoid alarms. You can simply remove the master from sentinel before shutdown the master: SENTINEL REMOVE <name>. After removing the master, you don't need to care the shutdown order any more.
How about the startup order?
If you use SENTINEL MONITOR <name> <ip> <port> <quorum> command to dynamically add a master to monitor, you can startup sentinel, and add masters dynamically. Instead, if you add the master with sentinel's configuration file, you can startup Redis first, to avoid alarms from sentinel.

How to switch redis master in sentinel configuration

I have a redis sentinel configuration with one master, two slaves and 3 sentinels running. I noticed that at some point the sentinels may switch the master electing one of the slaves as master. This is causing problems to an application which is connecting to the master node as a standalone client(I'm working on changing the code to use sentinels). I wanted to know if it is possible to switch the master by connecting to the sentinel client i.e. through 'redis-cli'
Can somebody let me know if there is a command that I can use to switch the master IP?
The client applications should use a client library that supports sentinel in the case where a redis master goes down and the sentinels select a new master. Not sure how beneficial it is to have sentinel setup if your client applications are not taking advantage of it. A client application that supports sentinel will query sentinel for the master ip and should be somewhat tolerant to faults occurring with the master connection. You can trigger a manual failover like the other answer states:
redis-cli -h {sentinel-ip} -p {26379 or sentinel port} sentinel failover {mastername}
But you will not be able to pick which node it fails over to. You can control a configuration value slave_priority in the redis.conf file so that it prefers a node over the rest. A description of the slave priority can be found here: https://redis.io/topics/sentinel
You can manually trigger a failover by running:
redis-cli -a {password} -p {sentinel_port} SENTINEL failover {cluster_name}
If you are using Lettuce Client you can use masterSlaveStatefulConnection and pass the sentinel URI it will perform auto discovery in the background and will refresh the master node internally.
https://github.com/lettuce-io/lettuce-core/wiki/Master-Replica

Redis - configure sentinel to elect slave if master shutdown

Hi i have create a cluster Redis with sentinel composed by 3 aws instances, i have configured sentinel to have an HA redis cluster and work, but if i simulate a crash of master (shutdown of master instance), sentinel installed on slaves, not locate sentinel of master and the election fail.
My sentinel configuration is:
sentinel monitor master ip-master 6379 2
sentinel down-after-milliseconds master 5000
sentinel failover-timeout master 10000
sentinel parallel-syncs master 1
Same file to all instaces
There are issues when running sentinel on the same node as the master and attempting to trigger a failover. Try it w/o running Sentinel on the master. Ultimately this means not running Sentinel on the same nodes as the Redis instances.
In your case your dead-node simulation is showing why you should not run Sentinel on the same node as Redis: If the node dies you lose one of your sentinels. In theory it should still work but as you and others have seen it isn't certain to work. I have some theories why but I've not yet confirmed them.
In a sense Sentinel is partly a monitoring system. Running a monitoring solution on the same nodes as are being monitored is generally unadvisable anyway, so you should be using off-node sentinels anyway. As Sentinel is resource efficient you don't necessarily need dedicated machines or large VMs. Indeed if you have a static set of application servers (where your client code runs), you should run Sentinel there, keeping in mind you want 3 minimum and a quorum of 50%+1.
recent redis version introduced the "protected-mode" option, which defaults to yes.
with protected-mode set to yes, redis instances, without a password set will not allow remote clients to execute commands.
this also affects sentinels master election.
try it with setting "protected-mode no" in the sentinels. this will allow them to talk to each other.
If you don't want to set protected-mode as no. you'd better set masterauth myredis in redis.conf and use sentinel auth-pass mymaster myredis in sentinel.conf

Redis master/slave replication - single point of failure?

How does one upgrade to a newer version of Redis with zero downtime? Redis slaves are read-only, so it seems like you'd have to take down the master and your site would be read-only for 45 seconds or more while you waited for it to reload the DB.
Is there a way around this?
Redis Team has very good documentation on this
Core Steps:
Setup your new Redis instance as a slave for your current Redis instance. In order to do so you need a different server, or a server that has enough RAM to keep two instances of Redis running at the same time.
If you use a single server, make sure that the slave is started in a different port than the master instance, otherwise the slave will not be able to start at all.
Wait for the replication initial synchronization to complete (check the slave log file).
Make sure using INFO that there are the same number of keys in the master and in the slave. Check with redis-cli that the slave is working as you wish and is replying to your commands.
Configure all your clients in order to use the new instance (that is, the slave).
Once you are sure that the master is no longer receiving any query (you can check this with the MONITOR command), elect the slave to master using the SLAVEOF NO ONE command, and shut down your master.
Full Documentation:
Upgrading or restarting a Redis instance without downtime
When taking the node offline, promote the slave to master using the SLAVEOF command, then when you bring it back online you set it up as a slave and it will copy all data from the online node.
You may also need to make sure your client can handle changed/missing master nodes appropriately.
If you want to get really fancy, you can set up your client to promote a slave if it detects an error writing to the master.
You can use Redis Sentinel for doing this, the sentinel will automatically promote a slave as new master.
you can find more info here http://redis.io/topics/sentinel.
Sentinel is a system used to manage redis servers , it monitors the redis master and slaves continuously, and whenever a master goes down it will automatically promote a slave in to master. and when the old master is UP it will be made as slave of the new master.
Here there will be no downtime or manual configuration of config file is needed.
You can visit above link to find out how to configure sentinel for your redis servers.
Note, you may have to check and set the following config to write to your slave.
("Since Redis 2.6 by default slaves are read-only")
redis-cli config set slave-read-only no
-- Example
-bash-4.1$ redis-cli info
Server
redis_version:2.6.9
-bash-4.1$ redis-cli slaveof admin2.mypersonalsite.com 6379
OK
-bash-4.1$ redis-cli set temp 42
(error) READONLY You can't write against a read only slave.
-bash-4.1$ redis-cli slaveof no one
OK
-bash-4.1$ redis-cli set temp 42
OK
-bash-4.1$ redis-cli get temp
"42"
-bash-4.1$ redis-cli config set slave-read-only no
OK
-bash-4.1$ redis-cli slaveof admin2.mypersonalsite.com 6379
OK
-bash-4.1$ redis-cli set temp 42
OK
-bash-4.1$ redis-cli get temp
"42"