Redis - unable to failover

Redis - unable to failover - redis

I have Master-Slave(3 nodes) setup using Redis Sentinel but when I try to do failover, I am seeing the below error
127.0.0.1:26379> sentinel failover mymaster
(error) NOGOODSLAVE No suitable replica to promote
Below is the Redis server configurations for master and slave nodes
#redis.conf of master node
bind 127.0.0.1 192.26.x.1
protected-mode no
daemonize yes
logfile /opt/softwares/redis-6.0.16/log/redis-server.log
#redis.conf of slave node 1
bind 127.0.0.1 192.26.x.2
protected-mode no
daemonize yes
logfile /opt/softwares/redis-6.0.16/log/redis-server.log
replicaof 192.26.x.1 6379
#redis.conf of slave node 2
bind 127.0.0.1 192.26.x.3
protected-mode no
daemonize yes
logfile /opt/softwares/redis-6.0.16/log/redis-server.log
replicaof 192.26.x.1 6379
Below is my sentinel nodes config
# Same for all sentinel nodes
bind 127.0.0.1 192.26.x.1
protected-mode no
port 26379
daemonize yes
pidfile "/var/run/redis-sentinel.pid"
logfile "/var/log/redis-sentinel.log"
sentinel monitor mymaster 192.26.x.1 6379 2
sentinel down-after-milliseconds mymaster 5000
sentinel failover-timeout mymaster 60000
When I queried the slave's status, I can see master-related config are not defined.
127.0.0.1:26379> SENTINEL masters
31) "master-link-down-time"
32) "0"
33) "master-link-status"
34) "err"
35) "master-host"
36) "?"
37) "master-port"
38) "0"
I am starting the Redis-servers in all nodes then Redis-sentinel. Not sure the ordering of starting nodes matters.
Please let me know whether I am missing some configurations or doing something wrong. The Redis version that I am using is 6.0.16.
Thanks in advance.

Changing the order of IP addresses in a bind in all configuration files solved the issue i.e.
bind <public_ip> <localhost>

Related

Redis sentinel becomes stable after first failover

I have a redis sentinel structure with 3 redis instances and 3 redis-sentinel instances. When I make them up for the first time, slaves constantly changes their master to 127.0.0.1 and then back to real master and then to 127.0.0.1 and so on. When I force the master to be down for 20 sec, another instance becomes master and everything becomes stable. Here are my config. What can cause this and how can I solve this problem? Logs are constantly saying fix-slave-config when sentinels are unstable.
initial master config
daemonize yes
port 6380
bind 0.0.0.0
supervised systemd
pidfile "/run/redis/redis-sentinel.log"
logfile "/var/log/redis/sentinel.log"
sentinel monitor mymaster 127.0.0.1 6379 2
sentinel auth-pass mymaster ***
sentinel down-after-milliseconds mymaster 4000
sentinel failover-timeout mymaster 2000
sentinel parallel-syncs mymaster 1
sentinel announce-ip IP_OF_CURRENT_MACHINE
initial slaves' config
daemonize yes
port 6380
bind 0.0.0.0
supervised systemd
pidfile "/run/redis/redis-sentinel.log"
logfile "/var/log/redis/sentinel.log"
sentinel monitor mymaster MASTER_IP 6379 2
sentinel auth-pass mymaster ***
sentinel down-after-milliseconds mymaster 4000
sentinel failover-timeout mymaster 2000
sentinel parallel-syncs mymaster 1
sentinel announce-ip IP_OF_CURRENT_MACHINE
What I tried:
quorum= 1,2
down-after-milliseconds 4000,10000
failover-timeout 2000,6000

Redis sentinel can not fail over the slave service

I'm gonna deploy a simple master-slave redis cluster with two servers: 192.168.0.101, 192.168.0.103, and 101 is the master.
here is the sentinel.conf on 103 server:
port 26379
bind 192.168.0.103 127.0.0.1
sentinel myid 49f552d5540fdcb8aa60be25208c56b689d3c0b0
sentinel monitor mymaster 192.168.0.101 6379 2
sentinel down-after-milliseconds mymaster 60000
sentinel failover-timeout mymaster 900000
sentinel auth-pass mymaster arsenal
sentinel config-epoch mymaster 0
# Generated by CONFIG REWRITE
dir "/etc/redis"
sentinel leader-epoch mymaster 3
sentinel known-slave mymaster 192.168.0.103 6379
sentinel current-epoch 3
and my redis.conf on 103 server:
bind 127.0.0.1 ::1
protected-mode yes
port 6379
tcp-backlog 511
timeout 0
daemonize yes
supervised no
dbfilename dump.rdb
dir /var/lib/redis
slaveof device1 6379
masterauth arsenal
slave-serve-stale-data yes
slave-read-only yes
slave-priority 100
requirepass arsenal
slave-lazy-flush no
appendonly no
appendfilename "appendonly.aof"
appendfsync everysec
no-appendfsync-on-rewrite no
activerehashing yes
aof-rewrite-incremental-fsync yes
I started with the sentinel on 192.168.0.103 with redis-server sentinel.conf --sentinel
7951:X 14 Mar 14:19:48.479 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
7951:X 14 Mar 14:19:48.479 # Sentinel ID is 49f552d5540fdcb8aa60be25208c56b689d3c0b0
7951:X 14 Mar 14:19:48.479 # +monitor master mymaster 192.168.0.101 6379 quorum 2
7951:X 14 Mar 14:20:48.480 # +sdown slave 192.168.0.103:6379 192.168.0.103 6379 # mymaster 192.168.0.101 6379
7951:X 14 Mar 14:21:11.577 # +sdown master mymaster 192.168.0.101 6379
My sentinel calling is like this:
sentinel = Sentinel([('device3', 26379)], password='arsenal')
sentinel.discover_master('mymaster')
MasterNotFoundError: No master found for 'mymaster'
The problem is after I tried to stop the redis-server service on 101, the sentinel can not switch the 103 server as the master.
Anyone who have idea? thanks.

In your config sentinel monitor mymaster 192.168.0.101 6379 2, quorum is 2, which means only two or more than two Sentinels think master down, can failover start.
See Redis Sentinel doc, only three or more than three Sentinels can be deployed stably, if you only have one Sentinel, it can not elects a leader (which get votes of majority) to start a failover.

redis sentinel not promoting +sdown to +odown

I setup a cluster of 3 redis-sentinel (3.2.6-1) on three instance of redis-server (3.2.6-1).
I checked the firewall for the 6379 and 26379 TCP port and it's all good.
The configuration for my redis-sentinel is something like that:
port 26379
dir "/tmp"
sentinel myid 0559ec26112bebce70bbfa5849f77338453315b
sentinel monitor rback 10.3.0.43 6379 2
sentinel down-after-milliseconds rback 5000
sentinel failover-timeout rback 10000
daemonize yes
pidfile "/var/run/redis/redis-sentinel.pid"
loglevel notice
logfile "/var/log/redis/redis-sentinel.log"
When I start the redis-server and redis-sentinel instances, I can query on the port 26379 port sentinel master rback and see the options:
9) "flags"
10) "master"
...
31) "num-slaves"
32) "2"
33) "num-other-sentinels"
34) "2"
35) "quorum"
36) "2"
In the logs of the redis-sentinel, I see this:
26851:X 12 Jun 15:22:35.092 * +sentinel sentinel 4b22b6ff1b983432028f8cdb0db75cd553bec4b3 XXXXX 26379 # redis-back XXXXX 6379
26851:X 12 Jun 15:22:40.105 * +sentinel sentinel 8fc263bf82226364917478541c13f2c7f5b746e6 XXXXX 26379 # redis-back XXXXX 6379
26851:X 12 Jun 15:22:40.168 # +sdown sentinel 4b22b6ff1b983432028f8cdb0db75cd553bec4b3 XXXXX 26379 # redis-back XXXXX 6379
26851:X 12 Jun 15:22:45.120 # +sdown sentinel 8fc263bf82226364917478541c13f2c7f5b746e6 XXXXX 26379 # redis-back XXXXX 6379
And if I run the sleep command or crash the master redis, I see each sentinel logging a +sdown command, but never promote it to +odown and promoting a new master.
How can I debug this?
Thanks
Add Information:
I run a tcpdump and analyse the traffic with wireshark, and found out that the sentinel is connecting to the other sentinel and try to communicate with it, but receive a "DENIED Redis is running in protected mode...". Even though the redis-servers are not running in protected mode.

The problem is the communication between the sentinel.
Redis adds with 3.2 version a "protected-mode" configuration flag on the sentinel.conf too.
The sentinel will receive an error message "Denied Redis is running in protected mode..." if the sentinel doesn't have the flag.
I found this information here:
https://newbiedba.wordpress.com/2016/07/01/redis-3-2-sentinel-with-protected-mode/

Redis sentinel failover configuration receive always +sdown

I'm testing redis failover with this simple setup:
3 Ubuntu server 16.04
redis and redis-sentinel are configured on each box.
Master ip : 192.168.0.18
Resque ip : 192.168.0.16
Resque2 ip : 192.168.0.13
Data replication works well but I can't get failover to work.
When I start redis-sentinel I always get a +sdown message after 60 seconds:
14913:X 17 Jul 10:40:03.505 # +monitor master mymaster 192.168.0.18 6379 quorum 2
14913:X 17 Jul 10:41:03.525 # +sdown master mymaster 192.168.0.18 6379
this is the configuration file for redis-sentinel:
bind 192.168.0.18
port 16379
sentinel monitor mymaster 192.168.0.18 6379 2
sentinel down-after-milliseconds mymaster 60000
sentinel failover-timeout mymaster 6000
loglevel verbose
logfile "/var/log/redis/sentinel.log"
repl-ping-slave-period 5
slave-serve-stale-data no
repl-backlog-size 8mb
min-slaves-to-write 1
min-slaves-max-lag 10
the bind directive uses the proper IP for each box.
I followed the redis tutorial here: https://redis.io/topics/sentinel but I can't get the failover to work.
Redis server version : 3.2.9

The issue is all about how redis-sentinel works because sentinel can not handle password protected redis-server.
In your redis-server configuration file (/etc/redis/redis.conf) do not use "requirepass" directive if you want to use redis-sentinel.

Redis Sentinel output in conf file

I was testing Redis Sentinel's failover ability. It worked, and Sentinel added some lines to the conf files. It auto-discovered the other sentinels and slave replicas, but it added some weird ids.
Can anyone tell me what those ids represent? Since they come right after known-sentinel, I assume they are the id of those sentinels but I can't be sure.
# Generated by CONFIG REWRITE
sentinel known-slave redis_master 127.0.0.1 6379
sentinel known-slave redis_master 127.0.0.1 6381
sentinel known-sentinel redis_master 127.0.0.1 26380
26f81b692201f11f0f16747b007da9d4f079d9d3 # this
sentinel known-sentinel redis_master 127.0.0.1 26381
0b613c6146bbf261f08c1b13f1d1b2dbc2f99413 # and this?

It's the run_id of sentinel. Remember sentinel is a special redis instance. Log into the sentinel and using "info server" to see its information, which includes the run_id. e.g.
redis-cli -h sentinel_host -p sentinel_port
info server
If you have multiple sentinels, you can use
sentinel sentinels mymaster(or redis_master in your situation)
to list all other sentinels' infomation.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Redis - unable to failover - redis

Changing the order of IP addresses in a bind in all configuration files solved the issue i.e. bind <public_ip> <localhost>

Related

Redis sentinel becomes stable after first failover

Redis sentinel can not fail over the slave service

redis sentinel not promoting +sdown to +odown

Redis sentinel failover configuration receive always +sdown

Redis Sentinel output in conf file

Categories

Resources