Redis Sentinel : last node doesn't become master

Redis Sentinel : last node doesn't become master - redis

I'm trying to set up an automatic failover system in a 3 nodes redis cluster. I installed redis-sentinel on each of these nodes (juste like this guy : http://www.symantec.com/connect/blogs/configuring-redis-high-availability).
Everything is fine as long as I have two or three nodes. The problem is that whenever there's only onte node remaining and that it's a slave, it does not get elected as master automatically. The quorum is set to 1, therefore the last node detects the odown of the master but can't vote for the failover since there's no majority.
To overcome this (surprising) issue, I wrote a little script that ask the other nodes for their masters, and if they don't answer I set the current node as the master. This script is called within the redis-sentinel.conf file, as a notification script. However ... As soon as the redis-sentinel service is started, this configuration is "erased" ! If I look at the configuration file in /etc, the "sentinel notification-script" line has disappeared (redis-sentinel rewrites its configuration file so why not) BUT the configuration I wrote is no longer available :
1) 1) "name"
2) "mymaster"
3) "ip"
4) "x.x.x.x"
5) "port"
6) "6379"
7) "runid"
8) "somerunid"
9) "flags"
10) "master"
11) "pending-commands"
12) "0"
13) "last-ping-sent"
14) "0"
15) "last-ok-ping-reply"
16) "395"
17) "last-ping-reply"
18) "395"
19) "down-after-milliseconds"
20) "30000"
21) "info-refresh"
22) "674"
23) "role-reported"
24) "master"
25) "role-reported-time"
26) "171302"
27) "config-epoch"
28) "0"
29) "num-slaves"
30) "1"
31) "num-other-sentinels"
32) "1"
33) "quorum"
34) "1"
35) "failover-timeout"
36) "180000"
37) "parallel-syncs"
38) "1"
That is the result of the sentinel-masters command. The only thing is that I previously set the "down-after-milliseconds" to 5000 and the "failover-timeout" to 10000 ...
I don't know if anyone has met anything similar ? Well, should someone has a little idea about wwhat's happening, I'd be glad about it ;)

This is a reason to not place your sentinels on your redis instance nodes. Think of them as monitoring agents. You wouldn't place your website monitor on the same node running your website and expect to catch the node death. The same is expected w/Sentinel.
The proper route to sentinel monitoring is to ideally run them from the clients, and if that isn't possible or workable, then from dedicated nodes as close to the clients as possible.
As antirez said, you need to have enough sentinels to have the election. There are two elections: 1: deciding on the new master and 2: deciding which sentinel handles the promotion. In your scenario you only have one sentinel, but to elect a sentinel to handle the promotion your sentinel needs votes from a quorum of Sentinels. This number is a majority of all sentinels seen. In your case it needs two sentinels to vote before an election can take place. This quorum number is not configurable and unaffected by the quorum setting. This is in place to reduce the chances of multiple masters.
I would also strongly advise against setting a quorum to be less than half+1 of your sentinels. This can lead to split brain operation where you have two masters. Or in your case you could have three. If you lost connectivity between your master and the two slaves but clients still had connectivity your settings could trigger split brain - where a slave was promoted and new connections talked to that master while existing ones continue talking to the original. Thus you have valid data in two masters which likely conflict with each other.
The author of that Symantec article only consider the Redis daemon dying, not the node. Thus it really isn't an HA setup.

the quorum is only used to reach the ODOWN state, that triggers the failover. For the failover to actually happen the slave must be voted by a majority, so a single node can't get elected. If you have such a requirement, and you don't care about only the majority side being able to continue in your cluster (this means unbound data loss in the minority side if clients get partitioned with a minority where there is a master), you can just add sentinels in your clients machines as well, this way the total num of Sentinels is, for example, 5, and even if two Redis nodes are down, the only remaining node plus two sentinels running client side are enough to get majority of 3. Unfortunately the Sentinel documentation is not complete enough to explain this stuff. There is all the info to get the picture right, but no examples for a faster reading / deploying.

Related

It is possible for only the Redis master instances to handle all reads and writes, and for the slaves to be used only for failover?

My work system consists of Spring web applications and it uses Redis as a transaction counter and it conditionally blocks transaction requests.
The transaction is as follows:
Check whether or not data exists. (HGET)
If it doesn't, saves new one with count 0 and set expiration time. (HSET, EXPIRE)
Increases a count value. (INCRBY)
If the increased count value reaches a specific configured limit, it sets the transaction to 'blocked' (HSET)
The limit value is my company's business policy.
Such read and write operations are requested one after another, immediately.
Currently, I use one Redis instance at one machine. (only master, no replications.)
I want to get Redis HA, so I need slave insntaces but at the same time, I want to have all reads and writes to Redis only to master insntaces because of slave data relication latency.
After some research, I found that it is a good idea to have a proxy server to use Redis HA. However, with proxy, it seems impossible to use only the master instances to receive requests and the slaves only for failover.
Is it possible??
Thanks in advance.

What you need is Redis Sentinel.
With Redis Sentinel, you can get the master address from sentinel, and read/write with master. If master is down, Redis Sentinel will do failover, and elect a new master. Then you can get the address of the new master from sentinel.

As you're going to use Lettuce for Redis cluster driver, you should set read preference to Master and things should be working fine a sample code might look like this.
LettuceClientConfiguration lettuceClientConfiguration =
LettuceClientConfiguration.builder().readFrom(ReadFrom.MASTER).build();
RedisClusterConfiguration redisClusterConfiguration = new RedisClusterConfiguration();
List<RedisNode> redisNodes = new ArrayList<>();
redisNodes.add(new RedisNode("127.0.0.1", 9000));
redisNodes.add(new RedisNode("127.0.0.1", 9001));
redisNodes.add(new RedisNode("127.0.0.1", 9002));
redisNodes.add(new RedisNode("127.0.0.1", 9003));
redisNodes.add(new RedisNode("127.0.0.1", 9004));
redisNodes.add(new RedisNode("127.0.0.1", 9005));
redisClusterConfiguration.setClusterNodes(redisNodes);
LettuceConnectionFactory lettuceConnectionFactory =
new LettuceConnectionFactory(redisClusterConfiguration, lettuceClientConfiguration);
lettuceConnectionFactory.afterPropertiesSet();
See in action at Redis Cluster Configuration

RabbitMQ delete a corrupted queue after node crash

RabbitMQ Version 3.7.21
Erlang Version Erlang 21.3.8.10
My team had 2 nodes hit the memory watermark last night and so I rebuilt the bad nodes but it left some queues in a bad state. I want to clear them out so that we can recreate them.
The stats show NaN for Ready, Unacked, and Total and the stats in queue look like:
It looks like the queue's node is one that no longer exists so unfortunately I can't access it. It's completely gone.
I have tried the following commands:
rabbitmqctl eval 'Q = rabbit_misc:r(<<"/">>, queue, <<"QUEUE">>), rabbit_amqqueue:internal_delete(Q).'
rabbitmqctl eval 'Q = {resource, <<"/">>, queue, <<"QUEUE">>}, rabbit_amqqueue:internal_delete(Q).'
but get this error:
{:undef, [{:rabbit_amqqueue, :internal_delete, [{:resource, "/", :queue, "QUEUE"}], []}, {:erl_eval, :do_apply, 6, [file: 'erl_eval.erl', line: 680]}, {:rpc, :"-handle_call_call/6-fun-0-", 5, [file: 'rpc.erl', line: 197]}]}
Which I assume means it's trying to make an RPC call to a node that no longer exists and it fails. This seems crazy to me because not just is the node gone but it has been forgotten from the cluster but still a couple queues remain.

Looks like there are 3 options:
Comb through the Mnesia tables and delete the corrupted ones
Fully rebuild the cluster and migrate to a new cluster
Rename your queues and ignore corrupted ones
We're going to go with Option 3 for now but I'm sure eventually there will be a breaking change in RabbitMQ that will make Option 2 more appealing but for now the quick fix is best for me.

According to https://groups.google.com/g/rabbitmq-users/c/VSjzvOUfS3s/m/q8OmFTqACAAJ, the internal_delete function in 3.7.x takes two arguments:
In 3.7.x rabbit_amqqueue:internal_delete takes two arguments (acting user name is the second one).
Therefore, the next time you need to delete a queue in a bad state, try
rabbitmqctl eval 'Q = {resource, <<"/">>, queue, <<"QUEUE">>}, rabbit_amqqueue:internal_delete(Q, <<"CLI">>).'

How to know connected slaves to a master in redis?

In redis-cli in master system, what command is used to know the list of slaves connected to the master.
I only got command to know the status of a server.
To know the status of current server, open redis-cli:
> role
1) "master"
2) (integer) 196364
3) 1)
1) "192.168.1.90"
2) "6379"
3) "196364"
2)
1) "192.168.1.7"
2) "6379"
3) "196364"

The easiest way to list all replicas connected to a Redis master is by with the CLIENT LIST command, i.e.:
CLIENT LIST TYPE replica
Note: the TYPE subscommand was added in v5.

Why do I see SET in slave's slowlog?

My setting is Redis master-slave replication. I am sure the slaves are read only because when I connect to slave and try to write data, "(error) READONLY You can't write against a read only slave." is returned.
However, when I check the slowlog there are SET commands, eg:
127.0.0.1:6379> slowlog get 1
1) 1) (integer) 1360
2) (integer) 1544276677
3) (integer) 10653
4) 1) "SET"
2) "some value"
Anyone can explain this? Thanks in advance.

The Redis replica is replaying commands sent from the master, so the SET command must have originated from it.
It is unclear why that command ended in the slowlog, but it could be because of any number of reasons (IO or CPU blockage). If this happened once I wouldn't worry about it, but if it is pathological you may want to review your replica's infrastructure and configuration.

Replication issue on basic 3-node Disque cluster

I'm hitting a replication issue on a three-node Disque cluster, it seems weird because the use case is fairly typical so it's entirely possible I'm doing something wrong.
This is how to reproduce it locally:
# relevant disque info
disque_version:1.0-rc1
disque_git_sha1:0192ba7e
disque_git_dirty:0
disque_build_id:b02910aa5c47590a
Start 3 disque nodes in ports 9001, 9002 and 9003, and then have servers on port 9002 and 9003 meet with 9001.
127.0.0.1:9002> CLUSTER MEET 127.0.0.1 9001 #=> OK
127.0.0.1:9003> CLUSTER MEET 127.0.0.1 9001 #=> OK
The HELLO reports the same data for all three nodes, as expected.
127.0.0.1:9003> hello
1) (integer) 1
2) "e93cbbd17ad12369dd2066a55f9d4c51be9c93dd"
3) 1) "b61c63e8fd0c67544f895f5d045aa832ccb47e08"
2) "127.0.0.1"
3) "9001"
4) "1"
4) 1) "b32eb6501e272a06d4c20a1459260ceba658b5cd"
2) "127.0.0.1"
3) "9002"
4) "1"
5) 1) "e93cbbd17ad12369dd2066a55f9d4c51be9c93dd"
2) "127.0.0.1"
3) "9003"
4) "1"
Enqueuing a job succeeds, but the job does not show on on either QLEN or QPEEK in the other nodes.
127.0.0.1:9001> addjob myqueue body 1 #=> D-b61c63e8-IFA29ufvL37FRVjVVWisbO/x-05a1
127.0.0.1:9001> qlen myqueue #=> 1
127.0.0.1:9002> qlen myqueue #=> 0
127.0.0.1:9002> qpeek myqueue 1 #=> (empty list or set)
127.0.0.1:9003> qlen myqueue #=> 0
127.0.0.1:9003> qpeek myqueue 1 #=> (empty list or set)
When explicitly setting a replication value higher than the amount of nodes, disque fails with a NOREPL as one would expect, an explicit replication level of 2 succeeds, but the jobs are still nowhere to be seen in nodes 9002 and 9003. The same behavior happens regardless of the node in which I add the job.
My understanding is that replication happens synchronously when calling ADDJOB (unless explicitly using ASYNC), but it doesn't seem to be working properly, the test suite is passing in the master branch so I'm hitting a wall here and will have to dig into the source code, any help will be greatly appreciated!

The job is replicated, but it's enqueued only in one node. Try killing the first node to see the job enqueued in a different one.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas