We have two datacenters, each with two redis instances. Generally they are replicated as chain.
NY1 (Master) --> NY2 (Slave) --> CO1 (Slave) --> CO2 (Slave)
NY is New York and CO is Colorado, our backup datacenter. In order to save bandwidth over the WAN, we don't want CO1 and CO2 connected to NY1. Rather we want a chain configuration, where there is only one slave directly to the master, and the others are all "slaves of slaves".
Can this sort of replication layout be maintained using Sentinel? Or do all slaves have to be a slave of the master, and not a slave of a slave?
Currently this type of setup isn't possible with Sentinel because Sentinel rewrites the configurations of all monitored Redis systems.
For example, if you set up a system as you described and have sentinel monitoring all of the hosts, if the master goes down and forces a failover, each of the Redis hosts will be re-configured. One of the replicas (any of them) will become the new master, and the others will become replicas of the new master. When the old master comes back online, it will be re-configured to be a replica of the new master.
However, in general you can get Redis to work the way you want. You can have as many replicas of a replica as you need by setting the replicaof config value to a replica.
Personally, I would still use Sentinel to monitor the master and the "prime" replicas (those that replicate from the master itself). This could result in one of the prime replicas becoming a new master, so I would enable the notification option. This tells sentinel to call a script whenever a failover happens. In that script you can send an email, hit a Slack webhook, or whatever else you want to do with it. When I get it, I'd manually reconfigure the hosts back into the format I want, but with the new master. It'd be a pain to do it this way but I'd still get automatic failover of the master and prime replicas so my apps will continue working.
Related
I am using Sentinel as a high availability solution for redis.
I have a problem.
In consideration of reducing the replication pressure of the master, our redis instances are multi-level, as follow:
In the introduction of the sentinel, I found that can monitor multiple masters, so I import it and hope to work as follows:
The second row of the replica belongs to the "master" logically too, so it also needs to be monitored.
Get the opposite of what one wants When the Sentinels just started, they had an election and independent many masters, actual master(role: master), not logic master.
Q: So can sentinels do the monitoring mode in the figure above?
My main configuration is as follows:
sentinel monitor top-master xxx.x.x.x 6379 2
sentinel monitor second-level-first xxx.x.x.x 6379 2
sentinel monitor second-level-second xxx.x.x.x 6379 2
sentinel monitor second-level-third xxx.x.x.x 6379 2
IN BRIEF - NO
To answer the above you would want to drill down into what sentinel is doing.
It is going to find out all the slaves it is connected to a master.
it establishes a pub-sub with those nodes.
when your actual master fails and another node becomes master this cannot be propagated.
Infact, to answer further more, can you please share the configuration of your slave nodes on level1? Infact this should have not been possible at all. I am just wondering how this worked.
If you can share the config files, will go through and update accordingly.
When running a single instance redis, I can use "slave of" to create a (or as many I like) readonly replica of this one redis node.
When using redis cluster, I split my Data into Partitons (Masters) and can create a slave for each partition.
Is it possible to treat this cluster as a single instance and connect a "slave of" Slave to this cluster which will hold a replica of all Data in the cluster and not just the partition of the connected node?
If not possible with redis cluster, is this might a working solution when using sentinel?
Our current Problem:
We are using the "slave of" feature together with keepalived to failover our redis instance on an outage of the master.
But we have lots of "slave of" slaves connected to the virtual IP of the failover setup, to deliver cached data.
Now everytime the system fails over (for maintenance reasons e.g.) all connected slaves have a timout for up to 30 seconds, when they have to resync their data with the new master.
We allready played with all possible redis config parameters but can't get this syncing time to be shorter (e.g. by relying on the replication-backlog, which isn't available on the new master after the failover).
Anyone any ideas?
a very good doc here : http://redis.io/presentation/Redis_Cluster.pdf and here http://fr.slideshare.net/NoSQLmatters/no-sql-matters-bcn-2014 (slide #9) or better https://www.javacodegeeks.com/2015/09/redis-clustering.html
If you want "slave" in Redis cluster mode, you need use replication of all nodes.
Regards,
Well, I just read this article:
https://seanmcgary.com/posts/how-to-build-a-fault-tolerant-redis-cluster-with-sentinel
The author used a single master with Redis Cluster, with 2 slaves per master, instead of one, and he let Redis Sentinel take care of the election of a slave to a master when the master is down.
You could play with this setup to see if the election of Master occurs quickly. While it's happening, clients would be served by a slave and should experience no downtime.
Hello stack community,
I have a question about Redis sentinel for a specific problem case. I use AWS with Multi AZ to create a sensu cluster.
On eu-central-1a I have a sensu+redis(M), a RBMQ+Sentinel and 2 others Sentinels. Same on eu-central-1b but the redis is my slave on this AZ.
What happen if there is a problem and eu-central-1a can not communicate with eu-central-1b ? What I think is that Sentinel on eu-central-1b should promote my redis slave to master, because he can not contact my redis master. So I should have 2 redis masters running together on 2 different AZ.
But when the link is retrieved between AZ, I will still have 2 masters, with 2 different datas. What will happen in this case ? One master will become a slave and data will be replicated without loss ? Do we need to restart a master and he will be a slave ?
Sentinel detects changes to the master for example
If the master goes down and is unreachable a new slave is elected. This is based on the quorum where multiple sentinels agree that the master has gone down. The failover then occurs.
Once the sentinel detects the master come back online it is then a slave I believe thus the new master continues I believe. You will loose data in the switchover from master to new master that in inevitable.
If you loose connection then yes sentinel wont work correctly as it relies on multiple sentinels to agree the master redis is down. You shouldn't use sentinel in a 2 sentinel system.
Basic solution would be for you to put a extra sentinel on another server maybe the client/application server that isn't running redis/sentinel this way you can make use of the quorum and sentinels agreeing the master is down.
I've been doing some reading on how to use Redis Sentinel, and I know it's possible to have 2 or more sentinels, and load balance between them when calling from the client side.
Is it good practice to have these 2 sentinels in the same server as my master + slave? In other words, have 1 sentinel in the same physical server as master, and another in same physical server as slave?
It seems to me if the master server dies, the sentinel in the slave will simply promote the slave to a master. if the slave server dies, it doesn't matter because the master is still up.
Am I missing something? What are the downsides?
I rather have the sentinels be in the same physical server as the master/slave to reduce latency.
First, Sentinel is not a load balancer or a proxy for Redis.
Second, not all failures are death of the host. Sometimes the server hangs briefly, sometimes a network cable gets unplugged, etc. Because f this, it is not good practice to run Sentinel on the same hosts as your Redis instance. If you're using Sentinel to manage failover, anything less than three sentinels running on nodes other than your Redis master and slave(s) is asking for trouble.
Sentinel uses a quorum mechanism to vote on a failover and slave. With less than two sentinels you run the risk of split brain where two or more Redis servers think they are master.
Imagine the scenario where you run two servers and run sentinel on each. If you lose one you lose reliable failover capability.
Clients only connect to Sentinel to learn the current master connection information. Anytime the client loses connectivity they repeat this process. Sentinel is not a proxy for Redis - commands for Redis go directly to Redis.
The only reliable reason to run Sentinel with less than three sentinels is for service discovery, which means not using it for failover management.
Consider the two host scenario:
Host A: redis master + sentinel 1 (Quorum 1)
Host B: redis slave + sentinel 2 (Quorum 1)
If Host B temporarily loses network connectivity to Host A in this scenario HostB will promote itself to master. Now you have:
Host A: redis master + sentinel 1 (Quorum 1)
Host B: redis master + sentinel 2 (Quorum 1)
Any clients which connect to Sentinel 2 will be told Host B is the master, whereas clients which connect to Sentinel 1 will be told Host A the master (which, if you have your Sentinels behind a load balancer, means half of your clients).
Thus what you need to run to obtain minimum acceptable reliable failover management is:
Host A: Redis master
Host B: Redis Slave
Host C: Sentinel 1
Host D: Sentinel 2
Host E: Sentinel 2
Your clients connect to the sentinels and obtain the current master for the Redis instance (by name), then connect to it. If the master dies the connection should be dropped by the client whereupon the client will/should connect to Sentinel again and get the new information.
How well each client library handles this is dependent on the library.
Ideally Hosts C,D, and E are either on the same hosts where you connect to Redis from (ie. the client host). or represent a good sampling got them. The main thrust here is to ensure you are checking from where you need to connect to Redis from. Failing that place them in the same DC/Rack/Region as the clients.
If you are wanting to have your clients talk to a load balancer try to have your Sentinels on those LB nodes if possible, adding additional non-LB hosts as needed to obtain an odd number of sentinels > 2. An exception to this is if your client hosts are dynamic in that the number of them is inconsistent (they scale up for traffic, down for slow periods, for example). In this scenario you pretty much must run your Sentinels on non-client and non-redis-server hosts.
Note that if you do this you will then need to write a daemon which monitors the Sentinel PUBSUB channel for the master switch event to update the LB -which you must configure to only talk to the current master (never try to talk to both). It is more work to do that but does make use of Sentinel transparent to the client - which only knows to talk to the LB IP/Port.
It all depends the level of Disaster Recovery you want to achieve, let's assume you have the following components independently of where they are hosted:
2 Sentinels
1 Master
1 Slave
1 Master 1+ Slaves
One host scenario
Host fails: You loose everything, bad replication scenario for most use cases.
Two host scenario
Host 1:
(Current elected) Master
1 Sentinel
Host 2:
Slave
1 Sentinel
It is true that in this scenario you can have the hosts fail one at a time which gives you some level of security. Just try to understand if by different server you mean physically different hosts. If these are just VMs on the same host, you do not get the same level of DR (Disaster Recovery).
Regarding your question:
I rather have the sentinels be in the same server as the master/slave to reduce latency.
Notice that Sentinels keep track of the current master and slaves, but the Redis clients do not connect to the Master VIA the Sentinels, they just get where the current master is via the Sentinels, e.g., in terms of reads and writes you're not looking into any considerable* latency gains.
Configuration provider. Sentinel acts as a source of authority for clients service discovery: clients connect to Sentinels in order to ask for the address of the current Redis master responsible for a given service. If a failover occurs, Sentinels will report the new address.
(see: http://redis.io/topics/sentinel)
The way I see it the only gains you have in terms of latency are the heartbeats sent from the Master and Slaves to the sentinel. As long as you are not spreading your servers through the whole world that should be ok.
It all depends on the use cases, but it seems you would do best to keep things as separate as possible if all other things are equal (costs, distance to clients, etc).
You can have sentinels on the same machine with master/slave, but the sentinels must be odd(3/5/7) in number. There should be atleast three sentinels and it is must to have a dedicated machine for atleast one sentinel.
If you have only two nodes, then in case of a split-brain (network disrupt) situation, the slave will be promoted to master. Both the master now will accept data from clients.However, when things come back to normal, one of the master will be demoted as a slave. That master will lose all of its data as it is a slave now and will replicate the data from current master.
check this for good a explanation of redis architectural desings and split-brain:
https://web.archive.org/web/20170527053749/http://www.yzuzun.com/2015/04/some-architectural-design-concepts-for-redis/
It's certainly not a recommended approach.
The Redis Sentinel docs explains the tradeoffs pretty well. Hope this helps.
https://redis.io/topics/sentinel#example-sentinel-deployments
We have a redis cluster with a master and a slave managed by three sentinel processes, and an additional remote slave, hosted in a different datacenter, for transparent failover and data preservation in the case that something bad happens to the master and slave machines.
It may happen that a transient error takes down the master redis process only, and in this situation we would like to see the slave process promoted to master, and the remote slave reslaved to it. However, it seems that sentinel could just as easily promote the remote slave to master, and we have not found any way to prevent this.
Is there any way to mark a particular slave machine as unpromotable, so that sentinel will not try to make it the master in the event of a failover?
Yes. In the slave's config file set the slave-priority setting to zero (the number not the word).