Is master always redis instance with smallest priority? - redis

I am running master-slave redis with sentinel,
when I kill my master, the lowest priority slave becomes the new master.
But when I start my old master again, which has an even lower priority, it does not become the master.
Is this behavior intended and somewhere documented? I don't seem to find anything in the redis sentinel documentation.

In regards to "failover behavior. It's clearly states that the lowest priority slave is preferred (unless it's zero) -
see the docs ("Slaves priority" section)
In Regards to "fallback" behavior. Once the old master goes back online, it will not regain it's old master status back. This is intentional as the idea is to change the state of the sentinel ha-cluster as little as possible. Once the next failover takes place, if the old master ( now a slave ) has the lowest priority, it'll promoted to master again.

Related

Redis Cluster Master Failover Scenario

In a 6 nodes Redis Cluster, with 3 master nodes and 3 slave nodes, if let's say a master will go down, the according slave will be promoted. When the old master comes back live will be a slave.
Is it possible to force it somehow from the redis config or otherwise, so that when it comes back live, the old master will be promoted as a master as it was at the beginning?
Thank you!
If the old master comes up with property:
slaveof no one
It will join the cluster as a master, but I don't think you would like to do it.
The 'old master' does not have the latest data, if you force it to become the master, there will be data loss.

Is the SETNX always executed on redis master in the sentinel system?

wanted to ask
if the SETNX command (both the NX test and the SET) guaranteed to execute on the redis master in the context of the "Redis Sentinel System" ?
if it is guaranteed to to be atomic in the context of the "Redis Sentinel System"?
My understanding after reading the documentation is YES to both, because:
Ad. 1 only the master can accept writes and since SETNX has a set/write component it has to go to the master (because all writes go through the master).
Ad. 2 since the SET will be executed on the master it only make sense to check the NX part also on the master (no slaves queried, ever), otherwise it would be unnecessarily time consuming and could undermine atomicity.
Can someone confirm with 100% certainty, maybe point me to some documentation that clears my doubts.
Thanks in advance!
I can confirm the above with %99.97 (3 sigmas) certainty.
Ad. 1 only the master can accept writes and since SETNX has a set/write component it has to go to the master (because all writes go through the master).
Correct, excluding the scenario where you deliberately enable writing to replicas and connect to a replica.
Ad. 2 since the SET will be executed on the master it only make sense to check the NX part also on the master (no slaves queried, ever), otherwise it would be unnecessarily time consuming and could undermine atomicity.
Yep.

Redis sentinel vs clustering

I understand redis sentinel is a way of configuring HA (high availability) among multiple redis instances. As I see, there is one redis instance actively serving the client requests at any given time. There are two additional servers are on standby (waiting for a failure to happen, so one of them can be in action again).
Is it waste of resources?
Is there a better way of using full use of the resources available?
Is Redis clustering an alternative to Redis sentinel?
I already looked up redis documentation for sentinel and clustering, can somebody having experience explain please.
UPDATE
OK. In my real deployment scenario I have two servers dedicated for redis. I have another server my Jboss server is running. The application running in Jboss is configured to connect to redis master server(M).
Failover scenario
Ideally, I think when Master cache server fails (either Redis process goes down or machine failure) the application in Jboss needs to connect to Slave cache server. How would I configure the redis servers to achieve this?
+--------+ +--------+
| Master |---------| Slave |
| | | |
+--------+ +--------+
Configuration: quorum = 1
First, lets talk sentinel.
Sentinel manages the failover, it doesn't configure Redis for HA. It is an important distinction. Second, the diagram you posted is actually a bad setup - you don't want to run Sentinel on the same node as the Redis nodes it is managing. When you lose that host you lose both.
As to "Is it waste of resources?" it depends on your use case. You don't need three Redis nodes in that setup, you only need two. Three increases your redundancy, but is not required. If you need the added redundancy then it isn't a waste of resources. If you don't need redundancy then you just run a single Redis instance and call it good - as running more would be "wasted".
Another reason for running two slaves would be to split reads. Again, if you need it then it wouldn't be a waste.
As to "Is there a better way of using full use of the resources available?" we can't answer that as it is far too dependent on your specific scenario and code. That said if the amount of data to store is "small" and the command rate is not exceedingly high, then remember you don't need to dedicate a host to Redis.
Now for "Is Redis clustering an alternative to Redis sentinel?".
It really depends entirely on your use case. Redis Cluster is not an HA solution - it is a multiple writer/larger-than-ram solution. If your goal is just HA then it likely won't be suitable for you. Redis Cluster comes with limitations, particularly around multi-key operations, so it isn't necessarily a straightforward "just use cluster" operation.
If you think having three hosts running Redis (and three running sentinel) is wasteful, you'll likely hold Cluster to be even more so as it does require more resources.
The questions you've asked are probably too broad and opinion-based to survive as written. If you have a specific case/problem you are working out please update with that so we can provide specific assistance and information.
Update for specifics:
For proper failover management in your scenario I would go with 3 sentinels, one running on your JBoss server. If you have 3 JBoss nodes then go with one on each. I'd have a Redis pod (master+slave) on separate nodes, and let sentinel manage the failover.
From there it is a matter of wiring up JBoss/Jedis to use Sentinel for it's information and connection management. As I don't use those a quick search turns up that Jedis has the support for it, you just need to configure it correctly. Some examples I found are at Looking for an example of Jedis with Sentinel and https://github.com/xetorthio/jedis/issues/725 which talk about JedisSentinelPool being the route for using a pool.
When Sentinel executes a failover the clients will be disconnected and Jedis will (should?) handle the reconnection by asking the Sentinels who the current master is.
This is not direct answer to your question, but think, it's helpful information for Redis newbies, like me. Also this question appears as the first link in google when searching the "Redis cluster vs sentinel".
Redis Sentinel is the name of the Redis high availability solution...
It has nothing to do with Redis Cluster and is intended to be used by
people that don't need Redis Cluster, but simply a way to perform
automatic fail over when a master instance is not functioning
correctly.
Taken from the Redis Sentinel design draft 1.3
It's not obviuos when you are new to Redis and implementing failover solution. Official documentations about sentinel and clustering doens't compare to each other, so it's hard to choose the right way without reading tons of documentations.
The recommendation, everywhere, is to start with an odd number of instances, not using two or a multiple of two. That was corrected, but lets correct some other points.
First, to say that Sentinel provides failover without HA is false. When you have failover, you have HA with the additional benefit of application state being replicated. The distinction is that you can have HA in a system without replication (it's HA but it's not fault tolerant).
Second, running a sentinel on the same machine as its target redis instance is not a "bad setup": if you lose your sentinel, or your redis instance, or the whole machine, the results are the same. That's probably why every example of such configurations shows both running on the same machine.
Additional info to above answers
Redis Cluster
One main purpose of the Redis cluster is to equally/uniformly distribute
your data load by sharding
Redis Cluster does not use consistent hashing, but a different form of sharding where every key is conceptually part of what is called as hash slot
There are 16384 hash slots in Redis Cluster, Every node in a Redis Cluster is responsible for a subset of the hash slots, so, for example, you may have a cluster with 3 nodes,
where:
Node A contains hash slots from 0 to 5500,
Node B contains hash slots from 5501 to 11000,
Node C contains hash slots from 11001 to 16383
This allows us to add and remove nodes in the cluster easily. For example, if we want to add a new node D, we need to move some hash slot from nodes A, B, C to D
Redis cluster supports the master-slave structure, you can create slaves A1,B1, C2 along with master A, B, C when creating a cluster, so when master B goes down slave B1 gets promoted as master
You don't need additional failover handling when using Redis Cluster and you should definitely not point Sentinel instances at any of the Cluster nodes.
So in practical terms, what do you get with Redis Cluster?
1.The ability to automatically split your dataset among multiple nodes.
2.The ability to continue operations when a subset of the nodes are experiencing failures or are unable to communicate with the rest of the cluster.
Redis Sentinel
Redis supports multiple slaves replicating data from a master node.
This provides a backup for data in master node.
Redis Sentinel is a system designed to manage master and slave. It runs as separate program. The minimum number of sentinels required in an ideal system is 3. They communicate among themselves and make sure that the Master is alive, if not alive they will promote one of the slaves as master, so later when the dead node spins up it will be acting as a slave for the new master
Quorum is configurable. Basically it is the number of sentinels that need to agree as the master is down. N/2 +1 should agree. N is the number of nodes in the Pod (note this setup is called a pod and is not a cluster)
So in practical terms, what do you get with Redis Sentinel?
It will make sure that Master is always available (if master goes down, the slave will be promoted as master)
Reference :
https://fnordig.de/2015/06/01/redis-sentinel-and-redis-cluster/
https://redis.io/topics/cluster-tutorial
This is my understanding after banging my head throughout the documentation.
Sentinel is a kind of hot standby solution where the slaves are kept replicated and ready to be promoted at any time. However, it won't support any multi-node writes. Slaves can be configured for read operations. It's NOT true that Sentinel won't provide HA, it has all the features of a typical active-passive cluster ( though that's not the right term to use here ).
Redis cluster is more or less a distributed solution, working on top of shards. Each chunk of data is being distributed among masters and slaves nodes. A minimum replication factor of 2 ensures that you have two active shards available across master and slaves.
If you know the sharding in Mongo or Elasticsearch, it will be easy to catch up.
Redis can operate in partitioned cluster (with many masters and slaves of those masters) or a single instance mode (single master with replica slaves).
The link here says:
When using Redis in single instance mode, in which a single Redis server manages the entire unpartitioned database, Redis Sentinel is used to manage its availability
It also says:
A Redis cluster, in which data is partitioned among multiple primary instances, manages availability by itself and requires no extra components.
So HA can be ensured in the 2 mentioned scenarios. Hope this clears the doubts. Redis cluster and sentinels are not alternative to each other. They are just used to ensure HA in different cases of partitioned or non-partitioned master.
Redis Sentinel performs the failover promoting replicas when they see a master is down. You typically want an odd number of sentinel nodes. For the example of one master and one replica, 3 sentinels should be used so there can be a consensus on the decision. Ideally the 3rd sentinel is on a 3rd server so the decision is not skewed (depending on failure). Sentinel takes care of changing the master/replica config settings on your nodes so that promotion and syncing occurs in the correct order and you don’t overwrite data by bringing on an old failed master that now contains older data.
Once you have your sentinel nodes set up to perform failovers, you need to ensure you are pointing to the correct instance. See an example of HAProxy configuration for this. HAProxy performs health checks and will point to the new master if a failure occurs.
Clustering will allow you to scale horizontally and can help handle high loads. It does take a bit of work to set up and configure up front.
There is an open source fork of Redis, “KeyDB” that has eliminated the need for sentinel nodes with an active-replica option. This allows the replica node to accept reads and writes. When a failover occurs HAProxy stops reads/writes with the failed node and just uses the remaining active node which is already sync’d. Timestamping enables the failed nodes to rejoin automatically and resync without losing data when they come back online. Setup is simple and for higher traffic you don’t need special upfront setup to direct reads to the replica node and read/writes to the master. See example of active replication here. KeyDB is also multi-threaded which for some applications might be an alternative to clustering, but really depends on what your needs are.
There is also an example of setting up clustering manually and with the create-cluster tool. These are the same steps if you are using Redis (replace 'keydb' with 'redis' in instruction)

promoting a master in replication

I am designing a replication algorithm, to promote a master among many slaves. I want it to be faster and simpler than Paxos. The basic idea is:
Assign each node a 'Promotion Priority', for example for 5 nodes there would be priorities: 50,40,30,20 and 10, 50 the highest and 10 the lowest.
When master needs to be elected, all slaves will send (at the same time) the other 4 nodes a message requesting to become a master, but only that master will be elected that will be confirmed by all slaves with a confirmation message. A slave will send confirmation message if its own 'Promotion Priority' is lower than the asking node, or if the asking node with higher priority times out to issue rejection message for its own request.
If a slave receives a rejection message from slave with higher 'Promotion Priority' it will abort the procedure.
There should be no nodes with the same priority.
There will be a minimum number of confirmation messages that a slave should collect in order to become a master.
This algorithm should be faster because all the slaves will be electing a master in parallel and the priority will help to speed up the process.
What do you think about it? Does any other algorithm for master promotion with priority exists?
What do you think about it?
It is hard to completely assess the validity of you algorithm without knowing the details of your requirements. Overall, it looks like a valid approach, but there are a few issues that I think deserve some attention.
Your question has some similarities to A distributed algorithm to assign a shared resource to one node out of many. Consequently, some of the arguments raised in my answer to that question hold for this question as well.
When master needs to be elected, all slaves will send (at the same
time) the other 4 nodes a message requesting to become a master, but
only that master will be elected that will be confirmed by all slaves
with a confirmation message.
This approach assumes that all slaves know how many slaves are present at any time -- otherwise the supposed master can never draw the conclusion when it has received a confirmation from all slaves. Implicitly, this means that no slaves can leave and join the system without breaking the algorithm.
In practice though, these slaves will come and go, because of crashes, reboots, network outages etc. The chances of this increase with the number of slaves, but whether or not this is a problem depends on your requirements. How fault tolerant does your system have to be?
By the way, since you mention that there are many slaves, I assume that you are using multicast or broadcast to send the request messages. Otherwise, depending on what many means to you, your set-up could be error prone with regard to administrating where all slaves reside.
A slave will send confirmation message if its own 'Promotion Priority'
is lower than the asking node, or if the asking node with higher
priority times out to issue rejection message for its own request.
Similar to the previous remark: a slave might draw the wrong conclusion if some slave has problem responding for whatever reason. In fact, if one slave is down or has a network problem, all other slaves will draw the same (most likely erroneous) conclusion that the non-responsive slave is the master.
This algorithm should be faster because all the slaves will be
electing a master in parallel
The issues raised in this answer are almost inherent to doing the master selection in a distributed fashion though, and hard to resolve without introducing some kind of centralized decision maker. You gain some, you lose some...
Does any other algorithm for master promotion with priority exists?
Another approach would be to have all slaves in the system constantly maintain administration about who is the current master. This could be done (at the cost of some network bandwidth) by having every slave multicasting/broadcasting its priority periodically, via some sort of heartbeat message. As a result, every slave will be aware of every other slave, and at the moment that a master needs to be selected, every slave can do that instantly. Network issues or other "system health" problems will be detected because heartbeats are missed. This algorithm is flexible with regard to slaves joining and leaving the system. The higher the heartbeat frequency, the more responsive your system will be to topology changes. However, you might still run into issues of slaves running drawing independent conclusions because of a disconnected network. If that is a problem, then you might not be able to solve this in a completely parallel fashion.

How to approach wcf services synchronization?

I have implemented a wcf service and now, my client wants it to have three copies of it, working independently on different machines. A master-slave approach. I need to find a solution that will enable behavior:
the first service that is instantiated "asks" the other two "if they are alive?" - if no, then it becomes a master and it is the one that is active on the net. The other two, once instantiated see that there is already a master alive, so they became slaves and start sleeping. There needs to be some mechanisms to periodically check if master is not dead and if so, choses the next copy that is alive to became a master (until it becomes dead respectively)
This i think should be a kind of an architectural pattern, so I would be more than glad to be given any advices.
thanks
I would suggest looking at the WCF peer channel (System.Net.PeerToPeer) to facilitate each node knowing about the other nodes. Here is a link that offers a decent introduction.
As for determining which node should be the master, the trick will be negotiating which node should be the master if two or more nodes come online at about the same time. Once the nodes become aware of each other, there needs to be some deterministic mechanism for establishing the master. For example, you could use the earliest creation time, the lowest value of the last octet of each node's IP address, or anything really. You'll just need to define some scheme that allows the nodes to negotiate this automatically.
Finally, as for checking if the master is still alive, I would suggest using the event-based mechanism described here. The master could send out periodic health-and-status events that the other nodes would register for. I'd put a try/catch/finally block at the code entry point so that if the master were to crash, it could publish one final MasterClosing event to let the slaves know it's going away. What this does not account for is a system crash, e.g., power failure, etc. To handle this, provide a timeout in the slaves so that when the timeout expires, the slaves can query the master to see if it's still there. If not, the slaves can negotiate between themselves using your deterministic algorithm about who should be the next master.