Handle io.lettuce.core.RedisReadOnlyException when network is partitioned

Handle io.lettuce.core.RedisReadOnlyException when network is partitioned - redis

I have a situation where I use sentinel to get current redis master from sentinel. My setup is one redis master and three slaves and three sentinel nodes. This works fine in most situations but I have found that if I get a network split where the current master and the sentinel node that is configured first in the list of sentinel nodes are isolated from the other nodes, the other two sentinel nodes are doing a reelection to a new master, as intended.
My problem is that when the isolated previous master is accessing the common network again and is reconfigured to slave, my application is never notified that a new master is elected and continues to write to a slave since it still thinks it is writing to a master, ending up in getting "Error in execution; nested exception is io.lettuce.core.RedisReadOnlyException: READONLY You can't write against a read only slave."
I do not know if this is a redis problem or framework problem. Should redis when it is reconfigured from master to save terminate the connection like it is done in normal circumstances when a new master is elected or should the framework handle exceptions and query for current master?
One more interesting aspect of this is if the sentinel node configured first in the sentinel node list continues to be isolated, the behavior continues even if the application accessing redis is restarted.
Is there any mechanism to handle this situation or is this a bug or enhancement to the framework?

Related

Redis sentinel implmentation over the internet

I'm trying to implement redis sentinel in which there are two seperate
environments where master and replica redis will be running. The two
enviroments i.e. Primary and Backup will communicate through internet. Each
environment will have 2 nodes and each node will have one pod which contains
redis+sentinel processes. The following architecture represents the same.
Let's consider a scenario, if Master Redis (Node 1) goes down then sentinel
will invoke fail-over process and make one of the replica as Master Redis.
In such case, suppose Node 3 replica becomes master redis. So far all works
as expected. Now when Node 1 becomes available then its redis will start as
Master, after sentinels communication redis will act as replica. Ideally,
redis should bind on 1.2.3.4:30001 but it is binding on private IP of node
i.e. 192.168.x.x.
My question is why this is happening and as per my understanding sentinel is
responsible for config rewrites and asking Node 1 redis to become replica
redis so how sentinel is taking private IP rather than public IP.
Hopefully, I have properly conveyed my problem to you. if you need any futher
information feel free to comment.

Redis connect single instance slave (slave of) to cluster or sentinel

When running a single instance redis, I can use "slave of" to create a (or as many I like) readonly replica of this one redis node.
When using redis cluster, I split my Data into Partitons (Masters) and can create a slave for each partition.
Is it possible to treat this cluster as a single instance and connect a "slave of" Slave to this cluster which will hold a replica of all Data in the cluster and not just the partition of the connected node?
If not possible with redis cluster, is this might a working solution when using sentinel?
Our current Problem:
We are using the "slave of" feature together with keepalived to failover our redis instance on an outage of the master.
But we have lots of "slave of" slaves connected to the virtual IP of the failover setup, to deliver cached data.
Now everytime the system fails over (for maintenance reasons e.g.) all connected slaves have a timout for up to 30 seconds, when they have to resync their data with the new master.
We allready played with all possible redis config parameters but can't get this syncing time to be shorter (e.g. by relying on the replication-backlog, which isn't available on the new master after the failover).
Anyone any ideas?

a very good doc here : http://redis.io/presentation/Redis_Cluster.pdf and here http://fr.slideshare.net/NoSQLmatters/no-sql-matters-bcn-2014 (slide #9) or better https://www.javacodegeeks.com/2015/09/redis-clustering.html
If you want "slave" in Redis cluster mode, you need use replication of all nodes.
Regards,

Well, I just read this article:
https://seanmcgary.com/posts/how-to-build-a-fault-tolerant-redis-cluster-with-sentinel
The author used a single master with Redis Cluster, with 2 slaves per master, instead of one, and he let Redis Sentinel take care of the election of a slave to a master when the master is down.
You could play with this setup to see if the election of Master occurs quickly. While it's happening, clients would be served by a slave and should experience no downtime.

Unslave a redis slave

I have a setup of 3 instances in a failover cluster, one master and two slaves. All monitored by sentinels. At one point I decide I don't need one slave, and I want to reuse that redis instance for something else, what commands to I issue?
I tried running slaveof no one on that slave, but it's enslaved again in a few seconds.

Sentinels remember forever the slaves they have seen, in order to reconnect them when they return after a crash or a network partition.
For the sentinels to forget the slave to remove, Redis' doc says "you need to send a SENTINEL RESET mastername command to all the Sentinels: they'll refresh the list of slaves within the next 10 seconds, only adding the ones listed as correctly replicating from the current master INFO output."

Can we mark a slave as unpromotable by redis-sentinel?

We have a redis cluster with a master and a slave managed by three sentinel processes, and an additional remote slave, hosted in a different datacenter, for transparent failover and data preservation in the case that something bad happens to the master and slave machines.
It may happen that a transient error takes down the master redis process only, and in this situation we would like to see the slave process promoted to master, and the remote slave reslaved to it. However, it seems that sentinel could just as easily promote the remote slave to master, and we have not found any way to prevent this.
Is there any way to mark a particular slave machine as unpromotable, so that sentinel will not try to make it the master in the event of a failover?

Yes. In the slave's config file set the slave-priority setting to zero (the number not the word).

How does ServiceStack PooledRedisClientManager failover work?

According to the git commit messages, ServiceStack has recently added failover support. I initially assumed this meant that I could pull one of my Redis instances down, and my pooled client manager would handle the failover elegantly and try to connect with one of my alternate Redis instances. Unfortunately, my code just bugs out and says that it can't connect with the initial Redis instance.
I am currently running instances of Redis 2.6.12 on a Windows, with the master at port 6379 and a slave at 6380, with sentinels set up to automatically promote the slave to a master if the master goes down. I am currently instantiating my client manager like this:
PooledRedisClientManager pooledClientManager =
new PooledRedisClientManager(new string[1] { "localhost:6379"},
new string[1] {"localhost:6380"});
where the first array is read-write hosts (for the master), and the second array is read-only hosts (for the slave).
When I terminate the master at port 6379, the sentinels promote the slave to a master. Now, when I try to run my C# code, instead of failing over to port 6380, it simply breaks and returns the error "could not connect to redis Instance at localhost:6379".
Is there a way around this, or will failover simply not work the way I want it to?

PooledRedisClientManager.FailoverTo allows you to reset which are the read/write hosts, vs readonly hosts, and restart the factory. This allows for a quick transition without needing to recreate clients.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Handle io.lettuce.core.RedisReadOnlyException when network is partitioned - redis

Related

Redis sentinel implmentation over the internet

Redis connect single instance slave (slave of) to cluster or sentinel

Unslave a redis slave

Can we mark a slave as unpromotable by redis-sentinel?

How does ServiceStack PooledRedisClientManager failover work?

Categories

Resources