How does ServiceStack PooledRedisClientManager failover work? - redis

According to the git commit messages, ServiceStack has recently added failover support. I initially assumed this meant that I could pull one of my Redis instances down, and my pooled client manager would handle the failover elegantly and try to connect with one of my alternate Redis instances. Unfortunately, my code just bugs out and says that it can't connect with the initial Redis instance.
I am currently running instances of Redis 2.6.12 on a Windows, with the master at port 6379 and a slave at 6380, with sentinels set up to automatically promote the slave to a master if the master goes down. I am currently instantiating my client manager like this:
PooledRedisClientManager pooledClientManager =
new PooledRedisClientManager(new string[1] { "localhost:6379"},
new string[1] {"localhost:6380"});
where the first array is read-write hosts (for the master), and the second array is read-only hosts (for the slave).
When I terminate the master at port 6379, the sentinels promote the slave to a master. Now, when I try to run my C# code, instead of failing over to port 6380, it simply breaks and returns the error "could not connect to redis Instance at localhost:6379".
Is there a way around this, or will failover simply not work the way I want it to?

PooledRedisClientManager.FailoverTo allows you to reset which are the read/write hosts, vs readonly hosts, and restart the factory. This allows for a quick transition without needing to recreate clients.

Related

Handle io.lettuce.core.RedisReadOnlyException when network is partitioned

I have a situation where I use sentinel to get current redis master from sentinel. My setup is one redis master and three slaves and three sentinel nodes. This works fine in most situations but I have found that if I get a network split where the current master and the sentinel node that is configured first in the list of sentinel nodes are isolated from the other nodes, the other two sentinel nodes are doing a reelection to a new master, as intended.
My problem is that when the isolated previous master is accessing the common network again and is reconfigured to slave, my application is never notified that a new master is elected and continues to write to a slave since it still thinks it is writing to a master, ending up in getting "Error in execution; nested exception is io.lettuce.core.RedisReadOnlyException: READONLY You can't write against a read only slave."
I do not know if this is a redis problem or framework problem. Should redis when it is reconfigured from master to save terminate the connection like it is done in normal circumstances when a new master is elected or should the framework handle exceptions and query for current master?
One more interesting aspect of this is if the sentinel node configured first in the sentinel node list continues to be isolated, the behavior continues even if the application accessing redis is restarted.
Is there any mechanism to handle this situation or is this a bug or enhancement to the framework?

Redis Cluster or Replication without proxy

Is it possible to build one master (port 6378) + two slave (read only port: 6379, 6380) "cluster" on one machine and increase the performances (especially reading) and do not use any proxy? Can the site or code connect to master instance and read data from read-only nodes? Or if I use 3 instances of Redis I have to use proxy anyway?
Edit: Seems like slave nodes don't have any data, they try to redirect to master instance, but it is not correct way, am I right?
Definitely. You can code the paths in your app so writes and reads go to different servers. Depending on the programming language that you're using and the Redis client, this may be easier or harder to achieve.
Edit: that said, I'm unsure how you're running a cluster with a single master - the minimum should be 3.
You need to send a READONLY command after connecting to the slave before you could execute any read commands.
A READONLY command only affects during the current socket session which means you need this command for every TCP connection.

Redis Cluster configuration for CacheManager.NET

I have a basic question about Redis connection parameters from CacheManager.NET perspective. In case when we have Redis cluster with a master and 2 slaves, and with quorum of sentinel processes, should we provide the IP:PORT combinations pointing to the sentinel processes OR the actual Redis server processes.
As suggested in https://seanmcgary.com/posts/how-to-build-a-fault-tolerant-redis-cluster-with-sentinel, it is advisable to ask the sentinel process about the actual master before making the connection. And probably that goes in line with Jedis which provides JedisSentinelPool to do the initial lookup.
Essentially what we want is that the load balancing on reads (via CacheManager.NET) and the writes should go to the current master node of the cluster.
CacheManager relies on StackExchange.Redis for the Redis implementation. Therefor, whatever this client library supports, CacheManager does, too.
Unfortunately, sentinel support is not implemented, there are issues on github for years regarding that
That being said, I did some testing with a Multi Master/Slave + Sentinel setup. Added all the non-sentinel nodes as endpoints to the Multiplexer configuration and it kinda works because the Redis Client knows how to handle multiple master/slave instances.
In the process of switching to another master, the client might throw exceptions that it cannot write to a readonly slave and such. CacheManager might retry those calls and after a short amount of time, when the leader election is done, the call should go through.
But this is not 100% stable and I would not put that in production, as "official" support is still missing...
Alternative to running with sentinels, you could run Redis in Cluster mode which should just work, or behind a proxy which deals with all that master/slave stuff.
Twemproxy is one alternative.
I still have to add support for Twemproxy to CacheManager, as many features are simply not available, like Lua scripting or get a list of servers or flush commands...
This will come in 1.0.2
Hope that helps.

Failing over with single Replication Group on ElastiCache Redis

I'm testing out ElastiCache backed by Redis with the following specs:
Using Redis 2.8, with Multi-AZ
Single replication group
1 master node in us-east-1b, 1 slave node in us-east-1c, 1 slave node in us-east-1d
The part of the application writing is directly using the endpoint for the master node (primary-node.use1.cache.amazonaws.com)
The part of the application doing only reads is pointing to a custom endpoint (readonly.redis.mydomain.com) configured in HAProxy, which then points to the two other read slave end points. (readslave1.use1.cache.amazonaws.com and readslave2.use1.cache.amazonaws.com)
Now lets say the primary node (master) fails in us-east-1b.
From what I understand, if the master instance fails, I won't have to change the url for the end point for writing to Redis (primary-node.use1.cache.amazonaws.com), although from there, I still have the following questions:
Do I have to change the endpoint names for the read only slaves?
How long until the missing slave is added into the pool?
If there's anything else I'm missing, I'd appreciate the advice/information.
Thanks!
If you are using ElastiCache, you should make use the "Primary EndpointThe" provided by AWS.
That endpoint actually is backed by Route53, if the primary (master) redis is down, since you enable MutliA-Z, it will auto fail over to one of the read replica (slave).
In that case, you don't need to modify the endpoint of your redis.
I don't know why you have such design, seems you only want write to master, but always read from slave.
For HA Proxy part, you should include TCP check for ALL 3 redis nodes, using their "Read Endpoint"
In haproxy, you can check if the endpoint is SLAVE, if yes, your haproxy should redirect the traffic to that.
Notice that in the application layer, if your redis driver don't support auto reconnect, your script will fail to connect to the new master nodes.
In addition to "auto reconnect", since AWS is using Route53 DNS to do fail over, some lib will NOT do NS lookup again, which means the DNS is still pointing to the OLD ip which is the old master.
Using HAproxy can solve this problem.

Redis failover scenario

currently I have a redis instance, now I would make it more failure prove.
Is it possible to archive the following things?
I connect to redis with the service stack library, now I want that when the server is not available redis switch to the failover server automatically.
You should configure a Redis instance as a slave of your master instance, either using the slaveof command or more likely by adding a slaveof directive in the configuration file (something like 'slaveof 127.0.0.1 6380' ; look at the documentation for more info); then use Redis Sentinel to monitor the instances and promote the Slave as Master when the master fails.
Moreover you either have to use a Redis client that supports sentinel and handles the redirection when the slave is promoted to slave, or use a network configuration (like virtual IP) to make the redirection transparent for your application.