Should I run haproxy for db and redis sentinel on web nodes? - redis

I am setting up a cluster of servers using vagrant and playing with Redis sentinel and HAProxy for Postgresql db connection (with pgpool). I was curious if it make sense to put haproxy and redis sentinel on each of my web server nodes and have them connect directly to those. The thought is that it can create a distributed connection to the DB and redis and reduce the single point of failure to having a single haproxy that they connect to and then split to different db nodes. I can also keep the database connect (via haproxy) and redis (via sentinel) encapsulated to the localhost. Does this make sense?

It only makes sense if you're trying to save up on resources/costs.
Please note that redis sentinel must have a finite list of sentinel instances, which doesn't fit the scenario of placing one per machine, as your maching count would probably scale/change.
Otherwise , it's always makes the most sense to put different infrastructure components ( especially those with clustering/HA nature, such as redis ) on different machines.
By mixing them all together, you usually end up with applications getting in the way of each other and stealing CPU from each-other once the load increases. You also risk designing your applications/scripts/flows to be location aware (i.e assume external resources are always local ) which is also not a really good practice.

Related

MaxScale cluster (master-master) setup

When deploying multiple MaxScale in a Master-Slave typology (failover from master to slave with Keepalived or similar) in front of a Galera Cluster in read-write-split mode , everything goes fine. But what about a Master-Master like typology in a round-robin fashion, is this possible?
Ex.: Having one MaxScale at 10.0.0.1, a 2nd at 10.0.0.2 and Haproxy in front of it with a roundrobin or leastconn distribution algorithm (or even without Haproxy/load-balancer, where applications just connect randomly to one or another) is this possible/well supported by MaxScale?
Usually you can connect to any number of MaxScale instances as long as certain features are enabled to guarantee that all MaxScale instances pick the same server where they send writes to.
Galera Clusters
If you are using a Galera cluster, this can be done in a safe and conflict-free manner by enabling the root_node_as_master parameter. It uses the Galera cluster itself to select which node it uses for writes. This allows your applications to connect to either of the MaxScale instances.
Even without this parameter it would not cause any problems with the database itself but due to the way Galera works, you increase the likelihood of running into a conflict when you COMMIT your transaction if you write to multiple nodes.
Asynchronous Replication Clusters
If you use MaxScale with a cluster that uses asynchronous replication, you still can do this as long as you configure your mariadbmon monitor to use cooperative_monitoring_locks. This causes the MaxScale instances to communicate via the database about which servers they see and which of them they chose for writes.
An added benefit of cooperative_monitoring_locks is that you can enable the automated cluster management parameters auto_failover and auto_rejoin without having to worry about two MaxScales attempting to change the replication configuration at the same time: cooperative_monitoring_locks makes sure only one MaxScale does it.

Redis Cluster or Replication without proxy

Is it possible to build one master (port 6378) + two slave (read only port: 6379, 6380) "cluster" on one machine and increase the performances (especially reading) and do not use any proxy? Can the site or code connect to master instance and read data from read-only nodes? Or if I use 3 instances of Redis I have to use proxy anyway?
Edit: Seems like slave nodes don't have any data, they try to redirect to master instance, but it is not correct way, am I right?
Definitely. You can code the paths in your app so writes and reads go to different servers. Depending on the programming language that you're using and the Redis client, this may be easier or harder to achieve.
Edit: that said, I'm unsure how you're running a cluster with a single master - the minimum should be 3.
You need to send a READONLY command after connecting to the slave before you could execute any read commands.
A READONLY command only affects during the current socket session which means you need this command for every TCP connection.

Redis Cluster configuration for CacheManager.NET

I have a basic question about Redis connection parameters from CacheManager.NET perspective. In case when we have Redis cluster with a master and 2 slaves, and with quorum of sentinel processes, should we provide the IP:PORT combinations pointing to the sentinel processes OR the actual Redis server processes.
As suggested in https://seanmcgary.com/posts/how-to-build-a-fault-tolerant-redis-cluster-with-sentinel, it is advisable to ask the sentinel process about the actual master before making the connection. And probably that goes in line with Jedis which provides JedisSentinelPool to do the initial lookup.
Essentially what we want is that the load balancing on reads (via CacheManager.NET) and the writes should go to the current master node of the cluster.
CacheManager relies on StackExchange.Redis for the Redis implementation. Therefor, whatever this client library supports, CacheManager does, too.
Unfortunately, sentinel support is not implemented, there are issues on github for years regarding that
That being said, I did some testing with a Multi Master/Slave + Sentinel setup. Added all the non-sentinel nodes as endpoints to the Multiplexer configuration and it kinda works because the Redis Client knows how to handle multiple master/slave instances.
In the process of switching to another master, the client might throw exceptions that it cannot write to a readonly slave and such. CacheManager might retry those calls and after a short amount of time, when the leader election is done, the call should go through.
But this is not 100% stable and I would not put that in production, as "official" support is still missing...
Alternative to running with sentinels, you could run Redis in Cluster mode which should just work, or behind a proxy which deals with all that master/slave stuff.
Twemproxy is one alternative.
I still have to add support for Twemproxy to CacheManager, as many features are simply not available, like Lua scripting or get a list of servers or flush commands...
This will come in 1.0.2
Hope that helps.

Redis cluster via HAProxy

I have a Redis Cluster that clients are connecting to via HAPRoxy with a Virtual IP. The Redis cluster has three nodes (with each node sharing the same server with a running sentinel instance).
My question is, when i clients gets a "MOVED" error/message from a cluster node upon sending a request, does it bypass the HAProxy the second time when it connects since it has been provided with an IP:port when the MOVEd message was issued? If not, how does the HAProxy know the second time to send it to the correct node?
I just need to understand how this works under the hood.
If you want to use HAProxy in front of Redis Cluster nodes, you will need to either:
Set up an HAProxy for each master/slave pair, and wire up something to update HAProxy when a failure happens, as well as probably intercept the topology related commands to insert the virtual IPs rather than the IPs the nodes themselves have and report via the topology commands/responses.
Customize HAProxy to teach it how to be the cluster-aware Redis client so the actual client doesn't know about cluster at all. This means teaching it the Redis protocol, storing the cluster's topology information, and selecting the node to query based on the key(s) being accessed by the consumer code.
With Redis Cluster the client must be able to access every node in the cluster. Of the two options above Option 2 is the "easier" one, but at this point I wouldn't recommend either.
Conceivably you could use the VIP as a "first place to get the topology info" IP but I suspect you'd have serious issues develop as that original IP would not be one of the ones properly being reported as a nod handling data. For that you could simply use round-robin DNS and avoid that problem, or use the built-in "here is a list of cluster IPs (or names?)" to the initial connection configuration.
Your simplest, and least likely to be problematic, route is to go "full native" and simply give full and direct access to every node in the cluster to your clients and not use HAProxy at all.

twemproxy (nutcracker) adding redis instance and keeping consistency

I set up twemproxy (nutcracker) with 2 redis servers as backends including slaves, sentinel and failover.
As soon as I add another redis server some of the keys are not able to be read, probably due to twemproxy redirecting to another redis.
How do I add another redis instance without breaking the consistency?
I want to use the setup as a consistent and very fast database.
Here are my settings:
redis_cluster:
auto_eject_hosts: false
distribution: ketama
hash: fnv1a_32
listen: 127.0.0.1:6379
preconnect: true
redis: true
servers:
- 127.0.0.1:7004:1 redis_1
- 127.0.0.1:7005:1 redis_2
I want to keep sharding a job of the server and be able to add instances. Do I need to use another setup?
Twemproxy can't do that. You can use Redis Cluster, or if you want to use Twemproxy you have to use a technique called presharding. Which is, start directly with, like, 32 or 64 instances or alike, even if them all run in the same host to start. Then start moving instances from one box to another in order to scale to multiple actual servers. The word to the right of the instances configured inside Twemproxy "redis_1" are used in order to hash, so that you can change IP address when you move instances, and still the hashing will be the same for that server.
Redis Cluster is release candidate 2 at this point. While it needs more testing and deployments to be battle tested as Redis is, it is already a viable product, so you may want to test it as well.