Redis Cluster fail-over issue. Slave to master promotion takes more than 2 seconds. How to solve it? - redis

I have a Redis cluster with multiple masters and 1 slave per master.
I am using jedis. Redis version is : 3.0.7.
Now when one of the master goes down Jedis is trying to make slave as master and it takes more than 2 seconds. Which is a huge time in the my scenarios.
*Following the two different Exceptions and the Stack :
JedisClusterException & JedisClusterMaxRedirectionsException.
JedisConnectionException!! - exc: redis.clients.jedis.exceptions.JedisClusterException: CLUSTERDOWN The cluster is down
redis.clients.jedis.exceptions.JedisClusterException: CLUSTERDOWN The cluster is down
at redis.clients.jedis.Protocol.processError(Protocol.java:115)
at redis.clients.jedis.Protocol.process(Protocol.java:151)
at redis.clients.jedis.Protocol.read(Protocol.java:205)
at redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:297)
at redis.clients.jedis.Connection.getIntegerReply(Connection.java:222)
at redis.clients.jedis.Jedis.incr(Jedis.java:548)
at redis.clients.jedis.JedisCluster$27.execute(JedisCluster.java:319)
at redis.clients.jedis.JedisCluster$27.execute(JedisCluster.java:316)
at redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:119)
at redis.clients.jedis.JedisClusterCommand.run(JedisClusterCommand.java:30)
at redis.clients.jedis.JedisCluster.incr(JedisCluster.java:321)
at com.demo.redisCluster.App.main(App.java:37)
.................................................
JedisClusterMaxRedirectionsException!! - exc: redis.clients.jedis.exceptions.JedisClusterMaxRedirectionsException: Too many Cluster redirections?
redis.clients.jedis.exceptions.JedisClusterMaxRedirectionsException: Too many Cluster redirections?
at redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:97)
at redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:131)
at redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:152)
at redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:131)
at redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:152)
at redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:131)
at redis.clients.jedis.JedisClusterCommand.run(JedisClusterCommand.java:30)
at redis.clients.jedis.JedisCluster.incr(JedisCluster.java:321)
at com.demo.redisCluster.App.main(App.java:37)*

Related

It is possible for only the Redis master instances to handle all reads and writes, and for the slaves to be used only for failover?

My work system consists of Spring web applications and it uses Redis as a transaction counter and it conditionally blocks transaction requests.
The transaction is as follows:
Check whether or not data exists. (HGET)
If it doesn't, saves new one with count 0 and set expiration time. (HSET, EXPIRE)
Increases a count value. (INCRBY)
If the increased count value reaches a specific configured limit, it sets the transaction to 'blocked' (HSET)
The limit value is my company's business policy.
Such read and write operations are requested one after another, immediately.
Currently, I use one Redis instance at one machine. (only master, no replications.)
I want to get Redis HA, so I need slave insntaces but at the same time, I want to have all reads and writes to Redis only to master insntaces because of slave data relication latency.
After some research, I found that it is a good idea to have a proxy server to use Redis HA. However, with proxy, it seems impossible to use only the master instances to receive requests and the slaves only for failover.
Is it possible??
Thanks in advance.
What you need is Redis Sentinel.
With Redis Sentinel, you can get the master address from sentinel, and read/write with master. If master is down, Redis Sentinel will do failover, and elect a new master. Then you can get the address of the new master from sentinel.
As you're going to use Lettuce for Redis cluster driver, you should set read preference to Master and things should be working fine a sample code might look like this.
LettuceClientConfiguration lettuceClientConfiguration =
LettuceClientConfiguration.builder().readFrom(ReadFrom.MASTER).build();
RedisClusterConfiguration redisClusterConfiguration = new RedisClusterConfiguration();
List<RedisNode> redisNodes = new ArrayList<>();
redisNodes.add(new RedisNode("127.0.0.1", 9000));
redisNodes.add(new RedisNode("127.0.0.1", 9001));
redisNodes.add(new RedisNode("127.0.0.1", 9002));
redisNodes.add(new RedisNode("127.0.0.1", 9003));
redisNodes.add(new RedisNode("127.0.0.1", 9004));
redisNodes.add(new RedisNode("127.0.0.1", 9005));
redisClusterConfiguration.setClusterNodes(redisNodes);
LettuceConnectionFactory lettuceConnectionFactory =
new LettuceConnectionFactory(redisClusterConfiguration, lettuceClientConfiguration);
lettuceConnectionFactory.afterPropertiesSet();
See in action at Redis Cluster Configuration

Redis cluster cannot add nodes

There are two redis server. And I have run three redis instances on each server.
When I executed cluster meet [ip] [port] to add the cluster nodes, I found I just could add the nods which was running on the same server. Everytime I run this command, it alwasys echo an "OK" for me. But when I use cluster nodes to check the nodes list, it always shows like this.
172.18.0.155:7010> cluster meet 172.18.0.156 7020
OK
172.18.0.155:7010> cluster nodes
ad829d8b297c79f644f48609f17985c5586b4941 127.0.0.1:7010#17010 myself,master - 0 1540538312000 1 connected
87a8017cfb498e47b6b48f0ad69fc066c466a9c2 172.18.0.156:7020#17020 handshake - 1540538308677 0 0 disconnected
fdf5879554741759aab14eba701dc185b605ac16 127.0.0.1:7012#17012 master - 0 1540538313000 0 connected
ec7b3ecba7a175ddb81f254821243dd469a7f961 127.0.0.1:7011#17011 master - 0 1540538314288 2 connected
You can see the nodes status is disconnected. And you can find it will disappare from the list, if you check again about 5s later.
Has anybody meet this problem before? I have no idea how to solve this problem. Please help me. Thanks a lot.
I have solved the problem. I found I had done some mistakes with the bind configuration. When I just add one IP which communicate with other nodes for the bind setting. The cluster nodes can add normally.

Apache Ignite - Distibuted Queue and Executors

I am planning to use Apache Ignite Distributed Queue.
I am using Ignite with a spring boot application. So, on bootup, I will be adding 20 names in a queue. But, since there are 3 servers in a cluster, the same 20 names gets added 3 times. But, i want to add them only once in the queue.
Ignite ignite = Ignition.ignite();
IgniteQueue<String> queue = ignite.queue(
"queueName", // Queue name.
0, // Queue capacity. 0 for unbounded queue.
null // Collection configuration.
);
Distributed executors, will be able to poll from the queue and run the task. Here, the executor is expected to poll, run the task and then add the same name to the queue. Trying to achieve round robin here.
Only one executor should be running the same task at any point of time, though there are multiple servers in a cluster.
Any suggestion for this.
You can launch ignite cluster singleton service https://apacheignite.readme.io/docs/cluster-singletons which will fill data to queue. Also you can adding data from coordinator node (oldest node in cluster) ignite.cluster().forOldest().node().isLocal()
I fixed bootup time duplicate cache loading issue this way:
final IgniteAtomicLong cacheLoadCnt = ignite.atomicLong(cacheName + "Cnt", 0, true);
if (cacheLoadCnt.get() == 0) {
loadCache();
cacheLoadCnt.addAndGet(1);
}

How to scale down a CrateDB cluster?

For testing, I wanted to shrink my 3 node cluster to 2 nodes, to later go and do the same thing for my 5 node cluster.
However, after following the best practice of shrinking a cluster:
Back up all tables
For all tables: alter table xyz set (number_of_replicas=2) if it was less than 2 before
SET GLOBAL PERSISTENT discovery.zen.minimum_master_nodes = <half of the cluster + 1>;
3 a. If the data check should always be green, set the min_availability to 'full':
https://crate.io/docs/reference/configuration.html#graceful-stop
Initiate graceful stop on one node
Wait for the data check to turn green
Repeat from 3.
When done, persist the node configurations in crate.yml:
gateway.recover_after_nodes: n
discovery.zen.minimum_master_nodes:[![enter image description here][1]][1] (n/2) +1
gateway.expected_nodes: n
My cluster never went back to "green" again, and I also have a critical node check failing.
What went wrong here?
crate.yml:
...
################################## Discovery ##################################
# Discovery infrastructure ensures nodes can be found within a cluster
# and master node is elected. Multicast discovery is the default.
# Set to ensure a node sees M other master eligible nodes to be considered
# operational within the cluster. Its recommended to set it to a higher value
# than 1 when running more than 2 nodes in the cluster.
#
# We highly recommend to set the minimum master nodes as follows:
# minimum_master_nodes: (N / 2) + 1 where N is the cluster size
# That will ensure a full recovery of the cluster state.
#
discovery.zen.minimum_master_nodes: 2
# Set the time to wait for ping responses from other nodes when discovering.
# Set this option to a higher value on a slow or congested network
# to minimize discovery failures:
#
# discovery.zen.ping.timeout: 3s
#
# Time a node is waiting for responses from other nodes to a published
# cluster state.
#
# discovery.zen.publish_timeout: 30s
# Unicast discovery allows to explicitly control which nodes will be used
# to discover the cluster. It can be used when multicast is not present,
# or to restrict the cluster communication-wise.
# For example, Amazon Web Services doesn't support multicast discovery.
# Therefore, you need to specify the instances you want to connect to a
# cluster as described in the following steps:
#
# 1. Disable multicast discovery (enabled by default):
#
discovery.zen.ping.multicast.enabled: false
#
# 2. Configure an initial list of master nodes in the cluster
# to perform discovery when new nodes (master or data) are started:
#
# If you want to debug the discovery process, you can set a logger in
# 'config/logging.yml' to help you doing so.
#
################################### Gateway ###################################
# The gateway persists cluster meta data on disk every time the meta data
# changes. This data is stored persistently across full cluster restarts
# and recovered after nodes are started again.
# Defines the number of nodes that need to be started before any cluster
# state recovery will start.
#
gateway.recover_after_nodes: 3
# Defines the time to wait before starting the recovery once the number
# of nodes defined in gateway.recover_after_nodes are started.
#
#gateway.recover_after_time: 5m
# Defines how many nodes should be waited for until the cluster state is
# recovered immediately. The value should be equal to the number of nodes
# in the cluster.
#
gateway.expected_nodes: 3
So there are two things that are important:
The number of replicas is essentially the number of nodes you can loose in a typical setup (2 is recommended so that you can scale down AND loose a node in the process and still be ok)
The procedure is recommended for clusters > 2 nodes ;)
CrateDB will automatically distribute the shards across the cluster in a way that no replica and primary share a node. If that is not possible (which is the case if you have 2 nodes and 1 primary with 2 replicas, the data check will never return to 'green'. So in your case, set the number of replicas to 1 in order to get the cluster back to green (alter table mytable set (number_of_replicas = 1)).
The critical node check is due to the cluster not having received an updated crate.yml yet: Your file also still has the configuration of a 3-node cluster in it, hence the message. Since CrateDB only loads the expected_nodes at startup (it's not a runtime setting), a restart of the whole cluster is required to conclude scaling down. It can be done with a rolling restart, but be sure to set SET GLOBAL PERSISTENT discovery.zen.minimum_master_nodes = <half of the cluster + 1>; properly, otherwise the consensus will not work...
Also, it's recommended to scale down one-by-one in order to avoid overloading the cluster with rebalancing and accidentally loosing data.

Authentication failures in cassandra when 1 of 16 nodes is down

I have a Cassandra cluster running :
Cassandra 2.0.11.83 | DSE 4.6.0 | CQL spec 3.1.1 | Thrift protocol 19.39.0
The cluster has 18 nodes, split among 3 datacenters, 6 in each. My system_auth keyspace has the following replication defined:
replication = {
'class': 'NetworkTopologyStrategy',
'DC1': '4',
'DC2': '4',
'DC3': '4'}
and my authenticator/authorizer are set to:
authenticator: org.apache.cassandra.auth.PasswordAuthenticator
authorizer: org.apache.cassandra.auth.CassandraAuthorizer
This morning I brought down one of the nodes in DC1 for maintenance. Within a few seconds/minute client applications started logging exceptions like this:
"User my_application_user has no MODIFY permission on or any of its parents"
Running 'LIST ALL PERMISSIONS of my_application_user' on one of the other nodes shows that user to have SELECT and MODIFY on the keyspace xxxxx, so I am rather confused. Do I have a setup issue? Is this a bug of some sort?
Re-posting this as the answer, as BrianC suggested above.
So this is resolved... Here's the sequence of events that seems to have fixed it:
Add 18 more nodes
Run cleanup on original nodes (this was part of the original plan)
Run a scrub on 1 table, since it was throwing exceptions on cleanup
Run a repair on the system_auth KS on the original troubled node
Wait for repair service to complete a full pass on all keyspaces
Decom original 18 nodes.
Honestly, I don't know what fixed it. The system_auth repair makes most sense, but what doesn't make sense is that it had run many passes before, so why work now, I don't know. I hope this at least helps someone.