Redis node restarts when active connection reaches limit - redis

I am having a multiple redis clusters in cluster mode. Each node has default limit of 10k client connections. One behavior I have observed is when redis client connection reaches this limit node restarts.
Expectation is that client connection should get refused when this limit is breached instead of node restarting

Related

redis slave node continually require auth

I have deployed a redis service using master-slave mode with 3 nodes at Kubernetes. And I using sentinel to keep it stay high avaliable. All node has set requirepass and masterauth.
When I connect to any of slave node, execute auth command, and after few seconds no op, about 5-15 seconds, redis require auth again.
As I know, redis has no auth expired settings item. So I'm curious is this a redis mechanism which I don't know or there is problem at my redis service.
I guess that your Redis server has set timeout config as N seconds.
The timeout option controls whether or not to close the connection after a client is idle for N seconds (0 to disable) (quote from redis.conf).
You connect to Redis with redis-cli, and send AUTH command to Redis. However, if you do not send any other commands in N seconds, Redis will close your connection. Then you send a new connection with redis-cli, redis-cli will create a new connection and send the command. However, since this is a new connection, and no AUTH command has been sent on this connection, you will fail and Redis asks you for authentication.

Redis cluster node failure not detected on MISCONF

We currently have a redis cache cluster with 3 masters and 3 slaves hosted on 3 windows servers (1 master/slave by server). We are using StackExhange.Redis as our client.
We have RBD disabled but AOF enabled and are experiencing some problems with the cluster in the following situation :
One of our servers became full and the redis node on this server was unable to write to the AOF file (the error returned to the client was MISCONF Errors writing to the AOF file: No space left on device).
The cluster did not detect that the node was failing and so did not exlclude it from the cluster.
All cache operations were blocked until we make some place on the server.
We know that we don't need the AOF, so we have disalbed it after the incident.
But we would like to confirm or infirm our view on redis clustering: for us, if a node was experiencing a failure, the cluster would redirect all requests to another one. We have tested that with a stopped node master, a slave is promoted into a master so we are confident that our cluster is working, but we are not sure why, in our case, the node was not marked as a failure.
Is the cluster capable of detecting a node failure when the failure is only happening when a request is made from a client to the cluster ?

How to stop client from reconnecting to server when the server is down?

How can we stop a client from reconnecting to the server after some retries.
In our case (in memory DB for fast retrieval), we have used Ignite and Oracle in parallel so that if Ignite server is down, then I could get my data from Oracle.
But when I start my application (while the Ignite server node is down for some reason), my application always waiting until it connects to server.
Console message:
Failed to connect to any address from IP finder (will retry to join topology every 2000 ms; change 'reconnectDelay' to configure the frequency of retries):
There is a TcpDiscoverySpi.joinTimeout property, which does exactly what you want: https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/spi/discovery/tcp/TcpDiscoverySpi.html#setJoinTimeout-long-
By default, it's not defined, so, node will try to reconnect endlessly.

Redis WAIT behavior upon failover

I have a Redis HA deployment with 3 nodes (1 Master, 2 Slaves) and a sentinel running on every node. On the client side, I use a WAIT 2 0 to block indefinitely until my write reached the 2 slaves (and I am OK with that).
What would be the behavior of the WAIT command upon:
1) a network partition isolates the master and the client from the 2 slaves so my client is currently blocked by the WAIT
2) the majority of sentinels elects one of the slave as the new master (since there is still a quorum)
3) the network partition heals and the old master become slave of the new one
Would the WAIT still be blocking? Or would it release the client returning "0" slaves reached?
Many thanks

ActiveMQ failover protocol not reconnecting to master after restarting

I am using ActiveMQ version 5.4 and I have a pure master slave configuration. My slave is configured such that starts its network transports connectors in the event of a failure. My clients are configured using the failover protocol, just like the docs say:
failover://(tcp://masterhost:61616,tcp://slavehost:61616)?randomize=false
When my master dies, the clients successfully fail over to the slave perfectly. The problem is that after I recover (i.e. stop the slave, copy over the data, restart the master, then restart the slave), the clients are still trying to connect to the the slave (which does not have any open network connectors at that point). Thus, the clients never reconnect to the master after restarting it. Is this how it's supposed to work?
I've seen this as well. If you're using the PooledConnectionFactory, set an expiry timeout on the pooled connections via setExpiryTimeout. The API documentation here suggests that this will force reconnection to the master broker:
allow connections to expire, irrespective of load or idle time. This is useful with failover to force a reconnect from the pool, to reestablish load balancing or use of the master post recovery