Too many pub sub connection in redis sentinel - redis

I have 5 cmdb servers, Where I installed redis server. (1 is master is remaining 4 are replicas), To achieve High availability I installed 5 sentinel nodes on different middleware servers and monitors the master redis server. redis version 6.2
I have max client limit as 10000 connections. But for some reason sentinel node opening too many pubpsub connections with other sentinel nodes and running out of client limit. Why so many pubsub connections are getting created.
Among 5 sentinal nodes only one node is able to access and remaining nodes throwing -----error "Max client limit reached".
Please check the output of working sentinel node
----------------------------------------------------------------
127.0.0.1:26379> info clients
# Clients
connected_clients:1119
cluster_connections:0
maxclients:10000
client_recent_max_input_buffer:32
client_recent_max_output_buffer:0
blocked_clients:0
tracking_clients:0
clients_in_timeout_table:0
--------------------------------------------------
output of max client limit reached sentinel node
127.0.0.1:26379> info clients
Error: Connection reset by peer
127.0.0.1:26379> info clients
ERR max number of clients reached
127.0.0.1:26379>
--------------------------------------------------------
Sample output of "client list" command in working sentinel node
id=264830 addr=x.x.x.x:5543 laddr=x.x.x.x:26379 fd=839 name= age=2327 idle=2327 flags=P db=0 sub=1 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=subscribe user=default redir=-1
id=267063 addr=x.x.x.x:20618 laddr=x.x.x.x:26379 fd=935 name= age=2065 idle=2065 flags=P db=0 sub=1 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=subscribe user=default redir=-1
id=264831 addr=x.x.x.x:46830 laddr=x.x.x.x:26379 fd=840 name= age=2327 idle=2327 flags=P db=0 sub=1 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20504 events=r cmd=subscribe user=default redir=-1
Please help how to prevent too many pub sub connections between sentinel nodes, Because of this applications failing to connect with redis server.

Related

Failed to send Message to remote node in ignite

I am running my service in local environment and trying to connect to remote node but it showing error failed to send message to remote node.
I want to run my service in local environment and connect it to remote ignite node on different server.
my configuration is:
IgniteConfiguration igniteConfig = new IgniteConfiguration();
igniteConfig.setIgniteInstanceName("MasterCacheCluster");
igniteConfig.setPeerClassLoadingEnabled(true);
igniteConfig.setClientMode(true);
TcpDiscoverySpi discoverySpi = new TcpDiscoverySpi();
TcpDiscoveryVmIpFinder ipFinder = new
TcpDiscoveryVmIpFinder();
TcpCommunicationSpi communicationSpi = new
TcpCommunicationSpi();
ipFinder.setAddresses(Arrays.asList("server_address":47500..47509"));
discoverySpi.setIpFinder(ipFinder);
igniteConfig.setDiscoverySpi(discoverySpi);
DataStorageConfiguration dataCfg = new
DataStorageConfiguration();
DataRegionConfiguration rgnCfg = new
DataRegionConfiguration();
rgnCfg.setName("Sample_Cluster_Region");
rgnCfg.setPageEvictionMode(DataPageEvictionMode.RANDOM_2_LRU);
rgnCfg.setPersistenceEnabled(true);
rgnCfg.setMetricsEnabled(true);
dataCfg.setDataRegionConfigurations(rgnCfg);
Ignite ignite = Ignition.start(igniteConfig);
ignite.cluster().active(true);
System.out.println("Cluster Size: " +
ignite.cluster().nodes().size());
return ignite;
** server address is hidden due to privacy reason
[13:12:18,839][SEVERE][exchange-worker-#62%MasterCacheCluster%][TcpCommunicationSpi] Failed to send message to remote node [node=TcpDiscoveryNode [id=724fff2c-76c2-44e7-921f-b7c37dac7d15, consistentId=7c4ed309-0b9b-40ba-84a1-90384e0940ea, addrs=ArrayList [0:0:0:0:0:0:0:1%lo, 10.3.0.8, 127.0.0.1], sockAddrs=null, discPort=47500, order=1, intOrder=1, lastExchangeTime=1676878928401, loc=false, ver=2.14.0#20220929-sha1:951e8deb, isClient=false], msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false, msg=GridDhtPartitionsSingleMessage [parts=null, partCntrs=null, partsSizes=null, partHistCntrs=null, err=null, client=true, exchangeStartTime=1676878928573, finishMsg=null, super=GridDhtPartitionsAbstractMessage [exchId=GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=2, minorTopVer=0], discoEvt=DiscoveryEvent [evtNode=TcpDiscoveryNode [id=96f70bd7-cbfb-4a3e-900d-00a93b10d892, consistentId=96f70bd7-cbfb-4a3e-900d-00a93b10d892, addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 172.16.16.50], sockAddrs=HashSet [/[0:0:0:0:0:0:0:1]:0, /127.0.0.1:0, LAPTOP-6AUCFF2I/172.16.16.50:0], discPort=0, order=2, intOrder=0, lastExchangeTime=1676878923997, loc=true, ver=2.14.0#20220929-sha1:951e8deb, isClient=true], topVer=2, msgTemplate=null, span=org.apache.ignite.internal.processors.tracing.NoopSpan#baed14f, nodeId8=96f70bd7, msg=null, type=NODE_JOINED, tstamp=1676878928556], nodeId=96f70bd7, evt=NODE_JOINED], lastVer=GridCacheVersion [topVer=0, order=1676878923547, nodeOrder=0, dataCenterId=0], super=GridCacheMessage [msgId=1, depInfo=null, lastAffChangedTopVer=AffinityTopologyVersion [topVer=-1, minorTopVer=0], err=null, skipPrepare=false]]]]]
class org.apache.ignite.IgniteCheckedException: Failed to connect to node (is node still alive?). Make sure that each ComputeTask and cache Transaction has a timeout set in order to prevent parties from waiting forever in case of network issues [nodeId=724fff2c-76c2-44e7-921f-b7c37dac7d15, addrs=[/10.3.0.8:47100, /[0:0:0:0:0:0:0:1%lo]:47100, /127.0.0.1:47100]]
Your client tries to establish communication link to the server node with id=724fff2c-76c2-44e7-921f-b7c37dac7d15 after receiving it's address through discovery protocol. This exception basically implies that there's no connectivity between your local host and "server_address":47100. Every single node (including clients) should be visible to the rest of a cluster. My guess is you have some firewall rules or something like that.
Try running some tools to troubleshoot, you could start with.
nc -vz "server_address" 47100
It should be run from your laptop.
It's also worth mentioning that your server expose ipv6 addresses. It's recommended to use ipv4 at the moment. Add -Djava.net.preferIPv4Stack=true JVM param to the both client and server JVM start scripts.

Ignite recurring stacktrace: Failed to process selector key

Ignite version: 2.14.0
Node configuration: 2 Nodes running on same PC (IPV4) using localhost and 255 available ports:
TcpDiscoveryMulticastIpFinder ipFinder = new TcpDiscoveryMulticastIpFinder();
ipFinder.setAddresses(Collections.singletonList("127.0.0.1"));
Also 2 different working dirs, Threadpool 16, 2 caches (one atomic, one transactional)
What happens: Using ExecutorService i submit 8 threads to pool. Class run correctly (4 on each node) and execute tasks as expected.
But during execution raise, repeatedly and with some frequency, the following exception on both nodes: GRAVE: "Failed to process selector key".
The application generates a high computational load. A simple "for loop" with a sleep gives no error
Full stack follows:
GRAVE: Failed to process selector key [ses=GridSelectorNioSessionImpl [worker=DirectNioClientWorker [super=AbstractNioClientWorker [idx=3, bytesRcvd=97567668, bytesSent=100128669, bytesRcvd0=0, bytesSent0=0, select=true, super=GridWorker [name=grid-nio-worker-tcp-comm-3, igniteInstanceName=TcpCommunicationSpi, finished=false, heartbeatTs=1675265761563, hashCode=2143442267, interrupted=false, runner=grid-nio-worker-tcp-comm-3-#26%TcpCommunicationSpi%]]], writeBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], readBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], inRecovery=GridNioRecoveryDescriptor [acked=1690656, resendCnt=0, rcvCnt=1696452, sentCnt=1691375, reserved=true, lastAck=1696448, nodeLeft=false, node=TcpDiscoveryNode [id=cd1ffdf0-b9b3-49ef-a9e3-db1676fad428, consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.178.30,192.168.56.1:47500, addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.178.30, 192.168.56.1], sockAddrs=HashSet [host.docker.internal/192.168.178.30:47500, /0:0:0:0:0:0:0:1:47500, WOPR/192.168.56.1:47500, /127.0.0.1:47500], discPort=47500, order=1, intOrder=1, lastExchangeTime=1675265584899, loc=false, ver=2.14.0#20220929-sha1:951e8deb, isClient=false], connected=true, connectCnt=69, queueLimit=4096, reserveCnt=101, pairedConnections=false], outRecovery=GridNioRecoveryDescriptor [acked=1690656, resendCnt=0, rcvCnt=1696452, sentCnt=1691375, reserved=true, lastAck=1696448, nodeLeft=false, node=TcpDiscoveryNode [id=cd1ffdf0-b9b3-49ef-a9e3-db1676fad428, consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.178.30,192.168.56.1:47500, addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.178.30, 192.168.56.1], sockAddrs=HashSet [host.docker.internal/192.168.178.30:47500, /0:0:0:0:0:0:0:1:47500, WOPR/192.168.56.1:47500, /127.0.0.1:47500], discPort=47500, order=1, intOrder=1, lastExchangeTime=1675265584899, loc=false, ver=2.14.0#20220929-sha1:951e8deb, isClient=false], connected=true, connectCnt=69, queueLimit=4096, reserveCnt=101, pairedConnections=false], closeSocket=true, outboundMessagesQueueSizeMetric=o.a.i.i.processors.metric.impl.LongAdderMetric#69a257d1, super=GridNioSessionImpl [locAddr=/0:0:0:0:0:0:0:1:47101, rmtAddr=/0:0:0:0:0:0:0:1:56361, createTime=1675265760336, closeTime=0, bytesSent=8479762, bytesRcvd=7459908, bytesSent0=0, bytesRcvd0=0, sndSchedTime=1675265760336, lastSndTime=1675265761545, lastRcvTime=1675265761563, readsPaused=false, filterChain=FilterChain[filters=[GridNioCodecFilter [parser=o.a.i.i.util.nio.GridDirectParser#b329ba4, directMode=true], GridConnectionBytesVerifyFilter], accepted=true, markedForClose=true]]]
java.io.IOException: Connessione in corso interrotta forzatamente dall'host remoto
at java.base/sun.nio.ch.SocketDispatcher.write0(Native Method)
at java.base/sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:51)
at java.base/sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:113)
at java.base/sun.nio.ch.IOUtil.write(IOUtil.java:58)
at java.base/sun.nio.ch.IOUtil.write(IOUtil.java:50)
at java.base/sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:466)
at org.apache.ignite.internal.util.nio.GridNioServer$DirectNioClientWorker.processWrite0(GridNioServer.java:1715)
at org.apache.ignite.internal.util.nio.GridNioServer$DirectNioClientWorker.processWrite(GridNioServer.java:1407)
at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.processSelectedKeysOptimized(GridNioServer.java:2511)
at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:2273)
at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.body(GridNioServer.java:1910)
at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125)
at java.base/java.lang.Thread.run(Thread.java:834)
Expected: I read that it could be a configuration problem but I don't understand how to fix it.
The configuration seems simple but and even if the execution is without calculation errors i would like to avoid this exception.
NODE1
[2023-02-02 15:54:30] [AVVERTENZA] Client disconnected abruptly due to network connection loss or because the connection was left open on application shutdown. [cls=class o.a.i.i.util.nio.GridNioException, msg=Connessione in corso interrotta forzatamente dall'host remoto] - [org.apache.ignite.logger.java.JavaLogger warning:]
[2023-02-02 15:54:30] [AVVERTENZA] Unacknowledged messages queue size overflow, will attempt to reconnect [remoteAddr=/127.0.0.1:63660, queueLimit=4096] - [org.apache.ignite.logger.java.JavaLogger warning:]
[2023-02-02 15:54:30] [INFORMAZIONI] Accepted incoming communication connection [locAddr=/127.0.0.1:47101, rmtAddr=/127.0.0.1:63670] - [org.apache.ignite.logger.java.JavaLogger info:]
[2023-02-02 15:54:30] [INFORMAZIONI] Accepted incoming communication connection [locAddr=/127.0.0.1:47101, rmtAddr=/127.0.0.1:63671] - [org.apache.ignite.logger.java.JavaLogger info:]
[2023-02-02 15:54:30] [INFORMAZIONI] Received incoming connection when already connected to this node, rejecting [locNode=af74d5c9-3631-4fdf-b9f2-0babc853019f, rmtNode=8a378874-f3ae-4d0c-9733-a6b143097658] - [org.apache.ignite.logger.java.JavaLogger info:]
[2023-02-02 15:54:30] [INFORMAZIONI] Accepted incoming communication connection [locAddr=/127.0.0.1:47101, rmtAddr=/127.0.0.1:63672] - [org.apache.ignite.logger.java.JavaLogger info:]
[2023-02-02 15:54:30] [INFORMAZIONI] Received incoming connection when already connected to this node, rejecting [locNode=af74d5c9-3631-4fdf-b9f2-0babc853019f, rmtNode=8a378874-f3ae-4d0c-9733-a6b143097658] - [org.apache.ignite.logger.java.JavaLogger info:]
[2023-02-02 15:54:31] [INFORMAZIONI] Accepted incoming communication connection [locAddr=/127.0.0.1:47101, rmtAddr=/127.0.0.1:63673] - [org.apache.ignite.logger.java.JavaLogger info:]
[2023-02-02 15:54:31] [INFORMAZIONI] Received incoming connection when already connected to this node, rejecting [locNode=af74d5c9-3631-4fdf-b9f2-0babc853019f, rmtNode=8a378874-f3ae-4d0c-9733-a6b143097658] - [org.apache.ignite.logger.java.JavaLogger info:]
[2023-02-02 15:54:31] [GRAVE ] Failed to process selector key [ses=GridSelectorNioSessionImpl [worker=DirectNioClientWorker [super=AbstractNioClientWorker [idx=0, bytesRcvd=2269317, bytesSent=3928093, bytesRcvd0=1909138, bytesSent0=720914, select=true, super=GridWorker [name=grid-nio-worker-tcp-comm-0, igniteInstanceName=TcpCommunicationSpi, finished=false, heartbeatTs=1675349670621, hashCode=722948156, interrupted=false, runner=grid-nio-worker-tcp-comm-0-#23%TcpCommunicationSpi%]]], writeBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], readBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], inRecovery=GridNioRecoveryDescriptor [acked=17152, resendCnt=870, rcvCnt=31061, sentCnt=18796, reserved=true, lastAck=31040, nodeLeft=false, node=TcpDiscoveryNode [id=8a378874-f3ae-4d0c-9733-a6b143097658, consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.178.30,192.168.56.1:47500, addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.178.30, 192.168.56.1], sockAddrs=HashSet [host.docker.internal/192.168.178.30:47500, /0:0:0:0:0:0:0:1:47500, WOPR/192.168.56.1:47500, /127.0.0.1:47500], discPort=47500, order=1, intOrder=1, lastExchangeTime=1675349650217, loc=false, ver=2.14.0#20220929-sha1:951e8deb, isClient=false], connected=true, connectCnt=7, queueLimit=4096, reserveCnt=9, pairedConnections=false], outRecovery=GridNioRecoveryDescriptor [acked=17152, resendCnt=870, rcvCnt=31061, sentCnt=18796, reserved=true, lastAck=31040, nodeLeft=false, node=TcpDiscoveryNode [id=8a378874-f3ae-4d0c-9733-a6b143097658, consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.178.30,192.168.56.1:47500, addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.178.30, 192.168.56.1], sockAddrs=HashSet [host.docker.internal/192.168.178.30:47500, /0:0:0:0:0:0:0:1:47500, WOPR/192.168.56.1:47500, /127.0.0.1:47500], discPort=47500, order=1, intOrder=1, lastExchangeTime=1675349650217, loc=false, ver=2.14.0#20220929-sha1:951e8deb, isClient=false], connected=true, connectCnt=7, queueLimit=4096, reserveCnt=9, pairedConnections=false], closeSocket=true, outboundMessagesQueueSizeMetric=o.a.i.i.processors.metric.impl.LongAdderMetric#69a257d1, super=GridNioSessionImpl [locAddr=/127.0.0.1:47101, rmtAddr=/127.0.0.1:63670, createTime=1675349670241, closeTime=0, bytesSent=720914, bytesRcvd=1909138, bytesSent0=720914, bytesRcvd0=1909138, sndSchedTime=1675349670241, lastSndTime=1675349670277, lastRcvTime=1675349670621, readsPaused=false, filterChain=FilterChain[filters=[GridNioCodecFilter [parser=o.a.i.i.util.nio.GridDirectParser#16179752, directMode=true], GridConnectionBytesVerifyFilter], accepted=true, markedForClose=true]]] - [org.apache.ignite.logger.java.JavaLogger error:
java.io.IOException: Connessione in corso interrotta forzatamente dall'host remoto
NODE2
[2023-02-02 15:54:30] [INFORMAZIONI] Accepted incoming communication connection [locAddr=/127.0.0.1:47100, rmtAddr=/127.0.0.1:63669] - [org.apache.ignite.logger.java.JavaLogger info:]
[2023-02-02 15:54:30] [INFORMAZIONI] Received incoming connection from remote node while connecting to this node, rejecting [locNode=8a378874-f3ae-4d0c-9733-a6b143097658, locNodeOrder=1, rmtNode=af74d5c9-3631-4fdf-b9f2-0babc853019f, rmtNodeOrder=2] - [org.apache.ignite.logger.java.JavaLogger info:]
[2023-02-02 15:54:30] [INFORMAZIONI] Established outgoing communication connection [locAddr=/127.0.0.1:63670, rmtAddr=/127.0.0.1:47101] - [org.apache.ignite.logger.java.JavaLogger info:]
[2023-02-02 15:54:31] [INFORMAZIONI] Established outgoing communication connection [locAddr=/127.0.0.1:63676, rmtAddr=/127.0.0.1:47101] - [org.apache.ignite.logger.java.JavaLogger info:]
[2023-02-02 15:54:31] [INFORMAZIONI] TCP client created [client=GridTcpNioCommunicationClient [ses=GridSelectorNioSessionImpl [worker=DirectNioClientWorker [super=AbstractNioClientWorker [idx=1, bytesRcvd=84, bytesSent=56, bytesRcvd0=0, bytesSent0=0, select=true, super=GridWorker [name=grid-nio-worker-tcp-comm-1, igniteInstanceName=TcpCommunicationSpi, finished=false, heartbeatTs=1675349671637, hashCode=762674116, interrupted=false, runner=grid-nio-worker-tcp-comm-1-#24%TcpCommunicationSpi%]]], writeBuf=java.nio.DirectByteBuffer[pos=9391 lim=32768 cap=32768], readBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], inRecovery=GridNioRecoveryDescriptor [acked=31061, resendCnt=753, rcvCnt=17160, sentCnt=31871, reserved=true, lastAck=17152, nodeLeft=false, node=TcpDiscoveryNode [id=af74d5c9-3631-4fdf-b9f2-0babc853019f, consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.178.30,192.168.56.1:47501, addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.178.30, 192.168.56.1], sockAddrs=HashSet [host.docker.internal/192.168.178.30:47501, /0:0:0:0:0:0:0:1:47501, WOPR/192.168.56.1:47501, /127.0.0.1:47501], discPort=47501, order=2, intOrder=2, lastExchangeTime=1675349650060, loc=false, ver=2.14.0#20220929-sha1:951e8deb, isClient=false], connected=false, connectCnt=8, queueLimit=4096, reserveCnt=9, pairedConnections=false], outRecovery=GridNioRecoveryDescriptor [acked=31061, resendCnt=531, rcvCnt=17160, sentCnt=31871, reserved=true, lastAck=17152, nodeLeft=false, node=TcpDiscoveryNode [id=af74d5c9-3631-4fdf-b9f2-0babc853019f, consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.178.30,192.168.56.1:47501, addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.178.30, 192.168.56.1], sockAddrs=HashSet [host.docker.internal/192.168.178.30:47501, /0:0:0:0:0:0:0:1:47501, WOPR/192.168.56.1:47501, /127.0.0.1:47501], discPort=47501, order=2, intOrder=2, lastExchangeTime=1675349650060, loc=false, ver=2.14.0#20220929-sha1:951e8deb, isClient=false], connected=false, connectCnt=8, queueLimit=4096, reserveCnt=9, pairedConnections=false], closeSocket=true, outboundMessagesQueueSizeMetric=org.apache.ignite.internal.processors.metric.impl.LongAdderMetric#69a257d1, super=GridNioSessionImpl [locAddr=/127.0.0.1:63676, rmtAddr=/127.0.0.1:47101, createTime=1675349671637, closeTime=0, bytesSent=0, bytesRcvd=0, bytesSent0=0, bytesRcvd0=0, sndSchedTime=1675349671637, lastSndTime=1675349671637, lastRcvTime=1675349671637, readsPaused=false, filterChain=FilterChain[filters=[GridNioCodecFilter [parser=org.apache.ignite.internal.util.nio.GridDirectParser#544beb47, directMode=true], GridConnectionBytesVerifyFilter], accepted=false, markedForClose=false]], super=GridAbstractCommunicationClient [lastUsed=1675349671637, closed=false, connIdx=0]], duration=339ms] - [org.apache.ignite.logger.java.JavaLogger info:]

Ignite client unstable since upgrade from 2.7.0 to 2.9.0

We upgraded a bunch of libraries in our application one of them is ignite. Right now the ignite running in client mode is crashing. My thinking is that one of the upgrades caused the cache to have increased in size. (so I don't think the upgrade of ignite is the problem).
So I increased the heap size from 10 to 20 GB. But when about 50% is used the JVM hangs.
I'm confused on why it does this when there is only 50% in use.
[12/3/20 16:07:58:788 GMT] 000000c4 IgniteKernal I .... Heap [used=9937MB, free=51.48%, comm=10680MB]
followed by
[12/3/20 16:08:26:410 GMT] 000000bd IgniteKernal W Possible too long JVM pause: 2418 milliseconds.
[12/3/20 16:08:27:465 GMT] 000000c5 TcpCommunicat W Client disconnected abruptly due to network connection loss or because the connection was left open on application shutdown. [cls=class o.a.i.i.util.nio.GridNioException, msg=Connection reset by peer]
[12/3/20 16:08:27:411 GMT] 000000c5 TcpCommunicat E Failed to process selector key [ses=GridSelectorNioSessionImpl [worker=DirectNioClientWorker [super=AbstractNioClientWorker [idx=0, bytesRcvd=48849402273, bytesSent=15994664546, bytesRcvd0=54446, bytesSent0=102, select=true, super=GridWorker [name=grid-nio-worker-tcp-comm-0, igniteInstanceName=null, finished=false, heartbeatTs=1607011706410, hashCode=433635054, interrupted=false, runner=grid-nio-worker-tcp-comm-0-#51]]], writeBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], readBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], inRecovery=GridNioRecoveryDescriptor [acked=9025120, resendCnt=0, rcvCnt=9025150, sentCnt=9025152, reserved=true, lastAck=9025120, nodeLeft=false, node=TcpDiscoveryNode [id=b3ca311e-077f-42a5-884a-807b539730b6, consistentId=10.60.46.12:48500, addrs=ArrayList [10.60.46.12], sockAddrs=HashSet [hex-wgc-p-web02/10.60.46.12:48500], discPort=48500, order=1, intOrder=1, lastExchangeTime=1607006097079, loc=false, ver=2.9.0#20201015-sha1:70742da8, isClient=false], connected=false, connectCnt=1, queueLimit=4096, reserveCnt=1, pairedConnections=false], outRecovery=GridNioRecoveryDescriptor [acked=9025120, resendCnt=0, rcvCnt=9025150, sentCnt=9025152, reserved=true, lastAck=9025120, nodeLeft=false, node=TcpDiscoveryNode [id=b3ca311e-077f-42a5-884a-807b539730b6, consistentId=10.60.46.12:48500, addrs=ArrayList [10.60.46.12], sockAddrs=HashSet [hex-wgc-p-web02/10.60.46.12:48500], discPort=48500, order=1, intOrder=1, lastExchangeTime=1607006097079, loc=false, ver=2.9.0#20201015-sha1:70742da8, isClient=false], connected=false, connectCnt=1, queueLimit=4096, reserveCnt=1, pairedConnections=false], closeSocket=true, outboundMessagesQueueSizeMetric=o.a.i.i.processors.metric.impl.LongAdderMetric#69a257d1, super=GridNioSessionImpl [locAddr=/10.223.132.3:52550, rmtAddr=/10.60.46.12:48100, createTime=1607006097572, closeTime=0, bytesSent=15994657850, bytesRcvd=48849402273, bytesSent0=102, bytesRcvd0=54446, sndSchedTime=1607006097572, lastSndTime=1607011706410, lastRcvTime=1607011706410, readsPaused=false, filterChain=FilterChain[filters=[GridNioCodecFilter [parser=o.a.i.i.util.nio.GridDirectParser#93200255, directMode=true], GridConnectionBytesVerifyFilter], accepted=false, markedForClose=false]]]
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:51)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:235)
at sun.nio.ch.IOUtil.read(IOUtil.java:204)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:394)
at org.apache.ignite.internal.util.nio.GridNioServer$DirectNioClientWorker.processRead(GridNioServer.java:1330)
at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.processSelectedKeysOptimized(GridNioServer.java:2472)
at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:2239)
at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.body(GridNioServer.java:1880)
at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at java.lang.Thread.run(Thread.java:822)
[12/3/20 16:08:44:437 GMT] 000000c4 SystemOut O [16:08:44] Possible failure suppressed accordingly to a configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker [name=tcp-comm-worker, igniteInstanceName=null, finished=false, heartbeatTs=1607011706420]]]
[12/3/20 16:08:44:436 GMT] 000000c4 W java.util.logging.LogManager$RootLogger log Possible failure suppressed accordingly to a configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker [name=tcp-comm-worker, igniteInstanceName=null, finished=false, heartbeatTs=1607011706420]]]
class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker, igniteInstanceName=null, finished=false, heartbeatTs=1607011706420]
at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$3.apply(IgnitionEx.java:1806)
at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$3.apply(IgnitionEx.java:1801)
at org.apache.ignite.internal.worker.WorkersRegistry.onIdle(WorkersRegistry.java:234)
at org.apache.ignite.internal.util.worker.GridWorker.onIdle(GridWorker.java:297)
at org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor$TimeoutWorker.body(GridTimeoutProcessor.java:221)
at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at java.lang.Thread.run(Thread.java:822)
[12/3/20 16:08:44:434 GMT] 000000c4 G W Thread [name="tcp-comm-worker-#1-#63", id=211, state=WAITING, blockCnt=2, waitCnt=100]
[12/3/20 16:08:44:432 GMT] 000000c4 G E Blocked system-critical thread has been detected. This can lead to cluster-wide undefined behaviour [workerName=tcp-comm-worker, threadName=tcp-comm-worker-#1-#63, blockedFor=18s]
[12/3/20 16:09:14:486 GMT] 000000c4 SystemOut O [16:09:14] Possible failure suppressed accordingly to a configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker [name=tcp-comm-worker, igniteInstanceName=null, finished=false, heartbeatTs=1607011736000]]]
These look like network issues.
[12/3/20 16:08:27:411 GMT] 000000c5 TcpCommunicat E Failed to process selector key
[workerName=tcp-comm-worker, threadName=tcp-comm-worker-#1-#63, blockedFor=18s]
Check that you are able to connect from your client machines to your server machines and that firewall configs are properly set up.
see: https://ignite.apache.org/docs/latest/clustering/network-configuration
make sure you've set: Djava.net.preferIPv4Stack=true if you are using IP v4 addresses.
If there are containers and/or private addresses involved, it might cause connection issues.
See: https://ignite.apache.org/docs/latest/clustering/running-client-nodes-behind-nat#limitations

Syntax error using CLIENT KILL USER <username> on Redis 5.9.102

I'm trying to kill a Redis client by user, as per the docs, but I get a syntax error in redis-cli:
redis:6379> client kill user my_client
(error) ERR syntax error
redis:6379> info
# Server
redis_version:5.9.102
What's the correct syntax for this command?
According to the this commit (May 1,2020) which is commited to the unstable version, your syntax is correct. But it is not released to the stable versions such as the one you used.
If you want to remove client by ip:port format then you need something like this;
127.0.0.1:6379> client list
id=272 addr=127.0.0.1:51374 fd=8 name= age=66 idle=1 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=26 qbuf-free=32742 obl=0 oll=0 omem=0 events=r cmd=client
id=273 addr=127.0.0.1:51376 fd=9 name= age=19 idle=16 flags=P db=0 sub=1 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=subscribe
127.0.0.1:6379> client kill 127.0.0.1:51376
OK
127.0.0.1:6379>

redis client list retruns "cmd=NULL"

When I do client list on redis-cli, I found some events showing cmd=NULL.
eg:
id=198375 addr=10.213.96.168:37090 fd=696 name= age=8064 idle=8064 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=NULL
What are those cmd=NULL events?
cmd=xxx shows the LAST COMMAND that the client executed.
However, if the client connects to Redis, and has NOT executed any command, then there's no LAST COMMAND, and its last command is NULL.