ConnectTimeoutException infinispan - infinispan

During infinispan cache load randomly I am noticing below error message in my application logs:
Cluster might have completely shutdown, try resetting transport layer and topology id
io.netty.channel.ConnectTimeoutException: connection timed out: 172.XX.20.XX:11222
Checked telnet nc -z ip 11222 is successful at that time, no error in infinispan server logs, number of connections to the server is also not very high.
Infinispan server version 9.4.11
Could anyone help me on what causing this?
Thanks,
Satyabrata Mohanty

What are your Hot Rod client connection/socket timeout configurations ?
I would increase the connection timeout and see if it happens again. If the server is overloaded your connection timeout might not be enough.

Related

Apache Ignite - how to close client

I start a Client to connect remote Server.
It involves a lot of computation.
Then the Client accidental disconnected.
However the Client's computation is still running on remote Server.
has a way to close it?
It will eventually happen when socket timeout is reached, I guess.

How to stop client from reconnecting to server when the server is down?

How can we stop a client from reconnecting to the server after some retries.
In our case (in memory DB for fast retrieval), we have used Ignite and Oracle in parallel so that if Ignite server is down, then I could get my data from Oracle.
But when I start my application (while the Ignite server node is down for some reason), my application always waiting until it connects to server.
Console message:
Failed to connect to any address from IP finder (will retry to join topology every 2000 ms; change 'reconnectDelay' to configure the frequency of retries):
There is a TcpDiscoverySpi.joinTimeout property, which does exactly what you want: https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/spi/discovery/tcp/TcpDiscoverySpi.html#setJoinTimeout-long-
By default, it's not defined, so, node will try to reconnect endlessly.

How to dispose of idle PUBSUB Redis connections

I was recently doing some investigation into some issues I'm facing on my Redis clusters, and saw that I have many connections that are sticking around, despite being idle indefinitely after some period of time.
After some investigation, I found that I have these two settings on my cluster:
timeout 300
tcp-keepalive 0
The stale connections that aren't going away are PUB/SUB client connections (StackExchange.Redis clients, in fact, but that's beside the point), so they do not respect the timeout configuration. As such, the tcp-keepalive seems to be the only other configuration that can ensure these connections get cleaned up over time.
So I applied this setting to all the nodes:
redis-trib.rb call 127.0.0.1:6001 config set tcp-keepalive 300
At this point I went home, and I came back the next morning, assuming the stale connections would have been disposed of properly. Sadly, I was greeted by all the same connections.
My question is: Is there any way from the Redis server side to dispose of these connections gracefully after they've been established? Is it expected that applying the tcp-keepalive configuration after the connections are established and old that they will not be disposed of?
The only solution I've found besides restarting the Redis servers are to do a bit of scripting and use the CLIENT KILL command, which is doable, but I was hoping for something configuration based to handle this.
Thanks in advance for any insight here!

When does Resque open a redis connection?

I've been running into Redis::TimeoutError: Connection timed out errors on Heroku, and I'm trying to pin down the problem. I'm only using Resque to connect to redis, so I'm wondering how Resque connects to redis:
When does Resque connect to redis? When a worker is started?
How long do redis connections last, typically?
It's unclear to me when connections are made and how long they last. Can anyone shed some light on this for me? Thanks!
Typically connections to Redis from Rails apps are established lazily, when the connection is first time used. For troubleshooting, sometimes it is useful to force the connection by adding Redis PING (http://redis.io/commands/ping) in the initializer code.
Once connection is established it will be maintained forever. If connection is dropped, an attempt to reconnect will happen next time it is used.
Also, be aware that as of early 2015, Heroku had an ongoing issue establishing connections to Redis instances on AWS, as the connections would occasionally time out. Heroku support is aware of that, so you may be able to get some help reaching out to them.

Redis Log says Connection Reset by Peer

I am getting the "Connection Reset by Peer" error in Redis Log.
probably your client either timed out or has raised another exception that disconnects the connection before redis was able to write to it. or it has crashed.
either increase your client's timeout, or check the client logs to see what happens.
If you try to connect to redis before it's fully started, you might get this error from the client.
I fixed it by adding a two-second sleep after starting the container and before trying to connect. Better would be to wait for some kind of healthy indicator.