Client timeout in Gemfire logs - gemfire

​
I am currently in the process of trouble shooting Gemfire logs for client timeout:
​
Logs:​
​[warning 2018/06/11 23:45:05.685 CDT xxxxsrv01_instance_42404_cacheserver <ClientHealthMonitor Thread> tid=0x97] Server connection from [identity(xx.xx.xx.xx(5338:loner):33228:4cae4ef2,connection=2; port=33261] is terminated because its client timeout of 10,000 has expired.
Even increasing the ​"read-timeout" value in client pool config to 200000 gives "client time out of 200000 has expired".
Also noticed below logs:
xxxxx_myinstance_42404_cacheserver <Handshaker /0:0:0:0:0:0:0:0:42404 Thread 229> tid=0x2c29cc] Bridge server: failed accepting client connection {0}
​Can you please help me on possible causes of the issue.
Regards,​
Rahul N.​

Related

Lettuce client for Redis - Cluster Topology Refresh Options not working

I'm using lettuce client version 6.2.0 to connect to a Redis cluster (v 6.2) with 3 masters each having 1 replica. I'm trying that the client re-discovers the cluster topology after a master goes down. Here is the client code I have:
List<RedisURI> redisURIs = new ArrayList<>();
redisURIs.add(RedisURI.create("redis://127.0.0.1:7000"));
redisURIs.add(RedisURI.create("redis://127.0.0.1:7001"));
redisURIs.add(RedisURI.create("redis://127.0.0.1:7002"));
redisURIs.add(RedisURI.create("redis://127.0.0.1:7003"));
redisURIs.add(RedisURI.create("redis://127.0.0.1:7004"));
redisURIs.add(RedisURI.create("redis://127.0.0.1:7005"));
ClusterTopologyRefreshOptions topologyRefreshOptions = ClusterTopologyRefreshOptions.builder()
.enableAllAdaptiveRefreshTriggers()
.refreshTriggersReconnectAttempts(1)
.enablePeriodicRefresh(Duration.ofSeconds(5))
.build();
ClusterClientOptions clientOptions = ClusterClientOptions.builder()
.autoReconnect(true).topologyRefreshOptions(topologyRefreshOptions).build();
ClientResources clientResources = ClientResources.builder().reconnectDelay(Delay.equalJitter()).build();
RedisClusterClient clusterClient = RedisClusterClient.create(clientResources, redisURIs);
clusterClient.setOptions(clientOptions);
The problem is that despite the setting enablePeriodicRefresh(Duration.ofSeconds(5)) the refresh interval is still taken as 60 seconds, instead of 5 seconds. Till 1 minute after a master goes down, the client stops working, i.e. it is not able to issue incr operation through clusterClient and the error keeps repeating:
Jul 18, 2022 5:56:21 PM io.lettuce.core.protocol.ConnectionWatchdog lambda$run$4
WARNING: Cannot reconnect to [127.0.0.1:7000]: Connection refused: /127.0.0.1:7000
After 1 minute timeout, it shows the warning message:
Jul 18, 2022 5:56:22 PM io.lettuce.core.cluster.topology.DefaultClusterTopologyRefresh lambda$openConnections$12
WARNING: Unable to connect to [127.0.0.1:7000]: Connection refused: /127.0.0.1:7000
Command timed out after 1 minute(s)
..and then it is able to proceed with commands. Even after that, it keeps showing the warning message:
Jul 18, 2022 5:56:27 PM io.lettuce.core.cluster.topology.DefaultClusterTopologyRefresh lambda$openConnections$12
WARNING: Unable to connect to [127.0.0.1:7000]: Connection refused: /127.0.0.1:7000
What am I missing here?
The enablePeriodicRefresh() setting works after the connection timeout.
You don't set the connection timeout, but it defaults to 60sec.
Adjusting the connection timeout will give you the desired result.
ex) redisURIs.add(RedisURI.create("redis://127.0.0.1:7000/0?timeout=10s"));

org.apache.ignite.client.ClientConnectionException: Ignite cluster is unavailable inside kubernetes environment

We have a setup wherein, one ignite server node serves 15 to 20 thick client nodes and 40 to 50 thin client nodes, thin client connection is singlton,
In operation, some times we get below error,
org.apache.ignite.client.ClientConnectionException: Ignite cluster is unavailable [sock=Socket[addr=hostnm19.hostx.com/10.13.10.19,port=30519,localport=57552]]
On the Server node, we are inserting data inside a third party store using CacheStoreAdapters
Don't know where it goes wrong since out of 100 operations one operation fails with the above error.
Also, let me know what can we do for this failure handling.
Apache Ignite version: 2.8
Edits: (Code Snippet)
ClientConfiguration cfg = new ClientConfiguration()
.setAddresses("host:port");
IgniteClient client = Ignition.startClient(cfg); // this client is singleton
client.getOrCreateCache("ABC_CACHE").put(key, val);
StatckTrace:
org.apache.ignite.client.ClientConnectionException: Ignite cluster is unavailable [sock=Socket[addr=hostnm19.hostx.com/10.13.10.19,port=30519,localport=57552]]
at org.apache.ignite.internal.client.thin.TcpClientChannel.handleIOError(TcpClientChannel.java:499)
at org.apache.ignite.internal.client.thin.TcpClientChannel.handleIOError(TcpClientChannel.java:491)
at org.apache.ignite.internal.client.thin.TcpClientChannel.access$100(TcpClientChannel.java:92)
at org.apache.ignite.internal.client.thin.TcpClientChannel$ByteCountingDataInput.read(TcpClientChannel.java:538)
at org.apache.ignite.internal.client.thin.TcpClientChannel$ByteCountingDataInput.readInt(TcpClientChannel.java:572)
at org.apache.ignite.internal.client.thin.TcpClientChannel.processNextResponse(TcpClientChannel.java:272)
at org.apache.ignite.internal.client.thin.TcpClientChannel.receive(TcpClientChannel.java:234)
at org.apache.ignite.internal.client.thin.TcpClientChannel.service(TcpClientChannel.java:171)
at org.apache.ignite.internal.client.thin.ReliableChannel.service(ReliableChannel.java:160)
at org.apache.ignite.internal.client.thin.ReliableChannel.request(ReliableChannel.java:187)
at org.apache.ignite.internal.client.thin.TcpIgniteClient.getOrCreateCache(TcpIgniteClient.java:114)
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:210)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at org.apache.ignite.internal.client.thin.TcpClientChannel$ByteCountingDataInput.read(TcpClientChannel.java:535)
... 36 more
You probably have network or NAT configured which will reset connections when not used, or even sporadically.
In this case, you will have to reconnect.
Another option, are you sure you are connecting to thin client port and not some other port?

Openshift online v3 - Timeout when reading response headers from daemon process

I created an python api on openshift online with python image. If you request all the data, it takes more than 30 seconds to respond. The server gives a 504 gateway timeout http response. How do you configure the length a response can take? > I created an annotation on the route, this seems to set proxy timeout.
haproxy.router.openshift.io/timeout: 600s
Problem remains, I now got logging. It looks like the message comes from mod_wsgi.
I want to try alter the configuration of the httpd (mod_wsgi-express process) from request-timeout 60 to request-timeout 600. Where doe you configure this. I am using base image https://github.com/sclorg/s2i-python-container/tree/master/2.7
Logging:
Timeout when reading response headers from daemon process 'localhost:8080':/tmp/mod_wsgi-localhost:8080:1000430000/htdocs
Does someone know how to fix this error on openshift online
Next to alter timeout of haproxy of the route of my app
haproxy.router.openshift.io/timeout: 600s
I altered the request-timeout and socket-timeout in app.sh of my python application. So the mod_wsgi-express server is configured with a higher timeout
ARGS="$ARGS --request-timeout 600"
ARGS="$ARGS --socket-timeout 600"
My application now wait 10 minutes before cancelling a request

ActiveMQ Master/Slave on Weblogic - vm transport issue

I am trying to configure ActiveMQ master/slave setup on a single WebLogic machine. The problem is when I start Managed Server1 it successfully connects to vm transport and everything works perfectly, but when I start Managed Server2 I am receiving the following errors in broker logs
INFO 2016-September-27 10:08:00,227 ActiveMQEndpointWorker:124 - Connection attempt already in progress, ignoring connection exception
INFO 2016-September-27 10:08:01,161 TransportConnector:260 - Connector vm://localhost started
INFO 2016-September-27 10:08:30,228 TransportConnector:291 - Connector vm://localhost stopped
INFO 2016-September-27 10:08:30,229 TransportConnector:260 - Connector vm://localhost started
WARN 2016-September-27 10:08:30,228 ActiveMQManagedConnection:385 - Connection failed: javax.jms.JMSException: peer (vm://localhost#61) stopped.
WARN 2016-September-27 10:08:30,231 TransportConnection:823 - Failed to add Connection ID:ndl-wls-300.mydomain.com-52251-1474966937425-65:1 due to java.lang.NullPointerException
ERROR 2016-September-27 10:08:30,233 ActiveMQEndpointWorker:183 - Failed to connect to broker [vm://localhost?create=false]: java.lang.NullPointerException
javax.jms.JMSException: java.lang.NullPointerException
Please help, I am stuck with this.
I still don't see the reason for the slave within the same VM. I suggest you reach out to an ActiveMQ expert consultant to validate your architecture.
However, I think I can help you move a little bit closer to this issue:
There is a fundamental miss understanding here.. the vm url is broken down like this:
vm://${brokerName}?option=value,etc
The first time you create vm://localhost?create=true.. you have created a broker
The second time you reference vm://localhost?create=false.. you have created a client connection to the first broker.
To get two brokers, you'd need two different vm://${brokerName}?create=true

Tomcat Connection Pool Exhausted

I'm using Apache Tomcat JDBC connection pooling in my project. I'm confused because under heavy load I keep seeing the following error:
12:26:36,410 ERROR [] (http-/XX.XXX.XXX.X:XXXXX-X) org.apache.tomcat.jdbc.pool.PoolExhaustedException: [http-/XX.XXX.XXX.X:XXXXX-X] Timeout: Pool empty. Unable to fetch a connection in 10 seconds, none available[size:4; busy:4; idle:0; lastwait:10000].
12:26:36,411 ERROR [org.apache.catalina.core.ContainerBase.[jboss.web].[default-host].[/APP].[AppConf]] (http-/XX.XXX.XXX.X:XXXXX-X) JBWEB000236: Servlet.service() for servlet AppConf threw exception: org.jboss.resteasy.spi.UnhandledException: java.lang.NullPointerException
My expectation was that with pooling, requests for new connections would be held in a queue until a connection became available. Instead it seems that requests are rejected when the pool has reached capacity. Can this behaviour be changed?
Thanks,
Dal
This is my pool configuration:
PoolProperties p = new PoolProperties();
p.setUrl("jdbc:oracle:thin:#" + server + ":" + port + ":" + SID_SVC);
p.setDriverClassName("oracle.jdbc.driver.OracleDriver");
p.setUsername(username);
p.setPassword(password);
p.setMaxActive(4);
p.setInitialSize(1);
p.setMaxWait(10000);
p.setRemoveAbandonedTimeout(300);
p.setMinEvictableIdleTimeMillis(150000);
p.setTestOnBorrow(true);
p.setValidationQuery("SELECT 1 from dual");
p.setMinIdle(1);
p.setMaxIdle(2);
p.setRemoveAbandoned(true);
p.setJdbcInterceptors(
"org.apache.tomcat.jdbc.pool.interceptor.ConnectionState;"
+ "org.apache.tomcat.jdbc.pool.interceptor.StatementFinalizer;"
+ "org.apache.tomcat.jdbc.pool.interceptor.ResetAbandonedTimer");
This working as per design/implementation, if you see the log Timeout: Pool empty. Unable to fetch a connection in 10 seconds and your configuration is p.setMaxWait(10000);. The requesting thread waits for 10seconds(10000 millseconds, maxwait) before giving up waiting for connection.
Now you have two solutions, increase the number of maxActive connection or check if there are any connection leaks/long running queries(which you do not expect).