org.apache.ignite.client.ClientConnectionException: Ignite cluster is unavailable inside kubernetes environment - ignite

We have a setup wherein, one ignite server node serves 15 to 20 thick client nodes and 40 to 50 thin client nodes, thin client connection is singlton,
In operation, some times we get below error,
org.apache.ignite.client.ClientConnectionException: Ignite cluster is unavailable [sock=Socket[addr=hostnm19.hostx.com/10.13.10.19,port=30519,localport=57552]]
On the Server node, we are inserting data inside a third party store using CacheStoreAdapters
Don't know where it goes wrong since out of 100 operations one operation fails with the above error.
Also, let me know what can we do for this failure handling.
Apache Ignite version: 2.8
Edits: (Code Snippet)
ClientConfiguration cfg = new ClientConfiguration()
.setAddresses("host:port");
IgniteClient client = Ignition.startClient(cfg); // this client is singleton
client.getOrCreateCache("ABC_CACHE").put(key, val);
StatckTrace:
org.apache.ignite.client.ClientConnectionException: Ignite cluster is unavailable [sock=Socket[addr=hostnm19.hostx.com/10.13.10.19,port=30519,localport=57552]]
at org.apache.ignite.internal.client.thin.TcpClientChannel.handleIOError(TcpClientChannel.java:499)
at org.apache.ignite.internal.client.thin.TcpClientChannel.handleIOError(TcpClientChannel.java:491)
at org.apache.ignite.internal.client.thin.TcpClientChannel.access$100(TcpClientChannel.java:92)
at org.apache.ignite.internal.client.thin.TcpClientChannel$ByteCountingDataInput.read(TcpClientChannel.java:538)
at org.apache.ignite.internal.client.thin.TcpClientChannel$ByteCountingDataInput.readInt(TcpClientChannel.java:572)
at org.apache.ignite.internal.client.thin.TcpClientChannel.processNextResponse(TcpClientChannel.java:272)
at org.apache.ignite.internal.client.thin.TcpClientChannel.receive(TcpClientChannel.java:234)
at org.apache.ignite.internal.client.thin.TcpClientChannel.service(TcpClientChannel.java:171)
at org.apache.ignite.internal.client.thin.ReliableChannel.service(ReliableChannel.java:160)
at org.apache.ignite.internal.client.thin.ReliableChannel.request(ReliableChannel.java:187)
at org.apache.ignite.internal.client.thin.TcpIgniteClient.getOrCreateCache(TcpIgniteClient.java:114)
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:210)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at org.apache.ignite.internal.client.thin.TcpClientChannel$ByteCountingDataInput.read(TcpClientChannel.java:535)
... 36 more

You probably have network or NAT configured which will reset connections when not used, or even sporadically.
In this case, you will have to reconnect.
Another option, are you sure you are connecting to thin client port and not some other port?

Related

Redis client Lettuce command timeout versus socket timeout

We have defined Lettuce client connection factory to be able to connect to Redis defining custom socket and command timeout:
#Bean
LettuceConnectionFactory lettuceConnectionFactory() {
final SocketOptions socketOptions = SocketOptions.builder().connectTimeout(socketTimeout).build();
final ClientOptions clientOptions =
ClientOptions.builder().socketOptions(socketOptions).build();
LettuceClientConfiguration clientConfig = LettuceClientConfiguration.builder()
.commandTimeout(redisCommandTimeout)
.clientOptions(clientOptions).build();
RedisStandaloneConfiguration serverConfig = new RedisStandaloneConfiguration(redisHost,
redisPort);
final LettuceConnectionFactory lettuceConnectionFactory = new LettuceConnectionFactory(serverConfig,
clientConfig);
lettuceConnectionFactory.setValidateConnection(true);
return new LettuceConnectionFactory(serverConfig, clientConfig);
}
Lettuce documentation define default values:
Default socket timeout is 10 seconds
Default command timeout is 60 seconds
If Redis service is down application must receive timeout in 300ms. Which value must be defined as the greatest value?
Github example project:
https://github.com/cristianprofile/spring-data-redis-lettuce
In socket options you specify connect timeout. This is a maximum time allowed for Redis client (Lettuce) to try to establish a TCP/IP connection to a Redis Server. This value should be relatively small (e.g. up to 1 minute).
If client could not establish connection to a server within 1 minute I guess it's safe to say server is not available (server is down, address/port is wrong, network security like firewalls prohibit connection etc).
The command timeout is completely different. Once connection is established, client can send commands to the server. It expects server to respond to those command. The timeout configures for how long client will be waiting for a response to a command from the server.
I think this timeout can be set to a bigger value (e.g a few minutes) in case client command sends a lot of data to the server and it takes time to transfer and store so much data.

ActiveMQ DB persistent does not reconnect when HA cluster DB failover occurs

Whenever HA failover occurs activeMQ broker got stuck with the messages that has and could not restart by itself.
Messages are processed successfully when we restart the activeMQ.
The bean is created to stop and start the connectors in case if IOExceptions.
bean id="ioExceptionHandler" class="org.apache.activemq.util.DefaultIOExceptionHandler"
property name="ignoreSQLExceptions"value=false property
property name="stopStartConnectors" value=true property
bean
We are getting connection closed as exceptions when this failover occurs as below
Initiating stop/restart of transports on BrokerService[localhost] due to IO exception, java.io.IOException: The connection is closed. | org.apache.activemq.util.DefaultIOExceptionHandler | ActiveMQ Transport: tcp:///hostname:52272#8501
java.io.IOException: The connection is closed.
Later it is trying to restart the transport connectors as a result of this config, but not able to continue further.
INFO | waiting for broker persistence adapter checkpoint to succeed
before restarting transports |
org.apache.activemq.util.DefaultIOExceptionHandler |
IOExceptionHandler: restart transports.
Please let us know if any configuration required for broker to restart and process the messages it has.

Aerospike heartbeat configuration for single server, error "Unable to find any suitable network device for node ID"

I want to run Aerospike server in single-server mode.
Now I have this configuration:
service {
paxos-single-replica-limit 1 # Number of nodes where the replica count is automatically reduced to 1.
service-threads 4
transaction-queues 4
transaction-threads-per-queue 4
proto-fd-max 15000
}
logging {
console {
context any info
}
}
network {
service {
address 127.0.0.1
port 3000
}
heartbeat {
mode multicast
multicast-group 239.1.99.222
port 9918
# To use unicast-mesh heartbeats, remove the 3 lines above, and see
# aerospike_mesh.conf for alternative.
interval 150
timeout 10
}
fabric {
port 3001
}
info {
port 3003
}
}
namespace test {
replication-factor 1
memory-size 20M
default-ttl 1d # 30 days, use 0 to never expire/evict.
storage-engine memory
}
And when I try to start server I got error in the log:
"Unable to find any suitable network device for node ID"
I don't want server to be available to internet.
How to achieve this and fix the issue?
The Node ID is generated using the MAC id of the interface on the host.
https://github.com/aerospike/aerospike-server/blob/master/cf/src/socket.c#L2470
If you dont have any of the default interface names that aerospike is aware of, then you might get this error.
To fix this problem, you can specify your interface name.
http://www.aerospike.com/docs/operations/troubleshoot/startup#problem-with-network-interface
To avoid exposing your aerospike node on internet, you can bind it only to localhost or to a private interface only or use other network tools/devices to avoid exposing the server port such as firewall or ACL. Best way to avoid exposing aerospike on internet is to ensure that the server hosting aerospike is not exposed to internet. If that is not doable then restrict your aerospike port access to your aerospike clients IP only using firewall. Also, you can use database credentials available in enterprise edition.
http://www.aerospike.com/docs/guide/security.html

ActiveMQ Master/Slave on Weblogic - vm transport issue

I am trying to configure ActiveMQ master/slave setup on a single WebLogic machine. The problem is when I start Managed Server1 it successfully connects to vm transport and everything works perfectly, but when I start Managed Server2 I am receiving the following errors in broker logs
INFO 2016-September-27 10:08:00,227 ActiveMQEndpointWorker:124 - Connection attempt already in progress, ignoring connection exception
INFO 2016-September-27 10:08:01,161 TransportConnector:260 - Connector vm://localhost started
INFO 2016-September-27 10:08:30,228 TransportConnector:291 - Connector vm://localhost stopped
INFO 2016-September-27 10:08:30,229 TransportConnector:260 - Connector vm://localhost started
WARN 2016-September-27 10:08:30,228 ActiveMQManagedConnection:385 - Connection failed: javax.jms.JMSException: peer (vm://localhost#61) stopped.
WARN 2016-September-27 10:08:30,231 TransportConnection:823 - Failed to add Connection ID:ndl-wls-300.mydomain.com-52251-1474966937425-65:1 due to java.lang.NullPointerException
ERROR 2016-September-27 10:08:30,233 ActiveMQEndpointWorker:183 - Failed to connect to broker [vm://localhost?create=false]: java.lang.NullPointerException
javax.jms.JMSException: java.lang.NullPointerException
Please help, I am stuck with this.
I still don't see the reason for the slave within the same VM. I suggest you reach out to an ActiveMQ expert consultant to validate your architecture.
However, I think I can help you move a little bit closer to this issue:
There is a fundamental miss understanding here.. the vm url is broken down like this:
vm://${brokerName}?option=value,etc
The first time you create vm://localhost?create=true.. you have created a broker
The second time you reference vm://localhost?create=false.. you have created a client connection to the first broker.
To get two brokers, you'd need two different vm://${brokerName}?create=true

Apache http core nio 4.3.3 reverse proxy SSL error

I am developing a reverse proxy using http core nio 4.3.3 and need to connect to a Secure/HTTPS endpoint via the proxy. I took the reverse proxy(Asynchronous HTTP reverse proxy)[1] and added SSL support as shown below.
SSLContext clientSSLContext =
SSLUtil.createClientSSLContext(TRUST_STORE_LOCATION,
TRUST_STORE_PASSWORD);
final IOEventDispatch connectingEventDispatch =
new DefaultHttpClientIODispatch(
clientHandler,
clientSSLContext,
ConnectionConfig.DEFAULT);
...
connectingIOReactor.execute(connectingEventDispatch);
When I send the request, I am getting this error,
java.io.IOException: SSL not supported
The Stack trace is given below.
[client<-proxy] 00000001 java.io.IOException: SSL not supported
java.io.IOException: SSL not supported
at org.apache.http.impl.nio.pool.BasicNIOConnFactory.create(BasicNIOConnFactory.java:159)
at org.apache.http.impl.nio.pool.BasicNIOConnFactory.create(BasicNIOConnFactory.java:1)
at org.apache.http.nio.pool.AbstractNIOConnPool.requestCompleted(AbstractNIOConnPool.java:484)
at org.apache.http.nio.pool.AbstractNIOConnPool$InternalSessionRequestCallback.completed(AbstractNIOConnPool.java:770)
at org.apache.http.impl.nio.reactor.SessionRequestImpl.completed(SessionRequestImpl.java:127)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.processNewChannels(AbstractIOReactor.java:423)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:288)
at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:105)
at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:586)
at java.lang.Thread.run(Thread.java:662)
I enabled SSL debug logs as well, but still could not figure out the issue.
Then I debugged this and found out that the proxy received the request from the client and drops due to an exception inside the handle method of HttpAsyncRequestConsumer. The exception is java.io.IOException: SSL not supported
Also note that the SSLContext was working fine with a reverse proxy written using the netty transport.
Any help would be appreciated.
[1] https://hc.apache.org/httpcomponents-core-ga/examples.html
Regards,
Ravindra.
When using a connection pool on the client side to manage outgoing connections one needs to ensure that the connection factory used by the pool to create new connection objects is SSL capable. Please make sure that the connection pool is properly configured.
Thanks a lot for the advice. That solved the issue.
clientSSLContext =
SSLUtil.createClientSSLContext(TRUST_STORE_LOCATION,
TRUST_STORE_PASSWORD);
BasicNIOConnFactory connectionFactory =
new BasicNIOConnFactory(
clientSSLContext,
null,
ConnectionConfig.DEFAULT);
proxyConnPool = new ProxyConnPool(connectingIOReactor, connectionFactory, 5000)