We have started using Apache Ignite and we are using TCP-communication. What we are seeing is that the clients are opening a port for communication just like the server.
My first assumption was that we don't need to open up from the server to the client, everything seemed to be working fine. However, in some cases when the topology is changing we got stack traces in the logs that indicates that the server is initiating communication with the client on this port and fails.
My question is why is the server trying to communicate directly with the client? Do we need to let the servers communicate with the client or can we simply ignore the error messages?
Below is an example of the stack trace:
2016-07-04 16:02:32,298 ERROR [marshaller-cache-#67%PMCacheCluster%] [org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryHandler] [NONE] - Failed to send event notification to node: ad8937b4-eb38-442a-8e06-9625c6246d7b
org.apache.ignite.IgniteCheckedException: Failed to send message (node may have left the grid or TCP connection cannot be established due to firewall issues) [node=TcpDiscoveryNode [id=ad8937b4-eb38-442a-8e06-9625c6246d7b, addrs=[xxx.xx.x.xxx], sockAddrs=[/xxx.xx.x.xxx:0, /xxx.xx.x.xxx:0], discPort=0, order=51, intOrder=29, lastExchangeTime=1467640045240, loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=true], topic=T4 [topic=TOPIC_CACHE, id1=ee261127-933b-36b7-b4ef-f5be9bb4bff2, id2=ad8937b4-eb38-442a-8e06-9625c6246d7b, id3=0], msg=GridContinuousMessage [type=MSG_EVT_NOTIFICATION, routineId=7107ffc5-9868-422f-8509-4739558869f7, data=null, futId=null], policy=2]
at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:1290)
at org.apache.ignite.internal.managers.communication.GridIoManager.sendOrderedMessage(GridIoManager.java:1508)
at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.sendWithRetries(GridContinuousProcessor.java:1229)
at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.sendWithRetries(GridContinuousProcessor.java:1200)
at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.sendWithRetries(GridContinuousProcessor.java:1182)
at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.sendNotification(GridContinuousProcessor.java:843)
at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.addNotification(GridContinuousProcessor.java:802)
at org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryHandler.onEntryUpdate(CacheContinuousQueryHandler.java:787)
at org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryHandler.access$700(CacheContinuousQueryHandler.java:91)
at org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryHandler$1.onEntryUpdated(CacheContinuousQueryHandler.java:412)
at org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryManager.onEntryUpdated(CacheContinuousQueryManager.java:343)
at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.innerUpdate(GridCacheMapEntry.java:2522)
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateSingle(GridDhtAtomicCache.java:2246)
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1644)
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1484)
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processNearAtomicUpdateRequest(GridDhtAtomicCache.java:2940)
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$600(GridDhtAtomicCache.java:129)
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$5.apply(GridDhtAtomicCache.java:260)
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$5.apply(GridDhtAtomicCache.java:258)
at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:622)
at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:320)
at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:244)
at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$000(GridCacheIoManager.java:81)
at org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:203)
at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1219)
at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:847)
at org.apache.ignite.internal.managers.communication.GridIoManager.access$1700(GridIoManager.java:105)
at org.apache.ignite.internal.managers.communication.GridIoManager$5.run(GridIoManager.java:810)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.ignite.spi.IgniteSpiException: Failed to send message to remote node: TcpDiscoveryNode [id=ad8937b4-eb38-442a-8e06-9625c6246d7b, addrs=[xxx.xx.x.xxx], sockAddrs=[/xxx.xx.x.xxx:0, /xxx.xx.x.xxx:0], discPort=0, order=51, intOrder=29, lastExchangeTime=1467640045240, loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=true]
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:1993)
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:1933)
at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:1285)
... 30 common frames omitted
Caused by: org.apache.ignite.IgniteCheckedException: Failed to connect to node (is node still alive?). Make sure that each ComputeTask and GridCacheTransaction has a timeout set in order to prevent parties from waiting forever in case of network issues [nodeId=ad8937b4-eb38-442a-8e06-9625c6246d7b, addrs=[/xxx.xx.x.xxx:47100, /xxx.xx.x.xxx:47100]]
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:2496)
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createNioClient(TcpCommunicationSpi.java:2137)
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:2031)
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:1967)
... 32 common frames omitted
Suppressed: org.apache.ignite.IgniteCheckedException: Failed to connect to address: /xxx.xx.x.xxx:47100
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:2501)
... 35 common frames omitted
Caused by: java.net.NoRouteToHostException: No route to host
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:111)
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:2360)
... 35 common frames omitted
Suppressed: org.apache.ignite.IgniteCheckedException: Failed to connect to address: /xxx.xx.x.xxx:47100
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:2501)
... 35 common frames omitted
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:111)
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:2360)
... 35 common frames omitted
2016-07-04 16:02:34,923 ERROR [marshaller-cache-#67%PMCacheCluster%] [org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryHandler] [NONE] - Failed to send event notification to node: 95d9812d-4a16-4589-93a8-0bf2aa6b8413
Client nodes are different from server nodes mostly by the fact that they don't hold cache data and don't execute computations.
Other than that, client nodes are first-class cluster citizens and participate in communications the same way as servers do. So yes, they need to accept connections.
See https://apacheignite.readme.io/docs/clients-vs-servers
Related
I'm using Apache ignite without any integration with database. I'm getting following exception on my windows machine. After investigating, found that ports for which I'm getting error are used by ODBC driver.
https://apacheignite.readme.io/v1.7/docs/connecting-string
I don't know if its required by Ignite but if now can we disable ODBC/JDBC driver loading, so that it doesn't need those ports.
org.apache.ignite.IgniteCheckedException: Failed to start processor: GridProcessorAdapter []
at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1741)
at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:987)
at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2014)
at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1723)
at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1151)
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:671)
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:596)
at org.apache.ignite.Ignition.start(Ignition.java:327)
at framework.cache.CacheManager.initialize(CacheManager.java:129)
Caused by: org.apache.ignite.IgniteCheckedException: Failed to start client connector processor.
at org.apache.ignite.internal.processors.odbc.ClientListenerProcessor.start(ClientListenerProcessor.java:175)
at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1738)
... 10 common frames omitted
Caused by: org.apache.ignite.IgniteCheckedException: Failed to bind to any [host:port] from the range [host=null, portFrom=10800, portTo=10900, lastErr=class org.apache.ignite.IgniteCheckedException: Failed to initialize NIO selector.]
at org.apache.ignite.internal.processors.odbc.ClientListenerProcessor.start(ClientListenerProcessor.java:171)
... 11 common frames omitted
To prevent Ignite from binding to JDBC/ODBC ports, you should set IgniteConfiguration#clientConnectorConfiguration to null.
If you just set odbcEnabled and jdbcEnabled to false, then Ignite will still bind to this port, but JDBC and ODBC connections won't be processed.
I'm able connect IBM Queue directly but when u tried to connect from mule getting the below error and not able to deploy. I'm getting the below error
ERROR 2017-04-25 06:45:13,582
[main]org.mule.retry.notifiers.ConnectNotifier: Failed to connect/reconnect:
WebSphereMQConnector
{
name=WMQ2
lifecycle=initialise
this=5e7abaf7
numberOfConcurrentTransactedReceivers=4
createMultipleTransactedReceivers=true
connected=false
supportedProtocols=[wmq]
serviceOverrides=<none>
}
. Root Exception was: Connection timed out: connect. Type: class java.net.ConnectException
ERROR 2017-04-25 06:50:23,943 [main] org.mule.module.launcher.application.DefaultMuleApplication:
************************************************
Message : JMSWMQ0018: Failed to connect to queue manager 'RQACBRKB' with connection mode 'Client' and host name '172.11.11.11(6912)'.
JMS Code : JMSWMQ0018
Element : /WMQ2 # app:config.xml:14 (WMQ)
--------------------------------------------------------------------------------
Root Exception stack trace:
java.net.ConnectException: Connection timed out: connect
at java.net.TwoStacksPlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source)
at java.net.AbstractPlainSocketImpl.connect(Unknown Source)
at java.net.PlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
com.ibm.mq.jmqi.JmqiException: CC=2;RC=2538;AMQ9213: A communications error for occurred [1=java.net.ConnectException[Connection timed out: connect],3=rbitbrka.apl.com] at com.ibm.mq.jmqi.remote.impl.RemoteTCPConnection.connnectUsingLocalAddress(RemoteTCPConnection.java:810) ~[?:?]
PFB connector details:
<wmq:connector name="WMQ5" hostName="${mq.host}" port="${mq.port}" queueManager="${mq.queue.manager}" channel="CLIENTS.SALES.CRM" username="${mq.user}" password="${mq.password}" transportType="CLIENT_MQ_TCPIP" specification="1.1" targetClient="JMS_COMPLIANT" validateConnections="false" doc:name="WMQ" maxRedelivery="-1">
<reconnect frequency="${mq.reconnection.period.ms}" count="${mq.reconnection.attempt}"/>
</wmq:connector>
When i telnet the ip and port getting below error:
C:\Users\111>telnet 172.11.11.11 6912
Connecting To 172.11.11.11...Could not open connection to the host, on port 6912: Connect failed
But when i ping getting responce
C:\Users\111>ping 172.11.11.11
The pertinent pieces of information from your provided error are:-
JMSWMQ0018: Failed to connect to queue manager 'RQACBRKB'
with connection mode 'Client' and host name '172.11.11.11(6912)'.
com.ibm.mq.jmqi.JmqiException: CC=2;RC=2538;
MQRC 2538 is MQRC_HOST_NOT_AVAILABLE which is explained in Knowledge Center. In there it mentions the most common reasons for this error:-
The listener has not been started on the remote system. (please check that your listener is running on port 6912 on the machine at IP address 172.11.11.11)
The connection name in the client channel definition is incorrect. (the connection name your client is using is '172.11.11.11(6912)' - is this correct?)
The network is currently unavailable.
A firewall blocking the port, or protocol-specific traffic.
The security call initializing the IBM MQ client is blocked by a security exit on the SVRCONN channel at the server.
"java.net.ConnectException: Connection timed out: connect" usually occurs when you have a config issue or you are unable to connect to the remote server. As mentioned above do you have an error on the MQ end and if not have you checked the connection properties within config. If these are correct, are you able to reach MQ from another client, such as SOAPUI?
Can you also post the connector and flow details?
I am trying to get drill running on a 3 node cluster made up of ec2 instances. I configured
the drill-override.conf file so that it can connect to my zookeeper cluster. and then started
drillbit.sh on all three nodes, checked the status, and they were running. I try to open drill
using drill-conf and the first time it works and I ensure it can see all three drillbits
by using:
SELECT * FROM sys.drillbits;
and it shows all three drillbits properly. However, I reboot the cluster and try to retry to process
and this is the error that I get:
[ec2-user#ip-<private IP> bin]$ ./drill-conf
Error: Failure in connecting to Drill: org.apache.drill.exec.rpc.RpcException: CONNECTION : io.netty.channel.ConnectTimeoutException: connection timed out: <private DNS>/<private IP>:31010 (state=,code=0)
java.sql.SQLException: Failure in connecting to Drill: org.apache.drill.exec.rpc.RpcException: CONNECTION : io.netty.channel.ConnectTimeoutException: connection timed out: <private DNS>/<private IP>:31010
at org.apache.drill.jdbc.impl.DrillConnectionImpl.<init>(DrillConnectionImpl.java:159)
at org.apache.drill.jdbc.impl.DrillJdbc41Factory.newDrillConnection(DrillJdbc41Factory.java:64)
at org.apache.drill.jdbc.impl.DrillFactory.newConnection(DrillFactory.java:69)
at net.hydromatic.avatica.UnregisteredDriver.connect(UnregisteredDriver.java:126)
at org.apache.drill.jdbc.Driver.connect(Driver.java:72)
at sqlline.DatabaseConnection.connect(DatabaseConnection.java:167)
at sqlline.DatabaseConnection.getConnection(DatabaseConnection.java:213)
at sqlline.Commands.connect(Commands.java:1083)
at sqlline.Commands.connect(Commands.java:1015)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:36)
at sqlline.SqlLine.dispatch(SqlLine.java:742)
at sqlline.SqlLine.initArgs(SqlLine.java:528)
at sqlline.SqlLine.begin(SqlLine.java:596)
at sqlline.SqlLine.start(SqlLine.java:375)
at sqlline.SqlLine.main(SqlLine.java:268)
Caused by: org.apache.drill.exec.rpc.RpcException: CONNECTION : io.netty.channel.ConnectTimeoutException: connection timed out: <private DNS>/<private IP>:31010
at org.apache.drill.exec.client.DrillClient$FutureHandler.connectionFailed(DrillClient.java:448)
at org.apache.drill.exec.rpc.BasicClient$ConnectionMultiListener$ConnectionHandler.operationComplete(BasicClient.java:237)
at org.apache.drill.exec.rpc.BasicClient$ConnectionMultiListener$ConnectionHandler.operationComplete(BasicClient.java:200)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563)
at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:424)
at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe$1.run(AbstractEpollStreamChannel.java:460)
at io.netty.util.concurrent.PromiseTask$RunnableAdapter.call(PromiseTask.java:38)
at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:120)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:254)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.ExecutionException: io.netty.channel.ConnectTimeoutException: connection timed out: <private DNS>/<private IP>:31010
at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:47)
at org.apache.drill.exec.rpc.BasicClient$ConnectionMultiListener$ConnectionHandler.operationComplete(BasicClient.java:213)
... 12 more
Caused by: io.netty.channel.ConnectTimeoutException: connection timed out: ip-172-31-24-19.us-west-2.compute.internal/172.31.24.19:31010
at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe$1.run(AbstractEpollStreamChannel.java:458)
... 6 more
apache drill 1.6.0
"the only truly happy people are children, the creative minority and drill users"
0: jdbc:drill:>
I've looked this up in a few places and tried out their solutions one of which was making sure
/etc/hosts is configured properly and that didn't fix it. Is there anything else I can try to solve the
issue?
Found the solution:
Had to make sure my security group for the ec2 instaqnces were configured properly by letting in the correct traffic to port 31010.
When ever I start the infinispan server 8 in domain mode, I am getting the below exception .I am not sure what is going wrong.
ERROR [org.jboss.msc.service.fail] (MSC service thread 1-1) MSC000001: Failed to start service jboss.datagrid-infinispan-endpoint.memcached.memcached-connector: org.jboss.msc.service.StartException in service jboss.datagrid-infinispan-endpoint.memcached.memcached-connector: DGENDPT10004: Failed to start MemcachedServer
Caused by: java.lang.IllegalStateException: failed to create a child event loop
... 5 more
Caused by: io.netty.channel.ChannelException: failed to open a new selector
Caused by: java.io.IOException: Unable to establish loopback connection
... 24 more
Caused by: java.io.IOException: An existing connection was forcibly closed y the remote host
... 32 more
2016-05-11 20:01:55,600 ERROR [org.jboss.msc.service.fail] (MSC service thread 1-8) MSC000001: Failed to start service jboss.datagrid-infinispan-endpoint.websocket.websocket-connector: org.jboss.msc.service.StartException in service jboss.datagrid-infinispan-endpoint.websocket.websocket-connector: GENDPT10004: Failed to start WebSocketServer
caused by: java.lang.IllegalStateException: failed to create a child event loop ... 5 more
Your logs contain Unable to establish loopback connection which suggests that there is something wrong with your OS configuration.
The similar issue can be found here: Failed to initialize monitor Thread: Unable to establish loopback connection
Hadoop is successfully running in distributed mode.
Getting following error while starting HBase in distributed mode.
Tried everything in hbase-site.xml configuration. No idea how to proceed with the problem?
014-03-10 13:55:42,493 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server ip-112-11-1-111.ec2.internal/112.11.1.111:2181.
Will not attempt to authenticate using SASL (Unable to locate a login configuration)
2014-03-10 13:55:42,494 WARN org.apache.zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
2014-03-10 13:55:42,594 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase
2014-03-10 13:55:42,594 ERROR org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: ZooKeeper exists failed after 3 retries
2014-03-10 13:55:42,595 ERROR org.apache.hadoop.hbase.master.HMasterCommandLine: Failed to start master
java.lang.RuntimeException: Failed construction of Master: class org.apache.hadoop.hbase.master.HMaster
at org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2104)
at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:152)
at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:104)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76)
at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2118)
Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1069)
at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:199)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndFailSilent(ZKUtil.java:1109)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndFailSilent(ZKUtil.java:1099)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndFailSilent(ZKUtil.java:1083)
at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.createBaseZNodes(ZooKeeperWatcher.java:162)
at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:155)
at org.apache.hadoop.hbase.master.HMaster.<init>(HMaster.java:345)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2099)
Make sure that ZooKeeper is provisioned and running expectedly.
Check zoo.cfg and /etc/hosts to make sure that all zookeeper servers are reachable by the HBase master.