hive metastore read timeout - hive

We are using Hive 2.3.3 along with Hadoop 2.7.7 and Spark 2.4.4.
We are using MariaDB as a backend database for Metastore.
The startup of the Metastore service is fine, and I am able to access Hive CLI and perform query operations. I am not starting up HS2 (to try to isolate the issue).
But after a while, all of the sudden, the Hive Metastore service is stuck and is not responding, even to simple queries show databases or show tables.
All the tables against which we execute other queries are empty (no partitions created as of now) and this environment is being newly created.
hive.log error file:
2020-10-09T18:43:56,971 DEBUG [IPC Client (1223050066) connection to master/10.28.66.65:8020 from hadoopuser] ipc.Client: IPC Client (1223050066) connection to master/10.28.66.65:8020 from hadoopuser: closed
2020-10-09T18:43:56,971 DEBUG [IPC Client (1223050066) connection to master/10.28.66.65:8020 from hadoopuser] ipc.Client: IPC Client (1223050066) connection to master/10.28.66.65:8020 from hadoopuser: stopped, remaining connections 0
2020-10-09T18:53:53,769 WARN [9edaa669-999e-41a9-a7b3-bcee9d6198f4 main] metastore.RetryingMetaStoreClient: MetaStoreClient lost connection. Attempting to reconnect (1 of 1) after 5s. getAllFunctions
org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129) ~[hive-exec-2.3.3.jar:2.3.3]
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) ~[hive-exec-2.3.3.jar:2.3.3]
at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) ~[hive-exec-2.3.3.jar:2.3.3]
at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318) ~[hive-exec-2.3.3.jar:2.3.3]
at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219) ~[hive-exec-2.3.3.jar:2.3.3]
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:77) ~[hive-exec-2.3.3.jar:2.3.3]
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_all_functions(ThriftHiveMetastore.java:3812) ~[hive-exec-2.3.3.jar:2.3.3]
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_all_functions(ThriftHiveMetastore.java:3800) ~[hive-exec-2.3.3.jar:2.3.3]
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getAllFunctions(HiveMetaStoreClient.java:2393) ~[hive-exec-2.3.3.jar:2.3.3]
....
2020-10-09T18:53:58,777 INFO [9edaa669-999e-41a9-a7b3-bcee9d6198f4 main] hive.metastore: Closed a connection to metastore, current connections: 0
2020-10-09T18:53:58,777 INFO [9edaa669-999e-41a9-a7b3-bcee9d6198f4 main] hive.metastore: Trying to connect to metastore with URI thrift://master:9083
2020-10-09T18:53:58,779 INFO [9edaa669-999e-41a9-a7b3-bcee9d6198f4 main] hive.metastore: Opened a connection to metastore, current connections: 1
2020-10-09T18:53:58,805 INFO [9edaa669-999e-41a9-a7b3-bcee9d6198f4 main] hive.metastore: Connected to metastore.

Related

server.HiveServer2: Error starting priviledge synchonizer

Hive version 3.1.2
Hadoop components(hdfs/yarn/historyjob) with kerberos authentication.
hive kerberos config:
hive.server2.authentication=KERBEROS
hive.server2.authentication.kerberos.principal=hiveserver2/_HOST#BDP.COM
hive.server2.authentication.kerberos.keytab=/etc/kerberos/hadoop/hiveserver2.bdp-05.keytab
hive.metastore.sasl.enabled=true
hive.metastore.kerberos.keytab.file=/etc/kerberos/hadoop/metastore.bdp-05.keytab
hive.metastore.kerberos.principal=metastore/_HOST#BDP.COM
First, start the Metastore:
./bin/hive --service metastore > /dev/null &
Nothing unnormal in the log.
Then start hiveserver2 :
./bin/hive --service hiveserver2 > /dev/null &
Here is the start logs :
2020-12-30T11:28:48,746 INFO [main] server.HiveServer2: Starting HiveServer2
2020-12-30T11:28:49,168 INFO [main] security.UserGroupInformation: Login successful for user hiveserver2/bigdata-server-05#BDP.COM using keytab file /etc/kerberos/hadoop/hiveserver2.bdp-05.keytab
2020-12-30T11:28:49,171 INFO [main] cli.CLIService: SPNego httpUGI not created, spNegoPrincipal: , ketabFile:
2020-12-30T11:28:49,187 INFO [main] SessionState: Hive Session ID = 0754b9bc-f2f9-4d4c-ab95-a7359764bc49
2020-12-30T11:28:50,052 INFO [main] session.SessionState: Created HDFS directory: /tmp/hive/hiveserver2/0754b9bc-f2f9-4d4c-ab95-a7359764bc49
2020-12-30T11:28:50,066 INFO [main] session.SessionState: Created local directory: /tmp/hive/0754b9bc-f2f9-4d4c-ab95-a7359764bc49
2020-12-30T11:28:50,069 INFO [main] session.SessionState: Created HDFS directory: /tmp/hive/hiveserver2/0754b9bc-f2f9-4d4c-ab95-a7359764bc49/_tmp_space.db
2020-12-30T11:28:50,600 INFO [main] metastore.HiveMetaStoreClient: Trying to connect to metastore with URI thrift://bigdata-server-05:9083
2020-12-30T11:28:50,605 INFO [main] metastore.HiveMetaStoreClient: HMSC::open(): Could not find delegation token. Creating KERBEROS-based thrift connection.
2020-12-30T11:28:50,653 INFO [main] metastore.HiveMetaStoreClient: Opened a connection to metastore, current connections: 1
2020-12-30T11:28:50,653 INFO [main] metastore.HiveMetaStoreClient: Connected to metastore.
2020-12-30T11:28:50,654 INFO [main] metastore.RetryingMetaStoreClient: RetryingMetaStoreClient proxy=class org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient ugi=hiveserver2/bigdata-server-05#BDP.COM (auth:KERBEROS) retries=1 delay=1 lifetime=0
2020-12-30T11:28:50,781 INFO [main] service.CompositeService: Operation log root directory is created: /tmp/hive/operation_logs
2020-12-30T11:28:50,783 INFO [main] service.CompositeService: HiveServer2: Background operation thread pool size: 100
2020-12-30T11:28:50,783 INFO [main] service.CompositeService: HiveServer2: Background operation thread wait queue size: 100
2020-12-30T11:28:50,783 INFO [main] service.CompositeService: HiveServer2: Background operation thread keepalive time: 10 seconds
2020-12-30T11:28:50,784 INFO [main] service.CompositeService: Connections limit are user: 0 ipaddress: 0 user-ipaddress: 0
2020-12-30T11:28:50,787 INFO [main] service.AbstractService: Service:OperationManager is inited.
2020-12-30T11:28:50,787 INFO [main] service.AbstractService: Service:SessionManager is inited.
2020-12-30T11:28:50,787 INFO [main] service.AbstractService: Service:CLIService is inited.
2020-12-30T11:28:50,787 INFO [main] service.AbstractService: Service:ThriftBinaryCLIService is inited.
2020-12-30T11:28:50,787 INFO [main] service.AbstractService: Service:HiveServer2 is inited.
2020-12-30T11:28:50,835 INFO [pool-7-thread-1] SessionState: Hive Session ID = 693b0399-aabd-42b5-a4b2-a4cebbd325d4
2020-12-30T11:28:50,838 INFO [main] results.QueryResultsCache: Initializing query results cache at /tmp/hive/_resultscache_
2020-12-30T11:28:50,844 INFO [pool-7-thread-1] session.SessionState: Created HDFS directory: /tmp/hive/hiveserver2/693b0399-aabd-42b5-a4b2-a4cebbd325d4
2020-12-30T11:28:50,844 INFO [main] results.QueryResultsCache: Query results cache: cacheDirectory /tmp/hive/_resultscache_/results-23ae949b-6894-4a17-8141-0eacf5fe5a63, maxCacheSize 2147483648, maxEntrySize 10485760, maxEntryLifetime 3600000
2020-12-30T11:28:50,846 INFO [pool-7-thread-1] session.SessionState: Created local directory: /tmp/hive/693b0399-aabd-42b5-a4b2-a4cebbd325d4
2020-12-30T11:28:50,849 INFO [pool-7-thread-1] session.SessionState: Created HDFS directory: /tmp/hive/hiveserver2/693b0399-aabd-42b5-a4b2-a4cebbd325d4/_tmp_space.db
2020-12-30T11:28:50,861 INFO [main] events.NotificationEventPoll: Initializing lastCheckedEventId to 0
2020-12-30T11:28:50,862 INFO [main] server.HiveServer2: Starting Web UI on port 10002
2020-12-30T11:28:50,885 INFO [pool-7-thread-1] metadata.HiveMaterializedViewsRegistry: Materialized views registry has been initialized
2020-12-30T11:28:50,894 INFO [main] util.log: Logging initialized #4380ms
2020-12-30T11:28:51,009 INFO [main] service.AbstractService: Service:OperationManager is started.
2020-12-30T11:28:51,009 INFO [main] service.AbstractService: Service:SessionManager is started.
2020-12-30T11:28:51,010 INFO [main] service.AbstractService: Service:CLIService is started.
2020-12-30T11:28:51,010 INFO [main] service.AbstractService: Service:ThriftBinaryCLIService is started.
2020-12-30T11:28:51,013 WARN [main] security.HadoopThriftAuthBridge: Client-facing principal not set. Using server-side setting: hiveserver2/_HOST#BDP.COM
2020-12-30T11:28:51,013 INFO [main] security.HadoopThriftAuthBridge: Logging in via CLIENT based principal
2020-12-30T11:28:51,019 INFO [main] security.UserGroupInformation: Login successful for user hiveserver2/bigdata-server-05#BDP.COM using keytab file /etc/kerberos/hadoop/hiveserver2.bdp-05.keytab
2020-12-30T11:28:51,019 INFO [main] security.HadoopThriftAuthBridge: Logging in via SERVER based principal
2020-12-30T11:28:51,023 INFO [main] security.UserGroupInformation: Login successful for user hiveserver2/bigdata-server-05#BDP.COM using keytab file /etc/kerberos/hadoop/hiveserver2.bdp-05.keytab
2020-12-30T11:28:51,030 INFO [main] delegation.AbstractDelegationTokenSecretManager: Updating the current master key for generating delegation tokens
2020-12-30T11:28:51,033 INFO [main] security.TokenStoreDelegationTokenSecretManager: New master key with key id=0
2020-12-30T11:28:51,034 INFO [Thread[Thread-8,5,main]] security.TokenStoreDelegationTokenSecretManager: Starting expired delegation token remover thread, tokenRemoverScanInterval=60 min(s)
2020-12-30T11:28:51,035 INFO [Thread[Thread-8,5,main]] delegation.AbstractDelegationTokenSecretManager: Updating the current master key for generating delegation tokens
2020-12-30T11:28:51,035 INFO [Thread[Thread-8,5,main]] security.TokenStoreDelegationTokenSecretManager: New master key with key id=1
2020-12-30T11:28:51,040 INFO [main] thrift.ThriftCLIService: Starting ThriftBinaryCLIService on port 10000 with 5...500 worker threads
2020-12-30T11:28:51,040 INFO [main] service.AbstractService: Service:HiveServer2 is started.
2020-12-30T11:28:51,041 ERROR [main] server.HiveServer2: Error starting priviledge synchonizer:
java.lang.NullPointerException: null
at org.apache.hive.service.server.HiveServer2.startPrivilegeSynchonizer(HiveServer2.java:985) ~[hive-service-3.1.2.jar:3.1.2]
at org.apache.hive.service.server.HiveServer2.start(HiveServer2.java:726) [hive-service-3.1.2.jar:3.1.2]
at org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:1037) [hive-service-3.1.2.jar:3.1.2]
at org.apache.hive.service.server.HiveServer2.access$1600(HiveServer2.java:140) [hive-service-3.1.2.jar:3.1.2]
at org.apache.hive.service.server.HiveServer2$StartOptionExecutor.execute(HiveServer2.java:1305) [hive-service-3.1.2.jar:3.1.2]
at org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:1149) [hive-service-3.1.2.jar:3.1.2]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_271]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_271]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_271]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_271]
at org.apache.hadoop.util.RunJar.run(RunJar.java:318) [hadoop-common-3.1.3.jar:?]
at org.apache.hadoop.util.RunJar.main(RunJar.java:232) [hadoop-common-3.1.3.jar:?]
2020-12-30T11:28:51,044 INFO [main] server.HiveServer2: Shutting down HiveServer2
In my case, the hiveserver2-sit.xml was created by Apache Ranger when turning the ranger-hive-plugin on. Once I disable the ranger-hive-plugin, hiveserver2-sit.xml was edited by Ranger.
Here are the remaining configurations:
<configuration>
<property>
<name>hive.security.authorization.enabled</name>
<value>true</value>
</property>
<property>
<name>hive.security.authorization.manager</name>
<value>org.apache.hadoop.hive.ql.security.authorization.DefaultHiveAuthorizationProvider</value>
</property>
<property>
<name>hive.security.authenticator.manager</name>
<value>org.apache.hadoop.hive.ql.security.HadoopDefaultAuthenticator</value>
</property>
<property>
<name>hive.conf.restricted.list</name>
<value>hive.security.authorization.enabled,hive.security.authorization.manager,hive.security.authenticator.manager</value>
</property>
</configuration>
Start hiveServer2 will encounter the previous exception.
Remove hiveserver2-site.xml will work fine.
I don't know why? Somebody can explain?
is this still actual ? , if yes , check logs . You should see that it tries to connect to zookeeper , if not described it will try to connect to localhost:2181 , so either it must be there or zk quorum servers should be described.

Linux intellij idea spark Error initializing SparkContext

I tried to run a spark 1.6.0 (spark-1.6.0-bin-hadoop2.6) program on local mode using intellij idea .It has the error below.(Chinese means you can not specify the address of the requested)
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
16/09/17 16:18:25 INFO SparkContext: Running Spark version 1.6.0
16/09/17 16:18:25 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/09/17 16:18:25 INFO SecurityManager: Changing view acls to: ron
16/09/17 16:18:25 INFO SecurityManager: Changing modify acls to: ron
16/09/17 16:18:25 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(ron); users with modify permissions: Set(ron)
16/09/17 16:18:26 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
16/09/17 16:18:26 ERROR SparkContext: Error initializing SparkContext.
java.net.BindException: 无法指定被请求的地址: Service 'sparkDriver' failed after 16 retries!
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:433)
at sun.nio.ch.Net.bind(Net.java:425)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:125)
at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:485)
at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1089)
at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:430)
at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:415)
at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:903)
at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:198)
at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:348)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:745)
16/09/17 16:18:26 INFO SparkContext: Successfully stopped SparkContext
Exception in thread "main" java.net.BindException: 无法指定被请求的地址: Service 'sparkDriver' failed after 16 retries!
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:433)
at sun.nio.ch.Net.bind(Net.java:425)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:125)
at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:485)
at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1089)
at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:430)
at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:415)
at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:903)
at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:198)
at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:348)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:745)
1.Get your hostname by using "hostname" command.
2.Make an entry in the /etc/hosts file for your hostname if not present as follows:
127.0.0.1 your_hostname

Why does a the ignite client open a port?

We have started using Apache Ignite and we are using TCP-communication. What we are seeing is that the clients are opening a port for communication just like the server.
My first assumption was that we don't need to open up from the server to the client, everything seemed to be working fine. However, in some cases when the topology is changing we got stack traces in the logs that indicates that the server is initiating communication with the client on this port and fails.
My question is why is the server trying to communicate directly with the client? Do we need to let the servers communicate with the client or can we simply ignore the error messages?
Below is an example of the stack trace:
2016-07-04 16:02:32,298 ERROR [marshaller-cache-#67%PMCacheCluster%] [org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryHandler] [NONE] - Failed to send event notification to node: ad8937b4-eb38-442a-8e06-9625c6246d7b
org.apache.ignite.IgniteCheckedException: Failed to send message (node may have left the grid or TCP connection cannot be established due to firewall issues) [node=TcpDiscoveryNode [id=ad8937b4-eb38-442a-8e06-9625c6246d7b, addrs=[xxx.xx.x.xxx], sockAddrs=[/xxx.xx.x.xxx:0, /xxx.xx.x.xxx:0], discPort=0, order=51, intOrder=29, lastExchangeTime=1467640045240, loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=true], topic=T4 [topic=TOPIC_CACHE, id1=ee261127-933b-36b7-b4ef-f5be9bb4bff2, id2=ad8937b4-eb38-442a-8e06-9625c6246d7b, id3=0], msg=GridContinuousMessage [type=MSG_EVT_NOTIFICATION, routineId=7107ffc5-9868-422f-8509-4739558869f7, data=null, futId=null], policy=2]
at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:1290)
at org.apache.ignite.internal.managers.communication.GridIoManager.sendOrderedMessage(GridIoManager.java:1508)
at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.sendWithRetries(GridContinuousProcessor.java:1229)
at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.sendWithRetries(GridContinuousProcessor.java:1200)
at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.sendWithRetries(GridContinuousProcessor.java:1182)
at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.sendNotification(GridContinuousProcessor.java:843)
at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.addNotification(GridContinuousProcessor.java:802)
at org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryHandler.onEntryUpdate(CacheContinuousQueryHandler.java:787)
at org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryHandler.access$700(CacheContinuousQueryHandler.java:91)
at org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryHandler$1.onEntryUpdated(CacheContinuousQueryHandler.java:412)
at org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryManager.onEntryUpdated(CacheContinuousQueryManager.java:343)
at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.innerUpdate(GridCacheMapEntry.java:2522)
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateSingle(GridDhtAtomicCache.java:2246)
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1644)
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1484)
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processNearAtomicUpdateRequest(GridDhtAtomicCache.java:2940)
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$600(GridDhtAtomicCache.java:129)
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$5.apply(GridDhtAtomicCache.java:260)
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$5.apply(GridDhtAtomicCache.java:258)
at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:622)
at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:320)
at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:244)
at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$000(GridCacheIoManager.java:81)
at org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:203)
at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1219)
at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:847)
at org.apache.ignite.internal.managers.communication.GridIoManager.access$1700(GridIoManager.java:105)
at org.apache.ignite.internal.managers.communication.GridIoManager$5.run(GridIoManager.java:810)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.ignite.spi.IgniteSpiException: Failed to send message to remote node: TcpDiscoveryNode [id=ad8937b4-eb38-442a-8e06-9625c6246d7b, addrs=[xxx.xx.x.xxx], sockAddrs=[/xxx.xx.x.xxx:0, /xxx.xx.x.xxx:0], discPort=0, order=51, intOrder=29, lastExchangeTime=1467640045240, loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=true]
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:1993)
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:1933)
at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:1285)
... 30 common frames omitted
Caused by: org.apache.ignite.IgniteCheckedException: Failed to connect to node (is node still alive?). Make sure that each ComputeTask and GridCacheTransaction has a timeout set in order to prevent parties from waiting forever in case of network issues [nodeId=ad8937b4-eb38-442a-8e06-9625c6246d7b, addrs=[/xxx.xx.x.xxx:47100, /xxx.xx.x.xxx:47100]]
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:2496)
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createNioClient(TcpCommunicationSpi.java:2137)
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:2031)
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:1967)
... 32 common frames omitted
Suppressed: org.apache.ignite.IgniteCheckedException: Failed to connect to address: /xxx.xx.x.xxx:47100
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:2501)
... 35 common frames omitted
Caused by: java.net.NoRouteToHostException: No route to host
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:111)
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:2360)
... 35 common frames omitted
Suppressed: org.apache.ignite.IgniteCheckedException: Failed to connect to address: /xxx.xx.x.xxx:47100
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:2501)
... 35 common frames omitted
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:111)
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:2360)
... 35 common frames omitted
2016-07-04 16:02:34,923 ERROR [marshaller-cache-#67%PMCacheCluster%] [org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryHandler] [NONE] - Failed to send event notification to node: 95d9812d-4a16-4589-93a8-0bf2aa6b8413
Client nodes are different from server nodes mostly by the fact that they don't hold cache data and don't execute computations.
Other than that, client nodes are first-class cluster citizens and participate in communications the same way as servers do. So yes, they need to accept connections.
See https://apacheignite.readme.io/docs/clients-vs-servers

JDBC client to Hive - No data or no sasl data in the stream Exception

We have a Kerberised cluster and I'm trying to run a Java action in Oozie where I make a JDBC connection to Hive. This JDBC connections works fine on the Sandbox without Kerberos.
The connection string is as simple as the following, where I'm providing username and password in it:
Connection con = DriverManager.getConnection("jdbc:hive2://W12345:10000/control;principal=hive/W12345.companynet.net#COMPANYNET.NET","user123","passw123");
The Oozie action (strangely) completes succesfully, and the Java action log does not present any error:
1742 [main] INFO org.apache.hive.jdbc.Utils - Supplied authorities: W12345:10000
1742 [main] INFO org.apache.hive.jdbc.Utils - Resolved authority: W12345:10000
1766 [main] INFO org.apache.hive.jdbc.HiveConnection - Will try to open client transport with JDBC Uri: jdbc:hive2://W12345:10000/control;principal=hive/W12345.companynet.net#COMPANYNET.NET
<<< Invocation of Main class completed <<<
Oozie Launcher ends
1785 [main] INFO org.apache.hadoop.mapred.Task - Task:attempt_1464245290012_0129_m_000000_0 is done. And is in the process of committing
1847 [main] INFO org.apache.hadoop.mapred.Task - Task attempt_1464245290012_0129_m_000000_0 is allowed to commit now
1854 [main] INFO org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - Saved output of task 'attempt_1464245290012_0129_m_000000_0' to hdfs://danskehadoop/user/user123/oozie-oozi/0000013-160527101253015-oozie-oozi-W/JavaAction--java/output/_temporary/1/task_1464245290012_0129_m_000000
1909 [main] INFO org.apache.hadoop.mapred.Task - Task 'attempt_1464245290012_0129_m_000000_0' done.
But in reality the Java main does not complete correctly the execution (and does not execute the needed queries) because the JDBC connection fails with an exception that I can see only in the Hive log:
ERROR [HiveServer2-Handler-Pool: Thread-78363]: server.TThreadPoolServer (TThreadPoolServer.java:run(296)) - Error occurred during processing of message.
java.lang.RuntimeException: org.apache.thrift.transport.TSaslTransportException: No data or no sasl data in the stream
at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219)
at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:739)
at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:736)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:360)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1637)
at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java:736)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:268)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.thrift.transport.TSaslTransportException: No data or no sasl data in the stream
at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:328)
at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
... 10 more
I'm actually connected to the cluster, and already done further kinit on my username.
Does anybody know what could the cause of this exception be?
Thanks in advance for the help!
Antonio
This happened to me on MapR hadoop distribution platform.
In my case it was Keepalived checking Hive port every 5 seconds and producing such error. I simply used "nc" command to check if Hive port is in use and did not use any authentication method. Later I switched to "maprcli" command which uses SASL authentication and the error was gone.

Hbase Master and Region servers could not be started

Hadoop is successfully running in distributed mode.
Getting following error while starting HBase in distributed mode.
Tried everything in hbase-site.xml configuration. No idea how to proceed with the problem?
014-03-10 13:55:42,493 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server ip-112-11-1-111.ec2.internal/112.11.1.111:2181.
Will not attempt to authenticate using SASL (Unable to locate a login configuration)
2014-03-10 13:55:42,494 WARN org.apache.zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
2014-03-10 13:55:42,594 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase
2014-03-10 13:55:42,594 ERROR org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: ZooKeeper exists failed after 3 retries
2014-03-10 13:55:42,595 ERROR org.apache.hadoop.hbase.master.HMasterCommandLine: Failed to start master
java.lang.RuntimeException: Failed construction of Master: class org.apache.hadoop.hbase.master.HMaster
at org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2104)
at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:152)
at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:104)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76)
at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2118)
Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1069)
at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:199)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndFailSilent(ZKUtil.java:1109)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndFailSilent(ZKUtil.java:1099)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndFailSilent(ZKUtil.java:1083)
at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.createBaseZNodes(ZooKeeperWatcher.java:162)
at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:155)
at org.apache.hadoop.hbase.master.HMaster.<init>(HMaster.java:345)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2099)
Make sure that ZooKeeper is provisioned and running expectedly.
Check zoo.cfg and /etc/hosts to make sure that all zookeeper servers are reachable by the HBase master.