Error when trying to start HiveServer2: NullPointerException in ThriftBinaryCLIService - hive

When I start hiveserver2 with the following command:
hive --service hiveserver2 --hiveconf hive.server2.thrift.port=10000 --hiveconf hive.root.logger=INFO,console
I receive the following error before the program exits:
2022-09-12T14:46:53,713 ERROR [Thrift Server] transport.TServerSocket: Could not set socket timeout.
java.net.SocketException: Socket is closed
at java.net.ServerSocket.setSoTimeout(ServerSocket.java:666) ~[?:1.8.0_292]
at org.apache.thrift.transport.TServerSocket.listen(TServerSocket.java:117) ~[hive-exec-3.1.3.jar:3.1.3]
at org.apache.thrift.server.TThreadPoolServer.serve(TThreadPoolServer.java:146) ~[hive-exec-3.1.3.jar:3.1.3]
at org.apache.hive.service.cli.thrift.ThriftBinaryCLIService.run(ThriftBinaryCLIService.java:169) ~[hive-service-3.1.3.jar:3.1.3]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_292]
Hive Session ID = 56c28481-2b0c-4712-808d-ff7ccf31b543
Hive Session ID = 9771e219-095c-4524-b34a-b8e05c335fc0
2022-09-12T14:48:03,871 ERROR [Thrift Server] thrift.ThriftCLIService: Exception caught by ThriftBinaryCLIService. Exiting.
java.lang.NullPointerException: null
at org.apache.hive.service.cli.thrift.ThriftBinaryCLIService.run(ThriftBinaryCLIService.java:169) ~[hive-service-3.1.3.jar:3.1.3]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_292]
Here is a brief explanation of my setup:
I am using vagrant and VirtualBox to create a "virtual" cluster.
This is very loosely (since the repository hasn't been updated in a while, I have had to make many changes to get it to work) based on this repository - https://github.com/njvijay/vagrant-jilla-hadoop
I have created 5 nodes (1 name node and 4 data nodes). The namenode also contains yarnm hive, pig, spark, mysql, python etc.
I am using Ubuntu 14.04.6, Hadoop 2.10.1, Hive 3.1.3, Spark 3.3.0 and Pig 0.15

It seems that there may be some compatibility issue between Hadoop 2 and Spark 3. I was able to resolve the error after updating Hadoop, Hive and Spark to the latest versions.

Related

facing hive query error on show databases query - Unable to instantiate

I initialized hive and its worked, later I gave SHOW DATABASES command, but I got below error.
I am using mysql for metadata.
adminn#master:~$ hive
Hive Session ID = e9e9145a-0c38-4007-a9af-ded86a4226ea
Logging initialized using configuration in jar:file:/home/adminn/apache-hive-3.1.1-bin/lib/hive-common-3.1.1.jar!/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive> show databases;
FAILED: HiveException java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
I added the below property to the hive-site.xml file, and this resolved the issue.
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>

Error in installation of Hive 2.1.0 on Hadoop 2.7.2 - Pseudo distributed mode

I followed Apache Hadoop installation links and could install the same along with PIG. They all are working fine.
Following is the configuration:
Hadoop: 2.7.2
Hive: 2.1.0
Machine: Ubuntu 14.04 LTS 64-bit
Java: Version 9
Now I tried to install Apache Hive 2.1.0 according to this link [https://cwiki.apache.org/confluence/display/Hive/AdminManual+Installation#AdminManualInstallation-InstallingfromaTarball].
... and started test execution of Hive CLI but everytime it throws following error and exits.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Exception in thread "main" java.lang.ClassCastException: jdk.internal.loader.ClassLoaders$AppClassLoader (in module: java.base) cannot be cast to java.net.URLClassLoader (in module: java.base)
at org.apache.hadoop.hive.ql.session.SessionState.<init> (SessionState.java:374)
at org.apache.hadoop.hive.ql.session.SessionState.<init>(SessionState.java:350)
at org.apache.hadoop.hive.cli.CliSessionState.<init>(CliSessionState.java:60)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:663)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(java.base#9-ea/Native Method)
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(java.base#9-ea/NativeMethodAccessorImpl.java:62)
at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base#9-ea/DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(java.base#9-ea/Method.java:533)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
..But there is a catch. If I invoke Beeline CLI then it works fine.
Could you please help :
a. Are the Beeline CLI and Hive CLI same or any specific difference?
b. Help to install/configure Hive on my machine
A : Beeline CLI VS Hive CLI https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_dataintegration/content/beeline-vs-hive-cli.html
B : According to :
http://openjdk.java.net/projects/jigsaw/talks/prepare-for-jdk9-j1-2015.pdf
Java 9 Uses no longer uses java.net.URLClassLoader.
However, I was able to solve the issue by pointing Hive to JDK8.
** I have only begun using HIVE/HADOOP... Perhaps someone could proved a better explanation or a workaround so that we are able to use JDK9...

Use saiku with apache hive

have you ever used Saiku to make data analysis on BigData Platform (Hadoop)? My recent work need to integrate some legacy BI tools with Hadoop to support common OLAP queries on HDFS/HBase.
I found a solution implemented with Phoenix & Hbase here, which bridges saiku and Hbase with SQL Dialect in Phoenix and it worked. However, this method can only handle data within HBase through HBase-API. It cannot boost any Map-Reduce style job when building the data cube. I prefer some more BigData compatible alternatives, like through Apache Hive.
Saiku is based on Mondrian. My version of Saiku use Mondrian-4.0.0.0-SNAPSHOT.jar, which I found can already work well with Hive. And I found that there are many Hive-0.13 jars within Saiku's lib directory. So I thought a simple config of hive2 datasource can work. I started an hiveserver2 in the namenode of my HDFS cluster and add following datasource into saiku.
Name: hive2
Connection Type: Mondrian
URL: jdbc:hive2://localhost:10000/default
Schema: /datasources/movie.xml
Jdbc Driver: org.apache.hive.jdbc.HiveDriver
Username: ubuntu
Password: XXXX
The saiku indeed successfully connected to the hiveserver2 but failed to load the datasource. I found following error in the saiku log:
name:hive2
driver:mondrian.olap4j.MondrianOlap4jDriver
url:jdbc:mondrian:Jdbc=jdbc:hive2://localhost:10000/default;Catalog=mondrian:///datasources/movie.xml;JdbcDrivers=org.apache.hive.jdbc.HiveDriver
12:41:48,110 WARN [RolapSchema] Model is in legacy format
12:41:50,464 ERROR [SecurityAwareConnectionManager] Error connecting: hive2
mondrian.olap.MondrianException: Mondrian Error:Internal error: while quoting identifier
at mondrian.resource.MondrianResource$_Def0.ex(MondrianResource.java:992)
at mondrian.olap.Util.newInternal(Util.java:2543)
at mondrian.spi.impl.JdbcDialectImpl.deduceIdentifierQuoteString(JdbcDialectImpl.java:245)
at mondrian.spi.impl.JdbcDialectImpl.<init>(JdbcDialectImpl.java:146)
at mondrian.spi.DialectManager$DialectManagerImpl$1.createDialect(DialectManager.java:210)
...
Caused by: java.sql.SQLException: Method not supported
at org.apache.hive.jdbc.HiveDatabaseMetaData.getIdentifierQuoteString(HiveDatabaseMetaData.java:342)
at org.apache.commons.dbcp.DelegatingDatabaseMetaData.getIdentifierQuoteString(DelegatingDatabaseMetaData.java:306)
at mondrian.spi.impl.JdbcDialectImpl.deduceIdentifierQuoteString(JdbcDialectImpl.java:238)
... 99 more
I looked into the hive 0.13 source. I found the getIdentifierQuoteString isn't implemented yet and simply throw an exception.
public String getIdentifierQuoteString() throws SQLException {
throw new SQLException("Method not supported");
}
Till now I'm puzzled. Is it practical to use the saiku with a hive? It has Hive 0.13 jars in its lib dir but cannot load a simple hive datasource? Should I simply modify the source of hive. I found in the newly released Hive 1.0. This function is implemented by simple return an empty string.
Does anyone has good idea? Thanks!

Apache hive error Merging of credentials not supported in this version of hadoop

I am using hadoop 1.2.1, hbase 0.94.14 and hive 1.0.0. There are three datanodes in my clsuter and three regionservers also. I have to import some data from hbase to hive. I have configured hive successfully but when I ran a command to count no. of rows in hive table, its gives following
ERROR [main]: exec.Task (SessionState.java:printError(833)) - Job Submission failed with exception 'java.lang.RuntimeException(java.io.IOException: Merging of credentials not supported in this version of hadoop)'
java.lang.RuntimeException: java.io.IOException: Merging of credentials not supported in this version of hadoop
at org.apache.hadoop.hive.hbase.HBaseStorageHandler.configureJobConf(HBaseStorageHandler.java:485)
at org.apache.hadoop.hive.ql.plan.PlanUtils.configureJobConf(PlanUtils.java:856)
at org.apache.hadoop.hive.ql.plan.MapWork.configureJobConf(MapWork.java:540)
I have changed version of hive to 0.14 but same error.
What is the solution of it?
Note: I cannot upgrade hadoop.
Although your version of Hive is current, this is not the source of your error. You need to upgrade your Hadoop version, to 2.4.0 or above.
The error originates from here https://github.com/apache/hive/blob/3b6825b5b61e943e8e41743f5cbf6d640e0ebdf5/shims/0.20S/src/main/java/org/apache/hadoop/hive/shims/Hadoop20SShims.java#L579

Show databases command not working in hive?

I connected hive, and when I try to show all databases using command below, I get the following error,:
techgene#slaveone:~/apps/hive-0.12.0$ hive
Logging initialized using configuration in jar:file:/home/techgene/apps/hive-0.12.0/lib/hive-common-0.12.0.jar!/hive-log4j.properties
hive> show databases;
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
Can you please provide a solution for this?
This problem usually occurs when hive CLI session is improperly ended. In such case, kill the improperly closed hive CLI session as follows. After this launch hive CLI fresh.
ramisetty#aspire:~$ jps
3710 SecondaryNameNode
4103 RunJar -------------------------> hive CLI instance.
4019 TaskTracker
3467 DataNode
3242 NameNode
4366 Jps
3788 JobTracker
ramisetty#aspire:~$ kill -9 4103
ramisetty#aspire:~$
still problem persists means follow the available solutions # FAILED: Error in metadata: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient