Apache hive error Merging of credentials not supported in this version of hadoop - apache

I am using hadoop 1.2.1, hbase 0.94.14 and hive 1.0.0. There are three datanodes in my clsuter and three regionservers also. I have to import some data from hbase to hive. I have configured hive successfully but when I ran a command to count no. of rows in hive table, its gives following
ERROR [main]: exec.Task (SessionState.java:printError(833)) - Job Submission failed with exception 'java.lang.RuntimeException(java.io.IOException: Merging of credentials not supported in this version of hadoop)'
java.lang.RuntimeException: java.io.IOException: Merging of credentials not supported in this version of hadoop
at org.apache.hadoop.hive.hbase.HBaseStorageHandler.configureJobConf(HBaseStorageHandler.java:485)
at org.apache.hadoop.hive.ql.plan.PlanUtils.configureJobConf(PlanUtils.java:856)
at org.apache.hadoop.hive.ql.plan.MapWork.configureJobConf(MapWork.java:540)
I have changed version of hive to 0.14 but same error.
What is the solution of it?
Note: I cannot upgrade hadoop.

Although your version of Hive is current, this is not the source of your error. You need to upgrade your Hadoop version, to 2.4.0 or above.
The error originates from here https://github.com/apache/hive/blob/3b6825b5b61e943e8e41743f5cbf6d640e0ebdf5/shims/0.20S/src/main/java/org/apache/hadoop/hive/shims/Hadoop20SShims.java#L579

Related

facing hive query error on show databases query - Unable to instantiate

I initialized hive and its worked, later I gave SHOW DATABASES command, but I got below error.
I am using mysql for metadata.
adminn#master:~$ hive
Hive Session ID = e9e9145a-0c38-4007-a9af-ded86a4226ea
Logging initialized using configuration in jar:file:/home/adminn/apache-hive-3.1.1-bin/lib/hive-common-3.1.1.jar!/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive> show databases;
FAILED: HiveException java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
I added the below property to the hive-site.xml file, and this resolved the issue.
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>

Error when trying to start HiveServer2: NullPointerException in ThriftBinaryCLIService

When I start hiveserver2 with the following command:
hive --service hiveserver2 --hiveconf hive.server2.thrift.port=10000 --hiveconf hive.root.logger=INFO,console
I receive the following error before the program exits:
2022-09-12T14:46:53,713 ERROR [Thrift Server] transport.TServerSocket: Could not set socket timeout.
java.net.SocketException: Socket is closed
at java.net.ServerSocket.setSoTimeout(ServerSocket.java:666) ~[?:1.8.0_292]
at org.apache.thrift.transport.TServerSocket.listen(TServerSocket.java:117) ~[hive-exec-3.1.3.jar:3.1.3]
at org.apache.thrift.server.TThreadPoolServer.serve(TThreadPoolServer.java:146) ~[hive-exec-3.1.3.jar:3.1.3]
at org.apache.hive.service.cli.thrift.ThriftBinaryCLIService.run(ThriftBinaryCLIService.java:169) ~[hive-service-3.1.3.jar:3.1.3]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_292]
Hive Session ID = 56c28481-2b0c-4712-808d-ff7ccf31b543
Hive Session ID = 9771e219-095c-4524-b34a-b8e05c335fc0
2022-09-12T14:48:03,871 ERROR [Thrift Server] thrift.ThriftCLIService: Exception caught by ThriftBinaryCLIService. Exiting.
java.lang.NullPointerException: null
at org.apache.hive.service.cli.thrift.ThriftBinaryCLIService.run(ThriftBinaryCLIService.java:169) ~[hive-service-3.1.3.jar:3.1.3]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_292]
Here is a brief explanation of my setup:
I am using vagrant and VirtualBox to create a "virtual" cluster.
This is very loosely (since the repository hasn't been updated in a while, I have had to make many changes to get it to work) based on this repository - https://github.com/njvijay/vagrant-jilla-hadoop
I have created 5 nodes (1 name node and 4 data nodes). The namenode also contains yarnm hive, pig, spark, mysql, python etc.
I am using Ubuntu 14.04.6, Hadoop 2.10.1, Hive 3.1.3, Spark 3.3.0 and Pig 0.15
It seems that there may be some compatibility issue between Hadoop 2 and Spark 3. I was able to resolve the error after updating Hadoop, Hive and Spark to the latest versions.

FATAL [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster java.lang.NoSuchMethodError:

I am trying to run a mapreduce job on EMR cluster. The version of Hadoop on EMR is 2.7.3.
The code is used to read HFiles residing on S3 bucket. But every time I run it fails with the below error.
2018-02-22 20:02:11,641 FATAL [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster
java.lang.NoSuchMethodError: org.apache.hadoop.mapred.TaskLog.createLogSyncer()Ljava/util/concurrent/ScheduledExecutorService;
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.<init>(MRAppMaster.java:250)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.<init>(MRAppMaster.java:233)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1472)
2018-02-22 20:02:12,188 INFO [main] org.apache.hadoop.util.ExitUtil: Exiting with status 1
End of LogType:syslog
The actual code was designed to read files from HDFS which was all doing fine in CDH based clusters where the hadoop version is 2.6.0. However there was a requirement to read the HFiles from S3 bucket on EMR based cluster in AWS. I made few changes in the code which will allow it to read any file system. Below is the snippet of the change
...
Path JSONOutputjob2 = new Path( args[1] );
FileSystem.get(JSONOutputjob2.toUri(), conf2).delete(JSONOutputjob2, true);
...
I am passing the path as an argument and here are the options that I have tried with the file path.
s3n://emr-ip/path/to/the/file
s3a://emr-ip/path/to/the/file
s3://emr-ip/path/to/the/file
This error is really driving me crazy. I have updated my pom.xml file to use the available Hadoop version of the cluster and built the project. The build was also successful. But does not work. Any suggestions or help is much appreciated.
Edit:
I have update my pom to have the aws hadoop version i.e 2.7.3 which did not fix the issue.

Error in installation of Hive 2.1.0 on Hadoop 2.7.2 - Pseudo distributed mode

I followed Apache Hadoop installation links and could install the same along with PIG. They all are working fine.
Following is the configuration:
Hadoop: 2.7.2
Hive: 2.1.0
Machine: Ubuntu 14.04 LTS 64-bit
Java: Version 9
Now I tried to install Apache Hive 2.1.0 according to this link [https://cwiki.apache.org/confluence/display/Hive/AdminManual+Installation#AdminManualInstallation-InstallingfromaTarball].
... and started test execution of Hive CLI but everytime it throws following error and exits.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Exception in thread "main" java.lang.ClassCastException: jdk.internal.loader.ClassLoaders$AppClassLoader (in module: java.base) cannot be cast to java.net.URLClassLoader (in module: java.base)
at org.apache.hadoop.hive.ql.session.SessionState.<init> (SessionState.java:374)
at org.apache.hadoop.hive.ql.session.SessionState.<init>(SessionState.java:350)
at org.apache.hadoop.hive.cli.CliSessionState.<init>(CliSessionState.java:60)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:663)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(java.base#9-ea/Native Method)
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(java.base#9-ea/NativeMethodAccessorImpl.java:62)
at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base#9-ea/DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(java.base#9-ea/Method.java:533)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
..But there is a catch. If I invoke Beeline CLI then it works fine.
Could you please help :
a. Are the Beeline CLI and Hive CLI same or any specific difference?
b. Help to install/configure Hive on my machine
A : Beeline CLI VS Hive CLI https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_dataintegration/content/beeline-vs-hive-cli.html
B : According to :
http://openjdk.java.net/projects/jigsaw/talks/prepare-for-jdk9-j1-2015.pdf
Java 9 Uses no longer uses java.net.URLClassLoader.
However, I was able to solve the issue by pointing Hive to JDK8.
** I have only begun using HIVE/HADOOP... Perhaps someone could proved a better explanation or a workaround so that we are able to use JDK9...

Use saiku with apache hive

have you ever used Saiku to make data analysis on BigData Platform (Hadoop)? My recent work need to integrate some legacy BI tools with Hadoop to support common OLAP queries on HDFS/HBase.
I found a solution implemented with Phoenix & Hbase here, which bridges saiku and Hbase with SQL Dialect in Phoenix and it worked. However, this method can only handle data within HBase through HBase-API. It cannot boost any Map-Reduce style job when building the data cube. I prefer some more BigData compatible alternatives, like through Apache Hive.
Saiku is based on Mondrian. My version of Saiku use Mondrian-4.0.0.0-SNAPSHOT.jar, which I found can already work well with Hive. And I found that there are many Hive-0.13 jars within Saiku's lib directory. So I thought a simple config of hive2 datasource can work. I started an hiveserver2 in the namenode of my HDFS cluster and add following datasource into saiku.
Name: hive2
Connection Type: Mondrian
URL: jdbc:hive2://localhost:10000/default
Schema: /datasources/movie.xml
Jdbc Driver: org.apache.hive.jdbc.HiveDriver
Username: ubuntu
Password: XXXX
The saiku indeed successfully connected to the hiveserver2 but failed to load the datasource. I found following error in the saiku log:
name:hive2
driver:mondrian.olap4j.MondrianOlap4jDriver
url:jdbc:mondrian:Jdbc=jdbc:hive2://localhost:10000/default;Catalog=mondrian:///datasources/movie.xml;JdbcDrivers=org.apache.hive.jdbc.HiveDriver
12:41:48,110 WARN [RolapSchema] Model is in legacy format
12:41:50,464 ERROR [SecurityAwareConnectionManager] Error connecting: hive2
mondrian.olap.MondrianException: Mondrian Error:Internal error: while quoting identifier
at mondrian.resource.MondrianResource$_Def0.ex(MondrianResource.java:992)
at mondrian.olap.Util.newInternal(Util.java:2543)
at mondrian.spi.impl.JdbcDialectImpl.deduceIdentifierQuoteString(JdbcDialectImpl.java:245)
at mondrian.spi.impl.JdbcDialectImpl.<init>(JdbcDialectImpl.java:146)
at mondrian.spi.DialectManager$DialectManagerImpl$1.createDialect(DialectManager.java:210)
...
Caused by: java.sql.SQLException: Method not supported
at org.apache.hive.jdbc.HiveDatabaseMetaData.getIdentifierQuoteString(HiveDatabaseMetaData.java:342)
at org.apache.commons.dbcp.DelegatingDatabaseMetaData.getIdentifierQuoteString(DelegatingDatabaseMetaData.java:306)
at mondrian.spi.impl.JdbcDialectImpl.deduceIdentifierQuoteString(JdbcDialectImpl.java:238)
... 99 more
I looked into the hive 0.13 source. I found the getIdentifierQuoteString isn't implemented yet and simply throw an exception.
public String getIdentifierQuoteString() throws SQLException {
throw new SQLException("Method not supported");
}
Till now I'm puzzled. Is it practical to use the saiku with a hive? It has Hive 0.13 jars in its lib dir but cannot load a simple hive datasource? Should I simply modify the source of hive. I found in the newly released Hive 1.0. This function is implemented by simple return an empty string.
Does anyone has good idea? Thanks!