Hive HBase integration failure - hive

I am using hadoop 2.7.0, hive 1.2.0 and HBase 1.0.1.1
I have created a simple table in HBase
hbase(main):021:0> create 'hbasetohive', 'colFamily'
0 row(s) in 0.2680 seconds
=> Hbase::Table - hbasetohive
hbase(main):022:0> put 'hbasetohive', '1s', 'colFamily:val','1strowval'
0 row(s) in 0.0280 seconds
hbase(main):023:0> scan 'hbasetohive'
ROW COLUMN+CELL
1s column=colFamily:val, timestamp=1434644858733, value=1strowval
1 row(s) in 0.0170 seconds
Now I have tried to access this HBase table through Hive external table. But while select from external table I am getting below error.
hive (default)> CREATE EXTERNAL TABLE hbase_hivetable_k(key string, value string)
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES ("hbase.columns.mapping" = "colFamily:val")
> TBLPROPERTIES("hbase.table.name" = "hbasetohive");
OK
Time taken: 1.688 seconds
hive (default)> Select * from hbase_hivetable_k;
OK
hbase_hivetable_k.key hbase_hivetable_k.value
WARN: The method class org.apache.commons.logging.impl.SLF4JLogFactory#release() was invoked.
WARN: Please see http://www.slf4j.org/codes.html#release for an explanation.
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.hbase.client.Scan.setCaching(I)V
at org.apache.hadoop.hive.hbase.HiveHBaseInputFormatUtil.getScan(HiveHBaseInputFormatUtil.java:123)
at org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat.getRecordReader(HiveHBaseTableInputFormat.java:99)
at org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit.getRecordReader(FetchOperator.java:673)
at org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:323)
at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:445)
at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414)
at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:140)
at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1667)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
It is totally coming out of hive prompt it self.
Can someone please tell me what is the issue here.
The below .hiverc also I am using from hive/conf directory :
SET hive.cli.print.header=true;
set hive.cli.print.current.db=true;
set hive.auto.convert.join=true;
SET hbase.scan.cacheblock=0;
SET hbase.scan.cache=10000;
SET hbase.client.scanner.cache=10000;
add JAR /usr/lib/hive/auxlib/zookeeper-3.4.6.jar;
add JAR /usr/lib/hive/auxlib/hive-hbase-handler-1.2.0.jar;
add JAR /usr/lib/hive/auxlib/guava-14.0.1.jar;
add JAR /usr/lib/hive/auxlib/hbase-common-1.0.1.1.jar;
add JAR /usr/lib/hive/auxlib/hbase-client-1.0.1.1.jar;
add JAR /usr/lib/hive/auxlib/hbase-hadoop2-compat-1.0.1.1.jar;
add JAR /usr/lib/hive/auxlib/hbase-hadoop-compat-1.0.1.1.jar;
add JAR /usr/lib/hive/auxlib/commons-configuration-1.6.jar;
add JAR /usr/lib/hive/auxlib/hadoop-common-2.7.0.jar;
add JAR /usr/lib/hive/auxlib/hbase-annotations-1.0.1.1.jar;
add JAR /usr/lib/hive/auxlib/hbase-it-1.0.1.1.jar;
add JAR /usr/lib/hive/auxlib/hbase-prefix-tree-1.0.1.1.jar;
add JAR /usr/lib/hive/auxlib/hbase-protocol-1.0.1.1.jar;
add JAR /usr/lib/hive/auxlib/hbase-rest-1.0.1.1.jar;
add JAR /usr/lib/hive/auxlib/hbase-server-1.0.1.1.jar;
add JAR /usr/lib/hive/auxlib/hbase-shell-1.0.1.1.jar;
add JAR /usr/lib/hive/auxlib/hbase-thrift-1.0.1.1.jar;
add JAR /usr/lib/hive/auxlib/high-scale-lib-1.1.1.jar;
add JAR /usr/lib/hive/auxlib/hive-serde-1.2.0.jar;
add JAR /usr/lib/hbase/lib/commons-beanutils-1.7.0.jar;
add JAR /usr/lib/hbase/lib/commons-beanutils-core-1.8.0.jar;
add JAR /usr/lib/hbase/lib/commons-cli-1.2.jar;
add JAR /usr/lib/hbase/lib/commons-codec-1.9.jar;
add JAR /usr/lib/hbase/lib/commons-collections-3.2.1.jar;
add JAR /usr/lib/hbase/lib/commons-compress-1.4.1.jar;
add JAR /usr/lib/hbase/lib/commons-digester-1.8.jar;
add JAR /usr/lib/hbase/lib/commons-el-1.0.jar;
add JAR /usr/lib/hbase/lib/commons-io-2.4.jar;
add JAR /usr/lib/hbase/lib/htrace-core-3.1.0-incubating.jar;
add JAR /usr/local/src/spark/lib/spark-assembly-1.3.1-hadoop2.6.0.jar;

I was having the same issue, actually the issue because of Hive 1.2.0 version is not compatible with hbase version 1.x.
As mentioned in HBaseIntegration:
Version information
As of Hive 0.9.0 the HBase integration requires at least HBase 0.92, earlier versions of Hive were working with HBase 0.89/0.90
Version information
Hive 1.x will remain compatible with HBase 0.98.x and lower versions. Hive 2.x will be compatible with HBase 1.x and higher. (See HIVE-10990 for details.) Consumers wanting to work with HBase 1.x using Hive 1.x will need to compile Hive 1.x stream code themselves.
So to make hive 1.x work with hbase 1.x you have to download the source code of hive 2.0 branch from hive on github and build it, after building replace the hive-hbase-handler jar file with the newer version then it will work.

Related

Hive with HBase (both Kerberos) java.net.SocketTimeoutException .. on table 'hbase:meta'

Error
Receiving Timeout errors when trying to query HBase from Hive using HBaseStorageHandler.
Caused by: java.net.SocketTimeoutException: callTimeout=60000, callDuration=68199: row 'phoenix_test310,,'
on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=hbase-master.example.com,16020,1583728693297, seqNum=0
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:159)
at org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:64)
... 3 more
I tried to follow what documentation I could and have some hbase configuraiton options added to hive-site.xml based on this Cloudera link
Environment:
Hadoop 2.9.2
HBase 1.5
Hive 2.3.6
Zookeeper 3.5.6
First, the Cloudera link should be ignored, Hive detects the presence of HBase through environment variables and then automatically reads the hbase-site.xml configuration settings.
There is no need to duplicate HBase settings within hive-site.xml
Configuring Hive for HBase
Modify your hive-env.sh as folllows:
# replace <hbase-install> with your installation path /etc/hbase for example
export HBASE_BIN="<hbase-install>/bin/hbase"
export HBASE_CONF_DIR="<hbase-install>/conf"
Separately you should ensure HADOOP_* environment variables are set as well in hive-env.sh,
and that the hbase lib directory is added to HADOOP_CLASSPATH.
We solved this error,by adding this property hbase.client.scanner.timeout.period=600000
hbase 1.2
https://docs.cloudera.com/documentation/enterprise/5-5-x/topics/admin_hbase_scanner_heartbeat.html#concept_xsl_dz1_jt

Alluxio + Hive on EMR

I have Alluxio 1.8 installed on an EMR 5.19.0 cluster, and can see my S3 tables using /usr/local/alluxio/bin/alluxio fs ls /.
However, when I start up hive and issue
hive> [[DDL w/ LOCATION = alluxio://master_host:19998/my_table ]]], I get the following:
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:java.lang.RuntimeException: java.lang.ClassNotFoundException: Class alluxio.hadoop.FileSystem not found
Is there a way of getting past this? I've tried starting hive with --auxpath pointing to both /usr/local/alluxio/client/alluxio-1.8.1-client.jar and a copy of the jar on hdfs without any success.
Any help?
I posted a blog talking about the reasons for the error message java.lang.ClassNotFoundException: Class alluxio.hadoop.FileSystem not found. Here are some tips, hope they can help:
For Hive, set environment variable HIVE_AUX_JARS_PATH in conf/hive-env.sh:
export HIVE_AUX_JARS_PATH=/<PATH_TO_ALLUXIO>/client/alluxio-1.8.1-client.jar:${HIVE_AUX_JARS_PATH}
which I guess is equivalent to what you have done to set --auxpath.
Depending on your setting of Hive (e.g., Hive on MR or Spark or Tez), you may also need to make sure the runtime is also able to access the client jar. Take Hive on MR as an example, you perhaps also need to append the path to Alluxio client jar to mapreduce.application.classpath or yarn.application.classpath to ensure each task of the MR jobs can access this jar.

Not able to start hiveserver2 for Apache Hive

Could any one help to resolve below problem, I'm trying to start hserver2 and I configured hive_site.xml and configuration file for Hadoop Directory path as well and jar file hive-service-rpc-2.1.1.jar also available at directory lib. And I am able to start using hive but not hiveserver2
$ hive --service hiveserver2 Exception in thread "main" java.lang.ClassNotFoundException: /home/directory/Hadoop/Hive/apache-hive-2/1/1-bin/lib/hive-service-rpc-2/1/1/jar
export HIVE_HOME=/usr/local/hive-1.2.1/
export HIVE_HOME=/usr/local/hive-2.1.1
I am glad that I solve it's problem. Here is my question ,I have different version hive ,and My command use 1.2.1, but it find it's jar form 2.1.1.
you can user command which hive server 2 ,find where is you command from .

Cannot validate serde: org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe

Getting Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Cannot validate serde: org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe
while creating table on Hive. Below is the table creation script :
CREATE EXTERNAL TABlE ratings(user_id INT, movie_id INT,rating INT,rating_time String)
ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe'
WITH SERDEPROPERTIES ("field.delim"="::")
LOCATION '/user/hive/ratings';
HDP Version : 2.1.1
You are facing this problem because your hive lib does not have hive-contrib jar or hive-site.xml is not pointing to it.
Check '/usr/lib/hive/lib' folders . There must be a jar hive-contrib-<version>.jar in this folder
If you do not find any jar in that folder download it from this link
Please take care of the correct version.
Now put that file to your hive lib folder mentioned above.
you can add this file to your hive CLI in two ways
for single Session :
add jar /usr/lib/hive/lib/hive-contrib-<version>.jar;
For permanent solution : Add this to your hive-site.xml
<property>
<name>hive.aux.jars.path</name>
<value>/usr/lib/hive/lib/*</value>
</property>
P.S: MultiDelimitSerDe class is added after hive-contrib-0.13.Please ensure that you are using correct version

hbase migration to 0.98 ClassNotFoundException WritableByteArrayComparable

After recent migration from HBase 0.94.13 to HBase 0.98.12 my code is failing to execute.
I am simply trying to connect to a table via dependent jar file developed by another team who uses Spring HbaseTemplate. I have manually placed all the required jar files for executing the code including hbase-client-0.98.12-mapr-1506.jar (we have MapR distribution).
I am receiving the following error:
Caused by: java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/filter/WritableByteArrayComparable
It seems to be occuring because HBase 0.96.x WritableByteArrayComparable has been renamed to ByteArrayComparable.
How can I make the old code work again?
I was able to make it work by keeping the old jar hbase-0.94.9-mapr-1308 in the classpath. It was a dirty fix but it did the job.
The other team whose dependent jar I was using to connect to M7, finally updated their code and now things are back to normal again. Thanks.