Failing to Drop Database in Hive - hive

I'm trying to initialize database in hive before each run.
The code is:
command="hive -e \"drop database if exists some_db cascade; create database some_db\"";
eval $command;
execution fails with error:
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Unable to clean up java.sql.SQLSyntaxErrorException: Table 'hive.COMPLETED_COMPACTIONS' doesn't exist
it passes on rerun,
eval $command;
I can't explain also the fact that when calling the command for multiple times, it keep failing alternately.
Appreciate for advice to what happens in Hive and how to make it work on the first attempt, thanks.
Details:
Hive 2.1 running on AWS EMR 5.7, didn't see such behavior on HIVE 1.0
Full Error Stack
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Unable to clean up java.sql.SQLSyntaxErrorException: Table 'hive.COMPLETED_COMPACTIONS' doesn't exist
at org.mariadb.jdbc.internal.util.ExceptionMapper.get(ExceptionMapper.java:125)
at org.mariadb.jdbc.internal.util.ExceptionMapper.throwException(ExceptionMapper.java:69)
at org.mariadb.jdbc.MariaDbStatement.executeQueryEpilog(MariaDbStatement.java:259)
at org.mariadb.jdbc.MariaDbStatement.execute(MariaDbStatement.java:287)
at org.mariadb.jdbc.MariaDbStatement.executeUpdate(MariaDbStatement.java:470)
at org.mariadb.jdbc.MariaDbStatement.executeUpdate(MariaDbStatement.java:486)
at com.jolbox.bonecp.StatementHandle.executeUpdate(StatementHandle.java:497)
at org.apache.hadoop.hive.metastore.txn.TxnHandler.cleanupRecords(TxnHandler.java:1721)
at org.apache.hadoop.hive.metastore.AcidEventListener.onDropDatabase(AcidEventListener.java:51)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_database_core(HiveMetaStore.java:1098)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_database(HiveMetaStore.java:1130)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:140)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:99)
at com.sun.proxy.$Proxy19.drop_database(Unknown Source)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$drop_database.getResult(ThriftHiveMetastore.java:10518)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$drop_database.getResult(ThriftHiveMetastore.java:10502)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
at org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.mariadb.jdbc.internal.util.dao.QueryException: Table 'hive.COMPLETED_COMPACTIONS' doesn't exist
at org.mariadb.jdbc.internal.protocol.AbstractQueryProtocol.getResult(AbstractQueryProtocol.java:479)
at org.mariadb.jdbc.internal.protocol.AbstractQueryProtocol.result(AbstractQueryProtocol.java:400)
at org.mariadb.jdbc.internal.protocol.AbstractQueryProtocol.executeQuery(AbstractQueryProtocol.java:381)
at org.mariadb.jdbc.internal.protocol.AbstractQueryProtocol.executeQuery(AbstractQueryProtocol.java:337)
at org.mariadb.jdbc.MariaDbStatement.execute(MariaDbStatement.java:277)
... 27 more
Blockquote

Looks like your Hive metastore schema needs to be upgraded. You can check the schema version with the schematool command.
Example output:
Metastore connection URL: jdbc:mysql://XXXXXXXXXXXXXXXX:3306/hive?createDatabaseIfNotExist=true
Metastore Connection Driver : org.mariadb.jdbc.Driver
Metastore connection User: hive
Hive distribution version: 2.1.0
Metastore schema version: 2.1.0
If your metastore schema and Hive distribution versions are not the same, upgrade the schema.
Make sure you have a backup first, just in case. Also remember that there could be changes to the schema that breaks functionality with older Hive distributions so you may not be able to share a matastore between Hive distribution versions.

Related

Apache Hive query on Tez FileNotFoundException

I'm receiving this exception when executing a Hive query on Tez with Hive 2.3.6 and Tez 0.9.2
I know Tez is configured correctly because I can manually run map-reduce jobs via Hadoop.
Dag submit failed due to java.io.FileNotFoundException: Path is not a file: /tmp/hive/root/_tez_session_dir/f4f4b17c-0657-41fa-8674-df83fa3ad362/lib
at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:76)
at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:62)
at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:150)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1829)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:709)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:381)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:503)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:871)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:817)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2606)
This error is seen on Hive 2.2+ or Hive 2.3.x+ when
hive.aux.jars.path in hive-site.xml is configured with an invalid path.
or
HIVE_AUX_JARS_PATH environment variable is configured improperly (usually in hive-env.sh)

java.lang.IllegalAccessError: tried to access method com.google.common.collect.Iterators.emptyIterator()

I'm using hadoop 3.2.1 and hive 2.3.6.
When I run show databases, it shows the following error
'''
hive> show databases;
Exception in thread "main" java.lang.IllegalAccessError: tried to access method com.google.common.collect.Iterators.emptyIterator()Lcom/google/common/collect/UnmodifiableIterator; from class org.apache.hadoop.hive.ql.exec.FetchOperator
at org.apache.hadoop.hive.ql.exec.FetchOperator.<init>(FetchOperator.java:108)
at org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:87)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:541)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1317)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1457)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
'''
What does it mean?And why do i get this error? Please give clarity.
Thanks in Advance.
According to the release page [1] Hive 2.3.3 works with Hadoop 2.x.y (not 3.x.y) so if you want to run with Hadoop 3.2.1 try a newer version.
Other than that the error looks like a classpath problem related with guava. I guess you have one Guava version coming from Hive and another version coming from Hadoop. Try removing one of them. For instance:
cd apache-hive-2.3.3-bin/lib
rm guava*
Even if you solve the problem above most likely you will bump up into another so it is better to choose versions that are compatible.
[1] https://hive.apache.org/downloads.html
Please upgrade to apache-hive-3.1.2 if that is an option for you. I had exact same issue that was resolved by the upgrade. Other option may be would be to compare lib folder of hive 2.3.6 versus hive 3.1.2; this issue is primarily due to incompatible jars.

Hive INSERT OVERWRITE query not writing to hdfs directory:Cannot get DistCp constructor

I have created a view of hbase in hive with 10 miliion rows and when i am running below query ,distcp is invoked and it throws below error.
INSERT OVERWRITE DIRECTORY '/mapred/INPUT' select hive_cdper1.cid,hive_cdper1.emptyp,hive_cdper1.ethtyp,hive_cdper1.gdtyp,hive_cdseg.mrtl from hive_cdper1 join hive_cdseg on hive_cdper1.cnm=hive_cdseg.cnm;
Output:map 100% reduce 100%
2016-10-17 15:05:34,688 INFO [main]: exec.Task (SessionState.java:printInfo(951)) - Moving data to: /mapred/INPUT
from
hdfs://mycluster/mapred/INPUT/.hive-staging_hive_2016-10-17_14-57-48_620_6609613978089243090-1/-ext-10000 2016-10-17 15:05:34,693 INFO [main]: common.FileUtils
(FileUtils.java:copy(551)) - Source is 483335659 bytes. (MAX: 4000000)
2016-10-17 15:05:34,693 INFO [main]: common.FileUtils
(FileUtils.java:copy(552)) - Launch distributed copy (distcp) job.
2016-10-17 15:05:34,695 ERROR [main]: exec.Task
(SessionState.java:printError(960)) - Failed with exception Unable to
move source
hdfs://mycluster/mapred/INPUT/.hive-staging_hive_2016-10-17_14-57-48_620_6609613978089243090-1/-ext-10000 to destination /mapred/INPUT
org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move
source
hdfs://mycluster/mapred/INPUT/.hive-staging_hive_2016-10-17_14-57-48_620_6609613978089243090-1/-ext-10000 to destination /mapred/INPUT
at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2644)
at org.apache.hadoop.hive.ql.exec.MoveTask.moveFile(MoveTask.java:105)
at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:222)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1653)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1412)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1195)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136) Caused by: java.io.IOException: Cannot get DistCp constructor:
org.apache.hadoop.tools.DistCp.()
at org.apache.hadoop.hive.shims.Hadoop23Shims.runDistCp(Hadoop23Shims.java:1160)
at org.apache.hadoop.hive.common.FileUtils.copy(FileUtils.java:553)
at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2622)
... 21 more
What i wonder here is:i am writing to the same cluster ,then why it is invoking distcp instead of normal cp.
Here i am using hive 1.2.1 with hadoop 2.7.2 and my cluster name is mycluster.
Note:i have tried setting hive.exec.copyfile.maxsize=4000000 but didnt work.
Appreciate your suggestions..
1) check permission of your destination path /mapred/INPUT
2) If write permission is not there for other user, then hadoop fs -chmod a+w /mapred/INPUT
Setting below properties in hive-site.xml solved my issue.
<property>
<name>hive.exec.copyfile.maxsize</name>
<value>3355443200</value>
<description>Maximum file size (in Mb) that Hive uses to do single HDFS copies between directories.Distributed copies (distcp) will be used instead for bigger files so that copies can be done faster.</description>
</property>

Install Spark on existing Hadoop cluster (ISSUE with HIVE)

I am trying to get a Spark/Shark cluster up but keep running into the same problem. I have followed the instructions on https://github.com/amplab/shark/wiki/Running-Shark-on-a-Cluster and addressed Hive as stated.
Here are the details, any help would be great.
I already installed the following package:
Spark/Shark 1.0.0
Apache Hadoop 2.4.0
Apache Hive 0.13
Scala 2.9.3
Java 7
I configure ~/spark/conf/spark-env.sh as:
export HADOOP_HOME=/path/to/hadoop/
export HIVE_HOME=/path/to/hive/
export MASTER=spark://xxx.xxx.xxx.xxx:7077
export SPARK_HOME=/path/to/spark
export SPARK_MEM=4g
export HIVE_CONF_DIR=/path/to/hive/conf/
source $SPARK_HOME/conf/spark-env.sh
When start spark with "./spark-withinfo", I get the following errors:
-hiveconf hive.root.logger=INFO,console
Starting the Shark Command Line Client
14/07/07 16:26:57 WARN conf.HiveConf: DEPRECATED: hive.metastore.ds.retry.* no longer has any effect. Use hive.hmshandler.retry.* instead
14/07/07 16:26:57 [main]: WARN conf.HiveConf: DEPRECATED: hive.metastore.ds.retry.* no longer has any effect. Use hive.hmshandler.retry.* instead
Logging initialized using configuration in jar:file:/path/to/hive/lib/hive-exec-0.13.0.jar!/hive-log4j.properties
14/07/07 16:26:57 [main]: INFO SessionState:
Logging initialized using configuration in jar:file:/path/to/hive/lib/hive-exec-0.13.0.jar!/hive-log4j.properties
14/07/07 16:26:57 [main]: INFO metastore.HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:344)
at shark.SharkCliDriver$.main(SharkCliDriver.scala:128)
at shark.SharkCliDriver.main(SharkCliDriver.scala)
Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1139)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:51)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:61)
at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2444)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2456)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:338)
... 2 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1137)
... 7 more
Caused by: java.lang.NoSuchFieldError: METASTOREINTERVAL
at org.apache.hadoop.hive.metastore.RetryingRawStore.init(RetryingRawStore.java:78)
at org.apache.hadoop.hive.metastore.RetryingRawStore.<init>(RetryingRawStore.java:60)
at org.apache.hadoop.hive.metastore.RetryingRawStore.getProxy(RetryingRawStore.java:71)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:413)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:401)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:439)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:325)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.<init>(HiveMetaStore.java:285)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:54)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:59)
at org.apache.hadoop.hive.metastore.HiveMetaStore.newHMSHandler(HiveMetaStore.java:4102)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:121)
... 12 more
I guess Spark can not find some libs ton connect metastore in Hive, but I have been stacked here for a couple days and don't know how to solve it. BTW, I use MYSQL for hive metadata, and everything works well in hive.
Any help is appreciated. Thanks in advance.
You may need to add mysql connector jar file before you start spark...
In my case, I added mysql connector jar like below.
$SPARK_HOME/bin/compute-classpath.sh
CLASSPATH=$CLASSPATH:/opt/big/hive/lib/mysql-connector-java-5.1.25-bin.jar

hive0.10.0 Exception in thread "main" java.lang.NoSuchMethodError: org.apache.thrift.EncodingUtils.setBit(BIZ)B

could u help me? I use hive 0.10.0
hive> show tables;
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.thrift.EncodingUtils.setBit(BIZ)B
at org.apache.hadoop.hive.ql.plan.api.Query.setStartedIsSet(Query.java:487)
at org.apache.hadoop.hive.ql.plan.api.Query.setStarted(Query.java:474)
at org.apache.hadoop.hive.ql.QueryPlan.updateCountersInQueryPlan(QueryPlan.java:309)
at org.apache.hadoop.hive.ql.QueryPlan.getQueryPlan(QueryPlan.java:450)
at org.apache.hadoop.hive.ql.QueryPlan.toString(QueryPlan.java:622)
at org.apache.hadoop.hive.ql.history.HiveHistory.logPlanProgress(HiveHistory.java:503)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1097)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:973)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:893)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:412)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:613)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
This issue comes because of incompatible "libthrift" jar version. So, I have downloaded the latest libthrift-0.9.3.jar, and it worked for me.
I faced the similar issue. The version of Hive used is not compatible with Hadoop. The thrift version used by hadoop is different from the one used by hive. It good to use the compatible version of Hive or replace the thirft (jar) library used by Hadoop with one used by hive.
When i faced this problem, this was my situation:
In HADOOP_HOME/lib, I placed mahout-examples-0.7-job.jar, which is not supposed to be there for some other excercises.
When I run Hive, it throwed me the same error like in your question.
I moved mahout.X.y.jar from lib, then started hive CLi and it worked like a charm.