Error while reading data in Hive using Dbeaver - hive

I am trying to select a data using dbeaver with the following query:
Select * from tablename where datecol='';
Above query works fine with major dates, however for one date it is throwing below error:
SQL Error [2] [08S01]: Error while processing statement: FAILED:
Execution Error, return code 2 from
org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed,
vertexName=Map 2, vertexId=vertex_1663507669262_19026_3_00,
diagnostics=[Task failed, taskId=task_1663507669262_19026_3_00_000009,
diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running
task ( failure ) :
attempt_1663507669262_19026_3_00_000009_0:java.lang.RuntimeException:
java.lang.RuntimeException: java.io.IOException:
com.google.protobuf.InvalidProtocolBufferException: Protocol message
tag had invalid wire type.
at
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)
at
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
at
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
at
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
at
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
at java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:422) at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
at
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at
com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
at
com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69)
at
com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

Related

HIVE NIFI ORC java.lang.IllegalArgumentException: Column has wrong number of index entries found: 0 expected: 1

I have a pipeline in NIFI and everything works fine except sometimes(very rarely) some record go to the retry queue in PUTHIVESTREAMING processor when trying to write to HIVE.
The nifi logs show the exception
ERROR [put-hive-streaming-0] o.a.h.h.streaming.AbstractRecordWriter Unable to close org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater[hdfs://hdfscluster/user/hive/metastore/bi_sureyield_db.db/bi_sureyield_events/year=2020/month=1/day=10/delta_127318799_127318808/bucket_00000] due to: Column has wrong number of index entries found: 0 expected: 1
java.lang.IllegalArgumentException: Column has wrong number of index entries found: 0 expected: 1
at org.apache.orc.impl.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:695)
at org.apache.orc.impl.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:2147)
at org.apache.orc.impl.WriterImpl.flushStripe(WriterImpl.java:2661)
at org.apache.orc.impl.WriterImpl.close(WriterImpl.java:2834)
at org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:321)
at org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.close(OrcRecordUpdater.java:502)
at org.apache.hive.hcatalog.streaming.AbstractRecordWriter.closeBatch(AbstractRecordWriter.java:218)
at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl$6.run(HiveEndPoint.java:998)
at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl$6.run(HiveEndPoint.java:995)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.closeImpl(HiveEndPoint.java:994)
at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.markDead(HiveEndPoint.java:760)
at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.commit(HiveEndPoint.java:854)
at org.apache.nifi.util.hive.HiveWriter$4.call(HiveWriter.java:237)
at org.apache.nifi.util.hive.HiveWriter$4.call(HiveWriter.java:234)
at org.apache.nifi.util.hive.HiveWriter.lambda$null$3(HiveWriter.java:373)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.nifi.util.hive.HiveWriter.lambda$callWithTimeout$4(HiveWriter.java:373)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-04-29 14:55:39,542 ERROR [put-hive-streaming-0] o.a.h.h.streaming.AbstractRecordWriter Unable to close org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater[hdfs://hdfscluster/user/hive/metastore/bi_sureyield_db.db/bi_sureyield_events/year=2020/month=1/day=10/delta_127318799_127318808/bucket_00001] due to: null
java.nio.channels.ClosedChannelException: null
at org.apache.hadoop.hdfs.DFSOutputStream.checkClosed(DFSOutputStream.java:1521)
at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:104)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:58)
at java.io.DataOutputStream.write(DataOutputStream.java:107)
at org.apache.orc.impl.PhysicalFsWriter$BufferedStream.spillToDiskAndClear(PhysicalFsWriter.java:286)
at org.apache.orc.impl.PhysicalFsWriter.finalizeStripe(PhysicalFsWriter.java:337)
at org.apache.orc.impl.WriterImpl.flushStripe(WriterImpl.java:2665)
at org.apache.orc.impl.WriterImpl.close(WriterImpl.java:2834)
at org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:321)
at org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.close(OrcRecordUpdater.java:502)
at org.apache.hive.hcatalog.streaming.AbstractRecordWriter.closeBatch(AbstractRecordWriter.java:218)
at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl$6.run(HiveEndPoint.java:998)
at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl$6.run(HiveEndPoint.java:995)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.closeImpl(HiveEndPoint.java:994)
at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.markDead(HiveEndPoint.java:760)
at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.commit(HiveEndPoint.java:854)
at org.apache.nifi.util.hive.HiveWriter$4.call(HiveWriter.java:237)
at org.apache.nifi.util.hive.HiveWriter$4.call(HiveWriter.java:234)
at org.apache.nifi.util.hive.HiveWriter.lambda$null$3(HiveWriter.java:373)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.nifi.util.hive.HiveWriter.lambda$callWithTimeout$4(HiveWriter.java:373)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
I'm unable to understand where the issue lies and how to fix it.

Spark SQL query `SHOW VIEWS IN` through Hive metastore fails with `missing 'FUNCTIONS' at 'IN'`

Have Spark (2.4.4) with a Hive metastore running. When accessing it through JDBC/ODBC with a query like
SHOW VIEWS IN space1
i get following error:
[2020-03-18T10:54:57,722][DEBUG][HiveServer2-Background-Pool: Thread-203][org.apache.spark.sql.execution.SparkSqlParser][][] Parsing command: SHOW VIEWS IN `space1`
[2020-03-18T10:54:57,733][ERROR][HiveServer2-Background-Pool: Thread-203][org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation][][] Error executing query, currentState RUNNING,
org.apache.spark.sql.catalyst.parser.ParseException:
missing 'FUNCTIONS' at 'IN'(line 1, pos 11)
== SQL ==
SHOW VIEWS IN `space1`
-----------^^^
at org.apache.spark.sql.catalyst.parser.ParseException.withCommand(ParseDriver.scala:241) ~[spark-catalyst_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse(ParseDriver.scala:117) ~[spark-catalyst_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:48) ~[spark-sql_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parsePlan(ParseDriver.scala:69) ~[spark-catalyst_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642) ~[spark-sql_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694) ~[spark-sql_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:232) [spark-hive-thriftserver_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:175) [spark-hive-thriftserver_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:171) [spark-hive-thriftserver_2.11-2.4.4.jar:2.4.4]
at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_201]
at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_201]
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844) [hadoop-common-2.8.5.jar:?]
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:185) [spark-hive-thriftserver_2.11-2.4.4.jar:2.4.4]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_201]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_201]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_201]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_201]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]
[2020-03-18T10:54:57,765][ERROR][HiveServer2-Background-Pool: Thread-203][org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation][][] Error running hive query:
org.apache.hive.service.cli.HiveSQLException: org.apache.spark.sql.catalyst.parser.ParseException:
missing 'FUNCTIONS' at 'IN'(line 1, pos 11)
== SQL ==
SHOW VIEWS IN `space1`
-----------^^^
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:269) ~[spark-hive-thriftserver_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:175) [spark-hive-thriftserver_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:171) [spark-hive-thriftserver_2.11-2.4.4.jar:2.4.4]
at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_201]
at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_201]
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844) [hadoop-common-2.8.5.jar:?]
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:185) [spark-hive-thriftserver_2.11-2.4.4.jar:2.4.4]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_201]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_201]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_201]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_201]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]
E.g. i get it when i connect Tableau to my Spark, or i can fire the query explicitly via a JDBC connected SQL tool.
Any idea's ?
Note, a query like
SELECT * FROM `employer` WHERE `Name` IN ('John','Alex');
finishes without a problem!
Also somebody else had this problem before but got no response: https://community.powerbi.com/t5/Desktop/Spark-connector-issue/td-p/952481
SHOW VIEWS command only works since Spark 3. That is why you are seeing that error.
See: https://issues.apache.org/jira/browse/SPARK-31113

Failure while running hive query - Map operator initialization failed - OOM

I am executing hive query which is created by joining 3 tables.But I am geting below error i.e.Please let me know what this error is and how can I resolve this error.
org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, vertexId=vertex_1535088835132_0435_1_02, diagnostics=[Task failed, taskId=task_1535088835132_0435_1_02_000028, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: Map operator initialization failed
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: Map operator initialization failed
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:262)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:149)
... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC overhead limit exceeded
at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:389)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:379)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:482)
at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:439)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:247)
... 15 more
Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:387)
... 20 more
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
at org.apache.hadoop.io.Text.setCapacity(Text.java:268)
at org.apache.hadoop.io.Text.set(Text.java:224)
at org.apache.hadoop.io.Text.set(Text.java:214)
at org.apache.hadoop.io.Text.<init>(Text.java:93)
at org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableStringObjectInspector.copyObject(WritableStringObjectInspector.java:36)
at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.copyToStandardObject(ObjectInspectorUtils.java:311)
at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.copyToStandardObject(ObjectInspectorUtils.java:346)
at org.apache.hadoop.hive.ql.exec.persistence.MapJoinKeyObject.read(MapJoinKeyObject.java:112)
at org.apache.hadoop.hive.ql.exec.persistence.MapJoinKey.read(MapJoinKey.java:68)
at org.apache.hadoop.hive.ql.exec.persistence.HashMapWrapper.putRow(HashMapWrapper.java:127)
at org.apache.hadoop.hive.ql.exec.tez.HashTableLoader.load(HashTableLoader.java:211)
at org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:310)
at org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:179)
at org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:175)
at org.apache.hadoop.hive.ql.exec.tez.ObjectCache.retrieve(ObjectCache.java:75)
at org.apache.hadoop.hive.ql.exec.tez.ObjectCache$1.call(ObjectCache.java:92)
... 4 more
], TaskAttempt 1 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: Map operator initialization failed
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: Map operator initialization failed
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:262)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:149)
... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC overhead limit exceeded
at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:389)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:379)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:482)
at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:439)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:247)
... 15 more
Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:387)
... 20 more
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
at org.apache.hadoop.hive.serde2.io.HiveDecimalWritable.getHiveDecimal(HiveDecimalWritable.java:95)
at org.apache.hadoop.hive.serde2.io.HiveDecimalWritable.hashCode(HiveDecimalWritable.java:167)
at java.util.Arrays.hashCode(Arrays.java:4146)
at org.apache.hadoop.hive.ql.exec.persistence.MapJoinKeyObject.hashCode(MapJoinKeyObject.java:83)
at java.util.HashMap.hash(HashMap.java:339)
at java.util.HashMap.put(HashMap.java:612)
at org.apache.hadoop.hive.ql.exec.persistence.HashMapWrapper.put(HashMapWrapper.java:107)
at org.apache.hadoop.hive.ql.exec.persistence.HashMapWrapper.putRow(HashMapWrapper.java:131)
at org.apache.hadoop.hive.ql.exec.tez.HashTableLoader.load(HashTableLoader.java:211)
at org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:310)
at org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:179)
at org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:175)
at org.apache.hadoop.hive.ql.exec.tez.ObjectCache.retrieve(ObjectCache.java:75)
at org.apache.hadoop.hive.ql.exec.tez.ObjectCache$1.call(ObjectCache.java:92)
... 4 more
], TaskAttempt 2 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: Map operator initialization failed
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: Map operator initialization failed
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:262)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:149)
... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC overhead limit exceeded
at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:389)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:379)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:482)
at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:439)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:247)
... 15 more
Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:387)
... 20 more
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.math.BigDecimal.valueOf(BigDecimal.java:1202)
at java.math.BigDecimal.createAndStripZerosToMatchScale(BigDecimal.java:4401)
at java.math.BigDecimal.stripTrailingZeros(BigDecimal.java:2598)
at org.apache.hadoop.hive.common.type.HiveDecimal.trim(HiveDecimal.java:238)
at org.apache.hadoop.hive.common.type.HiveDecimal.normalize(HiveDecimal.java:252)
at org.apache.hadoop.hive.common.type.HiveDecimal.create(HiveDecimal.java:71)
at org.apache.hadoop.hive.serde2.io.HiveDecimalWritable.getHiveDecimal(HiveDecimalWritable.java:95)
at org.apache.hadoop.hive.serde2.io.HiveDecimalWritable.hashCode(HiveDecimalWritable.java:167)
at java.util.Arrays.hashCode(Arrays.java:4146)
at org.apache.hadoop.hive.ql.exec.persistence.MapJoinKeyObject.hashCode(MapJoinKeyObject.java:83)
at java.util.HashMap.hash(HashMap.java:339)
at java.util.HashMap.put(HashMap.java:612)
at org.apache.hadoop.hive.ql.exec.persistence.HashMapWrapper.put(HashMapWrapper.java:107)
at org.apache.hadoop.hive.ql.exec.persistence.HashMapWrapper.putRow(HashMapWrapper.java:131)
at org.apache.hadoop.hive.ql.exec.tez.HashTableLoader.load(HashTableLoader.java:211)
at org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:310)
at org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:179)
at org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:175)
at org.apache.hadoop.hive.ql.exec.tez.ObjectCache.retrieve(ObjectCache.java:75)
at org.apache.hadoop.hive.ql.exec.tez.ObjectCache$1.call(ObjectCache.java:92)
... 4 more
], TaskAttempt 3 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: Map operator initialization failed
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: Map operator initialization failed
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:262)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:149)
... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC overhead limit exceeded
at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:389)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:379)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:482)
at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:439)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:247)
... 15 more
Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:387)
... 20 more
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
at org.apache.hadoop.hive.common.type.HiveDecimal.create(HiveDecimal.java:71)
at org.apache.hadoop.hive.serde2.io.HiveDecimalWritable.getHiveDecimal(HiveDecimalWritable.java:95)
at org.apache.hadoop.hive.serde2.io.HiveDecimalWritable.hashCode(HiveDecimalWritable.java:167)
at java.util.Arrays.hashCode(Arrays.java:4146)
at org.apache.hadoop.hive.ql.exec.persistence.MapJoinKeyObject.hashCode(MapJoinKeyObject.java:83)
at java.util.HashMap.hash(HashMap.java:339)
at java.util.HashMap.put(HashMap.java:612)
at org.apache.hadoop.hive.ql.exec.persistence.HashMapWrapper.put(HashMapWrapper.java:107)
at org.apache.hadoop.hive.ql.exec.persistence.HashMapWrapper.putRow(HashMapWrapper.java:131)
at org.apache.hadoop.hive.ql.exec.tez.HashTableLoader.load(HashTableLoader.java:211)
at org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:310)
at org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:179)
at org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:175)
at org.apache.hadoop.hive.ql.exec.tez.ObjectCache.retrieve(ObjectCache.java:75)
at org.apache.hadoop.hive.ql.exec.tez.ObjectCache$1.call(ObjectCache.java:92)
... 4 more
]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:59, Vertex vertex_1535088835132_0435_1_02 [Map 1] killed/failed due to:OWN_TASK_FAILURE]Vertex killed, vertexName=Reducer 2, vertexId=vertex_1535088835132_0435_1_03, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:51, Vertex vertex_1535088835132_0435_1_03 [Reducer 2] killed/failed due to:OTHER_VERTEX_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:1
(less...)
According to the log provided, you are running map join on Tez and it failed on loading hashMap. And the exception is java.lang.OutOfMemoryError: GC overhead limit exceeded.
Try to increase mapper container size and java heap size. Check your current configuration and increase accordingly. This is only example:
set hive.tez.container.size=9216;
set hive.tez.java.opts=-Xmx6144m;
Good tuning guide with explanation about these configuration parameters can be found here: Demystify Apache Tez Memory Tuning - Step by Step
If nothing helps, maybe the table is too big to fit in memory, consider decreasing the value of hive.auto.convert.join.noconditionaltask.size to force the query to use a Shuffle Join for the table of this big size.

ERROR: While running simple count(*) using hive2

I get
Exception in thread "main" java.sql.SQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask
at org.apache.hive.jdbc.HiveStatement.waitForOperationToComplete(HiveStatement.java:349)
at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:251)
at org.apache.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:434)
at Hive.main(Hive.java:21)
I have added hive-serde-2.1.1.jar from intellij. If I run select * from <table> it gives me result , but if I run select count(*) from <table> I get above error. Can anyone help me with this ?
UPDATE : I get this stack trace from yarn-gui
Application application_1499656878152_0003 failed 2 times due to Error launching appattempt_1499656878152_0003_000002. Got exception: org.apache.hadoop.yarn.exceptions.InvalidAuxServiceException: The auxService:mapreduce_shuffle does not exist
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateExceptionImpl(SerializedExceptionPBImpl.java:171)
at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateException(SerializedExceptionPBImpl.java:182)
at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.deSerialize(SerializedExceptionPBImpl.java:106)
at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:123)
at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:250)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
. Failing the application.

Beeline unable to create external hbase table but hive cli can

I have hbase 1.2.3 cluster, and install hive 2.1.1. When I try to create external hbase table through beeline/hiveserver2, I got exception. But if I use hive cli, it is ok. The create statement is as following:
create external table hbase_xing(id int, name string)
stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
with serdeproperties ("hbase.columns.mapping" = ":key,f:name")
tblproperties("hbase.table.name" = "xing");
Exception is:
Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=36, exceptions:
Fri Jan 06 19:38:24 CST 2017, null, java.net.SocketTimeoutException: callTimeout=120000, callDuration=128483: row 'xing,,' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=c3.estarspace.com,16020,1483694415877, seqNum=0
at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.throwEnrichedException(RpcRetryingCallerWithReadReplicas.java:271)
at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:223)
at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:61)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:320)
at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:295)
at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:160)
at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:155)
at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:811)
at org.apache.hadoop.hbase.MetaTableAccessor.fullScan(MetaTableAccessor.java:602)
at org.apache.hadoop.hbase.MetaTableAccessor.tableExists(MetaTableAccessor.java:366)
at org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:303)
at org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:313)
at org.apache.hadoop.hive.hbase.HBaseStorageHandler.preCreateTable(HBaseStorageHandler.java:205)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:742)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:735)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:154)
at com.sun.proxy.$Proxy24.createTable(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:830)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:845)
at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:3992)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:332)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2073)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1744)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1453)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1171)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1166)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:242)
at org.apache.hive.service.cli.operation.SQLOperation.access$800(SQLOperation.java:91)
at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:334)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:347)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.SocketTimeoutException: callTimeout=120000, callDuration=128483: row 'xing,,' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=c3.estarspace.com,16020,1483694415877, seqNum=0
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:159)
at org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:64)
... 3 more
Caused by: org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Call to c3.estarspace.com/192.168.0.13:16020 failed on local exception: org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Connection to c3.estarspace.com/192.168.0.13:16020 is closing. Call id=12, waitTime=3
at org.apache.hadoop.hbase.ipc.RpcClientImpl.wrapException(RpcClientImpl.java:1239)
at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1210)
at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:213)
at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:287)
at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:32651)
at org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:372)
at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:199)
at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:62)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:369)
at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:343)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126)
... 4 more
Caused by: org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Connection to c3.estarspace.com/192.168.0.13:16020 is closing. Call id=12, waitTime=3
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.cleanupCalls(RpcClientImpl.java:1037)
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.close(RpcClientImpl.java:844)
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.run(RpcClientImpl.java:572)
) (state=08S01,code=1)
I put hbase.zookeeper.quorum into the hive-site.xml.
Track the source code, It looks like something wrong when HIVE to call HBASE Client to call the region server.
This problem seems only exist in HBase 1.2.3. Upgrade to HBase 1.2.4, HIVE function corrently.