Beeline unable to create external hbase table but hive cli can - hive

I have hbase 1.2.3 cluster, and install hive 2.1.1. When I try to create external hbase table through beeline/hiveserver2, I got exception. But if I use hive cli, it is ok. The create statement is as following:
create external table hbase_xing(id int, name string)
stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
with serdeproperties ("hbase.columns.mapping" = ":key,f:name")
tblproperties("hbase.table.name" = "xing");
Exception is:
Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=36, exceptions:
Fri Jan 06 19:38:24 CST 2017, null, java.net.SocketTimeoutException: callTimeout=120000, callDuration=128483: row 'xing,,' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=c3.estarspace.com,16020,1483694415877, seqNum=0
at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.throwEnrichedException(RpcRetryingCallerWithReadReplicas.java:271)
at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:223)
at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:61)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:320)
at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:295)
at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:160)
at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:155)
at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:811)
at org.apache.hadoop.hbase.MetaTableAccessor.fullScan(MetaTableAccessor.java:602)
at org.apache.hadoop.hbase.MetaTableAccessor.tableExists(MetaTableAccessor.java:366)
at org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:303)
at org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:313)
at org.apache.hadoop.hive.hbase.HBaseStorageHandler.preCreateTable(HBaseStorageHandler.java:205)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:742)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:735)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:154)
at com.sun.proxy.$Proxy24.createTable(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:830)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:845)
at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:3992)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:332)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2073)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1744)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1453)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1171)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1166)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:242)
at org.apache.hive.service.cli.operation.SQLOperation.access$800(SQLOperation.java:91)
at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:334)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:347)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.SocketTimeoutException: callTimeout=120000, callDuration=128483: row 'xing,,' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=c3.estarspace.com,16020,1483694415877, seqNum=0
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:159)
at org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:64)
... 3 more
Caused by: org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Call to c3.estarspace.com/192.168.0.13:16020 failed on local exception: org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Connection to c3.estarspace.com/192.168.0.13:16020 is closing. Call id=12, waitTime=3
at org.apache.hadoop.hbase.ipc.RpcClientImpl.wrapException(RpcClientImpl.java:1239)
at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1210)
at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:213)
at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:287)
at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:32651)
at org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:372)
at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:199)
at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:62)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:369)
at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:343)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126)
... 4 more
Caused by: org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Connection to c3.estarspace.com/192.168.0.13:16020 is closing. Call id=12, waitTime=3
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.cleanupCalls(RpcClientImpl.java:1037)
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.close(RpcClientImpl.java:844)
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.run(RpcClientImpl.java:572)
) (state=08S01,code=1)
I put hbase.zookeeper.quorum into the hive-site.xml.

Track the source code, It looks like something wrong when HIVE to call HBASE Client to call the region server.
This problem seems only exist in HBase 1.2.3. Upgrade to HBase 1.2.4, HIVE function corrently.

Related

Presto - Ranger Issue with Hive Connector

We have a Presto(Version - 323-E.8) connector with Ranger enabled CDP Hive3 cluster where I'm able to run the select query on existing Hive ORC foramatted tables but couldn't create or delete any views on Hive metastore. It's throwing permissions issue error and my admin has granted all the permissions to the user from Ranger & AD and I'm able to perform all the operations from beeline with same user on the server.
Hive Properties:
*connector.name=hive-hadoop2
hive.metastore.uri=thrift://XXXXX
hive.views-execution.enabled=true
hive.metastore.authentication.type=KERBEROS
hive.metastore.service.principal=hive/_HOST#XXXX
hive.metastore.client.principal=XXXXX
hive.metastore.client.keytab=/abc/xxxx.keytab
hive.hdfs.wire-encryption.enabled=false
hive.metastore.thrift.impersonation.enabled=true
hive.config.resources=/etc/cdp/core-site.xml,/etc/cdp/hdfs-site.xml,/etc/cdp/hive-site.xml
hive.hdfs.authentication.type=KERBEROS
hive.hdfs.presto.principal=hdfs/_HOST#XXXXX
hive.hdfs.presto.principal=XXXX
hive.hdfs.presto.keytab=/abc/xxxx.keytab
hive.security=ranger
ranger.policy-rest-url=https://XXXXX:6182
ranger.service-name=cm_hive
ranger.authentication-type=KERBEROS
ranger.kerberos-principal=XXXX
ranger.kerberos-keytab=/abc/xxxx.keytab
ranger.plugin-policy-ssl-config-file=/abc/ssl-client.xml*
Error:
io.prestosql.spi.PrestoException: Operation type CREATE_VIEW not allowed for user:XXXXX
at io.prestosql.plugin.hive.metastore.thrift.ThriftHiveMetastore.createTable(ThriftHiveMetastore.java:1036)
at io.prestosql.plugin.hive.metastore.thrift.BridgingHiveMetastore.createTable(BridgingHiveMetastore.java:184)
at io.prestosql.plugin.hive.metastore.cache.CachingHiveMetastore.createTable(CachingHiveMetastore.java:524)
at io.prestosql.plugin.hive.metastore.cache.CachingHiveMetastore.createTable(CachingHiveMetastore.java:524)
at io.prestosql.plugin.hive.metastore.SemiTransactionalHiveMetastore$CreateTableOperation.run(SemiTransactionalHiveMetastore.java:2692)
at io.prestosql.plugin.hive.metastore.SemiTransactionalHiveMetastore$Committer.executeAddTableOperations(SemiTransactionalHiveMetastore.java:1668)
at io.prestosql.plugin.hive.metastore.SemiTransactionalHiveMetastore$Committer.access$1000(SemiTransactionalHiveMetastore.java:1282)
at io.prestosql.plugin.hive.metastore.SemiTransactionalHiveMetastore.commitShared(SemiTransactionalHiveMetastore.java:1225)
at io.prestosql.plugin.hive.metastore.SemiTransactionalHiveMetastore.commit(SemiTransactionalHiveMetastore.java:991)
at io.prestosql.plugin.hive.HiveMetadata.commit(HiveMetadata.java:2408)
at io.prestosql.plugin.hive.HiveConnector.commit(HiveConnector.java:202)
at io.prestosql.transaction.InMemoryTransactionManager$TransactionMetadata$ConnectorTransactionMetadata.commit(InMemoryTransactionManager.java:595)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57)
at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
at io.airlift.concurrent.BoundedExecutor.drainQueue(BoundedExecutor.java:78)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hadoop.hive.metastore.api.MetaException: Operation type CREATE_VIEW not allowed for user:XXXXX
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_result$create_table_resultStandardScheme.read(ThriftHiveMetastore.java:52658)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_result$create_table_resultStandardScheme.read(ThriftHiveMetastore.java:52626)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_result.read(ThriftHiveMetastore.java:52552)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_create_table(ThriftHiveMetastore.java:1490)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.create_table(ThriftHiveMetastore.java:1477)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at io.prestosql.plugin.base.util.LoggingInvocationHandler.handleInvocation(LoggingInvocationHandler.java:60)
at com.google.common.reflect.AbstractInvocationHandler.invoke(AbstractInvocationHandler.java:86)
at com.sun.proxy.$Proxy370.create_table(Unknown Source)
at io.prestosql.plugin.hive.metastore.thrift.ThriftHiveMetastoreClient.createTable(ThriftHiveMetastoreClient.java:161)
at io.prestosql.plugin.hive.metastore.thrift.ThriftHiveMetastore.lambda$createTable$51(ThriftHiveMetastore.java:1024)
at io.prestosql.plugin.hive.metastore.thrift.ThriftMetastoreApiStats.lambda$wrap$0(ThriftMetastoreApiStats.java:42)
at io.prestosql.plugin.hive.util.RetryDriver.run(RetryDriver.java:130)
at io.prestosql.plugin.hive.metastore.thrift.ThriftHiveMetastore.createTable(ThriftHiveMetastore.java:1022)
... 19 more

Spark SQL query `SHOW VIEWS IN` through Hive metastore fails with `missing 'FUNCTIONS' at 'IN'`

Have Spark (2.4.4) with a Hive metastore running. When accessing it through JDBC/ODBC with a query like
SHOW VIEWS IN space1
i get following error:
[2020-03-18T10:54:57,722][DEBUG][HiveServer2-Background-Pool: Thread-203][org.apache.spark.sql.execution.SparkSqlParser][][] Parsing command: SHOW VIEWS IN `space1`
[2020-03-18T10:54:57,733][ERROR][HiveServer2-Background-Pool: Thread-203][org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation][][] Error executing query, currentState RUNNING,
org.apache.spark.sql.catalyst.parser.ParseException:
missing 'FUNCTIONS' at 'IN'(line 1, pos 11)
== SQL ==
SHOW VIEWS IN `space1`
-----------^^^
at org.apache.spark.sql.catalyst.parser.ParseException.withCommand(ParseDriver.scala:241) ~[spark-catalyst_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse(ParseDriver.scala:117) ~[spark-catalyst_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:48) ~[spark-sql_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parsePlan(ParseDriver.scala:69) ~[spark-catalyst_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642) ~[spark-sql_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694) ~[spark-sql_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:232) [spark-hive-thriftserver_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:175) [spark-hive-thriftserver_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:171) [spark-hive-thriftserver_2.11-2.4.4.jar:2.4.4]
at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_201]
at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_201]
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844) [hadoop-common-2.8.5.jar:?]
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:185) [spark-hive-thriftserver_2.11-2.4.4.jar:2.4.4]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_201]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_201]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_201]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_201]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]
[2020-03-18T10:54:57,765][ERROR][HiveServer2-Background-Pool: Thread-203][org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation][][] Error running hive query:
org.apache.hive.service.cli.HiveSQLException: org.apache.spark.sql.catalyst.parser.ParseException:
missing 'FUNCTIONS' at 'IN'(line 1, pos 11)
== SQL ==
SHOW VIEWS IN `space1`
-----------^^^
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:269) ~[spark-hive-thriftserver_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:175) [spark-hive-thriftserver_2.11-2.4.4.jar:2.4.4]
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:171) [spark-hive-thriftserver_2.11-2.4.4.jar:2.4.4]
at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_201]
at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_201]
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844) [hadoop-common-2.8.5.jar:?]
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:185) [spark-hive-thriftserver_2.11-2.4.4.jar:2.4.4]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_201]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_201]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_201]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_201]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]
E.g. i get it when i connect Tableau to my Spark, or i can fire the query explicitly via a JDBC connected SQL tool.
Any idea's ?
Note, a query like
SELECT * FROM `employer` WHERE `Name` IN ('John','Alex');
finishes without a problem!
Also somebody else had this problem before but got no response: https://community.powerbi.com/t5/Desktop/Spark-connector-issue/td-p/952481
SHOW VIEWS command only works since Spark 3. That is why you are seeing that error.
See: https://issues.apache.org/jira/browse/SPARK-31113

Why do I get "File could only be replicated to 0 nodes" when writing to a partitioned table?

I create an external table in Hive with partitions and then try to populate it from the existing table, however, I get the following exceptions:
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /apps/hive/warehouse/pavel.db/browserdatapart/.hive-staging_hive_2018-12-28_13-22-45_751_6056004898772238481-1/_task_tmp.-ext-10000/cityid=1/_tmp.000001_3 could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and no node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1719)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3372)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3296)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:850)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:504)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:814)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:841)
at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:841)
at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:133)
at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:170)
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:555)
... 18 more
Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /apps/hive/warehouse/pavel.db/browserdatapart/.hive-staging_hive_2018-12-28_13-22-45_751_6056004898772238481-1/_task_tmp.-ext-10000/cityid=1/_tmp.000001_3 could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and no node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1719)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3372)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3296)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:850)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:504)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347)
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1554)
at org.apache.hadoop.ipc.Client.call(Client.java:1498)
at org.apache.hadoop.ipc.Client.call(Client.java:1398)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:459)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:290)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:202)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:184)
at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1580)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1375)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:
According to the internet these exceptions occur when datanode can't communicate with namenode or when you are running low on memory, but in my case everything is fine. I have already tried formatting my namenode and datanode as well. What else could be the issue?
https://wiki.apache.org/hadoop/CouldOnlyBeReplicatedTo I've also read this. And it didn't help me.
I am running on tez. this works:
insert into table browserdatapart partition(cityid) select UserAgent,cityid from browserdata limit 100;
And this fails with the exception I provided:
insert into table browserdatapart partition(cityid) select UserAgent,cityid from browserdata;
SET hive.exec.max.dynamic.partitions=100000;
SET hive.exec.max.dynamic.partitions.pernode=100000;
Setting the above parameters solved it for me. I guess hive was not able to replicate data to those partitions that show up in the exception, because there were more than the maximum (which is 224 in the case of my dataset).

Nifi putSQL processor raises exception on HIVE on simple insert

I'm feeding the putSQL processor using flowfiles like this one:
insert into test_nifi values ( '1476781027812');
I also tried using the version without the final ';' results are the same.
2016-10-18 10:49:58,858 ERROR [Timer-Driven Process Thread-4] o.apache.nifi.processors.standard.PutSQL PutSQL[id=d3103678-0157-1000-0000-000036cdfdbc] Failed to update database for [StandardFlow
FileRecord[uuid=b0e562b4-e974-4262-9b0b-c968f3488da4,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1476666309626-4, container=default, section=4], offset=332174, length=48],
offset=0,name=2479999684241677.avro,size=48]] due to java.sql.SQLException: Method not supported; it is possible that retrying the operation will succeed, so routing to retry: java.sql.SQLExcept
ion: Method not supported
2016-10-18 10:49:58,860 ERROR [Timer-Driven Process Thread-4] o.apache.nifi.processors.standard.PutSQL
java.sql.SQLException: Method not supported
at org.apache.hive.jdbc.HiveConnection.commit(HiveConnection.java:614) ~[na:na]
at org.apache.commons.dbcp.DelegatingConnection.commit(DelegatingConnection.java:334) ~[na:na]
at org.apache.commons.dbcp.PoolingDataSource$PoolGuardConnectionWrapper.commit(PoolingDataSource.java:211) ~[na:na]
at org.apache.nifi.processors.standard.PutSQL.onTrigger(PutSQL.java:371) ~[nifi-standard-processors-1.0.0.jar:1.0.0]
at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) [nifi-api-1.0.0.jar:1.0.0]
at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1064) [nifi-framework-core-1.0.0.jar:1.0.0]
at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136) [nifi-framework-core-1.0.0.jar:1.0.0]
at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47) [nifi-framework-core-1.0.0.jar:1.0.0]
at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132) [nifi-framework-core-1.0.0.jar:1.0.0]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_101]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_101]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_101]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_101]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_101]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_101]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_101]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]
SelectHiveQL processor works with no problem.
If putSQL it's broken ( at least on hive ) how can I execute SQL code which does not returns data ?
I'm connecting to Apache Hive (version 1.1.0-cdh5.5.2)
The table where I'm inserting the data is defined as :
CREATE TABLE `test_nifi`(
`value` string)
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
'hdfs://hdfs-prod/user/hive/warehouse/unifieddata.db/test_nifi';
inserts from belinee are working.
I solved it by myself.... the correct processor is PutHiveQL :P

Hive error : java.io.IOException: Couldn't create proxy provider class org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider

I have hive query which run successful sometimes but maximum time gives an error "java.io.IOException: Couldn't create proxy provider class org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider"
Below is my error log
java.lang.RuntimeException: java.io.IOException: Couldn't create proxy
provider class org.apache.hadoop.hdfs.server.namenode.ha.Con\
figuredFailoverProxyProvider at
org.apache.hadoop.mapred.lib.CombineFileInputFormat.isSplitable(CombineFileInputFormat.java:154)
at
org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat.getMoreSplits(CombineFileInputFormat.java:283)
at
org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:239)
at
org.apache.hadoop.mapred.lib.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:75)
at
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:336)
at
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:302)
at
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:435)
at
org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:525)
at
org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:517)
at
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:399)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1295) at
org.apache.hadoop.mapreduce.Job$10.run(Job.java:1292) at
java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:415) at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1292) at
org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:564) at
org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:559) at
java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:415) at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:559)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:550)
at
org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:420)
at
org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153) at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1516) at
org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1283) at
org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1101) at
org.apache.hadoop.hive.ql.Driver.run(Driver.java:924) at
org.apache.hadoop.hive.ql.Driver.run(Driver.java:914) at
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:269)
at
org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:221)
at
org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:431)
at
org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:367)
at
org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:464)
at
org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:474)
at
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:756)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:694) at
org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:633) at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606) at
org.apache.hadoop.util.RunJar.main(RunJar.java:212) Caused by:
java.io.IOException: Couldn't create proxy provider class
org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverPr\
oxyProvider at
org.apache.hadoop.hdfs.NameNodeProxies.createFailoverProxyProvider(NameNodeProxies.java:475)
at
org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:148)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:632) at
org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:570) at
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:147)
at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2596)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367) at
org.apache.hadoop.fs.FileSystem.get(FileSystem.java:169) at
org.apache.hadoop.mapred.lib.CombineFileInputFormat.isSplitable(CombineFileInputFormat.java:151)
... 45 more Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.GeneratedConstructorAccessor32.newInstance(Unknown
Source) at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at
org.apache.hadoop.hdfs.NameNodeProxies.createFailoverProxyProvider(NameNodeProxies.java:458)
... 53 more Caused by: java.lang.OutOfMemoryError: GC overhead limit
exceeded at java.util.Arrays.copyOf(Arrays.java:2219) at
java.util.ArrayList.grow(ArrayList.java:242) at
java.util.ArrayList.ensureExplicitCapacity(ArrayList.java:216) at
java.util.ArrayList.ensureCapacityInternal(ArrayList.java:208) at
java.util.ArrayList.add(ArrayList.java:440) at
java.lang.String.split(String.java:2288) at
sun.net.util.IPAddressUtil.textToNumericFormatV4(IPAddressUtil.java:47)
at java.net.InetAddress.getAllByName(InetAddress.java:1129) at
java.net.InetAddress.getAllByName(InetAddress.java:1098) at
java.net.InetAddress.getByName(InetAddress.java:1048) at
org.apache.hadoop.security.SecurityUtil$StandardHostResolver.getByName(SecurityUtil.java:474)
at
org.apache.hadoop.security.SecurityUtil.getByName(SecurityUtil.java:461)
at
org.apache.hadoop.net.NetUtils.createSocketAddrForHost(NetUtils.java:235)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:215)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:163)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:152)
at
org.apache.hadoop.hdfs.DFSUtil.getAddressesForNameserviceId(DFSUtil.java:677)
at
org.apache.hadoop.hdfs.DFSUtil.getAddressesForNsIds(DFSUtil.java:645)
at org.apache.hadoop.hdfs.DFSUtil.getAddresses(DFSUtil.java:628) at
org.apache.hadoop.hdfs.DFSUtil.getHaNnRpcAddresses(DFSUtil.java:727)
at
org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider.(ConfiguredFailoverProxyProvider.java:88)
at sun.reflect.GeneratedConstructorAccessor32.newInstance(Unknown
Source) at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at
org.apache.hadoop.hdfs.NameNodeProxies.createFailoverProxyProvider(NameNodeProxies.java:458)
at
org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:148)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:632) at
org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:570) at
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:147)
at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2596)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367) at
org.apache.hadoop.fs.FileSystem.get(FileSystem.java:169) Job
Submission failed with exception
'java.lang.RuntimeException(java.io.IOException: Couldn't create proxy
provider class org.apac\
he.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider)'
Could anyone tell me why this happen?
I just stumbled across a similar exception myself, and increasing the hive client heap didn't help. I found I was able to clear up the OutOfMemory GC Overhead exception by adding a partition column to the where clause of the query, so I've concluded that having a very large number of splits is causing this exception. I haven't dug into the code, but I believe I've seen this happen with string concatenation in a loop triggering gc thrashing, and something similar might be happening in the CombineHiveInputFormat.getSplits method.