I am using Hive2 with Tez. When I run the query it gives execution error which shown below.
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask
ERROR [432a4475-d246-4596-ad4c-54de6fea86c8 main] exec.Task: Failed to execute tez graph.
java.lang.IllegalArgumentException: Can not create a Path from an empty string
You have to put tez tar into local hdfs (/user/hadoop/tez) also set this path tez.lib.uris in tez-site.xml (tez/conf/tez-site.xml).
Related
Hadoop 3.3.5
Hive 3.1.3
Tez 0.10.2
I follow the instruction in this link to build tez 0.10.2 for hadoop 3.3.5: https://tez.apache.org/install.html
The db is stored on s3 bucket and I am able to run 'select count(*) from m1.t1' using hive.execution.engine=mr.
When I set hive.execution.engine=tez, and run the same query, I got this error immediately:
2023-02-15T21:21:09,208 INFO [a6e2cd1a-b2c9-42d8-9568-8e0b64677f77 main] client.TezClient: App did not succeed. Diagnostics: Application application_1676506240754_0019 failed 2 times due to AM Contai
ner for appattempt_1676506240754_0019_000002 exited with exitCode: 1
Failing this attempt.Diagnostics: [2023-02-15 21:21:08.730]Exception from container-launch.
Container id: container_1676506240754_0019_02_000001
Exit code: 1
[2023-02-15 21:21:08.732]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
Error: Could not find or load main class org.apache.tez.dag.app.DAGAppMaster
[2023-02-15 21:21:08.733]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
Error: Could not find or load main class org.apache.tez.dag.app.DAGAppMaster
If I set tez.use.cluster.hadoop-libs to true in tez-site.xml, I got YARN running but failed with load aws credential error even I have set the fs.s3a credentials in hadoop's core-site.xml, hive's hive-site.xml and .bashrc environment variables.
keys are masked to show sample only:
echo $AWS_ACCESS_KEY_ID
I9U996400005XXXXXXXX
echo $AWS_SECRET_KEY
mPY8GiU6NegNWoVnaODXXXXXXXXXXXXXXXXXXXX
hive> set hive.execution.engine=tez;
hive> select count(*) from m1.t1;
Query ID = hdp-user_20230215210146_62ed9fab-5d4a-42a9-bf54-5fb6f84a9048
Total jobs = 1
Launching Job 1 out of 1
Status: Running (Executing on YARN cluster with App id application_1676506240754_0015)
----------------------------------------------------------------------------------------------
VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
----------------------------------------------------------------------------------------------
Map 1 container INITIALIZING -1 0 0 -1 0 0
Reducer 2 container INITED 1 0 0 1 0 0
----------------------------------------------------------------------------------------------
VERTICES: 00/02 [>>--------------------------] 0% ELAPSED TIME: 2.03 s
----------------------------------------------------------------------------------------------
Status: Failed
Vertex failed, vertexName=Map 1, vertexId=vertex_1676506240754_0015_3_00, diagnostics=[Vertex vertex_1676506240754_0015_3_00 [Map 1] killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: t1 initializer failed, vertex=vertex_1676506240754_0015_3_00 [Map 1], java.nio.file.AccessDeniedException: s3a://hadoop-cluster/warehouse/tablespace/managed/hive/m1.db/t1: org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials provided by TemporaryAWSCredentialsProvider SimpleAWSCredentialsProvider EnvironmentVariableCredentialsProvider IAMInstanceCredentialsProvider : com.amazonaws.SdkClientException: Unable to load AWS credentials from environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY))
Tried to add all fs.s3a properties from core-site.xml to tez-site.xml and set fs,s3a,access.key and set fs.s3a.secret.key= inside hive session but still get same error.
org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials provided by TemporaryAWSCredentialsProvider SimpleAWSCredentialsProvider EnvironmentVariableCredentialsProvider IAMInstanceCredentialsProvider : com.amazonaws.SdkClientException: Unable to load AWS credentials from environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY))
Question: according to tez install instruction
Ensure tez.use.cluster.hadoop-libs is not set in tez-site.xml, or if it is set, the value should be false
But when set to false, tez could not run.
When set to true, I got aws credential error even though I set them in every possible location or environment variables.
==========================================================
Update:
Not sure if this is the right answer to this problem but I finally got it working by adding this property to hive-site.xml
<property>
<name>hive.conf.hidden.list</name>
<value>javax.jdo.option.ConnectionPassword,hive.server2.keystore.password,fs.s3a.proxy.password,dfs.adls.oauth2.credential,fs.adl.oauth2.credential</value>
</property>
Default all fs.s3a credential are hidden config even you don't set this property. I explicitly add this property and remove all fs.s3a credential related from the value.
Now, I can run select count(*) with tez.
I use HDP3.1. And I Ambari to deploy hadoop cluster and hive. I want to use only one user(hdfs) to run the all programs(such as hadoop, hive, sqoop, yarn...). So I change the users all to hdfs in set ACCOUNTS step when deploy hadoop cluster in ambari. After deployed, I run sqoop to import data from mysql to hive. I have the following issue.
19/02/20 18:44:13 INFO hive.HiveImport: ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. org.apache.hadoop.hive.ql.metadata.HiveException: Load Data failed for hdfs://datacenter1:8020/user/hdfs/person/part-m-00000 as the file is not owned by hive and load data is also not ran as hive
19/02/20 18:44:13 INFO hive.HiveImport: INFO : Completed executing command(queryId=hdfs_20190220184412_d61d8591-04fc-41a7-b412-d64935ddd046); Time taken: 0.235 seconds
19/02/20 18:44:13 INFO hive.HiveImport: Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. org.apache.hadoop.hive.ql.metadata.HiveException: Load Data failed for hdfs://datacenter1:8020/user/hdfs/person/part-m-00000 as the file is not owned by hive and load data is also not ran as hive (state=08S01,code=1)
19/02/20 18:44:13 INFO hive.HiveImport: Closing: 0: jdbc:hive2://datacenter2:2181,datacenter1:2181,datacenter3:2181/default;password=hdfs;serviceDiscoveryMode=zooKeeper;user=hdfs;zooKeeperNamespace=hiveserver2
19/02/20 18:44:13 ERROR tool.ImportTool: Import failed: java.io.IOException: Hive exited with status 2
at org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:299)
at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:234)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:558)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:656)
at org.apache.sqoop.Sqoop.run(Sqoop.java:150)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:186)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:240)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:249)
at org.apache.sqoop.Sqoop.main(Sqoop.java:258)
This issue happened in reduce step. I don't why it need hive user. Does anyone know how to resolve it?
change config of hive in hive-site.xml.
Change value: hive.load.data.owner from hive to nifi.
Restart all hive service and check again.
Duonghb
Stack: Ambari 2.4.2.0, HDP 2.5.3.0, CentOS 6.8, FreeIPA 3.0.0
When I try to use hdp user to submit a job on yarn, _000001 container can be created and launched successfully, but I got error when _000002 container is being launched after container created:
2018-11-27 22:13:35,919 WARN privileged.PrivilegedOperationExecutor (PrivilegedOperationExecutor.java:executePrivilegedOperation(170)) - Shell execution returned exit code: 255. Privileged Execution Operation Output:
main : command provided 1
main : run as user is hdp
main : requested yarn user is hdp
Getting exit code file...
Creating script paths...
Writing pid file...
Writing to tmp file /hadoop/yarn/local/nmPrivate/application_1543327888220_0001/container_e14_1543327888220_0001_01_000002/container_e14_1543327888220_0001_01_000002.pid.tmp
Writing to cgroup task files...
Creating local dirs...
Launching container...
Getting exit code file...
Creating script paths...
Full command array for failed execution:
[/usr/hdp/current/hadoop-yarn-nodemanager/bin/container-executor, hdp, hdp, 1, application_1543327888220_0001, container_e14_1543327888220_0001_01_000002, /hadoop/yarn/local/usercache/hdp/appcache/application_1543327888220_0001/container_e14_1543327888220_0001_01_000002, /hadoop/yarn/local/nmPrivate/application_1543327888220_0001/container_e14_1543327888220_0001_01_000002/launch_container.sh, /hadoop/yarn/local/nmPrivate/application_1543327888220_0001/container_e14_1543327888220_0001_01_000002/container_e14_1543327888220_0001_01_000002.tokens, /hadoop/yarn/local/nmPrivate/application_1543327888220_0001/container_e14_1543327888220_0001_01_000002/container_e14_1543327888220_0001_01_000002.pid, /hadoop/yarn/local, /hadoop/yarn/log, cgroups=none]
2018-11-27 22:13:35,921 WARN runtime.DefaultLinuxContainerRuntime (DefaultLinuxContainerRuntime.java:launchContainer(107)) - Launch container failed. Exception: org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException: ExitCodeException exitCode=255:
at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:175)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime.launchContainer(DefaultLinuxContainerRuntime.java:103)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.launchContainer(DelegatingLinuxContainerRuntime.java:89)
at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:392)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:317)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:83)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
Caused by: ExitCodeException exitCode=255:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:933)
at org.apache.hadoop.util.Shell.run(Shell.java:844)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1123)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:150)
... 9 more
There is no more log about Privilege, anybody has some idea?
Thanks in advance!
Problem resolved and the problem is submitted job itself not YARN/Privilege.
Suggestion is that you'd better try to find details in container log not resourcemanager/nodemanager log.
org.apache.hadoop.hive.metastore.HiveMetaException: Failed to get schema version.
Underlying cause: java.sql.SQLException : Failed to create database 'metastore_db', see the next exception for details.
SQL Error code: 40000
Use --verbose for detailed stacktrace.
* schemaTool failed *
FYI,
please check the permission on hive installation directory.
hive installation directory should be owned by the same user that is for hadoop.
that's how it worked for me.
I am trying to run pig script in local mode on a single node cluster as given below.
hduser#ubuntu:~$ pig -x local -f "/home/hduser/ddsoft/pigscript/FirstUDF.pig"
But I am getting below error.
[main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 101: file
'/home/hduser/ddsoft/hive-0.13.1-bin/hcatalog/share/hcatalog/hcatalog-core-0.13.1.jar'
does not exist.
how do I register the jar file mentioned in the error message. I tried updating the .bashrc, but it didn’t fix the error.