load local data files into hive table failed when using hive - hive

when i tried to load local data files into hive table,it report error while moving files.And i found the link,which give comments to fix this issue.I follow this step ,but it still can't work.
http://answers.mapr.com/questions/3565/getting-started-with-hive-load-the-data-from-sample-table txt-into-the-table-fails
After mkdir /user/hive/tmp,and set hive.exec.scratchdir= /user/hive/tmp,it still report RuntimeException Cannot make directory:file/user/hive/tmp/hive_2013* How can I fix this issue?Who are familiar with hive can help me?Thanks!
hive version is 0.10.0
hadoop version is 1.1.2

I suspect a permission issue here, because you are using MapR distribution.
Make sure that the user trying to create the directory has permissions to create the directory on CLDB.
Easy way to debug here is to do
$hadoop fs -chmod -R 777 /user/hive
and then try to load the data, to confirm if it's permission issue.

Related

Alluxio + Hive on EMR

I have Alluxio 1.8 installed on an EMR 5.19.0 cluster, and can see my S3 tables using /usr/local/alluxio/bin/alluxio fs ls /.
However, when I start up hive and issue
hive> [[DDL w/ LOCATION = alluxio://master_host:19998/my_table ]]], I get the following:
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:java.lang.RuntimeException: java.lang.ClassNotFoundException: Class alluxio.hadoop.FileSystem not found
Is there a way of getting past this? I've tried starting hive with --auxpath pointing to both /usr/local/alluxio/client/alluxio-1.8.1-client.jar and a copy of the jar on hdfs without any success.
Any help?
I posted a blog talking about the reasons for the error message java.lang.ClassNotFoundException: Class alluxio.hadoop.FileSystem not found. Here are some tips, hope they can help:
For Hive, set environment variable HIVE_AUX_JARS_PATH in conf/hive-env.sh:
export HIVE_AUX_JARS_PATH=/<PATH_TO_ALLUXIO>/client/alluxio-1.8.1-client.jar:${HIVE_AUX_JARS_PATH}
which I guess is equivalent to what you have done to set --auxpath.
Depending on your setting of Hive (e.g., Hive on MR or Spark or Tez), you may also need to make sure the runtime is also able to access the client jar. Take Hive on MR as an example, you perhaps also need to append the path to Alluxio client jar to mapreduce.application.classpath or yarn.application.classpath to ensure each task of the MR jobs can access this jar.

Failed to create database 'metastore_db', see the next exception for details

I'm getting the following exception while trying to start the hive in Ubuntu 14.04 LTS,Caused by: java.sql.SQLException: Failed to create database 'metastore_db', see the next exception for details. Hadoop installation is correct and it's working fine. Please tell me anyone what's problem?
It is because you're not in the same folder where you have created your metadata. I was facing the same problem because I was in my main user folder. When I changed the folder from main user to hduser my hive stated working.
See the mistake
I tried to find the xml file but it was not their so I searched and found where it was.
Similar to #dk14, In my case, I was in a folder on which I had no permission to write as user, moved directory and worked fine.
The reason for above error is, the user through which you are login doesn't have permission to write in that particular directory. I mean the directory in which you are running the schematool command.
For example my setup of Apache Hive was in /opt/apache-hive-3.1.2-bin I ran the command :-
sudo chown -R hadoopusr /opt/apache-hive-3.1.2-bin/
it is happening because you are on the other folder than your hive is installed.
so first of all change directory to the folder where your hive is installed and you and after that try to run hive once again.
and the hive should work properly.
Best of luck.
After spending some(lot) of time I got that issue is with creating that directory metastore_db inside DERBY_HOME/bin path was already there and I didn't had admin access for this you either:
delete that folder by using admin rights.
open hive-site.xml inside HIVE_HOME/conf path open in notepad and check connection string there change the database name to something else, it worked for me.

Apache oozie sharedlib is showing a blank list

Relatively new to Apache OOZIE and did an installation on Ubuntu 14.04, Hadoop 2.6.0, JDK 1.8. I was able to install oozie and the web console is visible at the 11000 port of my server.
Now while i copied the examples bundled with oozie and tried to run them i am running into an error which says no sharedlib exists.
Installed the sharedlib as below-
bin/oozie-setup.sh sharelib create -fs hdfs://localhost:54310
(my namenode is running on localhost 54310 and JT on localhost 54311)
hadoop fs -ls /user/hduser/share/lib is showing shared library created as per the oozie-site.xml file. However when i check the shared library using the command -
oozie admin -oozie http://localhost:11000/oozie -shareliblist the list is blank and also jobs are failing for the same reason.
Any clues on how should i approach this problem?
Thanks.
The sharelib create command looks fine.
If you havent done so already copy the core-site.xml from your hadoop installation folder into $OOZIE_HOME/conf/hadoop-conf/.
There might already be a "placeholder" core-site.xml in the hadoop-conf folder, delete or rename that one. Oozie doesnt get its hadoop configuration directly from your hadoop install (like hive for example) but from the core-site.xml you place in that hadoop-conf folder.
Okay i got a solution for this.
So when i was trying to create the sharedlib directory it was doing on HDFS but while running the job local path was being refereed. So i extracted the oozie-sharedlib tar.gz file in my local /user/hduser/share/lib directory and its working now.
But did not get the reason so its still an open question.
I have encountered the same issue and it turned out that
oozie was not able to communicate with hdfs, as it was not able to find the location for core-site.xml or any other hadoop configuration which has to be declared inside oozie-site.xml.
Corresponding property in oozie-site.xml is oozie.service.HadoopAccessorService.hadoop.configurations
this property was defined wrongly in my case.
changed it to point to where my Hadoop configuration xmls are present and then it started communicating with hdfs and hence was able to locate the sharelib on hdfs

Setting permissions for cloudera hadoop

I installed coudera hadoop 4 on a cluster of about 20 nodes. Using cloudera manager it went really smooth and all, but when I want to create an input directory using hadoop fs -mkdir input I get the following error: mkdir: Permission denied: user=root, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x Looks like a classic wrong permissions case but I have no clue where to start to fix this.
I found this document which I think would solve my problem if I knew what to do with it. For starters I don't know whether I am using MapReduce v1 of v2 (I don't see any yarn service in my cloudera manager so my guess would be v1 (?)). Second, since the whole installation was automatic I don't know what is installed and where.
Could anyone point me towards some easy steps to solve my problem? I'm really looking for the easiest solution here, I don't care at all about security since it is only a test. If I could give all users all possible permissions that would be fine.
I solved my problem: In cloudera manager, go to hdfs configuration under advanced and put the following code in HDFS Service Configuration Safety Valve:
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
Changing dfs.permission is always a solution but you can also try changing the user. In my system, the writing permission is assigned only to 'hdfs' user. The user can be changed by the following command:
su hdfs
hdfs1 -> Configuration -> View and Edit -> Uncheck "Check HDFS Persmissions" this worked thanks Shehaz
1.Do not modify dfs.permissions.Keep its value as true.
2.Add groups for a particular user if you required.( optional)
groupadd development
groupadd production
echo "Group production and development are created."
create user with existing groups and assign hdfs directory to use
useradd -g development clouddev3
sudo -u hdfs hadoop fs -mkdir -p /user/clouddev3
sudo -u hdfs hadoop fs -chown -R clouddev3:development /user/clouddev3
echo "User clouddev3 created and owns /user/clouddev3 directory in hdfs"
Now login with clouddev3 user and try,
hdfs dfs -ls /user/clouddev3
or hdfs dfs -ls

Hadoop DFS permission issue when running job

I'm getting this following permission error, and am not sure why hadoop is trying to write to this particular folder:
hadoop jar /usr/lib/hadoop/hadoop-*-examples.jar pi 2 100000
Number of Maps = 2
Samples per Map = 100000
Wrote input for Map #0
Wrote input for Map #1
Starting Job
org.apache.hadoop.security.AccessControlException: org.apache.hadoop.security.AccessControlException: Permission denied: user=myuser, access=WRITE, inode="/":hdfs:supergroup:drwxr-xr-x
Any idea why it is trying to write to the root of my hdfs?
Update: After temporarily setting hdfs root (/) to be 777 permissions, I seen that a "/tmp" folder is being written. I suppose one option is to just create a "/tmp" folder with open permissions for all to write to, but it would be nice from a security standpoint if this is instead written to the user folder (i.e. /user/myuser/tmp)
I was able to get this working with the following setting:
<configuration>
<property>
<name>mapreduce.jobtracker.staging.root.dir</name>
<value>/user</value>
</property>
#...
</configuration>
Restart of jobtracker service required as well (special thanks to Jeff on Hadoop mailing list for helping me track down problem!)
1) Create the {mapred.system.dir}/mapred directory in hdfs using the following command
sudo -u hdfs hadoop fs -mkdir /hadoop/mapred/
2) Give permission to mapred user
sudo -u hdfs hadoop fs -chown mapred:hadoop /hadoop/mapred/
You can also make a new user named "hdfs". Quite simple solution but not as clean probably.
Of course this is when you are using Hue with Cloudera Hadoop Manager (CDH3)
You need to set the permission for hadoop root directory (/) instead of setting the permission for the system's root directory. Even I was confused, but then realized that the directory mentioned was of hadoop's file system and not the system's.