HIVE> FAILED: SemanticException Line 1:23 Invalid path - hive

I tired to load the data into my table 'users' in LOCAL mode and i am using cloudera on my virtual box. I have a file placed my file inside /home/cloudera/Desktop/Hive/ directory but i am getting an error
FAILED: SemanticException Line 1:23 Invalid path ''/home/cloudera/Desktop/Hive/hive_input.txt'': No files matching path file:/home/cloudera/Desktop/Hive/hive_input.txt
My syntax to load data into table
Load DATA LOCAL INPATH '/home/cloudera/Desktop/Hive/hive_input.txt' INTO Table users

Yes I removed the Local as per #Bhaskar, and path is my HDFS path where file exists not underlying linux path.
Load DATA INPATH '/user/cloudera/input_project/' INTO Table users;

You should change permission on the folder that contains your file.
chmod -R 755 /home/user/

Another reason could be the file access issue. If you are running hive CLI from user01 and accessing a file (your INPATH) from user02 home directory, it will give you the same error.
So the solution could be
1. Move the file to a location where user01 can access the file.
OR
2. Relaunch the Hive CLI after logging in with user02.

Check if you are using a Sqoop import in your script, try to import data to hive from an empty table.
This may cause the scoop import to delete the HDFS location of the hive table.
to confirm run: hdfs dfs -ls before and after you execute the sqoop import, re-create the directory using hdfs dfs -mkdir

My path to the file in HDFS was data/file.csv, note, it is not /data/file.csv.
I specified the LOCATION during table creation as data/file.csv.
Executing
LOAD DATA INPATH '/data/file.csv' INTO TABLE example_table;
failed with the mentioned exception. However, executing
LOAD DATA INPATH 'data/file.csv' INTO TABLE example_table;
worked as desired.

Related

Output of Hive extract to local drive error

i have created a directory in hdfs as Test, but when i trying to insert records
from hive to the hdfs it is throwing error as
Error: WARNING:root:could not open file '/etc/apt/sources.list.d/dotnetdev.list'
WARNING:root:could not open file '/etc/apt/sources.list.d/HDP.list'
INSERT: command not found
Command i used
INSERT OVERWRITE DIRECTORY '/Test' select * from table limit 10
Please help

How to export HDFS directory with beeline (no HDFS access)?

I have access to a hive cluster through beeline. Results of some queries get stored as files in hdfs (e.g. /user/hive/warehouse/project). These results are just lines of texts.
Would it be possible to "download" those files to my local machine only using beeline as I don't have access to hdfs?
You can by
INSERT OVERWRITE LOCAL DIRECTORY '/your/path/' SELECT your_query
Try do something like this.
beeline: -e "select * from yourtable" > LOCAL/PATH/your_output
I'm runing this command from a unix server on remote HDFS server.
Regards.

Import Text data to Greenplum database

I have a text file with some data and wants to import data to Greenplum database.After online research , I found that its better to use COPY command if your data size is small. So i decided to use this.
Here is the scenario:
I have placed my Text file at location /bin/bash /data , I can access this file using terminal, but once I run the following COPY sql script at Greenplum database it's says :
could not open file "/bin/bash /data/data.txt" for reading: No such file or directory
Below is the my sql script:
COPY userdata(customerid,time,trans,quantity) from '/bin/bash /data/data.txt' WITH DELIMITER ',';
From Greenplum database documentation I found the following line :
The COPY source file must be accessible to the master host. Specify the COPY source file name relative to the master host location.
But I do not know how to make it accessible to master host and relative to master host location.
The path to your file doesn't make any sense.
/bin/bash /data/data.txt is certainly not a valid name for a path.
If you data.txt file is located in the /data folder with content
in the following format :
12345,5:32AM,air,2
67890,6:42PM,rail,4
You could use the below command :
COPY userdata(customerid,time,trans,quantity) FROM '/data/data.txt' WITH DELIMITER AS ',';
Also sql user you should have the permission to access the the data.txt from the location /data folder.
Perhaps do a ls -l and check if the sql user can read files from data.txt

What is the path for a bootstrapped file for a Pig job running in Amazon EMR

I bootstrap a data file in my EMR job. The bootstrapping succeeds and the file is copied to /home/hadoop/contents/ folder with right permissions.
However when I try to access it in the Pig script like below:
userdidstopick = load '/home/hadoop/contents/UserIdsToPick.txt' AS (uid:chararray);
I get an error that the input path does not exist:
hdfs://10.183.166.176:9000/home/hadoop/contents/UserIdsToPick.txt
When running Ruby jobs the bootstrapped file was always accessible under /home/hadoop/contents/ folder and everything worked for me.
Is it different for Pig?
By default Pig on EMR is configured to access HDFS location instead of local filesystem. The error shows the HDFS location.
There are 2 ways to solve this:
Either copy the file on S3, and directly load file from s3
userdidstopick = load 's3_bucket_location/UserIdsToPick.txt' AS (uid:chararray);
Or you can first copy the file on HDFS (instead of local filesystem), and then directly use it as path you are doing today.
I would prefer first option.

Pig is storing the data in Temporary Directory instead of actual directory

Below pig command ( local mode) is saving the file in temporary directory where in i was expecting the same to be stored in actual directory. Any thoughts?
STORE c INTO '/user/hue/pigbasic1' USING PigStorage('*');
hadoop fs -cat /user/hue/pigbasic1/_temporary/0/task_local1045577955_0002_r_000000/part-r-00000;
Thanks in advance