I have access to a hive cluster through beeline. Results of some queries get stored as files in hdfs (e.g. /user/hive/warehouse/project). These results are just lines of texts.
Would it be possible to "download" those files to my local machine only using beeline as I don't have access to hdfs?
You can by
INSERT OVERWRITE LOCAL DIRECTORY '/your/path/' SELECT your_query
Try do something like this.
beeline: -e "select * from yourtable" > LOCAL/PATH/your_output
I'm runing this command from a unix server on remote HDFS server.
Regards.
Related
I'm trying to import data from .csv file to a PostgreSQL database hosted in a Linux Server, using the following command:
COPY areas_brasil FROM 'C:/Temp/RELATORIO_DTB_BRASIL_MUNICIPIO.csv' with delimiter '|' null 'NULL';
But i'm receiving the following error:
ERROR: could not open file
"C:/Temp/RELATORIO_DTB_BRASIL_MUNICIPIO.csv" for reading: No such file
or directory TIP: COPY FROM instructs the PostgreSQL server process
to read a file. You may want a client-side facility such as psql's
\copy.
The .csv file is in a client computer (running on Windows 10) in which i have administrator access to the database hosted in the server (running on Linux - Debian).
Thanks for helping me!
Welcome to SO.
COPY .. FROM 'path' assumes that the file is located in the server. If you wish execute COPY without having the file into the database server, you can either use \copy or just use the STDIN of psql from your client console, e.g. in unix systems (you have to find the cat and | equivalent for Windows):
$ cat file.csv | psql yourdb -c "COPY areas_brasil FROM STDIN DELIMITER '|';"
Using \COPY inside of psql it can be done like this:
\COPY areas_brasil FROM '/home/jones/file.csv' DELIMITER '|';
See this answer for more details.
Files are being written to a directory using the COPY query:
Copy (SELECT * FROM animals) To '/var/lib/postgresql/data/backups/2020-01-01/animals.sql' With CSV DELIMITER ',';
However if the directory 2020-01-01 does not exist, we get the error
could not open file "/var/lib/postgresql/data/backups/2020-01-01/animals.sql" for writing: No such file or directory
PostgeSQL server is running inside a Docker container with the volume mapping /mnt/backups:/var/lib/postgresql/data/backups
The Copy query is being sent from a Node.js app outside of the Docker container.
The mapped host directory /mnt/backups was created by Docker Compose and is owned by root, so the Node.js app sending the COPY query is unable to create the missing directories due to insufficient permissions.
The backup file is meant to be transferred out of the Docker container to the Docker host.
Question: Is it possible to use an SQL query to ask PostgreSQL 11.2 to create a directory if it does not exist? If not, how will you recommend the directory creation be done?
Using Node.js 12.14.1 on Ubuntu 18.04 host. Using PostgreSQL 11.2 inside container, Docker 19.03.5
An easy way to solve it is to create the file directly into the client machine. Using STDOUT from COPY you can let the query output be redirected to the client standard output, which you can catch and save in a file. For instance, using psql in the client machine:
$ psql -U your_user -d your_db -c "COPY (SELECT * FROM animals) TO STDOUT WITH CSV DELIMITER ','" > file.csv
Creating an output directoy in case it does not exist:
$ mkdir -p /mnt/backups/2020-01/ && psql -U your_user -d your_db -c "COPY (SELECT * FROM animals) TO STDOUT WITH CSV DELIMITER ','" > /mnt/backups/2020-01/file.csv
On a side note: try to avoid exporting files into the database server. Although it is possible, I consider it a bad practice. Doing so you will either write a file into the postgres system directories or give the postgres user permission to write somewhere else, and it is something you shouldn't be comfortable with. Export data directly to the client either using COPY as I mentioned or follow the advice from #Schwern. Good luck!
Postgres has its own backup and restore utilities which are likely to be a better choice than rolling your own.
When used with one of the archive file formats and combined with pg_restore, pg_dump provides a flexible archival and transfer mechanism. pg_dump can be used to backup an entire database, then pg_restore can be used to examine the archive and/or select which parts of the database are to be restored. The most flexible output file formats are the “custom” format (-Fc) and the “directory” format (-Fd). They allow for selection and reordering of all archived items, support parallel restoration, and are compressed by default. The “directory” format is the only format that supports parallel dumps.
A simple backup rotation script might look like this:
#!/bin/sh
table='animals'
url='postgres://username#host:port/database_name'
date=`date -Idate`
file="/path/to/your/backups/$date/$table.sql"
mkdir -p `dirname $file`
pg_dump $url -w -Fc --table=$table -f $file
To avoid hard coding the database password, -w means it will not prompt for a password and instead look for a password file. Or you can use any of many Postgres authentication options.
I wrote a batch script that it creates a sql file for my database.
It will automatically write id and image file direction and idproduct but I don't know how to insert mysql code with using batch file to xampp phpmyadmin
Simply use the MySQL.exe tool in your batch:
mysql -u username -p password database_name < "c:\path\to\SqLfile.sql"
I tired to load the data into my table 'users' in LOCAL mode and i am using cloudera on my virtual box. I have a file placed my file inside /home/cloudera/Desktop/Hive/ directory but i am getting an error
FAILED: SemanticException Line 1:23 Invalid path ''/home/cloudera/Desktop/Hive/hive_input.txt'': No files matching path file:/home/cloudera/Desktop/Hive/hive_input.txt
My syntax to load data into table
Load DATA LOCAL INPATH '/home/cloudera/Desktop/Hive/hive_input.txt' INTO Table users
Yes I removed the Local as per #Bhaskar, and path is my HDFS path where file exists not underlying linux path.
Load DATA INPATH '/user/cloudera/input_project/' INTO Table users;
You should change permission on the folder that contains your file.
chmod -R 755 /home/user/
Another reason could be the file access issue. If you are running hive CLI from user01 and accessing a file (your INPATH) from user02 home directory, it will give you the same error.
So the solution could be
1. Move the file to a location where user01 can access the file.
OR
2. Relaunch the Hive CLI after logging in with user02.
Check if you are using a Sqoop import in your script, try to import data to hive from an empty table.
This may cause the scoop import to delete the HDFS location of the hive table.
to confirm run: hdfs dfs -ls before and after you execute the sqoop import, re-create the directory using hdfs dfs -mkdir
My path to the file in HDFS was data/file.csv, note, it is not /data/file.csv.
I specified the LOCATION during table creation as data/file.csv.
Executing
LOAD DATA INPATH '/data/file.csv' INTO TABLE example_table;
failed with the mentioned exception. However, executing
LOAD DATA INPATH 'data/file.csv' INTO TABLE example_table;
worked as desired.
when i tried to load local data files into hive table,it report error while moving files.And i found the link,which give comments to fix this issue.I follow this step ,but it still can't work.
http://answers.mapr.com/questions/3565/getting-started-with-hive-load-the-data-from-sample-table txt-into-the-table-fails
After mkdir /user/hive/tmp,and set hive.exec.scratchdir= /user/hive/tmp,it still report RuntimeException Cannot make directory:file/user/hive/tmp/hive_2013* How can I fix this issue?Who are familiar with hive can help me?Thanks!
hive version is 0.10.0
hadoop version is 1.1.2
I suspect a permission issue here, because you are using MapR distribution.
Make sure that the user trying to create the directory has permissions to create the directory on CLDB.
Easy way to debug here is to do
$hadoop fs -chmod -R 777 /user/hive
and then try to load the data, to confirm if it's permission issue.