Is it possible to download CrateDB results in CSV file? - cratedb

I need to query the database, and drop the results in a csv file.
I remember I did this with sql server, is it possible to do this CrateDB?

CrateDB's crash shell supports various different output formats, including csv.
Example:
crash --format csv -c 'select * from sys.cluster' > sys_cluster.csv
See https://crate.io/docs/reference/crash for details.

Of course yes, where are u getting stuck?
You can find the official github repository of examples: crate/crate-sample-apps, where cratedb is used. You can use this as a baseline to understand crate.
And, you can easily convert the values from crate to a CSV format using various language specific libraries. For example, in python you can use csv.

Related

Downloading data from BigQuery with Stata

How can I download data (either with queries or the full table) from BigQuery with Stata?
On Stata, I would like to run something like:
download "SELECT * FROM `project.dataset.table`" "~/Downloads/table.csv"
and have it download the query result to the local csv file.
You should be able to use odbc to do this. For example, your code might be:
odbc load, exec('"SELECT * FROM project.dataset.table"') dsn("ds_name") clear
export delimited "~/Downloads/table.csv", replace
But this depends on BigQuery being accessible to odbc. You may need to link it. If so, you may find this presentation and this documentation helpful.

Is there an alternative way to import data into Postgres than using psql?

I am under strict corporate environment and don't have access to Postgres' psql. Therefore I can't do what's shown e.g. in the SO Convert SQLITE SQL dump file to POSTGRESQL. However, I can generate the sqlite dump file .sql. The resulting dump.sql file is 1.3gb big.
What would be the best way to import this data into Postgres? I also have DBeaver and can connect to both databases simultaneously but unfortunately can't do INSERT from SELECT.
I think the term for that is 'absurd', not 'strict'.
DBeaver has an 'execute script' feature. But who knows, maybe it will be blocked.
EnterpriseDB offers binary downloads. If you unzip those to a local drive you might be able to execute psql from the bin subdirectory.
If you can install psycopg2 or pg8000 for python, you should be able to connect to the database and then loop over the dump file sending each line to the database with cur.execute(line) . It might take some fiddling if the dump file has any multi-line commands, but the example you linked to doesn't show any of those.

Importing and maintaining multiple csv files into PostgreSQL

I am new to using SQL, so please bear with me.
I need to import several hundred csv files into PostgreSQL. My web search has only indicated how to import many csv files into one table. However, most csv files have different column types (all have one line headers). Is it possible to somehow run a loop, and have each csv imported to a table with the same name as the csv? Creating each table manually and specifying columns is not an option. I know that COPY will not work as the table needs to already by specified.
Perhaps this is not feasible in PostgreSQL? I would like to accomplish this in pgAdmin III or the PSQL console, but I am open to other ideas (using something like R to change the csv to a format more easily entered into PostgreSQL?).
I am using PostgreSQL on a Windows 7 computer. It was requested that I use PostgreSQL, thus the focus of the question.
The desired result is a database full of tables, that I will then join with a spreadsheet that includes specific site data. Thanks!
Use pgfutter.
The general syntax looks like this:
pgfutter csv
In order to run this on all csv files in a directory from Windows Command Prompt, navigate to the desired directory and enter:
for %f in (*.csv) do pgfutter csv %f
Note that the path for the downloaded program must be added to the list of accepted paths for Environmental Variables.
EDIT:
Here is the command line code for Linux users
Run it as
pgfutter *.csv
Or if that won't do
find -iname '*.csv' -exec pgfutter csv {} \;
In the terminal use nano to make a file to loop through moving csv files under my directory to postgres DB
>nano run_pgfutter.sh
The content of run_pgfutter.sh:
#! /bin/bash
for i in /mypath/*.csv
do
./pgfutter csv ${i}
done
Then make the file executable:
chmod u+x run_pgfutter.sh

How to add headers when we DUMP the data in output file using PIG scripts?

I tried to search for it but cannot find the tip/recommendations.
Here is my situation. I have all the data lined up correctly and output working fine using pig script. Stored the files in a output directory. The output files are more than 100 files so what i have done is accumulated the results file using another pig script.
I was wondering if there is anything in PIG LATIN that will help me add "Header" to the accumulated results file so that business users can quickly use it as it also has headers?
Please advise
If you are using DUMP in Pig script and redirecting the result to a single file, you can use DESCRIBE before DUMP. Doing so will append schema information as header to your output file
A = LOAD 'test' USING PigStorage() AS (col1:int, col2:chararray);
DESCRIBE A;
DUMP A;
output will be something like:
A: {col1: int,col2: chararray}
1,test
2,test
...
Pig can store the schema into a different file ".pig_schema" using PigStorage:
store A into 'outputFile' using PigStorage('\t', '-schema');
will save your data in the outputFile using tabs as delimiters and also creates the schema file.
You can store the header in a separate file, LOAD it and UNION it with your data. Then you need to do an ORDER BY (that might be tricky depending on your data).
Another way would be to use hadoop getmerge.
In general, this is not something pig is very good at, you might as well write a script in another language.

Generate DDL SQL create table statement after scanning CSV file

Are there any command line tools (Linux, Mac, and/or Windows) that I could use to scan a delimited file and output a DDL create table statement with the data type determined for me?
Did some googling, but couldn't find anything. Was wondering if others might know, thanks!
DDL-generator can do this. It can generate DDL's for YAML, JSON, CSV, Pickle and HTML (although I don't know how the last one works). I just tried it on some data exported from Salesforce and it worked pretty well. Note you need to use it with Python 3, I could not get it to work with Python 2.7.
You can also try https://github.com/mshanu/idli. It can take csv file as input and can generate create statement with appropriate types.It can generate for mysql, oracle and postgres. I am actively working on this and happy to receive feedback for future improvement