I have a csv-file with multiple rows containing select statements that I would like to execute and then save all the results to one file.
Each row in the csv is looking something like this:
SELECT 'somehardcodedname' AS name,age FROM account WHERE id='somehardcodedid';
I attempted to run the following command:
\o output.csv
\i input.csv
This creates a file but the file is sort of useless, since I am getting the table names and the number of rows before/after each result and the table structure in the csv becomes ugly(kind of like a table in a table).
Is there any way that I can modify the command to put all results from the individual select queries into one result that gets saved to the file? The input.csv contains several thousand rows, so I cannot make that into one query.
Related
I want to extract information from a text file (almost 1GB) and store it in PostgreSQL database.
Text file is in following format:
DEBUG, 2017-03-23T10:02:27+00:00, ghtorrent-40 -- ghtorrent.rb:Repo EFForg/https-everywhere exists
DEBUG, 2017-03-24T12:06:23+00:00, ghtorrent-49 -- ghtorrent.rb:Repo Shikanime/print exists
...
and I want to extract 'DEBUG', timestamp, 'ghtorrent-40', 'ghtorrent' and "Repo EFForg/https-everywhere exists" from each line and store it in database.
I have done it in using other languages like python (psycopg2) and C++ (libpqxx) but is it possible to write a function in PostgreSQL itself to import the whole data itself.
I am currenly using pgAdmin4 tool for the PostgreSQL.
I thinking of using something like pg_read_file in function to read the file but one line at a time and insert it into the table.
An approach I use with my large XML files - 130GB or bigger - is to upload the whole file into a temporary unlogged table and from there I extract the content I want. Unlogged tables are not crash-safe, but are much faster than logged ones, which totally suits the purpose of a temporary table ;-)
Considering the following table ..
CREATE UNLOGGED TABLE tmp (raw TEXT);
.. you can import this 1GB file using a single psql line from your console (unix)..
$ cat 1gb_file.txt | psql -d db -c "COPY tmp FROM STDIN"
After that all you need is to apply your logic to query and extract the information you want. Depending on the size of your table, you can create a second table from a SELECT, e.g.:
CREATE TABLE t AS
SELECT
trim((string_to_array(raw,','))[1]) AS operation,
trim((string_to_array(raw,','))[2])::timestamp AS tmst,
trim((string_to_array(raw,','))[3]) AS txt
FROM tmp
WHERE raw LIKE '%DEBUG%' AND
raw LIKE '%ghtorrent-40%' AND
raw LIKE '%Repo EFForg/https-everywhere exists%'
Adjust the string_to_array function and the WHERE clause to your logic! Optionally you can replace these multiple LIKE operations to a single SIMILAR TO.
.. and your data would be ready to be played with:
SELECT * FROM t;
operation | tmst | txt
-----------+---------------------+------------------------------------------------------------------
DEBUG | 2017-03-23 10:02:27 | ghtorrent-40 -- ghtorrent.rb:Repo EFForg/https-everywhere exists
(1 Zeile)
Once your data is extracted you can DROP TABLE tmp; to free some disk space ;)
Further reading: COPY, PostgreSQL array functions and pattern matching
I have several queries that need to be run on a weekly basis in Microsoft SQL Server Management Studio, each one one is just a relatively simple select query, and the results need to be saved into csv file. Right now someone spends an hour running each script in turn and saving the results.
I figured this could be somewhat automated but am struggling.
From reading previous questions here I've gotten as far as using SQLCMD mode, and by putting :output c:\filename.csv I get the output saved into a file, but I am having trouble getting separate files to be generated for each query.
For simplicity's sake, assume my query looks like this:
OUT: C:\File1.csv
SELECT * FROM table1;
OUT: C:\File2.csv
SELECT * FROM table2;
OUT: C:\File3.csv
SELECT * FROM table3;
Instead of getting three files with the output of each query, I end up with File1 and File2 filled with a couple of unreadable characters, and all three queries in File3. I know in Oracle there is a spool off command, is there something similar for OUT: in SSMS?
I ran a somewhat modified query and was able to get three files with three different query results. I ran the following for a quick test:
:OUT C:\File1.csv
SELECT 'Hello'
GO
:OUT C:\File2.csv
SELECT 'My'
GO
:OUT C:\File3.csv
SELECT 'Friend'
This gave me three separate files with the results from each query in a separate file. All I did was take out the semi colon and added the keyword GO which will terminate a command and move on to the next one. I hope this helps.
I have a get file names step with a Regular expression that gets 4 csv files.
After that I have a text file input step which sets the fields of the csv, and read these files.
Once this step is completed a Table output step is executed.
The problem is that the text file input seems to read all 4 files in a single statement, so the table output statement inserts the rows of the 4 files. So my output table has 20 rows (5 per each file)
The expected beahivour is read one file, insert the 5 rows of the file in the output table and execute sql script which moves this table to a final table and truncate temp table. Now repeat the process for the second, third and last file.
The temporary table is deleted in every step of load a file, but final table not, it is incremental.
How can I do that in pentaho?
Change your current job to a subjob that executes once for each incoming record.
In the new main job you need:
a transformation that runs Get Filenames linking to Copy Rows to Result
a Job entry with your current job. Configure it to execute for each row.
In the subjob you have to replace Get Filenames with Get Rows from Result and reconfigure the field that contains the filename.
My team uses a query that generates a text file over 500MB in size.
The query is executed from a Korn Shell script on an AIX server connecting to DB2.
The results are ordered and grouped by a specific field.
My question: Is it possible, using SQL, to write all rows with this specific field value to its own text file?
For example: All rows with field VENDORID = 1 would go to 1.txt, VENDORID = 2 to 2.txt, etc.
The field in question currently has 1000+ different values, so I would expect the same amount of text files.
Here is an alternative approach that gets each file directly from the database.
You can use the DB2 export command to generate each file. Something like this should be able to create one file :
db2 export to 1.txt of DEL select * from table where vendorid = 1
I would use a shell script or something like Perl to automate the execution of such a command for each value.
Depending on how fancy you want to get, you could just hardcode the extent of vendorid, or you could first get the list of distinct vendorids from the table and use that.
This method might scale a bit better than extracting one huge text file first.
I need a simple way to export data from an SQLite database of multiple tables, then import them into another database.
Here is my scenario. I have 5 tables: A, B, C, D, E.
Each table has a primary key as the first column called ID. I want a Unix command that will dump ONLY the data in the row from the primary key in a format that can be imported into another database.
I know I can do a
sqlite3 db .dump | grep INSERT
but that gives me ALL data in the table. I'm not a database expert, and I'm trying to do this with all unix commands in which I can write a shell script, rather than writing C++ code to do it (because that's what people are telling me that's the easiest way). I just refuse to write C++ code to accomplish a task that possible can be done in 4-5 statements from the command line.
Any suggestions?
This may look a little weird, however you may give it a try:
Create text file and place following statements there:
.mode insert
.output insert.sql
select * from TABLE where STATEMENT; -- place the needed select query here
.output stdout
Feed this file to sqlite3:
$ sqlite3 -init initf DATA.DB .schema > schema.sql
As the result you will get two files: one with simple "inserts" (insert.sql) and another with db schema (schema.sql).
Suggest finding a tool that can take your query and export to a CSV. It sounds like you wanted a script. Did you want to reuse and automate this?
For other scenarios, perhaps consider the sqlite-manager Firefox plugin. It supports running your adhoc queries, and exporting the results to a CSV.
Within that, give it this statement:
SELECT ID FROM TableA
Repeat for each table as you need.
You can use sqlite3 bash. For example if you want to get insert query for all records in one table, you can do the followings:
$ sqlite3 /path/to/db_name.db
>>.mode insert
>>.output insert.sql
>>select * from table_name;
It will creates a file name called insert.sql and puts insert query for every record in the given table.
A sample for what you get in insert.sql:
INSERT INTO "table" VALUES("data for record one");
INSERT INTO "table" VALUES("data for record two");
..
you also can use the quote function of SQLite.
echo "SELECT 'INSERT INTO my_new_table (my_new_key) VALUES (' || quote(my_old_key) || ');' FROM my_old_table;" | sqlite my_table > statements.sql