Export data in file in Postgres - sql

I have one table with id, name and complex queries. Below is just a sample of that table..
ID name Query
1 advisor_1 "Select * from advisor"
2 student_1 "Select * from student where id = 12"
3 faculty_4 "Select * from student where id = 12"
I want to iterate over this table and save each record into the csv file
Is there any way I can do it though Anonymous block automatically.
I don't want to do this manually as table has lots of rows.
Can anyone please help?

Not being superuser means the export can't be done in a server-side DO block.
It could be done client-side in any programming language that can talk to the database, or assuming a psql-only environment, it's possible to generate a list of \copy statements with an SQL query.
As an example of the latter, assuming the unique output filenames are built from the ID column, something like this should work:
SELECT format('\copy (%s) TO ''file-%s.csv'' CSV', query, id)
FROM table_with_queries;
The result of this query should be put into a file in a format such that it can be directly included into psql, like this:
\pset format unaligned
\pset tuples_only on
-- \g with an argument treats it as an output file.
SELECT format('\copy (%s) TO ''file-%s.csv'' CSV', query, id)
FROM table_with_queries \g /tmp/commands.sql
\i /tmp/commands.sql
As a sidenote, that process cannot be managed with the \gexec meta-command introduced in PG 9.6, because \copy itself is a meta-command. \gexec iterates only on SQL queries, not on meta-commands. Otherwise the whole thing could be done by a single \gexec invocation.

You may use a function like: (IF your problem is the code)
DECLARE
rec RECORD;
BEGIN
FOR rec IN SELECT id, query FROM table_name
LOOP
EXECUTE('COPY (' || rec.query || ') TO ' || QUOTE_LITERAL('d:/csv' || rec.id || '.csv') || ' CSV');
END LOOP;
END;
for permission problem, You should use some places on server that you have writing access to them (or request from vendor).

Related

How extract table names from PL/SQL package body file?

I need to get the table names queried in a pl/sql package file.
I know that there is an option for this in Notepad++ by regex but I don't know what regex to apply for get the table names (I understand that must be some regex to take the keyword "FROM" and get the next string after space, I think so).
For the next example code:
CREATE OR REPLACE PACKAGE BODY pac_example AS
FUNCTION f1 RETURN NUMBER IS
BEGIN
SELECT * FROM table1;
RETURN 1;
END f1;
FUNCTION f2 RETURN NUMBER IS
BEGIN
SELECT * FROM table2;
RETURN 1;
END f2;
END pac_example;
And I expect replace all and get the file with only its table names:
table1
table2
If you're interested only in table names that are directly referred from the PACKAGE BODY, a simple and straight-forward method is to query all_dependencies or user_dependencies.
SELECT owner,
referenced_name as table_name
FROM all_dependencies
WHERE type IN (
'PACKAGE BODY'
) AND name IN (
'PAC_EXAMPLE'
) AND referenced_type = 'TABLE';
DEMO
To my knowledge no one has done this with 100% accuracy. The closest you get is the ALL/DBA_DEPENDENIES but does not tell you if the table is accessed in a SELECT, INSERT, UPDATE or DELETE.
It will however resolve synonyms.
The downside of this is that it will not include tables referenced in dynamic SQL.
If you have a database that uses particular naming convention for tables (e.g. Tnnn_XXXXX ) you could do:
SELECT DISTINCT c.text, c.name, c.type, t.table_name
FROM user_source c, user_tables t
WHERE UPPER(t.text) like '%' || t.name_name || '%' -- Maybe REGEXP_LIKE better
ORDER BY 2, 1, 4;
I worked on a project decades ago where they wanted a CRUD matrix of programs (PLSQL, SQL, Oracle Forms/Reports, ProC, ProCOBOL) and what tables each accessed.
The only solution available at the time was for me to write a parser (in C) that parsed the codebase looking for SQL and processing it. Mine even reported columns as well as tables. The C program parsed the source, looking for KEYWORDS and characters to control a state engine. It took a a couple of weeks to refine and get working across all the different codebase types.
By the end, the only thing it could not do was dynamic queries where the table name was built up from variable values. But the workaround here was to capture the tkprof files and process these.
Tragically, I do not have the source code for this anymore.
However, if I were to do it again, I would use Lex/Yacc/Bison to parser SQL and build a system around these tools.
A quick search found this:
https://github.com/jgarzik/sqlfun
https://www.oreilly.com/library/view/flex-bison/9780596805418/ch04.html
Not a small undertaking.
Ctrl+H
Find what: (?:\A(?:(?!FROM).)*|\G)FROM\s+(\w+(?:\s*,\s*\w+)*)(?:(?!FROM).)*
Replace with: " #a space and a double quote
check Wrap around
check Regular expression
CHECK . matches newline
Replace all
Explanation:
(?: # start non capture group
\A # beginning of file
(?:(?!FROM).)* # Tempered greedy token, make sure we haven't FROM before
| # OR
\G # restart from last match position
) # end group
FROM\s+ # literally FROM followed by 1 or more spaces
( # start group 1
\w+ # 1 or more word characters (table name)
(?:\s*,\s*\w+)* # non capture group spaces comma spaces and 1 or more word characters, optional more tables
) # end group
(?:(?!FROM).)* # Tempered greedy token, make sure we haven't FROM
Replacement:
$1 # content of group 1, table name
Screen capture:
You can use following regex to search for table names.
Regex: FROM\s([^;]+)
Replacement: \n%\1%\n
Then follow this answer for replacing other data in file.
the earlier mentioned tables
all_dependencies or user_dependencies
can list out the dependencies as mentioned earlier, but it wont cover the dynamic queries. and the search if done in notepad using key words like 'from' then only tables referred by statements after 'From' statements will be covered.
the below code snippet can be considered for full analysis :- line by line, word by word and analysis the tables (referred to the sample mentioned by you)
declare
l_line varchar2(2000);
ln_start_string number;
ln_last_string number;
ln_string_length number;
l_word varchar2(4000);
l_table_flag varchar(2):='N';
cursor l_pkg_body_cur
is
select TEXT from all_source where upper(name) like upper('pac_example') and type = 'PACKAGE BODY';
-- to get the source compiled in package boby, mentioned the package to be searched here
begin
for rec in l_pkg_body_cur
loop
-- line by line processing
select TRIM(rec.text) into l_line from dual;
ln_string_length := length(l_line);
loop
-- word by word processing
l_table_flag :='N';
select instr(l_line,' ') into ln_last_string from dual;
select substr(l_line,0,ln_last_string) into l_word from dual;
begin
select 'Y' into l_table_flag from all_tables where upper(table_name) like upper(trim(l_word)) and rownum=1; -- to validate it is table or not
exception
when others then
l_table_flag := 'N';
end;
IF l_table_flag = 'Y'
then
dbms_output.put_line(trim(l_word) ); -- table name
end if;
select length (l_word) into ln_start_string from dual;
select trim(substr(replace(l_line,';',null),ln_start_string)) into l_line from dual;
exit when l_line is NULL;
end loop;
end loop;
end;
--output:
Statement processed.
table1
table2
similar way this query can be altered as wanted to find views or synonyms deparated by changing the base table- all_views,all_synonyms, accordingly as per required.
this is like most straight forward approach- might take more processing time depends on the package size
same can be done using UNIX scripting if needed
if needed to check from the file directly UNIX scripting can be used (UTF_file operations can also be used ) to get line by line from the file and have sql session to do the above validation and to display the results
but hopefully this would provide most accurate results.

Drop all tables in a Redshift schema - without dropping permissions

I would be interested to drop all tables in a Redshift schema. Even though this solution works
DROP SCHEMA public CASCADE;
CREATE SCHEMA public;
is NOT good for me since that it drops SCHEMA permissions as well.
A solution like
DO $$ DECLARE
r RECORD;
BEGIN
-- if the schema you operate on is not "current", you will want to
-- replace current_schema() in query with 'schematodeletetablesfrom'
-- *and* update the generate 'DROP...' accordingly.
FOR r IN (SELECT tablename FROM pg_tables WHERE schemaname = current_schema()) LOOP
EXECUTE 'DROP TABLE IF EXISTS ' || quote_ident(r.tablename) || ' CASCADE';
END LOOP;
END $$;
as reported in this thread How can I drop all the tables in a PostgreSQL database?
would be ideal. Unfortunately it doesn't work on Redshift (apparently there is no support for for loops).
Is there any other solution to achieve it?
Run this SQL and copy+paste the result on your SQL client.
If you want to do it programmatically you need to built little bit code around it.
SELECT 'DROP TABLE IF EXISTS ' || tablename || ' CASCADE;'
FROM pg_tables
WHERE schemaname = '<your_schema>'
I solved it through a procedure that deletes all records. Using this technique to truncate fails but deleting it works fine for my intents and purposes.
create or replace procedure sp_truncate_dwh() as $$
DECLARE
tables RECORD;
BEGIN
FOR tables in SELECT tablename
FROM pg_tables
WHERE schemaname = 'dwh'
order by tablename
LOOP
EXECUTE 'delete from dwh.' || quote_ident(tables.tablename) ;
END LOOP;
RETURN;
END;
$$ LANGUAGE plpgsql;
--call sp_truncate_dwh()
In addition to demircioglu's answer, I had to add Commit after every drop statement to drop all tables in my schema. SELECT 'DROP TABLE IF EXISTS ' || tablename || ' CASCADE; COMMIT;' FROM pg_tables WHERE schemaname = '<your_schema>'
P.S.: I do not have required reputation to add this note as a comment and had to add as an answer.
Using Python and pyscopg2 locally on my PC I came up with this script to delete all tables in schema:
import psycopg2
schema = "schema_to_be_deleted"
try:
conn = psycopg2.connect("dbname='{}' port='{}' host='{}' user='{}' password='{}'".format("DB_NAME", "DB_PORT", "DB_HOST", "DB_USER", "DB_PWD"))
cursor = conn.cursor()
cursor.execute("SELECT tablename FROM pg_tables WHERE schemaname = '%s'" % schema)
rows = cursor.fetchall()
for row in rows:
cursor.execute("DROP TABLE {}.{}".format(schema, row[0]))
cursor.close()
conn.commit()
except psycopg2.DatabaseError as error:
logger.error(error)
finally:
if conn is not None:
conn.close()
Replace correctly values for DB_NAME, DB_PORT, DB_HOST, DB_USER and DB_PWD to connect to the Redshift DB
The following recipe differs from other answers in the regard that it generates one SQL statement for all tables we're going to delete.
SELECT
'DROP TABLE ' ||
LISTAGG("table", ', ') ||
';'
FROM
svv_table_info
WHERE
"table" LIKE 'staging_%';
Example result:
DROP TABLE staging_077815128468462e9de8ca6fec22f284, staging_abc, staging_123;
As in other answers, you will need to copy the generated SQL and execute it separately.
References
|| operator concatenates strings
LISTAGG function concatenates every table name into a string with a separator
The table svv_table_info is used because LISTAGG doesn't want to work with pg_tables for me. Complaint:
One or more of the used functions must be applied on at least one user created tables. Examples of user table only functions are LISTAGG, MEDIAN, PERCENTILE_CONT, etc
UPD. I just now noticed that SVV_TABLE_INFO page says:
The SVV_TABLE_INFO view doesn't return any information for empty tables.
...which means empty tables will not be in the list returned by this query. I usually delete transient tables to save disk space, so this does not bother me much; but in general this factor should be considered.

How can I loop to get counts from multiple tables?

I am trying to derive a table with counts from multiple tables. The tables are not on my schema. The table names on the schema that I am interested in all start with 'STAF_' and end with '_TS'. The criteria i am looking for is where SEP = 'MO'. So for example, the query in its base form is:
select area, count(SEP) areacount
from mous.STAF_0001_TS
where SEP = 'MO'
group by area;
I have about 1000 tables that i'd like to do this for.
Ultimatly, I'd like the output to be a table on my schema that looks like the following:
area| areacount
0001| 3
0002| 7
0003| 438
Thank you.
As a first step I'd write an SQL query that generates an SQL query:
SELECT 'SELECT area, count(*) FROM '||c.table_name||'UNION ALL' as run_me
FROM all_tables c
WHERE c.table_name LIKE 'STAF\_%\_MS' escape '\'
Running this will produce an output that is another SQL query. Copy the result text out of your results grid and paste it back into your query pane. Delete the final UNION ALL and run it
Once you dig how to write an SQL query that generate an SQL query, you can look at turning it into a view, or creating a dynamic query in a string.
Gotta say, this is a horrible way to store data; you'd be better off using ONE table with an extra column containing whatever is in xxx of STAF_xxx_MS right now
In Oracle 12c, you can embed a FUNCTION that will query the number of rows in any given table. Then you can use that function in your main query. Here is an example:
WITH FUNCTION cnt ( p_owner VARCHAR2, p_table_name VARCHAR2 ) RETURN NUMBER IS
l_cnt NUMBER;
BEGIN
EXECUTE IMMEDIATE 'SELECT count(*) INTO :cnt FROM ' || p_owner || '.' || p_table_name INTO l_cnt;
RETURN l_cnt;
EXCEPTION WHEN OTHERS THEN
RETURN NULL; -- This will happen for entries in ALL_TABLES that are not directly accessible (e.g., IOT overflow tables)
END cnt;
SELECT t.owner, t.table_name, cnt(t.owner, t.table_name)
FROM all_tables t
where t.table_Name like 'STAF\_%\_MS' escape '\';

Select columns with particular column names in PostgreSQL

I want to write a simple query to select a number of columns in PostgreSQL. However, I keep getting errors - I tried a few options but they did not work for me. At the moment I am getting the following error:
org.postgresql.util.PSQLException: ERROR: syntax error at or near
"column"
To get the columns with values I try the followig:
select * from weather_data where column like '%2010%'
Any ideas?
column is a reserved word. You cannot use it as identifier unless you double-quote it. Like: "column".
Doesn't mean you should, though. Just don't use reserved words as identifiers. Ever.
To ...
select a list of columns with 2010 in their name:
.. you can use this function to build the SQL command dynamically from the system catalog table pg_attribute:
CREATE OR REPLACE FUNCTION f_build_select(_tbl regclass, _pattern text)
RETURNS text AS
$func$
SELECT format('SELECT %s FROM %s'
, string_agg(quote_ident(attname), ', ')
, $1)
FROM pg_attribute
WHERE attrelid = $1
AND attname LIKE ('%' || $2 || '%')
AND NOT attisdropped -- no dropped (dead) columns
AND attnum > 0; -- no system columns
$func$ LANGUAGE sql;
Call:
SELECT f_build_select('weather_data', '2010');
Returns something like:
SELECT foo2010, bar2010_id, FROM weather_data;
You cannot make this fully dynamic, because the return type is unknown until we actually build the query.
This will get you the list of columns in a specific table (you can optionally add schema if needed):
SELECT column_name
FROM information_schema.columns
WHERE table_name = 'yourtable'
and column_name like '%2010%'
SQL Fiddle Demo
You can then use that query to create a dynamic sql statement to return your results.
Attempts to use dynamic structures like this usually indicate that you should be using data formats like hstore, json, xml, etc that are amenible to dynamic access.
You can get a dynamic column list by creating the SQL on the fly in your application. You can query the INFORMATION_SCHEMA to get information about the columns of a table and build the query.
It's possible to do this in PL/PgSQL and run the generated query with EXECUTE but you'll find it somewhat difficult to work with the result RECORD, as you must get and decode composite tuples, you can't expand the result set into a normal column list. Observe:
craig=> CREATE OR REPLACE FUNCTION retrecset() returns setof record as $$
values (1,2,3,4), (10,11,12,13);
$$ language sql;
craig=> select retrecset();
retrecset
---------------
(1,2,3,4)
(10,11,12,13)
(2 rows)
craig=> select * from retrecset();
ERROR: a column definition list is required for functions returning "record"
craig=> select (r).* FROM (select retrecset()) AS x(r);
ERROR: record type has not been registered
About all you can do is get the raw record and decode it in the client. You can't index into it from SQL, you can't convert it to anything else, etc. Most client APIs don't provide facilities for parsing the text representations of anonymous records so you'll likely have to write this yourself.
So: you can return dynamic records from PL/PgSQL without knowing their result type, it's just not particularly useful and it is a pain to deal with on the client side. You really want to just use the client to generate queries in the first place.
You can't search all columns like that. You have to specify a specific column.
For example,
select * from weather_data where weather_date like '%2010%'
or better yet if it is a date, specify a date range:
select * from weather_data where weather_date between '2010-01-01' and '2010-12-31'
Found this here :
SELECT 'SELECT ' || array_to_string(ARRAY(SELECT 'o' || '.' || c.column_name
FROM information_schema.columns As c
WHERE table_name = 'officepark'
AND c.column_name NOT IN('officeparkid', 'contractor')
), ',') || ' FROM officepark As o' As sqlstmt
The result is a SQL SELECT query you just have to execute further.
It fits my needs since I pipe the result in the shell like this :
psql -U myUser -d myDB -t -c "SELECT...As sqlstm" | psql -U myUser -d myDB
That returns me the formatted output, but it only works in the shell.
Hope this helps someone someday.

Oracle/SQL - Using query results as paramaters in another query -looping?

Hi everyone what I'm wondering if I can do is create a table that lists the record counts of other tables. It would get those table names from a table. So let's assume I have the table TABLE_LIST that looks like this
name
---------
sports_products <-- contains 10 records
house_products <-- contains 8 records
beauty_products <-- contains 15 records
I would like to write a statement that pulls the names from those tables to query them and coount the records and ultimately produce this table
name numRecords
------------------------------
sports_products 10
house_products 8
beauty_products 15
So I think I would need to do something like this pseudo code
select *
from foreach tableName in select name from table_list
select count(*) as numRecords
from tableName
loop
You can have a function that is doing this for you via dynamic sql.
However, make sure to declare it as authid current_user. You do not want anyone to gain some sort of privilege elevation by exploiting your function.
create or replace function SampleFunction
(
owner in VarChar
,tableName in VarChar
) return integer authid current_user is
result Integer;
begin
execute immediate 'select count(*) from "' || owner || '"."' || tableName || '"'
INTO result;
return result;
end;
One option is to simply keep your DB statistics updated, use dbms_stats package or EM, and then
select num_rows
from all_tables
where table_name in (select name from table_list);
I think Robert Giesecke solution will work fine.
A more exotic way of solving this is by using dbms_xmlgen.getxml.
See for example: Identify a table with maximum rows in Oracle