A column-safe `INSERT INTO t1 SELECT * FROM ...`? - sql

Is there some way to do a INSERT INTO t1 SELECT * FROM... such that it fails if the column names do not coincide?
I'm using Postgresql 9.x The columns names are not known in advance.
Motivation: I'm doing a periodic refresh of materialized views by the (quite standard) PL/pgSQL procedure:
CREATE OR REPLACE FUNCTION matview_refresh(name) RETURNS void AS
$BODY$
DECLARE
matview ALIAS FOR $1;
entry matviews%ROWTYPE;
BEGIN
SELECT * INTO entry FROM matviews WHERE mv_name = matview;
IF NOT FOUND THEN
RAISE EXCEPTION 'Materialized view % does not exist.', matview;
END IF;
EXECUTE 'TRUNCATE TABLE ' || matview;
EXECUTE 'INSERT INTO ' || matview || ' SELECT * FROM ' || entry.v_name;
UPDATE matviews SET last_refresh=CURRENT_TIMESTAMP WHERE mv_name=matview;
RETURN;
END
I preferred a TRUNCATE followed by a SELECT * INTO instead of a DROP/CREATE because it seemed more light and concurrent-friendly. It would fail if someone adds/remove columns from the view (then I would do the DROP/CREATE) but, it doesn't matter, in that case the refresh would not complete and we would catch the problem soon. What does matter is what happened today: someone changed the order of two columns of the view (of the same type), and the refresh inserted bogus data.

Build this into your plpgsql function to verify that view and table share the same column names in the same sequence exactly:
IF EXISTS (
SELECT 1
FROM (
SELECT *
FROM pg_attribute
WHERE attrelid = matview::regclass
AND attisdropped = FALSE
AND attnum > 0
) t
FULL OUTER JOIN (
SELECT *
FROM pg_attribute
WHERE attrelid = entry.v_name::regclass
AND attisdropped = FALSE
AND attnum > 0
) v USING (attnum, attname) -- atttypid to check for type, too
WHERE t.attname IS NULL
OR v.attname IS NULL
) THEN
RAISE EXCEPTION 'Mismatch between table and view!';
END IF;
The FULL OUTER JOIN adds a row with NULL values for any mismatch between the list of column names. So, if EXISTS finds a row, something is off.
And the cast to ::regclass would raise an exception right away if either table or view do not exist (or is out of scope - not in the search_path and not schema-qualified).
If you also want to check data types of the columns, just add atttypid to the USING clause.
As an aside: Querying pg_catalog tables is regularly faster by an order of magnitude than querying the bloated views int information_schema - information_schema is only good for SQL standard compliance and portability of code. Since you are writing 100 % Postgres-specific code, neither is relevant here.

You can query information_schema.columns to get columns in the right order:
SELECT INTO cols array_to_string(array_agg(column_name::text), ',')
FROM (
SELECT column_name
FROM information_schema.columns
WHERE table_name = 'matview'
ORDER BY ordinal_position
) AS x;
EXECUTE 'INSERT INTO ' || matview || ' SELECT ' || cols || ' FROM ' || entry.v_name;
You can get column list directly from pg_attribute -- just replace inner SELECT from information_schema.columns by:
SELECT attname AS column_name
FROM pg_attribute
WHERE attrelid = 'matview'::regclass AND attisdropped = false
ORDER BY attnum;

Related

How to get value printed on Postgres

I have a requirement to translate it to an SQL script.
I am using the information schema to get all the columns of a table and print their distinct count.
I was able to get the count, but not able to print the column name properly,
PFA the below code.
I have to pass the value of the "colum_lbl" to my select clause, if I do so it is giving me an group by error.
So I passed the "colum_lbl" within quotes. now all the values of the result has hardcoded 'colum_lbl' as value, I have to replace it with the original value I read from the for Loop
Any other efficient method for this requirement will be very much appreciated. Thanks in advance
do $$
DECLARE
colum_lbl text;
BEGIN
DROP TABLE IF EXISTS tmp_table;
CREATE TABLE tmp_table
(
colnm varchar(50),
cnt integer
);
FOR colum_lbl IN
SELECT distinct column_name
FROM information_schema.columns
WHERE table_schema = 'cva_aggr'
AND table_name = 'employee' AND column_name in ('empid','empnm')
LOOP
EXECUTE
'Insert into tmp_table
SELECT '' || colum_lbl || '',count(distinct ' || colum_lbl || ')
FROM employee ';
END LOOP;
END; $$

In postgres/timescaledb, for all tables that match filter get all results with condition

I have a timescale db with multiple tables having the same structure.
I want to retrieve the recent row from each table where a value is true.
My logic is to
retrieve all the tablenames for the tables where this condition can be true
loop over list of tablenames and select the rows where the condition is met
I get an syntax error on the FOR loop but I expect that I do more things wrong.
Can someone suggest a solution please? Thank you in advance.
DECLARE
tablename text;
BEGIN
FOR tablename IN
SELECT table_name FROM information_schema.tables
WHERE table_name LIKE 'ohlc%'
LOOP
SELECT WHERE tablename.is_active is TRUE
ORDER BY time_stamp DESC
Limit 1
END LOOP;
END;
translate your problem
find table that have specific column name in schema.
How to find a table having a specific column in postgresql
first condition meet then loop. Function to loop through and select data from multiple tables
most tricky issue is quote_ident.
create or replace function test0()
returns table (_is_active boolean, id int) as
$$
declare tbl text;
begin
for tbl in
select quote_ident( table_name)
from information_schema.columns
where table_schema = 'public'
and table_name ilike 'ohlc%'
and column_name = 'is_active'
loop
return query EXECUTE
'select ' || quote_ident('is_active') || ' , ' || quote_ident('id') || ' from ' || tbl || ' where '|| quote_ident('is_active') ||' is true';
end loop;
end
$$ language plpgsql;

How to select a column from all tables in which it resides?

I have many tables that have the same column 'customer_number'.
I can get a list of all these table by query:
SELECT table_name FROM ALL_TAB_COLUMNS
WHERE COLUMN_NAME = 'customer_number';
The question is how do I get all the records that have a specific customer number from all these tables without running the same query against each of them.
To get record from a table, you have write a query against that table. So, you can't get ALL the records from tables with specified field without a query against each one of these tables.
If there is a subset of columns that you are interested in and this subset is shared among all tables, you may use UNION/UNION ALL operation like this:
select * from (
select customer_number, phone, address from table1
union all
select customer_number, phone, address from table2
union all
select customer_number, phone, address from table3
)
where customer_number = 'my number'
Or, in simple case where you just want to know what tables have records about particular client
select * from (
select 'table1' src_tbl, customer_number from table1
union all
select 'table2', customer_number from table2
union all
select 'table3', customer_number from table3
)
where customer_number = 'my number'
Otherwise you have to query each table separatelly.
DBMS_XMLGEN enables you to run dynamic SQL statements without custom PL/SQL.
Sample Schema
create table table1(customer_number number, a number, b number);
insert into table1 values(1,1,1);
create table table2(customer_number number, a number, c number);
insert into table2 values(2,2,2);
create table table3(a number, b number, c number);
insert into table3 values(3,3,3);
Query
--Get CUSTOMER_NUMBER and A from all tables with the column CUSTOMER_NUMBER.
--
--Convert XML to columns.
select
table_name,
to_number(extractvalue(xml, '/ROWSET/ROW/CUSTOMER_NUMBER')) customer_number,
to_number(extractvalue(xml, '/ROWSET/ROW/A')) a
from
(
--Get results as XML.
select table_name,
xmltype(dbms_xmlgen.getxml(
'select customer_number, a from '||table_name
)) xml
from user_tab_columns
where column_name = 'CUSTOMER_NUMBER'
);
TABLE_NAME CUSTOMER_NUMBER A
---------- --------------- -
TABLE1 1 1
TABLE2 2 2
Warnings
These overly generic solutions often have issues. They won't perform as well as a plain old SQL statements and they are more likely to run into bugs. In general, these types of solutions should be avoided for production code. But they are still very useful for ad hoc queries.
Also, this solution assumes that you want the same columns from each row. If each row is different then things get much more complicated and you may need to look into technologies like ANYDATASET.
I assume you want to automate this. Two approaches.
SQL to generate SQL scripts
.
spool run_rep.sql
set head off pages 0 lines 200 trimspool on feedback off
SELECT 'prompt ' || table_name || chr(10) ||
'select ''' || table_name ||
''' tname, CUSTOMER_NUMBER from ' || table_name || ';' cmd
FROM all_tab_columns
WHERE column_name = 'CUSTOMER_NUMBER';
spool off
# run_rep.sql
PLSQL
Similar idea to use dynamic sql:
DECLARE
TYPE rcType IS REF CURSOR;
rc rcType;
CURSOR c1 IS SELECT table_name FROM all_table_columns WHERE column_name = 'CUST_NUM';
cmd VARCHAR2(4000);
cNum NUMBER;
BEGIN
FOR r1 IN c1 LOOP
cmd := 'SELECT cust_num FROM ' || r1.table_name ;
OPEN rc FOR cmd;
LOOP
FETCH rc INTO cNum;
EXIT WHEN rc%NOTFOUND;
-- Prob best to INSERT this into a temp table and then
-- select * that to avoind DBMS_OUTPUT buffer full issues
DBMS_OUTPUT.PUT_LINE ( 'T:' || r1.table_name || ' C: ' || rc.cust_num );
END LOOP;
CLOSE rc;
END LOOP;
END;

Dynamic Sql: Create array from records using array of column names

I am pulling all of the column_names (cname1) from a crosstab table that I made. There are thousands of these column names so I combined them into an array. I then want to use dynamic sql (or whatever works) to use those column_names to make an array based off of the records of that same crosstab table. I keep getting the error:
ERROR: missing "LOOP" at end of SQL expression
.
CREATE OR REPLACE FUNCTION mffcu.test_ty_hey()
RETURNS setof record
LANGUAGE plpgsql
AS $function$
Declare
cname1 text;
Begin
for cname1 in select array_agg(column_name) as useme
from(
select column_name::text
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'crosstab_183'
and ordinal_position != 1
) as fin
join mffcu.crosstab_183 a on fin.id = a.id;
loop
sql2 ='select distinct array['|| columnname ||'] from mffcu.crosstab_183';
execute sql2;
end loop;
END;
$function$
I cannot for the life of me figure out why I'm getting this error.
for cname1 in select array_agg(column_name) as useme
from(
select column_name::text
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'crosstab_183'
and ordinal_position != 1
) as fin
join mffcu.crosstab_183 a on fin.id = a.id; --here should not be semicolon!
loop

SELECT dynamic columns without functions in PostgreSQL

I need to select rows from two and more tables ("A", "B"). They have differences columns and I don't use inheritance for it.
So. For example:
SELECT * FROM "A" UNION SELECT * FROM "B"
ERROR: each UNION query must have the same number of columns
I can understand why.
I try get intersected columns from root schema in root table:
SELECT column_name FROM information_schema.columns
WHERE table_schema = 'client_root' AND table_name ='conditions'
It's ok! But I don't use query:
SELECT
(SELECT column_name FROM information_schema.columns
WHERE table_schema = 'client_root' AND table_name ='conditions')
FROM "client_123"."A"
So. How I can put sub select data in root select?
What you are trying to do is hardly possible in its entirety.
Create dynamic SQL
First, here is what you can do: a plpgsql function that creates the SQL for such a query:
CREATE OR REPLACE FUNCTION f_union_common_col_sql(text, text)
RETURNS text
AS $function$
DECLARE
_cols text;
BEGIN
_cols := string_agg(attname, ', ')
FROM (
SELECT a.attname
FROM pg_attribute a
WHERE a.attrelid = $1::regclass::oid
AND a.attnum >= 1
INTERSECT
SELECT a.attname
FROM pg_attribute a
WHERE a.attrelid = $2::regclass::oid
AND a.attnum >= 1
) x;
RETURN 'SELECT ' || _cols || '
FROM ' || quote_ident($1) || '
UNION
SELECT ' || _cols || '
FROM ' || quote_ident($1);
END;
$function$ LANGUAGE plpgsql;
COMMENT ON FUNCTION f_union_common_col_sql(text, text) IS 'Create SQL to query all visible columns that two tables have in common.
# Without duplicates. Use UNION ALL if you want to include duplicates.
# Depends on visibility dicatated by search_path
$1 .. table1: optionally schema-qualified, case sensitive!
$2 .. table2: optionally schema-qualified, case sensitive!';
Call:
SELECT f_union_common_col_sql('myschema1.tbl1', 'myschema2.tbl2');
Gives you the complete query. Execute it in a second call.
You can find most everything I used here in the manual on plpgsql functions.
The aggregate function string_agg() was introduced with PostgreSQL 9.0. In older versions you would: array_to_string(array_agg(attname), ', ').
Execute dynamic SQL?
Next, here is what you hardly can do:
CREATE OR REPLACE FUNCTION f_union_common_col(text, text)
RETURNS SETOF record AS
$BODY$
DECLARE
_cols text;
BEGIN
_cols := string_agg(attname, ', ')
FROM (
SELECT a.attname
FROM pg_attribute a
WHERE a.attrelid = $1::regclass::oid
AND a.attnum >= 1
INTERSECT
SELECT a.attname
FROM pg_attribute a
WHERE a.attrelid = $2::regclass::oid
AND a.attnum >= 1
) x;
RETURN QUERY EXECUTE '
SELECT ' || _cols || '
FROM quote_ident($1)
UNION
SELECT ' || _cols || '
FROM quote_ident($2)';
END;
$BODY$
LANGUAGE plpgsql VOLATILE;
COMMENT ON FUNCTION f_union_common_col(text, text) IS 'Query all visible columns that two tables have in common.
# Without duplicates. Use UNION ALL if you want to include duplicates.
# Depends on visibility dicatated by search_path
# !BUT! you need to specify a column definition list for every call. So, hardly useful.
$1 .. table1 (optionally schema-qualified)
$2 .. table1 (optionally schema-qualified)';
A function call requires you to specify the list of target columns. so this is hardly useful at all:
SELECT * from f_union_common_col('myschema1.tbl1', 'myschema2.tbl2')
ERROR: a column definition list is required for functions returning "record"
There is no easy way around this. You would have to dynamically create a function or at least a complex type. This is where I stop.