SELECT dynamic columns without functions in PostgreSQL - sql

I need to select rows from two and more tables ("A", "B"). They have differences columns and I don't use inheritance for it.
So. For example:
SELECT * FROM "A" UNION SELECT * FROM "B"
ERROR: each UNION query must have the same number of columns
I can understand why.
I try get intersected columns from root schema in root table:
SELECT column_name FROM information_schema.columns
WHERE table_schema = 'client_root' AND table_name ='conditions'
It's ok! But I don't use query:
SELECT
(SELECT column_name FROM information_schema.columns
WHERE table_schema = 'client_root' AND table_name ='conditions')
FROM "client_123"."A"
So. How I can put sub select data in root select?

What you are trying to do is hardly possible in its entirety.
Create dynamic SQL
First, here is what you can do: a plpgsql function that creates the SQL for such a query:
CREATE OR REPLACE FUNCTION f_union_common_col_sql(text, text)
RETURNS text
AS $function$
DECLARE
_cols text;
BEGIN
_cols := string_agg(attname, ', ')
FROM (
SELECT a.attname
FROM pg_attribute a
WHERE a.attrelid = $1::regclass::oid
AND a.attnum >= 1
INTERSECT
SELECT a.attname
FROM pg_attribute a
WHERE a.attrelid = $2::regclass::oid
AND a.attnum >= 1
) x;
RETURN 'SELECT ' || _cols || '
FROM ' || quote_ident($1) || '
UNION
SELECT ' || _cols || '
FROM ' || quote_ident($1);
END;
$function$ LANGUAGE plpgsql;
COMMENT ON FUNCTION f_union_common_col_sql(text, text) IS 'Create SQL to query all visible columns that two tables have in common.
# Without duplicates. Use UNION ALL if you want to include duplicates.
# Depends on visibility dicatated by search_path
$1 .. table1: optionally schema-qualified, case sensitive!
$2 .. table2: optionally schema-qualified, case sensitive!';
Call:
SELECT f_union_common_col_sql('myschema1.tbl1', 'myschema2.tbl2');
Gives you the complete query. Execute it in a second call.
You can find most everything I used here in the manual on plpgsql functions.
The aggregate function string_agg() was introduced with PostgreSQL 9.0. In older versions you would: array_to_string(array_agg(attname), ', ').
Execute dynamic SQL?
Next, here is what you hardly can do:
CREATE OR REPLACE FUNCTION f_union_common_col(text, text)
RETURNS SETOF record AS
$BODY$
DECLARE
_cols text;
BEGIN
_cols := string_agg(attname, ', ')
FROM (
SELECT a.attname
FROM pg_attribute a
WHERE a.attrelid = $1::regclass::oid
AND a.attnum >= 1
INTERSECT
SELECT a.attname
FROM pg_attribute a
WHERE a.attrelid = $2::regclass::oid
AND a.attnum >= 1
) x;
RETURN QUERY EXECUTE '
SELECT ' || _cols || '
FROM quote_ident($1)
UNION
SELECT ' || _cols || '
FROM quote_ident($2)';
END;
$BODY$
LANGUAGE plpgsql VOLATILE;
COMMENT ON FUNCTION f_union_common_col(text, text) IS 'Query all visible columns that two tables have in common.
# Without duplicates. Use UNION ALL if you want to include duplicates.
# Depends on visibility dicatated by search_path
# !BUT! you need to specify a column definition list for every call. So, hardly useful.
$1 .. table1 (optionally schema-qualified)
$2 .. table1 (optionally schema-qualified)';
A function call requires you to specify the list of target columns. so this is hardly useful at all:
SELECT * from f_union_common_col('myschema1.tbl1', 'myschema2.tbl2')
ERROR: a column definition list is required for functions returning "record"
There is no easy way around this. You would have to dynamically create a function or at least a complex type. This is where I stop.

Related

Select from all tables inside the schema containing column with name

How can I get select (table_name, table_name.age)?
I need to get values from column 'age' from all tables having this column/
I have this function
CREATE OR REPLACE FUNCTION union_all_tables()
RETURNS TABLE
(
age bigint
) AS
$$
DECLARE
dynamic_query text = '';
r_row record;
BEGIN
FOR r_row IN SELECT table_schema || '.' || table_name qualified_table_name
FROM information_schema.COLUMNS
WHERE column_name = 'age'
LOOP
dynamic_query := dynamic_query || format('UNION SELECT ' ||
'age ' ||
'FROM %s ',r_row.qualified_table_name) || E'\n'; -- adding new line for pretty print, it is not necessary
END LOOP;
dynamic_query := SUBSTRING(dynamic_query, 7) || ';';
RAISE NOTICE 'Union all tables in staging, executing statement: %', dynamic_query;
RETURN QUERY EXECUTE dynamic_query;
END;
$$
LANGUAGE plpgsql;
You don't need to generate a single huge UNION statement. If you use RETURN QUERY the result of that query is appended to the overall result of the function every time you use it.
When dealing with dynamic SQL you should also use format() to properly deal with identifiers.
Your function can be simplified to:
CREATE OR REPLACE FUNCTION union_all_tables()
RETURNS TABLE (table_schema text, table_name text, age bigint)
AS
$$
DECLARE
dynamic_query text = '';
r_row record;
BEGIN
FOR r_row IN SELECT c.table_schema, c.table_name
FROM information_schema.columns c
WHERE c.column_name = 'age'
LOOP
dynamic_query := format(
'select %L as table_schema, %L as table_name, age from %I.%I',
r_row.table_schema, r_row.table_name,
r_row.table_schema, r_row.table_name);
RETURN QUERY EXECUTE dynamic_query;
END LOOP;
END;
$$
LANGUAGE plpgsql;
Note that the whole function will fail if there is (at least) one table where the age column is not a bigint.

How to use dynamic column names in an UPDATE or SELECT statement in a function?

In PostgreSQL 9.1, PL/pgSQL, given a query:
select fk_list.relname from ...
where relname is of type name (e.g., "table_name").
How do you get the appropriate value for "relname" that can be used directly in an UPDATE statement as:
Update <relname> set ...
within the PL/pgSQL script?
Using quote_ident(r.relname) as:
Update quote_ident(r.relname) Set ...
fails with:
syntax error at or near "("
LINE 55: UPDATE quote_ident(r.relname) ....
The complete code I am working with:
CREATE FUNCTION merge_children_of_icd9 (ocicd9 text,
ocdesc text, ncicd9 text, ncdesc text)
RETURNS void AS $BODY$
DECLARE
r RECORD;
BEGIN
FOR r IN
WITH fk_actions ( code, action ) AS (
VALUES ('a', 'error'),
('r', 'restrict'),
('c', 'cascade'),
('n', 'set null'),
('d', 'set default')
), fk_list AS (
SELECT pg_constraint.oid AS fkoid, conrelid, confrelid::regclass AS parentid,
conname, relname, nspname,
fk_actions_update.action AS update_action,
fk_actions_delete.action AS delete_action,
conkey AS key_cols
FROM pg_constraint
JOIN pg_class ON conrelid = pg_class.oid
JOIN pg_namespace ON pg_class.relnamespace = pg_namespace.oid
JOIN fk_actions AS fk_actions_update ON confupdtype = fk_actions_update.code
JOIN fk_actions AS fk_actions_delete ON confdeltype = fk_actions_delete.code
WHERE contype = 'f'
), fk_attributes AS (
SELECT fkoid, conrelid, attname, attnum
FROM fk_list
JOIN pg_attribute ON conrelid = attrelid AND attnum = ANY(key_cols)
ORDER BY fkoid, attnum
), fk_cols_list AS (
SELECT fkoid, array_agg(attname) AS cols_list
FROM fk_attributes
GROUP BY fkoid
)
SELECT fk_list.fkoid, fk_list.conrelid, fk_list.parentid, fk_list.conname,
fk_list.relname, fk_cols_list.cols_list
FROM fk_list
JOIN fk_cols_list USING (fkoid)
WHERE parentid = 'icd9'::regclass
LOOP
RAISE NOTICE 'now in loop. relname is %', quote_ident(r.relname);
RAISE NOTICE 'cols_list[1] is %', quote_ident(r.cols_list[1]);
RAISE NOTICE 'cols_list[2] is %', quote_ident(r.cols_list[2]);
RAISE NOTICE 'now doing update';
UPDATE quote_ident(r.relname) SET r.cols_list[1] = ncicd9, r.cols_list[2] = ncdesc
WHERE r.cols_list[1] = ocicd9 AND r.cols_list[2] = ocdesc;
RAISE NOTICE 'finished update';
END LOOP;
RETURN;
END $BODY$ LANGUAGE plpgsql VOLATILE;
-- select merge_children_of_icd9('', 'aodm type 2', '', 'aodm, type 2');
I'm sure this kind of thing is done often, but I can't seem to find anything like it using PostgreSQL. Is there a better way?
In an UPDATE statement in PL/pgSQL, the table name has to be given as a literal. If you want to dynamically set the table name and the columns, you should use the EXECUTE command and paste the query string together:
EXECUTE 'UPDATE ' || quote_ident(r.relname) ||
' SET ' || quote_ident(r.cols_list[1]) || ' = $1, ' ||
quote_ident(r.cols_list[2]) || ' = $2' ||
' WHERE ' || quote_ident(r.cols_list[1]) || ' = $3 AND ' ||
quote_ident(r.cols_list[2]) || ' = $4'
USING ncicd9, ncdesc, ocicd9, ocdesc;
The USING clause can only be used for substituting data values, as shown above.
You need dynamic SQL with EXECUTE like #Patrick already provided.
However, both your function and the EXECUTE part can be much simpler. In particular, use format() to concatenate longer query strings safely (available since pg 9.1):
CREATE OR REPLACE FUNCTION merge_children_of_icd9 (_ocicd9 text, _ocdesc text
, _ncicd9 text, _ncdesc text)
RETURNS void
LANGUAGE plpgsql AS
$func$
DECLARE
_sql text;
BEGIN
FOR _sql IN
SELECT format('UPDATE %3$s SET %1$I = $3 , %2$I = $4
WHERE %1$I = $1 AND %2$I = $2'
, x.cols[1], x.cols[2], x.conrelid::regclass::text)
FROM (
SELECT c.conrelid, array_agg(a.attname ORDER BY a.attnum) AS cols
FROM pg_constraint c
JOIN pg_attribute a ON a.attrelid = c.conrelid
AND a.attnum = ANY(c.conkey)
WHERE c.confrelid = 'icd9'::regclass
AND c.contype = 'f'
GROUP BY c.oid, c.conrelid
ORDER BY c.oid
) x
LOOP
-- RAISE NOTICE '%', _sql; -- debug?
EXECUTE _sql
USING _ocicd9, _ocdesc, _ncicd9, _ncdesc;
END LOOP;
END
$func$;
The function errors out if the FK constraint does not span at least two columns or if the data type of the columns is not compatible with text. May or may not be as intended.
Details on how to pass identifiers safely and execute dynamic SQL:
Table name as a PostgreSQL function parameter
Define table and column names as arguments in a plpgsql function?

PLpgSQL function to find columns with only NULL values in a given table

We have to find columns of a table with only NULL values. We are trying to build a plpgsql function that takes a table's name and returns the list of such columns.
How to create such a function?
We are using PgAdmin 1.16.
You can query the catalog table pg_attribute to get a list of columns which are not defined NOT NULL and therefore can hold NULL values:
SELECT quote_ident(attname) AS column_can_be_null
FROM pg_attribute
WHERE attrelid = 'tbl'::regclass -- valid, visible table name
AND attnum >= 1 -- exclude tableoid & friends
AND NOT attisdropped -- exclude dropped columns
AND NOT attnotnull -- exclude columns defined NOT NULL!
ORDER BY attnum;
Where tbl is your (optionally schema-qualified) table name.
Doesn't say there are any actual NULL values in the column. You'd have to test each column. Like this:
Full automation with plpgsql function
CREATE OR REPLACE FUNCTION f_all_null_columns_of_tbl(_tbl regclass)
RETURNS SETOF text AS
$func$
DECLARE
_row_ct bigint; -- count rows in table $1
_sql text; -- SQL string to test for NULL values
_cols text[]; -- array of candidate column names
_nulls bool[]; -- array of test results
BEGIN
EXECUTE 'SELECT count(*) FROM ' || _tbl
INTO _row_ct;
IF _row_ct = 0 THEN
RAISE EXCEPTION 'Table % has no rows!', _tbl; -- pointless for empty table
ELSE
RAISE NOTICE '% rows in table %.', _row_ct, _tbl;
END IF;
SELECT INTO _sql, _cols
'SELECT ARRAY[' || string_agg('bool_and(' || col || ' IS NULL)', ', ')
|| '] FROM ' || _tbl
, array_agg(col)
FROM (
SELECT quote_ident(attname) AS col
FROM pg_attribute
WHERE attrelid = _tbl -- valid, visible table name
AND attnum >= 1 -- exclude tableoid & friends
AND NOT attisdropped -- exclude dropped columns
AND NOT attnotnull -- exclude columns defined NOT NULL!
ORDER BY attnum
) sub;
EXECUTE _sql INTO _nulls;
FOR i IN 1 .. array_upper(_cols, 1)
LOOP
IF _nulls[i] THEN -- column is NULL in all rows
RETURN NEXT _cols[i];
END IF;
END LOOP;
RETURN;
END
$func$ LANGUAGE plpgsql;
Call:
SELECT f_all_null_columns_of_tbl('my_schema.my_table');
Tested with Postgres 9.1 and 9.3.
This uses a number of advanced plpgsql features.
SQL Fiddle.
Related answer building SQL code and executing it, with modern syntax:
Replace empty strings with null values
About traversing a record:
Loop through columns of RECORD

A column-safe `INSERT INTO t1 SELECT * FROM ...`?

Is there some way to do a INSERT INTO t1 SELECT * FROM... such that it fails if the column names do not coincide?
I'm using Postgresql 9.x The columns names are not known in advance.
Motivation: I'm doing a periodic refresh of materialized views by the (quite standard) PL/pgSQL procedure:
CREATE OR REPLACE FUNCTION matview_refresh(name) RETURNS void AS
$BODY$
DECLARE
matview ALIAS FOR $1;
entry matviews%ROWTYPE;
BEGIN
SELECT * INTO entry FROM matviews WHERE mv_name = matview;
IF NOT FOUND THEN
RAISE EXCEPTION 'Materialized view % does not exist.', matview;
END IF;
EXECUTE 'TRUNCATE TABLE ' || matview;
EXECUTE 'INSERT INTO ' || matview || ' SELECT * FROM ' || entry.v_name;
UPDATE matviews SET last_refresh=CURRENT_TIMESTAMP WHERE mv_name=matview;
RETURN;
END
I preferred a TRUNCATE followed by a SELECT * INTO instead of a DROP/CREATE because it seemed more light and concurrent-friendly. It would fail if someone adds/remove columns from the view (then I would do the DROP/CREATE) but, it doesn't matter, in that case the refresh would not complete and we would catch the problem soon. What does matter is what happened today: someone changed the order of two columns of the view (of the same type), and the refresh inserted bogus data.
Build this into your plpgsql function to verify that view and table share the same column names in the same sequence exactly:
IF EXISTS (
SELECT 1
FROM (
SELECT *
FROM pg_attribute
WHERE attrelid = matview::regclass
AND attisdropped = FALSE
AND attnum > 0
) t
FULL OUTER JOIN (
SELECT *
FROM pg_attribute
WHERE attrelid = entry.v_name::regclass
AND attisdropped = FALSE
AND attnum > 0
) v USING (attnum, attname) -- atttypid to check for type, too
WHERE t.attname IS NULL
OR v.attname IS NULL
) THEN
RAISE EXCEPTION 'Mismatch between table and view!';
END IF;
The FULL OUTER JOIN adds a row with NULL values for any mismatch between the list of column names. So, if EXISTS finds a row, something is off.
And the cast to ::regclass would raise an exception right away if either table or view do not exist (or is out of scope - not in the search_path and not schema-qualified).
If you also want to check data types of the columns, just add atttypid to the USING clause.
As an aside: Querying pg_catalog tables is regularly faster by an order of magnitude than querying the bloated views int information_schema - information_schema is only good for SQL standard compliance and portability of code. Since you are writing 100 % Postgres-specific code, neither is relevant here.
You can query information_schema.columns to get columns in the right order:
SELECT INTO cols array_to_string(array_agg(column_name::text), ',')
FROM (
SELECT column_name
FROM information_schema.columns
WHERE table_name = 'matview'
ORDER BY ordinal_position
) AS x;
EXECUTE 'INSERT INTO ' || matview || ' SELECT ' || cols || ' FROM ' || entry.v_name;
You can get column list directly from pg_attribute -- just replace inner SELECT from information_schema.columns by:
SELECT attname AS column_name
FROM pg_attribute
WHERE attrelid = 'matview'::regclass AND attisdropped = false
ORDER BY attnum;

Dynamically select the columns to be used in a SELECT statement

I would love to be able to use the system tables (Oracle in this case) to drive which fields are used in a SELECT statement. Something like:
SELECT
(
select column_name
from all_tab_cols
where table_Name='CLARITY_SER'
AND OWNER='CLARITY'
AND data_type='DATE'
)
FROM CLARITY_SER
This syntax doesn't work, as the subquery returns multiple rows, instead of one row with multiple columns.
Is it possible to generate a SQL statement dynamically by querying the table schema information in order to select only certain columns?
** edit **
Do this without using a function or procedure, if possible.
You can do this:
declare
l_sql varchar2(32767);
rc sys_refcursor;
begin
l_sql := 'select ';
for r in
( select column_name
from all_tab_cols
where table_Name='CLARITY_SER'
AND OWNER='CLARITY'
AND data_type='DATE'
)
loop
l_sql := l_sql || r.column_name || ',';
end loop;
l_sql := rtrim(l_sql,',') || ' from clarity_ser';
open rc for l_sql;
...
end;
No, it's not possible to specify a column list dynamically in SQL. You'll need to use a procedural language to run the first query, use that to construct a second query, then run the second query.
You could use dynamic SQL. Create a function that takes the table name, owner, data type, executes the inner query and returns a comma-separated list of column names, or an array table if you prefer. Then construct the outer query and execute it with execute immediate.
CREATE FUNCTION get_column_list(
table_name IN varchar2,
owner_name IN varchar2,
data_type IN varchar2)
RETURN varchar2
IS
BEGIN
...... (get columns and return comma-separated list)
END;
/
If your function returns a comma-separated list you can inline it:
execute immediate 'select ' || get_column_list(table_name, owner_name, datatype) || ' from ' || table_name
Admittedly it's a long time since I played with oracle so I may be a bit off but I'm pretty sure this is quite doable.
In SQLPlus you could do this:
COLUMN cols NEW_VALUE cols
SELECT max( ltrim( sys_connect_by_path( column_name, ',' ), ',' ) ) cols
FROM
(
select rownum rn, column_name
from all_tab_cols
where table_Name='CLARITY_SER'
and OWNER='CLARITY'
AND data_type='DATE'
)
start with rn = 1 connect by rn = prior rn +1
;
select &cols from clarity.clarity_ser;