INSERT with dynamic column names - sql

I have column names stored in variable colls, next I execute code:
DO $$
DECLARE
v_name text := quote_ident('colls');
BEGIN
EXECUTE 'insert into table1 select '|| colls ||' from table2 ';
-- EXECUTE 'insert into table1 select '|| v_name ||' from table2 ';
END$$;
I have got error: column "colls" does not exist. Program used colls as name not as variable. What am I doing wrong?
I have found similar example in documentation:
https://www.postgresql.org/docs/8.1/static/plpgsql-statements.html#PLPGSQL-STATEMENTS-EXECUTING-DYN

I have column names stored in variable colls
No, you don't. You have a variable v_name - which holds a single word: 'colls'. About variables in SQL:
User defined variables in PostgreSQL
Read the chapters Identifiers and Key Words and Constants in the manual.
And if you had multiple column names in a single variable, you could not use quote_ident() like that. It would escape the whole string as a single identifier.
I guess the basic misunderstanding is this: 'colls' is a string constant, not a variable. There are no other variables in a DO statement than the ones you declare in the DECLARE section. You might be looking for a function that takes a variable number of column names as parameter(s) ...
CREATE OR REPLACE FUNCTION f_insert_these_columns(VARIADIC _cols text[])
RETURNS void AS
$func$
BEGIN
EXECUTE (
SELECT 'INSERT INTO table1 SELECT '
|| string_agg(quote_ident(col), ', ')
|| ' FROM table2'
FROM unnest(_cols) col
);
END
$func$ LANGUAGE plpgsql;
Call:
SELECT f_insert_these_columns('abd', 'NeW Deal'); -- column names case sensitive!
SELECT f_insert_these_columns(VARIADIC '{abd, NeW Deal}'); -- column names case sensitive!
Note how I unnest the array of column names and escape them one by one.
A VARIADIC parameter should be perfect for your use case. You can either pass a list of column names or an array.
Either way, be vary of SQL injection.
Related, with more explanation:
Pass multiple values in single parameter
Table name as a PostgreSQL function parameter

Related

Using a variable in a Postgres function [duplicate]

It must be simple, but I'm making my first steps into Postgres functions and I can't find anything that works...
I'd like to create a function that will modify a table and / or column and I can't find the right way of specifying my tables and columns as arguments in my function.
Something like:
CREATE OR REPLACE FUNCTION foo(t table)
RETURNS void AS $$
BEGIN
alter table t add column c1 varchar(20);
alter table t add column c2 varchar(20);
alter table t add column c3 varchar(20);
alter table t add column c4 varchar(20);
END;
$$ LANGUAGE PLPGSQL;
select foo(some_table)
In another case, I'd like to have a function that alters a certain column from a certain table:
CREATE OR REPLACE FUNCTION foo(t table, c column)
RETURNS void AS $$
BEGIN
UPDATE t SET c = "This is a test";
END;
$$ LANGUAGE PLPGSQL;
Is it possible to do that?
You must defend against SQL injection whenever you turn user input into code. That includes table and column names coming from system catalogs or from direct user input alike. This way you also prevent trivial exceptions with non-standard identifiers. There are basically three built-in methods:
1. format()
1st query, sanitized:
CREATE OR REPLACE FUNCTION foo(_t text)
RETURNS void
LANGUAGE plpgsql AS
$func$
BEGIN
EXECUTE format('
ALTER TABLE %I ADD COLUMN c1 varchar(20)
, ADD COLUMN c2 varchar(20)', _t);
END
$func$;
format() requires Postgres 9.1 or later. Use it with the %I format specifier.
The table name alone may be ambiguous. You may have to provide the schema name to avoid changing the wrong table by accident. Related:
INSERT with dynamic table name in trigger function
How does the search_path influence identifier resolution and the "current schema"
Aside: adding multiple columns with a single ALTER TABLE command is cheaper.
2. regclass
You can also use a cast to a registered class (regclass) for the special case of existing table names. Optionally schema-qualified. This fails immediately and gracefully for table names that are not be valid and visible to the calling user. The 1st query sanitized with a cast to regclass:
CREATE OR REPLACE FUNCTION foo(_t regclass)
RETURNS void
LANGUAGE plpgsql AS
$func$
BEGIN
EXECUTE 'ALTER TABLE ' || _t || ' ADD COLUMN c1 varchar(20)
, ADD COLUMN c2 varchar(20)';
END
$func$;
Call:
SELECT foo('table_name');
Or:
SELECT foo('my_schema.table_name'::regclass);
Aside: consider using just text instead of varchar(20).
3. quote_ident()
The 2nd query sanitized:
CREATE OR REPLACE FUNCTION foo(_t regclass, _c text)
RETURNS void
LANGUAGE plpgsql AS
$func$
BEGIN
EXECUTE 'UPDATE ' || _t -- sanitized with regclass
|| ' SET ' || quote_ident(_c) || ' = ''This is a test''';
END
$func$;
For multiple concatenations / interpolations, format() is cleaner ...
Related answers:
Table name as a PostgreSQL function parameter
Postgres functions vs prepared queries
Case sensitive!
Be aware that unquoted identifiers are not cast to lower case here. When used as identifier in SQL [Postgres casts to lower case automatically][7]. But here we pass strings for dynamic SQL. When escaped as demonstrated, CaMel-case identifiers (like UserS) will be preserved by doublequoting ("UserS"), just like other non-standard names like "name with space" "SELECT"etc. Hence, names are case sensitive in this context.
My standing advice is to use legal lower case identifiers exclusively and never worry about that.
Aside: single quotes are for values, double quotes are for identifiers. See:
Are PostgreSQL column names case-sensitive?

Dynamic column name as date in postgresql crosstab [duplicate]

I implemented this function in my Postgres database: http://www.cureffi.org/2013/03/19/automatically-creating-pivot-table-column-names-in-postgresql/
Here's the function:
create or replace function xtab (tablename varchar, rowc varchar, colc varchar, cellc varchar, celldatatype varchar) returns varchar language plpgsql as $$
declare
dynsql1 varchar;
dynsql2 varchar;
columnlist varchar;
begin
-- 1. retrieve list of column names.
dynsql1 = 'select string_agg(distinct '||colc||'||'' '||celldatatype||''','','' order by '||colc||'||'' '||celldatatype||''') from '||tablename||';';
execute dynsql1 into columnlist;
-- 2. set up the crosstab query
dynsql2 = 'select * from crosstab (
''select '||rowc||','||colc||','||cellc||' from '||tablename||' group by 1,2 order by 1,2'',
''select distinct '||colc||' from '||tablename||' order by 1''
)
as ct (
'||rowc||' varchar,'||columnlist||'
);';
return dynsql2;
end
$$;
So now I can call the function:
select xtab('globalpayments','month','currency','(sum(total_fees)/sum(txn_amount)*100)::decimal(48,2)','text');
Which returns (because the return type of the function is varchar):
select * from crosstab (
'select month,currency,(sum(total_fees)/sum(txn_amount)*100)::decimal(48,2)
from globalpayments
group by 1,2
order by 1,2'
, 'select distinct currency
from globalpayments
order by 1'
) as ct ( month varchar,CAD text,EUR text,GBP text,USD text );
How can I get this function to not only generate the code for the dynamic crosstab, but also execute the result? I.e., the result when I manually copy/paste/execute is this. But I want it to execute without that extra step: the function shall assemble the dynamic query and execute it:
Edit 1
This function comes close, but I need it to return more than just the first column of the first record
Taken from: Are there any way to execute a query inside the string value (like eval) in PostgreSQL?
create or replace function eval( sql text ) returns text as $$
declare
as_txt text;
begin
if sql is null then return null ; end if ;
execute sql into as_txt ;
return as_txt ;
end;
$$ language plpgsql
usage: select * from eval($$select * from analytics limit 1$$)
However it just returns the first column of the first record :
eval
----
2015
when the actual result looks like this:
Year, Month, Date, TPV_USD
---- ----- ------ --------
2016, 3, 2016-03-31, 100000
What you ask for is impossible. SQL is a strictly typed language. PostgreSQL functions need to declare a return type (RETURNS ..) at the time of creation.
A limited way around this is with polymorphic functions. If you can provide the return type at the time of the function call. But that's not evident from your question.
Refactor a PL/pgSQL function to return the output of various SELECT queries
You can return a completely dynamic result with anonymous records. But then you are required to provide a column definition list with every call. And how do you know about the returned columns? Catch 22.
There are various workarounds, depending on what you need or can work with. Since all your data columns seem to share the same data type, I suggest to return an array: text[]. Or you could return a document type like hstore or json. Related:
Dynamic alternative to pivot with CASE and GROUP BY
Dynamically convert hstore keys into columns for an unknown set of keys
But it might be simpler to just use two calls: 1: Let Postgres build the query. 2: Execute and retrieve returned rows.
Selecting multiple max() values using a single SQL statement
I would not use the function from Eric Minikel as presented in your question at all. It is not safe against SQL injection by way of maliciously malformed identifiers. Use format() to build query strings unless you are running an outdated version older than Postgres 9.1.
A shorter and cleaner implementation could look like this:
CREATE OR REPLACE FUNCTION xtab(_tbl regclass, _row text, _cat text
, _expr text -- still vulnerable to SQL injection!
, _type regtype)
RETURNS text
LANGUAGE plpgsql AS
$func$
DECLARE
_cat_list text;
_col_list text;
BEGIN
-- generate categories for xtab param and col definition list
EXECUTE format(
$$SELECT string_agg(quote_literal(x.cat), '), (')
, string_agg(quote_ident (x.cat), %L)
FROM (SELECT DISTINCT %I AS cat FROM %s ORDER BY 1) x$$
, ' ' || _type || ', ', _cat, _tbl)
INTO _cat_list, _col_list;
-- generate query string
RETURN format(
'SELECT * FROM crosstab(
$q$SELECT %I, %I, %s
FROM %I
GROUP BY 1, 2 -- only works if the 3rd column is an aggregate expression
ORDER BY 1, 2$q$
, $c$VALUES (%5$s)$c$
) ct(%1$I text, %6$s %7$s)'
, _row, _cat, _expr -- expr must be an aggregate expression!
, _tbl, _cat_list, _col_list, _type);
END
$func$;
Same function call as your original version. The function crosstab() is provided by the additional module tablefunc which has to be installed. Basics:
PostgreSQL Crosstab Query
This handles column and table names safely. Note the use of object identifier types regclass and regtype. Also works for schema-qualified names.
Table name as a PostgreSQL function parameter
However, it is not completely safe while you pass a string to be executed as expression (_expr - cellc in your original query). This kind of input is inherently unsafe against SQL injection and should never be exposed to the general public.
SQL injection in Postgres functions vs prepared queries
Scans the table only once for both lists of categories and should be a bit faster.
Still can't return completely dynamic row types since that's strictly not possible.
Not quite impossible, you can still execute it (from a query execute the string and return SETOF RECORD.
Then you have to specify the return record format. The reason in this case is that the planner needs to know the return format before it can make certain decisions (materialization comes to mind).
So in this case you would EXECUTE the query, return the rows and return SETOF RECORD.
For example, we could do something like this with a wrapper function but the same logic could be folded into your function:
CREATE OR REPLACE FUNCTION crosstab_wrapper
(tablename varchar, rowc varchar, colc varchar,
cellc varchar, celldatatype varchar)
returns setof record language plpgsql as $$
DECLARE outrow record;
BEGIN
FOR outrow IN EXECUTE xtab($1, $2, $3, $4, $5)
LOOP
RETURN NEXT outrow
END LOOP;
END;
$$;
Then you supply the record structure on calling the function just like you do with crosstab.
Then when you all the query you would have to supply a record structure (as (col1 type, col2 type, etc) like you do with connectby.

Column names as variables

I want to pass column name as a parameter. Here I have written a query to select the column 'user_id' from table users, but it returns user_id as text.
EXECUTE format('SELECT $1 FROM users
WHERE name=''name''')
USING 'user_id';
How can I do this?
Can you not use Dynamic SQL to get this result? something like:
EXECUTE 'SELECT '|| get_columns()|| ' FROM table_name'
According to documentation...
Note that parameter symbols can only be used for data values — if you
want to use dynamically determined table or column names, you must
insert them into the command string textually. For example, if the
preceding query needed to be done against a dynamically selected table...
So you need something like this
EXECUTE 'SELECT '
|| quote_ident(p_column)
|| ' FROM users '
|| ' WHERE name = $1'
INTO v_result
USING name;
p_column is the column name received as a parameter in your function.
v_result is a variable for storing the result. This INTO part can be discarded if you don't care about the result.
name is the value supplied to the where condition. Must be somewhere in your function or get as a parameter too.

Update substrings using lookup table and replace function

Here's my setup:
Table 1 (table_with_info): Contains a list of varchars with substrings that I'd like to replace.
Table 2 (sub_info): Contains two columns: the substring in table_with_info that I'd like to replace and the string I'd like to replace it with.
What I'd like to do is replace all the substrings in table_with_info with their substitutions in sub_info.
This works to a point but the issue is that select replace(...) returns a new row for each one of the substituted words replaced and doesn't replace all of the ones in an individual row.
I'm explaining the best I can but I don't know if it's too clear. Here's the code an example of what's happening/what I'd like to happen.
Here's my code:
create table table_with_info
(
val varchar
);
insert into table_with_info values
('this this is test data');
create table sub_info
(
word_from varchar,
word_to varchar
);
insert into sub_info values
('this','replace1')
, ('test', 'replace2');
update table_with_info set val = (select replace("val", "word_from", "word_to")
from "table_with_info", "sub_info"
the update() function doesn't work as select() returns two rows:
Row 1: replace1 replace1 is test data
Row 2: this this is replace2 data
so what I'd like for it for the select statement to return is:
Row 1: replace1 replace1 is test data
Any thoughts? I can't create UDFs on the system I'm running.
Your UPDATE statement is incorrect in multiple ways. Consult the manual before you try to run anything like this again. You introduce two cross joins that would make this statement extremely expensive, besides yielding nonsense.
To do this properly, you need to administer each UPDATE sequentially. In a single statement, one row version eliminates the other, while each replace would use the same original row version. You can use a DO statement for this or wrap it in a plpgsql function for instance:
DO
$do$
DECLARE
r sub_info;
BEGIN
FOR r IN
TABLE sub_info
-- SELECT * FROM sub_info ORDER BY ??? -- order is relevant
LOOP
UPDATE table_with_info
SET val = replace(val, r.word_from, r.word_to)
WHERE val LIKE ('%' || r.word_from || '%'); -- avoid empty updates
END LOOP;
END
$do$;
Be aware, that the order in which updates are applied can make a difference! If the first update creates a string where the second matches (but not otherwise) ..
So, order your columns in sub_info if that can be relevant.
Avoid empty updates. Without the additional WHERE clause, you would write many new row versions without changing anything. Expensive and useless.
double-quotes are optional for legal, lower-case names.
->SQLfiddle
Expanding on Erwin's answer, a do block with dynamic SQL can do the trick as well:
do $$
declare
rec record;
repl text;
begin
repl := 'val'; -- quote_ident() this if needed
for rec in select word_from, word_to from sub_info
loop
repl := 'replace(' || repl || ', '
|| quote_literal(rec.word_from) || ', '
|| quote_literal(rec.word_to) || ')';
end loop;
-- now do them all in a single query
execute 'update ' || 'table_with_info'::regclass || ' set val = ' || repl;
end;
$$ language plpgsql;
Optionally, build a like parameter in a similar way to avoid updating rows needlessly.

Update multiple columns in a trigger function in plpgsql

Given the following schema:
create table account_type_a (
id SERIAL UNIQUE PRIMARY KEY,
some_column VARCHAR
);
create table account_type_b (
id SERIAL UNIQUE PRIMARY KEY,
some_other_column VARCHAR
);
create view account_type_a view AS select * from account_type_a;
create view account_type_b view AS select * from account_type_b;
I try to create a generic trigger function in plpgsql, which enables updating the view:
create trigger trUpdate instead of UPDATE on account_view_type_a
for each row execute procedure updateAccount();
create trigger trUpdate instead of UPDATE on account_view_type_a
for each row execute procedure updateAccount();
An unsuccessful effort of mine was:
create function updateAccount() returns trigger as $$
declare
target_table varchar := substring(TG_TABLE_NAME from '(.+)_view');
cols varchar;
begin
execute 'select string_agg(column_name,$1) from information_schema.columns
where table_name = $2' using ',', target_table into cols;
execute 'update ' || target_table || ' set (' || cols || ') = select ($1).*
where id = ($1).id' using NEW;
return NULL;
end;
$$ language plpgsql;
The problem is the update statement. I am unable to come up with a syntax that would work here. I have successfully implemented this in PL/Perl, but would be interested in a plpgsql-only solution.
Any ideas?
Update
As #Erwin Brandstetter suggested, here is the code for my PL/Perl solution. I incoporated some of his suggestions.
create function f_tr_up() returns trigger as $$
use strict;
use warnings;
my $target_table = quote_ident($_TD->{'table_name'}) =~ s/^([\w]+)_view$/$1/r;
my $NEW = $_TD->{'new'};
my $cols = join(',', map { quote_ident($_) } keys $NEW);
my $vals = join(',', map { quote_literal($_) } values $NEW);
my $query = sprintf(
"update %s set (%s) = (%s) where id = %d",
$target_table,
$cols,
$vals,
$NEW->{'id'});
spi_exec_query($query);
return;
$$ language plperl;
While #Gary's answer is technically correct, it fails to mention that PostgreSQL does support this form:
UPDATE tbl
SET (col1, col2, ...) = (expression1, expression2, ..)
Read the manual on UPDATE.
It's still tricky to get this done with dynamic SQL. I'll assume a simple case where views consist of the same columns as their underlying tables.
CREATE VIEW tbl_view AS SELECT * FROM tbl;
Problems
The special record NEW is not visible inside EXECUTE. I pass NEW as a single parameter with the USING clause of EXECUTE.
As discussed, UPDATE with list-form needs individual values. I use a subselect to split the record into individual columns:
UPDATE ...
FROM (SELECT ($1).*) x
(Parenthesis around $1 are not optional.) This allows me to simply use two column lists built with string_agg() from the catalog table: one with and one without table qualification.
It's not possible to assign a row value as a whole to individual columns. The manual:
According to the standard, the source value for a parenthesized
sub-list of target column names can be any row-valued expression
yielding the correct number of columns. PostgreSQL only allows the
source value to be a row constructor or a sub-SELECT.
INSERT is implemented simpler. If the structure of view and table are identical we can omit the column definition list. (Can be improved, see below.)
Solution
I made a couple of updates to your approach to make it shine.
Trigger function for UPDATE:
CREATE OR REPLACE FUNCTION f_trg_up()
RETURNS TRIGGER
LANGUAGE plpgsql AS
$func$
DECLARE
_tbl regclass := quote_ident(TG_TABLE_SCHEMA) || '.'
|| quote_ident(substring(TG_TABLE_NAME from '(.+)_view$'));
_cols text;
_vals text;
BEGIN
SELECT INTO _cols, _vals
string_agg(quote_ident(attname), ', ')
, string_agg('x.' || quote_ident(attname), ', ')
FROM pg_attribute
WHERE attrelid = _tbl
AND NOT attisdropped -- no dropped (dead) columns
AND attnum > 0; -- no system columns
EXECUTE format('
UPDATE %s
SET (%s) = (%s)
FROM (SELECT ($1).*) x', _tbl, _cols, _vals)
USING NEW;
RETURN NEW; -- Don't return NULL unless you knwo what you're doing
END
$func$;
Trigger function for INSERT:
CREATE OR REPLACE FUNCTION f_trg_ins()
RETURNS TRIGGER
LANGUAGE plpgsql AS
$func$
DECLARE
_tbl regclass := quote_ident(TG_TABLE_SCHEMA) || '.'
|| quote_ident(substring(TG_TABLE_NAME FROM '(.+)_view$'));
BEGIN
EXECUTE format('INSERT INTO %s SELECT ($1).*', _tbl)
USING NEW;
RETURN NEW; -- Don't return NULL unless you know what you're doing
END
$func$;
Triggers:
CREATE TRIGGER trg_instead_up
INSTEAD OF UPDATE ON a_view
FOR EACH ROW EXECUTE FUNCTION f_trg_up();
CREATE TRIGGER trg_instead_ins
INSTEAD OF INSERT ON a_view
FOR EACH ROW EXECUTE FUNCTION f_trg_ins();
Before Postgres 11 the syntax (oddly) was EXECUTE PROCEDURE instead of EXECUTE FUNCTION - which also still works.
db<>fiddle here - demonstrating INSERT and UPDATE
Old sqlfiddle
Major points
Include the schema name to make the table reference unambiguous. There can be multiple table of the same name in one database with multiple schemas!
Query pg_catalog.pg_attribute instead of information_schema.columns. Less portable, but much faster and allows to use the table-OID.
How to check if a table exists in a given schema
Table names are NOT safe against SQLi when concatenated as strings for dynamic SQL. Escape with quote_ident() or format() or with an object-identifer type. This includes the special trigger function variables TG_TABLE_SCHEMA and TG_TABLE_NAME!
Cast to the object identifier type regclass to assert the table name is valid and get the OID for the catalog look-up.
Optionally use format() to build the dynamic query string safely.
No need for dynamic SQL for the first query on the catalog tables. Faster, simpler.
Use RETURN NEW instead of RETURN NULL in these trigger functions unless you know what you are doing. (NULL would cancel the INSERT for the current row.)
This simple version assumes that every table (and view) has a unique column named id. A more sophisticated version might use the primary key dynamically.
The function for UPDATE allows the columns of view and table to be in any order, as long as the set is the same.
The function for INSERT expects the columns of view and table to be in identical order. If you want to allow arbitrary order, add a column definition list to the INSERT command, just like with UPDATE.
Updated version also covers changes to the id column by using OLD additionally.
Postgresql doesn't support updating multiple columns using the set (col1,col2) = select val1,val2 syntax.
To achieve the same in postgresql you'd use
update target_table
set col1 = d.val1,
col2 = d.val2
from source_table d
where d.id = target_table.id
This is going to make the dynamic query a bit more complex to build as you'll need to iterate the column name list you're using into individual fields. I'd suggest you use array_agg instead of string_agg as an array is easier to process than splitting the string again.
Postgresql UPDATE syntax
documentation on array_agg function