Update multiple columns in a trigger function in plpgsql - sql

Given the following schema:
create table account_type_a (
id SERIAL UNIQUE PRIMARY KEY,
some_column VARCHAR
);
create table account_type_b (
id SERIAL UNIQUE PRIMARY KEY,
some_other_column VARCHAR
);
create view account_type_a view AS select * from account_type_a;
create view account_type_b view AS select * from account_type_b;
I try to create a generic trigger function in plpgsql, which enables updating the view:
create trigger trUpdate instead of UPDATE on account_view_type_a
for each row execute procedure updateAccount();
create trigger trUpdate instead of UPDATE on account_view_type_a
for each row execute procedure updateAccount();
An unsuccessful effort of mine was:
create function updateAccount() returns trigger as $$
declare
target_table varchar := substring(TG_TABLE_NAME from '(.+)_view');
cols varchar;
begin
execute 'select string_agg(column_name,$1) from information_schema.columns
where table_name = $2' using ',', target_table into cols;
execute 'update ' || target_table || ' set (' || cols || ') = select ($1).*
where id = ($1).id' using NEW;
return NULL;
end;
$$ language plpgsql;
The problem is the update statement. I am unable to come up with a syntax that would work here. I have successfully implemented this in PL/Perl, but would be interested in a plpgsql-only solution.
Any ideas?
Update
As #Erwin Brandstetter suggested, here is the code for my PL/Perl solution. I incoporated some of his suggestions.
create function f_tr_up() returns trigger as $$
use strict;
use warnings;
my $target_table = quote_ident($_TD->{'table_name'}) =~ s/^([\w]+)_view$/$1/r;
my $NEW = $_TD->{'new'};
my $cols = join(',', map { quote_ident($_) } keys $NEW);
my $vals = join(',', map { quote_literal($_) } values $NEW);
my $query = sprintf(
"update %s set (%s) = (%s) where id = %d",
$target_table,
$cols,
$vals,
$NEW->{'id'});
spi_exec_query($query);
return;
$$ language plperl;

While #Gary's answer is technically correct, it fails to mention that PostgreSQL does support this form:
UPDATE tbl
SET (col1, col2, ...) = (expression1, expression2, ..)
Read the manual on UPDATE.
It's still tricky to get this done with dynamic SQL. I'll assume a simple case where views consist of the same columns as their underlying tables.
CREATE VIEW tbl_view AS SELECT * FROM tbl;
Problems
The special record NEW is not visible inside EXECUTE. I pass NEW as a single parameter with the USING clause of EXECUTE.
As discussed, UPDATE with list-form needs individual values. I use a subselect to split the record into individual columns:
UPDATE ...
FROM (SELECT ($1).*) x
(Parenthesis around $1 are not optional.) This allows me to simply use two column lists built with string_agg() from the catalog table: one with and one without table qualification.
It's not possible to assign a row value as a whole to individual columns. The manual:
According to the standard, the source value for a parenthesized
sub-list of target column names can be any row-valued expression
yielding the correct number of columns. PostgreSQL only allows the
source value to be a row constructor or a sub-SELECT.
INSERT is implemented simpler. If the structure of view and table are identical we can omit the column definition list. (Can be improved, see below.)
Solution
I made a couple of updates to your approach to make it shine.
Trigger function for UPDATE:
CREATE OR REPLACE FUNCTION f_trg_up()
RETURNS TRIGGER
LANGUAGE plpgsql AS
$func$
DECLARE
_tbl regclass := quote_ident(TG_TABLE_SCHEMA) || '.'
|| quote_ident(substring(TG_TABLE_NAME from '(.+)_view$'));
_cols text;
_vals text;
BEGIN
SELECT INTO _cols, _vals
string_agg(quote_ident(attname), ', ')
, string_agg('x.' || quote_ident(attname), ', ')
FROM pg_attribute
WHERE attrelid = _tbl
AND NOT attisdropped -- no dropped (dead) columns
AND attnum > 0; -- no system columns
EXECUTE format('
UPDATE %s
SET (%s) = (%s)
FROM (SELECT ($1).*) x', _tbl, _cols, _vals)
USING NEW;
RETURN NEW; -- Don't return NULL unless you knwo what you're doing
END
$func$;
Trigger function for INSERT:
CREATE OR REPLACE FUNCTION f_trg_ins()
RETURNS TRIGGER
LANGUAGE plpgsql AS
$func$
DECLARE
_tbl regclass := quote_ident(TG_TABLE_SCHEMA) || '.'
|| quote_ident(substring(TG_TABLE_NAME FROM '(.+)_view$'));
BEGIN
EXECUTE format('INSERT INTO %s SELECT ($1).*', _tbl)
USING NEW;
RETURN NEW; -- Don't return NULL unless you know what you're doing
END
$func$;
Triggers:
CREATE TRIGGER trg_instead_up
INSTEAD OF UPDATE ON a_view
FOR EACH ROW EXECUTE FUNCTION f_trg_up();
CREATE TRIGGER trg_instead_ins
INSTEAD OF INSERT ON a_view
FOR EACH ROW EXECUTE FUNCTION f_trg_ins();
Before Postgres 11 the syntax (oddly) was EXECUTE PROCEDURE instead of EXECUTE FUNCTION - which also still works.
db<>fiddle here - demonstrating INSERT and UPDATE
Old sqlfiddle
Major points
Include the schema name to make the table reference unambiguous. There can be multiple table of the same name in one database with multiple schemas!
Query pg_catalog.pg_attribute instead of information_schema.columns. Less portable, but much faster and allows to use the table-OID.
How to check if a table exists in a given schema
Table names are NOT safe against SQLi when concatenated as strings for dynamic SQL. Escape with quote_ident() or format() or with an object-identifer type. This includes the special trigger function variables TG_TABLE_SCHEMA and TG_TABLE_NAME!
Cast to the object identifier type regclass to assert the table name is valid and get the OID for the catalog look-up.
Optionally use format() to build the dynamic query string safely.
No need for dynamic SQL for the first query on the catalog tables. Faster, simpler.
Use RETURN NEW instead of RETURN NULL in these trigger functions unless you know what you are doing. (NULL would cancel the INSERT for the current row.)
This simple version assumes that every table (and view) has a unique column named id. A more sophisticated version might use the primary key dynamically.
The function for UPDATE allows the columns of view and table to be in any order, as long as the set is the same.
The function for INSERT expects the columns of view and table to be in identical order. If you want to allow arbitrary order, add a column definition list to the INSERT command, just like with UPDATE.
Updated version also covers changes to the id column by using OLD additionally.

Postgresql doesn't support updating multiple columns using the set (col1,col2) = select val1,val2 syntax.
To achieve the same in postgresql you'd use
update target_table
set col1 = d.val1,
col2 = d.val2
from source_table d
where d.id = target_table.id
This is going to make the dynamic query a bit more complex to build as you'll need to iterate the column name list you're using into individual fields. I'd suggest you use array_agg instead of string_agg as an array is easier to process than splitting the string again.
Postgresql UPDATE syntax
documentation on array_agg function

Related

Dynamic query that uses CTE gets "syntax error at end of input"

I have a table that looks like this:
CREATE TABLE label (
hid UUID PRIMARY KEY DEFAULT UUID_GENERATE_V4(),
name TEXT NOT NULL UNIQUE
);
I want to create a function that takes a list of names and inserts multiple rows into the table, ignoring duplicate names, and returns an array of the IDs generated for the rows it inserted.
This works:
CREATE OR REPLACE FUNCTION insert_label(nms TEXT[])
RETURNS UUID[]
AS $$
DECLARE
ids UUID[];
BEGIN
CREATE TEMP TABLE tmp_names(name TEXT);
INSERT INTO tmp_names SELECT UNNEST(nms);
WITH new_names AS (
INSERT INTO label(name)
SELECT tn.name
FROM tmp_names tn
WHERE NOT EXISTS(SELECT 1 FROM label h WHERE h.name = tn.name)
RETURNING hid
)
SELECT ARRAY_AGG(hid) INTO ids
FROM new_names;
DROP TABLE tmp_names;
RETURN ids;
END;
$$ LANGUAGE PLPGSQL;
I have many tables with the exact same columns as the label table, so I would like to have a function that can insert into any of them. I'd like to create a dynamic query to do that. I tried that, but this does not work:
CREATE OR REPLACE FUNCTION insert_label(h_tbl REGCLASS, nms TEXT[])
RETURNS UUID[]
AS $$
DECLARE
ids UUID[];
query_str TEXT;
BEGIN
CREATE TEMP TABLE tmp_names(name TEXT);
INSERT INTO tmp_names SELECT UNNEST(nms);
query_str := FORMAT('WITH new_names AS ( INSERT INTO %1$I(name) SELECT tn.name FROM tmp_names tn WHERE NOT EXISTS(SELECT 1 FROM %1$I h WHERE h.name = tn.name) RETURNING hid)', h_tbl);
EXECUTE query_str;
SELECT ARRAY_AGG(hid) INTO ids FROM new_names;
DROP TABLE tmp_names;
RETURN ids;
END;
$$ LANGUAGE PLPGSQL;
This is the output I get when I run that function:
psql=# select insert_label('label', array['how', 'now', 'brown', 'cow']);
ERROR: syntax error at end of input
LINE 1: ...SELECT 1 FROM label h WHERE h.name = tn.name) RETURNING hid)
^
QUERY: WITH new_names AS ( INSERT INTO label(name) SELECT tn.name FROM tmp_names tn WHERE NOT EXISTS(SELECT 1 FROM label h WHERE h.name = tn.name) RETURNING hid)
CONTEXT: PL/pgSQL function insert_label(regclass,text[]) line 19 at EXECUTE
The query generated by the dynamic SQL looks like it should be exactly the same as the query from static SQL.
I got the function to work by changing the return value from an array of UUIDs to a table of UUIDs and not using CTE:
CREATE OR REPLACE FUNCTION insert_label(h_tbl REGCLASS, nms TEXT[])
RETURNS TABLE (hid UUID)
AS $$
DECLARE
query_str TEXT;
BEGIN
CREATE TEMP TABLE tmp_names(name TEXT);
INSERT INTO tmp_names SELECT UNNEST(nms);
query_str := FORMAT('INSERT INTO %1$I(name) SELECT tn.name FROM tmp_names tn WHERE NOT EXISTS(SELECT 1 FROM %1$I h WHERE h.name = tn.name) RETURNING hid', h_tbl);
RETURN QUERY EXECUTE query_str;
DROP TABLE tmp_names;
RETURN;
END;
$$ LANGUAGE PLPGSQL;
I don't know if one way is better than the other, returning an array of UUIDs or a table of UUIDs, but at least I got it to work one of those ways. Plus, possibly not using a CTE is more efficient, so it may be better to stick with the version that returns a table of UUIDs.
What I would like to know is why the dynamic query did not work when using a CTE. The query it produced looked like it should have worked.
If anyone can let me know what I did wrong, I would appreciate it.
... why the dynamic query did not work when using a CTE. The query it produced looked like it should have worked.
No, it was only the CTE without (required) outer query. (You had SELECT ARRAY_AGG(hid) INTO ids FROM new_names in the static version.)
There are more problems, but just use this query instead:
INSERT INTO label(name)
SELECT unnest(nms)
ON CONFLICT DO NOTHING
RETURNING hid;
label.name is defined UNIQUE NOT NULL, so this simple UPSERT can replace your function insert_label() completely.
It's much simpler and faster. It also defends against possible duplicates from within your input array that you didn't cover, yet. And it's safe under concurrent write load - as opposed to your original, which might run into race conditions. Related:
How to use RETURNING with ON CONFLICT in PostgreSQL?
I would just use the simple query and replace the table name.
But if you still want a dynamic function:
CREATE OR REPLACE FUNCTION insert_label(_tbl regclass, _nms text[])
RETURNS TABLE (hid uuid)
LANGUAGE plpgsql AS
$func$
BEGIN
RETURN QUERY EXECUTE format(
$$
INSERT INTO %s(name)
SELECT unnest($1)
ON CONFLICT DO NOTHING
RETURNING hid
$$, _tbl)
USING _nms;
END
$func$;
If you don't need an array as result, stick with the set (RETURNS TABLE ...). Simpler.
Pass values (_nms) to EXECUTE in a USING clause.
The tablename (_tbl) is type regclass, so the format specifier %I for format() would be wrong. Use %s instead. See:
Table name as a PostgreSQL function parameter

PL/pgSQL dynamic query in stored function to return any table's column names

I am trying to write a stored function with a dynamic query that returns all column names from a table that can then be used to create a dynamic query for a joined view trigger function. But struggling to create a stored function with a dynamic query returning column_name from information_schema.
Here is the SQL query I was hoping to convert to a stored function passing the table_name and table_schema as function parameters:
select
column_name
from
information_schema.columns
where
table_name = 'projects' -- to be replaced by parameter
and table_schema = 'public'; -- to be replaced by parameter
I (think I now) understand the basics of needing to use Execute and Format for neatness, but only got a result with passing a table name. This post had a good example of passing a table name: Refactor a PL/pgSQL function to return the output of various SELECT queries
The idea would be to dynamically get the columns then process into a function based on this scratch dynamic query...
DO $$
DECLARE
item varchar;
column_name varchar default 'name';
table_name varchar default 'projects';
temp_string varchar default '';
begin
FOR item IN execute format('SELECT %I FROM %I',column_name,table_name)
loop
temp_string := temp_string || ',NEW.' || item;
END LOOP;
RAISE NOTICE '%', temp_string;
END$$;
And ultimately into the trigger function for views based on a table with a foreign key join. I.e. so the INSERT and UPDATE code is dynamically created for any parent table of a view with a join:
RETURNS trigger
LANGUAGE plpgsql
AS $function$
BEGIN
IF TG_OP = 'INSERT' THEN
INSERT INTO projects VALUES(NEW.id,NEW.name);
RETURN NEW;
ELSIF TG_OP = 'UPDATE' THEN
UPDATE projects SET id=NEW.id, name=NEW.name WHERE id=OLD.id;
RETURN NEW;
ELSIF TG_OP = 'DELETE' THEN
DELETE FROM projects WHERE id=OLD.id;
RETURN NULL;
END IF;
RETURN NEW;
END;
$function$
And finally work out how to deal with foreign key columns.
End result is the parent table can be updated via the view in QGIS. Is this even possible?
I am not exactly sure I understand what you are after but I think parent table can be updated via the view indicates the goal. If so you are headed in the wrong direction entirely and none of what you are seeking is needed. What you want is an instead of trigger on the view(s). The fiddle here demonstrates an instead of trigger on a view generated with a join, typically these are not cannot normally be updated.
Your idea to dynamically get the columns then process ... and ultimately into the trigger function for views seems extremely ambitious. A better approach may be to build a template for the trigger and associated functions then make the necessary specific column changes. Your trigger(s) must
exist well before any DML action on the views.

Using a variable in a Postgres function [duplicate]

It must be simple, but I'm making my first steps into Postgres functions and I can't find anything that works...
I'd like to create a function that will modify a table and / or column and I can't find the right way of specifying my tables and columns as arguments in my function.
Something like:
CREATE OR REPLACE FUNCTION foo(t table)
RETURNS void AS $$
BEGIN
alter table t add column c1 varchar(20);
alter table t add column c2 varchar(20);
alter table t add column c3 varchar(20);
alter table t add column c4 varchar(20);
END;
$$ LANGUAGE PLPGSQL;
select foo(some_table)
In another case, I'd like to have a function that alters a certain column from a certain table:
CREATE OR REPLACE FUNCTION foo(t table, c column)
RETURNS void AS $$
BEGIN
UPDATE t SET c = "This is a test";
END;
$$ LANGUAGE PLPGSQL;
Is it possible to do that?
You must defend against SQL injection whenever you turn user input into code. That includes table and column names coming from system catalogs or from direct user input alike. This way you also prevent trivial exceptions with non-standard identifiers. There are basically three built-in methods:
1. format()
1st query, sanitized:
CREATE OR REPLACE FUNCTION foo(_t text)
RETURNS void
LANGUAGE plpgsql AS
$func$
BEGIN
EXECUTE format('
ALTER TABLE %I ADD COLUMN c1 varchar(20)
, ADD COLUMN c2 varchar(20)', _t);
END
$func$;
format() requires Postgres 9.1 or later. Use it with the %I format specifier.
The table name alone may be ambiguous. You may have to provide the schema name to avoid changing the wrong table by accident. Related:
INSERT with dynamic table name in trigger function
How does the search_path influence identifier resolution and the "current schema"
Aside: adding multiple columns with a single ALTER TABLE command is cheaper.
2. regclass
You can also use a cast to a registered class (regclass) for the special case of existing table names. Optionally schema-qualified. This fails immediately and gracefully for table names that are not be valid and visible to the calling user. The 1st query sanitized with a cast to regclass:
CREATE OR REPLACE FUNCTION foo(_t regclass)
RETURNS void
LANGUAGE plpgsql AS
$func$
BEGIN
EXECUTE 'ALTER TABLE ' || _t || ' ADD COLUMN c1 varchar(20)
, ADD COLUMN c2 varchar(20)';
END
$func$;
Call:
SELECT foo('table_name');
Or:
SELECT foo('my_schema.table_name'::regclass);
Aside: consider using just text instead of varchar(20).
3. quote_ident()
The 2nd query sanitized:
CREATE OR REPLACE FUNCTION foo(_t regclass, _c text)
RETURNS void
LANGUAGE plpgsql AS
$func$
BEGIN
EXECUTE 'UPDATE ' || _t -- sanitized with regclass
|| ' SET ' || quote_ident(_c) || ' = ''This is a test''';
END
$func$;
For multiple concatenations / interpolations, format() is cleaner ...
Related answers:
Table name as a PostgreSQL function parameter
Postgres functions vs prepared queries
Case sensitive!
Be aware that unquoted identifiers are not cast to lower case here. When used as identifier in SQL [Postgres casts to lower case automatically][7]. But here we pass strings for dynamic SQL. When escaped as demonstrated, CaMel-case identifiers (like UserS) will be preserved by doublequoting ("UserS"), just like other non-standard names like "name with space" "SELECT"etc. Hence, names are case sensitive in this context.
My standing advice is to use legal lower case identifiers exclusively and never worry about that.
Aside: single quotes are for values, double quotes are for identifiers. See:
Are PostgreSQL column names case-sensitive?

Update substrings using lookup table and replace function

Here's my setup:
Table 1 (table_with_info): Contains a list of varchars with substrings that I'd like to replace.
Table 2 (sub_info): Contains two columns: the substring in table_with_info that I'd like to replace and the string I'd like to replace it with.
What I'd like to do is replace all the substrings in table_with_info with their substitutions in sub_info.
This works to a point but the issue is that select replace(...) returns a new row for each one of the substituted words replaced and doesn't replace all of the ones in an individual row.
I'm explaining the best I can but I don't know if it's too clear. Here's the code an example of what's happening/what I'd like to happen.
Here's my code:
create table table_with_info
(
val varchar
);
insert into table_with_info values
('this this is test data');
create table sub_info
(
word_from varchar,
word_to varchar
);
insert into sub_info values
('this','replace1')
, ('test', 'replace2');
update table_with_info set val = (select replace("val", "word_from", "word_to")
from "table_with_info", "sub_info"
the update() function doesn't work as select() returns two rows:
Row 1: replace1 replace1 is test data
Row 2: this this is replace2 data
so what I'd like for it for the select statement to return is:
Row 1: replace1 replace1 is test data
Any thoughts? I can't create UDFs on the system I'm running.
Your UPDATE statement is incorrect in multiple ways. Consult the manual before you try to run anything like this again. You introduce two cross joins that would make this statement extremely expensive, besides yielding nonsense.
To do this properly, you need to administer each UPDATE sequentially. In a single statement, one row version eliminates the other, while each replace would use the same original row version. You can use a DO statement for this or wrap it in a plpgsql function for instance:
DO
$do$
DECLARE
r sub_info;
BEGIN
FOR r IN
TABLE sub_info
-- SELECT * FROM sub_info ORDER BY ??? -- order is relevant
LOOP
UPDATE table_with_info
SET val = replace(val, r.word_from, r.word_to)
WHERE val LIKE ('%' || r.word_from || '%'); -- avoid empty updates
END LOOP;
END
$do$;
Be aware, that the order in which updates are applied can make a difference! If the first update creates a string where the second matches (but not otherwise) ..
So, order your columns in sub_info if that can be relevant.
Avoid empty updates. Without the additional WHERE clause, you would write many new row versions without changing anything. Expensive and useless.
double-quotes are optional for legal, lower-case names.
->SQLfiddle
Expanding on Erwin's answer, a do block with dynamic SQL can do the trick as well:
do $$
declare
rec record;
repl text;
begin
repl := 'val'; -- quote_ident() this if needed
for rec in select word_from, word_to from sub_info
loop
repl := 'replace(' || repl || ', '
|| quote_literal(rec.word_from) || ', '
|| quote_literal(rec.word_to) || ')';
end loop;
-- now do them all in a single query
execute 'update ' || 'table_with_info'::regclass || ' set val = ' || repl;
end;
$$ language plpgsql;
Optionally, build a like parameter in a similar way to avoid updating rows needlessly.

Update multiple columns that start with a specific string

I am trying to update a bunch of columns in a DB for testing purposes of a feature. I have a table that is built with hibernate so all of the columns that are created for an embedded entity begin with the same name. I.e. contact_info_address_street1, contact_info_address_street2, etc.
I am trying to figure out if there is a way to do something to the affect of:
UPDATE table SET contact_info_address_* = null;
If not, I know I can do it the long way, just looking for a way to help myself out in the future if I need to do this all over again for a different set of columns.
You need dynamic SQL for this. So you must defend against possible SQL injection.
Basic query
The basic query to generate the DML command needed can look like this:
SELECT format('UPDATE tbl SET (%s) = (%s)'
,string_agg (quote_ident(attname), ', ')
,string_agg ('NULL', ', ')
)
FROM pg_attribute
WHERE attrelid = 'tbl'::regclass
AND NOT attisdropped
AND attnum > 0
AND attname ~~ 'foo_%';
Returns:
UPDATE tbl SET (foo_a, foo_b, foo_c) = (NULL, NULL, NULL);
Make use of the "column-list syntax" of UPDATE to shorten the code and simplify the task.
I query the system catalogs instead of information schema because the latter, while being standardized and guaranteed to be portable across major versions, is also notoriously slow and sometimes unwieldy. There are pros and cons, see:
Get column names and data types of a query, table or view
quote_ident() for the column names prevents SQL-injection - also necessary for identifiers.
string_agg() requires 9.0+.
Full automation with PL/pgSQL function
CREATE OR REPLACE FUNCTION f_update_cols(_tbl regclass, _col_pattern text
, OUT row_ct int, OUT col_ct bigint)
LANGUAGE plpgsql AS
$func$
DECLARE
_sql text;
BEGIN
SELECT INTO _sql, col_ct
format('UPDATE tbl SET (%s) = (%s)'
, string_agg (quote_ident(attname), ', ')
, string_agg ('NULL', ', ')
)
, count(*)
FROM pg_attribute
WHERE attrelid = _tbl
AND NOT attisdropped -- no dropped columns
AND attnum > 0 -- no system columns
AND attname LIKE _col_pattern; -- only columns matching pattern
-- RAISE NOTICE '%', _sql; -- output SQL for debugging
EXECUTE _sql;
GET DIAGNOSTICS row_ct = ROW_COUNT;
END
$func$;
COMMENT ON FUNCTION f_update_cols(regclass, text)
IS 'Updates all columns of table _tbl ($1)
that match _col_pattern ($2) in a LIKE expression.
Returns the count of columns (col_ct) and rows (row_ct) affected.';
Call:
SELECT * FROM f_update_cols('myschema.tbl', 'foo%');
To make the function more practical, it returns information as described in the comment. More about obtaining the result status in plpgsql in the manual.
I use the variable _sql to hold the query string, so I can collect the number of columns found (col_ct) in the same query.
The object identifier type regclass is the most efficient way to automatically avoid SQL injection (and sanitize non-standard names) for the table name, too. You can use schema-qualified table names to avoid ambiguities. I would advise to do so if you (can) have multiple schemas in your db! See:
Table name as a PostgreSQL function parameter
db<>fiddle here
Old sqlfiddle
There's no handy shortcut sorry. If you have to do this kind of thing a lot, you could create a function to dynamically execute sql and achieve your goal.
CREATE OR REPLACE FUNCTION reset_cols() RETURNS boolean AS $$ BEGIN
EXECUTE (select 'UPDATE table SET '
|| array_to_string(array(
select column_name::text
from information_schema.columns
where table_name = 'table'
and column_name::text like 'contact_info_address_%'
),' = NULL,')
|| ' = NULL');
RETURN true;
END; $$ LANGUAGE plpgsql;
-- run the function
SELECT reset_cols();
It's not very nice though. A better function would be one that accepts the tablename and column prefix as args. Which I'll leave as an exercise for the readers :)