Dynamic query that uses CTE gets "syntax error at end of input" - sql

I have a table that looks like this:
CREATE TABLE label (
hid UUID PRIMARY KEY DEFAULT UUID_GENERATE_V4(),
name TEXT NOT NULL UNIQUE
);
I want to create a function that takes a list of names and inserts multiple rows into the table, ignoring duplicate names, and returns an array of the IDs generated for the rows it inserted.
This works:
CREATE OR REPLACE FUNCTION insert_label(nms TEXT[])
RETURNS UUID[]
AS $$
DECLARE
ids UUID[];
BEGIN
CREATE TEMP TABLE tmp_names(name TEXT);
INSERT INTO tmp_names SELECT UNNEST(nms);
WITH new_names AS (
INSERT INTO label(name)
SELECT tn.name
FROM tmp_names tn
WHERE NOT EXISTS(SELECT 1 FROM label h WHERE h.name = tn.name)
RETURNING hid
)
SELECT ARRAY_AGG(hid) INTO ids
FROM new_names;
DROP TABLE tmp_names;
RETURN ids;
END;
$$ LANGUAGE PLPGSQL;
I have many tables with the exact same columns as the label table, so I would like to have a function that can insert into any of them. I'd like to create a dynamic query to do that. I tried that, but this does not work:
CREATE OR REPLACE FUNCTION insert_label(h_tbl REGCLASS, nms TEXT[])
RETURNS UUID[]
AS $$
DECLARE
ids UUID[];
query_str TEXT;
BEGIN
CREATE TEMP TABLE tmp_names(name TEXT);
INSERT INTO tmp_names SELECT UNNEST(nms);
query_str := FORMAT('WITH new_names AS ( INSERT INTO %1$I(name) SELECT tn.name FROM tmp_names tn WHERE NOT EXISTS(SELECT 1 FROM %1$I h WHERE h.name = tn.name) RETURNING hid)', h_tbl);
EXECUTE query_str;
SELECT ARRAY_AGG(hid) INTO ids FROM new_names;
DROP TABLE tmp_names;
RETURN ids;
END;
$$ LANGUAGE PLPGSQL;
This is the output I get when I run that function:
psql=# select insert_label('label', array['how', 'now', 'brown', 'cow']);
ERROR: syntax error at end of input
LINE 1: ...SELECT 1 FROM label h WHERE h.name = tn.name) RETURNING hid)
^
QUERY: WITH new_names AS ( INSERT INTO label(name) SELECT tn.name FROM tmp_names tn WHERE NOT EXISTS(SELECT 1 FROM label h WHERE h.name = tn.name) RETURNING hid)
CONTEXT: PL/pgSQL function insert_label(regclass,text[]) line 19 at EXECUTE
The query generated by the dynamic SQL looks like it should be exactly the same as the query from static SQL.
I got the function to work by changing the return value from an array of UUIDs to a table of UUIDs and not using CTE:
CREATE OR REPLACE FUNCTION insert_label(h_tbl REGCLASS, nms TEXT[])
RETURNS TABLE (hid UUID)
AS $$
DECLARE
query_str TEXT;
BEGIN
CREATE TEMP TABLE tmp_names(name TEXT);
INSERT INTO tmp_names SELECT UNNEST(nms);
query_str := FORMAT('INSERT INTO %1$I(name) SELECT tn.name FROM tmp_names tn WHERE NOT EXISTS(SELECT 1 FROM %1$I h WHERE h.name = tn.name) RETURNING hid', h_tbl);
RETURN QUERY EXECUTE query_str;
DROP TABLE tmp_names;
RETURN;
END;
$$ LANGUAGE PLPGSQL;
I don't know if one way is better than the other, returning an array of UUIDs or a table of UUIDs, but at least I got it to work one of those ways. Plus, possibly not using a CTE is more efficient, so it may be better to stick with the version that returns a table of UUIDs.
What I would like to know is why the dynamic query did not work when using a CTE. The query it produced looked like it should have worked.
If anyone can let me know what I did wrong, I would appreciate it.

... why the dynamic query did not work when using a CTE. The query it produced looked like it should have worked.
No, it was only the CTE without (required) outer query. (You had SELECT ARRAY_AGG(hid) INTO ids FROM new_names in the static version.)
There are more problems, but just use this query instead:
INSERT INTO label(name)
SELECT unnest(nms)
ON CONFLICT DO NOTHING
RETURNING hid;
label.name is defined UNIQUE NOT NULL, so this simple UPSERT can replace your function insert_label() completely.
It's much simpler and faster. It also defends against possible duplicates from within your input array that you didn't cover, yet. And it's safe under concurrent write load - as opposed to your original, which might run into race conditions. Related:
How to use RETURNING with ON CONFLICT in PostgreSQL?
I would just use the simple query and replace the table name.
But if you still want a dynamic function:
CREATE OR REPLACE FUNCTION insert_label(_tbl regclass, _nms text[])
RETURNS TABLE (hid uuid)
LANGUAGE plpgsql AS
$func$
BEGIN
RETURN QUERY EXECUTE format(
$$
INSERT INTO %s(name)
SELECT unnest($1)
ON CONFLICT DO NOTHING
RETURNING hid
$$, _tbl)
USING _nms;
END
$func$;
If you don't need an array as result, stick with the set (RETURNS TABLE ...). Simpler.
Pass values (_nms) to EXECUTE in a USING clause.
The tablename (_tbl) is type regclass, so the format specifier %I for format() would be wrong. Use %s instead. See:
Table name as a PostgreSQL function parameter

Related

query has no destination for result data in a function that has a set of instructions in postgresql

I am trying to automate a set of sentences that I execute several times a day. For this I want to put them in a postgres function and just call the function to execute the sentences consecutively. If everything runs OK then in the end return the SUCCESS value. The following function replicates my idea and the error I am getting when executing the function:
CREATE OR REPLACE FUNCTION createTable() RETURNS int AS $$
BEGIN
DROP TABLE IF EXISTS MY_TABLE;
CREATE TABLE MY_TABLE
(
ID integer
)
WITH (
OIDS=FALSE
);
insert into MY_TABLE values(1);
select * from MY_TABLE;
RETURN 'SUCCESS';
END;
$$ LANGUAGE plpgsql;
Invocation:
select * from createTable();
With my ignorance of postgresql I would expect to obtain the SUCCESS value as a return (If everything runs without errors). But the returned message causes me confusion, isn't it the same as a function in any other programming language? When executing the function I get the following message:
query has no destination for result data Hint: If you want to
discard the results of a SELECT, use PERFORM instead.
query has no destination for result data Hint: If you want to discard the results of a SELECT, use PERFORM instead.
You are getting this error because you do not assign the results to any variable in the function. In a function, you would typically do something like this instead:
select * into var1 from MY_TABLE;
Therefore, your function would look something like this:
CREATE OR REPLACE FUNCTION createTable() RETURNS int AS $$
DECLARE
var1 my_table%ROWTYPE;
BEGIN
DROP TABLE IF EXISTS MY_TABLE;
CREATE TABLE MY_TABLE
(
ID integer
)
WITH (
OIDS=FALSE
);
insert into MY_TABLE values(1);
select * into var1 from MY_TABLE;
<do something with var1>
RETURN 'SUCCESS';
END;
$$ LANGUAGE plpgsql;
Otherwise, if you don't put the results into a variable, then you're likely hoping to achieve some side effect (like advancing a sequence or firing a trigger somehow). In that case, plpgsql expects you to use PERFORM instead of SELECT
Also, BTW your function RETURNS int but at the bottom of your definition you RETURN 'SUCCESS'. SUCCESS is a text type, not an int, so you will eventually get this error once you get past that first error message -- be sure to change it as necessary.

Input table for PL/pgSQL function

I would like to use a plpgsql function with a table and several columns as input parameter. The idea is to split the table in chunks and do something with each part.
I tried the following function:
CREATE OR REPLACE FUNCTION my_func(Integer)
RETURNS SETOF my_part
AS $$
DECLARE
out my_part;
BEGIN
FOR i IN 0..$1 LOOP
FOR out IN
SELECT * FROM my_func2(SELECT * FROM table1 WHERE id = i)
LOOP
RETURN NEXT out;
END LOOP;
END LOOP;
RETURN;
END;
$$
LANGUAGE plpgsql;
my_func2() is the function that does some work on each smaller part.
CREATE or REPLACE FUNCTION my_func2(table1)
RETURNS SETOF my_part2 AS
$$
BEGIN
RETURN QUERY
SELECT * FROM table1;
END
$$
LANGUAGE plpgsql;
If I run:
SELECT * FROM my_func(99);
I guess I should receive the first 99 IDs processed for each id.
But it says there is an error for the following line:
SELECT * FROM my_func2(select * from table1 where id = i)
The error is:
The subquery is only allowed to return one column
Why does this happen? Is there an easy way to fix this?
There are multiple misconceptions here. Study the basics before you try advanced magic.
Postgres does not have "table variables". You can only pass 1 column or row at a time to a function. Use a temporary table or a refcursor (like commented by #Daniel) to pass a whole table. The syntax is invalid in multiple places, so it's unclear whether that's what you are actually trying.
Even if it is: it would probably be better to process one row at a time or rethink your approach and use a set-based operation (plain SQL) instead of passing cursors.
The data types my_part and my_part2 are undefined in your question. May be a shortcoming of the question or a problem in the test case.
You seem to expect that the table name table1 in the function body of my_func2() refers to the function parameter of the same (type!) name, but this is fundamentally wrong in at least two ways:
You can only pass values. A table name is an identifier, not a value. You would need to build a query string dynamically and execute it with EXECUTE in a plpgsql function. Try a search, many related answers her on SO. Then again, that may also not be what you wanted.
table1 in CREATE or REPLACE FUNCTION my_func2(table1) is a type name, not a parameter name. It means your function expects a value of the type table1. Obviously, you have a table of the same name, so it's supposed to be the associated row type.
The RETURN type of my_func2() must match what you actually return. Since you are returning SELECT * FROM table1, make that RETURNS SETOF table1.
It can just be a simple SQL function.
All of that put together:
CREATE or REPLACE FUNCTION my_func2(_row table1)
RETURNS SETOF table1 AS
'SELECT ($1).*' LANGUAGE sql;
Note the parentheses, which are essential for decomposing a row type. Per documentation:
The parentheses are required here to show that compositecol is a column name not a table name
But there is more ...
Don't use out as variable name, it's a keyword of the CREATE FUNCTION statement.
The syntax of your main query my_func() is more like psudo-code. Too much doesn't add up.
Proof of concept
Demo table:
CREATE TABLE table1(table1_id serial PRIMARY KEY, txt text);
INSERT INTO table1(txt) VALUES ('a'),('b'),('c'),('d'),('e'),('f'),('g');
Helper function:
CREATE or REPLACE FUNCTION my_func2(_row table1)
RETURNS SETOF table1 AS
'SELECT ($1).*' LANGUAGE sql;
Main function:
CREATE OR REPLACE FUNCTION my_func(int)
RETURNS SETOF table1 AS
$func$
DECLARE
rec table1;
BEGIN
FOR i IN 0..$1 LOOP
FOR rec IN
SELECT * FROM table1 WHERE table1_id = i
LOOP
RETURN QUERY
SELECT * FROM my_func2(rec);
END LOOP;
END LOOP;
END
$func$ LANGUAGE plpgsql;
Call:
SELECT * FROM my_func(99);
SQL Fiddle.
But it's really just a a proof of concept. Nothing useful, yet.
As the error log is telling you.. you can return only one column in a subquery, so you have to change it to
SELECT my_func2(SELECT Specific_column_you_need FROM hasval WHERE wid = i)
a possible solution can be that you pass to funct2 the primary key of the table your funct2 needs and then you can obtain the whole table by making the SELECT * inside the function

Passing a ResultSet into a Postgresql Function

Is it possible to pass the results of a postgres query as an input into another function?
As a very contrived example, say I have one query like
SELECT id, name
FROM users
LIMIT 50
and I want to create a function my_function that takes the resultset of the first query and returns the minimum id. Is this possible in pl/pgsql?
SELECT my_function(SELECT id, name FROM Users LIMIT 50); --returns 50
You could use a cursor, but that very impractical for computing a minimum.
I would use a temporary table for that purpose, and pass the table name for use in dynamic SQL:
CREATE OR REPLACE FUNCTION f_min_id(_tbl regclass, OUT min_id int) AS
$func$
BEGIN
EXECUTE 'SELECT min(id) FROM ' || _tbl
INTO min_id;
END
$func$ LANGUAGE plpgsql;
Call:
CREATE TEMP TABLE foo ON COMMIT DROP AS
SELECT id, name
FROM users
LIMIT 50;
SELECT f_min_id('foo');
Major points
The first parameter is of type regclass to prevent SQL injection. More info in this related answer on dba.SE.
I made the temp table ON COMMIT DROP to limit its lifetime to the current transaction. May or may not be what you want.
You can extend this example to take more parameters. Search for code examples for dynamic SQL with EXECUTE.
-> SQLfiddle demo
I would take the problem on the other side, calling an aggregate function for each record of the result set. It's not as flexible but can gives you an hint to work on.
As an exemple to follow your sample problem:
CREATE OR REPLACE FUNCTION myMin ( int,int ) RETURNS int AS $$
SELECT CASE WHEN $1 < $2 THEN $1 ELSE $2 END;
$$ LANGUAGE SQL STRICT IMMUTABLE;
CREATE AGGREGATE my_function ( int ) (
SFUNC = myMin, STYPE = int, INITCOND = 2147483647 --maxint
);
SELECT my_function(id) from (SELECT * FROM Users LIMIT 50) x;
It is not possible to pass an array of generic type RECORD to a plpgsql function which is essentially what you are trying to do.
What you can do is pass in an array of a specific user defined TYPE or of a particular table row type. In the example below you could also swap out the argument data type for the table name users[] (though this would obviously mean getting all data in the users table row).
CREATE TYPE trivial {
"ID" integer,
"NAME" text
}
CREATE OR REPLACE FUNCTION trivial_func(data trivial[])
RETURNS integer AS
$BODY$
DECLARE
BEGIN
--Implementation here using data
return 1;
END$BODY$
LANGUAGE 'plpgsql' VOLATILE;
I think there's no way to pass recordset or table into function (but I'd be glad if i'm wrong). Best I could suggest is to pass array:
create or replace function my_function(data int[])
returns int
as
$$
select min(x) from unnest(data) as x
$$
language SQL;
sql fiddle demo

Update multiple columns in a trigger function in plpgsql

Given the following schema:
create table account_type_a (
id SERIAL UNIQUE PRIMARY KEY,
some_column VARCHAR
);
create table account_type_b (
id SERIAL UNIQUE PRIMARY KEY,
some_other_column VARCHAR
);
create view account_type_a view AS select * from account_type_a;
create view account_type_b view AS select * from account_type_b;
I try to create a generic trigger function in plpgsql, which enables updating the view:
create trigger trUpdate instead of UPDATE on account_view_type_a
for each row execute procedure updateAccount();
create trigger trUpdate instead of UPDATE on account_view_type_a
for each row execute procedure updateAccount();
An unsuccessful effort of mine was:
create function updateAccount() returns trigger as $$
declare
target_table varchar := substring(TG_TABLE_NAME from '(.+)_view');
cols varchar;
begin
execute 'select string_agg(column_name,$1) from information_schema.columns
where table_name = $2' using ',', target_table into cols;
execute 'update ' || target_table || ' set (' || cols || ') = select ($1).*
where id = ($1).id' using NEW;
return NULL;
end;
$$ language plpgsql;
The problem is the update statement. I am unable to come up with a syntax that would work here. I have successfully implemented this in PL/Perl, but would be interested in a plpgsql-only solution.
Any ideas?
Update
As #Erwin Brandstetter suggested, here is the code for my PL/Perl solution. I incoporated some of his suggestions.
create function f_tr_up() returns trigger as $$
use strict;
use warnings;
my $target_table = quote_ident($_TD->{'table_name'}) =~ s/^([\w]+)_view$/$1/r;
my $NEW = $_TD->{'new'};
my $cols = join(',', map { quote_ident($_) } keys $NEW);
my $vals = join(',', map { quote_literal($_) } values $NEW);
my $query = sprintf(
"update %s set (%s) = (%s) where id = %d",
$target_table,
$cols,
$vals,
$NEW->{'id'});
spi_exec_query($query);
return;
$$ language plperl;
While #Gary's answer is technically correct, it fails to mention that PostgreSQL does support this form:
UPDATE tbl
SET (col1, col2, ...) = (expression1, expression2, ..)
Read the manual on UPDATE.
It's still tricky to get this done with dynamic SQL. I'll assume a simple case where views consist of the same columns as their underlying tables.
CREATE VIEW tbl_view AS SELECT * FROM tbl;
Problems
The special record NEW is not visible inside EXECUTE. I pass NEW as a single parameter with the USING clause of EXECUTE.
As discussed, UPDATE with list-form needs individual values. I use a subselect to split the record into individual columns:
UPDATE ...
FROM (SELECT ($1).*) x
(Parenthesis around $1 are not optional.) This allows me to simply use two column lists built with string_agg() from the catalog table: one with and one without table qualification.
It's not possible to assign a row value as a whole to individual columns. The manual:
According to the standard, the source value for a parenthesized
sub-list of target column names can be any row-valued expression
yielding the correct number of columns. PostgreSQL only allows the
source value to be a row constructor or a sub-SELECT.
INSERT is implemented simpler. If the structure of view and table are identical we can omit the column definition list. (Can be improved, see below.)
Solution
I made a couple of updates to your approach to make it shine.
Trigger function for UPDATE:
CREATE OR REPLACE FUNCTION f_trg_up()
RETURNS TRIGGER
LANGUAGE plpgsql AS
$func$
DECLARE
_tbl regclass := quote_ident(TG_TABLE_SCHEMA) || '.'
|| quote_ident(substring(TG_TABLE_NAME from '(.+)_view$'));
_cols text;
_vals text;
BEGIN
SELECT INTO _cols, _vals
string_agg(quote_ident(attname), ', ')
, string_agg('x.' || quote_ident(attname), ', ')
FROM pg_attribute
WHERE attrelid = _tbl
AND NOT attisdropped -- no dropped (dead) columns
AND attnum > 0; -- no system columns
EXECUTE format('
UPDATE %s
SET (%s) = (%s)
FROM (SELECT ($1).*) x', _tbl, _cols, _vals)
USING NEW;
RETURN NEW; -- Don't return NULL unless you knwo what you're doing
END
$func$;
Trigger function for INSERT:
CREATE OR REPLACE FUNCTION f_trg_ins()
RETURNS TRIGGER
LANGUAGE plpgsql AS
$func$
DECLARE
_tbl regclass := quote_ident(TG_TABLE_SCHEMA) || '.'
|| quote_ident(substring(TG_TABLE_NAME FROM '(.+)_view$'));
BEGIN
EXECUTE format('INSERT INTO %s SELECT ($1).*', _tbl)
USING NEW;
RETURN NEW; -- Don't return NULL unless you know what you're doing
END
$func$;
Triggers:
CREATE TRIGGER trg_instead_up
INSTEAD OF UPDATE ON a_view
FOR EACH ROW EXECUTE FUNCTION f_trg_up();
CREATE TRIGGER trg_instead_ins
INSTEAD OF INSERT ON a_view
FOR EACH ROW EXECUTE FUNCTION f_trg_ins();
Before Postgres 11 the syntax (oddly) was EXECUTE PROCEDURE instead of EXECUTE FUNCTION - which also still works.
db<>fiddle here - demonstrating INSERT and UPDATE
Old sqlfiddle
Major points
Include the schema name to make the table reference unambiguous. There can be multiple table of the same name in one database with multiple schemas!
Query pg_catalog.pg_attribute instead of information_schema.columns. Less portable, but much faster and allows to use the table-OID.
How to check if a table exists in a given schema
Table names are NOT safe against SQLi when concatenated as strings for dynamic SQL. Escape with quote_ident() or format() or with an object-identifer type. This includes the special trigger function variables TG_TABLE_SCHEMA and TG_TABLE_NAME!
Cast to the object identifier type regclass to assert the table name is valid and get the OID for the catalog look-up.
Optionally use format() to build the dynamic query string safely.
No need for dynamic SQL for the first query on the catalog tables. Faster, simpler.
Use RETURN NEW instead of RETURN NULL in these trigger functions unless you know what you are doing. (NULL would cancel the INSERT for the current row.)
This simple version assumes that every table (and view) has a unique column named id. A more sophisticated version might use the primary key dynamically.
The function for UPDATE allows the columns of view and table to be in any order, as long as the set is the same.
The function for INSERT expects the columns of view and table to be in identical order. If you want to allow arbitrary order, add a column definition list to the INSERT command, just like with UPDATE.
Updated version also covers changes to the id column by using OLD additionally.
Postgresql doesn't support updating multiple columns using the set (col1,col2) = select val1,val2 syntax.
To achieve the same in postgresql you'd use
update target_table
set col1 = d.val1,
col2 = d.val2
from source_table d
where d.id = target_table.id
This is going to make the dynamic query a bit more complex to build as you'll need to iterate the column name list you're using into individual fields. I'd suggest you use array_agg instead of string_agg as an array is easier to process than splitting the string again.
Postgresql UPDATE syntax
documentation on array_agg function

How to create SQL function with parameter?

I have an SQL function:
-- Function: insert_into_wgs()
-- DROP FUNCTION insert_into_wgs();
CREATE OR REPLACE FUNCTION insert_into_wgs()
RETURNS void AS
$BODY$
BEGIN
INSERT INTO parcels (num,vid,support_num,name_dispatcher,cadastr,the_geom,status_id)
SELECT num,vid,support_num,name_dispatcher,cadastr,ST_TRANSFORM(the_geom,4326),status_id
FROM parcels_temp
WHERE num NOT IN (SELECT num FROM parcels)AND deleted='no';
end;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
ALTER FUNCTION insert_into_wgs() OWNER TO postgres;
You see that it works only for parcels, parcels_temp tables. How can I create function with parameters insert_into_wgs(table, table_temp);?
As mentioned in the comment, you have to use dynamic SQL if you want to parametrize identifiers. And use EXECUTE in plpgsql.
In the base query better use NOT EXISTS instead of col NOT IN (<subquery>).
Now, if only the table names change and the columns stay the same, this simple demo would do the job:
CREATE OR REPLACE FUNCTION insert_into_tbl(_tbl text, _tbl_tmp text)
RETURNS void AS
$func$
BEGIN
EXECUTE format($x$
INSERT INTO %I (...) -- enter columns
SELECT ... -- enter columns
FROM %I t
WHERE NOT EXISTS (
SELECT 1
FROM %1$I p -- reuse first table name
WHERE p.num = t.num
)
AND deleted = 'no'$x$, _tbl, _tbl_tmp);
END
$func$ LANGUAGE plpgsql;
Check out format() in the manual.
There are quite a few related answers on SO. Try this search.