Is there a way to define a named constant in a PostgreSQL query? - sql

Is there a way to define a named constant in a PostgreSQL query? For example:
MY_ID = 5;
SELECT * FROM users WHERE id = MY_ID;

This question has been asked before (How do you use script variables in PostgreSQL?). However, there is a trick that I use for queries sometimes:
with const as (
select 1 as val
)
select . . .
from const cross join
<more tables>
That is, I define a CTE called const that has the constants defined there. I can then cross join this into my query, any number of times at any level. I have found this particularly useful when I'm dealing with dates, and need to handle date constants across many subqueries.

PostgreSQL has no built-in way to define (global) variables like MySQL or Oracle. (There is a limited workaround using "customized options"). Depending on what you want exactly there are other ways:
For one query
You can provide values at the top of a query in a CTE like #Gordon already provided.
Global, persistent constant:
You could create a simple IMMUTABLE function for that:
CREATE FUNCTION public.f_myid()
RETURNS int LANGUAGE sql IMMUTABLE PARALLEL SAFE AS
'SELECT 5';
(Parallel safety settings only apply to Postgres 9.6 or later.)
It has to live in a schema that is visible to the current user, i.e. is in the respective search_path. Like the schema public, by default. If security is an issue, make sure it's the first schema in the search_path or schema-qualify it in your call:
SELECT public.f_myid();
Visible for all users in the database (that are allowed to access schema public).
Multiple values for current session:
CREATE TEMP TABLE val (val_id int PRIMARY KEY, val text);
INSERT INTO val(val_id, val) VALUES
( 1, 'foo')
, ( 2, 'bar')
, (317, 'baz');
CREATE FUNCTION f_val(_id int)
RETURNS text LANGUAGE sql STABLE PARALLEL RESTRICTED AS
'SELECT val FROM val WHERE val_id = $1';
SELECT f_val(2); -- returns 'baz'
Since plpgsql checks the existence of a table on creation, you need to create a (temporary) table val before you can create the function - even if a temp table is dropped at the end of the session while the function persists. The function will raise an exception if the underlying table is not found at call time.
The current schema for temporary objects comes before the rest of your search_path per default - if not instructed otherwise explicitly. You cannot exclude the temporary schema from the search_path, but you can put other schemas first.
Evil creatures of the night (with the necessary privileges) might tinker with the search_path and put another object of the same name in front:
CREATE TABLE myschema.val (val_id int PRIMARY KEY, val text);
INSERT INTO val(val_id, val) VALUES (2, 'wrong');
SET search_path = myschema, pg_temp;
SELECT f_val(2); -- returns 'wrong'
It's not much of a threat, since only privileged users can alter global settings. Other users can only do it for their own session. Consider the related chapter of manual on creating functions with SECURITY DEFINER.
A hard-wired schema is typically simpler and faster:
CREATE FUNCTION f_val(_id int)
RETURNS text LANGUAGE sql STABLE PARALLEL RESTRICTED AS
'SELECT val FROM pg_temp.val WHERE val_id = $1';
Related answers with more options:
How to test my ad-hoc SQL with parameters in Postgres query window
Passing user id to PostgreSQL triggers

In addition to the sensible options Gordon and Erwin already mentioned (temp tables, constant-returning functions, CTEs, etc), you can also (ab)use the PostgreSQL GUC mechanism to create global-, session- and transaction-level variables.
See this prior post which shows the approach in detail.
I don't recommend this for general use, but it could be useful in narrow cases like the one mentioned in the linked question, where the poster wanted a way to provide the application-level username to triggers and functions.

I've found this solution:
with vars as (
SELECT * FROM (values(5)) as t(MY_ID)
)
SELECT * FROM users WHERE id = (SELECT MY_ID FROM vars)

I've found a mixture of the available approaches to be best:
Store your variables in a table:
CREATE TABLE vars (
id INT NOT NULL PRIMARY KEY DEFAULT 1,
zipcode INT NOT NULL DEFAULT 90210,
-- etc..
CHECK (id = 1)
);
Create a dynamic function, which loads the contents of your table, and uses it to:
Re/Create another separate static immutable getter function.
CREATE FUNCTION generate_var_getter()
RETURNS VOID AS $$
DECLARE
var_name TEXT;
var_value TEXT;
new_rows TEXT[];
new_sql TEXT;
BEGIN
FOR var_name IN (
SELECT columns.column_name
FROM information_schema.columns
WHERE columns.table_schema = 'public'
AND columns.table_name = 'vars'
ORDER BY columns.ordinal_position ASC
) LOOP
EXECUTE
FORMAT('SELECT %I FROM vars LIMIT 1', var_name)
INTO var_value;
new_rows := ARRAY_APPEND(
new_rows,
FORMAT('(''%s'', %s)', var_name, var_value)
);
END LOOP;
new_sql := FORMAT($sql$
CREATE OR REPLACE FUNCTION var_get(key_in TEXT)
RETURNS TEXT AS $config$
DECLARE
result NUMERIC;
BEGIN
result := (
SELECT value FROM (VALUES %s)
AS vars_tmp (key, value)
WHERE key = key_in
);
RETURN result;
END;
$config$ LANGUAGE plpgsql IMMUTABLE;
$sql$, ARRAY_TO_STRING(new_rows, ','));
EXECUTE new_sql;
RETURN;
END;
$$ LANGUAGE plpgsql;
Add an update trigger to your table, so that after you change one of your variables, generate_var_getter() is called, and the immutable var_get() function is recreated.
CREATE FUNCTION vars_regenerate_update()
RETURNS TRIGGER AS $$
BEGIN
PERFORM generate_var_getter();
RETURN NULL;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER trigger_vars_regenerate_change
AFTER INSERT OR UPDATE ON vars
EXECUTE FUNCTION vars_regenerate_update();
Now you can easily keep your variables in a table, but also get blazing-fast immutable access to them. The best of both worlds:
INSERT INTO vars DEFAULT VALUES;
-- INSERT 0 1
SELECT var_get('zipcode')::INT;
-- 90210
UPDATE vars SET zipcode = 84111;
-- UPDATE 1
SELECT var_get('zipcode')::INT;
-- 84111

When your query uses "GROUP BY":
WITH const AS (
select 5 as MY_ID,
'2022-03-1'::date as MY_DAY)
SELECT u.user_group,
COUNT(*),
const.MY_DAY
FROM users u
CROSS JOIN const
WHERE 1=1
GROUP BY u.user_group, const.MY_ID, const.MY_DAY
the sample contains more fields, than the OP, but that helps to more visitors, who are looking for the subject.
Without GROUP BY:
WITH const AS (
select 5 as MY_ID)
SELECT u.* FROM users u
CROSS JOIN const
WHERE u.id = const.MY_ID
credits to #GordonLinoff
Without GROUP BY and no column-name conflicts:
WITH const AS (
select 5 as MY_ID)
SELECT users.* FROM users
CROSS JOIN const
WHERE id = MY_ID

Related

Dynamic query that uses CTE gets "syntax error at end of input"

I have a table that looks like this:
CREATE TABLE label (
hid UUID PRIMARY KEY DEFAULT UUID_GENERATE_V4(),
name TEXT NOT NULL UNIQUE
);
I want to create a function that takes a list of names and inserts multiple rows into the table, ignoring duplicate names, and returns an array of the IDs generated for the rows it inserted.
This works:
CREATE OR REPLACE FUNCTION insert_label(nms TEXT[])
RETURNS UUID[]
AS $$
DECLARE
ids UUID[];
BEGIN
CREATE TEMP TABLE tmp_names(name TEXT);
INSERT INTO tmp_names SELECT UNNEST(nms);
WITH new_names AS (
INSERT INTO label(name)
SELECT tn.name
FROM tmp_names tn
WHERE NOT EXISTS(SELECT 1 FROM label h WHERE h.name = tn.name)
RETURNING hid
)
SELECT ARRAY_AGG(hid) INTO ids
FROM new_names;
DROP TABLE tmp_names;
RETURN ids;
END;
$$ LANGUAGE PLPGSQL;
I have many tables with the exact same columns as the label table, so I would like to have a function that can insert into any of them. I'd like to create a dynamic query to do that. I tried that, but this does not work:
CREATE OR REPLACE FUNCTION insert_label(h_tbl REGCLASS, nms TEXT[])
RETURNS UUID[]
AS $$
DECLARE
ids UUID[];
query_str TEXT;
BEGIN
CREATE TEMP TABLE tmp_names(name TEXT);
INSERT INTO tmp_names SELECT UNNEST(nms);
query_str := FORMAT('WITH new_names AS ( INSERT INTO %1$I(name) SELECT tn.name FROM tmp_names tn WHERE NOT EXISTS(SELECT 1 FROM %1$I h WHERE h.name = tn.name) RETURNING hid)', h_tbl);
EXECUTE query_str;
SELECT ARRAY_AGG(hid) INTO ids FROM new_names;
DROP TABLE tmp_names;
RETURN ids;
END;
$$ LANGUAGE PLPGSQL;
This is the output I get when I run that function:
psql=# select insert_label('label', array['how', 'now', 'brown', 'cow']);
ERROR: syntax error at end of input
LINE 1: ...SELECT 1 FROM label h WHERE h.name = tn.name) RETURNING hid)
^
QUERY: WITH new_names AS ( INSERT INTO label(name) SELECT tn.name FROM tmp_names tn WHERE NOT EXISTS(SELECT 1 FROM label h WHERE h.name = tn.name) RETURNING hid)
CONTEXT: PL/pgSQL function insert_label(regclass,text[]) line 19 at EXECUTE
The query generated by the dynamic SQL looks like it should be exactly the same as the query from static SQL.
I got the function to work by changing the return value from an array of UUIDs to a table of UUIDs and not using CTE:
CREATE OR REPLACE FUNCTION insert_label(h_tbl REGCLASS, nms TEXT[])
RETURNS TABLE (hid UUID)
AS $$
DECLARE
query_str TEXT;
BEGIN
CREATE TEMP TABLE tmp_names(name TEXT);
INSERT INTO tmp_names SELECT UNNEST(nms);
query_str := FORMAT('INSERT INTO %1$I(name) SELECT tn.name FROM tmp_names tn WHERE NOT EXISTS(SELECT 1 FROM %1$I h WHERE h.name = tn.name) RETURNING hid', h_tbl);
RETURN QUERY EXECUTE query_str;
DROP TABLE tmp_names;
RETURN;
END;
$$ LANGUAGE PLPGSQL;
I don't know if one way is better than the other, returning an array of UUIDs or a table of UUIDs, but at least I got it to work one of those ways. Plus, possibly not using a CTE is more efficient, so it may be better to stick with the version that returns a table of UUIDs.
What I would like to know is why the dynamic query did not work when using a CTE. The query it produced looked like it should have worked.
If anyone can let me know what I did wrong, I would appreciate it.
... why the dynamic query did not work when using a CTE. The query it produced looked like it should have worked.
No, it was only the CTE without (required) outer query. (You had SELECT ARRAY_AGG(hid) INTO ids FROM new_names in the static version.)
There are more problems, but just use this query instead:
INSERT INTO label(name)
SELECT unnest(nms)
ON CONFLICT DO NOTHING
RETURNING hid;
label.name is defined UNIQUE NOT NULL, so this simple UPSERT can replace your function insert_label() completely.
It's much simpler and faster. It also defends against possible duplicates from within your input array that you didn't cover, yet. And it's safe under concurrent write load - as opposed to your original, which might run into race conditions. Related:
How to use RETURNING with ON CONFLICT in PostgreSQL?
I would just use the simple query and replace the table name.
But if you still want a dynamic function:
CREATE OR REPLACE FUNCTION insert_label(_tbl regclass, _nms text[])
RETURNS TABLE (hid uuid)
LANGUAGE plpgsql AS
$func$
BEGIN
RETURN QUERY EXECUTE format(
$$
INSERT INTO %s(name)
SELECT unnest($1)
ON CONFLICT DO NOTHING
RETURNING hid
$$, _tbl)
USING _nms;
END
$func$;
If you don't need an array as result, stick with the set (RETURNS TABLE ...). Simpler.
Pass values (_nms) to EXECUTE in a USING clause.
The tablename (_tbl) is type regclass, so the format specifier %I for format() would be wrong. Use %s instead. See:
Table name as a PostgreSQL function parameter

How to use one sql parameter to represent input array

Is there a way to write sql for Oracle, MS SQL:
Select * from table where id in(:arr)
Select * from table where id in(#arr)
With one param in sql 'arr' to represent an array of items?
I found examples that explode arr to #arr0,.., #arrn and feed array as n+1 separate parameters, not array, like this
Select * from table where id in(:arr0, :arr1, :arr2)
Select * from table where id in(#arr0, #arr1, #arr2)
Not what i want.
These will cause change in sql query and this creates new execution plans based on number of parameter.
I ask for .net, c# and Oracle and MS SQL.
Thanks for constructive ideas!
/ip/
I believe Table Value Parameter is good option for this case. Have a look at a sample code below in SQL Server.
-- Your table
CREATE TABLE SampleTable
(
ID INT
)
INSERT INTO SampleTable VALUES
(1010),
(2010),
(3010),
(4010),
(5010),
(6010),
(7010),
(8030)
GO
-- Create a TABLE type in SQL database which you can fill from front-end code
CREATE TYPE ParameterTableType AS TABLE
(
ParameterID INT
--, some other columns
)
GO
-- Create a stored proc using table type defined above
CREATE PROCEDURE ParameterArrayProcedure
(
#ParameterTable AS ParameterTableType READONLY
)
AS
BEGIN
SELECT
S.*
FROM SampleTable S
INNER JOIN #ParameterTable P ON S.ID = P.ParameterID
END
GO
-- Populated table type variable
DECLARE #ParameterTable AS ParameterTableType
INSERT INTO #ParameterTable (ParameterID) VALUES (1010), (4010), (7010)
EXECUTE ParameterArrayProcedure #ParameterTable
DROP PROCEDURE ParameterArrayProcedure
DROP TYPE ParameterTableType
DROP TABLE SampleTable
GO
Apart from Table Value Parameter, you can also use Json or XML values as SQL parameter but yes, it will definitely change your execution plan accordingly.
In addition to a Table Valued Parameter as Steve mentioned, there are a couple of other techniques available. For example you can parse a delimited string
Example
Declare #arr varchar(50) = '10,20,35'
Select A.*
From YourTable A
Join string_split(#arr,',') B on A.ID=value
Or even
Select A.*
From YourTable A
Where ID in ( select value from string_split(#arr,',') )
Oracle
In other languages (i.e. Java) you can pass an SQL collection as a bind parameter and directly use it in an SQL statement.
However, C# does not support passing SQL collections and only supports passing OracleCollectionType.PLSQLAssociativeArray (documentation link) which is a PL/SQL only data-type and cannot be used (directly) in SQL statements.
To pass an array, you would need to pass a PLSQLAssociativeArray to a PLSQL stored procedure and use that to convert it to an SQL collection that you can use in an SQL statement. An example of a procedure to convert from a PL/SQL associative array to an SQL collection is:
CREATE TYPE IntList AS TABLE OF INTEGER
/
CREATE PACKAGE tools IS
TYPE IntMap IS TABLE OF INTEGER INDEX BY BINARY_INTEGER;
FUNCTION IntMapToList(
i_map IntMap
) RETURN IntList;
END;
/
CREATE PACKAGE BODY tools IS
FUNCTION IntMapToList(
i_map IntMap
) RETURN IntList
IS
o_list IntList := IntList();
i BINARY_INTEGER;
BEGIN
IF i_map IS NOT NULL THEN
i := o_list.FIRST;
WHILE i IS NOT NULL LOOP
o_list.EXTEND;
o_list( o_list.COUNT ) := i_map( i );
i := i_map.NEXT( i );
END LOOP;
END IF;
RETURN o_list;
END;
END;
/

Passing a ResultSet into a Postgresql Function

Is it possible to pass the results of a postgres query as an input into another function?
As a very contrived example, say I have one query like
SELECT id, name
FROM users
LIMIT 50
and I want to create a function my_function that takes the resultset of the first query and returns the minimum id. Is this possible in pl/pgsql?
SELECT my_function(SELECT id, name FROM Users LIMIT 50); --returns 50
You could use a cursor, but that very impractical for computing a minimum.
I would use a temporary table for that purpose, and pass the table name for use in dynamic SQL:
CREATE OR REPLACE FUNCTION f_min_id(_tbl regclass, OUT min_id int) AS
$func$
BEGIN
EXECUTE 'SELECT min(id) FROM ' || _tbl
INTO min_id;
END
$func$ LANGUAGE plpgsql;
Call:
CREATE TEMP TABLE foo ON COMMIT DROP AS
SELECT id, name
FROM users
LIMIT 50;
SELECT f_min_id('foo');
Major points
The first parameter is of type regclass to prevent SQL injection. More info in this related answer on dba.SE.
I made the temp table ON COMMIT DROP to limit its lifetime to the current transaction. May or may not be what you want.
You can extend this example to take more parameters. Search for code examples for dynamic SQL with EXECUTE.
-> SQLfiddle demo
I would take the problem on the other side, calling an aggregate function for each record of the result set. It's not as flexible but can gives you an hint to work on.
As an exemple to follow your sample problem:
CREATE OR REPLACE FUNCTION myMin ( int,int ) RETURNS int AS $$
SELECT CASE WHEN $1 < $2 THEN $1 ELSE $2 END;
$$ LANGUAGE SQL STRICT IMMUTABLE;
CREATE AGGREGATE my_function ( int ) (
SFUNC = myMin, STYPE = int, INITCOND = 2147483647 --maxint
);
SELECT my_function(id) from (SELECT * FROM Users LIMIT 50) x;
It is not possible to pass an array of generic type RECORD to a plpgsql function which is essentially what you are trying to do.
What you can do is pass in an array of a specific user defined TYPE or of a particular table row type. In the example below you could also swap out the argument data type for the table name users[] (though this would obviously mean getting all data in the users table row).
CREATE TYPE trivial {
"ID" integer,
"NAME" text
}
CREATE OR REPLACE FUNCTION trivial_func(data trivial[])
RETURNS integer AS
$BODY$
DECLARE
BEGIN
--Implementation here using data
return 1;
END$BODY$
LANGUAGE 'plpgsql' VOLATILE;
I think there's no way to pass recordset or table into function (but I'd be glad if i'm wrong). Best I could suggest is to pass array:
create or replace function my_function(data int[])
returns int
as
$$
select min(x) from unnest(data) as x
$$
language SQL;
sql fiddle demo

Update multiple columns in a trigger function in plpgsql

Given the following schema:
create table account_type_a (
id SERIAL UNIQUE PRIMARY KEY,
some_column VARCHAR
);
create table account_type_b (
id SERIAL UNIQUE PRIMARY KEY,
some_other_column VARCHAR
);
create view account_type_a view AS select * from account_type_a;
create view account_type_b view AS select * from account_type_b;
I try to create a generic trigger function in plpgsql, which enables updating the view:
create trigger trUpdate instead of UPDATE on account_view_type_a
for each row execute procedure updateAccount();
create trigger trUpdate instead of UPDATE on account_view_type_a
for each row execute procedure updateAccount();
An unsuccessful effort of mine was:
create function updateAccount() returns trigger as $$
declare
target_table varchar := substring(TG_TABLE_NAME from '(.+)_view');
cols varchar;
begin
execute 'select string_agg(column_name,$1) from information_schema.columns
where table_name = $2' using ',', target_table into cols;
execute 'update ' || target_table || ' set (' || cols || ') = select ($1).*
where id = ($1).id' using NEW;
return NULL;
end;
$$ language plpgsql;
The problem is the update statement. I am unable to come up with a syntax that would work here. I have successfully implemented this in PL/Perl, but would be interested in a plpgsql-only solution.
Any ideas?
Update
As #Erwin Brandstetter suggested, here is the code for my PL/Perl solution. I incoporated some of his suggestions.
create function f_tr_up() returns trigger as $$
use strict;
use warnings;
my $target_table = quote_ident($_TD->{'table_name'}) =~ s/^([\w]+)_view$/$1/r;
my $NEW = $_TD->{'new'};
my $cols = join(',', map { quote_ident($_) } keys $NEW);
my $vals = join(',', map { quote_literal($_) } values $NEW);
my $query = sprintf(
"update %s set (%s) = (%s) where id = %d",
$target_table,
$cols,
$vals,
$NEW->{'id'});
spi_exec_query($query);
return;
$$ language plperl;
While #Gary's answer is technically correct, it fails to mention that PostgreSQL does support this form:
UPDATE tbl
SET (col1, col2, ...) = (expression1, expression2, ..)
Read the manual on UPDATE.
It's still tricky to get this done with dynamic SQL. I'll assume a simple case where views consist of the same columns as their underlying tables.
CREATE VIEW tbl_view AS SELECT * FROM tbl;
Problems
The special record NEW is not visible inside EXECUTE. I pass NEW as a single parameter with the USING clause of EXECUTE.
As discussed, UPDATE with list-form needs individual values. I use a subselect to split the record into individual columns:
UPDATE ...
FROM (SELECT ($1).*) x
(Parenthesis around $1 are not optional.) This allows me to simply use two column lists built with string_agg() from the catalog table: one with and one without table qualification.
It's not possible to assign a row value as a whole to individual columns. The manual:
According to the standard, the source value for a parenthesized
sub-list of target column names can be any row-valued expression
yielding the correct number of columns. PostgreSQL only allows the
source value to be a row constructor or a sub-SELECT.
INSERT is implemented simpler. If the structure of view and table are identical we can omit the column definition list. (Can be improved, see below.)
Solution
I made a couple of updates to your approach to make it shine.
Trigger function for UPDATE:
CREATE OR REPLACE FUNCTION f_trg_up()
RETURNS TRIGGER
LANGUAGE plpgsql AS
$func$
DECLARE
_tbl regclass := quote_ident(TG_TABLE_SCHEMA) || '.'
|| quote_ident(substring(TG_TABLE_NAME from '(.+)_view$'));
_cols text;
_vals text;
BEGIN
SELECT INTO _cols, _vals
string_agg(quote_ident(attname), ', ')
, string_agg('x.' || quote_ident(attname), ', ')
FROM pg_attribute
WHERE attrelid = _tbl
AND NOT attisdropped -- no dropped (dead) columns
AND attnum > 0; -- no system columns
EXECUTE format('
UPDATE %s
SET (%s) = (%s)
FROM (SELECT ($1).*) x', _tbl, _cols, _vals)
USING NEW;
RETURN NEW; -- Don't return NULL unless you knwo what you're doing
END
$func$;
Trigger function for INSERT:
CREATE OR REPLACE FUNCTION f_trg_ins()
RETURNS TRIGGER
LANGUAGE plpgsql AS
$func$
DECLARE
_tbl regclass := quote_ident(TG_TABLE_SCHEMA) || '.'
|| quote_ident(substring(TG_TABLE_NAME FROM '(.+)_view$'));
BEGIN
EXECUTE format('INSERT INTO %s SELECT ($1).*', _tbl)
USING NEW;
RETURN NEW; -- Don't return NULL unless you know what you're doing
END
$func$;
Triggers:
CREATE TRIGGER trg_instead_up
INSTEAD OF UPDATE ON a_view
FOR EACH ROW EXECUTE FUNCTION f_trg_up();
CREATE TRIGGER trg_instead_ins
INSTEAD OF INSERT ON a_view
FOR EACH ROW EXECUTE FUNCTION f_trg_ins();
Before Postgres 11 the syntax (oddly) was EXECUTE PROCEDURE instead of EXECUTE FUNCTION - which also still works.
db<>fiddle here - demonstrating INSERT and UPDATE
Old sqlfiddle
Major points
Include the schema name to make the table reference unambiguous. There can be multiple table of the same name in one database with multiple schemas!
Query pg_catalog.pg_attribute instead of information_schema.columns. Less portable, but much faster and allows to use the table-OID.
How to check if a table exists in a given schema
Table names are NOT safe against SQLi when concatenated as strings for dynamic SQL. Escape with quote_ident() or format() or with an object-identifer type. This includes the special trigger function variables TG_TABLE_SCHEMA and TG_TABLE_NAME!
Cast to the object identifier type regclass to assert the table name is valid and get the OID for the catalog look-up.
Optionally use format() to build the dynamic query string safely.
No need for dynamic SQL for the first query on the catalog tables. Faster, simpler.
Use RETURN NEW instead of RETURN NULL in these trigger functions unless you know what you are doing. (NULL would cancel the INSERT for the current row.)
This simple version assumes that every table (and view) has a unique column named id. A more sophisticated version might use the primary key dynamically.
The function for UPDATE allows the columns of view and table to be in any order, as long as the set is the same.
The function for INSERT expects the columns of view and table to be in identical order. If you want to allow arbitrary order, add a column definition list to the INSERT command, just like with UPDATE.
Updated version also covers changes to the id column by using OLD additionally.
Postgresql doesn't support updating multiple columns using the set (col1,col2) = select val1,val2 syntax.
To achieve the same in postgresql you'd use
update target_table
set col1 = d.val1,
col2 = d.val2
from source_table d
where d.id = target_table.id
This is going to make the dynamic query a bit more complex to build as you'll need to iterate the column name list you're using into individual fields. I'd suggest you use array_agg instead of string_agg as an array is easier to process than splitting the string again.
Postgresql UPDATE syntax
documentation on array_agg function

Find out which schema based on table values

My database is separated into schemas based on clients (i.e.: each client has their own schema, with same data structure).
I also happen to have an external action that does not know which schema it should target. It comes from another part of the system that has no concepts of clients and does not know in which client's set it is operating. Before I process it, I have to find out which schema that request needs to target
To find the right schema, I have to find out which holds the record R with a particular unique ID (string)
From my understanding, the following
SET search_path TO schema1,schema2,schema3,...
will only look through the tables in schema1 (or the first schema that matches the table) and will not do a global search.
Is there a way for me to do a global search across all schemas, or am I just going to have to use a for loop and iterate through all of them, one at a time?
You could use inheritance for this. (Be sure to consider the limitations.)
Consider this little demo:
CREATE SCHEMA master; -- no access of others ..
CREATE SEQUENCE master.myseq; -- global sequence for globally unique ids
CREATE table master.tbl (
id int primary key DEFAULT nextval('master.myseq')
, foo text);
CREATE SCHEMA x;
CREATE table x.tbl() INHERITS (master.tbl);
INSERT INTO x.tbl(foo) VALUES ('x');
CREATE SCHEMA y;
CREATE table y.tbl() INHERITS (master.tbl);
INSERT INTO y.tbl(foo) VALUES ('y');
SELECT * FROM x.tbl; -- returns 'x'
SELECT * FROM y.tbl; -- returns 'y'
SELECT * FROM master.tbl; -- returns 'x' and 'y' <-- !!
Now, to actually identify the table a particular row lives in, use the tableoid:
SELECT *, tableoid::regclass AS table_name
FROM master.tbl
WHERE id = 2;
Result:
id | foo | table_name
---+-----+-----------
2 | y | y.tbl
You can derive the source schema from the tableoid, best by querying the system catalogs with the tableoid directly. (The displayed name depends on the setting of search_path.)
SELECT n.nspname
FROM master.tbl t
JOIN pg_class c ON c.oid = t.tableoid
JOIN pg_namespace n ON c.relnamespace = n.oid
WHERE t.id = 2;
This is also much faster than looping through many separate tables.
You will have to iterate over all namespaces. You can get a lot of this information from the pg_* system catalogs. In theory, you should be able to resolve the client -> schema mapping at request time without talking to the database so that the first SQL call you make is:
SET search_path = client1,global_schema;
While I think Erwin's solution is probably preferable if you can re-structure your tables, an alternative that doesn't require any schema changes is to write a PL/PgSQL function that scans the tables using dynamic SQL based on the system catalog information.
Given:
CREATE SCHEMA a;
CREATE SCHEMA b;
CREATE TABLE a.testtab ( searchval text );
CREATE TABLE b.testtab (LIKE a.testtab);
INSERT INTO a.testtab(searchval) VALUES ('ham');
INSERT INTO b.testtab(searchval) VALUES ('eggs');
The following PL/PgSQL function searches all schemas containing tables named _tabname for values in _colname equal to _value and returns the first matching schema.
CREATE OR REPLACE FUNCTION find_schema_for_value(_tabname text, _colname text, _value text) RETURNS text AS $$
DECLARE
cur_schema text;
foundval integer;
BEGIN
FOR cur_schema IN
SELECT nspname
FROM pg_class c
INNER JOIN pg_namespace n ON (c.relnamespace = n.oid)
WHERE c.relname = _tabname AND c.relkind = 'r'
LOOP
EXECUTE
format('SELECT 1 FROM %I.%I WHERE %I = $1',
cur_schema, _tabname, _colname
) INTO foundval USING _value;
IF foundval = 1 THEN
RETURN cur_schema;
END IF;
END LOOP;
RETURN NULL;
END;
$$ LANGUAGE 'plpgsql';
If there are are no matches then null is returned. If there are multiple matches the result will be one of them, but no guarantee is made about which one. Add an ORDER BY clause to the schema query if you want to return (say) the first in alphabetical order or something. The function is also trivially modified to return setof text and RETURN NEXT cur_schema if you want to return all the matches.
regress=# SELECT find_schema_for_value('testtab','searchval','ham');
find_schema_for_value
-----------------------
a
(1 row)
regress=# SELECT find_schema_for_value('testtab','searchval','eggs');
find_schema_for_value
-----------------------
b
(1 row)
regress=# SELECT find_schema_for_value('testtab','searchval','bones');
find_schema_for_value
-----------------------
(1 row)
By the way, you can re-use the table definitions without inheritance if you want, and you really should. Either use a common composite data type:
CREATE TYPE public.testtab AS ( searchval text );
CREATE TABLE a.testtab OF public.testtab;
CREATE TABLE b.testtab OF public.testtab;
in which case they share the same data type but not any data; or or via LIKE:
CREATE TABLE public.testtab ( searchval text );
CREATE TABLE a.testtab (LIKE public.testtab);
CREATE TABLE b.testtab (LIKE public.testtab);
in which case they're completely unconnected to each other after creation.