Strange 'Unknown table' when performing plpgsql script - sql

I have one same table in several schemas from PostgreSQL database server. I need
execute one query like below:
CREATE OR REPLACE FUNCTION git_search() RETURNS SETOF git_log AS $$
DECLARE
sch name;
BEGIN
FOREACH sch IN
select schema_name from information_schema.schemata where schema_name not in ('pg_toast','pg_temp_1','pg_toast_temp_1','pg_catalog','information_schema')
LOOP
qry := 'select count(*) from'|| quote_ident(sch) || '.git_log gl where gl.author_contributor_id = 17';
RETURN QUERY qry;
END LOOP;
RETURN;
END;
$$ LANGUAGE plpgsql;
select git_search();
but I have the error:
ERROR: "git_log" type not exists
SQL state: 42704
The git_log table is unknown in the first line of script. (clause CREATE)
Anybody can help me?
There are more than 100 schemas where I need perform the query that is adjusted for this situation. What is the best way to do this? Where I can create the function for this purpose?

The table name would serve just fine as composite type name, because a composite type of the same name (schema-qualified) is created with every table automatically.
The immediate cause of the error: none of your tables (actually the associated composite type of the same name) named git_log can be found in the current search_path, so the type name cannot be resolved.
Since you are operating with many schemas and many instances of tables called git_log, you need to be unambiguous and schema-qualify the table name. Just pick any one of your tables in one of the schemas, they all share the same layout:
But the rest of your function isn't going to work either. It's not a "plpgsql script", but a function definition. Try this:
CREATE OR REPLACE FUNCTION git_search()
RETURNS SETOF one_schema.git_log AS
$func$
DECLARE
sch text;
BEGIN
FOR sch IN
SELECT schema_name
FROM information_schema.schemata
WHERE schema_name NOT LIKE 'pg_%'
AND schema_name <> 'information_schema'
ORDER BY schema_name
LOOP
RETURN QUERY EXECUTE format(
'SELECT count(*)
FROM %I.git_log
WHERE author_contributor_id = 17', sch);
END LOOP;
END
$func$ LANGUAGE plpgsql;
Call:
SELECT * FROM git_search();
Major points
FOREACH is for looping though arrays, you need a FOR loop.
You need dynamic SQL. Search for examples with more explanation.
Call the function with SELECT * FROM.
Related answer (one of many):
Passing column names dynamically for a record variable in PostgreSQL

Related

PL/pgSQL dynamic query in stored function to return any table's column names

I am trying to write a stored function with a dynamic query that returns all column names from a table that can then be used to create a dynamic query for a joined view trigger function. But struggling to create a stored function with a dynamic query returning column_name from information_schema.
Here is the SQL query I was hoping to convert to a stored function passing the table_name and table_schema as function parameters:
select
column_name
from
information_schema.columns
where
table_name = 'projects' -- to be replaced by parameter
and table_schema = 'public'; -- to be replaced by parameter
I (think I now) understand the basics of needing to use Execute and Format for neatness, but only got a result with passing a table name. This post had a good example of passing a table name: Refactor a PL/pgSQL function to return the output of various SELECT queries
The idea would be to dynamically get the columns then process into a function based on this scratch dynamic query...
DO $$
DECLARE
item varchar;
column_name varchar default 'name';
table_name varchar default 'projects';
temp_string varchar default '';
begin
FOR item IN execute format('SELECT %I FROM %I',column_name,table_name)
loop
temp_string := temp_string || ',NEW.' || item;
END LOOP;
RAISE NOTICE '%', temp_string;
END$$;
And ultimately into the trigger function for views based on a table with a foreign key join. I.e. so the INSERT and UPDATE code is dynamically created for any parent table of a view with a join:
RETURNS trigger
LANGUAGE plpgsql
AS $function$
BEGIN
IF TG_OP = 'INSERT' THEN
INSERT INTO projects VALUES(NEW.id,NEW.name);
RETURN NEW;
ELSIF TG_OP = 'UPDATE' THEN
UPDATE projects SET id=NEW.id, name=NEW.name WHERE id=OLD.id;
RETURN NEW;
ELSIF TG_OP = 'DELETE' THEN
DELETE FROM projects WHERE id=OLD.id;
RETURN NULL;
END IF;
RETURN NEW;
END;
$function$
And finally work out how to deal with foreign key columns.
End result is the parent table can be updated via the view in QGIS. Is this even possible?
I am not exactly sure I understand what you are after but I think parent table can be updated via the view indicates the goal. If so you are headed in the wrong direction entirely and none of what you are seeking is needed. What you want is an instead of trigger on the view(s). The fiddle here demonstrates an instead of trigger on a view generated with a join, typically these are not cannot normally be updated.
Your idea to dynamically get the columns then process ... and ultimately into the trigger function for views seems extremely ambitious. A better approach may be to build a template for the trigger and associated functions then make the necessary specific column changes. Your trigger(s) must
exist well before any DML action on the views.

PostgreSQL Function returning result set from dynamic tables names

In my database, I have the standard app tables and backup tables. Eg. for a table "employee", I have a table called "bak_employee". The bak_employee table is a backup of the employee table. I use it to restore the employee table between tests.
I'd figure I can use these "bak_" tables to see the changes that have occurred during the test like this:
SELECT * FROM employee EXCEPT SELECT * FROM bak_employee
This will show me the inserted and updated records. I'll ignore the deleted records for now.
Now, what I would like to do is go through all the tables in my database to see if there's any changes in any of the tables. I was thinking of doing this as a function so it's easy to call over and over. This is what I have so far:
CREATE OR REPLACE FUNCTION public.show_diff()
RETURNS SETOF diff_tables AS
$BODY$
DECLARE
app_tables text;
BEGIN
FOR app_tables IN
SELECT table_name
FROM information_schema.tables
WHERE table_catalog = 'myDatabase'
AND table_schema = 'public'
AND table_name not like 'bak_%' -- exclude existing backup tables
LOOP
-- somehow loop through tables to see what's changed something like:
EXECUTE 'SELECT * FROM ' || app_tables || ' EXCEPT SELECT * FROM bak_' || app_tables;
END LOOP;
RETURN;
END;
$BODY$
LANGUAGE plpgsql;
But obviously this isn't going to return me any useful information. Any help would be appreciated.
You cannot return various well-known row types from the same function in the same call. A cheap fix is to cast each row type to text, so we have a common return type.
CREATE OR REPLACE FUNCTION public.show_diff()
RETURNS SETOF text AS -- text!!
$func$
DECLARE
app_table text;
BEGIN
FOR app_table IN
SELECT table_name
FROM information_schema.tables
WHERE table_catalog = 'myDatabase'
AND table_schema = 'public'
AND table_name NOT LIKE 'bak_%' -- exclude existing backup tables
LOOP
RETURN NEXT ' ';
RETURN NEXT '=== ' || app_table || ' ===';
RETURN QUERY EXECUTE format(
'SELECT x::text FROM (TABLE %I EXCEPT ALL TABLE %I) x'
, app_table, 'bak_' || app_table);
END LOOP;
RETURN;
END
$func$ LANGUAGE plpgsql;
Call:
SELECT * FROM public.show_diff();
I had the test suggested by #a_horse at first, but after your comment I realized that there is no need for this. EXCEPT considers NULL values to be equal and shows all differences.
While being at it, I improved and simplified your solution some more. Use EXCEPT ALL: cheaper and does not run the risk of folding complete duplicates.
Using EXCEPT clause in PostgreSQL
TABLE is just syntactical sugar.
Is there a shortcut for SELECT * FROM in psql?
However, if you have an index on a unique (combination of) column(s), a JOIN like I suggested before should be faster: finding the only possible duplicate via index should be substantially cheaper.
Crucial element is the cast the row type to text (x::text).
You can even make the function work for any table - but never more than one at a time: With a polymorphic parameter type:
Refactor a PL/pgSQL function to return the output of various SELECT queries

Check if table is empty in runtime

I am trying to write a script which drops some obsolete tables in Postgres database. I want to be sure the tables are empty before dropping them. I also want the script could be kept in our migration scripts where it is safe to run even after these tables are actually dropped.
There is my script:
CREATE OR REPLACE FUNCTION __execute(TEXT) RETURNS VOID AS $$
BEGIN EXECUTE $1; END;
$$ LANGUAGE plpgsql STRICT;
CREATE OR REPLACE FUNCTION __table_exists(TEXT, TEXT) RETURNS bool as $$
SELECT exists(SELECT 1 FROM information_schema.tables WHERE (table_schema, table_name, table_type) = ($1, $2, 'BASE TABLE'));
$$ language sql STRICT;
CREATE OR REPLACE FUNCTION __table_is_empty(TEXT) RETURNS bool as $$
SELECT not exists(SELECT 1 FROM $1 );
$$ language sql STRICT;
-- Start migration here
SELECT __execute($$
DROP TABLE oldtable1;
$$)
WHERE __table_exists('public', 'oldtable1')
AND __table_is_empty('oldtable1');
-- drop auxilary functions here
And finally I got:
ERROR: syntax error at or near "$1"
LINE 11: SELECT not exists(SELECT 1 FROM $1 );
Is there any other way?
You must use EXECUTE if you want to pass a table name as parameter in a Postgres function.
Example:
CREATE OR REPLACE FUNCTION __table_is_empty(param character varying)
RETURNS bool
AS $$
DECLARE
v int;
BEGIN
EXECUTE 'select 1 WHERE EXISTS( SELECT 1 FROM ' || quote_ident(param) || ' ) '
INTO v;
IF v THEN return false; ELSE return true; END IF;
END;
$$ LANGUAGE plpgsql;
/
Demo: http://sqlfiddle.com/#!12/09cb0/1
No, no, no. For many reasons.
#kordirko already pointed out the immediate cause for the error message: In plain SQL, variables can only be used for values not for key words or identifiers. You can fix that with dynamic SQL, but that still doesn't make your code right.
You are applying programming paradigms from other programming languages. With PL/pgSQL, it is extremely inefficient to split your code into multiple separate tiny sub-functions. The overhead is huge in comparison.
Your actual call is also a time bomb. Expressions in the WHERE clause are executed in any order, so this may or may not raise an exception for non-existing table names:
WHERE __table_exists('public', 'oldtable1')
AND __table_is_empty('oldtable1');
... which will roll back your whole transaction.
Finally, you are completely open to race conditions. Like #Frank already commented, a table can be in use by concurrent transactions, in which case open locks may stall your attempt to drop the table. Could also lead to deadlocks (which the system resolves by rolling back all but one competing transactions). Take out an exclusive lock yourself, before you check whether the table is (still) empty.
Proper function
This is safe for concurrent use. It takes an array of table names (and optionally a schema name) and only drops existing, empty tables that are not locked in any way:
CREATE OR REPLACE FUNCTION f_drop_tables(_tbls text[] = '{}'
, _schema text = 'public'
, OUT drop_ct int) AS
$func$
DECLARE
_tbl text; -- loop var
_empty bool; -- for empty check
BEGIN
drop_ct := 0; -- init!
FOR _tbl IN
SELECT quote_ident(table_schema) || '.'
|| quote_ident(table_name) -- qualified & escaped table name
FROM information_schema.tables
WHERE table_schema = _schema
AND table_type = 'BASE TABLE'
AND table_name = ANY(_tbls)
LOOP
EXECUTE 'SELECT NOT EXISTS (SELECT 1 FROM ' || _tbl || ')'
INTO _empty; -- check first, only lock if empty
IF _empty THEN
EXECUTE 'LOCK TABLE ' || _tbl; -- now table is ripe for the plucking
EXECUTE 'SELECT NOT EXISTS (SELECT 1 FROM ' || _tbl || ')'
INTO _empty; -- recheck after lock
IF _empty THEN
EXECUTE 'DROP TABLE ' || _tbl; -- go in for the kill
drop_ct := drop_ct + 1; -- count tables actually dropped
END IF;
END IF;
END LOOP;
END
$func$ LANGUAGE plpgsql STRICT;
Call:
SELECT f_drop_tables('{foo1,foo2,foo3,foo4}');
To call with a different schema than the default 'public':
SELECT f_drop_tables('{foo1,foo2,foo3,foo4}', 'my_schema');
Major points
Reports the number of tables actually dropped. (Adapt to report info of your choice.)
Using the information schema like in your original. Seems the right choice here, but be aware of subtle limitations:
How to check if a table exists in a given schema
For use under heavy concurrent load (with long transactions), consider the NOWAIT option for the LOCK command and possibly catch exceptions from it.
Per documentation on "Table-level Locks":
ACCESS EXCLUSIVE
Conflicts with locks of all modes (ACCESS SHARE, ROW SHARE, ROW EXCLUSIVE,
SHARE UPDATE EXCLUSIVE, SHARE, SHARE ROW EXCLUSIVE, EXCLUSIVE,
and ACCESS EXCLUSIVE). This mode guarantees that the holder
is the only transaction accessing the table in any way.
Acquired by the ALTER TABLE, DROP TABLE, TRUNCATE, REINDEX, CLUSTER, and VACUUM FULL commands. This is also the
default lock mode for LOCK TABLE statements that do not specify a mode explicitly.
Bold emphasis mine.
I work on DB2 so I can't say for sure if this will work on a Postgres database, but try this:
select case when count(*) > 0 then True else False end from $1
...in place of:
SELECT not exists(SELECT 1 FROM $1 )
If Postgres does not have a CASE / END expression capability, I'd be shocked if it didn't have some kind of similar IF/THEN/ELSE like expression ability to use as a substitute.

Update multiple columns that start with a specific string

I am trying to update a bunch of columns in a DB for testing purposes of a feature. I have a table that is built with hibernate so all of the columns that are created for an embedded entity begin with the same name. I.e. contact_info_address_street1, contact_info_address_street2, etc.
I am trying to figure out if there is a way to do something to the affect of:
UPDATE table SET contact_info_address_* = null;
If not, I know I can do it the long way, just looking for a way to help myself out in the future if I need to do this all over again for a different set of columns.
You need dynamic SQL for this. So you must defend against possible SQL injection.
Basic query
The basic query to generate the DML command needed can look like this:
SELECT format('UPDATE tbl SET (%s) = (%s)'
,string_agg (quote_ident(attname), ', ')
,string_agg ('NULL', ', ')
)
FROM pg_attribute
WHERE attrelid = 'tbl'::regclass
AND NOT attisdropped
AND attnum > 0
AND attname ~~ 'foo_%';
Returns:
UPDATE tbl SET (foo_a, foo_b, foo_c) = (NULL, NULL, NULL);
Make use of the "column-list syntax" of UPDATE to shorten the code and simplify the task.
I query the system catalogs instead of information schema because the latter, while being standardized and guaranteed to be portable across major versions, is also notoriously slow and sometimes unwieldy. There are pros and cons, see:
Get column names and data types of a query, table or view
quote_ident() for the column names prevents SQL-injection - also necessary for identifiers.
string_agg() requires 9.0+.
Full automation with PL/pgSQL function
CREATE OR REPLACE FUNCTION f_update_cols(_tbl regclass, _col_pattern text
, OUT row_ct int, OUT col_ct bigint)
LANGUAGE plpgsql AS
$func$
DECLARE
_sql text;
BEGIN
SELECT INTO _sql, col_ct
format('UPDATE tbl SET (%s) = (%s)'
, string_agg (quote_ident(attname), ', ')
, string_agg ('NULL', ', ')
)
, count(*)
FROM pg_attribute
WHERE attrelid = _tbl
AND NOT attisdropped -- no dropped columns
AND attnum > 0 -- no system columns
AND attname LIKE _col_pattern; -- only columns matching pattern
-- RAISE NOTICE '%', _sql; -- output SQL for debugging
EXECUTE _sql;
GET DIAGNOSTICS row_ct = ROW_COUNT;
END
$func$;
COMMENT ON FUNCTION f_update_cols(regclass, text)
IS 'Updates all columns of table _tbl ($1)
that match _col_pattern ($2) in a LIKE expression.
Returns the count of columns (col_ct) and rows (row_ct) affected.';
Call:
SELECT * FROM f_update_cols('myschema.tbl', 'foo%');
To make the function more practical, it returns information as described in the comment. More about obtaining the result status in plpgsql in the manual.
I use the variable _sql to hold the query string, so I can collect the number of columns found (col_ct) in the same query.
The object identifier type regclass is the most efficient way to automatically avoid SQL injection (and sanitize non-standard names) for the table name, too. You can use schema-qualified table names to avoid ambiguities. I would advise to do so if you (can) have multiple schemas in your db! See:
Table name as a PostgreSQL function parameter
db<>fiddle here
Old sqlfiddle
There's no handy shortcut sorry. If you have to do this kind of thing a lot, you could create a function to dynamically execute sql and achieve your goal.
CREATE OR REPLACE FUNCTION reset_cols() RETURNS boolean AS $$ BEGIN
EXECUTE (select 'UPDATE table SET '
|| array_to_string(array(
select column_name::text
from information_schema.columns
where table_name = 'table'
and column_name::text like 'contact_info_address_%'
),' = NULL,')
|| ' = NULL');
RETURN true;
END; $$ LANGUAGE plpgsql;
-- run the function
SELECT reset_cols();
It's not very nice though. A better function would be one that accepts the tablename and column prefix as args. Which I'll leave as an exercise for the readers :)

use selected value as table name in postgres

i've serarched for the answer, but did't find.
so i've got a table types
CREATE TABLE types
(
type_id serial NOT NULL,
type_name character varying,
CONSTRAINT un_type_name UNIQUE (type_name)
)
which holds type names, lets say users - and this is the name of corresponding table users. this design may be a bit ugly, but it was made to allow users create their own types. (is there better way to acheve this?)
now i want to perform a query like this one:
select type_name, (select count(*) from ???) from types
to get list of all type names and count of objects of each type.
can this be done?
You cannot do it directly in SQL
You can use a PLpgSQL function and dynamic SQL
CREATE OR REPLACE FUNCTION tables_count(OUT type_name character varying, OUT rows bigint)
RETURNS SETOF record AS $$
BEGIN
FOR tables_count.type_name IN SELECT types.type_name FROM types
LOOP
EXECUTE 'SELECT COUNT(*) FROM ' || quote_ident(tables_count.type_name) INTO tables_count.rows;
RETURN NEXT;
END LOOP;
RETURN;
END;
$$ LANGUAGE plpgsql;
SELECT * FROM tables_count();
I don't have enough information, but I do suspect that something is off with your design. You shouldn't need an extra table for every type.
Be that as it may, what you want to do cannot be done - in pure SQL.
It can be done with a plpgsql function executing dynamic SQL, though:
CREATE OR REPLACE FUNCTION f_type_ct()
RETURNS TABLE (type_name text, ct bigint) AS
$BODY$
DECLARE
tbl text;
BEGIN
FOR tbl IN SELECT t.type_name FROM types t ORDER BY t.type_name
LOOP
RETURN QUERY EXECUTE
'SELECT $1, count(*) FROM ' || tbl::regclass
USING tbl;
END LOOP;
END;
$BODY$
LANGUAGE plpgsql;
Call:
SELECT * FROM f_type_ct();
You'll need to study most of the chapter about plpgsql in the manual to understand what's going on here.
One special hint: the cast to regclass is a safeguard against SQLi. You could also use the more generally applicable quote_ident() for that, but that does not properly handle schema-qualified table names, while the cast to regclass does. It also only accepts table names that are visible to the calling user.