Postgres SQL query across different schemas - sql

We have multiple schemas, I would like to run a simple count query across schemas such as:
SELECT COUNT(col_x) FROM schema1.table WHENRE col_x IS NOT NULL
I saw that I'm able to get all the schemas with:
SELECT schema_name FROM information_schema.schemata
So by using:
set search_path to schema1;
SELECT COUNT(col_x)
FROM table
WHERE col_x is not NULL;
I was able to run the query for schema1
The question is - is it possible to run in a loop and use the schema name as a parameter for search_path and run the query across all schemas? or any other efficient way to do so?

You will need some plpgsql and dynamic SQL for this. Here is an anonymous block for illustration:
do language plpgsql
$$
declare
v_schema_name text;
table_row_count bigint;
sysSchema text[] := array['pg_toast','pg_temp_1','pg_toast_temp_1','pg_catalog','public','information_schema'];
-- other declarations here
begin
for v_schema_name in SELECT schema_name FROM information_schema.schemata WHERE (schema_name != ALL(sysSchema)) loop
begin
execute format('select count(col_x) from %I.t_table', v_schema_name)
into table_row_count;
raise notice 'Schema % count %', v_schema_name, table_row_count;
exception when others then null; -- t_table may not exists in some schemata
end;
-- other statements here
end loop;
end;
$$;
And btw WHERE col_x is not NULL is redundant.

Related

Can I select data across multiple schemas within the same SQL database?

I have one database with multiple schemas. I would like to run SELECT * FROM info, across all schemas that starts with "team".
The schemas are not fixed, meaning that schemas are added and dropped continuously, so I can't hardcode the schemas in the query. But I'm only interested in schemas that starts with "team". How do I do this?
If all tables have an identical structure, you can write a PL/pgSQL function that does this:
create function get_info(p_schema_prefix text)
returns table (... column definitions go here ...)
as
$$
declare
l_rec record;
l_sql text;
begin
for l_rec in select table_schema, table_name
from information_schema.tables
where table_name = 'info'
and table_schema like p_schema_prefix||'%'
loop
l_sql := format('select id, data from %I.%I', l_rec.table_schema, l_rec.table_name);
return query execute l_sql;
end loop;
end;
$$
language plpgsql;
The use it like this:
select *
from get_info('team')

How to delete all schemas in postgres

I'm using django-tenants, and for some tests I need to delete all schemas at once, so I was wondering how could I delete all schemas with a single sentence/script from postgresql shell, because deleting one by one is not scalable.
Thx so much.
For deleting all schemas you must use dynamic SQL. And schema names you can get from statistic system tables (example: information_schema). Example Query:
do
$body$
declare
f_rec record;
begin
for f_rec in
SELECT schema_name::text
FROM information_schema.schemata
where schema_name <> 'public'
loop
execute 'DROP SCHEMA ' || f_rec.schema_name || ' CASCADE';
end loop;
end;
$body$
language 'plpgsql';

Access dynamic column name of row type in trigger function

I am trying to create a dynamic function to use for setting up triggers.
CREATE OR REPLACE FUNCTION device_bid_modifiers_count_per()
RETURNS TRIGGER AS
$$
DECLARE
devices_count INTEGER;
table_name regclass := TG_ARGV[0];
column_name VARCHAR := TG_ARGV[1];
BEGIN
LOCK TABLE device_types IN EXCLUSIVE MODE;
EXECUTE format('LOCK TABLE %s IN EXCLUSIVE MODE', table_name);
SELECT INTO devices_count device_types_count();
IF TG_OP = 'DELETE' THEN
SELECT format(
'PERFORM validate_bid_modifiers_count(%s, %s, OLD.%s, %s)',
table_name,
column_name,
column_name,
devices_count
);
ELSE
SELECT format(
'PERFORM validate_bid_modifiers_count(%s, %s, NEW.%s, %s)',
table_name,
column_name,
column_name,
devices_count
);
END IF;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
My issue is with the execution of the dynamic function validate_bid_modifiers_count(). Currently it throws:
ERROR: query has no destination for result data
HINT: If you want to discard the results of a SELECT, use PERFORM instead.
CONTEXT: PL/pgSQL function device_bid_modifiers_count_per() line 21 at SQL statement
I can't really wrap my head around this. I understand that format() returns the correct string of function call with arguments. How do I fix this and make it work?
This should do it:
CREATE OR REPLACE FUNCTION device_bid_modifiers_count_per()
RETURNS TRIGGER AS
$func$
DECLARE
devices_count int := device_types_count();
table_name regclass := TG_ARGV[0];
column_name text := TG_ARGV[1];
BEGIN
LOCK TABLE device_types IN EXCLUSIVE MODE;
EXECUTE format('LOCK TABLE %s IN EXCLUSIVE MODE', table_name);
IF TG_OP = 'DELETE' THEN
PERFORM validate_bid_modifiers_count(table_name
, column_name
, (row_to_json(OLD) ->> column_name)::bigint
, devices_count);
ELSE
PERFORM validate_bid_modifiers_count(table_name
, column_name
, (row_to_json(NEW) ->> column_name)::bigint
, devices_count);
END IF;
RETURN NEW;
END
$func$ LANGUAGE plpgsql;
The immediate cause for the error message was the outer SELECT. Without target, you need to replace it with PERFORM in plpgsql. But the inner PERFORM in the query string passed to EXECUTE was wrong, too. PERFORM is a plpgsql command, not valid in an SQL string passed to EXECUTE, which expects SQL code. You have to use SELECT there. Finally OLD and NEW are not visible inside EXECUTE and would each raise an exception of their own the way you had it. All issues are fixed by dropping EXECUTE.
A simple and fast way to get the value of a dynamic column name from the row types OLD and NEW: cast to json, then you can parameterize the key name like demonstrated. Should be a bit simpler and faster than the alternative with dynamic SQL - which is possible as well, like:
...
EXECUTE format('SELECT validate_bid_modifiers_count(table_name
, column_name
, ($1.%I)::bigint
, devices_count)', column_name)
USING OLD;
...
Related:
Get values from varying columns in a generic trigger
Trigger with dynamic field name
Aside: Not sure why you need the heavy locks.
Aside 2: Consider writing a separate trigger function for each trigger instead. More noisy DDL, but simpler and faster to execute.
As I pointed out in the comment to Erwin Brandstetter's answer, initially I have an almost identical solution.
But issue was that I was getting the error
ERROR: record "new" has no field "column_name"
CONTEXT: SQL statement "SELECT validate_bid_modifiers_count(table_name, column_name, NEW.column_name, devices_count)"
PL/pgSQL function device_bid_modifiers_count_per() line 15 at PERFORM
This is why I thought I needed a way to dynamically evaluate things.
Currently got this working with the following still ugly looking to me solution (ugly because I don't like 2 IF statements, I would like it to be super dynamic, but maybe I am asking for too much):
CREATE OR REPLACE FUNCTION device_bid_modifiers_count_per()
RETURNS TRIGGER AS
$func$
DECLARE
row RECORD;
table_name regclass := TG_ARGV[0];
column_name text := TG_ARGV[1];
devices_count INTEGER;
BEGIN
LOCK TABLE device_types IN EXCLUSIVE MODE;
EXECUTE format('LOCK TABLE %s IN EXCLUSIVE MODE', table_name);
devices_count := device_types_count();
IF TG_OP = 'DELETE' THEN
row := OLD;
ELSE
row := NEW;
END IF;
IF column_name = 'campaign_id' THEN
PERFORM validate_bid_modifiers_count(table_name, column_name, row.campaign_id, devices_count);
ELSIF column_name = 'adgroup_id' THEN
PERFORM validate_bid_modifiers_count(table_name, column_name, row.adgroup_id, devices_count);
ELSE
RAISE EXCEPTION 'invalid_column_name %', column_name;
END IF;
RETURN NEW;
END;
$func$ LANGUAGE plpgsql;
I am open to more robust solution suggestions.
Basically, the second condition kind'a almost defeats the purpose of having a single function, I could have at this point as well split it into two functions. Because the goal is to define multiple (2) triggers using this function (providing arguments to it).

PL/PGSQL dynamic trigger for all tables in schema

I am looking to automate each table update with an automatic update of the updated_at column. I am able to make this work for a specific table using a trigger. But my main goal, which I can't find anywhere, is to create a function that dynamically grabs all the tables in the schema, and creates that same trigger and only changing the table name that the trigger is referencing to. For the life of me I can't figure it out.
I believe this shouldn't be as tricky as I'm making it as ever table in our schema will have the exact same column name of 'updated_at'.
One solution that I tried and thought would work was turning the table schema into an array, and iterating through that to invoke/create the trigger each iteration. But I don't have a ton of psql experience so I am finding myself googling for hours to solve this one little thing.
SELECT ARRAY (
SELECT
table_name::text
FROM
information_schema.tables
WHERE table_schema = 'public') as tables;
I have also tried:
DO $$
DECLARE
t text;
BEGIN
FOR t IN
SELECT table_name FROM information_schema.columns
WHERE column_name = 'updated_at'
LOOP
EXECUTE format('CREATE TRIGGER update_updatedAt
BEFORE UPDATE ON %I
FOR EACH ROW EXECUTE PROCEDURE updated_at()',
t);
END loop;
END;
$$ language 'plpgsql';
Procedure:
CREATE OR REPLACE FUNCTION updated_at()
RETURNS TRIGGER AS $$
BEGIN
NEW.updated_at = now();
RETURN NEW;
END;
$$ language 'plpgsql';
Your DO block works. The only problem with it is we can't have same Trigger name for multiple triggers. So, you can either add a table_name suffix/prefix for the Trigger name.
DO $$
DECLARE
t text;
BEGIN
FOR t IN
SELECT table_name FROM information_schema.columns
WHERE column_name = 'updated_at'
LOOP
EXECUTE format('CREATE TRIGGER update_updatedAt_%I
BEFORE UPDATE ON %I
FOR EACH ROW EXECUTE PROCEDURE updated_at()',
t,t);
END loop;
END;
$$ language 'plpgsql';
Additionally you may add a check to see if the trigger already exists in information_schema.triggers to be safe.
IF NOT EXISTS ( SELECT 1 from information_schema.triggers
where trigger_name = 'update_updatedat_'|| t)
THEN

SQL: send query to all database available

How is it possible to send a query to all databases on a server? I do not want to input all databases names, the script should auto-detect them.
example query:
SELECT SUM(tourney_results.amt_won)-SUM((tourney_summary.amt_buyin+tourney_summary.amt_fee)) as results
FROM tourney_results
INNER JOIN tourney_summary
ON tourney_results.id_tourney=tourney_summary.id_tourney
Where id_player=(SELECT id_player FROM player WHERE player_name='Apple');
So what I want to achieve here, if there is 2 databases, the first one would result 60, the second one would result 50, I need the 55 output here.
All databeses would have the same structure, tables etc.
You can do it using plpgsql and db_link. First install the db_link extension in the database you are connecting to:
CREATE EXTENSION dblink;
Then use a plpgsql function which iterates over all database on the server and executes the query. See this example (see comments inline). Note that I used a sample query in the function. You have to adapt the function with your real query:
CREATE or REPLACE FUNCTION test_dblink() RETURNS BIGINT AS
$$
DECLARE pg_database_row record;
query_result BIGINT;
_dbname TEXT;
_conn_name TEXT;
return_value BIGINT;
BEGIN
--initialize the final value
return_value = 0;
--first iterate over the records in the meta table pg_database
FOR pg_database_row in SELECT * FROM pg_database WHERE (NOT datistemplate) AND (datallowconn) LOOP
_dbname = pg_database_row.datname;
--build a connection name for db_link
_conn_name=_dbname||'myconn';
--close the connection is already active:
IF array_contains(dblink_get_connections(),_conn_name) THEN
PERFORM dblink_disconnect(_conn_name);
END IF;
-- open the connection with the actual database name
PERFORM dblink_connect(_dbname||'myconn', 'dbname='||_dbname);
-- check if the table does exist in the database:
PERFORM * FROM dblink(_conn_name,'SELECT 1 from pg_tables where tablename = ''your_table''') AS t(id int) ;
IF FOUND THEN
-- if the table exist, perform the query and save the result in a variable
SELECT * FROM dblink(_conn_name,'SELECT sum(id) FROM your_table limit 1') AS t(total int) INTO query_result;
IF query_result IS NOT NULL THEN
return_value = return_value + query_result;
END IF;
END IF;
PERFORM dblink_disconnect(_conn_name);
END LOOP;
RETURN return_value;
END;
$$
LANGUAGE 'plpgsql';
Execute the function with
select test_dblink();