I recently learned about row-level security policies in SQL, and love the idea of being able to run important security logic inside the database. However, I'm not sure how to make the workflow for updating RLS policies as good as updating API code is.
Since RLS policies are essentially just stateless logic, it feels like I should be able to define them all in an sql schema file that is checked into git and run on each deploy to idempotently set the policies.
We can get some of the way there by using drop if exists, e.g.:
drop policy if exists "everyone can read" on patterns;
create policy "everyone can read" on patterns for select using (auth.role() = 'anon');
This seems pretty good, but it's not quite truly declarative, because it won't drop any policies that still exist in the database from a previous version of the schema. Can I drop all existing policies for a table and then recreate them? Or is there another way to go about this?
I figured out how to drop all policies. There are two catalog tables that list policies; pg_policy has low-level info, whereas pg_policies is more useful for this because it includes the table names directly.
-- Drop all existing policies
do
$$
declare
rec record;
begin
for rec in (SELECT tablename, policyname FROM pg_policies)
loop
execute 'drop policy "'||rec.policyname||'" on '||rec.tablename;
end loop;
end;
$$;
You have two options:
query pg_policy for all policies on a table and drop them
drop all policies that ever existed in any version of your schema
Related
I'm going to guess that the answer is "no" based on the below error message (and this Google result), but is there anyway to perform a cross-database query using PostgreSQL?
databaseA=# select * from databaseB.public.someTableName;
ERROR: cross-database references are not implemented:
"databaseB.public.someTableName"
I'm working with some data that is partitioned across two databases although data is really shared between the two (userid columns in one database come from the users table in the other database). I have no idea why these are two separate databases instead of schema, but c'est la vie...
Note: As the original asker implied, if you are setting up two databases on the same machine you probably want to make two schemas instead - in that case you don't need anything special to query across them.
postgres_fdw
Use postgres_fdw (foreign data wrapper) to connect to tables in any Postgres database - local or remote.
Note that there are foreign data wrappers for other popular data sources. At this time, only postgres_fdw and file_fdw are part of the official Postgres distribution.
For Postgres versions before 9.3
Versions this old are no longer supported, but if you need to do this in a pre-2013 Postgres installation, there is a function called dblink.
I've never used it, but it is maintained and distributed with the rest of PostgreSQL. If you're using the version of PostgreSQL that came with your Linux distro, you might need to install a package called postgresql-contrib.
dblink() -- executes a query in a remote database
dblink executes a query (usually a SELECT, but it can be any SQL
statement that returns rows) in a remote database.
When two text arguments are given, the first one is first looked up as
a persistent connection's name; if found, the command is executed on
that connection. If not found, the first argument is treated as a
connection info string as for dblink_connect, and the indicated
connection is made just for the duration of this command.
one of the good example:
SELECT *
FROM table1 tb1
LEFT JOIN (
SELECT *
FROM dblink('dbname=db2','SELECT id, code FROM table2')
AS tb2(id int, code text);
) AS tb2 ON tb2.column = tb1.column;
Note: I am giving this information for future reference. Reference
I have run into this before an came to the same conclusion about cross database queries as you. What I ended up doing was using schemas to divide the table space that way I could keep the tables grouped but still query them all.
Just to add a bit more information.
There is no way to query a database other than the current one. Because PostgreSQL loads database-specific system catalogs, it is uncertain how a cross-database query should even behave.
contrib/dblink allows cross-database queries using function calls. Of course, a client can also make simultaneous connections to different databases and merge the results on the client side.
PostgreSQL FAQ
Yes, you can by using DBlink (postgresql only) and DBI-Link (allows foreign cross database queriers) and TDS_LInk which allows queries to be run against MS SQL server.
I have used DB-Link and TDS-link before with great success.
I have checked and tried to create a foreign key relationships between 2 tables in 2 different databases using both dblink and postgres_fdw but with no result.
Having read the other peoples feedback on this, for example here and here and in some other sources it looks like there is no way to do that currently:
The dblink and postgres_fdw indeed enable one to connect to and query tables in other databases, which is not possible with the standard Postgres, but they do not allow to establish foreign key relationships between tables in different databases.
If performance is important and most queries are read-only, I would suggest to replicate data over to another database. While this seems like unneeded duplication of data, it might help if indexes are required.
This can be done with simple on insert triggers which in turn call dblink to update another copy. There are also full-blown replication options (like Slony) but that's off-topic.
see https://www.cybertec-postgresql.com/en/joining-data-from-multiple-postgres-databases/ [published 2017]
These days you also have the option to use https://prestodb.io/
You can run SQL on that PrestoDB node and it will distribute the SQL query as required. It can connect to the same node twice for different databases, or it might be connecting to different nodes on different hosts.
It does not support:
DELETE
ALTER TABLE
CREATE TABLE (CREATE TABLE AS is supported)
GRANT
REVOKE
SHOW GRANTS
SHOW ROLES
SHOW ROLE GRANTS
So you should only use it for SELECT and JOIN needs. Connect directly to each database for the above needs. (It looks like you can also INSERT or UPDATE which is nice)
Client applications connect to PrestoDB primarily using JDBC, but other types of connection are possible including a Tableu compatible web API
This is an open source tool governed by the Linux Foundation and Presto Foundation.
The founding members of the Presto Foundation are: Facebook, Uber,
Twitter, and Alibaba.
The current members are: Facebook, Uber, Twitter, Alibaba, Alluxio,
Ahana, Upsolver, and Intel.
In case someone needs a more involved example on how to do cross-database queries, here's an example that cleans up the databasechangeloglock table on every database that has it:
CREATE EXTENSION IF NOT EXISTS dblink;
DO
$$
DECLARE database_name TEXT;
DECLARE conn_template TEXT;
DECLARE conn_string TEXT;
DECLARE table_exists Boolean;
BEGIN
conn_template = 'user=myuser password=mypass dbname=';
FOR database_name IN
SELECT datname FROM pg_database
WHERE datistemplate = false
LOOP
conn_string = conn_template || database_name;
table_exists = (select table_exists_ from dblink(conn_string, '(select Count(*) > 0 from information_schema.tables where table_name = ''databasechangeloglock'')') as (table_exists_ Boolean));
IF table_exists THEN
perform dblink_exec(conn_string, 'delete from databasechangeloglock');
END IF;
END LOOP;
END
$$
In our test environment, the schema is prepended to the trigger DDL as one might expect. However, in our QA and PROD environments, the schema prefix doesn't show in the DDL. We always connect as the "SCHEMA" user so it hasn't been a problem thus far. Is it worth updating the QA and PROD DDL's to include the schema prefix? If we don't ever connect to the DB as a user/schema other than "SCHEMA", do we really have anything to worry about?
TEST DDL:
create or replace TRIGGER "SCHEMA"."MDATA_BIR_TRG"
BEFORE INSERT ON "SCHEMA"."METADATA"
FOR EACH ROW
BEGIN
---CODE HERE.
END;
QA DDL:
create or replace TRIGGER "MDATA_BIR_TRG"
BEFORE INSERT ON "METADATA"
FOR EACH ROW
BEGIN
---CODE HERE.
END;
I agree with omeinusch that the schema name is not that important (as long as the current schema is the same as the schema where the object is intended to reside). There is no need to recompile the trigger and make it fully qualified.
A common approach to exporting an object's DDL is to use the SQL Developer's export wizard which does allow you to indicate whether the DDL of the object is schema qualified.
Directions to obtain DDL from SQL Developer export wizard
right click on the object in the connection navigator and select export
choose characteristics of export (include schema by selecting check)
make sure file path is entered.
click next.
No, the SCHEMA is optional and only needed if you want ensure that the handled object belongs to a defined schema or not. If you "don't care" and always use mean your current schema, you can omit it.
I'm new to SQL programming, and I couldn't find an answer to this question online.
I'm working with pl/pgsql and I wish to achieve the following result:
I have a table A with certain attributes.
I am supposed to keep this table updated at any time - thus whenever a change was made that can affect A's values (in other tables B or C which are related to A) - a trigger is fired which updates the values (in the process - new values can be inserted into A, as well as old values can be deleted).
At the same time, I want to prevent from someone insert values into A.
What I want to do is to create a trigger which will prevent insertion into A (by returning NULL) - but I don't want this trigger to be called when I'm doing the insertion from another Trigger - so eventually - insertion to A will only be allowed from within a specific trigger.
As I said before, I'm new to SQL, and I don't know if this is even possible.
Yes, totally possible.
1. Generally disallow UPDATE to A
I would operate with privileges:
REVOKE ALL ON TABLE A FROM public; -- and from anybody else who might have it
That leaves superusers such as postgres who ignore these lowly restrictions. Catch those inside your trigger-function on A with pg_has_role():
IF pg_has_role('postgres', 'member') THEN
RETURN NULL;
END IF;
Where postgres is an actual superuser. Note: this catches other superusers as well, since they are member of every role, even other superusers.
You could catch non-superusers in a similar fashion (alternative to the REVOKE approach).
2. Allow UPDATE for daemon role
Create a non-login role, which is allowed to update A:
CREATE ROLE a_update NOLOGIN;
-- GRANT USAGE ON SCHEMA xyz TO a_update; -- may be needed, too
GRANT UPDATE ON TABLE A TO a_update;
Create trigger functions on tables B and C, owned by this daemon role and with SECURITY DEFINER. Details:
Is there a way to disable updates/deletes but still allow triggers to perform them?
Add to the trigger function on A:
IF pg_has_role('postgres', 'member') THEN
RETURN NULL;
ELSIF pg_has_role('a_update', 'member') THEN
RETURN NEW;
END IF;
For simple 1:1 dependencies, you can also work with foreign key constraints (additionally) using ON UPDATE CASCADE.
I have a Sql Server 2008 database I inherited. A number of apps and SSIS packages work off that database. Not too long ago the scope of the database changed and a lot of new tables were added. As a result of this a lot of the table names (and even the database name itself) no longer make sense, resulting in a very confusing schema.
I could rename the tables straight away and change the apps and processes to use the new names but the chaos and downtime it would cause in the meantime would not be acceptable.
Is there a way I can add an alternate name for a table (like a permanent alias) that I could use to refer to either the new or old table name until all of my refactoring is complete?
Create a synonym first.
CREATE SYNONYM dbo.SensibleName FOR dbo.CrazyName;
Now find all the references to CrazyName in your codebase, and update them to reference SensibleName instead. Once you believe you have found them all, you can eventually run:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN TRANSACTION;
DROP SYNONYM dbo.SensibleName;
EXEC sp_rename N'dbo.CrazyName', N'SensibleName', N'OBJECT';
COMMIT TRANSACTION;
If you need to make column names more sensible, you'll have to do so using a view, as synonyms only cover a subset of database-level objects.
Some other info here.
You can rename it with sp_rename and then add synonym:
CREATE SYNONYM OldTableName FOR NewTableName
I'm going to guess that the answer is "no" based on the below error message (and this Google result), but is there anyway to perform a cross-database query using PostgreSQL?
databaseA=# select * from databaseB.public.someTableName;
ERROR: cross-database references are not implemented:
"databaseB.public.someTableName"
I'm working with some data that is partitioned across two databases although data is really shared between the two (userid columns in one database come from the users table in the other database). I have no idea why these are two separate databases instead of schema, but c'est la vie...
Note: As the original asker implied, if you are setting up two databases on the same machine you probably want to make two schemas instead - in that case you don't need anything special to query across them.
postgres_fdw
Use postgres_fdw (foreign data wrapper) to connect to tables in any Postgres database - local or remote.
Note that there are foreign data wrappers for other popular data sources. At this time, only postgres_fdw and file_fdw are part of the official Postgres distribution.
For Postgres versions before 9.3
Versions this old are no longer supported, but if you need to do this in a pre-2013 Postgres installation, there is a function called dblink.
I've never used it, but it is maintained and distributed with the rest of PostgreSQL. If you're using the version of PostgreSQL that came with your Linux distro, you might need to install a package called postgresql-contrib.
dblink() -- executes a query in a remote database
dblink executes a query (usually a SELECT, but it can be any SQL
statement that returns rows) in a remote database.
When two text arguments are given, the first one is first looked up as
a persistent connection's name; if found, the command is executed on
that connection. If not found, the first argument is treated as a
connection info string as for dblink_connect, and the indicated
connection is made just for the duration of this command.
one of the good example:
SELECT *
FROM table1 tb1
LEFT JOIN (
SELECT *
FROM dblink('dbname=db2','SELECT id, code FROM table2')
AS tb2(id int, code text);
) AS tb2 ON tb2.column = tb1.column;
Note: I am giving this information for future reference. Reference
I have run into this before an came to the same conclusion about cross database queries as you. What I ended up doing was using schemas to divide the table space that way I could keep the tables grouped but still query them all.
Just to add a bit more information.
There is no way to query a database other than the current one. Because PostgreSQL loads database-specific system catalogs, it is uncertain how a cross-database query should even behave.
contrib/dblink allows cross-database queries using function calls. Of course, a client can also make simultaneous connections to different databases and merge the results on the client side.
PostgreSQL FAQ
Yes, you can by using DBlink (postgresql only) and DBI-Link (allows foreign cross database queriers) and TDS_LInk which allows queries to be run against MS SQL server.
I have used DB-Link and TDS-link before with great success.
I have checked and tried to create a foreign key relationships between 2 tables in 2 different databases using both dblink and postgres_fdw but with no result.
Having read the other peoples feedback on this, for example here and here and in some other sources it looks like there is no way to do that currently:
The dblink and postgres_fdw indeed enable one to connect to and query tables in other databases, which is not possible with the standard Postgres, but they do not allow to establish foreign key relationships between tables in different databases.
If performance is important and most queries are read-only, I would suggest to replicate data over to another database. While this seems like unneeded duplication of data, it might help if indexes are required.
This can be done with simple on insert triggers which in turn call dblink to update another copy. There are also full-blown replication options (like Slony) but that's off-topic.
see https://www.cybertec-postgresql.com/en/joining-data-from-multiple-postgres-databases/ [published 2017]
These days you also have the option to use https://prestodb.io/
You can run SQL on that PrestoDB node and it will distribute the SQL query as required. It can connect to the same node twice for different databases, or it might be connecting to different nodes on different hosts.
It does not support:
DELETE
ALTER TABLE
CREATE TABLE (CREATE TABLE AS is supported)
GRANT
REVOKE
SHOW GRANTS
SHOW ROLES
SHOW ROLE GRANTS
So you should only use it for SELECT and JOIN needs. Connect directly to each database for the above needs. (It looks like you can also INSERT or UPDATE which is nice)
Client applications connect to PrestoDB primarily using JDBC, but other types of connection are possible including a Tableu compatible web API
This is an open source tool governed by the Linux Foundation and Presto Foundation.
The founding members of the Presto Foundation are: Facebook, Uber,
Twitter, and Alibaba.
The current members are: Facebook, Uber, Twitter, Alibaba, Alluxio,
Ahana, Upsolver, and Intel.
In case someone needs a more involved example on how to do cross-database queries, here's an example that cleans up the databasechangeloglock table on every database that has it:
CREATE EXTENSION IF NOT EXISTS dblink;
DO
$$
DECLARE database_name TEXT;
DECLARE conn_template TEXT;
DECLARE conn_string TEXT;
DECLARE table_exists Boolean;
BEGIN
conn_template = 'user=myuser password=mypass dbname=';
FOR database_name IN
SELECT datname FROM pg_database
WHERE datistemplate = false
LOOP
conn_string = conn_template || database_name;
table_exists = (select table_exists_ from dblink(conn_string, '(select Count(*) > 0 from information_schema.tables where table_name = ''databasechangeloglock'')') as (table_exists_ Boolean));
IF table_exists THEN
perform dblink_exec(conn_string, 'delete from databasechangeloglock');
END IF;
END LOOP;
END
$$