Cascade Delete Children not working as expected - sql

I have two tables one of which is for the polymorphic relationship of different corporations and I've added foreign key references to ids to ensure that if I delete a parent all children will be deleted. With this table setup below if I delete a parent corporation the child corporation persists which is not what I expected. If I delete a corporation_relationship via the parent_id the parent and its children cascade delete and if I a delete the relationship via the child_id the parent and siblings are unaffected. My questions are what am I doing wrong and how can I ensure that by deleting a parent the children are also deleted without adding any new columns?
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
CREATE TYPE "corporation_relationship_type" AS ENUM (
'campus',
'network'
);
CREATE TABLE "corporations" (
"id" uuid PRIMARY KEY NOT NULL DEFAULT uuid_generate_v4(),
"name" varchar(255) NOT NULL
);
CREATE TABLE "corporation_relationships" (
"parent_id" uuid NOT NULL,
"child_id" uuid NOT NULL,
"type" corporation_relationship_type NOT NULL,
PRIMARY KEY ("parent_id", "child_id")
);
ALTER TABLE "corporation_relationships" ADD FOREIGN KEY ("parent_id") REFERENCES "corporations" ("id") ON DELETE CASCADE;
ALTER TABLE "corporation_relationships" ADD FOREIGN KEY ("child_id") REFERENCES "corporations" ("id") ON DELETE CASCADE;
Example queries:
If I add 2 corporations and then add a relationship to the two like so:
insert into corporations (id, name) values ('f9f8f7f6-f5f4f3f2-f1f0f0f0-f0f0f0f0', 'Father');
insert into corporations (id, name) values ('f9f8f7f6-f5f4f3f2-f1f0f0f0-f0f0f0f1', 'Son');
insert into corporation_relationships (parent_id, child_id) values ('f9f8f7f6-f5f4f3f2-f1f0f0f0-f0f0f0f0', 'f9f8f7f6-f5f4f3f2-f1f0f0f0-f0f0f0f1');
My output for select * from corporations; will be:
id | name
--------------------------------------+--------------------
f9f8f7f6-f5f4-f3f2-f1f0-f0f0f0f0f0f0 | Father
f9f8f7f6-f5f4-f3f2-f1f0-f0f0f0f0f0f1 | Son
(2 rows)
My output for select * from corporation_relationships; is:
parent_id | child_id | type
--------------------------------------+--------------------------------------+--------
f9f8f7f6-f5f4-f3f2-f1f0-f0f0f0f0f0f0 | f9f8f7f6-f5f4-f3f2-f1f0-f0f0f0f0f0f1 | campus
Now if I delete the 'father' by executing delete FROM corporations WHERE id = 'f9f8f7f6-f5f4-f3f2-f1f0-f0f0f0f0f0f0'; I would expect my output of select * from corporations; to be nothing but instead it is the following:
id | name
--------------------------------------+--------------------
f9f8f7f6-f5f4-f3f2-f1f0-f0f0f0f0f0f1 | Son
(1 row)
Also, it is noteworthy that the corporation_relationships table is empty after this delete as well but I would want the cascade to keep going past that table and delete the child entity as well.

Your second foreign key constraint in the corporation_relationships table, that references to the corporations table has nothing with with your expectations of cascade deletions of children rows in corporations. To clearify, this foreign key do cascade deletions when you delete a referenced row in the corporations table. But you need the opposite.
To make it work as you expect in your design, you should have a column in corporations that references a primary key in corporation_relationships.
So you need to
create a primary key column, e.g. id, in corporation_relationships (not those you already have, it's not a pk, it's a unique constraint).
create a column in corporations and add a foreign key constraint on it that references a created corporation_relationships pk.
Remove a child_id column from corporation_relationships, it's incorrect and useless at this point.
When you create a relation you should set it's id to the fk column of corresponding child row in corporations.
Now, if you delete a parent corporation, it would delete all relationships, those will delete corresponding children of corporation and so on recursively.
Meanwhile, in my opinion, your design is not correct.
To define a tree-like relations you do not need the transit table, i.e
corporation_relationships. You can define it in a single corporations table. For that you need just a one column parent_id, those would be a foreign key with cascade delete rule, that references a pk in this table. Top-parent corporations would have a null in parent_id, all children - parent's id value.
Also, type column in corporation_relationships is not an attribute of relation itself, it's an attribute of child.

Postgres doesn't mantain referential integrity with optional polymorphic relationships so I created a trigger to do this for me:
CREATE FUNCTION cascade_delete_children() RETURNS trigger AS $$
BEGIN
-- Check if the corporation is a parent
IF OLD.id IN (SELECT parent_id FROM corporation_relationships) THEN
-- Delete all of the corporation's children
DELETE FROM corporations WHERE id IN (SELECT child_id FROM corporation_relationships WHERE parent_id = OLD.id);
END IF;
RETURN OLD;
END;
$$ LANGUAGE plpgsql;
CREATE trigger cascade_delete_children BEFORE DELETE ON corporations
FOR EACH ROW EXECUTE PROCEDURE cascade_delete_children();

Related

Performance of ON DELETE CASCADE in PostgresSQL

I have an issue related to performance of ON DELETE CASCADE. I'm trying to understand why it takes so long. For this topic purposes I simplified real case to schema presented below:
CREATE TABLE IF NOT EXISTS public.items
(
id uuid NOT NULL,
name text COLLATE pg_catalog."default",
CONSTRAINT items_pk PRIMARY KEY (id)
);
CREATE TABLE IF NOT EXISTS public.links
(
parent uuid,
child uuid,
CONSTRAINT links_parent_fk FOREIGN KEY (parent)
REFERENCES public.items (id) MATCH SIMPLE
ON UPDATE CASCADE
ON DELETE CASCADE,
CONSTRAINT links_child_fk FOREIGN KEY (child)
REFERENCES public.items (id) MATCH SIMPLE
ON UPDATE CASCADE
ON DELETE CASCADE
);
CREATE INDEX IF NOT EXISTS parent_idx
ON public.links USING btree
(parent ASC NULLS LAST);
CREATE INDEX IF NOT EXISTS child_idx
ON public.links USING btree
(child ASC NULLS LAST);
CREATE EXTENSION "uuid-ossp";
and data can be generated with:
INSERT INTO public.items
SELECT uuid_generate_v4 (), 'item_' || i
FROM generate_series(1, 134001) AS i;
INSERT INTO links
SELECT (SELECT id FROM public.items WHERE name='item_1'), id FROM public.items;
Briefly, data base contains two tables. Table items contains a list of items (identifier and name column) and table links which defines relations between items (parent <-> child). In presented case all items (children) belongs to item named 'item_1' (parent).
I call a query in order to delete all children assigned to parent:
BEGIN;
EXPLAIN ANALYZE DELETE FROM public.items where id in (SELECT child FROM public.links WHERE parent = (SELECT id FROM public.items WHERE name='item_1'));
ROLLBACK;
From execution plan we can read among others:
"Trigger for constraint links_parent_fk: time=10451.471 calls=134001"
"Trigger for constraint links_child_fk: time=2962.035 calls=134001"
The question is why trigger for constraint links_parent_fk consumes a lot time?
I performed some attempts with exchanging data between columns in links table. After that trigger for links_child_fk consumed ~10 s and trigger for links_parent_fk took ~3 s. I'm curious why there is such difference between execution of this delete cascades?
PostgreSQL version: 12.4 and 13.9.

PostgreSQL Cascade for columns (not foreign key)

create table parent (
child_type not null
child_id not null
);
create table child1(id not null);
create table child2(id not null);
create table child3(id not null);
And there's some rows in table parent like this:
child_type,child_id
"child1",1
"child1",2
"child2",1
"child3",1
I want to delete child row when I delete parent row.
Is there any way to make this trigger on delete cascade?
I presume that (child_type,child_id) is the primary key of parent (and this advice will only work if it is thus: if you want deleting of a parent row to trigger a delete in a child via a FK cascade, the parent must have a primary key)
You create associations like this:
create table child1(
child_type VARCHAR(20) DEFAULT 'child1',
id INT not null
FOREIGN KEY (child_type,id) REFERENCES parent(child_type, child_id) ON DELETE CASCADE
);
create table child2(
child_type VARCHAR(20) DEFAULT 'child2',
id INT not null
FOREIGN KEY (child_type,id) REFERENCES parent(child_type, child_id) ON DELETE CASCADE
);
create table child3(
child_type VARCHAR(20) DEFAULT 'child3',
id INT not null
FOREIGN KEY (child_type,id) REFERENCES parent(child_type, child_id) ON DELETE CASCADE
);
You can't have just id in the child references part of the composite PK in the parent; child has to have the same N columns with the same values as the parent PK has
FWIW that table structure is really wonky, and it will probably come around to bite you time and again.
Prefer something more normal, like:
create table parent (
id PRIMARY KEY
);
create table child1(id PRIMARY KEY, parent_id REFERENCES parent(id));
create table child2(id PRIMARY KEY, parent_id REFERENCES parent(id));
create table child3(id PRIMARY KEY, parent_id REFERENCES parent(id));
I hope this is a contrived situation for you actual problem, as it really is a terrible design. Assuming you actually "want to delete child row when I delete parent row". Unless you alter your data model and define FK constraints you require a delete trigger on table parent. You CANNOT cascade deletes without FK as that is where you define to Postgres to do so. BTW, your table definitions are invalid. Not Null is a constraint not a data type, you have not established a data type. After correcting that you can build a trigger which deletes the corresponding rows from the appropriate child table if your child_type column is understood to actually name the table in which the child resides. A very poor design leading to a extremely risky assumption, but:
-- setup
create table parent (
child_type text not null
,child_id integer not null
);
create table child1(id integer not null);
create table child2(id integer not null);
create table child3(id integer not null)
insert into parent(child_type, child_id)
values ('child1',1),('child1',2),('child2',1),('child3',1);
insert into child1(id) values (1),(2);
insert into child2(id) values (1);
insert into child3(id) values (1);
Now create the trigger function then 'attach' to parent table'
The trigger function now builds and dynamically executes the appropriate delete statement. Note I always generate a raise notice to display the actual statement before executing it, and do so here. You may consider it not necessary.
-- build trigger function.
create or replace function parent_adr()
returns trigger
language plpgsql
as $$
declare
base_del_lk constant text = 'delete from %s where id = %s';
sql_delete_stmt_l text;
begin
sql_delete_stmt_l = format(base_del_lk,old.child_type, old.child_id);
raise notice 'Running statement==%', sql_delete_stmt_l;
EXECUTE sql_delete_stmt_l;
return old;
end;
$$;
-- and define the trigger on the parent table.
create trigger parent_adr_trig
after delete
on parent
for each row
execute procedure parent_adr();
--- test.
delete from parent where child_type = 'child1';

Foreign key to table A or table B

Consider a situation where I define an object, a group of objects, then a table that links them together:
CREATE TABLE obj (
id INTEGER PRIMARY KEY,
name text
) ;
CREATE TABLE group (
id INTEGER PRIMARY KEY ;
grpname TEXT
) ;
CREATE TABLE relation (
objid INTEGER,
grpid INTEGER,
PRIMARY KEY (objid, grpid)
) ;
I am looking for cascade delete when applicable so I add the foreign key
ALTER TABLE relation
ADD FOREIGN KEY (objid)
REFERENCES obj(id)
ON DELETE CASCADE ;
ALTER TABLE relation
ADD FOREIGN KEY (grpid)
REFERENCES group(id)
ON DELETE CASCADE ;
So far is all OK. Now suppose I want to add support for group of groups. I am thinking to change the relation table like this:
CREATE TABLE relation_ver1 (
parent INTEGER,
child INTEGER,
PRIMARY KEY (parent, child)
) ;
ALTER TABLE relation_ver1
ADD FOREIGN KEY (parent)
REFERENCES group(id)
ON DELETE CASCADE ;
Here I get to the question: I would like to apply cascade delete to child too, but I do not know here if child refers to a group or object.
Can I add a foreign key to table obj or group?
The only solution I have found do fare is add child_obj and child_grp fields, add the relative foreign keys and then, when inserting e.g an object use a 'special' (sort of null) group, and do the reverse when inserting subgroup.
Consider the relation:
relation_ver1(parent, child_obj, child_group)
I claim that this relation has the following disadvantages:
You have to deal with the NULL special case.
Approx. 1/3 of values are NULL. NULL values are bad.
Fortunately, there is an easy way to fix this. Since there is a multi-value dependency in your data, you can decompose your table into 2 smaller tables that are 4NF compliant. For example:
relation_ver_obj(parent, child_obj) and
relation_ver_grp(parent, child_group).
The primary reason why we have foreign keys is not so as to be able to do things like cascaded deletes. The primary reason for the existence of foreign keys is referential integrity.
This means that grpid is declared as REFERENCES group(id) in order to ensure that grpid will never be allowed to take any value which is not found in group(id). So, it is an issue of validity. A cascaded DELETE also boils down to validity: if a key is deleted, then any and all foreign keys referring to that key would be left invalid, so clearly, something must be done about them. Cascaded deletion is one possible solution. Setting the foreign key to NULL, thus voiding the relationship, is another possible solution.
Your notion of having a child id refer to either a group or an object violates any notion of referential integrity. Relational Database theory has no use and no provision for polymorphism. A key must refer to one and only one kind of entity. If not, then you start running into problems like the one you have just discovered, but even worse, you cannot have any referential integrity guarantees in your database. That's not a nice situation to be in.
The way to handle the need of relationships to different kinds of entities is with the use of a set of foreign keys, one for each possible related entity, out of which only one may be non-NULL. So, here is how it would look like:
CREATE TABLE tree_relation (
parent_id INTEGER,
child_object_id INTEGER,
child_group_id INTEGER,
PRIMARY KEY (parent_id, child_object_id, child_group_id) );
ALTER TABLE tree_relation
ADD FOREIGN KEY (parent_id) REFERENCES group(id) ON DELETE CASCADE;
ALTER TABLE tree_relation
ADD FOREIGN KEY (child_object_id) REFERENCES object(id) ON DELETE CASCADE;
ALTER TABLE tree_relation
ADD FOREIGN KEY (child_group_id) REFERENCES group(id) ON DELETE CASCADE;
All you need to do is ensure that only one of child_object_id, child_group_id is non-NULL.

Very slow SQL DELETE query on table with foreign key constraint

I have got some trouble with a SQL DELETE query.
I work on a database (postgres 9.3) with 2 tables (Parent and Child).
The child has a relation to the parent with a foreign key.
Parent Table
CREATE TABLE parent
(
id bigint NOT NULL,
...
CONSTRAINT parent_pkey PRIMARY KEY (id)
)
Child Table
CREATE TABLE child
(
id bigint NOT NULL,
parent_id bigint,
...
CONSTRAINT child_pkey PRIMARY KEY (id),
CONSTRAINT fk_adc9xan172ilseglcmi1hi0co FOREIGN KEY (parent_id)
REFERENCES parent (id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION
)
I inserted in both tables 200'000 entries without any relation ( Child.parent_id = NULL).
But a DELETE query like below has a duration of more than 20 minutes.
And that even without a WHERE conditions.
DELETE FROM Parent;
If I don't add the relation constraints the execution time will be done in 400 ms.
What did I miss?
A workable solution is the example below. But I don't know if this is a good idea. Maybe anyone could tell me a better way to do that.
BEGIN WORK;
ALTER TABLE Parent DISABLE TRIGGER ALL;
DELETE FROM Parent;
ALTER TABLE Parent ENABLE TRIGGER ALL;
COMMIT WORK;
When you delete from Parent, the Child table needs to be queried by parent_id to ensure that no child row refers to the parent row you are about to delete.
To ensure that the child lookup runs quickly, you need to have an index on your parent_id column in the Child table.

PostgreSQL delete fails with ON DELETE rule on inherited table

In my PostgreSQL 9.1 database I've defined RULEs that delete rows from child tables whenever a parent table row is deleted. This all worked OK, until I introduced inheritance. If the parent (referencing) table INHERITS from another table and I delete from the base table then the DELETE succeeds, but the RULE doesn't appear to fire at all - the referenced row is not deleted. If I try to delete from the derived table I get an error:
update or delete on table "referenced" violates foreign key constraint "fk_derived_referenced" on table "derived"
There is no other row in the parent table that would violate the foreign key: it's being referenced by the row that's being deleted! How do I fix this?
The following script reproduces the problem:
-- Schema
CREATE TABLE base
(
id serial NOT NULL,
name character varying(100),
CONSTRAINT pk_base PRIMARY KEY (id)
);
CREATE TABLE referenced
(
id serial NOT NULL,
value character varying(100),
CONSTRAINT pk_referenced PRIMARY KEY (id)
);
CREATE TABLE derived
(
referenced_id integer,
CONSTRAINT pk_derived PRIMARY KEY (id),
CONSTRAINT fk_derived_referenced FOREIGN KEY (referenced_id) REFERENCES referenced (id)
)
INHERITS (base);
-- The rule
CREATE OR REPLACE RULE rl_derived_delete_referenced
AS ON DELETE TO derived DO ALSO
DELETE FROM referenced r WHERE r.id = old.referenced_id;
-- Some test data
INSERT INTO referenced (id, value)
VALUES (1, 'referenced 1');
INSERT INTO derived (id, name, referenced_id)
VALUES (2, 'derived 2', 1);
-- Delete from base - deletes the "base" and "derived" rows, but not "referenced"
--DELETE FROM base
--WHERE id = 2;
-- Delete from derived - fails with:
-- update or delete on table "referenced" violates foreign key constraint "fk_derived_referenced" on table "derived"
DELETE FROM derived
WHERE id = 2
As I said in my comment, this seems an unusual way to do things. But you can make it work with a deferred constraint.
CREATE TABLE derived
(
referenced_id integer,
CONSTRAINT pk_derived PRIMARY KEY (id),
CONSTRAINT fk_derived_referenced FOREIGN KEY (referenced_id)
REFERENCES referenced (id) DEFERRABLE INITIALLY DEFERRED
)
INHERITS (base);
The PostgreSQL docs, Rules vs. Triggers, say
Many things that can be done using triggers can also be implemented
using the PostgreSQL rule system. One of the things that cannot be
implemented by rules are some kinds of constraints, especially foreign
keys.
But it's not clear to me that this specific limitation is what you're running into.
Also, you need to check if other records are still referencing the to-be-deleted rows. I added a test derived record#3, which points to the same #1 reference record.
-- The rule
CREATE OR REPLACE RULE rl_derived_delete_referenced
AS ON DELETE TO tmp.derived DO ALSO (
DELETE FROM tmp.referenced re_del
WHERE re_del.id = OLD.referenced_id
AND NOT EXISTS ( SELECT * FROM tmp.derived other
WHERE other.referenced_id = re_del.id
AND other.id <> OLD.id )
;
);
-- Some test data
INSERT INTO tmp.referenced (id, value)
VALUES (1, 'referenced 1');
-- EXPLAIN ANALYZE
INSERT INTO tmp.derived (id, name, referenced_id)
VALUES (2, 'derived 2', 1);
INSERT INTO tmp.derived (id, name, referenced_id)
VALUES (3, 'derived 3', 1);
-- Delete from base - deletes the "base" and "derived" rows, but not "referenced"
--DELETE FROM base
--WHERE id = 2;
-- Delete from derived - fails with:
-- update or delete on table "referenced" violates foreign key constraint "fk_derived_referenced" on table "derived"
EXPLAIN ANALYZE
DELETE FROM tmp.derived
WHERE id = 2
;
SELECT * FROM tmp.base;
SELECT * FROM tmp.derived;
SELECT * FROM tmp.referenced;