Directionless relationship failing in PostgreSQL - sql

I am trying to create a 2-way relationship table in PostgreSQL for my 3 objects. This idea has stemmed from the following question https://dba.stackexchange.com/questions/48568/how-to-relate-two-rows-in-the-same-table where I also want to store the relationship and its reverse between rows.
For context on my database: Object 1 which contains (aka relates to many) object2s. In turn, these object2s also relate to many object3s. A 1-to-many relationship (object 1 to object 2) and many-to-many relationship (object 2 to object 3)
Each of the objects have been assigned a UUID in other tables which contain info regarding them. Based on their UUID's I want to be able to query them and get the associated objects UUID as well. This in turn will show me the associations and direct me as to which object I should be looking at for location, info, etc just by knowing the UUID.
PLEASE NOTE - THAT ONE BOX MAY HAVE A RELTIONSHIP OF 10 SLOTS. THEREFORE THAT ONE UUID ASSIGNED FOR THE BOX WILL APPEAR IN MY UUID1 COLUMN 10 TIMES!! THIS IS A MUST!
My next step was to try and create a directionless relationship using this query:
CREATE TABLE bridge_x
(uuid1 UUID NOT NULL REFERENCES temp (uuid1), uuid2 UUID NOT NULL REFERENCES temp (uuid2),
PRIMARY KEY(uuid1, uuid2),
CONSTRAINT temp_temp_directionless
FOREIGN KEY (uuid2, uuid1)
REFERENCES bridge_x (uuid1, uuid2)
);
Is there any other way I can store ALL the information mentioned and be able to query the UUID in order to see the relationship between the objects?

You'll need a composite primary key in the bridge table. An example, using polygameous marriages:
CREATE TABLE person
(person_id INTEGER NOT NULL PRIMARY KEY
, name varchar NOT NULL
);
CREATE TABLE marriage
( person1 INTEGER NOT NULL
, person2 INTEGER NOT NULL
, comment varchar
, CONSTRAINT marriage_1 FOREIGN KEY (person1) REFERENCES person(person_id)
, CONSTRAINT marriage_2 FOREIGN KEY (person2) REFERENCES person(person_id)
, CONSTRAINT order_in_court CHECK (person1 < person2)
, CONSTRAINT polygamy_allowed UNIQUE (person1,person2)
);
INSERT INTO person(person_id,name) values (1,'Bob'),(2,'Alice'),(3,'Charles');
INSERT INTO marriage(person1,person2, comment) VALUES(1,2, 'Crypto marriage!') ; -- Ok
INSERT INTO marriage(person1,person2, comment) VALUES(2,1, 'Not twice!' ) ; -- Should fail
INSERT INTO marriage(person1,person2, comment) VALUES(3,3, 'No you dont...' ) ; -- Should fail
INSERT INTO marriage(person1,person2, comment) VALUES(2,3, 'OMG she did it again.' ) ; -- Should fail (does not)
INSERT INTO marriage(person1,person2, comment) VALUES(3,4, 'Non existant persons are not allowed to marry !' ) ; -- Should fail
SELECT p1.name, p2.name, m.comment
FROM marriage m
JOIN person p1 ON m.person1 = p1.person_id
JOIN person p2 ON m.person2 = p2.person_id
;
Result:
CREATE TABLE
CREATE TABLE
INSERT 0 3
INSERT 0 1
ERROR: new row for relation "marriage" violates check constraint "order_in_court"
DETAIL: Failing row contains (2, 1, Not twice!).
ERROR: new row for relation "marriage" violates check constraint "order_in_court"
DETAIL: Failing row contains (3, 3, No you dont...).
INSERT 0 1
ERROR: insert or update on table "marriage" violates foreign key constraint "marriage_2"
DETAIL: Key (person2)=(4) is not present in table "person".
name | name | comment
-------+---------+-----------------------
Bob | Alice | Crypto marriage!
Alice | Charles | OMG she did it again.
(2 rows)

Related

SQL set constraint on how many times a PK can be referenced

I'm building a demo database of zoo for my school project and I've encountered following problem: I have a table Pavilion, which has some primary key id_pavilion and column capacity (this is information about what is the highest number of animals which can live in this pavilion).
Let's say that each pavilion can contain 2 animals at maximum.
Pavilion
id_pavilion capacity
-----------------------
1 2
2 2
3 2
4 2
Animal
id_an-column2-column3 id_pavilion
---------------------------------------
1 2
2 2
3 2
4 2
(This shows what I'm trying to prevent)
Then I have table animal, which contains some information about the animal and mainly the id_pavilion from Pavilion as a foreign key.
My question is: how can I add such a constraint that the PK id_pavilion from Pavilion can be referenced in table Animal only so many times as the capacity allows?
Looking at your example data, one could argue that every PAVILION can accommodate 2 animals, right? One could also say that the "accommodations" need to be in place before the animals can be kept in an appropriate manner. Thus, we could create a table called ACCOMMODATION, listing all available spaces.
create table pavilion( id primary key, capacity )
as
select level, 2 from dual connect by level <= 4 ;
create table accommodation(
id number generated always as identity start with 1000 primary key
, pavilionid number references pavilion( id )
) ;
Generate all accommodations
-- No "human intervention" here.
-- Only the available spaces will be INSERTed.
insert into accommodation ( pavilionid )
select id
from pavilion P1, lateral (
select 1
from dual
connect by level <= ( select capacity from pavilion where id = P1.id )
) ;
-- we can accommodate 8 animals ...
select count(*) from accommodation ;
COUNT(*)
----------
8
-- accommodations and pavilions
SQL> select * from accommodation ;
ID PAVILIONID
---------- ----------
1000 1
1001 1
1002 2
1003 2
1004 3
1005 3
1006 4
1007 4
8 rows selected.
Each animal should be in a single (defined) location. When an animal is "added" to the zoo, it can only (physically) be in a single location/accommodation. We can use a UNIQUE key and a FOREIGN key (referencing ACCOMMODATION) to enforce this.
-- the ANIMAL table will have more columns eg GENUS, SPECIES, NAME etc
create table animal(
id number generated always as identity start with 2000
-- , name varchar2( 64 )
, accommodation number
) ;
alter table animal
add (
constraint animal_pk primary key( id )
, constraint accommodation_unique unique( accommodation )
, constraint accommodation_fk
foreign key( accommodation ) references accommodation( id )
);
Testing
-- INSERTs will also affect the columns GENUS, SPECIES, NAME etc
-- when the final version of the ANIMAL table is in place.
insert into animal( accommodation ) values ( 1001 ) ;
SQL> insert into animal( accommodation ) values ( 1000 ) ;
1 row inserted.
SQL> insert into animal( accommodation ) values ( 1001 ) ;
1 row inserted.
-- trying to INSERT into the same location again
-- MUST fail (due to the unique constraint)
SQL> insert into animal( accommodation ) values ( 1000 );
Error starting at line : 1 in command -
insert into animal( accommodation ) values ( 1000 )
Error report -
ORA-00001: unique constraint (...ACCOMMODATION_UNIQUE) violated
SQL> insert into animal( accommodation ) values ( 1001 );
Error starting at line : 1 in command -
insert into animal( accommodation ) values ( 1001 )
Error report -
ORA-00001: unique constraint (...ACCOMMODATION_UNIQUE) violated
-- trying to INSERT into a location that does not exist
-- MUST fail (due to the foreign key constraint)
SQL> insert into animal( accommodation ) values ( 9999 ) ;
Error starting at line : 1 in command -
insert into animal( accommodation ) values ( 9999 )
Error report -
ORA-02291: integrity constraint (...ACCOMMODATION_FK) violated - parent key not found
Animals and accommodations
select
A.id as animal
, P.id as pavilion
, AC.id as location --(accommodation)
from pavilion P
join accommodation AC on P.id = AC.pavilionid
join animal A on AC.id = A.accommodation
;
ANIMAL PAVILION LOCATION
---------- ---------- ----------
2000 1 1000
2001 1 1001
DBfiddle here. Tested with Oracle 12c and 18c. (You'll need version 12c+ for LATERAL join to work.)
What you are trying to enforce at the database level is more of a 'business logic' rule rather than a hard data constraint. You can not implement it directly in your table designs; even if you could (as #serg mentions in the comments) it would require a very expensive (in terms of CPU/resources) lock on the table to perform the counting.
Another option, that would achieve your goal and keep the business logic separate from the data design, is to use a SQL Trigger.
A trigger can run before the data is inserted into your table; here you can check how many rows have already been inserted for that 'pavilion entity' and abort or allow the insert.
A comment for the "school project" side of things:
This being said, the sort of logic you are talking about is much better served within your consuming application rather than the database (my opinion, others may disagree). Also perhaps think about defining the size limit in the data, so you can have different sized pavilions.
Notes:
For anyone visiting this question in the future, the above link is for an oracle trigger (as OP has tagged the question for oracle). This link is for Microsoft SQL Server Triggers.
The answer is "not easily". Although the idea of keeping the "accommodations" in the pavilions as a separate table is a clever one, animals are put into pavilions, not accommodations. Modeling accommodations makes it much trickier to move animals around.
Perhaps the simplest approach is to use triggers. This starts with an animal_count column in pavilions. This column starts at zero and is incremented or decremented as animals move in or out. You can use a check constraint to validate that the pavilion is not over-capacity.
Unfortunately, maintaining this column requires triggers on the animals table, one for insert, update, and delete.
In the end, the trigger is maintaining the count and if you attempt to put an animal in a full pavilion, you will violate the check constraint.
You need a column (say "NrOccupants") that is updated when an animal is placed into or removed from each pavilion. Then you add a check constraint to that column that prevents the application code from adding more animals to a pavilion than is permitted by the rule that is enforced by the check constraint.
Here is an example of the SQL DDL that would do that.
CREATE SCHEMA Pavilion
GO
CREATE TABLE Pavilion.Pavilion
(
pavilionNr int NOT NULL,
capacity tinyint CHECK (capacity IN (2)) NOT NULL,
nrOccupants tinyint CHECK (nrOccupants IN (0, 2)) NOT NULL,
CONSTRAINT Pavilion_PK PRIMARY KEY(pavilionNr)
)
GO
CREATE TABLE Pavilion.Animal
(
animalNr int NOT NULL,
name nchar(50) NOT NULL,
pavilionNr int NOT NULL,
type nchar(50) NOT NULL,
weight smallint NOT NULL,
CONSTRAINT Animal_PK PRIMARY KEY(animalNr)
)
GO
ALTER TABLE Pavilion.Animal ADD CONSTRAINT Animal_FK FOREIGN KEY (pavilionNr) REFERENCES Pavilion.Pavilion (pavilionNr) ON DELETE NO ACTION ON UPDATE NO ACTION
GO

Build correct logical db

I use 4 table in database.
Example:
Create table applicant(
Id int not null primary key,
IdName integer
idSkill integer,
idContact integer
Constraint initial foreign key (idName) References Initiale(id)
CONSTRAINT contacT foreign key (idContact) References contact(id)
CONSTRAINT Skills foreign key (idSkill) references skill(id))
Create table Initiale(
Id int not null primary key,
firstname text,
middlename text)
Create table contact(
Id int not null primary key,
phone text,
email text)
Create tabke Skills(
Id int not null primary key,,
Nane text)
I want insert data promptly in 4 table,but i not understand, what get id and insert in applicant.
Create tabke Skills ..... will fail. You should use Create table Skills.
Nor should you have Id int not null primary key,, (two commas), this would fail.
You should very likely be using Id INTEGER PRIMARY KEY not Id int not null primary key.
That is because generally an Id column should be a unique identifier of the row. With SQLite INTEGER PRIMARY KEY has a special meaining whilst INT PRIMARY KEY does not.
That is, if INTEGER PRIMARY KEY is used then the column will be an alias of the rowid column which has to be a unique signed 64 bit integer and importantly if no value is provided when inserting a row then SQLite will assign a unique integer. i.e. the all important Id.
This Id will initially be 1, then likely 2, then likely 3 and so on, although there is no guarantee that the Id will be monotonically increasing.
There are additional errors, mainly comma's omitted. The following should work :-
CREATE TABLE IF NOT EXISTS applicant(
Id INTEGER PRIMARY KEY,
IdName integer,
idSkill integer,
idContact integer,
Constraint initial foreign key (idName) References Initiale(id),
CONSTRAINT contacT foreign key (idContact) References contact(id),
CONSTRAINT Skills foreign key (idSkill) references Skills(id));
Create table IF NOT EXISTS Initiale(
Id INTEGER PRIMARY KEY,
firstname text,
middlename text);
Create table IF NOT EXISTS contact(
Id INTEGER PRIMARY KEY,
phone text,
email text);
Create table IF NOT EXISTS Skills(
Id INTEGER PRIMARY KEY,
Nane text);
You could then insert data along the lines of :-
INSERT INTO Initiale (firstname,middlename) -- Note absence of Id so SQLite will generate
VALUES
('Fred','James'), -- very likely id 1
('Alan','Roy'), -- very likely id 2
('Simon','Gerorge')-- very likely id 3
;
INSERT INTO contact -- alternative way of getting Id generated (specify null for Id)
VALUES
(null,'0123456789','email01#email.com'), -- very likely id 1
(null,'0987654321','email02#email.com'), -- very likely id 2
(null,'3333333333','email03.#email.com') -- very likely id 3
;
INSERT INTO Skills (Nane)
VALUES
('Skill01'),('Skill02'),('Skill03') -- very likely id's 1,2 and 3
;
INSERT INTO applicant (IdName,idSkill,idContact)
VALUES
-- First applicant
(2, -- Alan Roy
3, -- Skill 3
1), -- Contact 0123456789 )
-- Second Applicant
(3, -- Simon George
3, -- Skill 3
2), -- Contact 0987654321 )
-- Third Applicant
(2, -- Alan Roy again????
1, -- Skill 1
3), -- contact 3333333333)
(1,1,1) -- Fred James/ Skill 1/ Contact 0123456789
--- etc
;
Noting that Rows on the Initiale, Contact and Skills table MUST exists before insertions can be made into the Applicant table.
You could then run a query such as :-
SELECT * FROM applicant
JOIN Initiale ON Initiale.Id = idName
JOIN contact ON contact.Id = idContact
JOIN Skills ON Skills.Id = idSkill
This would result in (using the data as inserted above) :-

How to deal with an id that needs to be matched to multiple ids in SQLite?

I'm new to databases, so I'll start by showing what I would do if I was using a simple table in a csv file. Presently, I'm building a Shiny (R) app to keep track of people taking part in studies. The idea is to make sure no one is doing more than one study at the same time, and that enough time has passed between studies.
A single table would look like something like this:
study_title contact_person tasks first_name last_name
MX9345-3 John Doe OGTT Michael Smith
MX9345-3 John Doe PVT Michael Smith
MX9345-3 John Doe OGTT Julia Barnes
MX9345-3 John Doe PVT Julia Barnes
...
So each study has a single contact person, but multiple tasks. It is possible other studies will use the same tasks.
Each task should have a description
Each person can be connected to multiple studies (the final database has timestamps to make sure this does not happen at the same time), and consequently repeat the same tasks.
the SQLite code could look something like this
CREATE TABLE studies (study_title TEXT, contact_person TEXT);
CREATE TABLE tasks (task_name TEXT, description TEXT);
CREATE TABLE participants (first_name TEXT, last_name TEXT);
Now I'm stuck. If I add a primary key and foreign keys (say in studies an ID for each study, and foreign keys for each task and person), the primary keys will repeat, which is not possible. A Study is defined by the tasks it contains (akin to an album and music tracks).
How should I approach this situation in SQLite? And importantly, how are the INSERTs done in these situations? I've seen lots of SELECT examples, but few INSERTs that match all IDs in each table, for example when adding a new person to a running study.
What you do is use tables to map/reference/relate/associate.
The first step would be to utilise alias's of the rowid so instead of :-
CREATE TABLE studies (study_title TEXT, contact_person TEXT);
CREATE TABLE tasks (task_name TEXT, description TEXT);
CREATE TABLE participants (first_name TEXT, last_name TEXT);
you would use :-
CREATE TABLE studies (id INTEGER PRIMARY KEY,study_title TEXT, contact_person TEXT);
CREATE TABLE tasks (id INTEGER PRIMARY KEY, task_name TEXT, description TEXT);
CREATE TABLE participants (id INTEGER PRIMARY KEY, first_name TEXT, last_name TEXT);
With SQLite INTEGER PRIMARY KEY (or INTEGER PRIMARY KEY AUTOINCREMENT) makes the column (id in the above although they can have any valid column name) and alias of the rowid (max of 1 per table), which uniquely identifies the rows.
Why not to use AUTOINCREMENT plus more seeSQLite Autoincrement
Insert some data for demonstration :-
INSERT INTO studies (study_title, contact_person)
VALUES ('Maths','Mr Smith'),('English','Mrs Taylor'),('Geography','Mary White'),('Phsyics','Mr Smith');
INSERT INTO tasks (task_name,description)
VALUES ('Task1','Do task 1'),('Task2','Do task 2'),('Task3','Do task 3'),('Task4','Do task 4'),('Mark','Mark the sudies');
INSERT INTO participants (first_name,last_name)
VALUES ('Tom','Jones'),('Susan','Smythe'),('Sarah','Toms'),('Alan','Francis'),('Julian','MacDonald'),('Fred','Bloggs'),('Rory','Belcher');
First mapping/reference... Table :-
CREATE TABLE IF NOT EXISTS study_task_relationship (study_reference INTEGER, task_reference INTEGER, PRIMARY KEY (study_reference,task_reference));
Map/relate Study's with Tasks (many-many possible)
Do some mapping (INSERT some data) :-
INSERT INTO study_task_relationship
VALUES
(1,2), -- Maths Has Task2
(1,5), -- Maths has Mark Questions
(2,1), -- English has Task1
(2,4), -- English has Task4
(2,5), -- English has Mark questions
(3,3), -- Geography has Task3
(3,1), -- Geoegrapyh has Task1
(3,2), -- Geography has Task2
(3,5), -- Geography has Mark Questions
(4,4) -- Physics has Task4
;
- See comments on each line
List the Studies along with the tasks
SELECT study_title, task_name -- (just want the Study title and task name)
FROM study_task_relationship -- use the mapping table as the main table
JOIN studies ON study_reference = studies.id -- get the related studies
JOIN tasks ON task_reference = tasks.id -- get the related tasks
ORDER BY study_title -- Order by Study title
results in :-
List each study with all it's tasks
SELECT study_title, group_concat(task_name,'...') AS tasklist
FROM study_task_relationship
JOIN studies ON study_reference = studies.id
JOIN tasks ON task_reference = tasks.id
GROUP BY studies.id
ORDER by study_title;
results in :-
Add study-participants associative table and populate it :-
CREATE TABLE IF NOT EXISTS study_participants_relationship (study_reference INTEGER, participant_reference INTEGER, PRIMARY KEY (study_reference,participant_reference));
INSERT INTO study_participants_relationship
VALUES
(1,1), -- Maths has Tom Jones
(1,5), -- Maths has Julian MacDonald
(1,6), -- Maths has Fred Bloggs
(2,4), -- English has Alan Francis
(2,7), -- English has Rory Belcher
(3,3), -- Geography has Sarah Toms
(3,2) -- Susan Smythe
;
You can now, as an example, get a list of participants the tasks along with the study title :-
SELECT study_title, task_name, participants.first_name ||' '||participants.last_name AS fullname
FROM study_task_relationship
JOIN tasks ON study_task_relationship.task_reference = tasks.id
JOIN studies On study_task_relationship.study_reference = studies.id
JOIN study_participants_relationship ON study_task_relationship.study_reference = study_participants_relationship.study_reference
JOIN participants ON study_participants_relationship.participant_reference = participants.id
ORDER BY fullname, study_title
which would result in :-
FOREIGN KEYS
As you can see there is no actual need for defining FOREIGN KEYS. They are really just an aid to stop you inadvertently doing something like :-
INSERT INTO study_participants_relationship VALUES(30,25);
No such study nor no such participant
To utilise FOREIGN KEYS you have to ensure that they are enabled, the simplest is just to issue the command to turn them on (as if it were a normal SQL statment).
PRAGMA foreign_keys=1
A FOREIGN KEY is a constraint, it stops you INSERTING, UPDATING or DELETING a row that would violate the constraint/rule.
Basically the rule is that the column to which the FOREIGN key is defined (the child) must have a value that is in the referenced table/column the parent.
So assumning that FOREIGN KEYS are turned on then coding :-
CREATE TABLE IF NOT EXISTS study_participants_relationship
(
study_reference INTEGER REFERENCES studies(id), -- column foreign key constraint
participant_reference INTEGER,
FOREIGN KEY (participant_reference) REFERENCES participants(id) -- table foreign key constraint
PRIMARY KEY (study_reference,participant_reference
)
);
Would result in INSERT INTO study_participants_relationship VALUES(30,25); failing e.g.
FOREIGN KEY constraint failed: INSERT INTO study_participants_relationship VALUES(30,25);
It fails as there is no row in studies with an id who's value is 30 (the first column foreign key constraint). If the value 30 did exist then the second constraint would kick in as there is no row in participants with an id of 25.
There is no difference between a column Foreign key constraint and a table Foreign key constraint other than where and how they are coded.
However, the above wouldn't stop you deleting all rows from the study_participants_relationship table as it would stop you deleting a row from the studies or participants table if they were referenced by the study_participants_relationship table.
"deal with an id that needs to be matched to multiple ids in SQLite?"
For many-to-many couplings, make extra coupling tables, like the study_task and participent_task tables below. This is many-to-many since a task can be on many studies and a study can have many tasks.
"make sure no one is doing more than one study at the same time"
That could be handled by letting each participant only have a column for current study (no place for more than one then).
PRAGMA foreign_keys = ON;
CREATE TABLE study (id INTEGER PRIMARY KEY, study_title TEXT, contact_person TEXT);
CREATE TABLE task (id INTEGER PRIMARY KEY, task_name TEXT, description TEXT);
CREATE TABLE participant (
id INTEGER PRIMARY KEY,
first_name TEXT,
last_name TEXT,
id_current_study INTEGER references study(id),
started_current_study DATE
);
CREATE TABLE study_task (
id_study INTEGER NOT NULL references study(id),
id_task INTEGER NOT NULL references task(id),
primary key (id_study,id_task)
);
CREATE TABLE participant_task (
id_participant INTEGER NOT NULL references participant(id),
id_task INTEGER NOT NULL references task(id),
status TEXT check (status in ('STARTED', 'DELIVERED', 'PASSED', 'FAILED')),
primary key (id_participant,id_task)
);
insert into study values (1, 'MX9345-3', 'John Doe');
insert into study values (2, 'MX9300-2', 'Jane Doe');
insert into participant values (1001, 'Michael', 'Smith', 1,'2018-04-21');
insert into participant values (1002, 'Julia', 'Barnes', 1, '2018-04-10');
insert into task values (51, 'OGTT', 'Make a ...');
insert into task values (52, 'PVT', 'Inspect the ...');
insert into study_task values (1,51);
insert into study_task values (1,52);
insert into study_task values (2,51);
--insert into study_task values (2,66); --would fail since 66 doesnt exists (controlled and enforced by foreign key)
The PRAGMA on the first line is needed to make SQLite (above version 3.6 from 2009 I think) enforce foreign keys, without it it just accepts the foreign key syntax, but no controlling is done.

Key is not present in table

table description
# \d invites;
Table "public.invites"
Column | Type | Modifiers
-----------------------+-----------------------------+---------------------
id | integer | not null default
email | character varying |
key | character varying |
sender_user_id | integer | not null
receiver_user_id | integer |
Indexes:
"invites_pkey" PRIMARY KEY, btree (id)
Foreign-key constraints:
"fk_invites_receiver_user_id"
FOREIGN KEY (receiver_user_id) REFERENCES users(id)
"fk_invites_sender_user_id"
FOREIGN KEY (sender_user_id) REFERENCES users(id)
you can see Foreighn Key "fk_invites_receiver_user_id" FOREIGN KEY (receiver_user_id) REFERENCES users(id).
But the records for user with pk is absent in the parent table, where in the reference table fk is exists.
# select id from users where id = 958;
id
----
(0 rows)
select count(*) from invites where receiver_user_id = 958;
count
-------
1
(1 row)
the question is how it can be, simple way to fix the conflicts is delete wrong records, but want to exclude such situation in the future, and when i try to restore the data there is an error:
DETAIL: Key (receiver_user_id)=(958) is not present in table "users".
Command was: ALTER TABLE ONLY invites
ADD CONSTRAINT fk_invites_receiver_user_id
FOREIGN KEY (receiver_user_id) REFERENCES users(id);
P.S.
database=# select version();
version
------------------------------------------------------------------------
PostgreSQL 9.4.14 on x86_64-unknown-linux-gnu,
compiled by gcc (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4, 64-bit
Create fail
-- tmp schema
-- \i tmp.sql
-- tables
CREATE TABLE users
(id serial NOT NULL PRIMARY KEY
, name text
);
CREATE TABLE refs
( user_from INTEGER NOT NULL
, user_to INTEGER NOT NULL
, msg text
);
-- data
INSERT INTO users(name) VALUES ('Alice'), ('Bob');
INSERT INTO refs VALUES (1,2), (1,3);
-- FK constraints
ALTER TABLE refs
ADD CONSTRAINT bad_user_from
FOREIGN KEY (user_from) references users(id);
ALTER TABLE refs
ADD CONSTRAINT bad_user_to
FOREIGN KEY (user_to) references users(id)
NOT VALID ; -- <<--HERE
-- This should fail; user=3 does not exist
INSERT INTO refs VALUES (3,2), (2,3);
Repair
-- "repair" the broken refs (by introducing a dummy row)
INSERT INTO users(id,name) VALUES (0, 'NoName');
-- Make the bad FKs point to the dummy row
UPDATE refs r
SET user_to =0
WHERE NOT EXISTS(
SELECT *
FROM users u
WHERE u.id= r.user_to
);
UPDATE refs r
SET user_from =0
WHERE NOT EXISTS(
SELECT *
FROM users u
WHERE u.id= r.user_from
);
SELECT * FROM refs r
JOIN users u0 ON u0.id= r.user_from
JOIN users u1 ON u1.id= r.user_to
;
INSERT INTO users(name) VALUES ('NewName');
-- see what we'v got
SELECT * FROM users;
SELECT* FROM refs r
JOIN users u0 ON u0.id= r.user_from
JOIN users u1 ON u1.id= r.user_to;
Validate constraint
\d refs
\d users
-- enforce the constraint
ALTER TABLE refs
VALIDATE CONSTRAINT bad_user_to; -- <<-- HERE
-- check if valid
\d refs
\d users
-- This should not fail; user=3 does exist now
INSERT INTO refs VALUES (3,2), (2,3);
The answer is unfortunately “data corruption”.
You should delete the offending row, dump the database with pg_dumpall and restore it into a fresh cluster created with initdb.
You should also try to find out what caused the data corruption. Did you have any crashes recently? Could it be that you have unreliable storage or hardware problems? Anything weird in the database log?
To exclude such problems as much as possible, always use the latest fixpack for your PostgreSQL version and use reliable hardware.

Restrict foreign key relationship to rows of related subtypes

Overview: I am trying to represent several types of entities in a database, which have a number of basic fields in common, and then each has some additional fields that are not shared with the other types of entities. Workflow would frequently involve listing the entities together, so I have decided to have a table with their common fields, and then each entity will have its own table with its additional fields.
To implement: There is a common field, “status”, which all entities have; however, some entities will only support a subset of all possible statuses. I also want each type of entity to enforce the use of its subset of statuses. Finally, I will also want to include this field when listing the entities together, so excluding it from the set of common fields seems incorrect, as this would require a union of the specific type tables and the lack of “implements interface” in SQL means that inclusion of that field would be by-convention.
Why I’m here: Below is an solution that is functional, but I am interested if there is a better or more common way to solve the problem. In particular, the fact that this solution requires me to make a redundant unique constraint and a redundant status field feels inelegant.
create schema test;
create table test.statuses(
id integer primary key
);
create table test.entities(
id integer primary key,
status integer,
unique(id, status),
foreign key (status) references test.statuses(id)
);
create table test.statuses_subset1(
id integer primary key,
foreign key (id) references test.statuses(id)
);
create table test.entites_subtype(
id integer primary key,
status integer,
foreign key (id) references test.entities(id),
foreign key (status) references test.statuses_subset1(id),
foreign key (id, status) references test.entities(id, status) initially deferred
);
Some data:
insert into test.statuses(id) values
(1),
(2),
(3);
insert into test.entities(id, status) values
(11, 1),
(13, 3);
insert into test.statuses_subset1(id) values
(1), (2);
insert into test.entites_subtype(id, status) values
(11, 1);
-- Test updating subtype first
update test.entites_subtype
set status = 2
where id = 11;
update test.entities
set status = 2
where id = 11;
-- Test updating base type first
update test.entities
set status = 1
where id = 11;
update test.entites_subtype
set status = 1
where id = 11;
/* -- This will fail
insert into test.entites_subtype(id, status) values
(12, 3);
*/
Simplify building on MATCH SIMPLE behavior of fk constraints
If at least one column of multicolumn foreign constraint with default MATCH SIMPLE behaviour is NULL, the constraint is not enforced. You can build on that to largely simplify your design.
CREATE SCHEMA test;
CREATE TABLE test.status(
status_id integer PRIMARY KEY
,sub bool NOT NULL DEFAULT FALSE -- TRUE .. *can* be sub-status
,UNIQUE (sub, status_id)
);
CREATE TABLE test.entity(
entity_id integer PRIMARY KEY
,status_id integer REFERENCES test.status -- can reference all statuses
,sub bool -- see examples below
,additional_col1 text -- should be NULL for main entities
,additional_col2 text -- should be NULL for main entities
,FOREIGN KEY (sub, status_id) REFERENCES test.status(sub, status_id)
MATCH SIMPLE ON UPDATE CASCADE -- optionally enforce sub-status
);
It is very cheap to store some additional NULL columns (for main entities):
How much disk-space is needed to store a NULL value using postgresql DB?
BTW, per documentation:
If the refcolumn list is omitted, the primary key of the reftable is used.
Demo-data:
INSERT INTO test.status VALUES
(1, TRUE)
, (2, TRUE)
, (3, FALSE); -- not valid for sub-entities
INSERT INTO test.entity(entity_id, status_id, sub) VALUES
(11, 1, TRUE) -- sub-entity (can be main, UPDATES to status.sub cascaded)
, (13, 3, FALSE) -- entity (cannot be sub, UPDATES to status.sub cascaded)
, (14, 2, NULL) -- entity (can be sub, UPDATES to status.sub NOT cascaded)
, (15, 3, NULL) -- entity (cannot be sub, UPDATES to status.sub NOT cascaded)
SQL Fiddle (including your tests).
Alternative with single FK
Another option would be to enter all combinations of (status_id, sub) into the status table (there can only be 2 per status_id) and only have a single fk constraint:
CREATE TABLE test.status(
status_id integer
,sub bool DEFAULT FALSE
,PRIMARY KEY (status_id, sub)
);
CREATE TABLE test.entity(
entity_id integer PRIMARY KEY
,status_id integer NOT NULL -- cannot be NULL in this case
,sub bool NOT NULL -- cannot be NULL in this case
,additional_col1 text
,additional_col2 text
,FOREIGN KEY (status_id, sub) REFERENCES test.status
MATCH SIMPLE ON UPDATE CASCADE -- optionally enforce sub-status
);
INSERT INTO test.status VALUES
(1, TRUE) -- can be sub ...
(1, FALSE) -- ... and main
, (2, TRUE)
, (2, FALSE)
, (3, FALSE); -- only main
Etc.
Related answers:
MATCH FULL vs MATCH SIMPLE
Two-column foreign key constraint only when third column is NOT NULL
Uniqueness validation in database when validation has a condition on another table
Keep all tables
If you need all four tables for some reason not in the question consider this detailed solution to a very similar question on dba.SE:
Enforcing constraints “two tables away”
Inheritance
... might be another option for what you describe. If you can live with some major limitations. Related answer:
Create a table of two types in PostgreSQL