Constraint on a group of rows - sql

For a simple example, let's say I have a list table and a list_entry table:
CREATE TABLE list
(
id SERIAL PRIMARY KEY,
);
CREATE TABLE list_entry
(
id SERIAL PRIMARY KEY,
list_id INTEGER NOT NULL
REFERENCES list(id)
ON DELETE CASCADE,
position INTEGER NOT NULL,
value TEXT NOT NULL,
CONSTRAINT list_entry__position_in_list_unique
UNIQUE(list_id, position)
);
I now want to add the following constraint: all list entries with the same list_id have position entries that form a contiguous sequence starting at 1.
And I have no idea how.
I first thought about EXCLUDE constraints, but that seems to lead nowhere.
Could of course create a trigger, but I'd prefer not to, if at all possible.

You can't do that with a constraint - you would need to implement the logic in code (e.g. using triggers, stored procedures, application code, etc.)

I'm not aware of such way to use constraints. Normally a trigger would be the most straightforward choice, but in case you want to avoid using them, try to get the current position number for the list_entry with the list_id you're about to insert, e.g. inserting a list_entry with list_id = 1:
INSERT INTO list_entry (list_id,position,value) VALUES
(1,(SELECT coalesce(max(position),0)+1 FROM list_entry WHERE list_id = 1),42);
Demo: db<>fiddle

You can use a generated column to reference the previous number in the list, essentially building a linked list. This works in Postgres:
create table list_entry
(
pos integer not null primary key,
val text not null,
prev_pos integer not null
references list_entry (pos)
generated always as (greatest(0, pos-1)) stored
);
In this implementation, the first item (pos=0) points to itself.

Related

a trigger to check a value

I have two tables :
create table building(
id integer,
name varchar(15),
rooms_num integer,
primary key(id)
);
create table room(
id integer,
building_id integer,
primary key(id),
foreign key(building_id) references building(id)
);
as you see, there is a rooms_num field in the building table which shows the number of rooms in that building and a building_id in the room table which shows that room's building.
All I want is that when I insert a value into the room table , the database check itself and see if the number of room is not out of bound.
is it not better to code it with a trigger?
i have tried this but i dont know what to put in the condition part :
CREATE TRIGGER onRoom
ON room
BEFORE INSERT
????
First you should tighten up your data model. Everything that should be present should be marked NOT NULL such as the building's name. I tend to like making sure required text fields have actual values in them, so I put a check constraint in that matches at least one "word" character (alphanumeric).
You should not ever be allowing negative room numbers, right? In addition, you should avoid using private database information like a room number or building number as a primary key. It should be considered a candidate key only. Best practice in my opinion would be to use UUIDs for primary keys, but some people love their auto-incrementing integers, so I won't push here. The point is that rooms tend to (for example) get drywall put up separating them or taken down to make bigger spaces. Switching around primary key IDs can have unexpected results. Better to separate how its identified within the database from the data itself so you can add "Room 6A".
It should also be a safe bet that you won't have more than 32,767 rooms per building, so int2 (16-bit, 2-byte integer) would work here.
CREATE TABLE building (
id integer GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
name varchar(15) NOT NULL UNIQUE CHECK (name ~ '\w'),
rooms_num int2 NOT NULL CHECK (rooms_num >= 0)
);
CREATE TABLE room (
id integer GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
-- Should there be a room name too?
room_id int2 NOT NULL CHECK (room_id > 0),
building_id integer NOT NULL REFERENCES building (id),
UNIQUE (room_id, building_id)
);
Okay, now you have a more solid foundation to work from. Let's make the trigger.
CREATE FUNCTION room_in_building ()
RETURNS TRIGGER LANGUAGE plpgsql STRICT STABLE AS $$
BEGIN
IF (
-- We can safely just check upper bounds because the check constraint on
-- the table prevents zero or negative values.
SELECT b.rooms_num >= NEW.room_id
FROM building AS b
WHERE b.id = NEW.building_id
) THEN
RETURN NEW; -- Everything looks good
ELSE
RAISE EXCEPTION 'Room number (%) out of bounds', NEW.id;
END IF;
END;
$$;
CREATE TRIGGER valid_room_number
BEFORE INSERT OR UPDATE -- Updates shouldn't break the data model either
ON room
FOR EACH ROW EXECUTE PROCEDURE room_in_building();
You should also add an update trigger for the building table. If someone were to update the rooms_num column to a smaller number, you could end up with an inconsistent data model.

How to create an array to store primary keys?

I have 3 tables (Users, Links and LinkLists) and I want to store user ids and their link ids in an array. However, when I try the code below I get following error:
incompatible types: integer[] and integer
Is there any way that I can store user_id and an array of link ids in a table?
CREATE TABLE Users (
id serial primary key not null,
email varchar(64) unique not null,
);
CREATE TABLE Links (
id serial primary key not null,
name varchar(64) not null
);
CREATE TABLE LinkLists(
user_id integer unique not null REFERENCES Users(id),
links integer [] REFERENCES Links(id) -- problem here --
);
Example:
Users Table*
1 example#gmail.com
Links Table
1 google.com
2 twitter.com
LinkLists Table
1 [1,2]
Probably you do not need array data type on LinkList table.
You need only two foreign keys.
CREATE TABLE LinkLists(
user_id integer unique not null REFERENCES Users(id),
links integer not null REFERENCES Links(id)
);
According to me, you do not need 3nd table: LinkLists
Define a user_id FK on links table and refer to users.
Or best approach, use a treeview table, including REFERENCES ownself
Actually you can do!
Just add a new array type field the users table. And define an after insert trigger function for users table.
And update array field by CTE.
I am using that approach succesfully.

Postgres create table error

I am trying to create my very first table in postgres, but when I execute this SQL:
create table public.automated_group_msg (
automated_group_msg_idx integer NOT NULL DEFAULT nextval ('automated_group_msg_idx'::regclass),
group_idx integer NOT NULL,
template_idx integer NOT NULL,
CONSTRAINT automated_group_msg_pkey PRIMARY KEY (automated_group_msg_idx),
CONSTRAINT automated_group_msg_group_idx_fkey FOREIGN KEY (group_idx)
REFERENCES public.groups (group_idx) MATCH SIMPLE
ON UPDATE CASCADE ON DELETE CASCADE,
CONSTRAINT automated_msg_template_idx_fkey FOREIGN KEY (template_idx)
REFERENCES public.template (template_idx) MATCH SIMPLE
ON UPDATE CASCADE ON DELETE CASCADE
)
WITH (
OIDS = FALSE
);
I get the following error:
ERROR: relation "automated_group_msg_idx" does not exist
Your error is (likely) because the sequence you're trying to use doesn't exist yet.
But you can create a sequence on the fly using this syntax:
create table public.automated_group_msg (
id serial primary key,
... -- other columns
)
Not directly related to your question, but naming columns with the table name in the name of the column is generally speaking an anti-pattern, especially for primary keys for which id is the industry standard. It also allows for app code refactoring using abstract classes whose id column is always id. It's crystal clear what automated_group_msg.id means and also crystal clear that automated_group_msg.automated_group_msg_id is a train wreck and contains redundant information. Attribute column names like customer.birth_date should also not be over-decorated as customer.customer_birth_date for the same reasons.
You just need to create the sequence before creating the table
CREATE SEQUENCE automated_group_msg_idx;

Enforcing existence of many-to-many and one-to-one information at the same time (PostgreSQL)

The relation "astronomers discover stars" is m:n, as an astronomer can discover many stars, and a star can as well be discovered from many astronomers.
In any case however, there will be a single date of discovery for every star. If there are many astronomers working on it, they should do it simultaneously, otherwise, only the first one gets the credits of the discovery.
So, in PostgreSQL, we have:
CREATE TABLE astronomer (
astronomer_id serial PRIMARY KEY,
astronomer_name text NOT NULL UNIQUE
);
CREATE TABLE star (
star_id serial PRIMARY KEY,
star_name text NOT NULL UNIQUE,
discovery_date date
);
CREATE TABLE discovered (
star_id integer NOT NULL REFERENCES star,
astronomer_id integer NOT NULL REFERENCES astronomer,
CONSTRAINT pk_discovered PRIMARY KEY (star_id, astronomer_id)
);
Now the trick question: I want to enforce that whenever there is a discovery date for a star, there will be at least one entry in the discovered table, and vice versa.
I could create a unique key (star_id, discovery_date) in table star, and then use that combination as a foreign key in table discovered instead of the star_id. That would solve half of the problem, but still leave it possible to have a discovered_date without astronomers for it.
I can't see any simple solution other than using a trigger to check, or only allowing data enter via a stored procedure that will at the same time insert a discovery_date and entries into the table discovered.
Any ideas?
Thank you!
I would just move the discovery_date column to the discovered table
CREATE TABLE astronomer (
astronomer_id serial PRIMARY KEY,
astronomer_name text NOT NULL UNIQUE
);
CREATE TABLE star (
star_id serial PRIMARY KEY,
star_name text NOT NULL UNIQUE
);
CREATE TABLE discovered (
star_id integer NOT NULL REFERENCES star,
astronomer_id integer NOT NULL REFERENCES astronomer,
discovery_date date not null,
CONSTRAINT pk_discovered PRIMARY KEY (star_id, astronomer_id)
);
Now you have the problem of multiple dates for the same star but as you say in the question only the first one(s) will get the credit.

Correct way to create a table that references variables from another table

I have these relationships:
User(uid:integer,uname:varchar), key is uid
Recipe(rid:integer,content:text), key is rid
Rating(rid:integer, uid:integer, rating:integer) , key is (uid,rid).
I built the table in the following way:
CREATE TABLE User(
uid INTEGER PRIMARY KEY ,
uname VARCHAR NOT NULL
);
CREATE TABLE Recipes(
rid INTEGER PRIMARY KEY,
content VARCHAR NOT NULL
);
Now for the Rating table: I want it to be impossible to insert a uid\rid that does not exist in User\Recipe.
My question is: which of the following is the correct way to do it? Or please suggest the correct way if none of them are correct. Moreover, I would really appreciate if someone could explain to me what is the difference between the two.
First:
CREATE TABLE Rating(
rid INTEGER,
uid INTEGER,
rating INTEGER CHECK (0<=rating and rating<=5) NOT NULL,
PRIMARY KEY(rid,uid),
FOREIGN KEY (rid) REFERENCES Recipes,
FOREIGN KEY (uid) REFERENCES User
);
Second:
CREATE TABLE Rating(
rid INTEGER REFERENCES Recipes,
uid INTEGER REFERENCES User,
rating INTEGER CHECK (0<=rating and rating<=5) NOT NULL,
PRIMARY KEY(rid,uid)
);
EDIT:
I think User is problematic as a name for a table so ignore the name.
Technically both versions are the same in Postgres. The docs for CREATE TABLE say so quite clearly:
There are two ways to define constraints: table constraints and column constraints. A column constraint is defined as part of a column definition. A table constraint definition is not tied to a particular column, and it can encompass more than one column. Every column constraint can also be written as a table constraint; a column constraint is only a notational convenience for use when the constraint only affects one column.
So when you have to reference a compound key a table constraint is the only way to go.
But for every other case I prefer the shortest and most concise form where I don't need to give names to stuff I'm not really interested in. So my version would be like this:
CREATE TABLE usr(
uid SERIAL PRIMARY KEY ,
uname TEXT NOT NULL
);
CREATE TABLE recipes(
rid SERIAL PRIMARY KEY,
content TEXT NOT NULL
);
CREATE TABLE rating(
rid INTEGER REFERENCES recipes,
uid INTEGER REFERENCES usr,
rating INTEGER NOT NULL CHECK (rating between 0 and 5),
PRIMARY KEY(rid,uid)
);
This is a SQL Server based solution, but the concept applies to most any RDBMS.
Like so:
CREATE TABLE Rating (
rid int NOT NULL,
uid int NOT NULL,
CONSTRAINT PK_Rating PRIMARY KEY (rid, uid)
);
ALTER TABLE Rating ADD CONSTRAINT FK_Rating_Recipies FOREIGN KEY(rid)
REFERENCES Recipies (rid);
ALTER TABLE Rating ADD CONSTRAINT FK_Rating_User FOREIGN KEY(uid)
REFERENCES User (uid);
This ensures that the values inside of Rating are only valid values inside of both the Users table and the Recipes table. Please note, in the Rating table I didn't include the other fields you had, just add those.
Assume in the users table you have 3 users: Joe, Bob and Bill respective ID's 1,2,3. And in the recipes table you had cookies, chicken pot pie, and pumpkin pie respective ID's are 1,2,3. Then inserting into Rating table will only allow for these values, the minute you enter 4 for a RID or a UID SQL throws an error and does not commit the transaction.
Try it yourself, its a good learning experience.
In Postgresql a correct way to implement these tables are:
CREATE SEQUENCE uid_seq;
CREATE SEQUENCE rid_seq;
CREATE TABLE User(
uid INTEGER PRIMARY KEY DEFAULT nextval('uid_seq'),
uname VARCHAR NOT NULL
);
CREATE TABLE Recipes(
rid INTEGER PRIMARY KEY DEFAULT nextval('rid_seq'),
content VARCHAR NOT NULL
);
CREATE TABLE Rating(
rid INTEGER NOT NULL REFERENCES Recipes(rid),
uid INTEGER NOT NULL REFERENCES User(uid),
rating INTEGER CHECK (0<=rating and rating<=5) NOT NULL,
PRIMARY KEY(rid,uid)
);
There is no real difference between the two options that you have written.
A simple (i.e. single-column) foreign key may be declared in-line with the column declaration or not. It's merely a question of style. A third way should be to omit foreign key declarations from the CREATE TABLE entirely and later add them using ALTER TABLE statements; done in a transaction (presumable along with all the other tables, constraints, etc) the table would never exist without its required constraints. Choose whichever you think is easiest fora human coder to read and understand i.e. is easiest to maintain.
EDIT: I overlooked the REFERENCES clause in the second version when I wrote my original answer. The two versions are identical in terms of referential integrity, there are just two ways of syntax to do this.