How to create and normalise this SQL relational database? - sql

I am trying to design an SQL lite database. I have made a quick design on paper of what I want.
So far, I have one table with the following columns:
user_id (which is unique and the primary key of the table)
number_of_items_allowed (can be any number and does not have to be
unique)
list_of_items (a list of any size as long as it is less than or equal
to number_of_items_allowed and this list stores item IDs)
The column I am struggling the most with is list_of_items. I know that a relational database does not have a column which allows lists and you must create a second table with that information to normalise the database. I have looked at a few stack overflow answers including this one, which says that you can't have lists stored in a column, but I was not able to apply the accepted answer to my case.
I have thought about having a secondary table which would have a row for each item ID belonging to a user_id and the primary key in that case would have been the combination of the item ID and the user_id, however, I was not sure if that would be the ideal way of going about it.

Consider the following schema with 3 tables:
CREATE TABLE users (
user_id INTEGER PRIMARY KEY,
user TEXT NOT NULL,
number_of_items_allowed INTEGER NOT NULL CHECK(number_of_items_allowed >= 0)
);
CREATE TABLE items (
item_id INTEGER PRIMARY KEY,
item TEXT NOT NULL
);
CREATE TABLE users_items (
user_id INTEGER NOT NULL REFERENCES users(user_id) ON UPDATE CASCADE ON DELETE CASCADE,
item_id INTEGER NOT NULL REFERENCES items (item_id) ON UPDATE CASCADE ON DELETE CASCADE,
PRIMARY KEY(user_id, item_id)
);
For this schema, you need a BEFORE INSERT trigger on users_items which checks if a new item can be inserted for a user by comparing the user's number_of_items_allowed to the current number of items that the user has:
CREATE TRIGGER check_number_before_insert_users_items
BEFORE INSERT ON users_items
BEGIN
SELECT
CASE
WHEN (SELECT COUNT(*) FROM users_items WHERE user_id = NEW.user_id) >=
(SELECT number_of_items_allowed FROM users WHERE user_id = NEW.user_id)
THEN RAISE (ABORT, 'No more items allowed')
END;
END;
You will need another trigger that will check when number_of_items_allowed is updated if the new value is less than the current number of the items of this user:
CREATE TRIGGER check_number_before_update_users
BEFORE UPDATE ON users
BEGIN
SELECT
CASE
WHEN (SELECT COUNT(*) FROM users_items WHERE user_id = NEW.user_id) > NEW.number_of_items_allowed
THEN RAISE (ABORT, 'There are already more items for this user than the value inserted')
END;
END;
See the demo.

Related

Constraint on a group of rows

For a simple example, let's say I have a list table and a list_entry table:
CREATE TABLE list
(
id SERIAL PRIMARY KEY,
);
CREATE TABLE list_entry
(
id SERIAL PRIMARY KEY,
list_id INTEGER NOT NULL
REFERENCES list(id)
ON DELETE CASCADE,
position INTEGER NOT NULL,
value TEXT NOT NULL,
CONSTRAINT list_entry__position_in_list_unique
UNIQUE(list_id, position)
);
I now want to add the following constraint: all list entries with the same list_id have position entries that form a contiguous sequence starting at 1.
And I have no idea how.
I first thought about EXCLUDE constraints, but that seems to lead nowhere.
Could of course create a trigger, but I'd prefer not to, if at all possible.
You can't do that with a constraint - you would need to implement the logic in code (e.g. using triggers, stored procedures, application code, etc.)
I'm not aware of such way to use constraints. Normally a trigger would be the most straightforward choice, but in case you want to avoid using them, try to get the current position number for the list_entry with the list_id you're about to insert, e.g. inserting a list_entry with list_id = 1:
INSERT INTO list_entry (list_id,position,value) VALUES
(1,(SELECT coalesce(max(position),0)+1 FROM list_entry WHERE list_id = 1),42);
Demo: db<>fiddle
You can use a generated column to reference the previous number in the list, essentially building a linked list. This works in Postgres:
create table list_entry
(
pos integer not null primary key,
val text not null,
prev_pos integer not null
references list_entry (pos)
generated always as (greatest(0, pos-1)) stored
);
In this implementation, the first item (pos=0) points to itself.

How to solve deadlock when inserting rows in sql many-to-many relationship with minimum cardinality one restriction?

This year I've been learning about relational databases and how to design them. In order to strenghten my knowledge, I'm trying to design and implement a database using Python and sqlite3.
The database is about a textile company, and, among other thigs, they want to keep information about the following:
Materials they use to make their products
Shops where they look for materials
Some shops (the ones where they do buy materials) are considered suppliers.
They want to know what suppliers provide what materials
About this last relationship, there are some restrictions:
A supplier can provide more than one material (Supplier class maximum cardinality many)
A material can be provided by more than one supplier (Material class maximum carindality many)
All materials must be provided by at least one supplier (Material class minimum cardinality one)
All suppliers must provide at least one material (Supplier class minimum cardinality one)
This is how I think the ER diagram looks giving these indications:
Entity-Relation diagram for "Provides" relationship
Given the minimum cardinality one, I think I have to implement integrity restrictions by triggers. This is how I think the logic design (the actual tables in the database) looks:
Logical diagram for "Provides" relationship
With the following integrity restrictions:
IR1. Minimum cardinality one in Material-Provides: every value of the 'cod_material' attribute from the Material table must appear at least once as a value of the 'cod_material' attribute in the Provides table.
IR2. Minimum cardinality one in Supplier-Provides: every value of the 'cod_supplier' attribute from the Supplier table must appear at least once as a value of the 'cod_supplier' attribute in the Provides table.
All of this means that, when inserting new suppliers or materials, I will also have to insert what material they provided (in the case of the suppliers) or what supplier has provided it (in the case of the materials).
This is what the triggers I made to take into consideration the integrity restrictions look like (I should also add that I've been working with pl-sql, and sqlite uses sql, so I'm not that used to this syntax, and there may be some errors):
CREATE TRIGGER IF NOT EXISTS check_mult_provides_supl
AFTER INSERT ON Supplier
BEGIN
SELECT
CASE WHEN ((SELECT p.cod_supplier FROM Provides p WHERE p.cod_supplier = new.cod_supplier) IS NULL)
THEN RAISE(ABORT, 'Esta tienda no ha provisto aun ningun material')
END;
END;
CREATE TRIGGER IF NOT EXISTS check_mult_provides_mat
AFTER INSERT ON Material
BEGIN
SELECT
CASE WHEN ((SELECT m.cod_material FROM Material m WHERE m.cod_material = new.cod_material) IS NULL)
THEN RAISE(ABORT, 'Este material no ha sido provisto por nadie')
END;
END;
I've tried adding new rows to the tables Material and Supplier respectively, and the triggers are working (or at least they're not allowing me to insert new rows without a row in the Provides table).
This is when I reach the deadlock:
Having the database empty, if I try to insert a row in the tables Material or Supplier the triggers fire and they don't allow me (because first I need to insert the corresponding row in the table Provides). However, if I try to insert a row in the Provides table, I get a foreign key constraint error (obviously, since that supplier and material are not inserted into their respective tables yet), so basically I cannot insert rows in my database.
The only answers I can think of are not very satisfactory: momentary disabling any constraint (either the foreign key constraint or the integrity one by the trigger) puts the database integrity at risk, since new inserted rows don't fire the trigger even if this one gets enabled after. The other thing I thought of was relaxing the minimum cardinality restrictions, but I assume a many-to-many relationship with minimum cardinality one restriction should be usual in real databases, so there must be another kind of solutions.
How can I get out of this deadlock? Maybe a procedure (although sqlite doesn't have store procedures, I think I can make them with the Python API by create_function() in the sqlite3 module) would do the trick?
Just in case, if anyone wants to reproduce this part of the database, here is the code for the creation of the tables (I finally decided to autoincrement the primary key, so the datatype is an integer, as opposed to the ER diagram and the logical diagram which said a datatype character)
CREATE TABLE IF NOT EXISTS Material (
cod_material integer AUTO_INCREMENT PRIMARY KEY,
descriptive_name varchar(100) NOT NULL,
cost_price float NOT NULL
);
CREATE TABLE IF NOT EXISTS Shop (
cod_shop integer AUTO_INCREMENT PRIMARY KEY,
name varchar(100) NOT NULL,
web varchar(100) NOT NULL,
phone_number varchar(12),
mail varchar(100),
address varchar(100)
);
CREATE TABLE IF NOT EXISTS Supplier (
cod_proveedor integer PRIMARY KEY CONSTRAINT FK_Supplier_Shop REFERENCES Shop(cod_shop)
);
CREATE TABLE IF NOT EXISTS Provides (
cod_material integer CONSTRAINT FK_Provides_Material REFERENCES Material(cod_material),
cod_supplier integer CONSTRAINT FK_Provides_Supplier REFERENCES Supplier(cod_supplier),
CONSTRAINT PK_Provides PRIMARY KEY (cod_material, cod_supplier)
);
I believe that you want a DEFERRED FOREIGN KEY. The triggers, however, will interfere as they would be triggered.
However, you also need to consider the code that you have posted. There is no AUTO_INCREMENT keyword it is AUTOINCREMENT (however you very probably do not do not need AUTOINCREMENT as INTEGER PRIMARY KEY will do all that you required).
If you check SQLite AUTOINCREMENT along with
The AUTOINCREMENT keyword imposes extra CPU, memory, disk space, and disk I/O overhead and should be avoided if not strictly needed. It is usually not needed.
The Supplier table is useless as you have coded it is simply a single column that references a shop with no other data. However, the Provides table references the Supplier table BUT to a non-existent column (cod_supplier).
Coding CONSTRAINT name REFERENCES table(column(s)) doesn't adhere to the SYNTAX as CONSTRAINT is a table level clause, whilst REFERENCES is a column level clause and this appears to cause some confusion.
I suspect that you may have resorted to Triggers because the FK conflicts weren't doing anything. By default FK processing is turned off and has to be enabled as per Enabling Foreign Key Support. I don't believe they are required.
Anyway I believe that the following, that includes changes to overcome the above issues, demonstrates DEFERREED FOREIGN KEYS :-
DROP TABLE IF EXISTS Provides;
DROP TABLE IF EXISTS Supplier;
DROP TABLE IF EXISTS Shop;
DROP TABLE IF EXISTS Material;
DROP TRIGGER IF EXISTS check_mult_provides_supl;
DROP TRIGGER IF EXISTS check_mult_provides_mat;
PRAGMA foreign_keys = ON;
CREATE TABLE IF NOT EXISTS Material (
cod_material integer PRIMARY KEY,
descriptive_name varchar(100) NOT NULL,
cost_price float NOT NULL
);
CREATE TABLE IF NOT EXISTS Shop (
cod_shop integer PRIMARY KEY,
name varchar(100) NOT NULL,
web varchar(100) NOT NULL,
phone_number varchar(12),
mail varchar(100),
address varchar(100)
);
CREATE TABLE IF NOT EXISTS Supplier (
cod_supplier INTEGER PRIMARY KEY, cod_proveedor integer /*PRIMARY KEY*/ REFERENCES Shop(cod_shop) DEFERRABLE INITIALLY DEFERRED
);
CREATE TABLE IF NOT EXISTS Provides (
cod_material integer REFERENCES Material(cod_material) DEFERRABLE INITIALLY DEFERRED,
cod_supplier integer REFERENCES Supplier(cod_supplier) DEFERRABLE INITIALLY DEFERRED,
PRIMARY KEY (cod_material, cod_supplier)
);
/*
CREATE TRIGGER IF NOT EXISTS check_mult_provides_supl
AFTER INSERT ON Supplier
BEGIN
SELECT
CASE WHEN ((SELECT p.cod_supplier FROM Provides p WHERE p.cod_supplier = new.cod_supplier) IS NULL)
THEN RAISE(ABORT, 'Esta tienda no ha provisto aun ningun material')
END;
END;
CREATE TRIGGER IF NOT EXISTS check_mult_provides_mat
AFTER INSERT ON Material
BEGIN
SELECT
CASE WHEN ((SELECT m.cod_material FROM Material m WHERE m.cod_material = new.cod_material) IS NULL)
THEN RAISE(ABORT, 'Este material no ha sido provisto por nadie')
END;
END;
*/
-- END TRANSACTION; need to use this if it fails before getting to commit
BEGIN TRANSACTION;
INSERT INTO Shop (name,web,phone_number,mail,address)VALUES('shop1','www.shop1.com','000000000000','shop1#email.com','1 Somewhere Street, SomeTown etc');
INSERT INTO Supplier (cod_proveedor) VALUES((SELECT max(cod_shop) FROM Shop));
INSERT INTO Material (descriptive_name,cost_price)VALUES('cotton',10.5);
INSERT INTO Provides VALUES((SELECT max(cod_material) FROM Material),(SELECT max(cod_supplier) FROM Supplier ));
COMMIT;
SELECT * FROM shop
JOIN Supplier ON Shop.cod_shop = cod_proveedor
JOIN Provides ON Provides.cod_supplier = Supplier.cod_supplier
JOIN Material ON Provides.cod_material = Material.cod_material
;
DROP TABLE IF EXISTS Provides;
DROP TABLE IF EXISTS Supplier;
DROP TABLE IF EXISTS Shop;
DROP TABLE IF EXISTS Material;
DROP TRIGGER IF EXISTS check_mult_provides_supl;
DROP TRIGGER IF EXISTS check_mult_provides_mat;
When run as is then the result is :-
However, if the INSERT into the Supplier is altered to :-
INSERT INTO Supplier (cod_proveedor) VALUES((SELECT max(cod_shop) + 1 FROM Shop));
i.e. the reference to the shop is not an existing shop (1 greater) then :-
The messages/log are :-
BEGIN TRANSACTION
> OK
> Time: 0s
INSERT INTO Shop (name,web,phone_number,mail,address)VALUES('shop1','www.shop1.com','000000000000','shop1#email.com','1 Somewhere Street, SomeTown etc')
> Affected rows: 1
> Time: 0.002s
INSERT INTO Supplier (cod_proveedor) VALUES((SELECT max(cod_shop) + 1 FROM Shop))
> Affected rows: 1
> Time: 0s
INSERT INTO Material (descriptive_name,cost_price)VALUES('cotton',10.5)
> Affected rows: 1
> Time: 0s
INSERT INTO Provides VALUES((SELECT max(cod_material) FROM Material),(SELECT max(cod_supplier) FROM Supplier ))
> Affected rows: 1
> Time: 0s
COMMIT
> FOREIGN KEY constraint failed
> Time: 0s
That is the deferred inserts were successful BUT the commit failed.
You may wish to refer to SQLite Transaction
I think the design of your database should be reconsidered, since the table Provides represents two different set of informations: which shop provides which materials, and which is the supplier for a certain material. A better design should be to separate those two kind of information, so that you can increase the constraints expressed through foreign keys.
Here is a sketch of the tables, not tied to a specific RDBMS.
Material (cod_material, descriptive_name, cost_price)
PK (cod_material)
Shop (cod_shop, name, web. phone_number, mail, address)
PK (cod_shop)
ShopMaterial (cod_shop, cod_material)
PK (cod_shop, cod_material),
cod_shop FK for Shop, cod_material FK for Material
SupplierMaterial (cod_sup, cod_material)
PK (cod_sup, cod_material)
cod_sup FK for Shop, cod_material FK for material
(cod_sup, cod_material) FK for ShopMaterial
The different foreign keys already take into account several constraints. The only constraint not enforced is, I think:
All materials must be provided by at least one supplier
This constraint cannot be enforced automatically since you have first to insert a material, then to add the corresponding pairs (cod_shop, cod_material) and then the pairs (cod_sup, cod_material). For this, I think the best option is to define, at the application level, a procedure that insert at the same time the material, the shops from which it can be obtained, and the supplier for it, as well as a procedure that remove the material, and the relevant pairs in ShopMaterial and SupplierMaterial tables.

PostgreSQL - Constraint Based on Column in Another Table

I have two tables, one called ballots and one called votes. Ballots stores a list of strings representing options that people can vote for:
CREATE TABLE IF NOT EXISTS Polls (
id SERIAL PRIMARY KEY,
options text[]
);
Votes stores votes that users have made (where a vote is an integer representing the index of the option they voted for):
CREATE TABLE IF NOT EXISTS Votes (
id SERIAL PRIMARY KEY,
poll_id integer references Polls(id),
value integer NOT NULL ,
cast_by integer NOT NULL
);
I want to ensure that whenever a row is created in the Votes table, the value of 'value' is in the range [0,length(options)) for the corresponding row in Polls (by corresponding, I mean the row where the poll_id's match).
Is there any kind of check or foreign key constraint I can implement to accomplish this? Or do I need some kind of trigger? If so, what would that trigger look like and would there be performance concerns? Would it be just as performant to just manually query for the corresponding poll using a SELECT statement and then assert that 'value' is valid before inserting into Votes table?
In your requirement you can not use Check Constraint because it can refer the column of the same table.
You can refer the Official Manual for the same.
So, here you should use Trigger on BEFORE INSERT event of your Votes Table or you can use function/procedure(depend upon your version of PostgreSQL) for your insert operation where you can check the value before insert and raise exception if the condition not satisfied.
USING Trigger:
create or replace function id_exist() returns trigger as
$$
begin
if new.value<0 or new.value>=(select array_length(options,1) from polls where id=new.poll_id) then
raise exception 'value is not in range';
end if;
return new;
end;
$$
language plpgsql
CREATE TRIGGER check_value BEFORE INSERT ON votes
FOR EACH ROW EXECUTE PROCEDURE id_exist();
DEMO
I would suggest that you modify your data model to have a table, PollOptions:
CREATE TABLE IF NOT EXISTS PollOptions (
PollOptionsId SERIAL PRIMARY KEY, -- should use generated always as identity
PollId INT NOT NULL, REFERENCES Polls(id),
OptionNumber int,
Option text,
UNIQUE (PollId, Option)
);
Then your Votes table should have a foreign key reference to PollOptions. You can use either PollOptionId or (PollId, Option).
No triggers or special functions are needed if you set up the data correctly.

How can set unique key checking by another table

i have create three tables: supplier, item and purchase. supplier id has relation with item table and item id has relation with purchase table. I do not want to insert itemid on purchase table in same supplier item. How can I set constraint?
CREATE TABLE csupplier(
supid NUMBER(10) PRIMARY KEY ,
supname VARCHAR2(30)
);
CREATE TABLE ctitem(
itemid NUMBER(10) PRIMARY KEY,
itemname VARCHAR2(50),
supid NUMBER(10)
);
ALTER TABLE CTITEM
ADD CONSTRAINT CTITEM_FK1 FOREIGN KEY(SUPID )REFERENCES CSUPPLIER(SUPID );
CREATE TABLE cPurchase(
purchaseid NUMBER(10) PRIMARY KEY,
itemid NUMBER(10),
purchaseqty NUMBER(10)
);
ALTER TABLE CPURCHASE
ADD CONSTRAINT CPURCHASE_FK1 FOREIGN KEY(ITEMID )REFERENCES CTITEM(ITEMID )
i don not want insert item-1 and item-3 in a same time under purchase
The problem is Oracle does not understand concept of at the same time. It understands transactions, it understands DML statements, it understands unique keys. So we need to frame your question in terms Oracle can understand: for instance, a given purchase cannot have more than one item from the same supplier.
Your first problem is that your data model can't support such a rule. Your cpurchase table has a primary key of purchaseid which means you have one record per item purchased. There is no set of purchased items against which we can enforce a rule. So, the first thing is to change the data model:
CREATE TABLE cPurchase(
purchaseid NUMBER(10) PRIMARY KEY );
CREATE TABLE cPurchaseItem(
purchaseid NUMBER(10),
itemid NUMBER(10),
purchaseqty NUMBER(10)
);
ALTER TABLE CPURCHASEITEM
ADD CONSTRAINT CPURCHASEITEM_PK PRIMARY KEY(PURCHASEID,ITEMID);
ALTER TABLE CPURCHASEITEM
ADD CONSTRAINT CPURCHASEITEM_FK1 FOREIGN KEY(PURCHASEID )REFERENCES CPURCHASE;
ALTER TABLE CPURCHASE
ADD CONSTRAINT CPURCHASE_FK2 FOREIGN KEY(ITEMID )REFERENCES CTITEM(ITEMID );
Now we have a header-detail structure which assigns multiple items to one purchase, which means we can attempt to enforce the rule.
The next problem is that supplierid is not an attribute of cpurchaseitem. There is no way to build a check constraint on a table or column which executes a query on another table. What you are after is a SQL Assertion, which is a notional construct that would allow us to define such rules. Alas Oracle (nor any other RDBMS) supports Assertions at the moment.
So that leaves us with three options:
Go procedural, and write a transaction API which enforces this rule.
Denormalise cpurchaeitem to include supplierid then build a unique constraint on (purchaseid, supplierid). You would need to populate supplierid whenever you populate cpurchaseitem.
Write an after statement trigger:
(Warning: this is coded wildstyle and may contain bugs and/or compilation errors.)
create or replace trigger cpurchaseitem_trg
after insert or update on cpurchaseitem
declare
rec_count number;
begin
select count(*)
into rec_count
from cpurchaseitem pi
join citem I on pi.itemid = i.itemid
group by pi.purchaseid, i.supplierid having count(*) > 1;
if rec_count > 0 then
raise_application_error(-20000
, 'more than one item for a supplier!');
end if;
end;
Frankly none of these solutions is especially appealing. The API is a solid solution but open to circumvention. The trigger will suffer from scaling issues as the number of purchases grows over time (although this can be mitigated by writing a compound trigger instead, left as an exercise for the reader). Denormalisation is the safest (and probably most performative) solution, even though it's not modelling best practice.
There are 2 solutions to your problem:
1. Alter the table cPurchase and add the supid column in the table. and make the unique key on this column. This will solve your problem.
CREATE TABLE cPurchase(
purchaseid NUMBER(10) PRIMARY KEY,
itemid NUMBER(10),
purchaseqty NUMBER(10),
supid NUMBER(10) UNIQUE KEY
);
If alter is not possible on this table, write a row level, Before Insert/update trigger. In this trigger write the logic to find the Supid based on the Item_id ctitem and them find any item on this supplier exists in your purchase table.
CREATE [ OR REPLACE ] TRIGGER SUP_CHECK
BEFORE INSERT
ON cPurchase
FOR EACH ROW
DECLARE
L_COUNT NUMBER;
BEGIN
SELECT COUNT(*) INTO L_COUNT
FROM cPurchase c
WHERE C.itemid in (Select itemid from ctitem ct where ct.supid = (Select supid
from ctitem where itemid = :new.itemid) );
EXCEPTION
WHEN ...
-- exception handling
END;

Deleting millions of record in bunch in postgresql

I have to delete rows from table that has 120 millions records.
The data that has highest(entry_date) and second highest(entry_date) should not be deleted.
Table has many constraints.
One PRIMARY key
Two FOREIGN keys
and two indexes other than index on primary key.
I have already successfully tried method to delete as creating temp table and moving required data into temp table.
Then dropping the present table and then again moving back filtered data from temp to main table.And it worked fine.
But I need a way to delete records in bunch .
CREATE TABLE values
(
value_id bigint NOT NULL,
content_definition_id bigint NOT NULL,
value_s text,
value_n double precision,
order integer,
scope_id integer NOT NULL,
answer boolean NOT NULL,
date timestamp without time zone NOT NULL,
entry_date timestamp without time zone NOT NULL,
CONSTRAINT "value_PK" PRIMARY KEY (value_id),
CONSTRAINT content_definition_id_fk FOREIGN KEY (content_definition_id)
REFERENCES content_definition (content_definition_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION,
CONSTRAINT scope_fk FOREIGN KEY (scope_id)
REFERENCES scopes (scope_id) MATCH SIMPLE
ON UPDATE RESTRICT ON DELETE RESTRICT
)
-- Index: fki_content_definition_id_fk
-- Index: fki_value_value_scope_id
How to delete record in bunch like first only 1 million data should be deleted and on.
This assumes you have no conflicting locks. Note that index page locks may slow things down as well.
Recent PostgreSQL allows you to use a CTE in a delete statement. I.e. you can:
WITH ids_to_delete (
SELECT value_id FROM values
where ...
limit ...
)
delete from values where value_id in (select value_id from ids_to_delete)
Can you try merge with conditions of your temp tables and use delete part of it. That should give you a good performance