Database design queries regarding inheritance and foreign key references

Database design queries regarding inheritance and foreign key references - sql

I have a query regarding a design problem that I faced …
There is a table A with subtypes B and C. Table A has an attribute type which tells whether the type is B or C. The common attributes of B and C are in A .
The problem is that there are no extra attributes for B .. all attributes required for B are in A already. However , there are extra attributes for C.
Is it an acceptable solution if I make tables A and C only ??… to extract entities of B I will query through the type attribute from table A
Can you refer any material ?
I also had a another confusion where table A has subtypes B,C,D . Table Z has a column that requires a value of primary id of either B or C but NOT D.
I thought of adding the primary id column of A as a foreign key reference to Z’s column and then making a trigger to ensure that the id isn't D ...
Can anyone please comment ?
Thank you !

Many people just enforce all these rules in application code. That is, they "simply" don't insert wrong data. Of course this is very fragile and depends on writing perfect application code at all times. So we want the database to enforce the constraints instead, so that wrong data can never be inserted.
CREATE TABLE A (
id INT PRIMARY KEY,
type CHAR(1) NOT NULL,
unique key (id, type)
);
CREATE TABLE B (
id INT PRIMARY KEY,
type CHAR(1) NOT NULL DEFAULT 'B',
FOREIGN KEY (id, type) REFERENCES A(id, type)
);
If you can force B.type to always be 'B' (CHECK constraint, trigger, or reference a one-row lookup table) then it can of course reference parent rows in A where type='B'. And do something similar in tables C and D, so then each row of A can be referenced by a row from only one sub-type table.
That is, if A.type is 'B' on a given row, and C.type can only be 'C', then no row of C can reference any row where A.type is 'B'.
Now if you want table Z to reference B or C but not D, you can reference by id and type, so Z also has its own type column. You can restrict Z.type by using a lookup table:
CREATE TABLE Ztypes (
type CHAR(1) PRIMARY KEY
);
INSERT INTO Ztypes VALUES ('B'), ('C');
CREATE TABLE Z (
id INT PRIMARY KEY,
Aid INT NOT NULL,
type CHAR(1) NOT NULL,
FOREIGN KEY (Aid, type) REFERENCES A(id, type),
FOREIGN KEY (type) REFERENCES Ztypes(type)
);

You've already got the answer you were looking for. But for other who run across this, it's worth researching two techniques: Class Table Inheritance and Shared Primary Key.
These two techniques used together make it fast, simple and easy to join A's data with either B's or C's data. And in this pattern, B contains only the key, but still contains usefull informaton.
Both of these techiques have their own tags.

Related

How to inherit generated column in Postgres

I have 3 tables as follows:
CREATE TABLE A (
ID integer PRIMARY KEY generated always as identity,
data integer
);
CREATE TABLE B (
other_data integer
) INHERITS (A);
CREATE TABLE C (
other_other_data integer
) INHERITS (A);
My intent is to have unique id for tables B and C so they don't mix up. When trying this approach and inserting data into B (insert into B (other_data) values (1)) i get the following error:
ERROR: null value in column "id" of relation "B" violates not-null constraint
DETAIL: Failing row contains (null)
I suspect this approach is not the correct way to make 2 tables share unique ids. And Postgres documentation really sucks. I tried using serial for a long time. Only to find out that buried in the wiki is a warning to not use them.
-edit: added the insert statement

How to maintain database integrity when table inheritance cannot be used

I have several different component types that each have drastically different data specs to store so each component type needs its own table, but they all share some common columns. I'm most concerned with [component.ID] which must be a unique identifier to a component regardless of component type (unique across many tables).
First Option
My first idea was inheritance where the table for each component type inherits a generic [component] table.
create table if not exists component (
ID long primary key default nextval('component_id_seq'),
typeID long not null references componentType (ID),
manufacturerID long not null references manufacturer (ID),
retailPrice numeric check (retailPrice >= 0.0),
purchasePrice numeric check (purchasePrice >= 0.0),
manufacturerPartNum varchar(255) not null,
isLegacy boolean default false,
check (retailPrice >= purchasePrice)
);
create table if not exists motherboard (
foo long,
bar long
) inherits component; //<-- this guy right here!!
/* there would be many other tables with different specific types of components
which each inherit the [component] table*/
PostgreSQL inheritance has some caveats that seem to make this a bad idea.
Constraints like unique or primary key are not respected by the inheriting table. Even if you specify unique in the inheriting table it would only be unique in that table and could duplicate values in the parent table or other inheriting tables.
References do not carry over from the parent table. So the references for typeID or manufacturerID would not apply to the inheriting table.
References to the parent table would not include data in the inheriting tables. This is the worst deal breaker for me using inheritance because I need to be able to reference to all components regardless of type.
Second Option
If I don't use inheritance and just use the component table as a master component list with data common to any component of any type and then have a table for each type of component where each entry refers to a component.ID. that works fine but how do I enforce it?
How do I enforce that each entry in the component table has one and only one corresponding entry in only one of many other tables? The part that baffles me is that there are many tables and the corresponding entry could be in any of them.
A simple reference back to the component table will ensure that each row in the many specific component type tables has one valid component.id to which it belongs.
Third Option
Last of all I could forego a master component table altogether and just have each table for a specific component type have those same columns. Then I am left with the conundrum of how to enforce a unique component ID across many tables and also how to search across all these many tables (which may very well grow or shrink) in queries. I don't want a huge unwieldy UNION between all these tables. That would bog any select query to frozen molasses speed.
Fourth Option
This strikes me as a problem that comes up from time to time is DB design and there is probably a name for it that I don't know and perhaps a solution that is different entirely from the above three options.

The foreign key should contain the type of a subcomponent, the example speaks for itself.
create table component(
id int generated always as identity primary key,
-- or
-- id serial primary key,
type_id int not null,
general_info text,
unique (type_id, id)
);
create table subcomponent_1 (
id int primary key,
type_id int generated always as (1) stored,
-- or
-- type_id int default 1 check(type_id = 1),
specific_info text,
foreign key(type_id, id) references component(type_id, id)
);
insert into component (type_id, general_info)
values (1, 'component type 1');
insert into subcomponent_1 (id, specific_info)
values (1, 'specific info');
Note:
update component
set type_id = 2
where id = 1;
ERROR: update or delete on table "component" violates foreign key constraint "subcomponent_1_type_id_id_fkey" on table "subcomponent_1"
DETAIL: Key (type_id, id)=(1, 1) is still referenced from table "subcomponent_1".

many to many to many SQL schemas

I have 2 tables: A and B.
A many-to-many relation units them through a join table A_B.
Now, my needs evolve: an A and a B can be related by more than 1 way.
I don't know what is the more conventional way to do that.
Must I declare a new "relation_way" table that contains the different "ways" for a A to be connected to a B and use this to compose a ternary key in A_B?

I would simply add a column to a_b that states the type of the relationship, e.g. relation_type that stores e.g. owned_by or referred_to or however you want to describe that relation (your obfuscated table and column names do not help a bit in answering this).
create table a_b
(
a_id integer not null references a,
b_id integer not null references b,
relation_type text not null
);
If you allow multiple relations but with different types between two entities, then include the relation_type in the primary key of the a_b table.
If you want to restrict the possible relation types, you should create a lookup table:
create table relation_type
(
id integer primary key,
type_name varchar(20) not null unique
);
and reference that from the link table:
create table a_b
(
a_id integer not null references a,
b_id integer not null references b,
relation_type_id integer not null references relation_type,
primary key (a_id, b_id, relation_type_id)
);

Add relation with fixed column value

I like to create a 'conditional' (foreign key) relation between 3 tables. In my case, it's like this (of course it's quite more complex, but I've stripped it down to demonstrate the problem situation):
Table [ItemTable]
Column int Id (PK)
Column str ItemName
Table [ItemGroup]
Column int Id (PK)
Column str GroupName
Table [Settings]
Column int Id (PK)
Column str RefersTo ('I' means item, 'G' means item group)
Column int Reference (foreign key depending on 'RefersTo')
The Goal now is to create Relations with contraints like this:
Settings.Reference refers to ItemTable.Id when Settings.RefersTo equals 'I'
Settings.Reference refers to ItemGroup.Id when Settings.RefersTo equals 'G'
No relation in case if RefersTo is empty (so no constraint in this situation)
It sounds like a refer-here-or-there-relation, but I don't know how to achive with MS SQL. I usually use the grafical designer in Management Studio to create and modify table defintion.
Any help is appreciated. Thank you in advance.

Foreign keys don't have filter clauses in their definition. But you can do this using computed columns:
create table Settings as (
. . .
reference_i as (case when refersto = 'I' then reference end) persisted,
reference_g as (case when refersto = 'G' then reference end) persisted,
constraint fk_settings_reference_index
foreign key (reference_i) references itemTable(id),
constraint fk_settings_reference_group
foreign key (reference_g) references groupTable(id)
);

This is not a good design and if you can, it would be better to change it as #VojtěchDohnal already suggested.
If you cannot change it, you could use a trigger after insert, to check if the corresponding value of Reference comes from the correct table, depending on the current value of RefersTo and if not, stop inserting and throw some error, but using triggers is also not the best way performance-wise.
You cannot use an indexed view (which would have been the best, since it would be schema bound and it would get all new values/deleted values from your items or groups) since your sources are two different ones and you would need a union to generate a full list of posible values and there's a limitation that The SELECT statement in the view definition must not contain UNION
in indexed views.
The last option: You could use an additional table where you keep all data (Type('I', 'G'), Value (Id's from ItemTable for 'I', Id's from ItemGroup for 'G')) with possible Id's for each table and then make your composite foreign key refer to this new table.
The drawback is that in this case you would need to keep track of changes in both ItemTable and ItemGroup tables and update the newly created table accordingly (for newly inserted values, or deleted values) which is not so nice when it comes to maintenance.
For this last scenario the code would be something like:
CREATE TABLE ItemTable (Id INT PRIMARY KEY IDENTITY(1,1), ItemName VARCHAR(100))
CREATE TABLE ItemGroup (Id INT PRIMARY KEY IDENTITY(1,1), GroupName VARCHAR(100))
CREATE TABLE Settings (Id INT PRIMARY KEY IDENTITY(1,1), RefersTo CHAR(1), Reference int)
INSERT INTO ItemTable (ItemName) values ('TestItemName1'), ('TestItemName2'), ('TestItemName3'), ('TestItemName4')
INSERT INTO [ItemGroup] (GroupName) values ('Group1'), ('Group2')
SELECT * FROM ItemTable
SELECT * FROM ItemGroup
SELECT * FROM Settings
CREATE TABLE ReferenceValues (Type char(1), Val INT, PRIMARY KEY (Type, Val))
INSERT INTO ReferenceValues
SELECT 'I' as Type, i.Id as Val
FROM dbo.ItemTable i
UNION
SELECT 'G' as Type, g.Id as Val
FROM dbo.ItemGroup as g
ALTER TABLE dbo.Settings
ADD FOREIGN KEY (RefersTo, Reference) REFERENCES dbo.ReferenceValues(Type, Val);
INSERT INTO Settings (RefersTo, Reference)
VALUES ('I', 1) -- will work
INSERT INTO Settings (RefersTo, Reference)
VALUES ('G', 4) -- will not work

After thinking arround, I came to conclusion to discard the whole idea with one-column-multi-relation thingy.
Answer accepted: Despite on good or bad idea, implementation as desired not possible :)
Thank you all for your answers and comments!

How to constraint one column with values from a column from another table?

This isn't a big deal, but my OCD is acting up with the following problem in the database I'm creating. I'm not used to working with databases, but the data has to be stored somewhere...
Problem
I have two tables A and B.
One of the datafields is common to both tables - segments. There's a finite number of segments, and I want to write queries that connect values from A to B through their segment values, very much asif the following table structure was used:
However, as you can see the table Segments is empty. There's nothing more I want to put into that table, rather than the ID to give other table as foreign keys. I want my tables to be as simple as possible, and therefore adding another one just seems wrong.
Note also that one of these tables (A, say) is actually master, in the sense that you should be able to put any value for segment into A, but B one should first check with A before inserting.
EDIT
I tried one of the answers below:
create table A(
id int primary key identity,
segment int not null
)
create table B(
id integer primary key identity,
segment int not null
)
--Andomar's suggestion
alter table B add constraint FK_B_SegmentID
foreign key (segment) references A(segment)
This produced the following error.
Maybe I was somehow unclear that segments is not-unique in A or B and can appear many times in both tables.
Msg 1776, Level 16, State 0, Line 11 There are no primary or candidate
keys in the referenced table 'A' that match the referencing column
list in the foreign key 'FK_B_SegmentID'. Msg 1750, Level 16, State 0,
Line 11 Could not create constraint. See previous errors.

You can create a foreign key relationship directly from B.SegmentID to A.SegmentID. There's no need for the extra table.
Update: If the SegmentIDs aren't unique in TableA, then you do need the extra table to store the segment IDs, and create foreign key relationships from both tables to this table. This however is not enough to enforce that all segment IDs in TableB also occur in TableA. You could instead use triggers.

You can ensure the segment exists in A with a foreign key:
alter table B add constraint FK_B_SegmentID
foreign key (SegmentID) references A(SegmentID)
To avoid rows in B without a segment at all, make B.SegmentID not nullable:
alter table B alter column SegmentID int not null
There is no need to create a Segments table unless you want to associate extra data with a SegmentID.

As Andomar and Mark Byers wrote, you don't have to create an extra table.
You can also CASCADE UPDATEs or DELETEs on the master. Be very carefull with ON DELETE CASCADE though!
For queries use a JOIN:
SELECT *
FROM A
JOIN B ON a.SegmentID = b.SegmentID
Edit:
You have to add a UNIQUE constraint on segment_id in the "master" table to avoid duplicates there, or else the foreign key is not possible. Like this:
ALTER TABLE A ADD CONSTRAINT UNQ_A_SegmentID UNIQUE (SegmentID);

If I've understood correctly, a given segment cannot be inserted into table B unless it has also been inserted into table A. In which case, table A should reference table Segments and table B should reference table A; it would be implicit that table B ultimately references table Segments (indirectly via table A) so an explicit reference is not required. This could be done using foreign keys (e.g. no triggers required).
Because table A has its own key I assume a given segment_ID can appear in table A more than once, therefore for B to be able to reference the segment_ID value in A then a superkey would need to be defined on the compound of A_ID and segment_ID. Here's a quick sketch:
CREATE TABLE Segments
(
segment_ID INTEGER NOT NULL UNIQUE
);
CREATE TABLE A
(
A_ID INTEGER NOT NULL UNIQUE,
segment_ID INTEGER NOT NULL
REFERENCES Segments (segment_ID),
A_data INTEGER NOT NULL,
UNIQUE (segment_ID, A_ID) -- superkey
);
CREATE TABLE B
(
B_ID INTEGER NOT NULL UNIQUE,
A_ID INTEGER NOT NULL,
segment_ID INTEGER NOT NULL,
FOREIGN KEY (segment_ID, A_ID)
REFERENCES A (segment_ID, A_ID),
B_data INTEGER NOT NULL
);

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas