Best approach cascade deleting related entity in MS SQL

Best approach cascade deleting related entity in MS SQL - sql

Need advice of the best approach how to design DB for the following scenario:
Following below DB structure exmaple (it's not real just explain problem)
File
(
Id INT PRIMARY KEY...,
Name VARCHAR(),
TypeId SMALLINT,
...
/*other common fields*/
)
FileContent
(
Id INT PRIMARY KEY...,
FileId FOREIGN KEY REFERENCES File(Id) NOT NULL ON DELETE CASCADE UNIQUE,
Content VARBINARY(MAX) NOT NULL,
)
Book
(
Id INT PRIMARY KEY...,
Name VARCHAR(255),
Author VARCHAR(255)
...
CoverImageId FK REFERENCES File(Id),
)
BookPageType
(
Id TINYINT PRIMARY KEY...,
Name VARCHAR(50),
)
BookPage
(
Id INT PRIMARY KEY...,
TypyId TINYINT FOREIGN KEY REFERENCES BookPageType(Id),
BookId INT FOREIGN KEY REFERENCES Book(Id) ON DELETE CASCADE,
Name VARCHAR(100),
CreatedDate DATETIME2,
...
/*other common fields*/
)
BookPage1
(
Id PRIMARAY KEY REFERENCES BookPage(Id) NOT NULL ON DELETE CASCADE,
FileId PRIMARAY KEY REFERENCES File(Id)
...
/* other specific fileds */
)
...
BookPageN
(
Id PRIMARAY KEY REFERENCES BookPage(Id) NOT NULL ON DELETE CASCADE,
ImageId PRIMARAY KEY REFERENCES File(Id),
...
/* other specific fileds */
)
Now question is I want to delete Book with all pages and data (and it works good with delete cascade), but how to make cascade delete the associated files also (1 to 1 relentionship).
Here I see following approaches:
Add file to every table when I use it, but I don't want to copy file
schema for every table
Add foreign keys to the File table (instead of page for example), but since I use file for e.g. in 10 tables I will have 10 foreign keys in file table. This also not good
Use triggers, what I don't wnat to do
Thanks in Advance

If such necessary is appeared maybe it seems you need refactor your base.
You said this example is not real and I'll not ask about N tables for pages though it's strange. If not all files have 1 to 1 relationship and so you need remove only a file that other book does not refer to, it's sounds like a job for a trigger.

So what you have defined is a many-to-many relationship between BookPage and File. this is a result of the one-to-many relationship between BookPage and BookPageN and then the one-to-many relationship between File and BookPageN. To get the relationships you say you want in the text, you need to turn the relationship around to point from BookPageN to File. Maybe instead of having so many BookPageN tables you could find a way to consolidate them into a single table. Maybe just use the BookPage table. Just allow nulls for the fields that are optional.

Related

Postgresql Foreign Key Actions - Delete Attribute and Change Other Attributes Related to This

I create 3 tables just like image. Each students can be enrolled multiple class I tried to build one to many relation.
What I want to do is, when a student is deleted from the "Student" table, the course in which the student is registered in the "Bridge" table returns to null. How can I do this operations with postgresql (pgAdmin 4), can you help me please? Thank you...

You are describing the on delete set null option to foreign keys constraints. The create table statement for bridge would look like:
create table bridge (
std_id int references students(std_id) on delete set null,
class_id int references class(class_id)
);
I am unsure that set null is your best pick for such a bridge table though. This leaves "gaps" in your data that do not make a lot of sense. on delete cascade would probably make more sense - and you could apply it to both foreign keys:
create table bridge (
std_id int references students(std_id) on delete cascade,
class_id int references class(class_id) on delete cascade
);
That way, the bridge table is properly cleaned up when any parent record is dropped. This also opens the way to set up a composite primary key made of both columns in the bridge table.

"Multiple" Foreign Key

I have tables:
MUSICIANS (musician_id, ...)
PROGRAMMERS (programmer_id, ...)
COPS (cop_id, ...)
Then I'm going to have a specific table
RICH_PEOPLE (rich_person_id, ...)
where rich_person_id is either musician_id, programmer_id or cop_id. (Assume that all the musician_ids, programmer_ids, cop_ids are different.)
Is it possible to directly create a Foreign Key on the field rich_person_id?
P.S. I would like the database to
ensure that there is a record of either MUSICIANS, PROGRAMMERS or COPS with the same id as the new RICH_PEOPLE record's rich_person_id before inserting it into RICH_PEOPLE
deleting from either MUSICIANS, PROGRAMMERS or COPS would fail (or require cascade deletion) if there a RICH_PEOPLE record with the same id
P.P.S. I wouldn't like
creating an extra table like POSSIBLY_RICH_PEOPLE with the only field possibly_rich_person_id
creating triggers

You can create three nullable foreign keys, one to each foreign table. Then use a CHECK constraint to ensure only one value is not null at any given time.
For example:
create table rich_people (
rich_person_id int primary key not null,
musician_id int references musicians (musician_id),
programmer_id int references programmers (programmer_id),
cop_id int references cops (cop_id),
check (musician_id is not null and programmer_id is null and cop_id is null
or musician_id is null and programmer_id is not null and cop_id is null
or musician_id is null and programmer_id is null and cop_id is not null)
);
This way, referential integrity will be ensured at all times. Deletions will require cascade deletion or other strategy to keep data integrity.

You do this in a somewhat different way:
Create a table people with a person_id.
Use this key as the primary key (and foreign key) for each of your occupation tables.
Use this key as the primary key (and foreign key) for your rich_people table.
Postgres supports a concept called "inheritance", which facilitates this type construct. Your occupation tables can "inherit" columns from people.

What is the simplest way to delete a child row when its parent is deleted, without knowing what its parent is?

Given multiple entity types:
Cluster
Hypervisor
VirtualMachine
and given properties that could belong to any one of them (but no more than one per row):
CpuInfo
CpuSpeed
CpuTotal
...
DataStore
...
What is the simplest way to delete a property with its parent?
Attempted Solutions
ON DELETE CASCADE
ON DELETE CASCADE seems to require a nullable foreign key for each possible parent, which strikes me as a poor design:
CREATE TABLE CpuInfo
(
-- Properties
Id INT IDENTITY(1,1) NOT NULL PRIMARY KEY,
CpuSpeed INT,
AllocatedTotal INT,
CpuTotal INT,
AvailableTotal INT,
-- Foreign keys for all possible parents
ClusterId INT,
HypervisorId INT,
VirtualMachineId INT,
FOREIGN KEY (ClusterId) REFERENCES Cluster(Id) ON DELETE CASCADE,
FOREIGN KEY (HypervisorId) REFERENCES Hypervisor(Id) ON DELETE CASCADE,
FOREIGN KEY (VirtualMachineId) REFERENCES VirtualMachine(Id) ON DELETE CASCADE
);
Junction Tables with Triggers
Parents are related to properties through junction tables. For example:
CREATE TABLE HypervisorCpuInfo
(
HypervisorId INT NOT NULL,
CpuInfoId INT NOT NULL,
FOREIGN KEY (HypervisorId) REFERENCES Hypervisor(Id),
FOREIGN KEY (CpuInfoId) REFERENCES CpuInfo(Id) ON DELETE CASCADE
);
There is then a DELETE trigger for each entity type. The trigger selects the IDs of the entity's properties and deletes them. When the properties are deleted, the child junction rows are then deleted also, via ON CASCADE DELETE.
This doesn't model the business rules very well, though, since it allows the same CpuInfo to belong to multiple entities. It also adds a lot of tables to the design.
Is there a simpler solution?

I think a "junction table" might be fitting for DRYness (it isn't a real junction because of the 1:n relation)
You could call your "junction table" a "super table" (something like "machine" [sorry I'm not native]):
In this table you put all the keys to your properties (make each foreign key column unique to ensure 1:1*). The very type of your "machine" (Cluster,Hypervisor,VirtualMachine) is in the "triple key" you already tried - also in the super-table.
To ensure "machine" is only of one entity add a constraint:
ALTER TABLE CpuInfo WITH CHECK ADD CONSTRAINT [CK_keyIDs] CHECK (
(ClusterId IS NULL AND HypervisorId IS NULL AND VirtualMachineId IS NOT NULL)
OR (ClusterId IS NULL AND HypervisorId IS NOT NULL AND VirtualMachineId IS NULL)
OR (ClusterId IS NOT NULL AND HypervisorId IS NULL AND VirtualMachineId IS NULL)) GO
The good thing is you are quite free with your entities, you could allow a PC to be a Cluster at the same time.
*the key-column! the ID already has to be unique

Where do you store ad-hoc properties in a relational database?

Lets say you have a relational DB table like INVENTORY_ITEM. It's generic in the sense that anything that's in inventory needs a record here. Now lets say there are tons of different types of inventory and each different type might have unique fields that they want to keep track of (e.g. forks might track the number of tines, but refrigerators wouldn't have a use for that field). These fields must be user-definable per category type.
There are many ways to solve this:
Use ALTER TABLE statements to actually add nullable columns on the fly (yuk)
Have two tables with a one-to-one mapping, INVENTORY_ITEM, and INVENTORY_ITEM_USER, and use ALTER TABLE statements to add and remove nullable columns from the latter table on the fly (a bit nicer).
Add a CUSTOM_PROPERTY table, and a CUSTOM_PROPERTY_VALUE table, and add/remove rows in CUSTOM_PROPERTY when the user adds and removes rows, and store the values in the latter table. This is nice and generic, but the performance would suffer. If you had an average of 20 values per item, the number of rows in CUSTOM_PROPERTY_VALUE goes up at 20 times the rate, and you still need to include columns in CUSTOM_PROPERTY_VALUE for every different data type that you might want to store.
Have one big varchar(MAX) field on INVENTORY_ITEM to store custom properties as XML.
I guess you could have individual tables for each category type that hangs off the INVENTORY_ITEM table, and these get created/destroyed on the fly when the user creates inventory types, and the columns get updated when they add/remove properties to those types. Seems messy though.
Is there a best-practice for this? It seems to me that option 4 is clean, but doesn't allow you to easily search by the metadata. I've used a variant of 3 before, but only on a table that had a really small number of rows, so performance wasn't an issue. It always seemed to me that 2 was a good idea, but it doesn't fit well with auto-generated entity frameworks, so you'd have to exclude the custom properties table from the entity generation and just write your own custom data access code to handle it.
Am I missing any alternatives? Is there a way for SQL server to "look into" XML data in a column so it could actually do stuff with option 4 now?

I am using the xml type column for this kind of situations...
http://msdn.microsoft.com/en-us/library/ms189887.aspx
Before xml we had to use the option 3. Which in my point of view is still a good way to do it. Espacialy if you have a Data Access Layer that is able to handle the type conversion properly for you. We stored everything as string values and defined a column that held the orignial data type for the conversion.
Options 1 and 2 are a no-go. Don't change the database schema in production on the fly.
Option 5 could be done in a separate database... But still no control over the schema and the user would need the rights to create tables etc.

Definitely the 3.
Sometimes 4 if you have a very good reason to do so.
Do not ever dynamically modify database structure to accommodate for incoming data. One day something could break and damage your database. It is simply not done this way.

3 or 4 are the only ones I would consider - you don't want to be changing the schema on the fly, especially if you're using some kind of mapping layer.
I've generally gone with option 3. As a bit of sanity, I always have a type column in the CUSTOM_PROPERTY table, which is repeated in the CUSTOM_PROPERTY_VALUE table. By adding a superkey to the CUSTOM_PROPERTY table of <Primary Key, Type>, you can then have a foreign key that references this (as well as the simpler foreign key to just the primary key). And finally, a check constraint that ensures that only the relevant column in CUSTOM_PROPERTY_VALUE is not null, based on this type column.
In this way, you know that if someone has defined a CUSTOM_PROPERTY, say, Tine count, of type int, that you're actually only ever going to find an int stored in the CUSTOM_PROPERTY_VALUE table, for all instances of this property.
Edit
If you need it to reference multiple entity tables, then it can get more complex, especially if you want full referential integrity. For instance (with two distinct entity types in the database):
create table dbo.Entities (
EntityID uniqueidentifier not null,
EntityType varchar(10) not null,
constraint PK_Entities PRIMARY KEY (EntityID),
constraint CK_Entities_KnownTypes CHECK (
EntityType in ('Foo','Bar')),
constraint UQ_Entities_KnownTypes UNIQUE (EntityID,EntityType)
)
go
create table dbo.Foos (
EntityID uniqueidentifier not null,
EntityType as CAST('Foo' as varchar(10)) persisted,
FooFixedProperty1 int not null,
FooFixedProperty2 varchar(150) not null,
constraint PK_Foos PRIMARY KEY (EntityID),
constraint FK_Foos_Entities FOREIGN KEY (EntityID) references dbo.Entities (EntityID) on delete cascade,
constraint FK_Foos_Entities_Type FOREIGN KEY (EntityID,EntityType) references dbo.Entities (EntityID,EntityType)
)
go
create table dbo.Bars (
EntityID uniqueidentifier not null,
EntityType as CAST('Bar' as varchar(10)) persisted,
BarFixedProperty1 float not null,
BarFixedProperty2 int not null,
constraint PK_Bars PRIMARY KEY (EntityID),
constraint FK_Bars_Entities FOREIGN KEY (EntityID) references dbo.Entities (EntityID) on delete cascade,
constraint FK_Bars_Entities_Type FOREIGN KEY (EntityID,EntityType) references dbo.Entities (EntityID,EntityType)
)
go
create table dbo.ExtendedProperties (
PropertyID uniqueidentifier not null,
PropertyName varchar(100) not null,
PropertyType int not null,
constraint PK_ExtendedProperties PRIMARY KEY (PropertyID),
constraint CK_ExtendedProperties CHECK (
PropertyType between 1 and 4), --Or make type a varchar, and change check to IN('int', 'float'), etc
constraint UQ_ExtendedProperty_Names UNIQUE (PropertyName),
constraint UQ_ExtendedProperties_Types UNIQUE (PropertyID,PropertyType)
)
go
create table dbo.PropertyValues (
EntityID uniqueidentifier not null,
PropertyID uniqueidentifier not null,
PropertyType int not null,
IntValue int null,
FloatValue float null,
DecimalValue decimal(15,2) null,
CharValue varchar(max) null,
EntityType varchar(10) not null,
constraint PK_PropertyValues PRIMARY KEY (EntityID,PropertyID),
constraint FK_PropertyValues_ExtendedProperties FOREIGN KEY (PropertyID) references dbo.ExtendedProperties (PropertyID) on delete cascade,
constraint FK_PropertyValues_ExtendedProperty_Types FOREIGN KEY (PropertyID,PropertyType) references dbo.ExtendedProperties (PropertyID,PropertyType),
constraint FK_PropertyValues_Entities FOREIGN KEY (EntityID) references dbo.Entities (EntityID) on delete cascade,
constraint FK_PropertyValues_Entitiy_Types FOREIGN KEY (EntityID,EntityType) references dbo.Entities (EntityID,EntityType),
constraint CK_PropertyValues_OfType CHECK (
(IntValue is null or PropertyType = 1) and
(FloatValue is null or PropertyType = 2) and
(DecimalValue is null or PropertyType = 3) and
(CharValue is null or PropertyType = 4)),
--Shoot for bonus points
FooID as CASE WHEN EntityType='Foo' THEN EntityID END persisted,
constraint FK_PropertyValues_Foos FOREIGN KEY (FooID) references dbo.Foos (EntityID),
BarID as CASE WHEN EntityType='Bar' THEN EntityID END persisted,
constraint FK_PropertyValues_Bars FOREIGN KEY (BarID) references dbo.Bars (EntityID)
)
go
--Now we wrap up inserts into the Foos, Bars and PropertyValues tables as either Stored Procs, or instead of triggers
--To get the proper additional columns and/or base tables populated

My inclination would be to store things as XML if the database supports that nicely, or else have a small number of different tables for different data types (try to format data so it will fit one of a small number of types--don't use one table for VARCHAR(15), another for VARCHAR(20), etc.) Something like #5, but with all tables pre-created, and everything shoehorned into the existing tables. Each row should hold a main-record ID, record-type indicator, and a piece of data. Set up an index based on record-type, subsorted by data, and it will be possible to query for particular field values (where RecType==19 and Data=='Fred'). Querying for records that match multiple field values would be harder, but such is life.

MySQL Lookup table and id/keys

Hoping someone can shed some light on this: Do lookup tables need their own ID?
For example, say I have:
Table users: user_id, username
Table categories: category_id, category_name
Table users_categories: user_id, category_id
Would each row in "users_categories" need an additional ID field? What would the primary key of said table be? Thanks.

You have a choice. The primary key can be either:
A new, otherwise meaningless INTEGER column.
A key made up of both user_id and category_id.
I prefer the first solution but I think you'll find a majority of programmers here prefer the second.

You could create a composite key that uses the both keys
Normally if there is no suitable key to be found in a table you want to create a either a composite key, made up of 2 or more fields,
ex:
Code below found here
CREATE TABLE topic_replies (
topic_id int unsigned not null,
id int unsigned not null auto_increment,
user_id int unsigned not null,
message text not null,
PRIMARY KEY(topic_id, id));
therefor in your case you could add code that does the following:
ALTER TABLE users_categories ADD PRIMARY KEY (user_id, category_id);
therefor once you want to reference a certain field all you would need is to pass the two PKs from your other table, however to link them they need to each be coded as a foreign key.
ALTER TABLE users_categories ADD CONSTRAINT fk_1 FOREIGN KEY (category_id) REFERENCES categories (category_id);
but if you want to create a new primary key in your users_categories table that is an option. Just know that its not always neccessary.

If your users_categories table has a unique primary key over (user_id, category_id), then - no, not necessarily.
Only if you
want to refer to single rows of that table from someplace else easily
have more than one equal user_id, category_id combination
you could benefit from a separate ID field.

Every table needs a primary key and unique ID in SQL no matter what. Just make it users_categories_id, you technically never have to use it but it has to be there.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas