How do I enforce referential integrity on a type 6 SCD Dimension table? - sql

I'm having difficulties with designing the Primary and Foreign key relationship between my fact table and a Type 6 SCD Dimension table.
The dimension table has the following definition:
CREATE TABLE DimTable
(
surrogate_key INT,
row_key INT IDENTITY (1,1),
natural_key INT NOT NULL,
current_value INT NOT NULL,
historic_value INT NOT NULL,
is_current BIT NOT NULL,
record_start_date_id INT NOT NULL,
record_end_date_id INT NOT NULL
-- Primary Key
CONSTRAINT pk_dimtable_surrogate_key_row_key PRIMARY KEY (surrogate_key, row_key);
A sample of how the data looks like:
surrogate_key | row_key | natural_key | current_value | historic_value | is_current | record_start_date_id | record_end_date_id
-------------------------------------------------------------------------------------------------------------------------------
121 | 2591227 | 123456 | 20090807 | 20090807 | 0 | 20180807 | 99991231
121 | 2591228 | 123456 | 20140807 | 20090807 | 0 | 20180807 | 99991231
121 | 2591229 | 123456 | 20141107 | 20140807 | 1 | 20180807 | 99991231
122 | 2591230 | 456789 | 20090807 | 20090807 | 1 | 20180807 | 99991231
From my understanding of the wikipedia page, I should be able to enforce Referential integrity through PK/FK relationship, however the master surrogate key is not unique across this table so I don't know how to point the surrogate_id in my fact table to the surrogate_key with a FK constraint.
Is there any way around this limitation, or do I understand the description wrong?
Btw, this is my first time asking a question here, so if anything is unclear or missing please let me know!
EDIT: Column names are generic dummynames. The actual colnames are more descriptive.

I believe you misunderstood the concept of SurrogateKey. Instead the Row_Key attribute here makes more sense of SurrogateKey.
I suggest go one and read books to understand the surrogate key. You may require lot of changes in your process.

Related

MariaDB foreign key auto generated index not created for the first column of PK

i'm facing a question without answer. I can't understand why the auto generated index from the FK creation is not working when the column seems to be the first one of PK, what i mean :
Create a simple schema with :
CREATE TABLE cat (name VARCHAR(255) PRIMARY KEY);
CREATE TABLE dog (name VARCHAR(255) PRIMARY KEY);
CREATE TABLE cat_dog_couple
(
cat_name VARCHAR(255),
dog_name VARCHAR(255),
PRIMARY KEY (cat_name, dog_name),
CONSTRAINT fk__cat_dog_couple__cat_name FOREIGN KEY (cat_name) references cat(name),
CONSTRAINT fk__cat_dog_couple__dog_name FOREIGN KEY (dog_name) references dog(name)
);
These indexes will be generated :
+----------------+------------+------------------------------+--------------+-------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name |
+----------------+------------+------------------------------+--------------+-------------+
| cat_dog_couple | 0 | PRIMARY | 1 | cat_name |
| cat_dog_couple | 0 | PRIMARY | 2 | dog_name |
| cat_dog_couple | 1 | fk__cat_dog_couple__dog_name | 1 | dog_name |
+----------------+------------+------------------------------+--------------+-------------+
Screen show index
I don't really understand why the index fk__cat_dog_couple__cat_name is not created?
Is it a bug ? A technical limitation ? A technical choice ?
Tested on MariaDB 10.4.x and 10.5.x.

Supertype & Subtypes and one to one relationship

I have the following supertype/multiple subtypes tables in SQL Server
supertype: Doctor and subtypes: Paediatrician, Orthopedic and Dentist
create table Doctor
(
DoctorID int primary key,
Name varchar(100),
-- add some other common attributes (all of vendor, sponsor, volunteer have) here.
)
create table Paediatrician
(
PaediatricianId int primary key,
DoctorID int foreign key references Doctor(DoctorID)
-- add some other attributes related to Paediatrician here.
)
create table Orthopedic
(
OrthopedicId int primary key,
DoctorID int foreign key references Doctor(DoctorID)
-- add some other attributes related to Orthopedic here.
)
create table Dentist
(
DentistId int primary key,
DoctorID int foreign key references Doctor(DoctorID)
-- add some other attributes related to Dentisthere.
)
My business logic is that a doctor can be either a Paediatrician, Dentist or an Orthopedic. Cannot be more than one of the subtypes. Based on the above design this is not enforced. I can create Doctor with Id = 1 and then go to Dentist and Orthopedictables and assign DoctorId value of 1 in both tables. How do I enforce it so that a doctor can be present at only one table?
I would arrange this bit differently. I would have 3 tables, a Doctor table (like you already have), a Specialist table and a SpecialistAttributes table.
The Doctor table contains all the Doctors' info, easy.
The Specialist Table contains your SpecialistTypeID and SpecialistDescription etc.
Your 3 example specialists would each be a row in this table.
The SpecialistAttributes table contains all the attributes needed for the specialists. In your Doctor table, you have a foreign key to lookup the SpecialistTypeID, so there can be only 1, then the SpecialistType has a number of SpecislaistAttibutes it can link to.
The other benefit of organising your data this way is that of you need to add any specialists roles or attributes, you don't need to change the structure of your database, just add more rows.
Doctor Table
| ID | Name | Specialist_FK |
---------------------------------
| 1 | Smith | 2 |
| 2 | Davies | 3 |
| 3 | Jones | 3 |
Specialist Table
| ID | Speciality |
----------------------
| 1 | Paediatrician |
| 2 | Orthopedic |
| 3 | Dentist |
SpecialistAttribute Table
| ID | SpecialityID+FK | Description | Other |
------------------------------------------------------------
| 1 | 1 | Paediatrician Info 1 | Other Info |
| 2 | 1 | Paediatrician Info 2 | Other Info |
| 3 | 2 | Orthopedic Info 1 | Other Info |
| 4 | 2 | Orthopedic Info 1 | Other Info |
| 5 | 3 | Dentist Info 1 | Other Info |
| 6 | 4 | Dentist Info 1 | Other Info |
There is no inbuild constraints/feature in the SQL server to handle this. You need to write custom login for it. Either in the procedure or Trigger.
You can write a stored procedure which would be responsible to insert in these tables. before insert, it will validate that if doctor id already exists in any of the tables if yes then an error will be custom raised otherwise procedure will insert the record in the respective table.

SQL: What is does the UNIQUE mean when creating a primary key

I've got a table called students:
+------------+------------+-----------+---------------------+---------------------+
| student_id | first_name | surname | email | reg_date |
+------------+------------+-----------+---------------------+---------------------+
| 1 | Emily | Jackson | emilyj#gmail.com | 2012-10-14 11:14:13 |
| 2 | Daniel | ALexander | daniela#hotmail.com | 2014-08-19 08:08:23 |
| 3 | Sarah | Bell | sbell#gmail.com | 1998-07-04 13:16:32 |
| 4 | Alex | Harte | AHarte#hotmail.com | 1982-06-14 00:00:00 |
+------------+------------+-----------+---------------------+---------------------+
When creating the table:
CREATE TABLE students(
-> student_id INT NOT NULL AUTO_INCREMENT,
-> first_name VARCHAR(30) NOT NULL,
-> surname VARCHAR(50) NOT NULL,
-> email VARCHAR(200) NOT NULL,
-> reg_date DATETIME NOT NULL,
-> PRIMARY KEY (student_id),
-> UNIQUE (email));
What does the 'UNIQUE (email)' mean? Does it mean if the primary key isn't unique, look at the email to see if that's unique instead? Or something different?
Thanks
The UNIQUE keyword creates a unique constraint on the columns that are mentioned in its argument list (in this case, email). It does not interfere with the primary key. It will enforce unique values on the email column, that is, fail with an exception when a row is about to be INSERTed (or UPDATEd) that would collide with an existing row.
A primary key (by default) implies a unique constraint. So as you designate student_id as your primary key, the RDBMS will also automatically maintain unique values in that column for you.
Further reading: http://www.w3schools.com/sql/sql_unique.asp
It allows the engine to use it as an index in queries and enforces it to be unique when a record/s are inserted/updated; throwing a violation of a unique key constraint when an already existing email is inserted/updated.
Example: http://sqlfiddle.com/#!9/7a0aee
More Information: http://dev.mysql.com/doc/refman/5.7/en/partitioning-limitations-partitioning-keys-unique-keys.html

Are there problems with this 'Soft Delete' solution using EAV tables?

I've read some information about the ugly side of just setting a deleted_at field in your tables to signify a row has been deleted.
Namely
http://richarddingwall.name/2009/11/20/the-trouble-with-soft-delete/
Are there any potential problems with taking a row from a table you want to delete and pivoting it into some EAV tables?
For instance.
Lets Say I have two tables deleted and deleted_row respectively described as follows.
mysql> describe deleted;
+------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| tablename | varchar(255) | YES | | NULL | |
| deleted_at | timestamp | YES | | NULL | |
+------------+--------------+------+-----+---------+----------------+
mysql> describe deleted_rows;
+--------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| entity | int(11) | YES | MUL | NULL | |
| name | varchar(255) | YES | | NULL | |
| value | blob | YES | | NULL | |
+--------+--------------+------+-----+---------+----------------+
Now when you wanted to delete a row from any table you would delete it from the table then insert it into these tables as such.
deleted
+----+-----------+---------------------+
| id | tablename | deleted_at |
+----+-----------+---------------------+
| 1 | products | 2011-03-23 00:00:00 |
+----+-----------+---------------------+
deleted_row
+----+--------+-------------+-------------------------------+
| id | entity | name | value |
+----+--------+-------------+-------------------------------+
| 1 | 1 | Title | A Great Product |
| 2 | 1 | Price | 55.00 |
| 3 | 1 | Description | You guessed it... it's great. |
+----+--------+-------------+-------------------------------+
A few things I see off the bat.
You'll need to use application logic
to do the pivot (Ruby, PHP, Python,
etc)
The table could grow pretty big
because I'm using blob to handle
the unknown size of the row value
Do you see any other glaring problems with this type of soft delete?
Why not mirror your tables with archive tables?
create table mytable(
col_1 int
,col_2 varchar(100)
,col_3 date
,primary key(col_1)
)
create table mytable_deleted(
delete_id int not null auto_increment
,delete_dtm datetime not null
-- All of the original columns
,col_1 int
,col_2 varchar(100)
,col_3 date
,index(col_1)
,primary key(delete_id)
)
And then simply add on-delete-triggers on your tables that inserts the current row in the mirrored table before the deletion? That would provide you with dead-simple and very performant solution.
You could actually generate the tables and trigger code using the data dictionary.
Note that I might not want to have a unique index on the original primary key (col_1) in the archive table, because you may actually end up deleting the same row twice over time if you are using natural keys. Unless you plan to hook up the archive tables in your application (for undo purposes) you can drop the index entirely. Also, I added the time of delete (deleted_dtm) and a surrogate key that can be used to delete the deleted (hehe) rows.
You may also consider range partitioning the archive table on deleted_dtm. This makes it pretty much effortless to purge data from the tables.

How to change a primary key in SQL to auto_increment?

I have a table in MySQL that has a primary key:
mysql> desc gifts;
+---------------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------------+-------------+------+-----+---------+-------+
| giftID | int(11) | NO | PRI | NULL | |
| name | varchar(80) | YES | | NULL | |
| filename | varchar(80) | YES | | NULL | |
| effectiveTime | datetime | YES | | NULL | |
+---------------+-------------+------+-----+---------+-------+
but I wanted to make it auto_increment.
The following statement failed. How can it be modified so that it can work? thanks
mysql> alter table gifts modify giftID int primary key auto_increment;
ERROR 1068 (42000): Multiple primary key defined
Leave off the primary key attribute:
ALTER TABLE gifts MODIFY giftID INT AUTO_INCREMENT;
Certain column attributes, such as PRIMARY KEY, aren't exactly properties of the column so much as shortcuts for other things. A column marked PRIMARY KEY, for example, is placed in the PRIMARY index. Futhermore, all columns in the PRIMARY index are given the NOT NULL attribute. (Aside: to have a multi-column primary key, you must use a separate constraint clause rather than multiple PRIMARY KEY column attributes.) Since the column is already in the PRIMARY index, you don't need to specify it again when you modify the column. Try SHOW CREATE TABLE gifts; to see the affects of using the PRIMARY KEY attribute.