Database Normalization using Foreign Key - sql

I have a sample table like below where Course Completion Status of a Student is being stored:
Create Table StudentCourseCompletionStatus
(
CourseCompletionID int primary key identity(1,1),
StudentID int not null,
AlgorithmCourseStatus nvarchar(30),
DatabaseCourseStatus nvarchar(30),
NetworkingCourseStatus nvarchar(30),
MathematicsCourseStatus nvarchar(30),
ProgrammingCourseStatus nvarchar(30)
)
Insert into StudentCourseCompletionStatus Values (1, 'In Progress', 'In Progress', 'Not Started', 'Completed', 'Completed')
Insert into StudentCourseCompletionStatus Values (2, 'Not Started', 'In Progress', 'Not Started', 'Not Applicable', 'Completed')
Now as part of normalizing the schema I have created two other tables - CourseStatusType and Status for storing the Course Status names and Status.
Create Table CourseStatusType
(
CourseStatusTypeID int primary key identity(1,1),
CourseStatusType nvarchar(100) not null
)
Insert into CourseStatusType Values ('AlgorithmCourseStatus')
Insert into CourseStatusType Values ('DatabaseCourseStatus')
Insert into CourseStatusType Values ('NetworkingCourseStatus')
Insert into CourseStatusType Values ('MathematicsCourseStatus')
Insert into CourseStatusType Values ('ProgrammingCourseStatus')
Insert into CourseStatusType Values ('OperatingSystemsCourseStatus')
Insert into CourseStatusType Values ('CompilerCourseStatus')
Create Table Status
(
StatusID int primary key identity(1,1),
StatusName nvarchar (100) not null
)
Insert into Status Values ('Completed')
Insert into Status Values ('Not Started')
Insert into Status Values ('In Progress')
Insert into Status Values ('Not Applicable')
The modified table is as below:
Create Table StudentCourseCompletionStatus1
(
CourseCompletionID int primary key identity(1,1),
StudentID int not null,
CourseStatusTypeID int not null CONSTRAINT [FK_StudentCourseCompletionStatus1_CourseStatusType] FOREIGN KEY (CourseStatusTypeID) REFERENCES dbo.CourseStatusType (CourseStatusTypeID),
StatusID int not null CONSTRAINT [FK_StudentCourseCompletionStatus1_Status] FOREIGN KEY (StatusID) REFERENCES Status (StatusID),
)
I have few question on this:
Is this the correct way to normalize it ? The old table was very helpful to get data easily - I can store a student's course status in a single row, but now 5 rows are required. Is there a better way to do it?
Moving the data from the old table to this new table seems to be not an easy task. Can I achieve this using a query or I have to manually to do this ?
Any help is appreciated.

vou could also consider storing results in flat table like this:
studentID,courseID,status
1,1,"completed"
1,2,"not started"
2,1,"not started"
2,3,"in progress"
you will also need additional Courses table like this
courserId,courseName
1, math
2, programming
3, networking
and a students table
students
1 "john smith"
2 "perry clam"
3 "john deere"
etc..you could also optionally create a status table to store the distinct statusstrings statusstings and refer to their PK instead ofthestrings
studentID,courseID,status
1,1,1
1,2,2
2,1,2
2,3,3
... etc
and status table
id,status
1,"completed"
2,"not started"
3,"in progress"
the beauty of this representation is: it is quite easy to filter and aggregate data , i.e it is easy to query which subjects a particular person have completed, how many subjects are completed by an average student, etc. this things are much more difficult in the columnar design like you had. you can also easily add new subjects without the need to adapt your tables or even queries they,will just work.
you can also always usin SQLs PIVOT query to get it to a familiar columnar presentation like
name,mathstatus,programmingstatus,networkingstatus,etc..

but now 5 rows are required
No, it's still just one row. That row simply contains identifiers for values stored in other tables.
There are pros and cons to this. One of the main reasons to normalize in this way is to protect the integrity of the data. If a column is just a string then anything can be stored there. But if there's a foreign key relationship to a table containing a finite set of values then only one of those options can be stored there. Additionally, if you ever want to change the text of an option or add/remove options, you do it in a centralized place.
Moving the data from the old table to this new table seems to be not an easy task.
No problem at all. Create your new numeric columns on the data table and populate them with the identifiers of the lookup table records associated with each data table record. If they're nullable, you can make them foreign keys right away. If they're not nullable then you need to populate them before you can make them foreign keys. Once you've verified that the data is correct, remove the old de-normalized columns. Done.

In StudentCourseCompletionStatus1 you still need 2 associations to Status and CourseStatusType. So I think you should consider following variant of normalization:
It means, that your StudentCourseCompletionStatus would hold only one CourseStatusID and another table CourseStatus would hold the associations to CourseType and Status.
To move your data you can surely use a query.

Related

Securing values for other tables from enumerable table in SQL Server database

English is not my native language, so I might have misused words Enumerator and Enumerable in this context. Please get a feel for what I'm trying to say and correct me if I'm wrong.
I'm looking into not having tables for each enumerator I need in my database.
I "don't want" to add tables for (examples:) service duration type, user type, currency type, etc. and add relations for each of them.
Instead of a table for each of them which values will probably not change a lot, and for which I'd have to create relationships with other tables, I'm looking into having just 2 tables called Enumerator (eg: user type, currency...) and Enumerable (eg: for user type -> manager, ceo, delivery guy... and for currency -> euro, dollar, pound...).
Though here's the kicker. If I implement it like that, I'm loosing the rigidity of the foreign key relationships -> I can't accidentally insert a row in users table that will have a user type of some currency or service duration type, or something else.
Is there another way to resolve the issue of having so many enumerators and enumerables with the benefit of having that rigidity of the foreign key and with the benefit of having all of them in just those 2 tables?
Best I can think of is to create a trigger for BEFORE UPDATE and BEFORE INSERT to check if (for example) the column type of user table is using the id of the enumerable table that belongs to the correct enumerator.
This is a short example in SQL
CREATE TABLE [dbo].[Enumerator]
(
[Id] INT NOT NULL PRIMARY KEY,
[Name] VARCHAR(50)
)
CREATE TABLE [dbo].[Enumerable]
(
[Id] INT NOT NULL PRIMARY KEY,
[EnumeratorId] INT NOT NULL FOREIGN KEY REFERENCES Enumerator(Id),
[Name] VARCHAR(50)
)
INSERT INTO Enumerator (Id, Name)
VALUES (1, 'UserType'),
(2, 'ServiceType');
INSERT INTO Enumerable (Id, EnumeratorId, Name) -- UserType
VALUES (1, 1, 'CEO'),
(2, 1, 'Manager'),
(3, 1, 'DeliveryGuy');
INSERT INTO Enumerable (Id, EnumeratorId, Name) -- ServiceDurationType
VALUES (4, 2, 'Daily'),
(5, 2, 'Weekly'),
(6, 2, 'Monthly');
CREATE TABLE [dbo].[User]
(
[Id] INT NOT NULL PRIMARY KEY IDENTITY (1,1),
[Type] INT NOT NULL FOREIGN KEY REFERENCES Enumerable(Id)
)
CREATE TABLE [dbo].[Service]
(
[Id] INT NOT NULL PRIMARY KEY IDENTITY (1,1),
[Type] INT NOT NULL FOREIGN KEY REFERENCES Enumerable(Id)
)
The questions are:
Is it viable to resolve enumerators and enumerables with 2 tables and with before update and before insert triggers, or is it more trouble than it's worth?
Is there a better way to resolve this other than using before update and before insert triggers?
Is there a better way to resolve enumerators and enumerables that is not using 2 tables and triggers, nor creating a table with relations for each of them?
I ask for your wisdom as I don't have one or more big projects behind me and I didn't get a chance to create a DB like this until now.

SQL sever constraint data

I have 2 table Product and Supplier
create table Product(
ProductCode int not null primary key,
Name varchar(50) not null ,
PurchasePrice numeric(20,3) not null,
SellPrice numeric(20,3) not null ,
Type varchar(50) not null,
SupplierCode int not null
)
go
create table Supplier(
SupplierCode int not null primary key,
SupplierName varchar(50) not null ,
Address varchar(50) not null
)
I want : A product of Samsung must be television, mobile or tablet.
Help me.
database enter image description here
I want "SupplierCode<>4" because Supplier supply 'food'
You can't achieve it this way do something like this in check constraint because value depends on different tables.
The most straightforward way would be to create a trigger. This one is after insert. It just deletes rows that are not acceptable. You can experiment and make it instead of insert instead
CREATE TRIGGER insertProduct
ON Sales.Product
AFTER INSERT, UPDATE
AS
delete from inserted a
join Supplier b on a.SupplierCode = b.SupplierCode
where
(b.Supplier = 'Samsung' and a.Type not in ('phone', whatever you need...))
or (b.Supplier = 'different' and a.Type not in (...))
--here you put any additional logic for different suppliers
if ##ROWCOUNT > 0
RAISERROR ('Couldn't insert into table', 16, 10);
GO
Depending on your case what you can also do is make inserting to the tables available only via stored procedures. Then put the checks in the stored procedure instead.
If I understand you correct, you want to define for each suppliers what kind of products are allowed in your tables.
I think you need a table that defines what producttypes are allowed for what Suppliers, and than you can force this in a trigger or in your client.
First you need a table to define the kind of products
table ProductType (ID, Name)
this holds info like
(1, 'Television'), (2, 'Tablet'), (3, 'Mobi'), (4, 'Food'), and so on...
Then replace the field Type by ProductTypeID in your Product table
Now you can have a table that defines what product types each supplier may have
table AllowedProducts (ID, SupplierCode, ProductTypeID)
and finally you can check this in your client, or in a trigger if you want to keep this rule in your database
When inserting data you can check if the ProductTypeID of the selected Product is present in table AllowedProducts for the selected ´Supplier` and if not reject the insert.

SQL Server trigger can't insert

I beginning to learn how to write trigger with this basic database.
I'm also making my very 1st database.
Schema
Team:
TeamID int PK (TeamID int IDENTITY(0,1) CONSTRAINT TeamID_PK PRIMARY KEY)
TeamName nvarchar(100)
History:
HistoryID int PK (HistoryID int IDENTITY(0,1) CONSTRAINT HistoryID_PK PRIMARY KEY)
TeamID int FK REF Team(TeamID)
WinCount int
LoseCount int
My trigger: when a new team is inserted, it should insert a new history row with that team id
CREATE TRIGGER after_insert_Player
ON Team
FOR INSERT
AS
BEGIN
INSERT INTO History (TeamID, WinCount, LoseCount)
SELECT DISTINCT i.TeamID
FROM Inserted i
LEFT JOIN History h ON h.TeamID = i.TeamID
AND h.WinCount = 0 AND h.LoseCount = 0
END
Executed it returns
The select list for the INSERT statement contains fewer items than the insert list. The number of SELECT values must match the number of INSERT columns.
Please help thank. I'm using SQL Server
The error text is the best guide, it is so clear ..
You try inserting one value from i.TeamID into three columns (TeamID,WinCount,LoseCount)
consider these WinCount and LoseCount while inserting.
Note: I Think the structure of History table need to revisit, you should select WinCount and LoseCount as Expressions not as actual columns.
When you specify insert columns, you say which columns you will be filling. But in your case, right after insert you select only one column (team id).
You either have to modify the insert to contain only one column, or select, to retrieve 3 fields as in insert.
If you mention the columns where values have to be inserted(Using INSERT-SELECT).
The SELECT Statement has to contain the same number of columns that have been specified to be inserted. Also, ensure they are of the same data type.(You might face some issues otherwise)

Foreign key to 1 of several tables

I have two tables (OrderFreshGoods and OrderUtensils) and then I have an AuditTrail table. The AuditTrail table is related to the OrderFreshGoods table but I want to change it so that an Audit must relate to either an OrderFreshGoods or OrderUtensils record. I have seen a lot of solutions where the Audit table say would have 2 foreign keys (OrderIDFresh, OrderIDUtensils and it is optional that 1 of them must be populated). Note that I do not want that solution. I want the Audit table to have 1 foreign key (OrderID) and it must relate to either OrderFreshGoods.OrderID or OrderUtensils.OrderID.
Also my two order tables have no fields in common and are used in a large number of queries around the system so I don't want a parent table for both types of order.
Can anybody help? My sql script is below, the comments should help explain my tests...
--Setup tables
create table OrderFreshGoods (OrderID int not null primary key, sellBy date, name varchar(20))
go
create table OrderUtensils (OrderID int not null primary key, requiresOver18CheckForKnives bit, colour varchar(20), title varchar(20))
go
create table AuditTrail (AuditId int not null primary key, OrderID int, timeOfEvent date, eventDescription varchar(100))
go
--Base data
insert into OrderFreshGoods values (7, DATEADD(dd, 3, getdate()), 'Organic milk')
insert into OrderUtensils values (8, 0, 'Red', 'Garlic crusher')
--Test data!!!!!!!!!!!!!!
--This should work
insert into AuditTrail values (15, 7, getdate(), 'Logging order for Organic Milk from Corkys Coffee shop.')
--This should work
insert into AuditTrail values (16, 8, getdate(), 'Logging order for a Red Garlic Crusher from Perrys Pizza Place.')
--This should not be allowed
insert into AuditTrail values (17, 9, getdate(), 'Wrongly adding an audit entry before the order, please stop me now!')
--This should not be allowed
insert into AuditTrail values (18, null, getdate(), 'Oh dear, bad code has caused the OrderId to be lost, please stop me now!')
What you want is not possible as you describe it, it goes against the very premise of Relational databases.
If you leave out the actual Foreign Key, then you can populate the AuditTrail.OrderId with whatever you want.
But you'd lose the referential integrity check, so your third insert into AuditTrail statement wouldn't fail. That could then be fixed by applying an on-insert trigger which does a reference check. But it would still not prevent Orders from being deleted, causing the pseudo-relation to go bad again.
Another and perhaps much better alternative is to add an AuditId field to both of the Order tables, and fill that as needed.
If you want to use always proper relations, then I can see next solution:
create yet another table xxxx, like:
create table xxxx(id int identity(1,1) primary key)
add foreign keys in all your auditable tables to xxxx(id)
create new records in xxxx while adding records to auditable tables (triggers can be handy) (you may include more info into xxxx)
add foreign key in audit table to xxxx(id)
This way you can insert into audit table only records, pointing to xxxx - and these are always related to actual orders (and do not vanish when some order gets deleted either).

SQL can I have a "conditionally unique" constraint on a table?

I've had this come up a couple times in my career, and none of my local peers seems to be able to answer it. Say I have a table that has a "Description" field which is a candidate key, except that sometimes a user will stop halfway through the process. So for maybe 25% of the records this value is null, but for all that are not NULL, it must be unique.
Another example might be a table which must maintain multiple "versions" of a record, and a bit value indicates which one is the "active" one. So the "candidate key" is always populated, but there may be three versions that are identical (with 0 in the active bit) and only one that is active (1 in the active bit).
I have alternate methods to solve these problems (in the first case, enforce the rule code, either in the stored procedure or business layer, and in the second, populate an archive table with a trigger and UNION the tables when I need a history). I don't want alternatives (unless there are demonstrably better solutions), I'm just wondering if any flavor of SQL can express "conditional uniqueness" in this way. I'm using MS SQL, so if there's a way to do it in that, great. I'm mostly just academically interested in the problem.
If you are using SQL Server 2008 a Index filter would maybe your solution:
http://msdn.microsoft.com/en-us/library/ms188783.aspx
This is how I enforce a Unique Index with multiple NULL values
CREATE UNIQUE INDEX [IDX_Blah] ON [tblBlah] ([MyCol]) WHERE [MyCol] IS NOT NULL
In the case of descriptions which are not yet completed, I wouldn't have those in the same table as the finalized descriptions. The final table would then have a unique index or primary key on the description.
In the case of the active/inactive, again I might have separate tables as you did with an "archive" or "history" table, but another possible way to do it in MS SQL Server at least is through the use of an indexed view:
CREATE TABLE Test_Conditionally_Unique
(
my_id INT NOT NULL,
active BIT NOT NULL DEFAULT 0
)
GO
CREATE VIEW dbo.Test_Conditionally_Unique_View
WITH SCHEMABINDING
AS
SELECT
my_id
FROM
dbo.Test_Conditionally_Unique
WHERE
active = 1
GO
CREATE UNIQUE CLUSTERED INDEX IDX1 ON Test_Conditionally_Unique_View (my_id)
GO
INSERT INTO dbo.Test_Conditionally_Unique (my_id, active)
VALUES (1, 0)
INSERT INTO dbo.Test_Conditionally_Unique (my_id, active)
VALUES (1, 0)
INSERT INTO dbo.Test_Conditionally_Unique (my_id, active)
VALUES (1, 0)
INSERT INTO dbo.Test_Conditionally_Unique (my_id, active)
VALUES (1, 1)
INSERT INTO dbo.Test_Conditionally_Unique (my_id, active)
VALUES (2, 0)
INSERT INTO dbo.Test_Conditionally_Unique (my_id, active)
VALUES (2, 1)
INSERT INTO dbo.Test_Conditionally_Unique (my_id, active)
VALUES (2, 1) -- This insert will fail
You could use this same method for the NULL/Valued descriptions as well.
Thanks for the comments, the initial version of this answer was wrong.
Here's a trick using a computed column that effectively allows a nullable unique constraint in SQL Server:
create table NullAndUnique
(
id int identity,
name varchar(50),
uniqueName as case
when name is null then cast(id as varchar(51))
else name + '_' end,
unique(uniqueName)
)
insert into NullAndUnique default values
insert into NullAndUnique default values -- Works
insert into NullAndUnique default values -- not accidentally :)
insert into NullAndUnique (name) values ('Joel')
insert into NullAndUnique (name) values ('Joel') -- Boom!
It basically uses the id when the name is null. The + '_' is to avoid cases where name might be numeric, like 1, which could collide with the id.
I'm not entirely aware of your intended use or your tables, but you could try using a one to one relationship. Split out this "sometimes" unique column into a new table, create the UNIQUE index on that column in the new table and FK back to the original table using the original tables PK. Only have a row in this new table when the "unique" data is supposed to exist.
OLD tables:
TableA
ID pk
Col1 sometimes unique
Col...
NEW tables:
TableA
ID
Col...
TableB
ID PK, FK to TableA.ID
Col1 unique index
Oracle does. A fully null key is not indexed by a Btree in index in Oracle, and Oracle uses Btree indexes to enforce unique constraints.
Assuming one wished to version ID_COLUMN based on the ACTIVE_FLAG being set to 1:
CREATE UNIQUE INDEX idx_versioning_id ON mytable
(CASE active_flag WHEN 0 THEN NULL ELSE active_flag END,
CASE active_flag WHEN 0 THEN NULL ELSE id_column END);