Use a common table with many to many relationship - sql

I have two SQL tables: Job and Employee. I need to compare Job Languages Proficiencies and Employee Languages Proficiencies. A Language Proficiency is composed by a Language and a Language Level.
create table dbo.EmployeeLanguageProficiency (
EmployeeId int not null,
LanguageProficiencyId int not null,
constraint PK_ELP primary key clustered (EmployeeId, LanguageProficiencyId)
)
create table dbo.JobLanguageProficiency (
JobId int not null,
LanguageProficiencyId int not null,
constraint PK_JLP primary key clustered (JobId, LanguageProficiencyId)
)
create table dbo.LanguageProficiency (
Id int identity not null
constraint PK_LanguageProficiency_Id primary key clustered (Id),
LanguageCode nvarchar (4) not null,
LanguageLevelId int not null,
constraint UQ_LP unique (LanguageCode, LanguageLevelId)
)
create table dbo.LanguageLevel (
Id int identity not null
constraint PK_LanguageLevel_Id primary key clustered (Id),
Name nvarchar (80) not null
constraint UQ_LanguageLevel_Name unique (Name)
)
create table dbo.[Language]
(
Code nvarchar (4) not null
constraint PK_Language_Code primary key clustered (Code),
Name nvarchar (80) not null
)
My question is about LanguageProficiency table. I added an Id has PK but I am not sure this is the best option.
What do you think about this scheme?

Your constraint of EmployeeId, LanguageProficiencyId allows an employee to have more than one proficiency per language. This sounds counterintuitive.
This would be cleaner, as it allows only one entry per language:
create table dbo.EmployeeLanguageProficiency (
EmployeeId int not null,
LanguageId int not null,
LanguageLevelId int not null,
constraint PK_ELP primary key clustered (EmployeeId, LanguageId)
)
I don't see the point of table LanguageProficiency at the moment.
Same applies to the Job of course. Unless you would like to allow a "range" of proficiencies. But assuming that "too high proficiency" does not hurt, it can easilly be defined through a >= statement in our queries.
Rgds

Related

How to use constraints to force two child items be from the same parent?

I have a Jobs table that holds jobs.
I have a Tasks table that holds tasks that belong to a job (1:many).
I have a Task_Relationships table that holds the data about which tasks depend on other tasks within a job.
I have 2 jobs, each job has 3 tasks and within the jobs the tasks are related as in the diagram. The Task_Relationships table is to represent that tasks within a job have dependencies between them.
How to ensure that when I add an entry to the Task_Relationships table say (1,2) representing the fact that task 1 is related to task 2, that tasks 1 and 2 are in the same job? I'm trying to enforce this through keys and not through code.
drop table if exists dbo.test_jobs
create table dbo.test_jobs (
[Id] int identity(1,1) primary key not null,
[Name] varchar(128) not null
)
drop table if exists dbo.test_tasks
create table dbo.test_tasks (
[Id] int identity(1,1) primary key not null,
[Job_Id] int not null,
[Name] varchar(128) not null
constraint fk_jobs foreign key ([Id]) references dbo.test_jobs(Id)
)
drop table if exists dbo.test_task_relationships
create table dbo.test_task_relationships (
[Id] int identity(1,1) not null,
[From_Task] int not null,
[To_Task] int not null
constraint fk_tasks_from foreign key ([From_Task]) references dbo.test_tasks(Id),
constraint fk_tasks_to foreign key ([To_Task]) references dbo.test_tasks(Id)
)
A reliance on identity columns as primary keys is not helping you here. And it is a logic fault to use an identity column in the relationship table IMO. Surely you do not intend to allow multiple rows to exist in that table with the same values for <from_task, to_task>.
Imagine the child table defined as:
create table dbo.test_tasks (
Job_Id int not null,
Task_Id tinyint not null,
Name varchar(128) not null,
constraint pk_tasks primary key clustered (Job_Id, Task_Id),
constraint fk_jobs foreign key ([Job_Id]) references dbo.test_jobs(Id)
);
Now your relationship table can be transformed into:
create table dbo.test_task_relationships (
From_Job int not null,
From_Task tinyint not null,
To_Job int not null,
To_Task tinyint not null
);
I'll leave it to you to complete the DDL but that should make your goal trivial.
You can declare a superkey in the Task table that includes the Job_Id column as well as columns from an existing key.
create table dbo.test_tasks (
[Id] int identity(1,1) primary key not null,
[Job_Id] int not null,
[Name] varchar(128) not null
constraint fk_jobs foreign key ([Id]) references dbo.test_jobs(Id),
constraint UQ_Tasks_WithJob UNIQUE (Id, Job_Id)
)
You can then add the Job_Id column to the relationships table and include it in both foreign key constraints:
create table dbo.test_task_relationships (
[Id] int identity(1,1) not null,
[From_Task] int not null,
Job_Id int not null,
[To_Task] int not null
constraint fk_tasks_from foreign key ([From_Task], Job_Id) references dbo.test_tasks(Id, Job_Id),
constraint fk_tasks_to foreign key ([To_Task], Job_Id) references dbo.test_tasks(Id, Job_Id)
)
There is now no way for the table to contain mismatched tasks. If necessary, wrap this table in a view/trigger if you don't want to expose the presence of the job_id column to applications and to automatically populate it during insert.

Foreign Key References Two Tables

I'm currently working on a database and I came across a new problem to me. The entities involved are Universe, Competition, Game, Pot. Here are the SQL files to create the tables:
CREATE TABLE Universe (
id int NOT NULL IDENTITY PRIMARY KEY,
history nvarchar (max),
creation_date date
);
CREATE TABLE Pot (
pot_name nvarchar(100),
universe_id int FOREIGN KEY REFERENCES Universe(id),
pot_description nvarchar(100),
media_description nvarchar(100),
is_official_pot bit
PRIMARY KEY (pot_name, universe_id)
);
CREATE TABLE Competition (
universe_id int NOT NULL FOREIGN KEY REFERENCES Universe(id),
compt_name nvarchar(100) NOT NULL,
alias nvarchar(100),
history nvarchar(max),
rules nvarchar (max),
winner_id nvarchar(100) FOREIGN KEY REFERENCES RaulUser(username),
edition int NOT NULL,
is_official_competition bit NOT NULL,
PRIMARY KEY (universe_id, compt_name, edition)
);
CREATE TABLE Game (
id int NOT NULL IDENTITY PRIMARY KEY,
pot_name nvarchar (100) NOT NULL,
universe_id int NOT NULL,
competition_name nvarchar(100) NOT NULL,
competition_edition int NOT NULL,
competition_round int NOT NULL,
home_raul_u_username nvarchar (100) FOREIGN KEY REFERENCES RaulUser(username) NOT NULL,
home_team nvarchar (100) NOT NULL FOREIGN KEY REFERENCES Team(team_name),
home_score int,
away_raul_u_username nvarchar (100) NOT NULL FOREIGN KEY REFERENCES RaulUser(username),
away_team nvarchar (100) NOT NULL FOREIGN KEY REFERENCES Team(team_name),
away_score int,
is_over bit NOT NULL,
played_date date,
FOREIGN KEY (universe_id, competition_name, competition_edition) REFERENCES Competition(universe_id, compt_name, edition),
FOREIGN KEY (universe_id, pot_name) REFERENCES Pot(universe_id, pot_name)
);
The problem starts with this last table (Game), as I can't use universe_id as a Foreign Key for different tables. What's the best approach to solving this? Creating an M:M table Game_Pot?
I only need to record the Pot of each Game because Pots change overtime and I don't want to lose that data.
Sorry for the long post and thank you all in advance :)
The only problem that I see is in the definition of table Game:
FOREIGN KEY (universe_id, pot_name) REFERENCES Pot(universe_id, pot_name)
Ordering of columns matters. The primary key of table Pot is (pot_name, universe_id), so you need to swap the columns in the foreign key, like so:
FOREIGN KEY (pot_name, universe_id) REFERENCES Pot(pot_name, universe_id)
Note that having identity (or the-like) primary key in every table might simplify your design: it would allow you to reduce the number of columns in the children tables, and to use single-column foreign keys. Meanwhile, you can still enforce uniqeness on columns tuples in the parent tables with unique constraints.

How do I implement a key to be both PK and FK?

I'm very new to SQL and just practicing for my SQL exam by going through past papers, however, I'm stuck on how to implement the staffID & projNo in the allocation table to be both primary and foreign keys. I've tried to use solutions online but none work.
Here is the relational schema (PK, FK)
staff( **staffID**, firstname, lastName, gender, dob, jobTitle)
project ( **projNo**, projName, description )
allocation ( ***staffID***, ***projNo***, hours )
Here is my SQL:
CREATE TABLE staff (
staffID CHAR (4) PRIMARY KEY,
firstName VARCHAR (30) NOT NULL,
lastName VARCHAR (30) NOT NULL,
gender CHAR (1) CHECK (gender IN ('M','F'))
dob DATE NOT NULL,
jobTitle VARCHAR (30) NOT NULL,
);
CREATE TABLE project (
projNo CHAR (4) PRIMARY KEY,
projName VARCHAR (20) NOT NULL,
description VARCHAR (30) NOT NULL,
);
CREATE TABLE allocation (
staffID CHAR (4)
projNo CHAR (4)
hours int (2)
);
Making the two column to have Primary Key Constraint Use;
Alter table table_name Add MyPrimaryConstraint Primary Key(column(s));
There's really not much to it:
CREATE TABLE allocation (
staffID CHAR (4)
CONSTRAINT ALLOCATION_FK1
REFERENCES STAFF(STAFFID)
ON DELETE CASCADE,
projNo CHAR (4)
CONSTRAINT ALLOCATION_FK2
REFERENCES PROJECT(PROJNO)
ON DELETE CASCADE,
hours int (2),
CONSTRAINT PK_ALLOCATION
PRIMARY KEY (STAFFID, PROJNO)
USING INDEX
);
CREATE INDEX ALLOCATION_1
ON ALLOCATION (STAFFID);
CREATE INDEX ALLOCATION_2
ON ALLOCATION (PROJNO);
Here I've defined the foreign key constraints as part of the column definition, but there's no reason other than convenience to do it this way; you could, if you chose, make them out-of-line constraints. Because the primary key consists of multiple columns you have to make it an out-of-line constraint as shown here.
I've also defined indexes on both of the foreign keys. This is important - delete performance from the parent tables will suffer if you don't do this.
Best of luck.
You can set combination key for staffID CHAR (4) , projNo CHAR (4) for these two columns. and then add foreign key as usual.
CREATE TABLE [allocation](
[staffID] [char](4) NOT NULL,
[projNo] [char](4) NOT NULL,
[hours] [int] NULL,
CONSTRAINT [PK_allocation] PRIMARY KEY CLUSTERED
(
[staffID] ASC,
[projNo] ASC
))
Then set Foreign key with the help of alter command
ALTER TABLE [dbo].[allocation] WITH CHECK ADD CONSTRAINT [FK_allocation_project] FOREIGN KEY([projNo])
REFERENCES [dbo].[project] ([projNo])
GO
ALTER TABLE [dbo].[allocation] CHECK CONSTRAINT [FK_allocation_project]
GO
ALTER TABLE [dbo].[allocation] WITH CHECK ADD CONSTRAINT [FK_allocation_staff] FOREIGN KEY([staffID])
REFERENCES [dbo].[staff] ([staffID])
GO
ALTER TABLE [dbo].[allocation] CHECK CONSTRAINT [FK_allocation_staff]
GO

Composite Keys and Referential Integrity in T-SQL

Is it possible, in T-SQL, to have a relationship table with a composite key composed of 1 column defining Table Type and another column defining the Id of a row from a table referenced in the Table Type column?
For a shared-email address example:Three different user tables (UserA, UserB, UserC)One UserType Table (UserType)One Email Table (EmailAddress)One Email-User Relationship Table (EmailRelationship)The EmailRelationship Table contains three columns, EmailId, UserTypeId and UserId
Can I have a relationship from each User table to the EmailRelationship table (or some other way?) to maintain referential integrity?
I've tried making all three columns in the EmailRelationship table into primary keys, I've tried making only UserTypeId and UserId primary.
CREATE TABLE [dbo].[UserType](
[Id] [int] IDENTITY(1,1) NOT NULL ,
[Type] [varchar](50) NOT NULL)
insert into [dbo].[UserType]
([Type])
values
('A'),('B'),('C')
CREATE TABLE [dbo].[UserA](
[Id] [int] IDENTITY(1,1) NOT NULL,
[UserTypeId] [int] NOT NULL,
[Name] [varchar](50) NOT NULL)
insert into [dbo].[UserA]
(UserTypeId,Name)
values
(1,'UserA')
CREATE TABLE [dbo].[UserB](
[Id] [int] IDENTITY(1,1) NOT NULL,
[UserTypeId] [int] NOT NULL,
[Name] [varchar](50) NOT NULL)
insert into [dbo].[UserB]
(UserTypeId,Name)
values
(2,'UserB')
CREATE TABLE [dbo].[UserC](
[Id] [int] IDENTITY(1,1) NOT NULL,
[UserTypeId] [int] NOT NULL,
[Name] [varchar](50) NOT NULL)
insert into [dbo].[UserC]
(UserTypeId,Name)
values
(3,'UserC')
CREATE TABLE [dbo].[Email](
[Id] [int] IDENTITY(1,1) NOT NULL,
[EmailAddress] [varchar](50) NOT NULL)
insert into [dbo].[email]
(EmailAddress)
values
('SharedEmail#SharedEmail.com')
CREATE TABLE [dbo].[EmailRelationship](
[EmailId] [int] NOT NULL,
[UserTypeId] [int] NOT NULL,
[UserId] [int] NOT NULL)
insert into [dbo].[EmailRelationship]
(EmailId, UserTypeId, UserId)
values
(1,1,1),(1,2,1),(1,3,1)
No there isn't, a foreign key can refer to one table, and one table only, I can think of three ways you could approach this.
The first is to have 3 columns, one for each user table, each column with a foreign key, and a check constraint to check that at one, and only one of the values is not null
CREATE TABLE dbo.EmailRelationship
(
EmailId INT NOT NULL,
UserTypeId INT NOT NULL,
UserAId INT NULL,
UserBId INT NULL,
UserCId INT NULL,
CONSTRAINT FK_EmailRelationship__UserAID FOREIGN KEY (UserAId)
REFERENCES dbo.UserA (Id),
CONSTRAINT FK_EmailRelationship__UserBID FOREIGN KEY (UserBId)
REFERENCES dbo.UserB (Id),
CONSTRAINT FK_EmailRelationship__UserCID FOREIGN KEY (UserCId)
REFERENCES dbo.UserC (Id),
CONSTRAINT CK_EmailRelationship__ValidUserId CHECK
(CASE WHEN UserTypeID = 1 AND UserAId IS NOT NULL AND ISNULL(UserBId, UserCId) IS NULL THEN 1
WHEN UserTypeID = 2 AND UserBId IS NOT NULL AND ISNULL(UserAId, UserCId) IS NULL THEN 1
WHEN UserTypeID = 3 AND UserCId IS NOT NULL AND ISNULL(UserAId, UserBId) IS NULL THEN 1
ELSE 0
END = 1)
);
Then as a quick example trying to insert a UserAId with a user Type ID of 2 gives you an error:
INSERT EmailRelationship (EmailID, UserTypeID, UserAId)
VALUES (1, 1, 1);
The INSERT statement conflicted with the CHECK constraint "CK_EmailRelationship__ValidUserId".
The second approach is to just have a single user table, and store user type against it, along with any other common attributes
CREATE TABLE dbo.[User]
(
Id INT IDENTITY(1, 1) NOT NULL,
UserTypeID INT NOT NULL,
Name VARCHAR(50) NOT NULL,
CONSTRAINT PK_User__UserID PRIMARY KEY (Id),
CONSTRAINT FK_User__UserTypeID FOREIGN KEY (UserTypeID) REFERENCES dbo.UserType (UserTypeID),
CONSTRAINT UQ_User__Id_UserTypeID UNIQUE (Id, UserTypeID)
);
-- NOTE THE UNIQUE CONSTRAINT, THIS WILL BE USED LATER
Then you can just use a normal foreign key constraint on your email relationship table:
CREATE TABLE dbo.EmailRelationship
(
EmailId INT NOT NULL,
UserId INT NOT NULL,
CONSTRAINT PK_EmailRelationship PRIMARY KEY (EmailID),
CONSTRAINT FK_EmailRelationship__EmailId
FOREIGN KEY (EmailID) REFERENCES dbo.Email (Id),
CONSTRAINT FK_EmailRelationship__UserId
FOREIGN KEY (UserId) REFERENCES dbo.[User] (Id)
);
It is then no longer necessary to store UserTypeId against the email relationship because you can join back to User to get this.
Then, if for whatever reason you do need specific tables for different user types (this is not unheard of), you can create these tables, and enforce referential integrity to the user table:
CREATE TABLE dbo.UserA
(
UserID INT NOT NULL,
UserTypeID AS 1 PERSISTED,
SomeOtherCol VARCHAR(50),
CONSTRAINT PK_UserA__UserID PRIMARY KEY (UserID),
CONSTRAINT FK_UserA__UserID_UserTypeID FOREIGN KEY (UserID, UserTypeID)
REFERENCES dbo.[User] (Id, UserTypeID)
);
The foreign key from UserID and the computed column UserTypeID back to the User table, ensures that you can only enter users in this table where the UserTypeID is 1.
A third option is just to have a separate junction table for each User table:
CREATE TABLE dbo.UserAEmailRelationship
(
EmailId INT NOT NULL,
UserAId INT NOT NULL,
CONSTRAINT PK_UserAEmailRelationship PRIMARY KEY (EmailId, UserAId),
CONSTRAINT FK_UserAEmailRelationship__EmailId FOREIGN KEY (EmailId)
REFERENCES dbo.Email (Id),
CONSTRAINT FK_UserAEmailRelationship__UserAId FOREIGN KEY (UserAId)
REFERENCES dbo.UserA (Id)
);
CREATE TABLE dbo.UserBEmailRelationship
(
EmailId INT NOT NULL,
UserBId INT NOT NULL,
CONSTRAINT PK_UserBEmailRelationship PRIMARY KEY (EmailId, UserBId),
CONSTRAINT FK_UserBEmailRelationship__EmailId FOREIGN KEY (EmailId)
REFERENCES dbo.Email (Id),
CONSTRAINT FK_UserBEmailRelationship__UserBId FOREIGN KEY (UserBId)
REFERENCES dbo.UserB (Id)
);
Each approach has it's merits and drawbacks, so you would need to assess what is best for your scenario.
No it does not work that way. You cannot use a column value as a dynamic reference to different tables.
In general the data design is flawed.
Thanks to #GarethD I created a CHECK constraint that called a scalar-function that would enforce referential integrity (only upon insert, refer to caveat below):
Using my above example:
alter FUNCTION [dbo].[UserTableConstraint](#Id int, #UserTypeId int)
RETURNS int
AS
BEGIN
IF EXISTS (SELECT Id From [dbo].[UserA] WHERE Id = #Id and UserTypeId = #UserTypeId)
return 1
ELSE IF EXISTS (SELECT Id From [dbo].[UserB] WHERE Id = #Id and UserTypeId = #UserTypeId)
return 1
ELSE IF EXISTS (SELECT Id From [dbo].[UserC] WHERE Id = #Id and UserTypeId = #UserTypeId)
return 1
return 0
end;
alter table [dbo].[emailrelationship]
--drop constraint CK_UserType
with CHECK add constraint CK_UserType
CHECK([dbo].[UserTableConstraint](UserId,UserTypeId) = 1)
I am sure there is a not insignificant overhead to a Scalar-function call from within a CONSTRAINT. If the above becomes prohibitive I will report back here, though the tables in question will not have to deal with a large volume of INSERTs.
If there are any other reasons to not do the above, I would like to hear them. Thanks!
Update:
I've tested INSERT and UPDATE with 100k rows (SQL Server 2014, 2.1ghz quadcore w/ 8gb ram):
INSERT takes 2 seconds with out the CONSTRAINT
and 3 seconds with the CHECK CONSTRAINT
Turning on IO and TIME STATISTICS causes the INSERT tests to run in:
1.7 seconds with out the CONSTRAINT
and 10 seconds with the CHECK CONSTRAINT
I left the STATISTICS on for the UPDATE 100k rows test:
just over 1sec with out the CONSTRAINT
and 1.5sec with the CHECK CONSTRAINT
My referenced tables (UserA, UserB, UserC from my example) only contain around 10k rows each, so anybody else looking to implement the above may want to run some additional testing, especially if your referenced tables contain millions of rows.
Caveat:
The above solution may not be suitable for most uses, as the only time referential integrity is checked is during the CHECK CONSTRAINT upon INSERT. Any other operations or modifications of the data needs to take that into account. For example, using the above, if an Email is deleted any related EmailRelationship entries will be pointing to invalid data.

Two Composite Primary Key in SQL

I am trying to define table as follows:
CREATE TABLE dbo.[User]
(
Id int NOT NULL IDENTITY PRIMARY KEY,
Name nvarchar(1024) NOT NULL
);
CREATE TABLE [Group]
(
Id int NOT NULL IDENTITY PRIMARY KEY,
Name nvarchar(1024) NOT NULL
);
CREATE TABLE [UserToGroup]
(
Name VARCHER(20)
UserId int NOT NULL,
GroupId int NOT NULL,
PRIMARY KEY CLUSTERED ( UserId, Name),
PRIMARY KEY CLUSTERED ( GroupId, Name),
FOREIGN KEY ( UserId ) REFERENCES [User] ( Id ) ON UPDATE NO ACTION ON DELETE CASCADE,
FOREIGN KEY ( GroupId ) REFERENCES [Group] ( Id ) ON UPDATE NO ACTION ON DELETE CASCADE
);
How can i create table with two Composite Primary Key?
Name VARCHAR(20) NOT NULL,
UserId int NOT NULL,
GroupId int NOT NULL,
UNIQUE ( UserId, Name),
UNIQUE ( GroupId, Name)
In the relational model and in SQL there is no logical difference between one key and another so there's no very strong reason to have a different syntax for specifying one key over any other. However, for better or worse, the authors of the SQL standard decided to make a limitation that the PRIMARY KEY constraint syntax can only be used once per table and that where you need more than one key you have to use one or more UNIQUE constraints instead. Arguably it would be desirable to drop that limitation but since it's fundamentally just a bit of syntactical sugar that's unlikely to happen any time soon.