Foreign key to table with composite primary key - sql

We are making a translation system and we're struggling finding the best way to model our database.
What we have right now is:
CREATE TABLE Translation
(
id INT NOT NULL PRIMARY KEY
EN VARCHAR(MAX) NULL
DE VARCHAR(MAX) NULL
FR VARCHAR(MAX) NULL
...
);
This solution combines all translations to one entry. Downside is that if you have to add a language, you have to add a column. The upside is that you have a primary key which can be used for foreign keys.
Alternative solution:
CREATE TABLE TranslationId
(
id INT NOT NULL PRIMARY KEY
);
CREATE TABLE Translation
(
id INT NOT NULL
Language VARCHAR(2) NOT NULL
Translation VARCHAR(MAX) NULL
);
id in Translation has a foreign key to the id of TranslationId (and is not unique in the Translation table). This solution doesn't have the disadvantage of the first solution. The disadvantage is that this may be overengineered. To get all the translations for a certain id, you need to pass through an extra table.
Both solutions will work. Any thoughts on either solution?

Related

Is there a way to create a conditional foreign key depending on a field's value?

Trying to create a table like
CREATE TABLE SearchUser(
SearchID int NOT NULL PRIMARY KEY,
UserID int FOREIGN KEY REFERENCES Users(UserId),
[scan_id] NVARCHAR(8),
[scan_type] NVARCHAR(3),
[tokens_left] INT);
where depending on scan_type value, I would like to be able to utilize different foreign keys
i.e.
if [scan_type] = 'ORG'
I would like [scan_id] to be a foreign key to Org(scan_id)
if [scan_type] = 'PER'
I would like [scan_id] to be a foreign key to Per(scan_id)
In SQL Server it's impossible to create a dynamic foreign key but, you can implement table inheritance, which solves your problem e.g.:
CREATE TABLE BaseScan(Id INT PRIMARY KEY,SharedProperties....);
CREATE TABLE OrgScan(
Id INT...,
BaseScanId INT NOT NULL FOREIGN KEY REFERENCES BaseScan(Id));
CREATE TABLE dbo.PerScan(
Id INT...,
BaseScanId INT NOT NULL FOREIGN KEY REFERENCES BaseScan(Id));
This way you'll be able to reference BaseScan.Id in SearchUser and then join the data you need depending on 'scan-type' value.
As other devs mention, it is not possible directly in nutshell. But there is still hope if you want to implement the same.
You can apply check constraint on column. You can create function for check and call it for check constraint like below.
ALTER TABLE YourTable
ADD CONSTRAINT chk_CheckFunction CHECK ( dbo.CheckFunction() = 1 );
In function you can write your logic accordingly.
CREATE FUNCTION dbo.CheckFunction ()
RETURNS int
AS
BEGIN
RETURN ( SELECT 1 );
END;

Database Schema - Many-to-Many Normalisation

I'm designing a schema where a case can have many forms attached and a form can be used for many cases. The Form table basically holds the structure of a html form which gets rendered on the client side. When the form is submitted the name/value pairs for the fields are stored separately. Is there any value in keeping the name/value attributes seperate from the join table as follows?
CREATE TABLE Case (
ID int NOT NULL PRIMARY KEY,
...
);
CREATE TABLE CaseForm (
CaseID int NOT NULL FOREIGN KEY REFERENCES Case (ID),
FormID int NOT NULL FOREIGN KEY REFERENCES Form (ID),
CONSTRAINT PK_CaseForm PRIMARY KEY (CaseID, FormID)
);
CREATE TABLE CaseFormAttribute (
ID int NOT NULL PRIMARY KEY,
CaseID int NOT NULL FOREIGN KEY REFERENCES CaseForm (CaseID),
FormID int NOT NULL FOREIGN KEY REFERENCES CaseForm (FormID),
Name varchar(255) NOT NULL,
Value varchar(max)
);
CREATE TABLE Form (
ID int NOT NULL PRIMARY KEY,
FieldsJson varchar (max) NOT NULL
);
I'm I overcomplicating the schema since the same many to many relationship can by achieved by turning the CaseFormAttribute table into the join table and getting rid of the CaseForm table altogether as follows?
CREATE TABLE CaseFormAttribute (
ID int NOT NULL PRIMARY KEY,
CaseID int NOT NULL FOREIGN KEY REFERENCES Case (ID),
FormID int NOT NULL FOREIGN KEY REFERENCES Form (ID),
Name varchar(255) NOT NULL,
Value varchar(max) NULL
);
Basically what I'm trying to ask is which is the better design?
The main benefit of splitting up the two would depend on whether or not additional fields would ever be added to the CaseForm table. For instance, say that you want to record if a Form is incomplete. You may add an Incomplete bit field to that effect. Now, you have two main options for retrieving that information:
Clustered index scan on CaseForm
Create a nonclustered index on CaseForm.Incomplete which includes CaseID, FormID, and scan that
If you didn't split the tables, your two main options would be:
Clustered index scan on CaseFormAttribute
Create a nonclustered index on CaseFormAttribute.Incomplete which includes CaseID, FormID, and scan that
For the purposes of this example, query options 1 and 2 are roughly the same in terms of performance. Introducing the nonclustered index adds overhead in multiple ways. It's a little less streamlined than the clustered index (it may take more reads to scan in this particular example), it's additional storage space that CaseForm will take up, and the index has to be maintained for updates to the table. Option 4 will also perform similarly, with the same caveats as option 2. Option 3 will be your worst performer, as a clustered index scan will include reading all of the BLOB data in your Value field, even though it only needs the bit in Incomplete to determine whether or not to return that (Case, Form) pair.
So it really does depend on what direction you're going in the future.
Also, if you stay with the split approach, consider shifting CaseFormAttribute.ID to CaseForm, and then use CaseForm.ID as your PK/FK in CaseFormAttribute. The caveat here is that we're assuming that all Forms will be inserted at the same time for a given Case. If that's not true, then you would invite some page splits because your inserts will be somewhat random, though still generally increasing.

Why would a database architect choose to de-normalize referenced child tables

Why would a DBA choose to have a large, heavily referenced lookup table instead of several small, dedicated lookup tables with only one or two tables referencing each one. For example:
CREATE TABLE value_group (
id INT NOT NULL PRIMARY KEY AUTO_INCREMENT,
group_name VARCHAR(30) NOT NULL
);
CREATE TABLE value_group_value (
id INT NOT NULL PRIMARY KEY AUTO_INCREMENT,
value_group_id INT NOT NULL,
value_id INT NOT NULL,
FOREIGN KEY (value_group_id) REFERENCES value_group(id)
);
CREATE TABLE value (
id INT NOT NULL PRIMARY KEY AUTO_INCREMENT,
value_text VARCHAR(30) NOT NULL
);
Example groups would be something along the lines of:
'State Abbreviation' with the corresponding values being a list of all the U.S. state abbreviations.
'Name Prefix' with the corresponding values being a list of strings such as 'Mr.', 'Mrs.', 'Dr.', etc.
In my experience normalizing these value tables into tables for each value_group would make changes easier, provides clarity, and queries perform faster:
CREATE TABLE state_abbrv (
id INT NOT NULL PRIMARY KEY AUTO_INCREMENT,
abbreviation CHAR NOT NULL
);
CREATE TABLE name_prefix (
id INT NOT NULL PRIMARY KEY AUTO_INCREMENT,
prefix VARCHAR NOT NULL
);
With n tables like that for n groups in the value_group table. Each of these new tables could then be directly referenced from another table or using some intermediary table depending on the desired relationship.
What factors would influence a DBA to use the described the first setup over the second?
In my experience, the primary advantages of a single, standardized "table of tables" structure for lookups are code reuse, simplified documentation (if you're in the 1% of folks who document your database, that is) and you can add new lookup tables without changing the database structure.
And if I had a dollar for every time I saw something in a database that made me wonder "what was the DBA thinking?", I could retire to the Bahamas.

Inheritance in SQL. How to guarantee it?

I am trying to create inheritance as in a C# object using SQL Server and I have:
create table dbo.Evaluations
(
Id int not null constraint primary key clustered (Id),
Created datetime not null
);
create table dbo.Exams
(
Id int not null,
Value int not null
// Some other fields
);
create table dbo.Tests
(
Id int not null,
Order int not null
// Some other fields
);
alter table dbo.Exams
add constraint FK_Exams_Id foreign key (Id) references dbo.Evaluations(Id);
alter table dbo.Tests
add constraint FK_Tests_Id foreign key (Id) references dbo.Evaluations(Id);
Which would translate to:
public class Evaluation {}
public class Exam : Evaluation {}
public class Test : Evaluation {}
I think this is the way to go but I have a problem:
How to force that an Evaluation has only one Test or one Exam but not both?
To find which type of evaluation I have I can check exam or test for null. But should I have an EvaluationType in Evaluations table instead?
NOTE:
In reality I have 4 subtypes each one with around 40 to 60 different columns.
And in Evaluations table I have around 20 common columns which are also the ones which i use more often to query so I get lists.
First, don't use reserved words such as order for column names.
You have a couple of choices on what to do. For this simple example, I would suggest just having the two foreign key references in the evaluation table, along with some constraints and computed columns. Something like this:
create table dbo.Evaluations
(
EvaluationId int not null constraint primary key clustered (Id),
ExamId int references exams(ExamId),
TestId int references tests(TestId),
Created datetime not null,
EvaluationType as (case when ExamId is not null then 'Exam' when TestId is not null then 'Test' end),
check (not (ExamId is not null and TestId is not null))
);
This approach gets less practical if you have lots of subtypes. For your case, though, it provides the following:
Foreign key references to the subtables.
A column specifying the type.
A validation that at most one type is set for each evaluation.
It does have a slight overhead of storing the extra, unused id, but that is a small overhead.
EDIT:
With four subtypes, you can go in the other direction of having a single reference and type in the parent table and then using conditional columns and indexes to enforce the constraints:
create table dbo.Evaluations
(
EvaluationId int not null constraint primary key clustered (Id),
EvaluationType varchar(255) not null,
ChildId int not null,
CreatedAt datetime not null,
EvaluationType as (case when ExamId is not null then 'Exam' when TestId is not null then 'Test' end),
ExamId as (case when EvaluationType = 'Exam' then ChildId end),
TestId as (case when EvaluationType = 'Test' then ChildId end),
Other1Id as (case when EvaluationType = 'Other1' then ChildId end),
Other2Id as (case when EvaluationType = 'Other2' then ChildId end),
Foreign Key (ExamId) int references exams(ExamId),
Foreign Key (TestId) int references tests(TestId),
Foreign Key (Other1Id) int references other1(Other1Id),
Foreign Key (Other2Id) int references other2(Other2Id)
);
In some ways, this is the better solution to the problem. It minimizes storage and is extensible for additional types. Note that it is using computed columns for the foreign key references, so it is still maintaining relational integrity.
My best experience is include all columns in one table.
Relation model is not much friendly with object oriented design.
If you treat every class as one table, you can get performance problems with high number of rows in "base-table" (base class) or you can suffer from a lot of joins if you have level of inheritance.
If you want minimalize amount of work to get correct structure, create your own tool, which can genrate create/alter scripts of tables for chosen classes. It's in fact pretty easy. Then you can generate also your data access layer. In result you will get automatic worker and you can focus on complex tasks and delegate work for "trained monkeys" to computer not humans.

SQL one-to-many

I am trying to build an SQL schema for a system where we have channels, each with an id, and one or more fixtures. I am having difficulty finding a way to implement this one-to-many mapping. (i.e. One channel to many fixtures). I am using the H2 database engine.
I cannot have a table :
id | fixture
----|----------
1 | 1
1 | 2
2 | 3
CREATE TABLE channel(
id INT NOT NULL PRIMARY KEY,
fixture INT NOT NULL
);
... as the PRIMARY KEY id must be UNIQUE.
Similarly, I cannot map as follows:
CREATE TABLE channel(
id INT NOT NULL PRIMARY KEY,
f_set INT NOT NULL REFERENCES fixtures(f_set)
);
CREATE TABLE fixtures(
id INT NOT NULL PRIMARY KEY,
f_set INT NOT NULL
);
... as this required f_set to be UNIQUE
I am currently implementing it as follows:
CREATE TABLE channel(
id INT NOT NULL PRIMARY KEY,
f_set INT NOT NULL REFERENCES fixture_set(id)
);
CREATE TABLE fixtures(
id INT NOT NULL PRIMARY KEY,
f_set INT NOT NULL REFERENCES fixture_set(id)
);
CREATE TABLE fixture_set(
id INT NOT NULL PRIMARY KEY
);
... but this means that we can have a channel with a fixture_set which does not have any assigned fixtures (Not ideal).
I was wondering if you had any suggestions for how i may approach this (Or where my understanding is wrong). Thanks
"One-to-many" means that many items (may) reference one item. If it's one channel to many fixtures, then fixtures should reference channels, not the other way round, which means the reference column should be in the fixtures table:
CREATE TABLE channel(
id INT NOT NULL PRIMARY KEY
);
CREATE TABLE fixtures(
id INT NOT NULL PRIMARY KEY,
channel_id INT NOT NULL FOREIGN KEY REFERENCES channel (id)
);
You can add a CONSTRAINT just to check it.
Sorry for not pasting a snippet... I don't know anything about H2 specifics.
Or you could also avoid the fixture-set concept at all.
Then you would just need:
channel table, with just the id (plus other fields not involved on that matter, of course)
a channelfixtures table, with channelId and fixtureId. Primary key would be composed of (channelId, fixtureId)
a fixture table, only if you need it.