Foreign Key Relationships and "belongs to many" - sql

edit - Based on the responses below, I'm going to revisit my design. I think I can avoid this mess by being a little bit more clever with how I set out my business objects and rules. Thanks everyone for your help!
--
I have the following model:
S belongs to T
T has many S
A,B,C,D,E (etc) have 1 T each, so the T should belong to each of A,B,C,D,E (etc)
At first I set up my foreign keys so that in A, fk_a_t would be the foreign key on A.t to T(id), in B it'd be fk_b_t, etc. Everything looks fine in my UML (using MySQLWorkBench), but generating the yii models results in it thinking that T has many A,B,C,D (etc) which to me is the reverse.
It sounds to me like either I need to have A_T, B_T, C_T (etc) tables, but this would be a pain as there are a lot of tables that have this relationship. I've also googled that the better way to do this would be some sort of behavior, such that A,B,C,D (etc) can behave as a T, but I'm not clear on exactly how to do this (I will continue to google more on this)
EDIT - to clarify, a T can only belong to one of A, or B, or C, (etc) and not two A's, nor an A and a B (that is, it is not a many to many). My question is in regards to how to describe this relationship in the Yii Framework models - eg, (A,B,C,D,...) HAS_ONE T , and T belongs to (A,B,C,D,...). From a business use case, this all makes sense, but I'm not sure if I have it correctly set up in the database, or if I do, that I need to use a "behavior" in Yii to make it understand the relationship. #rwmnau I understand what you mean, I hope my clarification helps.
UML:
Here's the DDL (auto generated). Just pretend that there is more than 3 tables referencing T.
-- -----------------------------------------------------
-- Table `mydb`.`T`
-- -----------------------------------------------------
CREATE TABLE IF NOT EXISTS `mydb`.`T` (
`id` INT NOT NULL AUTO_INCREMENT ,
PRIMARY KEY (`id`) )
ENGINE = InnoDB;
-- -----------------------------------------------------
-- Table `mydb`.`S`
-- -----------------------------------------------------
CREATE TABLE IF NOT EXISTS `mydb`.`S` (
`id` INT NOT NULL AUTO_INCREMENT ,
`thing` VARCHAR(45) NULL ,
`t` INT NOT NULL ,
PRIMARY KEY (`id`) ,
INDEX `fk_S_T` (`id` ASC) ,
CONSTRAINT `fk_S_T`
FOREIGN KEY (`id` )
REFERENCES `mydb`.`T` (`id` )
ON DELETE NO ACTION
ON UPDATE NO ACTION)
ENGINE = InnoDB;
-- -----------------------------------------------------
-- Table `mydb`.`A`
-- -----------------------------------------------------
CREATE TABLE IF NOT EXISTS `mydb`.`A` (
`id` INT NOT NULL AUTO_INCREMENT ,
`T` INT NOT NULL ,
`stuff` VARCHAR(45) NULL ,
`bar` VARCHAR(45) NULL ,
`foo` VARCHAR(45) NULL ,
PRIMARY KEY (`id`) ,
INDEX `fk_A_T` (`T` ASC) ,
CONSTRAINT `fk_A_T`
FOREIGN KEY (`T` )
REFERENCES `mydb`.`T` (`id` )
ON DELETE NO ACTION
ON UPDATE NO ACTION)
ENGINE = InnoDB;
-- -----------------------------------------------------
-- Table `mydb`.`B`
-- -----------------------------------------------------
CREATE TABLE IF NOT EXISTS `mydb`.`B` (
`id` INT NOT NULL AUTO_INCREMENT ,
`T` INT NOT NULL ,
`stuff2` VARCHAR(45) NULL ,
`foobar` VARCHAR(45) NULL ,
`other` VARCHAR(45) NULL ,
PRIMARY KEY (`id`) ,
INDEX `fk_A_T` (`T` ASC) ,
CONSTRAINT `fk_A_T`
FOREIGN KEY (`T` )
REFERENCES `mydb`.`T` (`id` )
ON DELETE NO ACTION
ON UPDATE NO ACTION)
ENGINE = InnoDB;
-- -----------------------------------------------------
-- Table `mydb`.`C`
-- -----------------------------------------------------
CREATE TABLE IF NOT EXISTS `mydb`.`C` (
`id` INT NOT NULL AUTO_INCREMENT ,
`T` INT NOT NULL ,
`stuff3` VARCHAR(45) NULL ,
`foobar2` VARCHAR(45) NULL ,
`other4` VARCHAR(45) NULL ,
PRIMARY KEY (`id`) ,
INDEX `fk_A_T` (`T` ASC) ,
CONSTRAINT `fk_A_T`
FOREIGN KEY (`T` )
REFERENCES `mydb`.`T` (`id` )
ON DELETE NO ACTION
ON UPDATE NO ACTION)
ENGINE = InnoDB;

Your problem is in part that you have no way to distinguish which of the tables it is in relation to.
Further if you can only have one record that matches any of three or four other tables, this is not a normal relationship and cannot be modelled using normal techniques. A trigger can ensure this is true but with only the column of id in it what prevents it from matching an id in table A of 10 and anid in table C of 10 (violating the rules).
BTW naming columns ID is usually a poor choice for maintenance. It is much clearer what is going on if you name the column with table name for PKs and use the exact name of the Pk for FKs.
An alternative solution for you is to have in the middle table a column for each type of id and a trigger to ensure that only one of them has values, but this is a pain to query if you need all the ids. A compound PK of id and idtype could work to ensure no repeats within a type, but to have no repeats at all, you will need a trigger.

This is a dilemma that comes up fairly regularly, and there is no perfect solution IMHO.
However I would recommend the following:
Combine the S and T table. I don't see any real need for the T table.
Invert the way the A/B/C tables relate to the S (formerly T) table. By this I mean remove the FK on the A/B/C side and create nullable FK columns on the S side. So now your S table has three additional nullable columns: A_ID, B_ID, C_ID.
Create a check constraint on the S table, ensuring that exactly one of these columns always has a value (or none of them has a value if that is allowed).
If having exactly one value is the rule, you can also create a unique constraint across these three columns to ensure that only one S can be related to an A/B/C.
If no value in any of these columns is allowed, the above rule will have to be enforced with a check constraint as well.
Update After Your Comment
Ok, then I would forget about inverting the relationships, and keep the FK on the A/B/C side. I would still enforce the uniqueness of usage using a check constraint, but it would need to cross tables and will likely look different for each flavor of SQL (e.g. SQL Server requires a UDF to go across tables in a check constraint). I still think you can nuke the T table.
Regarding the ORM side of things, I don't know yii at all, so can't speak to that. But if you enforce the relationship at the database level, how you implement it via code shouldn't matter, as the database is responsible for the integrity of the data (they will just look like vanilla relationships to the ORM). However, it may present a problem with trapping the specific error that comes up if at runtime the check constraint's rule is violated.
I should also mention that if there is a large (or even reasonably large) amount of data going into the tables in question, the approach I am recommending might not be the best, as your check constraint will have to check all 20 tables to enforce the rule.

You only require a table in the middle if it's a many-to-many relationship, and it doesn't sound like that's the case, so don't worry about those.
Your question isn't clear - can a T belong to more than 1 A, more than 1 B, and so on? Or does a single T belong to each of A-E, and to no others? It's the difference between a 1-to-1 relationship (each T has exactly one each of A-E) and a 1-to-many relationship (each A-E has exactly 1 T, but a T can belong to many A, many B, and so on). Does this make sense?
Also, I'd second the request for some more info in your question to help solidify what you're asking for.

I have to face a similar situation some weeks ago (not my own db, I prefer to combine all the tables into one, unless in very specific situations).
The solution I implemented was: In "T" model file I did something like this at relations() function:
'id_1' => array(self::BELONGS_TO, 'A', 'id'),
'id_2' => array(self::BELONGS_TO, 'B', 'id'),
'id_3' => array(self::BELONGS_TO, 'C', 'id'),
I hope this helps you.
Regards.

Related

Do I really need PRIMARY KEY when using UNIQUE NOT NULL columns?

My knowledge in SQL is limited and I would appreciate someone who could help me to clarify the use of PRIMARY KEY in the following circumstances. I created a table to support ISO country information. I'm using MariaDB 10 but I believe that will not be relevant for the kind of questions I have(?)
CREATE TABLE IF NOT EXISTS python.country
(
iso_code INTEGER( 3) NOT NULL ,
iso_2_alpha VARCHAR( 2) NOT NULL ,
iso_3_alpha VARCHAR( 3) NOT NULL ,
short_name VARCHAR( 32) NOT NULL ,
long_name VARCHAR( 64) NOT NULL ,
flag_link VARCHAR(2000) DEFAULT(NULL),
CONSTRAINT CK_iso_code CHECK (iso_code > 0 AND iso_code <= 999) ,
CONSTRAINT CK_iso_alpha CHECK (
iso_2_alpha RLIKE BINARY '^[A-Z]+$' AND LENGTH(iso_2_alpha) = 2
AND
iso_3_alpha RLIKE BINARY '^[A-Z]+$' AND LENGTH(iso_3_alpha) = 3
) ,
CONSTRAINT CK_names CHECK (
short_name RLIKE '^\\p{L}+(\\.?[[:blank:]]\\p{L}+)*\\p{L}+$'
AND
long_name RLIKE '^\\p{L}+(\\.?[[:blank:]]\\p{L}+)*\\p{L}+$'
) ,
CONSTRAINT UN_short_name UNIQUE (short_name) ,
CONSTRAINT UN_long_name UNIQUE (long_name) ,
CONSTRAINT UN_iso_2_alpha UNIQUE (iso_2_alpha) ,
CONSTRAINT UN_iso_3_alpha UNIQUE (iso_3_alpha)
-- ???
-- CONSTRAINT PK_country PRIMARY KEY (iso_code,iso_2_alpha,iso_3_alpha)
); -- ENGINE = 'InnoDB';
Question 1: Since all main columns (iso_code,iso_2_alpha,iso_3_alpha) are NOT NULL and UNIQUE does make sense to create a composite PRIMARY KEY? I "believe" it's waste of space and time when inserting new elements?
Question 2: Can I use iso_code safely has being the FOREIGN KEY in other table?
Many thanks.
Since all main columns (iso_code,iso_2_alpha,iso_3_alpha) are NOT NULL and UNIQUE does make sense to create a composite PRIMARY KEY? I "believe" it's waste of space and time when inserting new elements?
Your proposed PK is a superkey over existing keys. It's not necessary in and of itself. You could choose to declare one of your unique key constraints as a PK instead but it's not necessary.
Can I use iso_code safely has being the FOREIGN KEY in other table?
If you also mark iso_code as a unique key in this table, that should work fine.
Some people would recommend that every table always have an autogenerated column marked as PK. That's fine so long as you also enforce the logical keys. Unfortunately, many people will just create that auto-PK and no other keys, which means your data is nonsense.
You've chosen (currently) to just have the logical keys. I think that's fine in this case, especially as several (iso_code, iso_2_alpha and iso_3_alpha) are likely to be more compact that the recommended autogenerated column.
Can't comment on performance and efficiency but one thing with composite keys is that when you use them as a primary key, you have to repeat them in your foreign key. I.e, PK iso_code, iso_2_alpha, iso_3_alpha will be additional FK columns in all the tables related. You also have to then query by these 3 columns in your SQL queries. Bit of a PITA IMO when you can simply use a generic, unique self generating column.
If you can use iso_code and you are sure you never ever ever will have the chance to require inserting a duplicate iso_code that has a different iso_2_alpha, iso_3_alpha then go ahead. But, you should future proof and make table more robust and anticipate the unexpected, use a new dedicated id column unrelated to the business, IMHO.

Problems on having a field that will be null very often on a table in SQL Server

I have a column that sometimes will be null. This column is also a foreign key, so I want to know if I'll have problems with performance or with data consistency if this column will have weight
I know its a foolish question but I want to be sure.
There is no problem necessarily with this, other than it is likely indication that you might have poorly normalized design. There might be performance implications due to the way indexes are structured and the sparseness of the column with nulls, but without knowing your structure or intended querying scenarios any conclusions one might draw would be pure speculation.
A better solution might be a shared primary key where table A has a primary key, and there is zero or one records in B with the same primary key.
If table A can have one or zero B, but more than one A can refer to B, then what you have is a one to many relationship. This can be represented as Pieter laid out in his answer. This allows multiple A records to refer to the same B, and in turn each B may optionally refer to an A.
So you see there are two optional structures to address this problem, and choosing each is not guesswork. There is a distinct rational between why you would choose one or the other, but it depends on the nature of your relationships you are modelling.
Instead of this design:
create table Master (
ID int identity not null primary key,
DetailID int null references Detail(ID)
)
go
create table Detail (
ID int identity not null primary key
)
go
consider this instead
create table Master (
ID int identity not null primary key
)
go
create table Detail (
ID int identity not null primary key,
MasterID int not null references Master(ID)
)
go
Now the Foreign Key is never null, rather the existence (or not) of the Detail record indicates whether it exists.
If a Detail can exist for multiple records, create a mapping table to manage the relationship.

ORACLE Table design: M:N table best practice

I'd like to hear your suggestions on this very basic question:
Imagine these three tables:
--DROP TABLE a_to_b;
--DROP TABLE a;
--DROP TABLE b;
CREATE TABLE A
(
ID NUMBER NOT NULL ,
NAME VARCHAR2(20) NOT NULL ,
CONSTRAINT A_PK PRIMARY KEY ( ID ) ENABLE
);
CREATE TABLE B
(
ID NUMBER NOT NULL ,
NAME VARCHAR2(20) NOT NULL ,
CONSTRAINT B_PK PRIMARY KEY ( ID ) ENABLE
);
CREATE TABLE A_TO_B
(
id NUMBER NOT NULL,
a_id NUMBER NOT NULL,
b_id NUMBER NOT NULL,
somevalue1 VARCHAR2(20) NOT NULL,
somevalue2 VARCHAR2(20) NOT NULL,
somevalue3 VARCHAR2(20) NOT NULL
) ;
How would you design table a_to_b?
I'll give some discussion starters:
synthetic id-PK column or combined a_id,b_id-PK (dropping the "id" column)
When synthetic: What other indices/constraints?
When combined: Also index on b_id? Or even b_id,a_id (don't think so)?
Also combined when these entries are referenced themselves?
Also combined when these entries perhaps are referenced themselves in the future?
Heap or Index-organized table
Always or only up to x "somevalue"-columns?
I know that the decision for one of the designs is closely related to the question how the table will be used (read/write ratio, density, etc.), but perhaps we get a 20/80 solution as blueprint for future readers.
I'm looking forward to your ideas!
Blama
I have always made the PK be the combination of the two FKs, a_id and b_id in your example. Adding a synthetic id field to this table does no good, since you never end up looking for a row based on a knowledge of its id.
Using the compound PK gives you a constraint that prevents the same instance of the relationship between a and b from being inserted twice. If duplicate entries need to be permitted, there's something wrong with your data model at the conceptual level.
The index you get behind the scenes (for every DBMS I know of) will be useful to speed up common joins. An extra index on b_id is sometimes useful, depending on the kinds of joins you do frequently.
Just as a side note, I don't use the name "id" for all my synthetic pk columns. I prefer a_id, b_id. It makes it easier to manage the metadata, even though it's a little extra typing.
CREATE TABLE A_TO_B
(
a_id NUMBER NOT NULL REFERENCES A (a_id),
b_id NUMBER NOT NULL REFERENCES B (b_id),
PRIMARY KEY (a_id, b_id),
...
) ;
It's not unusual for ORMs to require (or, in more clueful ORMs, hope for) an integer column named "id" in addition to whatever other keys you have. Apart from that, there's no need for it. An id number like that makes the table wider (which usually degrades I/O performance just slightly), and adds an index that is, strictly speaking, unnecessary. It isn't necessary to identify the entity--the existing key does that--and it leads new developers into bad habits. (Specifically, giving every table an integer column named "id", and believing that that column alone is the only key you need.)
You're likely to need one or more of these indexed.
a_id
b_id
{a_id, b_id}
{b_id, a_id}
I believe Oracle should automatically index {a_id, b_id}, because that's the primary key. Oracle doesn't automatically index foreign keys. Oracle's indexing guidelines are online.
In general, you need to think carefully about whether you need ON UPDATE CASCADE or ON DELETE CASCADE. In Oracle, you only need to think carefully about whether you need ON DELETE CASCADE. (Oracle doesn't support ON UPDATE CASCADE.)
the other comments so far are good.
also consider adding begin_dt and end_dt to the relationship. in this way, you can manage a good number of questions about each relationship through time. (consider baseline issues)

Complex Foreign Key Constraint in SQL

Is there a way to define a constraint using SQL Server 2005 to not only ensure a foreign key exists in another table, but also meets a certain criteria?
For example, say I have two tables:
Table A
--------
Id - int
FK_BId - int
Table B
--------
Id - int
Name - string
SomeBoolean - bit
Can I define a constraint that sayd FK_BId must point to a record in Table B, AND that record in Table B must have SomeBoolean = true? Thanks in advance for any help you can provide.
You can enforce the business rule using a composite key on (Id, SomeBoolean), reference this in table A with a CHECK constraint on FK_BSomeBoolean to ensure it is always TRUE. BTW I'd recommend avoiding BIT and instead using CHAR(1) with domain checking e.g.
CHECK (SomeBoolean IN ('F', 'T'))
The table structure could look like this:
CREATE TABLE B
(
Id INTEGER NOT NULL UNIQUE, -- candidate key 1
Name VARCHAR(20) NOT NULL UNIQUE, -- candidate key 2
SomeBoolean CHAR(1) DEFAULT 'F' NOT NULL
CHECK (SomeBoolean IN ('F', 'T')),
UNIQUE (Id, SomeBoolean) -- superkey
);
CREATE TABLE A
(
Ib INTEGER NOT NULL UNIQUE,
FK_BId CHAR(1) NOT NULL,
FK_BSomeBoolean CHAR(1) DEFAULT 'T' NOT NULL
CHECK (FK_BSomeBoolean = 'T')
FOREIGN KEY (FK_BId, FK_BSomeBoolean)
REFERENCES B (Id, SomeBoolean)
);
I think what you're looking for is out of the scope of foreign keys, but you could do the check in triggers, stored procedures, or your code.
If it is possible to do, I'd say that you would make it a compound foreign key, using ID and SomeBoolean, but I don't think it actually cares what the value is.
In some databases (I can't check SQL Server) you can add a check constraint that references other tables.
ALTER TABLE a ADD CONSTRAINT fancy_fk
CHECK (FK_BId IN (SELECT Id FROM b WHERE SomeBoolean));
I don’t believe this behavior is standard.

Multiple yet mutually exclusive foreign keys - is this the way to go?

I have three tables: Users, Companies and Websites.
Users and companies have websites, and thus each user record has a foreign key into the Websites table. Also, each company record has a foreign key into the Websites table.
Now I want to include foreign keys in the Websites table back into their respective "parent" records. How do I do that? Should I have two foreign keys in each website record, with one of them always NULL? Or is there another way to go?
If we look into the model here, we will see the following:
A user is related to exactly one website
A company is related to exactly one website
A website is related to exactly one user or company
The third relation implies existence of a "user or company" entity whose PRIMARY KEY should be stored somewhere.
To store it you need to create a table that would store a PRIMARY KEY of a website owner entity. This table can also store attributes common for a user and a website.
Since it's a one-to-one relation, website attributes can be stored in this table too.
The attributes not shared by users and companies should be stored in the separate table.
To force the correct relationships, you need to make the PRIMARY KEY of the website composite with owner type as a part of it, and force the correct type in the child tables with a CHECK constraint:
CREATE TABLE website_owner (
type INT NOT NULL,
id INT NOT NULL,
website_attributes,
common_attributes,
CHECK (type IN (1, 2)) -- 1 for user, 2 for company
PRIMARY KEY (type, id)
)
CREATE TABLE user (
type INT NOT NULL,
id INT NOT NULL PRIMARY KEY,
user_attributes,
CHECK (type = 1),
FOREIGN KEY (type, id) REFERENCES website_owner
)
CREATE TABLE company (
type INT NOT NULL,
id INT NOT NULL PRIMARY KEY,
company_attributes,
CHECK (type = 2),
FOREIGN KEY (type, id) REFERENCES website_owner
)
you don’t need a parent column, you can lookup the parents with a simple select (or join the tables) on the users and companies table. if you want to know if this is a user or a company website i suggest using a boolean column in your websites table.
Why do you need a foreign key from website to user/company at all? The principle of not duplicating data would suggest it might be better to scan the user/company tables for a matching website id. If you really need to you could always store a flag in the website table that denotes whether a given website record is for a user or a company, and then scan the appropriate table.
The problem I have with the accepted answer (by Quassnoi) is that the object relationships are the wrong way around: company is not a sub-type of a website owner; we had companies before we had websites and we can have companies who are website owners. Also, it seems to me that website ownership is a relationship between a website and either a person or a company i.e. we should have a relationship table (or two) in the schema. It may be an acceptable approach to keep personal website ownership separate from corporate website ownership and only bring them together when required e.g. via VIEWs:
CREATE TABLE People
(
person_id CHAR(9) NOT NULL UNIQUE, -- external identifier
person_name VARCHAR(100) NOT NULL
);
CREATE TABLE Companies
(
company_id CHAR(6) NOT NULL UNIQUE, -- external identifier
company_name VARCHAR(255) NOT NULL
);
CREATE TABLE Websites
(
url CHAR(255) NOT NULL UNIQUE
);
CREATE TABLE PersonalWebsiteOwnership
(
person_id CHAR(9) NOT NULL UNIQUE
REFERENCES People ( person_id ),
url CHAR(255) NOT NULL UNIQUE
REFERENCES Websites ( url )
);
CREATE TABLE CorporateWebsiteOwnership
(
company_id CHAR(6) NOT NULL UNIQUE
REFERENCES Companies( company_id ),
url CHAR(255) NOT NULL UNIQUE
REFERENCES Websites ( url )
);
CREATE VIEW WebsiteOwnership AS
SELECT url, company_name AS website_owner_name
FROM CorporateWebsiteOwnership
NATURAL JOIN Companies
UNION
SELECT url, person_name AS website_owner_name
FROM PersonalWebsiteOwnership
NATURAL JOIN People;
The problem with the above is there is no way of using database constraints to enforce the rule that a website is either owned by a person or a company but not both.
If we can assuming the DBMS enforces check constraints (as the accepted answer does) then we can exploit the fact that a (human) person and a company are both legal persons and employ a super-type table (LegalPersons) but still retain relationship table approach (WebsiteOwnership), this time using the VIEWs to separate personal website ownership from separate from corporate website ownership but this time with strongly typed attributes:
CREATE TABLE LegalPersons
(
legal_person_id INT NOT NULL UNIQUE, -- internal artificial identifier
legal_person_type CHAR(7) NOT NULL
CHECK ( legal_person_type IN ( 'Company', 'Person' ) ),
UNIQUE ( legal_person_type, legal_person_id )
);
CREATE TABLE People
(
legal_person_id INT NOT NULL
legal_person_type CHAR(7) NOT NULL
CHECK ( legal_person_type = 'Person' ),
UNIQUE ( legal_person_type, legal_person_id ),
FOREIGN KEY ( legal_person_type, legal_person_id )
REFERENCES LegalPersons ( legal_person_type, legal_person_id ),
person_id CHAR(9) NOT NULL UNIQUE, -- external identifier
person_name VARCHAR(100) NOT NULL
);
CREATE TABLE Companies
(
legal_person_id INT NOT NULL
legal_person_type CHAR(7) NOT NULL
CHECK ( legal_person_type = 'Company' ),
UNIQUE ( legal_person_type, legal_person_id ),
FOREIGN KEY ( legal_person_type, legal_person_id )
REFERENCES LegalPersons ( legal_person_type, legal_person_id ),
company_id CHAR(6) NOT NULL UNIQUE, -- external identifier
company_name VARCHAR(255) NOT NULL
);
CREATE TABLE WebsiteOwnership
(
legal_person_id INT NOT NULL
legal_person_type CHAR(7) NOT NULL
UNIQUE ( legal_person_type, legal_person_id ),
FOREIGN KEY ( legal_person_type, legal_person_id )
REFERENCES LegalPersons ( legal_person_type, legal_person_id ),
url CHAR(255) NOT NULL UNIQUE
REFERENCES Websites ( url )
);
CREATE VIEW CorporateWebsiteOwnership AS
SELECT url, company_name
FROM WebsiteOwnership
NATURAL JOIN Companies;
CREATE VIEW PersonalWebsiteOwnership AS
SELECT url, person_name
FROM WebsiteOwnership
NATURAL JOIN Persons;
What we need are new DBMS features for 'distributed foreign keys' ("For each row in this table there must be exactly one row in one of these tables") and 'multiple assignment' to allow the data to be added into tables thus constrained in a single SQL statement. Sadly we are a far way from getting such features!
First of all, do you really need this bi-directional link? It is a good practice to avoid it unless absolutely needed.
I understand it that you wish to know whether the site belongs to a user or to a company. You can achieve that by having a simple boolean field in the Website table - [BelongsToUser]. If true, then you look up a user, if false - you look up a company.
A bit late, but all the existing answers seemed to fall somewhat short of the mark:
Owner to website is a 1:Many relation
Website to owner is a 1:1 relation
Users and Companies tables should not have a foreign key into the Websites table
None of the website data, common to users and companies or not, should be in the Users or Companies tables
None of the owner's information, common or not, should be in the Websites table
MySQL ignores, silently, CHECK constraints on tables (no enforcement of referential integrity)
The DBMS ought to handle the 'relation' logic, not the application using the database
Some of this is recognized in the answer from onedaywhen, yet that answer still missed the opportunity to make MySQL do the heavy lifting and enforce the referential integrity.
A website can only have one owner, legally, anyway. A person, or company, can have any number of websites, including none. A link in the database from owner to website can only be 1:1 at any level of normalization. In reality the relation is 1:Many, and would require having multiple table entries for each owner that happens to own more than one website. A link from website to owner is 1:1 in both database terms and in reality. Having the link from website to owner represents the model better. With an index in the website table, doing the 1:Many lookup for a given owner becomes reasonably efficient.
The CHECK attribute in SQL would be an excellent solution, if MySQL didn't happen to silently ignore it.
MySQL Docs 13.1.20 CREATE TABLE Syntax
The CHECK clause is parsed but ignored by all storage engines.
MySQL's functionality does offer two solutions as work-arounds to implement the behavior of CHECK and keep the referential integrity of the data. Triggers with stored procedures is one, and works well with all manner of constraints. Easier to implement, though less versatile, is using a VIEW with a WITH CHECK OPTION clause, which MySQL will implement.
MySQL Docs 24.5.4 The View WITH CHECK OPTION Clause
The WITH CHECK OPTION clause can be given for an updatable view to prevent inserts to rows for which the WHERE clause in the select_statement is not true. It also prevents updates to rows for which the WHERE clause is true but the update would cause it to be not true (in other words, it prevents visible rows from being updated to nonvisible rows).
The MySQLTUTORIAL site gives a good example of both options in their Introduction to the SQL CHECK constraint tutorial. (You have to think around the typos, but good otherwise.)
Having found this question while trying to resolve a similar mutually exclusive foreign key split and developing a solution, with hints generated by the answers, it seems only proper to share my solution in return.
Recommended Solution
For the minimum impact to the existing schema, and the application accessing the data, retain the Users and Companies tables as they are. Rename the Websites table and replace it with a VIEW named Websites which the application can continue to access. Except when dealing with the ownership information, all the old queries to Websites should still work. So:
The setup
-- Keep the `Users` table about "users"
CREATE TABLE `Users` (
`id` INT SERIAL PRIMARY KEY,
`name` VARCHAR(180),
-- user_attributes
);
-- Keep the `Companies` table about "companies"
CREATE TABLE `Companies` (
`id` SERIAL PRIMARY KEY,
`name` VARCHAR(180),
-- company_attributes
);
-- Attach ownership information about the website to the website's record in the `Websites` table, renamed to `WebsitesData`
CREATE TABLE `WebsitesData` (
`id` SERIAL PRIMARY KEY,
`name` VARCHAR(255),
`is_personal` BOOL,
`owner_user` BIGINT UNSIGNED DEFAULT NULL,
`owner_company` BIGINT UNSIGNED DEFAULT NULL,
website_attributes,
FOREIGN KEY `WebsiteOwner_User` (`owner_user`)
REFERENCES `Users` (`id`)
ON DELETE RESTRICT ON UPDATE CASCADE,
FOREIGN KEY `WebsiteOwner_Company` (`owner_company`)
REFERENCES `Companies` (`id`)
ON DELETE RESTRICT ON UPDATE CASCADE,
);
-- Create a new `VIEW` with the original name of `Websites` as the gateway to the website records which can enforce the constraints you need
CREATE VIEW `Websites` AS
SELECT * FROM `WebsitesData` WHERE
(`is_personal`=TRUE AND `owner_user` IS NOT NULL AND `owner_company` IS NULL) OR
(`is_personal`=FALSE AND `owner_user` IS NULL AND `owner_company` IS NOT NULL)
WITH CHECK OPTION;
Usage
-- Use the Websites VIEW for the INSERT, UPDATE, and SELECT operations as you normally would and leave the WebsitesData table in the background.
INSERT INTO `Websites` SET
`is_personal`=TRUE,
`owner_user`=$userID;
INSERT INTO `Websites` SET
`is_personal`=FALSE,
`owner_company`=$companyID;
-- Or, using different field lists based on the type of owner
INSERT INTO `Websites` (`is_personal`,`owner_user`, ...)
VALUES (TRUE, $userID, ...);
INSERT INTO `Websites` (`is_personal`,`owner_company`, ...)
VALUES (FALSE, $companyID, ...);
-- Or, using a common field list, and placing NULL in the proper place
INSERT INTO `Websites` (`is_personal`,`owner_user`,`owner_company`,...)
VALUES (TRUE, $userID, NULL, ...);
INSERT INTO `Websites` (`is_personal`,`owner_user`,`owner_company`,...)
VALUES (FALSE, NULL, $companyID, ...);
-- Change the company that owns a website
-- Will ERROR if the site was owned by a User.
UPDATE `Websites` SET `owner_company`=$new_companyID;
-- Force change the ownership from a User to a Company
UPDATE `Websites` SET
`owner_company`=$new_companyID,
`owner_user`=NULL,
`is_personal`=FALSE;
-- Force change the ownership from a Company to a User
UPDATE `Websites` SET
`owner_user`=$new_userID,
`owner_company`=NULL,
`is_personal`=TRUE;
-- Selecting the owner of a site without needing to know if it is personal or not
(SELECT `Users`.`name` AS `Owner`
FROM `Websites`
JOIN `Users` ON `Websites`.`owner_user`=`Users`.`id`
WHERE `is_personal`=TRUE AND `Websites`.`id`=$siteID)
UNION
(SELECT `Companies`.`name` AS `Owner`
FROM `Websites`
JOIN `Companies` ON `Websites`.`owner_company`=`Companies`.`id`
WHERE `is_personal`=FALSE AND `Websites`.`id`=$siteID);
-- Selecting the sites owned by a User
SELECT `name` FROM `Websites`
WHERE `is_personal`=TRUE AND `id`=$userID;
SELECT `Websites`.`name`
FROM `Websites`
JOIN `Users` ON `Websites`.`owner_user`=`Users`.$userID
WHERE `is_personal`=TRUE AND `Users`.`name`="$user_name";
-- Selecting the sites owned by a Company
SELECT `name` FROM `Websites` WHERE `is_personal`=FALSE AND `id`=$companyID;
SELECT `Websites`.`name`
FROM `Websites`
JOIN `Comnpanies` ON `Websites`.`owner_company`=`Companies`.$userID
WHERE `is_personal`=FALSE AND `Companies`.`name`="$company_name";
-- Listing all websites and their owners
(SELECT `Websites`.`name` AS `Website`,`Users`.`name` AS `Owner`
FROM `Websites`
JOIN `Users` ON `Websites`.`owner_user`=`Users`.`id`
WHERE `is_personal`=TRUE)
UNION ALL
(SELECT `Websites`.`name` AS `Website`,`Companies`.`name` AS `Owner`
FROM `Websites`
JOIN `Companies` ON `Websites`.`owner_company`=`Companies`.`id`
WHERE `is_personal`=FALSE)
ORDER BY Website, Owner;
-- Listing all users or companies which own at least one website
(SELECT `Websites`.`name` AS `Website`,`Users`.`name` AS `Owner`
FROM `Websites`
JOIN `Users` ON `Websites`.`owner_user`=`Users`.`id`
WHERE `is_personal`=TRUE)
UNION DISTINCT
(SELECT `Websites`.`name` AS `Website`,`Companies`.`name` AS `Owner`
FROM `Websites`
JOIN `Companies` ON `Websites`.`owner_company`=`Companies`.`id`
WHERE `is_personal`=FALSE)
GROUP BY `Owner` ORDER BY `Owner`;
Normalization Level Up
As a technical note for normalization, the ownership information could be factored out of the Websites table and a new table created to hold the ownership data, including the is_normal column.
CREATE TABLE `Websites` (
`id` SERIAL PRIMARY KEY,
`name` VARCHAR(255),
`owner` BIGINT UNSIGNED DEFAULT NULL,
website_attributes,
FOREIGN KEY `Website_Owner` (`owner`)
REFERENCES `WebOwners` (id`)
ON DELETE RESTRICT ON UPDATE CASCADE
);
CREATE TABLE `WebOwnersData` (
`id` SERIAL PRIMARY KEY,
`is_personal` BOOL,
`user` BIGINT UNSIGNED DEFAULT NULL,
`company` BIGINT UNSIGNED DEFAULT NULL,
FOREIGN KEY `WebOwners_User` (`user`)
REFERENCES `Users` (`id`)
ON DELETE RESTRICT ON UPDATE CASCADE,
FOREIGN KEY `WebOwners_Company` (`company`)
REFERENCES `Companies` (`id`)
ON DELETE RESTRICT ON UPDATE CASCADE,
);
CREATE VIEW `WebOwners` AS
SELECT * FROM WebsitesData WHERE
(`is_personal`=TRUE AND `user` IS NOT NULL AND `company` IS NULL) OR
(`is_personal`=FALSE AND `user` IS NULL AND `company` IS NOT NULL)
WITH CHECK OPTION;
I believe, however, that the created VIEW, with its constraints, prevents any of the anomalies that normalization aims to remove, and adds complexity that is not needed in the situation. The normalization process is always a trade off anyway.