1:1 Foreign Key Constraints - sql

How do you specify that a foreign key constraint should be a 1:1 relationship in transact sql? Is declaring the column UNIQUE enough? Below is my existing code.!
CREATE TABLE [dbo].MyTable(
[MyTablekey] INT IDENTITY(1,1) NOT FOR REPLICATION NOT NULL,
[OtherTableKey] INT NOT NULL UNIQUE
CONSTRAINT [FK_MyTable_OtherTable] FOREIGN KEY REFERENCES [dbo].[OtherTable]([OtherTableKey]),
...
CONSTRAINT [PK_MyTable] PRIMARY KEY CLUSTERED
(
[MyTableKey] ASC
) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO

A foreign key column with the UNIQUE and NOT NULL constraints that references a UNIQUE, NOT NULL column in another table creates a 1:(0|1) relationship, which is probably what you want.
If there was a true 1:1 relationship, every record in the first table would have a corresponding record in the second table and vice-versa. In that case, you would probably just want to make one table (unless you needed some strange storage optimization).

You could declare the column to be both the primary key and a foreign key. This is a good strategy for "extension" tables that are used to avoid putting nullable columns into the main table.

#bosnic:
You have a table CLIENT that has a 1:1 relationship with table SALES_OFFICE because, for example, the logic of your system says so.
What your app logic says, and what your data model say are 2 different things. There is nothing wrong with enforcing that relationship with your business logic code, but it has no place in the data model.
Would you really incorporate the data of SALES_OFFICE into CLIENT table?
If every CLIENT has a unique SALES_OFFICE, and every SALES_OFFICE has a singular, unique CLIENT - then yes, they should be in the same table. We just need a better name. ;)
And if another tables need to relate them selfs with SALES_OFFICE?
There's no reason to. Relate your other tables to CLIENT, since CLIENT has a unique SALES_OFFICE.
And what about database normalization best practices and patterns?
This is normalization.
To be fair, SALES_OFFICE and CLIENT is obviously not a 1:1 relationship - it's 1:N. Hopefully, your SALES_OFFICE exists to serve more than 1 client, and will continue to exist (for a while, at least) without any clients.
A more realistic example is SALES_OFFICE and ZIP_CODE. A SALES_OFFICE must have exactly 1 ZIP_CODE, and 2 SALES_OFFICEs - even if they have an equivalent ZIP_CODE - do not share the instance of a ZIP_CODE (so, changing the ZIP_CODE of 1 does not impact the other). Wouldn't you agree that ZIP_CODE belongs as a column in SALES_OFFICE?

Based on your code above, the unique constraint would be enough given that the for every primary key you have in the table, the unique constrained column is also unique. Also, this assumes that in [OtherTable], the [OtherTableKey] column is the primary key of that table.

If there was a true 1:1 relationship, every record in the first table would have a corresponding record in the second table and vice-versa. In that case, you would probably just want to make one table (unless you needed some strange storage optimization).
This is very incorrect. Let me give you an example. You have a table CLIENT that has a 1:1 relationship with table SALES_OFFICE because, for example, the logic of your system says so. Would you really incorporate the data of SALES_OFFICE into CLIENT table? And if another tables need to relate them selfs with SALES_OFFICE? And what about database normalization best practices and patterns?
A foreign key column with the UNIQUE and NOT NULL constraints that references a UNIQUE, NOT NULL column in another table creates a 1:(0|1) relationship, which is probably what you want.
The first part of your answer is the right answer, without the second part, unless the data in second table is really a kind of information that belongs to first table and never will be used by other tables.

Related

What does it mean when there is no "Id" at the end of a column name which appears to be a foreign key?

This is the databases ERD my final project for school is on (at the bottom), I am required to make a database using this information. I understand how to add the tables that are setup like 'trainer' and even how to add self-joining tables to my database, but something we have NOT learned is what it means or what to do when there is no Id at the end? Like 'evolvesfrom' and 'pokemonfightexppoint'.
Do you not have to add an Id at the end? From what my teacher taught us, I assumed you did. From what I see in this ERD is how evolvesfrom is self-joining itself to pokemonId. I know how to complete this only when there is an Id at the end of evolvesfrom.
For something like trainerId, it is super easy to understand how to add the constraints and everything like so:
CREATE TABLE trainer (
trainerId INT IDENTITY(1, 1),
trainerName VARCHAR(50) NOT NULL,
CONSTRAINT pk_trainer_trainerId PRIMARY KEY (trainerId)
);
I just don't understand how to do this when there is no Id added. For the pokemonFight table, it is noted that "It is assumed that a
Pokémon can play any battles at any battle locations. In other words, the battle experience points are functionally dependent on Pokémon, battle, and battle location", if that makes a difference.
If possible, could anyone show me an example on how to add a table, with constraints on either the pokemon or the pokemonFight table? (obviously you don't have to include the data types or anything).
Thank you in advance.
I am using SQL Server.
There is no required naming convention for columns in SQL Server that differentiates between a data column, a primary key column or a foreign key column.
The only constraints on column names are that they follow the rules for SQL Server identifier naming. However in a particular work environment you might well use a naming convention which does include ID at the end of the column name in order to clearly make the intention of the column obvious.
To create a self-referencing foreign key you just do the same as normal which can be as part of the create table or an alter table.
CREATE TABLE pokemon (
pokemonId INT IDENTITY(1, 1),
...
CONSTRAINT fk_pokemon_evolvesFrom FOREIGN KEY (evolvesFrom) REFERENCES pokemon (pokemonId)
);
-- OR
ALTER TABLE pokemon
ADD CONSTRAINT fk_pokemon_evolvesFrom FOREIGN KEY (evolvesFrom)
REFERENCES pokemon (pokemonId)

composite primary key with practical approach

I just need to understand the concept behind the composite primary key. I have googled about it, understood that it is a combination of more than one column of a table.But my questions is, what is the practical approach of this key over any data? when i should use this concept? can you show me any practical usage of this key on excel or SQL server?
It may be a weird type of question for any sql expert. I apologize for this kind of idiotic question. If anybody feels it is an idiot question, please forgive me.
A typical use-case for a composite primary key is a junction/association table. Consider orders and products. One order could have many products. One product could be in many orders. The orderProducts table could be defined as:
create table orderProducts (
orderId int not null references orders(orderId),
productId int not null references products(productId),
quantity int,
. . .
);
It makes sense to declare (orderId, productId) as a composite primary key. This would impose the constraint that any given order has any given product only once.
That said, I would normally use a synthetic key (orderProductId) and simply declare the combination as unique.
The benefit of a composite primary key as that it enforces the uniques (which could also be done with a uniqueness constraint). It also wastes no space that would be needed for an additional key.
There are downsides to composite primary keys as compared to identity keys:
Identity keys keep track of the order of inserts.
Identity keys are typically only 4 bytes.
Foreign key references consist of only one column.
By default, SQL Server clusters on primary keys. This imposes an ordering and can result in fragmentation (although that is doubtful for this example).
Let's say I have a table of cars. It includes the model and make of the cars. I do not want to insert the same exact car into my table, but there are cars that will have the same make and cars that will have the same model (assume both Ford and Toyota make a car called the 'BlergWagon').
I could enforce uniqueness of make/model with a composite key that includes both values. A unique key on just make would not allow me to add more than 1 Toyota and a unique key on just model would not allow me to enter more than 1 BlergWagon.
Another example would be grades, terms, years, students, and classes. I could enforce uniqueness for a student in a class and a specific semester in a specific year so that my table does not have 2 dupe records that show the same class in the same semester in the same year with the same student.
Another part of your post is about primary key, which I'll assume means you are talking about a clustered index. Clustered index enforces order of the table. So you could throw this onto an identity column to order the table and add a unique, nonclustered index to enforce uniqueness on your other columns.

SQL sub-types with overlapping child tables

Consider the problem above where the 'CommonChild' entity can be a child of either sub-type A or B, but not C. How would I go about designing the physical model in a relational [SQL] database?
Ideally, the solution would allow...
for an identifying relationship between CommonChild and it's related sub-type.
a 1:N relationship.
Possible Solutions
Add an additional sub-type to the super-type and move sub-type A and B under the new sub-type. The CommonChild can then have a FK constraint on the newly created sub-type. Works for the above, but not if an additional entity is added which can have a relationship with sub-type A and C, but not B.
Add a FK constraint between the CommonChild and SuperType. Use a trigger or check constraint (w/ UDF) against the super-type's discriminator before allowing a new tuple into CommonChild. Seems straight forward, but now CommonChild almost seems like new subtype itself (which it is not).
My model is fundamentally flawed. Remodel and the problem should go away.
I'm looking for other possible solutions or confirmation of one of the above solutions I've already proposed.
Thanks!
EDIT
I'm going to implement the exclusive foreign key solution provided by Branko Dimitrijevic (see accepted answer).
I am going to make a slight modifications in this case as:
the super-type, sub-type, and "CommonChild" all have the same PKs and;
the PKs are 3 column composites.
The modification is to to create an intermediate table whose sole role is to enforce the exclusive FK constraint between the sub-types and the "CommonChild" (exact model provided by Dimitrijevic minus the "CommonChild's" attributes.). The CommonChild's PK will have a normal FK constraint to the intermediate table.
This will prevent the "CommonChild" from having 2 sets of 3 column composite FKs. Plus, since the identifying relationship is maintained from super-type to "CommonChild", [read] queries can effectively ignore the intermediate table altogether.
Looks like you need a variation of exclusive foreign keys:
CREATE TABLE CommonChild (
Id AS COALESCE(SubTypeAId, SubTypeBId) PERSISTED PRIMARY KEY,
SubTypeAId int REFERENCES SubTypeA (SuperId),
SubTypeBId int REFERENCES SubTypeB (SuperId),
Attr6 varchar,
CHECK (
(SubTypeAId IS NOT NULL AND SubTypeBId IS NULL)
OR (SubTypeAId IS NULL AND SubTypeBId IS NOT NULL)
)
);
There are couple of thing to note here:
There are two NULL-able FOREIGN KEYs.
There is a CHECK that allows exactly one of these FKs be non-NULL.
There is a computed column Id which equals one of the FKs (whichever is currently non-NULL) which is also a PRIMARY KEY. This ensures that:
One parent cannot have multiple children.
A "grandchild" table can reference the CommonChild.Id directly from its FK. The SuperType.Id is effectively popagated all the way down.
We don't have to mess with NULL-able UNIQUE constraints, which are problematic in MS SQL Server (see below).
A DBMS-agnostic way of of doing something similar would be...
CREATE TABLE CommonChild (
Id int PRIMARY KEY,
SubTypeAId int UNIQUE REFERENCES SubTypeA (SuperId),
SubTypeBId int UNIQUE REFERENCES SubTypeB (SuperId),
Attr6 varchar,
CHECK (
(SubTypeAId IS NOT NULL AND SubTypeAId = Id AND SubTypeBId IS NULL)
OR (SubTypeAId IS NULL AND SubTypeBId IS NOT NULL AND SubTypeBId = Id)
)
)
Unfortunately a UNIQUE column containing more than one NULL is not allowed by MS SQL Server, which is not the case in most DBMSes. However, you can just omit the UNIQUE constraint if you don't want to reference SubTypeAId or SubTypeBId directly.
Wondering what am I missing here?
Admittedly, it is hard without having the wording of the specific problem, but things do feel a bit upside-down.

Constraint is key is index is constraint?

in SQL Server Management Studio (SSMS) 2008 R2 Dev on the table (because I cannot ask without it)
--SET ANSI_NULL_DFLT_ON ON
create table B (Id int)
I create unique constraint
ALTER TABLE B
ADD CONSTRAINT IX_B
UNIQUE (ID)
WITH (IGNORE_DUP_KEY = ON)
SSMS shows that I do not have any constraint but have a key + index instead while context options (on right clicking) hint me to create (sorry, script) still constraint.
MAIN QUESTION:
What is a key here?
Why is it needed for this constraint?
Why is unique constraint called by key (and key by unique constraint)?
Sorry, again, why is key called by index? They seem to have the same name (though had I created without explicit name they would have called differently)...
Sorry, again...
COLLATERAL questions:
Which functionality is different between "unique constraint" vs. "unique index"? I searched today for a single difference (wanted to find 10) quite a time and could not find any.
In other words, what's for (why) are concepts (or constructs) "unique constraint" and "unique index" are duplicated in SQL Server?
Bonus question (for those whom this question seemed too simple):
What is the sense in unique index (or in unique constraint) permitting dupes?
insert into B VALUES (1)
insert into B VALUES (1)
insert into B VALUES (1)
Update: Sorry Thanks, guys and ladies (bonus is withdrawn)
Update2: Was not there a difference between "unique index" and "unique constraint" in previous SQL Server (I vaguely recall that one of them did not permit NULL)?
Update3: Really, I was always pissed off (confused) by a "foreign key constraint" is called by "foreign key" while it is not a key and the key, foreign one is in another table, foreign one ... and just found that this is universal confusion to rattle on it. At least, now I memorize that I should remember au contraire.
Update4:
#Damien_The_Unbeliever, thanks,
these are, at least, some hrooks to memorize confusion.
Though, some bewilderments:
Why are these candidates NOT NULL by default?
Initially I really wanted to insert much shorter script:
CREATE TABLE A(A INT UNIQUE);
which produced:
WTF this "candidate" for PRIMARY and KEY has multiple identities syndrome, then?
Is not "UQ_" stand for Unique constraint in naming practices?
Now, scripting of this index UQ__A__3214EC262AA05119 produces nameless... not index ... constraint(?!):
ALTER TABLE [dbo].[A] ADD UNIQUE NONCLUSTERED
(
[ID] ASC
)
WITH
( PAD_INDEX = OFF,
STATISTICS_NORECOMPUTE = OFF,
SORT_IN_TEMPDB = OFF,
IGNORE_DUP_KEY = OFF,
ONLINE = OFF,
ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON)
ON
[PRIMARY]
WTheF - How is unique index is identified as constraint? Where, by what
Scripting of key produces the same! again, constraint...
Why is it nameless? Why is it impossible to script as "modify" but only as "create"?!
This does not make sense!
Now if to execute the produced script, there are 2 dupes,
to execute once more - voila: 3 dupe "candidates"
Note: if to create the unique constraint by separate T-SQL statement with custom/manual name, as I made at the top, then scripting does not produce anonymous DML and, respectively, their execution does not permit multiplication of "candidates"
Strange candidates for primary keys, aren't they? need to visit a shrink (with multiple identities symptoms)?
A UNIQUE constraint is also known as a UNIQUE KEY constraint. Basically, there can be multiple KEYs for a table. One is selected (somewhat arbitrarily) to be the PRIMARY KEY for the table. Other keys are created as UNIQUE KEY constraints.
As an implementation detail, a UNIQUE KEY constraint is implemented by placing a UNIQUE index on the same columns in the table. However, it is possible to create UNIQUE indexes on a table (via CREATE INDEX), without creating a UNIQUE constraint.
UNIQUE constraints are similar to PRIMARY KEYs - they can be the target reference for a FOREIGN KEY constraint. A UNIQUE index, by itself, cannot be so referenced.
In SSMS, PRIMARY KEY, UNIQUE, and FOREIGN KEY constraints will always show up under the "Keys" folder for a table. CHECK and DEFAULT constraints will show up under the "Constraints" folder
This only answers part of your question, as I'm not clear on why management studio displays these objects the way it does.
A unique constraint is part of the (logical) relational model. Essentially if you were drawing out the logical model, a unique constraint would appear on the drawing.
A unique index (like all indexes) is an implementation detail, so is part of the physical model.
SQL Server uses a unique index to implement a unique constraint. There is a logical difference, and I think this shows up in the SQL Server diagramming tool--if you use a unique constraint on a foreign key in what otherwise would be a 1:n relationship, it shows up as a 1:1. This is not true when using a unique index (again, not 100% sure on this).
first off all your table has only 1 value not 3, take a look
create table B (Id int)
ALTER TABLE B
ADD CONSTRAINT IX_B
UNIQUE (ID)
WITH (IGNORE_DUP_KEY = ON)
insert into B VALUES (1)
insert into B VALUES (1)
insert into B VALUES (1)
--Duplicate key was ignored.
--Duplicate key was ignored.
select * from B
One row, right?
This is because you have this WITH (IGNORE_DUP_KEY = ON)
now do this
create table C (Id int)
ALTER TABLE C
ADD CONSTRAINT IX_C
UNIQUE (ID)
insert into C VALUES (1)
insert into C VALUES (1)
The second row won't go into the table now
The way that SQL Server implements constraint is that it creates an index behind it to facilitate fast lookups among other things.
Perhaps you really want a primary key on this table?
create table D (Id int not null primary key)
A unique constraint is implemented internally as an index.
Pretty much the only difference between explicit CREATE INDEX and adding a constraint via ALTER TABLE is the ability to have INCLUDE columns as an explicit index.
SSMS is somewhat confusing in how it presents this. No idea why
Personally, I think IGNORE_DUP_KEY is pointless and have never used it.
Specifies the error response to
duplicate key values in a multiple-row
insert operation on a unique clustered
or unique nonclustered index. The
default is OFF.
ON
A warning message is issued and
only the rows violating the unique
index fail.
OFF
An error message is issued and the
entire INSERT transaction is rolled
back.
The IGNORE_DUP_KEY setting applies
only to insert operations that occur
after the index is created or rebuilt.
The setting has no affect during the
index operation.
Edit:
Found this from me before: Can I set ignore_dup_key on for a primary key?
Generally, IGNORE_DUP_KEY has several answers too

Generic Database table design

Just trying to figure out the best way to design my table for the following scenario:
I have several areas in my system (documents, projects, groups and clients) and each of these can have comments logged against them.
My question is should I have one table like this:
CommentID
DocumentID
ProjectID
GroupID
ClientID
etc
Where only one of the ids will have data and the rest will be NULL or should I have a separate CommentType table and have my comments table like this:
CommentID
CommentTypeID
ResourceID (this being the id of the project/doc/client)
etc
My thoughts are that option 2 would be more efficient from an indexing point of view. Is this correct?
Option 2 is not a good solution for a relational database. It's called polymorphic associations (as mentioned by #Daniel Vassallo) and it breaks the fundamental definition of a relation.
For example, suppose you have a ResourceId of 1234 on two different rows. Do these represent the same resource? It depends on whether the CommentTypeId is the same on these two rows. This violates the concept of a type in a relation. See SQL and Relational Theory by C. J. Date for more details.
Another clue that it's a broken design is that you can't declare a foreign key constraint for ResourceId, because it could point to any of several tables. If you try to enforce referential integrity using triggers or something, you find yourself rewriting the trigger every time you add a new type of commentable resource.
I would solve this with the solution that #mdma briefly mentions (but then ignores):
CREATE TABLE Commentable (
ResourceId INT NOT NULL IDENTITY,
ResourceType INT NOT NULL,
PRIMARY KEY (ResourceId, ResourceType)
);
CREATE TABLE Documents (
ResourceId INT NOT NULL,
ResourceType INT NOT NULL CHECK (ResourceType = 1),
FOREIGN KEY (ResourceId, ResourceType) REFERENCES Commentable
);
CREATE TABLE Projects (
ResourceId INT NOT NULL,
ResourceType INT NOT NULL CHECK (ResourceType = 2),
FOREIGN KEY (ResourceId, ResourceType) REFERENCES Commentable
);
Now each resource type has its own table, but the serial primary key is allocated uniquely by Commentable. A given primary key value can be used only by one resource type.
CREATE TABLE Comments (
CommentId INT IDENTITY PRIMARY KEY,
ResourceId INT NOT NULL,
ResourceType INT NOT NULL,
FOREIGN KEY (ResourceId, ResourceType) REFERENCES Commentable
);
Now Comments reference Commentable resources, with referential integrity enforced. A given comment can reference only one resource type. There's no possibility of anomalies or conflicting resource ids.
I cover more about polymorphic associations in my presentation Practical Object-Oriented Models in SQL and my book SQL Antipatterns.
Read up on database normalization.
Nulls in the way you describe would be a big indication that the database isn't designed properly.
You need to split up all your tables so that the data held in them is fully normalized, this will save you a lot of time further down the line guaranteed, and it's a lot better practice to get into the habit of.
From a foreign key perspective, the first example is better because you can have multiple foreign key constraints on a column but the data has to exist in all those references. It's also more flexible if the business rules change.
To continue from #OMG Ponies' answer, what you describe in the second example is called a Polymorphic Association, where the foreign key ResourceID may reference rows in more than one table. However in SQL databases, a foreign key constraint can only reference exactly one table. The database cannot enforce the foreign key according to the value in CommentTypeID.
You may be interested in checking out the following Stack Overflow post for one solution to tackle this problem:
MySQL - Conditional Foreign Key Constraints
The first approach is not great, since it is quite denormalized. Each time you add a new entity type, you need to update the table. You may be better off making this an attribute of document - I.e. store the comment inline in the document table.
For the ResourceID approach to work with referential integrity, you will need to have a Resource table, and a ResourceID foreign key in all of your Document, Project etc.. entities (or use a mapping table.) Making "ResourceID" a jack-of-all-trades, that can be a documentID, projectID etc.. is not a good solution since it cannot be used for sensible indexing or foreign key constraint.
To normalize, you need to the comment table into one table per resource type.
Comment
-------
CommentID
CommentText
...etc
DocumentComment
---------------
DocumentID
CommentID
ProjectComment
--------------
ProjectID
CommentID
If only one comment is allowed, then you add a unique constraint on the foreign key for the entity (DocumentID, ProjectID etc.) This ensures that there can only be one row for the given item and so only one comment. You can also ensure that comments are not shared by using a unique constraint on CommentID.
EDIT: Interestingly, this is almost parallel to the normalized implementation of ResourceID - replace "Comment" in the table name, with "Resource" and change "CommentID" to "ResourceID" and you have the structure needed to associate a ResourceID with each resource. You can then use a single table "ResourceComment".
If there are going to be other entities that are associated with any type of resource (e.g. audit details, access rights, etc..), then using the resource mapping tables is the way to go, since it will allow you to add normalized comments and any other resource related entities.
I wouldn't go with either of those solutions. Depending on some of the specifics of your requirements you could go with a super-type table:
CREATE TABLE Commentable_Items (
commentable_item_id INT NOT NULL,
CONSTRAINT PK_Commentable_Items PRIMARY KEY CLUSTERED (commentable_item_id))
GO
CREATE TABLE Projects (
commentable_item_id INT NOT NULL,
... (other project columns)
CONSTRAINT PK_Projects PRIMARY KEY CLUSTERED (commentable_item_id))
GO
CREATE TABLE Documents (
commentable_item_id INT NOT NULL,
... (other document columns)
CONSTRAINT PK_Documents PRIMARY KEY CLUSTERED (commentable_item_id))
GO
If the each item can only have one comment and comments are not shared (i.e. a comment can only belong to one entity) then you could just put the comments in the Commentable_Items table. Otherwise you could link the comments off of that table with a foreign key.
I don't like this approach very much in your specific case though, because "having comments" isn't enough to put items together like that in my mind.
I would probably go with separate Comments tables (assuming that you can have multiple comments per item - otherwise just put them in your base tables). If a comment can be shared between multiple entity types (i.e., a document and a project can share the same comment) then have a central Comments table and multiple entity-comment relationship tables:
CREATE TABLE Comments (
comment_id INT NOT NULL,
comment_text NVARCHAR(MAX) NOT NULL,
CONSTRAINT PK_Comments PRIMARY KEY CLUSTERED (comment_id))
GO
CREATE TABLE Document_Comments (
document_id INT NOT NULL,
comment_id INT NOT NULL,
CONSTRAINT PK_Document_Comments PRIMARY KEY CLUSTERED (document_id, comment_id))
GO
CREATE TABLE Project_Comments (
project_id INT NOT NULL,
comment_id INT NOT NULL,
CONSTRAINT PK_Project_Comments PRIMARY KEY CLUSTERED (project_id, comment_id))
GO
If you want to constrain comments to a single document (for example) then you could add a unique index (or change the primary key) on the comment_id within that linking table.
It's all of these "little" decisions that will affect the specific PKs and FKs. I like this approach because each table is clear on what it is. In databases that's usually better then having "generic" tables/solutions.
Of the options you give, I would go for number 2.
Option 2 is a good way to go. The issue that I see with that is you are putting the resouce key on that table. Each of the IDs from the different resources could be duplicated. When you join resources to the comments you will more than likely come up with comments that do not belong to that particular resouce. This would be considered a many to many join. I would think a better option would be to have your resource tables, the comments table, and then tables that cross reference the resource type and the comments table.
If you carry the same sort of data about all comments regardless of what they are comments about, I'd vote against creating multiple comment tables. Maybe a comment is just "thing it's about" and text, but if you don't have other data now, it's likely you will: date the comment was entered, user id of person who made it, etc. With multiple tables, you have to repeat all these column definitions for each table.
As noted, using a single reference field means that you could not put a foreign key constraint on it. This is too bad, but it doesn't break anything, it just means you have to do the validation with a trigger or in code. More seriously, joins get difficult. You can just say "from comment join document using (documentid)". You need a complex join based on the value of the type field.
So while the multiple pointer fields is ugly, I tend to think that's the right way to go. I know some db people say there should never be a null field in a table, that you should always break it off into another table to prevent that from happening, but I fail to see any real advantage to following this rule.
Personally I'd be open to hearing further discussion on pros and cons.
Pawnshop Application:
I have separate tables for Loan, Purchase, Inventory & Sales transactions.
Each tables rows are joined to their respective customer rows by:
customer.pk [serial] = loan.fk [integer];
= purchase.fk [integer];
= inventory.fk [integer];
= sale.fk [integer];
I have consolidated the four tables into one table called "transaction", where a column:
transaction.trx_type char(1) {L=Loan, P=Purchase, I=Inventory, S=Sale}
Scenario:
A customer initially pawns merchandise, makes a couple of interest payments, then decides he wants to sell the merchandise to the pawnshop, who then places merchandise in Inventory and eventually sells it to another customer.
I designed a generic transaction table where for example:
transaction.main_amount DECIMAL(7,2)
in a loan transaction holds the pawn amount,
in a purchase holds the purchase price,
in inventory and sale holds sale price.
This is clearly a denormalized design, but has made programming alot easier and improved performance. Any type of transaction can now be performed from within one screen, without the need to change to different tables.