Foreign key to an arbitrary table - sql

I need to store a foreign key in a table that doesn’t directly reference a table. I explain. Here I’d like to do something similar to inheritance, but it’s actually not. I have – for a given record in my table – two important fields: the arbitrary or generic key, and a field representing the type of what such a key would refer to. The idea is storing an integer, then regarding the type of the key, joining the corresponding table.
Is it even possible? What are the alternatives? I don’t want inheritance – I’m not using an ODBMS.

A foreign key in a "child" table must reference a single "parent" table. The parent must be specified at the time when you define your key. All rows in your child table must reference the same parent table - there can be no row-by-row differentiation.
Note, however, that a column does not need to be a foreign key in order to be used in a join. A foreign key constraint will prevent insertions of incorrect keys into the child table, and deal with deletions in a parent table by cascading into the child or throwing an error. If you do not care for this functionality, you could store your "foreign key" column normally, and use it in outer joins, like this:
select *
from child c
left outer join referenced1 r1 on c.fk = r1.pk AND c.code = 'first'
left outer join referenced2 r2 on c.fk = r2.pk AND c.code = 'second'
left outer join referenced3 r3 on c.fk = r3.pk AND c.code = 'third'
The above example assumes that your "foreign key" consists of two columns - the fk that indicates what row you reference, and code that indicates what table you want.

Multi-table foreign key (or - to be more specific - "something like foreign key") in one column is possible if you resign from creating FK constraint and store related table name in another column. It'll work some way, unless you care about your data integrity. In fact, some people consider this as one of SQL antipatterns. :-)
Of course you can then create some trigger procedure and check every time if new/modified value exists in "related" table. It's not elegant though.
I think that the better way is to create foreign key in "related" tables (one table for every type):
main_table:
id
...(other columns)
table_type_first:
main_table_id
foreign_key_for_type_1
...(other columns)
table_type_second:
main_table_id
foreign_key_for_type_2
...(other columns)
It's also not brilliant, and maybe it's not exactly what you need (you don't have type column, "entity" type is dependent on existance of record in "type" tables), but it provides more data integrity.

Related

One Primary Key Value in many tables

This may seem like a simple question, but I am stumped:
I have created a database about cars (in Oracle SQL developer). I have amongst other tables a table called: Manufacturer and a table called Parentcompany.
Since some manufacturers are owned by bigger corporations, I will also show them in my database.
The parentcompany table is the "parent table" and the Manufacturer table the "child table".
for both I have created columns, each having their own Primary Key.
For some reason, when I inserted the values for my columns, I was able to use the same value for the primary key of Manufacturer and Parentcompany
The column: ManufacturerID is primary Key of Manufacturer. The value for this is: 'MBE'
The column: ParentcompanyID is primary key of Parentcompany. The value for this is 'MBE'
Both have the same value. Do I have a problem with the thinking logic?
Or do I just not understand how primary keys work?
Does a primary key only need to be unique in a table, and not the database?
I would appreciate it if someone shed light on the situation.
A primary key is unique for each table.
Have a look at this tutorial: SQL - Primary key
A primary key is a field in a table which uniquely identifies each
row/record in a database table. Primary keys must contain unique
values. A primary key column cannot have NULL values.
A table can have only one primary key, which may consist of single or
multiple fields. When multiple fields are used as a primary key, they
are called a composite key.
If a table has a primary key defined on any field(s), then you cannot
have two records having the same value of that field(s).
Primary key is table-unique. You can use same value of PI for every separate table in DB. Actually that often happens as PI often incremental number representing ID of a row: 1,2,3,4...
For your case more common implementation would be to have hierarchical table called Company, which would have fields: company_name and parent_company_name. In case company has a parent, in field parent_company_name it would have some value from field company_name.
There are several reasons why the same value in two different PKs might work out with no problems. In your case, it seems to flow naturally from the semantics of the data.
A row in the Manufacturers table and a row in the ParentCompany table both appear to refer to the same thing, namely a company. In that case, giving a company the same id in both tables is not only possible, but actually useful. It represents a 1 to 1 correspondence between manufacturers and parent companies without adding extra columns to serve as FKs.
Thanks for the quick answers!
I think I know what to do now. I will create a general company table, in which all companies will be stored. Then I will create, as I go along specific company tables like Manufacturer and parent company that reference a certain company in the company table.
To clarify, the only column I would put into the sub-company tables is a column with a foreign key referencing a column of the company table, yes?
For the primary key, I was just confused, because I hear so much about the key needing to be unique, and can't have the same value as another. So then this condition only goes for tables, not the whole database. Thanks for the clarification!

How to reference a composite primary key into a single field?

I got this composite primary key in Table 1:
Table 1: Applicant
CreationDate PK
FamilyId PK
MemberId PK
I need to create a foreign key in Table 2 to reference this composite key. But i do not want to create three fields in Table 2 but to concatenate them in a single field.
Table 2: Sales
SalesId int,
ApplicantId -- This should be "CreationDate-FamilyId-MemberId"
What are the possible ways to achieve this ?
Note: I know i can create another field in Table 1 with the three columns concatenation but then i will have redundant info
What you're asking for is tantamount to saying "I want to treat three pieces of information as one piece of information without explicitly making it one piece of information". Which is to say that it's not possible.
That said, there are ways to make happen what you want to happen
Create a surrogate key (i.e. identity column) and use that as the FK reference
Create a computed column that is the concatenation of the three columns and use that as the FK reference
All else being equal (ease of implementation, politics, etc), I'd prefer the first. What you have is really a natural key and doesn't make a good PK if it's going to be referenced externally. Which isn't to say that you can't enforce uniqueness with a unique key; you can and should.

ON UPDATE CASCADE with two columns in a single table in SQL Server [duplicate]

I have a database table called Lesson:
columns: [LessonID, LessonNumber, Description] ...plus some other columns
I have another table called Lesson_ScoreBasedSelection:
columns: [LessonID,NextLessonID_1,NextLessonID_2,NextLessonID_3]
When a lesson is completed, its LessonID is looked up in the Lesson_ScoreBasedSelection table to get the three possible next lessons, each of which are associated with a particular range of scores. If the score was 0-33, the LessonID stored in NextLessonID_1 would be used. If the score was 34-66, the LessonID stored in NextLessonID_2 would be used, and so on.
I want to constrain all the columns in the Lesson_ScoreBasedSelection table with foreign keys referencing the LessonID column in the lesson table, since every value in the Lesson_ScoreBasedSelection table must have an entry in the LessonID column of the Lesson table. I also want cascade updates turned on, so that if a LessonID changes in the Lesson table, all references to it in the Lesson_ScoreBasedSelection table get updated.
This particular cascade update seems like a very straightforward, one-way update, but when I try to apply a foreign key constraint to each field in the Lesson_ScoreBasedSelection table referencing the LessonID field in the Lesson table, I get the error:
Introducing FOREIGN KEY constraint 'c_name' on table 'Lesson_ScoreBasedSelection' may cause cycles or multiple cascade paths.
Can anyone explain why I'm getting this error or how I can achieve the constraints and cascading updating I described?
You can't have more than one cascading RI link to a single table in any given linked table. Microsoft explains this:
You receive this error message because
in SQL Server, a table cannot appear
more than one time in a list of all
the cascading referential actions that
are started by either a DELETE or an
UPDATE statement. For example, the
tree of cascading referential actions
must only have one path to a
particular table on the cascading
referential actions tree.
Given the SQL Server constraint on this, why don't you solve this problem by creating a table with SelectionID (PK), LessonID, Next_LessonID, QualifyingScore as the columns. Use a constraint to ensure LessonID and QualifyingScore are unique.
In the QualifyingScore column, I'd use a tinyint, and make it 0, 1, or 2. That, or you could do a QualifyingMinScore and QualifyingMaxScore column so you could say,
SELECT * FROM NextLesson
WHERE LessonID = #MyLesson
AND QualifyingMinScore <= #MyScore
AND #MyScore <= QualifyingMaxScore
Cheers,
Eric

Database table id-key Null value and referential integrity

I'm learning databases, using SQLce. Got some problems, with this error:
A foreign key value cannot be inserted because a corresponding primary key value does not exist.
How does the integrity and acceptance of data work when attempting to save a data row that does not have specified one foreign key. Isn't it possible to set it to NULL in some way, meaning it will not reference the other table? In case, how would I do that? (For an integer key field)
Also, what if you save a row with a valid foreign key that corresponds to an existing primary key in other table. But then decide to delete that entry in this other table. So the foreign key will no longer be valid. Will I be allowed to delete? How does it work? I would think it should then be simply reset to a null value.. But maybe it's not that simple?
What you need to do is insert your data starting from the parent down.
So if you have an orders table and an items table that refers to orders, you have to create the new order first before adding all the children to the list.
Many of the data access libraries that you can get (in C# there is Linq to SQL) which will try and abstract this problem.
If you need to delete data you actually have to go the other way, delete the items before you delete the parent order record.
Of course, this assumes you are enforcing the foreign key, it is possible to not enforce the key, which might be useful during a bulk delete.
This is because of "bad data" you have in the tables. Check if you have all corresponding values in the primary table.
DBMS checks the referential integrity for ensuring the "correctness" of data within database.
For example, if you have a column called some_id in TableA with values 1 through 10 and a column called some_id in TableB with values 1 through 11 then TableA has no corresponding value (11) for that which you have already in TableB.
You can make a foreign key nullable but I don't recommend it. There are too many problems and inconsistencies that can arise. Redesign your tables so that you don't need to populate the foreign key for values that don't exist. Usually you can do that by moving the column to a new table for example.

How do I implement this multi-table database design/constraint, normalized?

I have data that kinda looks like this...
Elements
Class | Synthetic ID (pk)
A | 2
A | 3
B | 4
B | 5
C | 6
C | 7
Elements_Xref
ID (pk) | Synthetic ID | Real ID (fk)
. | 2 | 77-8F <--- A class
. | 3 | 30-7D <--- A class
. | 6 | 21-2A <--- C class
. | 7 | 30-7D <--- C class
So I have these elements that are assigned synthetic IDs and are grouped into classes. But these synthetic IDs are then paired with Real IDs that we actually care about. There is also a constraint that a Real ID cannot recur in a single class. How can I capture all of this in one coherent design?
I don't want to jam the Real ID into the upper table because
It is nullable (there are periods where we don't know what the Real ID of something should be).
It's a foreign key to more data.
Obviously this could be done with triggers acting as constraints, but I'm wondering if this could be implemented with regular constraints/unique indexes. Using SQL Server 2005.
I've thought about having two main tables SyntheticByClass and RealByClass and then putting IDs of those tables into another xref/link table, but that still doesn't guarantee that the classes of both elements match. Also solvable via trigger.
Edit: This is keyword stuffing but I think it has to do with normalization.
Edit^2: As indicated in the comments below, I seem to have implied that foreign keys cannot be nullable. Which is false, they can! But what cannot be done is setting a unique index on fields where NULLs repeat. Although unique indexes support NULL values, they cannot constraint more than one NULL in a set. Since the Real ID assignment is initially sparse, multiple NULL Real IDs per class is more than likely.
Edit^3: Dropped the redundant Elements.ID column.
Edit^4: General observations. There seems to be three major approaches at work, one of which I already mentioned.
Triggers. Use a trigger as a constraint to break any data operations that would corrupt the integrity of the data.
Index a view that joins the tables. Fantastic, I had no idea you could do that with views and indexes.
Create a multi-column foreign key. Didn't think of doing this, didn't know it was possible. Add the Class field to the Xref table. Create a UNIQUE constraint on (Class + Real ID) and a foreign key constraint on (Class + Synthetic ID) back to the Elements table.
Comments from before the question was made into a 'bonus' question
What you'd like to be able to do is express that the join of Elements and Elements_Xref has a unique constraint on Class and Real ID. If you had a DBMS that supported SQL-92 ASSERTION constraints, you could do it.
AFAIK, no DBMS supports them, so you are stuck with using triggers.
It seems odd that the design does not constrain Real ID to be unique across classes; from the discussion, it seems that a given Real ID could be part of several different classes. Were the Real ID 'unique unless null', then you would be able to enforce the uniqueness more easily, if the DBMS supported the 'unique unless null' concept (most don't; I believe there is one that does, but I forget which it is).
Comments before edits made 2010-02-08
The question rules out 'jamming' the Real_ID in the upper table (Elements); it doesn't rule out including the Class in the lower table (Elements_Xref), which then allows you to create a unique index on Class and Real_ID in Elements_Xref, achieving (I believe) the required result.
It isn't clear from the sample data whether the synthetic ID in the Elements table is unique or whether it can repeat with different classes (or, indeed whether a synthetic ID can be repeated in a single class). Given that there seems to be an ID column (which presumably is unique) as well as the Synthetic ID column, it seems reasonable to suppose that sometimes the synthetic ID repeats - otherwise there are two unique columns in the table for no very good reason. For the most part, it doesn't matter - but it does affect the uniqueness constraint if the class is copied to the Elements_Xref table. One more possibility; maybe the Class is not needed in the Elements table at all; it should live only in the Elements_Xref table. We don't have enough information to tell whether this is a possibility.
Comments for changes made 2010-02-08
Now that the Elements table has the Synthetic ID as the primary key, things are somewhat easier. There's a comment that the 'Class' information actually is a 'month', but I'll try to ignore that.
In the Elements_Xref table, we have an unique ID column, and then a Synthetic ID (which is not marked as a foreign key to Elements, but presumably must actually be one), and the Real ID. We can see from the sample data that more than one Synthetic ID can map to a given Real ID. It is not clear why the Elements_Xref table has both the ID column and the Synthetic ID column.
We do not know whether a single Synthetic ID can only map to a single Real ID or whether it can map to several Real ID values.
Since the Synthetic ID is the primary key of Elements, we know that a single Synthetic ID corresponds to a single Class.
We don't know whether the mapping of Synthetic ID to Real ID varies over time (it might as Class is date-related), and whether the old state has to be remembered.
We can assume that the tables are reduced to the bare minimum and that there are other columns in each table, the contents of which are not directly material to the question.
The problem states that the Real ID is a foreign key to other data and can be NULL.
I can't see a perfectly non-redundant design that works.
I think that the Elements_Xref table should contain:
Synthetic ID
Class
Real ID
with (Synthetic ID, Class) as a 'foreign key' referencing Elements, and a NOT NULL constraint on Real ID, and a unique constraint on (Class, Real ID).
The Elements_Xref table only contains rows for which the Real ID is known - and correctly enforces the uniqueness constraint that is needed.
The weird bit is that the (Synthetic ID, Class) data in Elements_Xref must match the same columns in Elements, even though the Synthetic ID is the primary key of Elements.
In IBM Informix Dynamic Server, you can achieve this:
CREATE TABLE elements
(
class CHAR(1) NOT NULL,
synthetic_id SERIAL NOT NULL PRIMARY KEY,
UNIQUE(class, synthetic_id)
);
CREATE TABLE elements_xref
(
class CHAR(1) NOT NULL,
synthetic_id INTEGER NOT NULL REFERENCES elements(synthetic_id),
FOREIGN KEY (class, synthetic_id) REFERENCES elements(class, synthetic_id),
real_id CHAR(5) NOT NULL,
PRIMARY KEY (class, real_id)
);
I would:
Create a UNIQUE constraint on Elements(Synthetic ID, Class)
Add Class column to Elements_Xref
Add a FOREIGN KEY constraint on Elements_Xref table, referring to (Synthetic ID, Class)
At this point we know for sure that Elements_Xref.Class always matches Elements.Class.
Now we need to implement "unique when not null" logic. Follow the link and scroll to section "Use Computed Columns to Implement Complex Business Rules":
Indexes on Computed Columns: Speed Up Queries, Add Business Rules
Alternatively, you can create an indexed view on (Class, RealID) with WHERE RealID IS NOT NULL in its WHERE clause - that will also enforce "unique when not null" logic.
Create an indexed view for Elements_Xref with Where Real_Id Is Not Null and then create a unique index on that view
Create View Elements_Xref_View With SchemaBinding As
Select Elements.Class, Elements_Xref.Real_Id
From Elements_Xref
Inner Join Element On Elements.Synthetic_Id = Elements_Xref.Synthetic_Id
Where Real_Id Is Not Null
Go
Create Unique Clustered Index Elements_Xref_Unique_Index
On Elements_Xref_View (Class, Real_Id)
Go
This serves no other purpose other than simulating a unique index that treats nulls properly i.e. null != null
You can
Create a view from the the result set of joining Elements_Xref and Elements together on Synthetic ID
add a unique constraint on class, and [Real ID]. In other news, this is also how you do functional indexes in MSSQL, by indexing views.
Here is some sql:
CREATE VIEW unique_const_view AS
SELECT e.[Synthetic ID], e.Class, x.[Real ID]
FROM Elements AS e
JOIN [Elements_Xref] AS x
ON e.[Synthetic ID] = x.[Synthetic ID]
CREATE UNIQUE INDEX unique_const_view_index ON unique_const_view ( Class, [Real ID] );
Now, apparently, unbeknownst to myself this solution doesn't work in Microsoft-land-place because with MS SQL Server duplicate nulls will violate a UNIQUE constraint: this is against the SQL spec. This is where the problem is discussed about.
This is the Microsoft workaround:
create unique nonclustered index idx on dbo.DimCustomer(emailAddress)
where EmailAddress is not null;
Not sure if that is 2005, or just 2008.
I think a trigger is your best option. Constraints can't cross to other tables to get information. Same thing with a unique index (although I suppose a materialized view with an index might be possible), they are unique within the table. When you put the trigger together, remember to do it in a set-based fashion not row-by-row and test with a multi-row insert and multi-row update where the real key is repeated in the dataset.
I don't think either of your two reasons are an obstacle to putting Real ID in Elements. If a given element has 0 or 1 Real IDs (but never more than 1), it should absolutely be in the Elements table. This would then allow you to constrain uniqueness within Class (I think).
Could you expand on your two reasons not to do this?
Create a new table real_elements with fields Real ID, Class and Synthetic ID with a primary key of Class, RealId and add elements when you actually add a RealID
This constrains Real IDs to be unique for a class and gives you a way to match a class and real ID to the synthetic ID
As for Real ID being a foreign key do you mean that if it is in two classes then the data keyed off it will be the same. If so the add another table with key Real Id. This key is then a foreign key into real_elements and any other table needing real ID as foreign key