multiple tables need one to many relationship - sql

I have a SQL database with multiple tables: A, B, C, D. Entities in those tables are quite different things, with different columns, and different kind of relations between them.
However, they all share one little thing: the need for a comment system which, in this case, would have an identical structure: author_id, date, content, etc.
I wonder which strategy would be the best for this schema to have A,..D tables use a comment system. In a classical 'blog' web site I would use a one-to-many relationship with a post_id inside the 'comments' table.
Here it looks like I need an A_comments, B_comments, etc tables to handle this problem, which looks a little bit weird.
Is there a better way?

Create a comment table with a comment_id primary key and the various attributes of a comment.
Additionally, create A_comment thus:
CREATE TABLE A_comment (
comment_id PRIMARY KEY REFERENCES comment(comment_id),
A_id REFERENCES A(A_id)
)
Do likewise for B, C and D. This ensures referential integrity between comment and all the other tables, which you can't do if you store the ids to A, B, C and D directly in comment.
Declaring A_comment.comment_id as the primary key ensures that a comment can only belong to one entry in A. It doesn't prevent a comment from belonging to an entry in A and an entry in B, but there's only so much you can achieve with foreign keys; this would require database-level constraints, which no database I know of supports.
This design also doesn't prevent orphaned comments, but I can't think of any way to prevent this in SQL, except, of course, to do the very thing you wanted to avoid: create multiple comment tables.

I had a similar "problem" with comments for more different object types (e.g. articles and shops in my case, each type has it's own table).
What I did: in my comments table I have two columns that manage the linking:
object_type (ENUM type) that determines the object/table we are linking to, and
object_id (unsigned integer type that matches the primary key of your other tables (or the biggest of them)) that point to the exact row in the particular table.
The column structure is then: id, object_type, object_id, author_id, date, content, etc.
Important thing here is to have an index on both of the columns, (object_type, object_id), to make indexing fast.

I presume that you are talking of a single comments table with a foreign key to "exactly one of" A, B, C or D.
The fact that SQL cannot handle this is one of its fundamental weaknesses. The question gets asked over and over and over again.
See, e.g.
What is the best way to enforce a 'subset' relationship with integrity constraints
Your constraint is a "foreign key" from your "single comments" table into a view, which is the union of the identifiers in A, B, C and D. SQL supports only foreign keys into base tables.
Observe that SQL as a language does have support for your situation, in the form of CREATE ASSERTION. I know of no SQL products that support that statement, however.
EDIT
You should also keep in mind that with a 'single' comments table, you might need to enforce disjointness between the keys in A,B,C and D, for otherwise it might happen some time that a comment automatically gets "shared" between entity occurrences from different tables, which might not be desirable.

You could have a single comments table and within that table have a column that contains a value differentiating which table the comment belongs to - ie a 1 in that column means it's a comment for table A, 2 for table B, and so on. If you didn't want to have "magic numbers" in the comments table, you could have another table that has just two columns: one with the number and another detailing which table the number represents.

You don't need a separate comment table for every other, one is enough. Every comment will have a unique ID, so you don't have to worry about conflicts.

Related

How to remove redundancy when a lot of columns can have repeating data?

I am working on a library system. I have a book table in which there are 12 columns including author, title, publication etc. Among those 12 there are 2 columns book_id and isbn that are unique. When a library get n copies of a same book the only columns that are going to be unique are book_id and isbn. All other columns are going to repeat n times. Is this a bad design. Is there a way to remove this redundency? I thought of creating another table where I could store uniquely all the redudent columns and seperate book_id and isbn form it. Should I do that or is the table just fine as it is?
I think you should use two different tables.
In one table you can store unique keys like book_id, isbn_id. And in another table you can use one of these book_id or isbn_id for a reference foreign key. Or other way is you can create an auto increment id also in first table for foreign key reference to second table.
In second table you can keep all other 10 columns + 1 column for reference from 1st table.
This will help you to maintain redundancy and this first table can be also used for mapping as well if in future you want to extend database tables.
What you have described sounds like two different tables:
books which defines the attributes of a book.
copies which defines the attributes of an incoming copy.
You seem to understand the attributes of books.
The attributes of copies might include (besides a foreign key reference to books)
condition
acquired_date
source
other information about the acquisition
It might also have "disposed of" date as well. Or you could maintain that in a copies_transactions table of some sort.

Is it possible to implement a TRUE one-to-one relation?

Consider the following model where a Customer should have one and only one Address and an Address should belong to one and only one Customer:
To implement it, as almost everybody in DB field says, Shared PK is the solution:
But I think it is a fake one-to-one relationship. Because nothing in terms of database relationship actually prevents deleting any row in table Address. So truely, it is 1..[0..1] not 1..1
Am I right? Is there any other way to implement a true 1..1 relation?
Update:
Why cascade delete is not a solution:
If we consider cascade delete as a solution we should put this on either of the tables. Let's say if a row is deleted from table Address, it causes corresponding row in table Customer to be deleted. it's okay but half of the solution. If a row in Customer is deleted, the corresponding row in Address should be deleted as well. This is the second half of the solution, and it obviously makes a cycle.
Beside my comment
You could implement DELETE CASCADE See HOW
I realize there is also the problem of insert.
You have to insert Customer first and then Address
So I think the best way if you really want a 1:1 is create a single table instead.
Customer
CustomerID
Name
Address
City
Sorry, is this meant to be a real-world database relationship? In all of the many databases I have ever built with customer data, there has always been real cases of either customers with multiple addresses, or more than one organisation at the same address.
I wouldn't want to lead you into a database modelling fallacy by suggesting anything different.
Yes, the "shared PK" idiom you show is for 1-to-0-or-1.
The straightforward way to have a true 1-to-1 correspondence is to have one table with Customer and Address as CKs (candidate keys). (Via UNIQUE NOT NULL and/or PRIMARY KEY.) You could offer the separate tables as views. Unfortunately typical DBMSs have restrictions on what you can do via the views, in particular re updating.
The relational way to have separate CUSTOMER and ADDRESS tables and a third table/association/relationship with Customer and Address columns as CKs plus FKs on Customer to and from CUSTOMER and on Address to and from ADDRESS (or equivalent constraint(s)). Unfortunately most DBMSs needlessly won't let you declare cycles in FKs and you cannot impose the constraints without triggers/complexity. (Ultimately, if you want to have proper integrity in a typical SQL database you need to use triggers and complex idioms.)
Entity-oriented design methods unfortunately artificially distinguish between entities, associations and properties. Here is an example where if you consider the simplest design to simply be the one table with PKs then you don't want to always have to have distinct tables for each entity. Or if you consider the simplest design to be the three tables (or even two) with the PKs and FKs (or some other constraint(s) for 1-to-1) then unfortunately typical DBMSs just don't declaratively/ergonomically support that particular design situation.
(Straightforward relational design is to have values (that are sometimes used as ids) 1-to-1 with application things but then just have whatever relevant application relationships/associations/relations and corresponding/representing tables/relations as needed to describe your application situations.)
It's possible in principle to implement a true 1-1 data structure in some DBMSs. It's very difficult to add data or modify data in such a structure using standard SQL however. Standard SQL only permits one table to be updated at a time and therefore as soon as you insert a row into one or other table the intended constraint is broken.
Here are two examples. First using Tutorial D. Note that the comma between the two INSERT statements ensures that the 1-1 constraint is never broken:
VAR CUSTOMER REAL RELATION {
id INTEGER} KEY{id};
VAR ADDRESS REAL RELATION {
id INTEGER} KEY{id};
CONSTRAINT one_to_one (CUSTOMER{id} = ADDRESS{id});
INSERT CUSTOMER RELATION {
TUPLE {id 1234}
},
INSERT ADDRESS RELATION {
TUPLE {id 1234}
};
Now the same thing in SQL.
CREATE TABLE CUSTOMER (
id INTEGER NOT NULL PRIMARY KEY);
CREATE TABLE ADDRESS (
id INTEGER NOT NULL PRIMARY KEY);
INSERT INTO CUSTOMER (id)
VALUES (1234);
INSERT INTO ADDRESS (id)
VALUES (1234);
ALTER TABLE CUSTOMER ADD CONSTRAINT one_to_one_1
FOREIGN KEY (id) REFERENCES ADDRESS (id);
ALTER TABLE ADDRESS ADD CONSTRAINT one_to_one_2
FOREIGN KEY (id) REFERENCES CUSTOMER (id);
The SQL version uses two foreign key constraints, which is the only kind of multi-table constraint supported by most SQL DBMSs. It requires two INSERT statements which means I could only insert a row before adding the constraints, not after.
A strict one-to-one constraint probably isn't very useful in practice but it's actually just a special case of something more important and interesting: join dependency. A join dependency is effectively an "at least one" constraint between tables rather than "exactly one". In the world outside databases it is common to encounter examples of business rules that ought to be implemented as join dependencies ("each customer must have AT LEAST ONE addresss", "each order must have AT LEAST ONE item in it"). In SQL DBMSs it's hard or impossible to implement join dependencies. The usual solution is simply to ignore such business rules thus weakening the data integrity value of the database.
Yes, what you say is true, the dependent side of a 1:1 relationship may not exist -- if only for the time it takes to create the dependent entity after creating the independent entity. In fact, all relationships may have a zero on one side or the other. You can even turn the relationship into a 1:m by placing the FK of the address in the Customer row and making the field not null. You can still have addresses that aren't referenced by any customer.
At first glance, a m:n may look like an exception. The intersection entry is generally defined so that neither FK can be null. But there can be customers and addresses both that have no entry referring to them. So this is really a 0..m:0..n relationship.
What of it? Everyone I've ever worked with has understood that "one" (as in 1:1) or "many" (as in 1:m or m:n) means "no more than this." There is no "exactly this, no more or less." For example, we can design a 1:3 relationship on paper. We cannot strictly enforce it in any database. We have to use triggers, stored procedures and/or scheduled tasks to seek out and call our attention to deviations. Execute a stored procedure weekly, for instance, that will seek and and flag or delete any such orphaned addresses.
Think of it like a "frictionless surface." It exists only on paper.
I see this question as a conceptual misunderstanding. Relations are between different things. Things with a "true 1-to-1 relation" are by definition aspects or attributes of the same thing, and belong in the same table. No, of course a person and and address are not the same, but if they are inseparable, and must always be inserted, deleted, or otherwise acted upon as a unit, then as data they are "the same thing". This is exactly what is described here.
Yes, and it's actually quite easy: just put both entities in the same table!
OTOH, if you need to keep them in separate tables for some reason, then you need a key in one table referencing1 a key in another, and vice-versa. This, of course, represents a "chicken and egg" problem2 which can be resolved by deferring the enforcement of FKs to the end of the transaction3. This works only on DBMSes that support deferred constraints (such as Oracle and PostgreSQL).
1 Via a foreign key.
2 Inserting a row in the first table is impossible because that would violate the referential integrity towards the second table, but inserting a row in the second table is impossible because that would violate the referential integrity towards the first table, etc... Ditto for deletion.
3 So you simply insert both rows, and then check both FKs.

Is it possible to implement an any-any relationship only using 2 tables?

I don't know whether my idea below is applicable:
I have 2 tables, namely A and B.
Each row in table A can be associated with zero or more rows of table B.
Each row in table B can also be associated with zero or more rows of table A.
Table A contains (among others) 2 columns AId (as a primary key) and BId (as a foreign key).
Table B also contains (among others) 2 columns BId (as a primary key) and AId (as a foreign key).
A cascade delete rule is also setup for each foreign key relationship in DB and model class.
It means deleting a row of A will also delete rows, associated with it, of B or deleting a row of B will delete rows, associated with it, of A.
Is it practically possible to do this scenario?
No, not if you are following normal form.
many to Many relationships are a hallmark of needing an intersection table.
More info:
So here's an example. question tagging. A tag can be on multiple questions, and a question can have multiple tags. this is a many to many realtionship. You COULD put multiple entries of tagIds in the Question entity's Tag Column. But you lose A LOT by doing this.
You will not have integrity, because it is VERY difficult to maintain whether or not a tag exists in your tag table as well as in the questions tag column.
This also violates normal form, because a single column cannot have multiple values.
You also cannot easily join on that column, since it has multiple values in it.
I'm assuming that by 'any-any' relationship you are referring to 'many-to-many'.
What you describe in your post is not a many-to-many relation. What you describe is two separate one-to-many relations.
You have a one-to-many relation from TableA to TableB via the AId column in TableB. And you have another one-to-many relation from TableB to TableA via the BId column in TableA. Having two one-to-many relationships in opposite direction is not the same thing as having a many-to-many relationship. Take Stefan's tagging example and consider three queries (QId1, QId2 and QId3) and three tags (TId1, TId2 and TId3). Try to express that all QId1, QId2 and QId3 are tagged each with all TId1, TId2 and TId3. You'll realize that you cannot, because you're trying to express 9 relations in only 6 available 'foreign key' fields. A true many-to-many relation requires up to MxN 'links' possible between two tables of size M and N, while your design allows for M+N (not surprising, since your design is M links in one of the 1-to-many relations and another N links in the other 1-to-many relation).
What you need is a join table. So you have an a and b table that each have a primary key. Then you create an ab table that only has 2 columns. Both are a foreign key. One goes to the a table and the other goes to the b table.
Google for "Database Normalization". You will find lots of examples.
It's possible, but you definitely don't want to do it that way.
You can use a comma separated string of identities in one of the tables. Looking up value from that table to the other is of course a major hassle. Looking up values from the the other table is a nightmare.
Using a cascading trigger with this method is of course out of the question. It might be possible by making an update trigger do the work, but the performance for that would be so bad that it's pointless.
To do this efficiently you absolutely need another table for the relations.

Do multiple foreign keys make sense?

can it make sense for one table to have multiple foreign keys?
Suppose I have three tables, Table A, Table B and Table C. If I think of the tables as objects (and they are mapped to objects in my code), then both Tables A and B have a many to one relationship with Table C. I.e. Table/object A and B can each have many instances of C. So the way I designed it is that Table C points to the primary key in both Table A and Table B: in other words, Table C has 2 foreign keys (a_Id, and b_Id).
Edit: I forgot to mention also that Table A can have many instances of Table B. So Table B has a foreign key into Table A. If this makes a difference...
I am wondering if this makes sense or is there a better way to do it? Thanks.
This is fine, but note that it only makes sense if a C always has to have both an A and a B as a pair.
If you just want A's to have C's and B's to have C's, but A and B are otherwise unrelated then you should put the foreign key in A and in B and allow it to be nullable.
Update: after clarification it seems you want two separate relationships: an A can have many Cs, and a B can have many Cs, but a C can only belong to one A or one B.
Solution: It's two separate one-to-many relationships, so create two new tables A_C and B_C, and put the foreign keys there. A_C contains a foreign key to A and a foreign key to C. Similarly for B_C.
In your scenario, of two different tables being referenced by a third, this is also fine. I have tables in my databases that have 3-4 or even more foreign keys. This is provided that the entity always requires all of the references to exist.
Many to many relationships are also implemented as a single table with two (or more) foreign keys, so yes, they do make sense in that context.
See this article about implementing this kind of relationship for PHP and MySQL.
If you can't formulate the relations between A,B,C objects any other way, it makes perfect sense to define the FKs like you did.
Yes, that makes perfect sense. It is not at all uncommon to have a table with multiple foreign keys to other tables.
I think it's ok to do it this way, but maybe that's because I do it this way. In my case, I have a table full of people, and a table full of roles those people can fulfill. Since each person can be in any number of those roles, the simplest way to have it work was to add a third table tracking these relationships.
It's not a great solution, but it's better than adding a new column to the table each time a new role comes along, and having to rewrite the code running those queries each time. And I sure can't think of a better way to handle it!

Multiple foreign keys to a single column

I'm defining a database for a customer/ order system where there are two highly distinct types of customers. Because they are so different having a single customer table would be very ugly (it'd be full of null columns as they are pointless for one type).
Their orders though are in the same format. Is it possible to have a CustomerId column in my Order table which has a foreign key to both the Customer Types? I have set it up in SQL server and it's given me no problems creating the relationships, but I'm yet to try inserting any data.
Also, I'm planning on using nHibernate as the ORM, could there be any problems introduced by doing the relationships like this?
No, you can't have a single field as a foreign key to two different tables. How would you tell where to look for the key?
You would at least need a field that tells what kind of user it is, or two separate foreign keys.
You could also put the information that is common for all users in one table and have separate tables for the information that is specific for the user types, so that you have a single table with user id as primary key.
A foreign key can only reference a single primary key, so no. However, you could use a bridge table:
CustomerA <---- CustomerA_Orders ----> Order
CustomerB <---- CustomerB_Orders ----> Order
So Order doesn't even have a foreign key; whether this is desirable, though...
I inherited a SQL Server database where this was done (a single column used in four foreign key relationships with four unrelated tables), so yes, it's possible. My predecessor is gone, though, so I can't ask why he thought it was a good idea.
He used a GUID column ("uniqueidentifier" type) to avoid the ambiguity problem, and he turned off constraint checking on the foreign keys, since it's guaranteed that only one will match. But I can think of lots of reasons that you shouldn't, and I haven't thought of any reasons you should.
Yours does sound like the classical "specialization" problem, typically solved by creating a parent table with the shared customer data, then two child tables that contain the data unique to each class of customer. Your foreign key would then be against the parent customer table, and your determination of which type of customer would be based on which child table had a matching entry.
You can create a foreign key referencing multiple tables. This feature is to allow vertical partioining of your table and still maintain referential integrity. In your case however, this is not applicable.
Your best bet would be to have a CustomerType table with possible columns - CustomerTypeID, CustomerID, where CustomerID is the PK and then refernce your OrderID table to CustomerID.
Raj
I know this is a very old question; however if other people are finding this question through the googles, and you don't mind adding some columns to your table, a technique I've used (using the original question as a hypothetical problem to solve) is:
Add a [CustomerType] column. The purpose of storing a value here is to indicate which table holds the PK for your (assumed) [CustomerId] FK column. Optional - addition of a check constraint (to ensure CustomerType is in CustomerA or CustomerB) will help you sleep better at night.
Add a computed column for each [CustomerType], eg:
[CustomerTypeAId] as case when [CustomerType] = 'CustomerA' then [CustomerId] end persisted
[CustomerTypeBId] as case when [CustomerType] = 'CustomerB' then [CustomerId] end persisted
Add your foreign keys to the calculated (and persisted) columns.
Caveat: I'm primarily in a MSSQL environment; so I don't know how well this translates to other DBMS (ie: Postgres, ORACLE, etc).
As noted, if the key is, say, 12345, how would you know which table to look it up in? You could, I suppose, do something to insure that the key values for the two tables never overlapped, but this is too ugly and painful to contemplate. You could have a second field that says which customer type it is. But if you're going to have two fields, why not have one field for customer type 1 id and another for customer type 2 id.
Without knowing more about your app, my first thought is that you really should have a general customer table with the data that is common to both, and then have two additional tables with the data specific to each customer type. I would think that there must be a lot of data common to the two -- basic stuff like name and address and customer number at the least -- and repeating columns across tables sucks big time. The additional tables could then refer back to the base table. As there is then a single key for the base table, the issue of foreign keys having to know which table to refer to evaporates.
Two distinct types of customer is a classic case of types and subtypes or, if you prefer, classes and subclasses. Here is an answer from another question.
Essentially, the class-table-inheritance technique is like Arnand's answer. The use of the shared-primary-key technique is what allows you to get around the problems created by two types of foreign key in one column. The foreign key will be customer-id. That will identify one row in the customer table, and also one row in the appropriate kind of customer type table, as the case may be.
Create a "customer" table include all the columns that have same data for both types of customer.
Than create table "customer_a" and "customer_b"
Use "customer_id" from "consumer" table as foreign key in "customer_a" and "customer_b"
customer
|
---------------------------------
| |
cusomter_a customer_b