Bridge Table Primary Key or Composite/Compound Key - sql

For a Bridge Table I have the PK from 2 other tables. What are the pros and cons of making a PK field for the bridge table or making a composite/compound between the two fields.
I want to make sure I am following best practices.
Some Links I am reading over:
https://dba.stackexchange.com/questions/3134/in-sql-is-it-composite-or-compound-keys
http://social.msdn.microsoft.com/Forums/en-US/sqlgetstarted/thread/9d3cfd17-e596-4411-b3d8-66e0ec8bfdc7/
http://www.ben-morris.com/identity-surrogate-vs-composite-keys-in-sql-server
Composite primary keys versus unique object ID field

You have to enforce a unique constraint of some kind on the two foreign keys. The easiest way to do that is with a primary key constraint.
An additional, surrogate ID number isn't really useful. Some people use it because it makes foreign key constraints and joins to the "bridge" table easier to write. I think that, if you think it's hard to make a join using two integers, you shouldn't be working with databases in the first place.

Related

How can I use functionality of Primary Key and Foreign Key in Clickhouse?

I recently created a relational database model and it has a lot of primary key and foreign key relations. I want to use clickhouse for my database but it turns out that clickhouse does not support foreign key and unique primary keys. Can someone tell me if I am missing anything here.
You are right. CH does not have unique & foreign constraints.
Moreover JOINs are not the best part of ClickHouse.
ClickHouse suggests to create single wide denormalized table and avoid joins as possible.

SQL many-to-many

I need to build a simple forum (message board) as a school project. But I came across one problem. In the img above there are 2 tables: post and category, which have many-to-many relationship. I made a bridge table, which stores the postKey and categoryKey. Is it a bad practice to create a composite primary key from those 2 keys, or I need something like postCategoryKey? And what else should I improve?
On my opinion, there is no need for PostCategoryKey, due to it's only a relationship table and you won't accés it by postCategoryKey.
I would create the PK using the 2 others FK (postKey and categoryKey).
Hope it helps!
It depends, if you plan later on to add some extra metadata to postCategoryKey in a separate table, then it makes sense.
In your case - I'd go with a composite primary key and get rid of postCategoryKey
You will have to make postKey and categoryKey not null and create a unique constraint on them anyway. That makes them a key for the table, no matter whether you call this the "primary key" or not.
So, there are three options:
Leave this as described with NOT NULL and the unique constraint.
Declare the two columns the table's primary key.
Create an additional column postCategoryKey and make this the primary key.
The decision doesn't really matter. Some companies have a database style convention. In that case it's easy; just follow the company rules. Some people want every table to have a single-column primary key. If so, add that PK column. Some people want bridge tables to have a composite primary key to imediately show what identifies a row. My personal preference is the latter, but any method is as good as the other actually. Just stay consistent in your database.

Primary key in "many-to-many" table

I have a table in a SQL database that provides a "many-to-many" connection.
The table contains id's of both tables and some fields with additional information about the connection.
CREATE TABLE SomeTable (
f_id1 INTEGER NOT NULL,
f_id2 INTEGER NOT NULL,
additional_info text NOT NULL,
ts timestamp NULL DEFAULT now()
);
The table is expected to contain 10 000 - 100 000 entries.
How is it better to design a primary key? Should I create an additional 'id' field, or to create a complex primary key from both id's?
DBMS is PostgreSQL
This is a "hard" question in the sense that there are pretty good arguments on both sides. I have a bias toward putting in auto-incremented ids in all tables that I use. Over time, I have found that this simply helps with the development process and I don't have to think about whether they are necessary.
A big reason for this is so foreign key references to the table can use only one column.
In a many-to-many junction table (aka "association table"), this probably isn't necessary:
It is unlikely that you will add a table with a foreign key relationship to a junction table.
You are going to want a unique index on the columns anyway.
They will probably be declared not null anyway.
Some databases actually store data based on the primary key. So, when you do an insert, then data must be moved on pages to accommodate the new values. Postgres is not one of those databases. It treats the primary key index just like any other index. In other words, you are not incurring "extra" work by declaring one more more columns as a primary key.
My conclusion is that having the composite primary key is fine, even though I would probably have an auto-incremented primary key with separate constraints. The composite primary key will occupy less space so probably be more efficient than an auto-incremented id. However, if there is any chance that this table would be used for a foreign key relationship, then add in another id field.
A surrogate key wont protect you from adding multiple instances of (f_id1, f_id2) so you should definitely have a unique constraint or primary key for that. What would the purpose of a surrogate key be in your scenario?
Yes that's actually what people commonly do, that key is called surrogate key.. I'm not exactly sure with PostgreSQL, but in MySQL by using surrogate key you can delete/edit the records from the user interface.. Besides, this allows the database to query the single key column faster than it could multiple columns.. Hope it helps..

Can a foreign key be the only primary key

I just have a quick question. Can a table have it's only primary key as a foreign key?
To clarify. When I've been creating tables I sometimes have a table with multiple keys where some of them are foreign keys. For example:
create table Pet(
Name varchar(20),
Owner char(1),
Color varchar(10),
primary key(Name, Owner),
foreign key(Owner) referecnes Person(Ssn)
);
So now I'm wondering if it's possible to do something like this:
create table WorksAs(
Worker char(1),
Work varcahr(30),
primary key(Worker),
foreign key(Worker) references Person(Ssn)
);
This would result in two tables having the exact same primary key. Is this something that should be avoided or is it an ok way to design a database? If the above is not a good standard I would simply make the Work variable a primary key as well and that would be fine, but it seems simpler to just skip if it is not needed.
Yes, it's perfectly legal to do that.
In fact, this is the basis of IS-A relations ;)
Yes. Because of the following reasons.
Making them the primary key will force uniqueness (as opposed to imply it).
The primary key will presumably be clustered (depending on the dbms) which will improve performance for some queries.
It saves the space of adding a unique constraint which in some DBMS also creates a unique index
Yes, you might do so. But you need to be careful as foreign keys can have NULL values whereas Primary can't.
Sure. You can use this approach when mapping inheritance hierarchies using the Concrete Table Inheritance or Class Table Inheritance approach, see e.g. SQL Alchemy docs

Should every table have a primary key?

I read somewhere saying that every table should have a primary key to fulfill 1NF.
I have a tbl_friendship table.
There are 2 fields in the table : Owner and Friend.
Fields of Owner and Friends are foreign keys of auto increment id field in tbl_user.
Should this tbl_friendship has a primary key?
Should I create an auto increment id field in tbl_friendship and make it as primary key?
Primary keys can apply to multiple columns! In your example, the primary key should be on both columns, For example (Owner, Friend). Especially when Owner and Friend are foreign keys to a users table rather than actual names say (personally, my identity columns use the "Id" naming convention and so I would have (OwnerId, FriendId)
Personally I believe every table should have a primary key, but you'll find others who disagree.
Here's an article I wrote on the topic of normal forms.
http://michaeljswart.com/2011/01/ridiculously-unnormalized-database-schemas-part-zero/
Yes every table should have a primary key.
Yes you should create surrogate key.. aka an auto increment pk field.
You should also make "Friend" an FK to that auto increment field.
If you think that you are going to "rekey" in the future you might want to look into using natural keys, which are fields that naturally identify your data. The key to this is while coding always use the natural identifiers, and then you create unique indexes on those natural keys. In the future if you have to re-key you can, because your ux guarantees your data is consistent.
I would only do this if you absolutely have to, because it increases complexity, in your code and data model.
It is not clear from your description, but are owner and friend foreign keys and there can be only one relationship between any given pair? This makes two foreign key column a perfect candidate for a natural primary key.
Another option is to use surrogate key (extra auto-incremented column as you suggested). Take a look here for an in-depth discussion.
A primary key can be something abstract as well. In this case, each tuple (owner, friend), e.g. ("Dave","Matt") can form a unique entry and therefore be your primary key. In that case, it would be useful not to use names, but keys referencing another table. If you guarantee, that these tuples can't have duplicates, you have a valid primary key.
For processing reasons it might be useful to introduce a special primary key, like an autoincrement field (e.g. in MySQL) or using a sequence with Oracle.
To comply with 1NF (which is not completely aggreed upon what defines 1NF), yes you should have a primary key identified on each table. This is necessary to provide for uniqueness of each record.
http://en.wikipedia.org/wiki/First_normal_form
In general, you can create a primary key in many ways, one of which is to have an auto-increment column, another is to have a column with GUIDs, another is to have two or more columns that will identify a row uniquely when taken together.
Your table will be much easier to manage in the long term if it has a primary key. At the very least, you need to uniquely identify each record in the table. The field that is used to uniquely identify each record might as well be the primary key.
Yes every table should have (at least one) key. Duplicating rows in any table is undesirable for lots of reasons so put the constraint on those two columns.