Database "key/ID" design ideas, Surrogate Key, Primary Key, etc - sql

So I've seen several mentions of a surrogate key lately, and I'm not really sure what it is and how it differs from a primary key.
I always assumed that ID was my primary key in a table like this:
Users
ID, Guid
FirstName, Text
LastName, Text
SSN, Int
however, wikipedia defines a surrogate key as "A surrogate key in a database is a unique identifier for either an entity in the modeled world or an object in the database. The surrogate key is not derived from application data."
According to Wikipedia, it looks like ID is my surrogate key, and my primary key might be SSN+ID? Is this right? Is that a bad table design?
Assuming that table design is sound, would something like this be bad, for a table where the data didn't have anything unique about it?
LogEntry
ID, Guid
LogEntryID, Int [sql identity field +1 every time]
LogType, Int
Message, Text

No, your ID can be both a surrogate key (which just means it's not "derived from application data", e.g. an artificial key), and it should be your primary key, too.
The primary key is used to uniquely and safely identify any row in your table. It has to be stable, unique, and NOT NULL - an "artificial" ID usually has those properties.
I would normally recommend against using "natural" or real data for primary keys - are not REALLY 150% sure it's NEVER going to change?? The Swiss equivalent of the SSN for instance changes each time a woman marries (or gets divorced) - hardly an ideal candidate. And it's not guaranteed to be unique, either......
To spare yourself all that grief, just use a surrogate (artificial) ID that is system-defined, unique, and never changes and never has any application meaning (other than being your unique ID).
Scott Ambler has a pretty good article here which has a "glossary" of all the various keys and what they mean - you'll find natural, surrogate, primary key and a few more.

First, a Surrogate key is a key that is artificially generated within the database, as a unique value for each row in a table, and which has no dependency whatsoever on any other attribute in the table.
Now, the phrase Primary Key is a red herring. Whether a key is primary or an alternate doesn't mean anything. What matters is what the key is used for. Keys can serve two functions which are fundementally inconsistent with one another.
They are first and foremost there to ensure the integrity and consistency of your data! Each row in a table represents an instance of whatever entity that table is defined to hold data for. No Surrogate Key, by definition, can ever perform this function. Only a properly designed natural Key can do this. (If all you have is a surrogate key, you can always add another row with every other attributes exactly identical to an existing row, as long as you give it a different surrogate key value)
Secondly they are there to serve as references (pointers) for the foreign Keys in other tables which are children entities of an entity in the table with the Primary Key. A Natural Key, (especially if it is a composite of multiple attributes) is not a good choice for this function because it would mean tha that A) the foreign keys in all the child tables would also have to be composite keys, making them very wide, and thereby decreasing performance of all constraint operations and of SQL Joins. and B) If the value of the key changed in the main table, you would be required to do cascading updates on every table where the value was represented as a FK.
So the answer is simple... Always (wherever you care about data integrity/consistency) use a natural key and, where necessary, use both! When the natural key is a composite, or long, or not stable enough, add an alternate Surrogate key (as auto-incrementing integer for example) for use as targets of FKs in child tables. But at the risk of losing data consistency of your table, DO NOT remove the natural key from the main table.
To make this crystal clear let's make an example.
Say you have a table with Bank accounts in it... A natural Key might be the Bank Routing Number and the Account Number at the bank. To avoid using this twin composite key in every transaction record in the transactions table you might decide to put an artificially generated surrogate key on the BankAccount table which is just an integer. But you better keep the natural Key! If you didn't, if you did not also have the composite natural key, you could quite easily end up with two rows in the table as follows
id BankRoutingNumber BankAccountNumber BankBalance
1 12345678932154 9876543210123 $123.12
2 12345678932154 9876543210123 ($3,291.62)
Now, which one is right?
To marc from comments below, What good does it do you to be able to "identify the row"?? No good at all, it seems to me, because what we need to be able to identify is which bank account the row represents! Identifying the row is only important for internal database technical functions, like joins in queries, or for FK constraint operations, which, if/when they are necessary, should be using a surrogate key anyway, not the natural key.
You are right in that a poor choice of a natural key, or sometimes even the best available choice of a natural key, may not be truly unique, or guaranteed to prevent duplicates. But any choice is better than no choice, as it will at least prevent duplicate rows for the same values in the attributes chosen as the natural key. These issues can be kept to a minimum by the appropriate choice of key attributes, but sometimees they are unavoidable and must be dealt with. But it is still better to do so than to allow incorrect inaccurate or redundant data into the database.
As to "ease of use" If all you are using the natural key for is to constrain the insertion of duplicate rows, and you are using another, surrogate, key as the target for FK constraints, I do not see any ease of use issues of concern.

Wow, you opened a can of worms with this question. Database purists will tell you never to use surrogate keys (like you have above). On the other hand, surrogate keys can have some tremendous benefits. I use them all the time.
In SQL Server, a surrogate key is typically an auto-increment Identity value that SQL Server generates for you. It has NO relationship to the actual data stored in the table. The opposite of this is a Natural key. An example might be Social Security number. This does have a relationship to the data stored in the table. There are benefits to natural keys, but, IMO, the benefits to using surrogate keys outweigh natural keys.
I noticed in your example, you have a GUID for a primary key. You generally want to stay away from GUIDS as primary keys. The are big, bulky and can often be inserted into your database in a random way, causing major fragmentation.
Randy

The reason that database purists get all up in arms about surrogate keys is because, if used improperly, they can allow data duplication, which is one of the evils that good database design is meant to banish.
For instance, suppose that I had a table of email addresses for a mailing list. I would want them to be unique, right? There's no point in having 2, 3, or n entries of the same email address. If I use email_address as my primary key ( which is a natural key -- it exists as data independently of the database structure you've created ), this will guarantee that I will never have a duplicate email address in my mailing list.
However, if I have a field called id as a surrogate key, then I can have any number of duplicate email addresses. This becomes bad if there are then 10 rows of the same email address, all with conflicting subscription information in other columns. Which one is correct, if any? There's no way to tell! After that point, your data integrity is borked. There's no way to fix the data but to go through the records one by one, asking people what subscription information is really correct, etc.
The reason why non-purists want it is because it makes it easy to use standardized code, because you can rely on refering to a single database row with an integer value. If you had a natural key of, say, the set ( client_id, email, category_id ), the programmer is going to hate coding around this instance! It kind of breaks the encapsulation of class-based coding, because it requires the programmer to have deep knowledge of table structure, and a delete method may have different code for each table. Yuck!
So obviously this example is over-simplified, but it illustrates the point.

Users Table
Using a Guid as a primary key for your Users table is perfect.
LogEntry table
Unless you plan to expose your LogEntry data to an external system or merge it with another database, I would simply use an incrementing int rather than a Guid as the primary key. It's easier to work with and will use slightly less space, which could be significant in a huge log stretching several years.

The primary key is whatever you make it. Whatever you define as the primary key is the primary key. Usually its an integer ID field.
The surrogate key is also this ID field. Its a surrogate for the natural key, which defines uniqueness in terms of your application data.
The idea behind having an integer ID as the primary key (even it doesnt really mean anything) is for indexing purposes. You would then probably define a natural key as a unique constraint on your table. This way you get the best of both worlds. Fast indexing with your ID field and each row still maintains its natural uniqueness.
That said, some people swear by just using a natural key.

There are actually three kinds of keys to talk about. The primary key is what is used to uniquely identify every row in a table. The surrogate key is an artificial key that is created with that property. A natural key is a primary key which is derived from the actual real life data.
In some cases the natural key may be unwieldy so a surrogate key may be created to be used as a foreign key, etc. For example, in a log or diary the PK might be the date, time, and the full text of the entry (if it is possible to add two entries at the exact same time). Obviously it would be a bad idea to use all of that every time that you wanted to identify a row, so you might make a "log id". It might be a sequential number (the most common) or it might be the date plus a sequential number (like 20091222001) or it might be something else. Some natural keys may work well as a primary key though, such as vehicle VIN numbers, student ID numbers (if they are not reused), or in the case of joining tables the PKs of the two tables being joined.
This is just an overview of table key selection. There's a lot to consider, although in most shops you'll find that they go with, "add an identity column to every table and that's our primary key". You then get all of the problems that go with that.
In your case I think that a LogEntryID for your log items seems reasonable. Is the ID an FK to the Users table? If not then I might question having both the ID and the LogEntryID in the same table as they are redundant. If it is, then I would change the name to user_id or something similar.

Related

Surrogate key 'preference' explanation

As I understand there is a war going on between purists of natural key and purists of surrogate key. In likes to this this post (there are more) people say 'natural key is bad for you, always use surrogate...
However, either I am stupid or blind but I can not see a reason to have surrogate key always!
Say you have 3 tables in configuration like this:
Why would I need a surrogate key for it?? I mean it makes perfect sense not to have it.
Also, can someone please explain why primary keys should never change according to surrogate key purists? I mean, if I have say color_id VARCHAR(30) and a key is black, and I no longer need black because I am changing it to charcoal, why is it a bad idea to change black key to charcoal and all referencing columns too?
EDIT: Just noticed that I dont even need to change it! Just create new one, change referencing columns (same as I would have to do with surrogate key) and leave old one in peace....
In surrogate key mantra I need to then create additional entry with, say, id=232 and name=black. How does that benefit me really? I have a spare key in table which I don't need any more. Also I need to join to get a colour name while otherwise I can stay in one table and be merry?
Please explain like to a 5 year old, and please keep in mind that I am not trying to say 'surrogate key is bad', I am trying to understand why would someone say things like 'always use surrogate key!'.
Surrogate keys are useful where there is an suboptimal natural key: no more, no less.
A suboptimal natural key would be a GUID or varchar or otherwise wide/non-ordered.
However, the decision to use a surrogate is an implementation decision after the conceptual and logical modelling process, based on knowledge of how the chosen RDBMS works.
However, this best practice of "have a surrogate key" is now "always have a surrogate key" and it introduced immediately. Object Relation Mappers also often add surrogate keys to all tables whether needed or not which doesn't help.
For a link (many-many) table, you don't need one: SQL: Do you need an auto-incremental primary key for Many-Many tables?. For a table with 2 int columns, the overhead is an extra 50% of data for a surrogate column (assuming ints and ignoring row metadata)
Well, I am more on the natural keys myself :)
But surrogate keys can have its advantages, even if you like me want to go "natural" all the way :)
For example, I have a table that, due to various constraints, has to be defined as being dependent from others. Something like
Table Fee (
foreign_key1,
foreign_key2,
foreign_key3,
value)
the record is defined/identified by the three foreign keys but at the same time, you can have at most 2 of them to be null.
So you cannot create them as a primary keys (u'll just put an unique on the 3 columns)
In order to have a primary key on that table, the only way to do that is to use a surrogate :)
Now... why not to change a primary key... This can be considered pretty philosophical... I see it in this way, hope it will make sense...
A primary key, in itself, is not only a combination of unique+not null, it is more about "the real essence of the record", what it defines the record at the core.
In that sense, it is not something you could change easily, could you?
Consider yourself as an example. You have a nick, but it does not defines what u really are. You could change it, but the essence of being yourself would not change.
Now, if you maintain the nickname, but change your essence... would it still be the same person? Nope, it would make more sense to consider it a "new" person... And for records it's the same...
So that's why you usually do not change the primary key and define a new record from scratch
Always remember surrogate key is the additional column for our actual table columns let us take a table columns like below
patient_name
address
mobile_no
email_address
See here imagine we are working with admission of patient records so here we can't take mobile_no has primary key because we can take but some people might not have mobile no instead of this go for surrogate key and make it as primary key and make actual mobile_no, patient_name as primary key then we can easily perform ..here if mobile no changed no problem we can still search with the help of surrogate key
like below..
Here you can write surrogate key on the top of actual data
patient_no----->primary key[surrogate key]
patient_name ---->pk
address
mobile_no--->pk
email_address

SQL: Primary key column. Artificial "Id" column vs "Natural" columns [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Relational database design question - Surrogate-key or Natural-key?
When I create relational table there is a temptation to choose primary key column the column which values are unique. But for optimization and uniformity purposes I create artifical Id column every time. If there is a column (or columns combination) that should be unique I create Unique Index for that instead of marking them as (composite) primary key column(s).
Is it really a good practice always to prefer artificial "Id" column + indexes instead of natural columns for a primary key?
This is a bit of a religious debate. My personal preference is to have synthetic primary keys rather than natural primary keys but there are good arguments on both sides. Realistically, so long as you are consistent and reasonable, either approach can work well.
If you use natural keys, the two major downsides are the presence of composite keys and mutating primary key values. If you have composite primary keys, you'd obviously have to have multiple columns in each child table. That can get unwieldy from a data model perspective when there are many relationships among entities. But it can also cause grief for people developing queries-- it's awfully easy to create queries that use N-1 of N join conditions and get almost the right result. If you have natural keys, you'll also inevitably encounter a situation where the natural key value changes and you then have to ripple that change through many different entities-- that's vastly more complicated than changing a unique value in the table.
On the other hand, if you use synthetic keys, you're wasting space by adding additional columns, adding additional overhead to maintain an additional index, and you're increasing the risk that you'll get functionally duplicated results. It's awfully easy to either forget to create a unique constraint on the business key or to see that there is a non-unique index on the combination and just assume that it was a unique index. I actually just got bitten by this particular failing a couple days ago-- I had indexed the composite natural key (with a non-unique index) rather than creating a unique constraint. Dumb mistake but one that's relatively easy to make.
From a query writing and naming convention standpoint, I would also tend to prefer synthetic keys because it's nice to know when you're joining tables that the primary key of A is going to be A_ID and the primary key of B is going to be B_ID. That's far more self-documenting than trying to remember that the primary key of A is the combination of A_NAME and A_REVISION_NUMBER and that the primary key of B is B_CODE.
There is little or no difference between a key enforced through a PRIMARY KEY constraint and a key enforced through a UNIQUE constraint. What's important is that you enforce ALL the keys necessary from a data integrity perspective. Usually that means at least one "natural" key (a key exposed to the users/consumers of the data and used to identify the facts about the universe of discourse) per table.
Optionally you might also want to create "technical" keys to support the application and database features rather than the end user (usually called surrogate keys). That should be very much a secondary consideration however. In the interests of simplicity (and very often performance as well) it usually makes sense only to create surrogate keys where you have identified a particular need for them and not before.
It depends on your natural columns. If they are small and steadily increasing, then they are good candidates for the primary key.
Small - the smaller the key, the more values you can get into a single row, and the faster your index scans will be
Steadily increasing - produces fewer index reshuffles as the table grows, improving performance.
My preference is to always use an artificial key.
First it is consistent. Anyone working on your application knows that there is a key and they can make assumptions on it. This makes it easier to understand and maintain.
I've also seen scenarios where the natural key (aka. a string from an HR system that identifies an employee) has to change during the life of the application. If you have an artificial key that links the natural id to your employee record then you only have to change that natural id in the one table. However, if that natural id is a primary key and you have it duplicated across a number of other tables as a foreign key, then you have a mess on your hands.
In my humble opinion, it is always better to have an artificial Id, if I understand properly your meaning of it.
Some people would use, for instance, business significant unique values as their table Id, and I have already read on MSDN, and even in the NHibernate official documentation that a unique business meaningless value is prefered (artificial Id), though you want to create an index on that value for future reference. So, the day the company will change their nomenclature, the system shall still be running flawlessly.
Yes, it is. If nothing else, one of the most important properties of the artificial primary key is opacity, which means the artificial key doesn't reflect any information beyond itself; if you use natural row contents for keys, you wind up exposing that information to things like Web interfaces, which is just a terrible idea on all manner of principle.

Relational database design question - Surrogate-key or Natural-key?

Which one is the best practice and Why?
a) Type Table, Surrogate/Artificial Key
Foreign key is from user.type to type.id:
b) Type Table, Natural Key
Foreign key is from user.type to type.typeName:
I believe that in practice, using a natural key is rarely the best option. I would probably go for the surrogate key approach as in your first example.
The following are the main disadvantages of the natural key approach:
You might have an incorrect type name, or you may simply want to rename the type. To edit it, you would have to update all the tables that would be using it as a foreign key.
An index on an int field will be much more compact than one on a varchar field.
In some cases, it might be difficult to have a unique natural key, and this is necessary since it will be used as a primary key. This might not apply in your case.
The first one is more future proof, because it allows you to change the string representing the type without updating the whole user table. In other words you use a surrogate key, an additional immutable identifier introduced for the sake of flexibility.
A good reason to use a surrogate key (instead of a natural key like name) is when the natural key isn't really a good choice in terms of uniqueness. In my lifetime i've known no fewer than 4 "Chris Smith"s. Person names are not unique.
I prefer to use the surrogate key. It is often people will identity and use the natural key which will be fine for a while, until they decide they want to change the value. Then problems start.
You should probably always use an ID number (that way if you change the type name, you don't need to update the user table) it also allows you to keep your datasize down, as a table full of INTs is much smaller than one full of 45 character varchars.
If typeName is a natural key, then it's probably the preferable option, because it won't require a join to get the value.
You should only really use a surrogate key (id) when the name is likely to change.
Surrogate key for me too, please.
The other might be easier when you need to bang out some code, but it will eventually be harder. Back in the day, my tech boss decided using an email addr as a primary key was a good idea. Needless to say, when people wanted to change their addresses it really sucked.
Use natural keys whenever they work. Names usually don't work. They are too mutable.
If you are inventing your own data, you might as well invent a syntheic key. If you are building a database of data provided by other people or their software, analyze the source data to see how they identify things that need identification.
If they are managing data at all well, they will have natural keys that work for the important stuff. For the unimportant stuff, suit yourself.
well i think surrgote key is helpful when you don't have any uniquely identified key whose value is related and meaningful as is to be its primary key... moreover surrgote key is easier to implement and less overhead to maintain.
but on the other hand surrgote key is sometimes make extra cost by joining tables.
think about 'User' ... I have
UserId varchar(20), ID int, Name varchar(200)
as the table structure.
now consider that i want to take a track on many tables as who is inserting records... if i use Id as a primary key, then [1,2,3,4,5..] etc will be in foreign tables and whenever i need to know who is inserting data i've to join User Table with it because 1,2,3,4,5,6 is meaningless. but if i use UserId as a primary key which is uniquely identified then on other foreign tables [john, annie, nadia, linda123] etc will be saved which is sometimes easily distinguishable and meaningful . so i need not to join user table everytime when i do query.
but mind it, it takes some extra physical space as varchar is saved in foreign tables which takes extra bytes.. and ofcourse indexing has a significant performance issue where int performs better rather than varchar
Surrogate key is a substitution for the natural primary key.
It is just a unique identifier or number for each row that can be used for the primary key to the table.
The only requirement for a surrogate primary key is that it is unique for each row in the table.
It is useful because the natural primary key (i.e. Customer Number in Customer table) can change and this makes updates more difficult.

Should I use Natural Identity Columns without GUID?

Is it ok to define tables primary key with 2 foreign key column that in combination must be unique? And thus not add a new id column (Guid or int)?
Is there a negative to this?
Yes, it's completely OK. Why not? The downside of composite primary keys is that it can be long and it might be harder to identify a single row uniquely from the application perspective. However, for a couple integer columns (specially in junction tables), it's a good practice.
Natural versus Artificial primary keys is one of those issues that gets widely debated and IMHO the discussion only seems to see the positions harden.
In my opinion both work so long as the developer knows how to avoid the downside of both. A natural primary key (whether composite or single column) more nearly ensures that duplicate rows are not added to the DB. Whereas with artificial primary keys it is necessary to first check the record is unique (as opposed to the primary key, which being artificial will always be unique). One effective way to achieve this is to have a unique or candidate index on the fields that make the record unique (e.g. the fields that make a candidate for a primary key)
Meanwhile an artificial primary key makes for easy joining. Relations can be made with a single field to single field join. With a composite key the writer of the SQL statement must know how many fields to include in the join.
For some definitions of "ok", yes. As long as you never intend to add additional fields to this intersection table, that'll be fine. However, if you intend to have more fields, it's good practice to have an ID field. It's still fine, come to think of it, but can be more awkward.
Unless, of course, disk space is at a serious premium!
If you look into any database textbook, you will find such tables en masse. This is the default way to define n-to-m relations. For example:
article = (id, title, text)
author = (id, name)
article_author = (article_id, author_id)
Semantically, article_author is not a new entity so you might refrain from defining it as the primary key and instead create it as a normal index with UNIQUE constraint.
Yes, I agree, with "some definition of OK" it is OK. But the moment you decide to reference this composite primary key from somewhere (i.e. move it to a foreign key), it quickly bocomes NG (Not Good).

What are the down sides of using a composite/compound primary key?

What are the down sides of using a composite/compound primary key?
Could cause more problems for normalisation (2NF, "Note that when a 1NF table has no composite candidate keys (candidate keys consisting of more than one attribute), the table is automatically in 2NF")
More unnecessary data duplication. If your composite key consists of 3 columns, you will need to create the same 3 columns in every table, where it is used as a foreign key.
Generally avoidable with the help of surrogate keys (read about their advantages and disadvantages)
I can imagine a good scenario for composite key -- in a table representing a N:N relation, like Students - Classes, and the key in the intermediate table will be (StudentID, ClassID). But if you need to store more information about each pair (like a history of all marks of a student in a class) then you'll probably introduce a surrogate key.
There's nothing wrong with having a compound key per se, but a primary key should ideally be as small as possible (in terms of number of bytes required). If the primary key is long then this will cause non-clustered indexes to be bloated.
Bear in mind that the order of the columns in the primary key is important. The first column should be as selective as possible i.e. as 'unique' as possible. Searches on the first column will be able to seek, but searches just on the second column will have to scan, unless there is also a non-clustered index on the second column.
I think this is a specialisation of the synthetic key debate (whether to use meaningful keys or an arbitrary synthetic primary key). I come down almost completely on the synthetic key side of this debate for a number of reasons. These are a few of the more pertinent ones:
You have to keep dependent child
tables on the end of a foriegn key
up to date. If you change the the
value of one of the primary key
fields (which can happen - see
below) you have to somehow change
all of the dependent tables where
their PK value includes these
fields. This is a bit tricky
because changing key values will
invalidate FK relationships with
child tables so you may (depending
on the constraint validation options
available on your platform) have to
resort to tricks like copying the
record to a new one and deleting the
old records.
On a deep schema the keys can get
quite wide - I've seen 8 columns
once.
Changes in primary key values can be
troublesome to identify in ETL
processes loading off the system.
The example I once had occasion to
see was an MIS application
extracting from an insurance
underwriting system. On some
occasions a policy entry would be
re-used by the customer, changing
the policy identifier. This was a
part of the primary key of the
table. When this happens the
warehouse load is not aware of what
the old value was so it cannot match
the new data to it. The developer
had to go searching through audit
logs to identify the changed value.
Most of the issues with non-synthetic primary keys revolve around issues when PK values of records change. The most useful applications of non-synthetic values are where a database schema is intended to be used, such as an M.I.S. application where report writers are using the tables directly. In this case short values with fixed domains such as currency codes or dates might reasonably be placed directly on the table for convenience.
I would recommend a generated primary key in those cases with a unique not null constraint on the natural composite key.
If you use the natural key as primary then you will most likely have to reference both values in foreign key references to make sure you are identifying the correct record.
Take the example of a table with two candidate keys: one simple (single-column) and one compound (multi-column). Your question in that context seems to be, "What disadvantage may I suffer if I choose to promote one key to be 'primary' and I choose the compound key?"
First, consider whether you actually need to promote a key at all: "the very existence of the PRIMARY KEY in SQL seems to be an historical accident of some kind. According to author Chris Date the earliest incarnations of SQL didn't have any key constraints and PRIMARY KEY was only later addded to the SQL standards. The designers of the standard obviously took the term from E.F.Codd who invented it, even though Codd's original notion had been abandoned by that time! (Codd originally proposed that foreign keys must only reference one key - the primary key - but that idea was forgotten and ignored because it was widely recognised as a pointless limitation)." [source: David Portas' Blog: Down with Primary Keys?
Second, what criteria would you apply to choose which key in a table should be 'primary'?
In SQL, the choice of key PRIMARY KEY is arbitrary and product specific. In ACE/Jet (a.k.a. MS Access) the two main and often competing factors is whether you want to use PRIMARY KEY to favour clustering on disk or whether you want the columns comprising the key to appears as bold in the 'Relationships' picture in the MS Access user interface; I'm in the minority by thinking that index strategy trumps pretty picture :) In SQL Server, you can specify the clustered index independently of the PRIMARY KEY and there seems to be no product-specific advantage afforded. The only remaining advantage seems to be the fact you can omit the columns of the PRIMARY KEY when creating a foreign key in SQL DDL, being a SQL-92 Standard behaviour and anyhow doesn't seem such a big deal to me (perhaps another one of the things they added to the Standard because it was a feature already widespread in SQL products?) So, it's not a case of looking for drawbacks, rather, you should be looking to see what advantage, if any, your SQL product gives the PRIMARY KEY. Put another way, the only drawback to choosing the wrong key is that you may be missing out on a given advantage.
Third, are you rather alluding to using an artificial/synthetic/surrogate key to implement in your physical model a candidate key from your logical model because you are concerned there will be performance penalties if you use the natural key in foreign keys and table joins? That's an entirely different question and largely depends on your 'religious' stance on the issue of natural keys in SQL.
Need more specificity.
Taken too far, it can overcomplicate Inserts (Every key MUST exist) and documentation and your joined reads could be suspect if incomplete.
Sometimes it can indicate a flawed data model (is a composite key REALLY what's described by the data?)
I don't believe there is a performance cost...it just can go really wrong really easily.
when you se it on a diagram are less readable
when you use it on a query join are less
readable
when you use it on a foregein key
you have to add a check constraint
about all the attribute have to be
null or not null (if only one is
null the key is not checked)
usualy need more storage when use it
as foreign key
some tool doesn't manage composite
key
The main downside of using a compound primary key, is that you will confuse the hell out of typical ORM code generators.