Multi-Column Primary Key or Unique Constraint? - sql

I have a Country table which includes the columns ID, Name and Code. All three of these columns should contain unique values and cannot be NULL.
My question is, would I be better to create a primary key on the ID column and a unique constraint for the Name and Code columns (in one index, I guess), or would it be better to just include the Name and Code columns in the primary key with the ID? And why? Are there potential downsides or complexities that will result from having a multiple-column primary key?

First of all - yes, a compound primary key made up of three columns makes it more annoying to join to this table - any other table wanting to join to the Country table will also have to have all three columns to establish the join.
And more importantly - it's NOT the same restriction!
If you have the PK on ID and a unique constraint on both Code and Name separately, then this is NOT valid:
ID Code Name
--------------------------
41 CH Switzerland
341 CH Liechtenstein
555 LIE Liechtenstein
because the Code of CH and the Name of Liechtenstein both appear twice.
But if you have a single PK on all three columns together - then this is valid because each row has a different tuple of data - (41, CH, Switzerland) is not the same as (341, CH, Liechtenstein) and therefore, those two rows are admissible.
If you put the PK on all three columns at once, then the uniqueness only extends to the whole tuple (all three columns) - each column separately can have "duplicates".
So it really boils down to
what you really need (which uniqueness)
how easy you want to make it to join to this table

Related

Foreign and Primary Key Conceptual Questions

I am a newbie at SQL/PostgreSQL, and I had a conceptual question about foreign keys and keys in general:
Let's say I have two tables: Table A and Table B.
A has a bunch of columns, two of which are A.id, A.seq. The primary key is btree(A.id, A.seq) and it has a foreign key constraint that A.id references B.id. Note that A.seq is a sequential column (first row has value 1, second has 2, etc).
Now say B has a bunch of columns, one of which is the above mentioned B.id. The primary key is btree(B.id).
I have the following questions:
What exactly does btree do? What is the significance of having two column names in the btree rather than just one (as in btree(B.id)).
Why is it important that A references B instead of B referencing A? Why does order matter when it comes to foreign keys??
Thanks so much! Please correct me if I used incorrect terminology for anything.
EDIT: I am using postgres
A btree index stored values in sorted order, which means you can not only search for a single primary key value, but you can also efficiently search for a range of values:
SELECT ... WHERE id between 6060842 AND 8675309
PostgreSQL also supports other index types, but only btree is supported for a unique index (for example, the primary key).
In your B table, the primary key being a single column id means that only one row can exist for each value in id. In other words, it is unique, and if you search for a value by primary key, it will find at most one row (it may also find zero rows if you don't have a row with that value).
The primary key in your A table is for (id, seq). This means you can have multiple rows for each value of id. You can also have multiple rows for each value of seq as long as they are for different id values. The combination must be unique though. You can't have more than one row with the same pair of values.
When a foreign key in A references B, it means that the row must exist in B before you are allowed to store the row in A with the same id value. But the reverse is not necessary.
Example:
Suppose B is for users and A is for phones. You must store a user before you can store a phone record for that user. You can store one or more phones for that user, one per row in the A table. We say that a row in A therefore references the user row in B, meaning, "this phone belongs to user #1234."
But the reverse is not restricted. A user might have no phones (at least not known to this database), so there is no requirement for B to reference A. In other words, it is not required for a user to have a phone. You can store a row in B (the user) even if there is no row in A (the phones) for that user.
The reference also means you are not allowed to DELETE FROM B WHERE id = ? if there are rows in another table that reference that given row in B. Deleting that user would cause those other rows to become orphaned. No one would be able to know who those phones belonged to, if the user row they reference is deleted.
To your questions:
There are several strategies to implement unique keys. The most common one is to use an index that is "unique" using a "b-tree" strategy. That's what "btree" means in PostgreSQL.
Having two columns in a key just depends on how you want to design your table. When you have a key with more than one column that is called a "composite key".
When A references B, the columns in B must represent a "key". The columns in A do not represent a key, but just a reference to one. In fact the values in A for that column can be repeated; that is, multiple rows in A can point to the same row in B.
Your data structure makes no sense. Why would the primary key of A haver both id and name? Normally it would just be id. In some data models, you might have a version or timestamp added. I can't think of a reasonable data model where name would also be included.
In addition, B's foreign key would have to be to both id and name.
But, your question is what is btree for? Most databases don't have such an option. A primary key would typically be expressed as:
id int primary key;
constraint unq_t_id primary key (id);
btree is a type of index -- in fact the default type of index in all databases that I'm aware of. Databases that have a plethora of available types of indexes -- such as Postgres -- you can specify the index type associated with the primary key.

Optimized structure for normalization of multi value attribute

Suppose I have a table in which I have a column which is multi-valued, these values are primary key of another table, now I could normalize this by making it single-valued column and repeating the same row for each value that occurs to be in the column, but I guess this is so redundant, note that this column can have up to 100 values(which are foreign key derived from another table).
Now if I make it single-valued column then obviously my rows will be multiplied by the number of keys I take from the foreign table, what is the best practice to normalize this kind of scenario in the most optimized way?
Table-1(
pk,
column-1,
column-2)
Table-2(
pk,
column-1,
column-2,
FK(Table-1))
Here my FK can relate to more than one item of table-1

One Primary Key Value in many tables

This may seem like a simple question, but I am stumped:
I have created a database about cars (in Oracle SQL developer). I have amongst other tables a table called: Manufacturer and a table called Parentcompany.
Since some manufacturers are owned by bigger corporations, I will also show them in my database.
The parentcompany table is the "parent table" and the Manufacturer table the "child table".
for both I have created columns, each having their own Primary Key.
For some reason, when I inserted the values for my columns, I was able to use the same value for the primary key of Manufacturer and Parentcompany
The column: ManufacturerID is primary Key of Manufacturer. The value for this is: 'MBE'
The column: ParentcompanyID is primary key of Parentcompany. The value for this is 'MBE'
Both have the same value. Do I have a problem with the thinking logic?
Or do I just not understand how primary keys work?
Does a primary key only need to be unique in a table, and not the database?
I would appreciate it if someone shed light on the situation.
A primary key is unique for each table.
Have a look at this tutorial: SQL - Primary key
A primary key is a field in a table which uniquely identifies each
row/record in a database table. Primary keys must contain unique
values. A primary key column cannot have NULL values.
A table can have only one primary key, which may consist of single or
multiple fields. When multiple fields are used as a primary key, they
are called a composite key.
If a table has a primary key defined on any field(s), then you cannot
have two records having the same value of that field(s).
Primary key is table-unique. You can use same value of PI for every separate table in DB. Actually that often happens as PI often incremental number representing ID of a row: 1,2,3,4...
For your case more common implementation would be to have hierarchical table called Company, which would have fields: company_name and parent_company_name. In case company has a parent, in field parent_company_name it would have some value from field company_name.
There are several reasons why the same value in two different PKs might work out with no problems. In your case, it seems to flow naturally from the semantics of the data.
A row in the Manufacturers table and a row in the ParentCompany table both appear to refer to the same thing, namely a company. In that case, giving a company the same id in both tables is not only possible, but actually useful. It represents a 1 to 1 correspondence between manufacturers and parent companies without adding extra columns to serve as FKs.
Thanks for the quick answers!
I think I know what to do now. I will create a general company table, in which all companies will be stored. Then I will create, as I go along specific company tables like Manufacturer and parent company that reference a certain company in the company table.
To clarify, the only column I would put into the sub-company tables is a column with a foreign key referencing a column of the company table, yes?
For the primary key, I was just confused, because I hear so much about the key needing to be unique, and can't have the same value as another. So then this condition only goes for tables, not the whole database. Thanks for the clarification!

How to reference a composite primary key into a single field?

I got this composite primary key in Table 1:
Table 1: Applicant
CreationDate PK
FamilyId PK
MemberId PK
I need to create a foreign key in Table 2 to reference this composite key. But i do not want to create three fields in Table 2 but to concatenate them in a single field.
Table 2: Sales
SalesId int,
ApplicantId -- This should be "CreationDate-FamilyId-MemberId"
What are the possible ways to achieve this ?
Note: I know i can create another field in Table 1 with the three columns concatenation but then i will have redundant info
What you're asking for is tantamount to saying "I want to treat three pieces of information as one piece of information without explicitly making it one piece of information". Which is to say that it's not possible.
That said, there are ways to make happen what you want to happen
Create a surrogate key (i.e. identity column) and use that as the FK reference
Create a computed column that is the concatenation of the three columns and use that as the FK reference
All else being equal (ease of implementation, politics, etc), I'd prefer the first. What you have is really a natural key and doesn't make a good PK if it's going to be referenced externally. Which isn't to say that you can't enforce uniqueness with a unique key; you can and should.

Table design, composite key

I have a table with some data summary which consist of client_id, location_id, category_id and summary columns. Values of the three id's columns are not unique.
At the moment I have created a composite key from client_id, location_id, category_id using primary keys. Those three columns will uniquely identify rows.
My question is, if I still should include unique primary key for that table for example column with auto-increment id ?
That depends completely on your uses of the table. If you don't want to refer to a given row in a query (for example, having a dependent table), the separate PK is unnecessary (eg. if you always ask for statistics for a given client and a given location and a given category). However, if you do have dependent tables, you probably want a separate PK as well.
If your composite key is the primary clustered index then I would say it's not necessary.