I know that a primary key must be unique, but is it okay for a primary key to be equal to a different column in the same table by coincidence?
For instance, I have 2 tables. One table is called person that holds information about a person (ID, email, telephone, address, name). The other table is staff (ID, pID(person ID), salary, position).
In staff the ID column is the primary key and is used to uniquely identify a staff member. The number is from 1 - 100. However, the pID (person ID) may be equal to the ID. For instance the staff ID may be 1 and the pID that it references to may be equal to 1.
Is that okay?
The job of the primary key is to uniquely and reliably identify each row - therefore, it must be unique and NOT NULL - anything else is irrelevant.
If you just happen to have a second column with the exact same values - I'd be wondering why that is the case - but that doesn't in any way affect the primary key negatively.
Primary key of a table must be unique and not null. There are no restrictions on uniquity between tables. It's 100% up to you.
Yes. There's no checking of relationships between different columns in a table.
The restriction you're worried about doesn't even make sense. Suppose you had a table for persons with columns ID, name, and year_of_birth. It wouldn't allow someone who was born in 1975 to have ID = 1975.
Related
This may seem like a simple question, but I am stumped:
I have created a database about cars (in Oracle SQL developer). I have amongst other tables a table called: Manufacturer and a table called Parentcompany.
Since some manufacturers are owned by bigger corporations, I will also show them in my database.
The parentcompany table is the "parent table" and the Manufacturer table the "child table".
for both I have created columns, each having their own Primary Key.
For some reason, when I inserted the values for my columns, I was able to use the same value for the primary key of Manufacturer and Parentcompany
The column: ManufacturerID is primary Key of Manufacturer. The value for this is: 'MBE'
The column: ParentcompanyID is primary key of Parentcompany. The value for this is 'MBE'
Both have the same value. Do I have a problem with the thinking logic?
Or do I just not understand how primary keys work?
Does a primary key only need to be unique in a table, and not the database?
I would appreciate it if someone shed light on the situation.
A primary key is unique for each table.
Have a look at this tutorial: SQL - Primary key
A primary key is a field in a table which uniquely identifies each
row/record in a database table. Primary keys must contain unique
values. A primary key column cannot have NULL values.
A table can have only one primary key, which may consist of single or
multiple fields. When multiple fields are used as a primary key, they
are called a composite key.
If a table has a primary key defined on any field(s), then you cannot
have two records having the same value of that field(s).
Primary key is table-unique. You can use same value of PI for every separate table in DB. Actually that often happens as PI often incremental number representing ID of a row: 1,2,3,4...
For your case more common implementation would be to have hierarchical table called Company, which would have fields: company_name and parent_company_name. In case company has a parent, in field parent_company_name it would have some value from field company_name.
There are several reasons why the same value in two different PKs might work out with no problems. In your case, it seems to flow naturally from the semantics of the data.
A row in the Manufacturers table and a row in the ParentCompany table both appear to refer to the same thing, namely a company. In that case, giving a company the same id in both tables is not only possible, but actually useful. It represents a 1 to 1 correspondence between manufacturers and parent companies without adding extra columns to serve as FKs.
Thanks for the quick answers!
I think I know what to do now. I will create a general company table, in which all companies will be stored. Then I will create, as I go along specific company tables like Manufacturer and parent company that reference a certain company in the company table.
To clarify, the only column I would put into the sub-company tables is a column with a foreign key referencing a column of the company table, yes?
For the primary key, I was just confused, because I hear so much about the key needing to be unique, and can't have the same value as another. So then this condition only goes for tables, not the whole database. Thanks for the clarification!
I get the point that primary indices are unique to each record and hence retrieving a record gets faster using primary indexing. What happens when we use secondary indexing.
Of what I can think of,
ID Name School
1 John XYZ
2 Roger XYZ
3 Ray ABC
4 Matt KJL
5 Roger ABC
if we have secondary indexing on Name, then it will help me retrieve records relevant to names and not with id hence it would not restrict me to one record if I query a record for Roger and I would be able to get result pertaining to both Rogers. Hence if the table is extensively queried based on the secondary index, it should be used.
Am I right?
Apart from speeding up specific queries, perhaps the most common case for secondary indexes is to speed up checking of UNIQUE constraints. Consider e.g. a table
CREATE TABLE Person (
id int primary key,
fname text not null,
lname text not null,
date_of_birth date not null,
...
UNIQUE (fname, lname, date_of_birth)
)
Here we want to enforce the UNIQUE constraint to ensure the same person doesn't appear in the table multiple times under different ids. But at the same time we wouldn't want to make (fname, lname, date_of_birth) the primary key, because a person's name could potentially change, and because using 3 attributes as reference can be cumbersome.
Now, when inserting a new record into the table, the DBMS needs to check whether it already contains another tuple with the same (fname, lname, date_of_birth), and a secondary index on these attributes can help speed this check up.
Note that UNIQUE constraints automatically generate their indexes, so there is no need to create them explicitly.
Another common case where secondary indexes are required (and must be created explicitly) are foreign key constraints that target attributes that do not make up the primary key for the target table.
I got a Person table, each Person can visit several countries. The countries visited by each Person is stored in table CountryVisit
Person:
PersonId,
Name
CountryVisit:
CountryVisitId (primary key)
PersonId (foreign key to 'Person.PersonId')
CountryName
VisitDate
For the CountryVisit Table, my primary key is CountryVisitId which is an identity column. This design will result in that a Person can have only 1 CountryVisit but the CountryVisitId can be 40 for example. Is it a better practice to create another surrogate key column to act as an identity column while the CountryVisitId be a natural key that is unique for each PersonId ?
It is pretty good. I would suggest that you have a separate table for countries, with one row per country. Then the CountryVisits table would have:
CountryVisitId PrimaryKey,
PersonId ForeignKey,
CountryId ForeignKey,
VisitDate
This will ensure that the country name is always spelled correctly and consistently. If you want a list of countries to get started, check out this Wikipedia page. Also note that your definition of country may be different from the standard list of countries (there are actually several out there), so you should use your own auto-incremented primary key, rather than using the country code.
And, you should relax the requirement and remove the unique or primary key on PersonId, CountryId, unless you want to enforce only one visit per country.
Is it that a primary key is the selected candidate key chosen for a given table?
Candidate Key – A Candidate Key can be any column or a combination of columns that can qualify as unique key in database. There can be multiple Candidate Keys in one table. Each Candidate Key can qualify as Primary Key.
Primary Key – A Primary Key is a column or a combination of columns that uniquely identify a record. Only one Candidate Key can be Primary Key.
More on this link with example
John Woo's answer is correct, as far as it goes. Here are a few additional points.
A primary key is always one of the candidate keys. Fairly often, it's the only candidate.
A table with no candidate keys does not represent a relation. If you're using the relational model to help you build a good database, then every table you design will have at least one candidate key.
The relational model would be complete without the concept of primary key. It wasn't in the original presentation of the relational model. As a practical matter, the use of foreign key references without a declared primary key leads to a mess. It could be a logically correct mess, but it's a mess nonetheless. Declaring a primary key lets the DBMS help you enforce the data rules. Most of the time, having the DBMS help you enforce the data rules is a good thing, and well worth the cost.
Some database designers and some users have some mental confusion about whether the primary key identifies a row (record) in a table or an instance of an entity in the subject matter that the table represents. In an ideal world, it's supposed to do both, and there should be a one-for-one correspondence between rows in an entity table and instances of the corresponding entity.
In the real world, things get screwed up. Somebody enters the same new employee twice, and the employee ends up with two ids. Somebody gets hired, but the data entry slips through the cracks in some manual process, and the employee doesn't get an id, until the omission is corrected. A database that does not collapse the first time things get screwed up is more robust than one that does.
Primary key -> Any column or set of columns that can uniquely identify a record in the table is a primary key. (There can be only one Primary key in the table)
Candidate key -> Any column or set of columns that are candidate to become primary key are Candidate key. (There can be one or more candidate key(s) in the table, if there is only one candidate key, it can be chosen as Primary key)
A Primary key is a special kind of index in that:
there can be only one;
it cannot be nullable
it must be unique.
Candidate keys are selected from the set of super keys, the only thing we take care while selecting the candidate key is: It should not have any redundant attribute.
Example of an Employee table:
Employee (
Employee ID,
FullName,
SSN,
DeptID
)
Candidate Key: are individual columns in a table that qualifies for the uniqueness of all the rows. Here in Employee table EmployeeID & SSN are Candidate keys.
Primary Key: are the columns you choose to maintain uniqueness in a table. Here in Employee table, you can choose either EmployeeID or SSN columns, EmployeeID is a preferable choice, as SSN is a secure value.
Alternate Key: Candidate column other the Primary column, like if EmployeeID is PK then SSN would be the Alternate key.
Super Key: If you add any other column/attribute to a Primary Key then it becomes a super key, like EmployeeID + FullName, is a Super Key.
Composite Key: If a table does not have a single column that qualifies for a Candidate key, then you have to select 2 or more columns to make a row unique. Like if there is no EmployeeID or SSN columns, then you can make FullName + DateOfBirth as Composite primary Key. But still, there can be a narrow chance of duplicate row.
There is no difference. A primary key is a candidate key. By convention one candidate key in a relation is usually chosen to be the "primary" one but the choice is essentially arbitrary and a matter of convenience for database users/designers/developers. It doesn't make a "primary" key fundamentally any different to any other candidate key.
A table can have so many column which can uniquely identify a row. This columns are referred as candidate keys, but primary key should be one of them because one primary key is enough for a table. So selection of primary key is important among so many candidate key. Thats the main difference.
Think of a table of vehicles with an integer Primary Key.
The registration number would be a candidate key.
In the real world registration numbers are subject change so it depends somewhat on the circumstances what might qualify as a candidate key.
Primary key -> Any column or set of columns that can uniquely identify a record in the table is a primary key. (There can be only one Primary key in the table) and
the candidate key-> the same as Primary key but the Primary Key chosen by DB administrator's prospective for example(the primary key the least candidate key in size)
A primary key is a column (or columns) in a table that uniquely identifies the rows in that table.
CUSTOMERS
CustomerNo FirstName LastName
1 Sally Thompson
2 Sally Henderson
3 Harry Henderson
4 Sandra Wellington
For example, in the table above, CustomerNo is the primary key.
The values placed in primary key columns must be unique for each row: no duplicates can be tolerated. In addition, nulls are not allowed in primary key columns.
So, having told you that it is possible to use one or more columns as a primary key, how do you decide which columns (and how many) to choose?
Well there are times when it is advisable or essential to use multiple columns. However, if you cannot see an immediate reason to use multiple columns, then use one. This isn't an absolute rule, it is simply advice. However, primary keys made up of single columns are generally easier to maintain and faster in operation. This means that if you query the database, you will usually get the answer back faster if the tables have single column primary keys.
Next question — which column should you pick? The easiest way to choose a column as a primary key (and a method that is reasonably commonly employed) is to get the database itself to automatically allocate a unique number to each row.
In a table of employees, clearly any column like FirstName is a poor choice since you cannot control employee's first names. Often there is only one choice for the primary key, as in the case above. However, if there is more than one, these can be described as 'candidate keys' — the name reflects that they are candidates for the responsible job of primary key.
If superkey is a big set than candidate key is some smaller set inside big set and primary key any one element(one at a time or for a table) in candidate key set.
First you have to know what is a determinant?
the determinant is an attribute that used to determine another attribute in the same table.
SO the determinant must be a candidate key. And you can have more than one determinant.
But primary key is used to determine the whole record and you can have only one primary key.
Both primary and candidate key can consist of one or more attributes
I have created a table called STUDENT which consists of Unique ID as the PRIMARY KEY along with other related attributes like Name, Addr, Gender etc....
Since I don't want to increase the table size of the STUDENT, I created a weak entity called ACADEMIC RECORDS which stores the previous Academic Records of the student.But in this table i have only created a PRIMARY KEY Unique ID which references the Unique ID of the student. and there is no other attribute in conjunction with Unique ID in the weak entity ACADEMIC RECORD Table.
As I came across the definition OF A WEAK ENTITY which define its primary key as a combination of weak entity's attribute and the owner's table's primary key(in this case STUDENT)
Is my method of creating a foreign key as the only primary key in the table ACADEMIC RECORD correct??
STUDENT Table
**UID** NAME GENDER ADDRESS PHNO.
ACADEMIC RECORD Table
**UID** HighschoolMarks GradSchoolMarks
There's nothing necessarily wrong with having a primary key that's also entirely a foreign key. It's a common way of implementing something like ‘base classes’, where an entity has a row in a base table, and may have a row in one or more extension tables (one to one-or-zero relationship).
I'm not sure it's the most appropriate schema for what you're doing though. If there really is an exactly one-to-one relationship between academic_records and students, it looks like they are part of the same entity to me.
In which case from a normalisation point of view the record columns should be part of the main students table. Maybe there's an efficiency reason to denormalise, but “I don't want to increase the table size” is not normally an adequate reason. Databases can cope with big tables.
I'm not completely clear on what you are asking, but it sounds like you are using the correct method.
Your STUDENT table requires a primary key to provide a unique reference for each row.
Your ACADEMIC_RECORD table also requires a primary key to provide a unique reference for each row.
Each student may have zero or more academic records, and you want to identify the student to which each academic record belongs. You do this by adding a column in the ACADEMIC_RECORD table which contains the id (the primary key) of the student:
STUDENT ACADEMIC_RECORD
id
id <-------> student_id
name high_school_marks
gender grade_school_marks
address
phone_no
Assume you have three students: Alice, Bob and Dave. Alice and Dave have three academic records, and Bob has two. Your tables would look something like this (I've omitted some columns to make things clearer):
STUDENT
id name
1 Alice
2 Bob
3 Dave
ACADEMIC_RECORD
id student_id
1 1
2 1
3 1
4 2
5 2
6 3
7 3
8 3