Regarding Candidate Keys and Superkeys - sql

I have a quick question regarding candidate keys and superkeys. Say you have two keys (a, b) where 'a' is a primary key and b is a candidate key. Would the combination of these two keys be a superkey ie. would (a,b) be a superkey? Or would it be a candidate key. My assumption is that it would be a superkey because the definition of a candidate key states that it is a irreducible superkey and the combination of the two fields a and b could be reduced to either a or b. Is this logic correct? Or am I missing something here? Thanks!

Would the combination of these two keys be a superkey ie. would (a,b) be a superkey?
Yes, it would still uniquely identify rows.
Or would it be a candidate key.
No, it would no longer be minimal.
My assumption is that it would be a superkey because the definition of a candidate key states that it is a irreducible superkey and the combination of the two fields a and b could be reduced to either a or b. Is this logic correct?
Almost. Yes it would be a superkey, but not because it can be reduced. It would be a superkey because it is unique.
Every candidate key is superkey, but not every superkey is candidate key. So {a} is both candidate and superkey, {b} is both candidate and superkey and {a, b} is just superkey.

Related

Candidate key or Super key

Consider a relational table with different columns, what would you call the collection of unique and not null values, super key or candidate key?
A Super key is a set or one of more columns to uniquely identify rows in a table.
Candidate keys are selected from the set of super keys, the only thing we take care while selecting candidate key is: It should not have any redundant attribute. That’s the reason they are also termed as minimal super key.
In Employee table there are Three Columns : Emp_Code,Emp_Number,Emp_Name
Super keys:
All of the following sets are able to uniquely identify rows of the employee table.
{Emp_Code}
{Emp_Number}
{Emp_Code, Emp_Number}
{Emp_Code, Emp_Name}
{Emp_Code, Emp_Number, Emp_Name}
{Emp_Number, Emp_Name}
Candidate Keys:
As I stated above, they are the minimal super keys with no redundant attributes.
{Emp_Code}
{Emp_Number}
As a Summary:
A Superkey is a set of columns that uniquely identifies a row.Whereas a Candidate key would be a MINIMAL set of columns that uniquely identifies a row. So essentially a Superkey is a Candidate key with extra unnecessary columns in it.

Superkeys of this relation

I am trying to find the superkeys of this relation, but I am having troubles finding out how many superkeys there are and exactly what they are. I figured out that the candidate keys were {A},{B},{C},{D}.
Here is the relation:
R(A,B,C,D)
Functional Dependencies:
A->B
B->C
C->D
D->A
Candidate keys: {A},{B},{C},{D} (from what I figured out)
Can someone please help me find the superkeys, and how exactly to find them?
Let me keep it simple:
Here is a definition for super key and candidate key:
Super Key
Super key stands for superset of a key.
A Super Key is a set of one or more attributes that are taken collectively and can identify all other attributes uniquely.
Candidate Keys
Candidate Keys are super keys for which no proper subset is a super key. In other words candidate keys are minimal super keys.
Thus,any combination of candidate keys with other keys is a super key
In this example,
any combination of candidate keys of A,B,C,D is a Super key
Hope this helps!

Which is better, have a primary key composed of an integer and a foreign key or have a primary key autoincrement and a foreign key?

I have a problem, the database admin have the follow structure:
As you can see the primary key of the table TCModulo is a composed key of the ID_modulo and ID_sistema which is a foreign key of the table TCSistemas.
I think that is better that the field ID_modulo from the table TCModulo must be the primary key with an auto_increment constrain, and the field ID_sistema must be only a foreign key.
Wich one is better?
Whether the PK of TCmodulo is (ID_modulo) or (ID_modulo,ID_sistema) depends on what goes in the table. We cannot answer your question unless you tell us. Presumably an ID_modulo value in a row is how you refer to some modulo. You have to tell us how to do that. But after that (for every column) (and given what situations can arise) there is no choice left about which sets of columns are candidates for primary key.
A set of columns whose subrow values are unique in a table is called a superkey. Any subrow containing a unique subrow is unique. So any set of columns containing a superkey is a superkey. A subrow that contains no (smaller) unique subrow is called a candidate key. So a superkey that contains no (smaller) superkey is a candidate key. One of the candidate keys of a table is chosen as primary key.
If ID_modulo uniquely determined a module over the whole application, then (ID_modulo) would be unique with no smaller unique subrow inside so it would be a candidate key. It would be the only one so it would be the primary key.
If ID_modulo uniquely determined a module only per sistema, then (ID_modulo,ID_sistema) would be unique with no smaller unique subrow (assuming there can more than one sistemo) so it would be a candidate key. It would be the only one so it would be the primary key.
So what candidate keys are available to be chosen as primary key is up to how your application refers to modulos. After that there is no choice about candidate keys. In each of these two cases there's only one candidate key so there's no choice about primary key either.
As to whether you should have a unique id overall or only within sistema or both or anything else, that depends on other ergonomic issues. Eg you are uniquely kentverger in stackoverflow (now; user names aren't necessarily unique), but perhaps uniquely Kent at home. Eg you probably prefer to call today something like the 4th of July, rather than day 185. But note that any candidate key serves as a unique identifier. So if ID_modulo is unique only within sistema, still (ID_modula,ID_sistema) is unique overall.
Note that this has nothing to do with modulos being many-to-one with sistemas per se. It has to do with columns forming unique subrows.
I always prefer to use an identity (auto-increment) for the primary key, as it keeps the pages clustered better and avoids fragmentation on the disk. You need a foreign key ID_sistema anyway, so add that too.

Difference between Primary key and Candidate key

I have read about Keys in RDBMS.
https://stackoverflow.com/a/6951124/1647112
I however couldn't understand the need to use a candidate key. If a primary key is all that is needed to uniquely identify a row in a table, why is candidate key required?
Please give a good example as to state the differences and importance of various keys.
Thanks in advance.
A table can have one or more candidate keys - these are keys that uniquely identify a row in the table.
However, only one of these candidate keys can be chosen to be the primary key.
From, the above answer i came to this conclusion
Super key(one or more attributes used for selecting one or more rows)
||
\/
Candidate key(one or more attributes from super used for selecting a single row)
||
\/
Primary key(one attribute among candidate keys used for selecting a single row)
Am i correct?

Superkey vs. Candidate key

What difference between Super and Candidate key in ERDB?
A superkey is a set of columns that uniquely identifies a row. A Candidate key would be a MINIMAL set of columns that uniquely identifies a row. So essentially a Superkey is a Candidate key with extra unnecessary columns in it.
candidate key is a minimal superkey
Candidate key = minimal key to identify a row
Super key = at least as wide as a candidate key
For me, a super key would generally introduce ambiguities over a candidate key
Let's keep it simple
SuperKey - A set of keys that uniquely defines a row.So out of all the attributes if even any single one is unique then all the subsets having that unique attribute falls under superkey.
Candidate Key - A superkey out of which no further subset can be derived which can identify the rows uniquely, Or we can simply say that it is the minimal superkey.
In nutshell: CANDIDATE KEY is a minimal SUPER KEY.
Where Super key is the combination of columns(or attributes) that uniquely identify any record(or tuple) in a relation(table) in RDBMS.
For instance, consider the following dependencies in a table having columns A, B, C and D
(Giving this table just for a quick example so not covering all dependencies that R could have).
Attribute set (Determinant)---Can Identify--->(Dependent)
A-----> AD
B-----> ABCD
C-----> CD
AC----->ACD
AB----->ABCD
ABC----->ABCD
BCD----->ABCD
Now, B, AB, ABC, BCD identifies all columns so those four qualify for the super key.
But, B⊂AB; B⊂ABC; B⊂BCD hence AB, ABC, and BCD disqualified for CANDIDATE KEY as their subsets could identify the relation, so they aren't minimal and hence only B is the candidate key, not the others.