2nf second normal form difficult exercise

2nf second normal form difficult exercise - sql

I have R(A,B,C,D) with AB
primary key and AD --> C
I think it is in 2nf becouse you cannot determine C with a subset of AB
from wiki "a table is in 2NF if it is in 1NF and no non-prime attribute is dependent on any proper subset of any candidate key of the table"
but many people say it is in 1nf because the definition
"in 2NF if it is in 1NF and every non-prime attribute of the table is dependent on the whole of every candidate key"
so AD is not the whole primary key but just a part with another attribute not prime
please if you can put also some refereces different of wikipedia so I can demostrate my thesis if it is really correct

You state as a fact that AB is the primary key for the given relation R. For that to be true there have to be at least one more functional dependency other than AD->C .
In order to explain 2NF, I assume that the missing FD is say B->D. So we have a relation R(A,B,C,D) with FD's :
AD->C
B->D
Then our primary key is AB. Now in simple words 2NF deals with partial dependency, that is, when an attribute depends on part of the primary key. (So if we have a primary key that's just one attribute than the relation R is already in 2NF!)
Formally:
Given a functional dependency X->A of a relation R where:
X is a set of attributes of R
A is a non-prime attribute not in X
then to be in 2NF, X should not be a proper subset of any key.
Coming back to our example. Primary key is AB. So primary attributes are A and B. Non primary attributes are C and D.
Let's consider the first FD, AD->C
Here C is a non primary attribute. To not violate 2NF condition, AD should not be a proper subset of the primary key AB. AD is not a proper subset of AB, so it does not violate 2NF condition.
Let's see the next FD, B->D
Here D is a non primary attribute and B is a proper subset of primary key AB and therefore it violates 2NF condition.
Hence the relation R is not in second normal form.
On the other hand if the set of FD's for R would have been:
AD->C
AB->D
Our primary key is still AB but now the relation R is in second normal form.

Related

From UML to SQL (PostgreSQL)

I am training for an upcoming exam and just finished this (simple) exercise.
I just wanted to be sure that I implemented everything correctly, especially the Composition with the multiplicities 1 and 0..*
My Answer:
CREATE TABLE exam.A(
idA integer,
b text NOT NULL,
c float DEFAULT -1.0 CONSTRAINT negative_c CHECK (c < 0.0),
PRIMARY KEY(idA));
CREATE TABLE exam.B(
idB integer,
c integer,
PRIMARY KEY(idB));
CREATE TABLE exam.RelationAandB(
idA integer NOT NULL ON DELETE CASCADE,
idB integer,
b integer,
c text,
FOREIGN KEY (idA) REFERENCES exam.A(idA),
FOREIGN KEY (idB) REFERENCES exam.B(idB),
PRIMARY KEY (idA, idB));

Your SQL code is pretty good, but I see the following issues:
In UML class diagrams, attributes are mandatory by default. They would be optional only when qualified with the multiplicity expression [0..1]. Consequently, all attributes would need to be coded as NOT-NULL columns. Possibly, however, your instructor has not been aware of this or is using a non-standard reading for "UML data models".
The string-valued attribute A::b has a "{not empty}" property modifier, which reads as a constraint requiring non-empty strings. Notice that having a non-empty string value is not the same as being mandatory (NOT NULL) because the empty string "" satisfies the NOT NULL constraint.
You also need a CASCADE DELETE rule on idB in the RelationAandB table, because no matter if an A or a B is deleted, the associated tuples in RelationAandB have to be deleted as well.
I think, for readability, it is preferable to add the PRIMARY KEY declaration to the column definition if the key is non-composite (has just one column). The same holds for single-column FOREIGN KEY declarations.
Many people think that a composition implies a deletion dependency, although this is not warranted by the UML semantics (see Aggregation versus Composition), and it is also not based on common sense (see my remarks below). In your SQL code, you did not implement such a dependency ("whenever an A is deleted, all dependent Bs have to be deleted as well"), which is correct according to the UML semantics of the class diagram, but which may have been the intention of your instructor, especially since he made it mandatory for a B component to have an A composite (by the multiplicity 1 at the composite side). Such a mandatory composite constraint implies that, when their composite is deleted, components either have to be deleted as well or they have to be re-assigned to another (A) composite. If your instructor's intention was that there should be a deletion dependency, then you should better add a corresponding foreign key declaration in exam.B from idB to RelationAandB with a CASCADE DELETE rule: idB integer FOREIGN KEY REFERENCES exam.RelationAandB CASCADE DELETE,
Concerning the question if a composition implies a lifecycle dependency between a composite and its components, we have to distinguish between three levels of abstraction: 1) the purely conceptual (philosophical) level, which should be the common sense of a data modeler, 2) the UML semantics, which is often not precisely defined, and 3) the level of (e.g., SQL) code. At the conceptual level, it should be clear that there are compositions with and without such a lifecycle dependency, so the very fact that there is a composition does not imply a lifecycle dependency.
Unfortunately, UML didn't define any means how to declare that a composition has existentially dependent components. In my SO answer Aggregation versus Composition, I have proposed to use a stereotype "inseparable" for such a composition.

In which normal form are these FDs?

I've been trying to figure out the difference between the 2nd and 3rd Normal Form using this example. The definitions didn't do the trick for me...
These are the functional dependencies:
A is the candidate key. (A --> A,B,C,D)
FDs:
A --> CD
AC --> D
CD --> B
D --> B
My idea: it's in 1st and 2nd, but not in 3rd Normal form because A, the candidate key, doesn't consist of two or more columns. But B is transitively dependent on D. So it's not in 3rd.
Ist that correct? Especially the argument that A consits of less than two columns?

First, let us see what 2NF and 3NF are. From the context of the question it is clear that 1NF is understood, so I will refer to it. If it is unclear as well, let me know, I will clarify that as well.
2NF: R is in second normal form, if and only if it is in first normal form and no non-prime attribute is dependent on any proper subset of any candidate key of the relation.
non-prime attributes are attributes which are not part of any candidate keys. So, if a non-prime attribute can be determined by a functional dependency which holds a non-whole subset of a candidate key, then the relation is not in 2NF.
For example, let's consider an invoices(number, year, age) table where (number, year) is a candidate key. age can be determined by the year alone, so the table is not in 2NF.
In your case, since the key is one dimensional, assuming it is in 1NF, we can say it is in 2NF as well. However, it is in 3NF if and only if it is in 2NF and every non-prime attribute is non transitively dependent on every key.
In your case, A is the key, but since
A -> D -> B
B is transitively dependent on A, so your table is not in 3NF. To achieve 3NF, you will need to create another table, which will be in relation with this one via D and will hold B. Possible solution:
T1(A, C, D)
T2(D, B)
Note, that AC -> D and A -> CD are trivial, since A is the candidate key and the candidate key determines everything else. If that's not the case, you will need to take a look at 1NF as well.

Clarification on foreign keys, SQL

If I make it so that one of the attributes in my relation A references a foreign key of another relation B, is the attribute in A required to be the primary key for A (or part of the primary key for A)?
Also, my understanding is that in order to reference an attribute, the referenced attribute must be a key or unique. Am I then right is asserting that we couldn't reference part of a primary key (i.e if the primary key had two attributes we would need to reference both of them or neither, since by itself neither attribute is guaranteed to be unique)?

A foreign key must reference a unique key of some sorts, whether it's a primary key or not. You cannot reference just part of a composite unique key, unless it's a unique key on its own right.
The referencing field(s) can be a unique key (making the relation a 1:0..1 relation, but needn't necessarily be one.

Yes you are right in your understanding. Lets say you were storing information about dogs being washed at a doggy parlor, and you had two tables (tbl_dog, tbl_DogsWashed).
tbl_Dog has the columns (DogId,DogsName,Breed,OwnersIdentityNumber)
tbl_DogsWashed has the columns (DogsWashedId,DogsName)
If you linked the two tables together using the dogs name, you would risk the fact that two different dogs with the same name have had washes.
Rather, tbl_Dog would have the columns (DogsWashedId,DogId) and you would look up the DogId using the DogsName,Breed,OwnersIdentityNumber etc and populate the tbl_DogsWashed table with a primary key from tbl_Dog.

Functional dependencies keys and normal form

I am trying to understand functional dependencies
Let's say we have R with {A,B,C,D,E} and FDs A->B, BC->E and ED->A.
What are the keys and is R in 3NF or BCNF?

The keys here are — ACD, BCD and ECD. Since each attribute of the relation R comes at least once in each of the keys, all the attributes in your relation R are prime attributes.
Note that if a relation has all prime attributes then it is already in 3NF.
Hence the given relation R is in 3NF.
To be in BCNF, for each functional dependency X->Y, X should be a key. We see that the very first dependency ( A->B ) violates this and hence the relation R is not in BCNF.

The keys are — ACD, BCD and ECD.
Prime attributes will be (A,B,C,D,E) because all are coming in primary key.
Note that if a relation has all prime attributes then it is already in 3NF.
Hence the given relation R is in 3NF.
To be in BCNF, for each functional dependency X->Y, X should be a superkey. We see that the very first dependency ( A->B ) violates this and hence the relation R is not in BCNF.

The candidate keys are - ACD,BCD and ECD.
Prime attributes are (A,B,C,D,E) because they are all in primary keys.
Now, first we check the relation for BCNF
For BCNF, in the FD's the left side in the attribute must be a super key and as you can notice that not any FD follows this condition
For 3NF, in the FD's there are two conditions:
1. Either the left side be a super key
2. If the first conditions fails, then the right side of the same FD must be a prime attribute.
if the relation follows these conditions, then it is in 3NF and as we can notice all the attributes are prime attributes, the following relation R is in 3NF but not in BCNF.

Superkey vs. Candidate key

What difference between Super and Candidate key in ERDB?

A superkey is a set of columns that uniquely identifies a row. A Candidate key would be a MINIMAL set of columns that uniquely identifies a row. So essentially a Superkey is a Candidate key with extra unnecessary columns in it.

candidate key is a minimal superkey

Candidate key = minimal key to identify a row
Super key = at least as wide as a candidate key
For me, a super key would generally introduce ambiguities over a candidate key

Let's keep it simple
SuperKey - A set of keys that uniquely defines a row.So out of all the attributes if even any single one is unique then all the subsets having that unique attribute falls under superkey.
Candidate Key - A superkey out of which no further subset can be derived which can identify the rows uniquely, Or we can simply say that it is the minimal superkey.

In nutshell: CANDIDATE KEY is a minimal SUPER KEY.
Where Super key is the combination of columns(or attributes) that uniquely identify any record(or tuple) in a relation(table) in RDBMS.
For instance, consider the following dependencies in a table having columns A, B, C and D
(Giving this table just for a quick example so not covering all dependencies that R could have).
Attribute set (Determinant)---Can Identify--->(Dependent)
A-----> AD
B-----> ABCD
C-----> CD
AC----->ACD
AB----->ABCD
ABC----->ABCD
BCD----->ABCD
Now, B, AB, ABC, BCD identifies all columns so those four qualify for the super key.
But, B⊂AB; B⊂ABC; B⊂BCD hence AB, ABC, and BCD disqualified for CANDIDATE KEY as their subsets could identify the relation, so they aren't minimal and hence only B is the candidate key, not the others.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas