Hi this is my first time posting so sorry if formatting is bad.
I have a question regarding normalization, I came across a problem in homework and I am having issues finding any information on it. This is not in a specific database language but we are working with sql.
I am giving a table like T( A,B,C,D,E,F,G)
and we are given the following dependencies A+B----> C; A----> D, E; D---->E
we are then asked to give the primary keys followed by a 2NF and 3NF
this was my answer originally
for 2nf (A, B, C); (A, D, E)
and 3nf (A, B, C); (A, D, E); (D, E) with primary keys being A and B
I then noticed the table contained two attributes with no relation F and G
now this is where I am having trouble, would these be considered candidate keys or are they dropped in a 2nf or 3nf for possible redundancy? Since we aren't given actual data it is hard for me to understand why a table would have two attributes that have no relation to a primary key.
Thanks
Related
I've been trying to figure out the difference between the 2nd and 3rd Normal Form using this example. The definitions didn't do the trick for me...
These are the functional dependencies:
A is the candidate key. (A --> A,B,C,D)
FDs:
A --> CD
AC --> D
CD --> B
D --> B
My idea: it's in 1st and 2nd, but not in 3rd Normal form because A, the candidate key, doesn't consist of two or more columns. But B is transitively dependent on D. So it's not in 3rd.
Ist that correct? Especially the argument that A consits of less than two columns?
First, let us see what 2NF and 3NF are. From the context of the question it is clear that 1NF is understood, so I will refer to it. If it is unclear as well, let me know, I will clarify that as well.
2NF: R is in second normal form, if and only if it is in first normal form and no non-prime attribute is dependent on any proper subset of any candidate key of the relation.
non-prime attributes are attributes which are not part of any candidate keys. So, if a non-prime attribute can be determined by a functional dependency which holds a non-whole subset of a candidate key, then the relation is not in 2NF.
For example, let's consider an invoices(number, year, age) table where (number, year) is a candidate key. age can be determined by the year alone, so the table is not in 2NF.
In your case, since the key is one dimensional, assuming it is in 1NF, we can say it is in 2NF as well. However, it is in 3NF if and only if it is in 2NF and every non-prime attribute is non transitively dependent on every key.
In your case, A is the key, but since
A -> D -> B
B is transitively dependent on A, so your table is not in 3NF. To achieve 3NF, you will need to create another table, which will be in relation with this one via D and will hold B. Possible solution:
T1(A, C, D)
T2(D, B)
Note, that AC -> D and A -> CD are trivial, since A is the candidate key and the candidate key determines everything else. If that's not the case, you will need to take a look at 1NF as well.
(Primary keys in bold.)
In one of my lectures, we took the following schema
R(a, b, c, d, e)
a -> b
e -> b
c, d -> a
c, d -> b
c, d -> e
and took it to 2NF as follows:
R1(c, d, a, e)
c, d - > a and e
R2(a, e, b) (Not in 2NF)
a -> b
e -> b
Naturally, if I want to take my schema to 3NF this causes a problem, since b cannot be partially determined by a and e. What I want to do is simply create separate relations as follows:
R3(e, b)
e -> b
and
R4(a, b)
a -> b
In this instance b is fully functionally dependent the primary key, which brings me to 2NF and the transative dependencies are eleminated for relations 3 and 4, which are in 3NF. However I think it could be argued that this solution is not satisfactory as the value of b could potentially be different for each relation and as there could be anomalies when it is inevitably used as a foriegn key. Any thoughts on this?
We seek decompositions "preserving" FDs and (this is usually not stated explicitly) not introducing other constraints. An FD is preserved when it holds in some component. The idea is that we can check that an FD holds in recompositions by just checking that it holds in the components, rather than having to join then check. We also prefer an FD and its attibutes to be in just one component, or we would need to add a constraint that where the determinant values agree the dependent values agree. There's always a 3NF schema preserving all FDs without introducing other constraints. When an FD cannot be preserved to get to BCNF, there is instead an "equality dependency" introduced that two components must have the same projection on the FD attributes.
We don't normalize to a given NF by moving through lower NFs. That can preclude good higher NF designs arising. We use an algorithm for a given NF.
When some FDs (functional dependencies) hold, others do, per Armstrong's axioms. We must look among all FDs for NF violators and FDs to preserve, not just some given ones that form a cover. Algorithms also take that into account.
See this recent answer.
PS PKs (primary keys) don't matter, CKs (candidate keys) do. There can be more than one and they can be composite. A PK is just some CK you decided to call PK. So highlighting attributes of a PK is in general inadequate. Just list the CKs.
PPS An (update) anomaly is a certain thing, and it's not what you are using "anomaly" for.
I've been given the relation and functional dependencies
And am looking to justify what form it is in, and then to transform it into BCNF.
Now I proposed that it was in 3NF, as the second FD is a transitive dependency with a key attribute as its RHS. This second FD also violates BCNF, because C is not a superkey for R.
However - I am unsure how to go about decomposing into BCNF.
If I decompose into;
This voids the first FD, and effectively makes (A,C) the new key - so it doesn't seem correct! Can this relation be converted to BCNF?
Can this relation be converted to BCNF?
Every relation can be converted in BCNF, by applying the “analysis algorithm”, that can be found on any good book on databases.
Note that the relation has two keys, AB and AC, so that all attributes are primes (and for this reason the relation is automatically in 3NF).
You must start by finding all the dependencies that violates the BCNF, in this case only C → B, since C is not a superkey.
Then you decompose the relation in two relations, one contaning C and all the attributes determinates by it (in this case only B), and the other one including all the other attributes plus C.
So the decomposition is actually:
R1(B, C), with key C, with the only (non-trivial) dependency C → B
R2(A, C), with key AC, without (non-trivial) dependencies
Then the decomposition must be repeated for every relation that has some dependency that violates the BCNF, but in this case there is no such relation, because both R1 and R2 are in BCNF.
Finally note that the decomposition does not preserve the dependencies. In fact the dependency AB → C is not preserved in the decomposition.
I have R(A,B,C,D) with AB
primary key and AD --> C
I think it is in 2nf becouse you cannot determine C with a subset of AB
from wiki "a table is in 2NF if it is in 1NF and no non-prime attribute is dependent on any proper subset of any candidate key of the table"
but many people say it is in 1nf because the definition
"in 2NF if it is in 1NF and every non-prime attribute of the table is dependent on the whole of every candidate key"
so AD is not the whole primary key but just a part with another attribute not prime
please if you can put also some refereces different of wikipedia so I can demostrate my thesis if it is really correct
You state as a fact that AB is the primary key for the given relation R. For that to be true there have to be at least one more functional dependency other than AD->C .
In order to explain 2NF, I assume that the missing FD is say B->D. So we have a relation R(A,B,C,D) with FD's :
AD->C
B->D
Then our primary key is AB. Now in simple words 2NF deals with partial dependency, that is, when an attribute depends on part of the primary key. (So if we have a primary key that's just one attribute than the relation R is already in 2NF!)
Formally:
Given a functional dependency X->A of a relation R where:
X is a set of attributes of R
A is a non-prime attribute not in X
then to be in 2NF, X should not be a proper subset of any key.
Coming back to our example. Primary key is AB. So primary attributes are A and B. Non primary attributes are C and D.
Let's consider the first FD, AD->C
Here C is a non primary attribute. To not violate 2NF condition, AD should not be a proper subset of the primary key AB. AD is not a proper subset of AB, so it does not violate 2NF condition.
Let's see the next FD, B->D
Here D is a non primary attribute and B is a proper subset of primary key AB and therefore it violates 2NF condition.
Hence the relation R is not in second normal form.
On the other hand if the set of FD's for R would have been:
AD->C
AB->D
Our primary key is still AB but now the relation R is in second normal form.
I have a SQL database with multiple tables: A, B, C, D. Entities in those tables are quite different things, with different columns, and different kind of relations between them.
However, they all share one little thing: the need for a comment system which, in this case, would have an identical structure: author_id, date, content, etc.
I wonder which strategy would be the best for this schema to have A,..D tables use a comment system. In a classical 'blog' web site I would use a one-to-many relationship with a post_id inside the 'comments' table.
Here it looks like I need an A_comments, B_comments, etc tables to handle this problem, which looks a little bit weird.
Is there a better way?
Create a comment table with a comment_id primary key and the various attributes of a comment.
Additionally, create A_comment thus:
CREATE TABLE A_comment (
comment_id PRIMARY KEY REFERENCES comment(comment_id),
A_id REFERENCES A(A_id)
)
Do likewise for B, C and D. This ensures referential integrity between comment and all the other tables, which you can't do if you store the ids to A, B, C and D directly in comment.
Declaring A_comment.comment_id as the primary key ensures that a comment can only belong to one entry in A. It doesn't prevent a comment from belonging to an entry in A and an entry in B, but there's only so much you can achieve with foreign keys; this would require database-level constraints, which no database I know of supports.
This design also doesn't prevent orphaned comments, but I can't think of any way to prevent this in SQL, except, of course, to do the very thing you wanted to avoid: create multiple comment tables.
I had a similar "problem" with comments for more different object types (e.g. articles and shops in my case, each type has it's own table).
What I did: in my comments table I have two columns that manage the linking:
object_type (ENUM type) that determines the object/table we are linking to, and
object_id (unsigned integer type that matches the primary key of your other tables (or the biggest of them)) that point to the exact row in the particular table.
The column structure is then: id, object_type, object_id, author_id, date, content, etc.
Important thing here is to have an index on both of the columns, (object_type, object_id), to make indexing fast.
I presume that you are talking of a single comments table with a foreign key to "exactly one of" A, B, C or D.
The fact that SQL cannot handle this is one of its fundamental weaknesses. The question gets asked over and over and over again.
See, e.g.
What is the best way to enforce a 'subset' relationship with integrity constraints
Your constraint is a "foreign key" from your "single comments" table into a view, which is the union of the identifiers in A, B, C and D. SQL supports only foreign keys into base tables.
Observe that SQL as a language does have support for your situation, in the form of CREATE ASSERTION. I know of no SQL products that support that statement, however.
EDIT
You should also keep in mind that with a 'single' comments table, you might need to enforce disjointness between the keys in A,B,C and D, for otherwise it might happen some time that a comment automatically gets "shared" between entity occurrences from different tables, which might not be desirable.
You could have a single comments table and within that table have a column that contains a value differentiating which table the comment belongs to - ie a 1 in that column means it's a comment for table A, 2 for table B, and so on. If you didn't want to have "magic numbers" in the comments table, you could have another table that has just two columns: one with the number and another detailing which table the number represents.
You don't need a separate comment table for every other, one is enough. Every comment will have a unique ID, so you don't have to worry about conflicts.