Before I go ahead and convert my entity relationship diagram into SQL statements, I thought I'd ask if someone could verify if this model doesn't contain any absurdities and anomalies that will appear once I have a SQL database schema.
I am particularly unsure about my cardinality of a relationship between Customer and VIP. Also, Supplier and CD relationship. start_date of the VIP entity - should it be a weak key? Are there any other potential weak keys besides the name attribute of the Song entity?
Legend
Entity
Attribute
Weak Entity
Relationship
Identifying Relationship
Cardinality Ratio
I've used the following websites as references to construct my diagram:
http://en.wikipedia.org/wiki/File:ERD_Representation.svg
http://en.wikipedia.org/wiki/Entity-relationship_model
http://www.cse.ohio-state.edu/~gurari/course/cse670/cse670Ch2.xht
Software used to create the diagram: Dia (Linux)
Sorry this is a late answer, but in case it's useful there are two improvements you can make.
1) The "is-a" relationship between "VIP" and "CUSTOMER" indicates the presence of a superclass (customer) and subclass (vip). You may want to model VIP as a subclass.
2) Since you are tracking dates for the relationship "rents", the cardinality must be taken "over time". Therefore the cardinality on both sides is "N" (i.e., not "1" on the side of customers)
Minor improvement: in "Song" (weak entity class) set the partial identifier as "track" rather than "name"; this will allow for multiple recordings of the same song on a CD (e.g., 2 versions). The track number will always be unique within the CD
Related
I have the following problem that I have multiple scenarios that might be right or wrong, I've been searching on this for a while and I didn't find a specific answer for my problem:
Doctor Clinic Example:
We have doctor, patient, treatment, treatment-type
Doctor: id, name....
Patient: id, name...
Treatment: date, cost
Treatment-Type: id, name
Doctor can do multiple treatments, and Patient can also do multiple treatments, so they are connected with Treatment with(1-N) relationship.
Treatment entity is a weak entity, as it cannot be defined in the absence of Doctor or Patient, so my question is, when we convert this ERD to actual tables, which is the correct (or the best-practice) scenario?
1 - doctor-id, patient-id cannot define the Treatment table uniquely, so we add to Treatment table the treatment-id field, and the PK is (doctor-id, patient-id, treatment-id).
2 - We add treatment-id field, and the PK is(treatment-id).
3 - The PK will be (doctor-id, patient-id, date).
I struggled finding if 'date' can be part of PK or not, and also I struggled if I can create an unique ID for weak entity
Thanks in advance.
Weak entity sets are entity sets that are partially identified by a parent entity set's primary key. A weak entity set necessarily depends on its parent entity set for existence (we say it participates totally in its identifying relationship), but not everything with an existence dependency is a weak entity set. Regular entity sets can also participate totally in one or more relationships. So, it depends on how you identify an entity set. See also my answer to the question "is optionality (mandatory, optional) and participation (total, partial) are same?"
An entity set that is uniquely identified by its own attributes is a regular entity set. An entity set that is partially identified by a parent entity set's primary key is a weak entity set. An entity set that is fully identified by a parent entity set's primary key is a subtype.
You should also note that weak entity sets can only have one parent entity set according to the entity-relationship model as Chen described it. Being identified by multiple parent entity sets would make it a relationship rather than an entity set.
In some schema design tools, a different interpretation is used where tables are equated to entity sets and relationships equated to FK constraints, and an identifying relationship would be an FK that is part of the PK of a table. This approach is closer to the network data model than the entity-relationship model, despite having adopted much of ER's terminology.
Let's take a look at your examples:
In example 1, we should consider whether treatment-id is identifying on its own (i.e. a surrogate key) or only in combination with doctor-id and patient-id (i.e. an ordinal number). If it's a surrogate key, it would be a mistake to include doctor-id and patient-id in the PK, example 2 would be the right way of handling it. If it's an ordinal number, then it's basically the same as example 3 - two foreign entity keys and a value set in a primary key. I'll say more about that in my comments on example 3.
In example 2, treatment-id is a surrogate key which means Treatment is a regular entity set which participates totally in its relationships with Patient and Doctor. This would be my recommended solution, since it's the simplest.
In example 3, you have a primary key consisting of two foreign entity keys and a value set.
The entity-relationship model doesn't cover such relations - relations with a single entity key are called entity relations, and relations with multiple entity keys are called relationships relations. Value sets are only described as the codomains of attributes, not the domains. The ER model's inability to handle arbitrary relations are a consequence of artificial distinctions between entity sets vs value sets, and between attributes vs relationships. Other data modeling disciplines like the relational model and object-role modeling are complete and can handle any kinds of relations.
Back to example 3, despite the ER model's shortcomings, it's not invalid to create such a table/relation in an actual database. However, think about what the primary key means - can a patient receive only one treatment per day from the same doctor? I would think multiple treatments should be possible, in which case you might need to add another ordinal number, e.g. (doctor-id, patient-id, date, treatment-id). In that case, it might be simpler just to do (doctor-id, patient-id, treatment-id).
One argument against such composite/natural keys is that they add up - a many-to-many association between two relations, each with 3 columns in their primary keys, could have up to 6 columns in its primary key! That gets inconvenient quickly, but on the other hand, those columns are relevant related info that would otherwise need to be retrieved from joined tables if the association was identified by a surrogate key.
Sorry about the long answer, but I hope this covers all the fine points. Let me know if you have any questions.
If I have a design like the image above, and I wanted to have another table called "favourite fruit" that selects only one fruit for each person, would it make sense to have it as a weak entity of fruit table with a pk (personid UNIQUE, fruitid, artificialid UNIQUE) ?
Watch it/Stay tuned. Entities only exist in the diagram if they have properties/attributes, even in partitioning relationships (the "ISA" relationship).
[...] and I wanted to have another table called "favourite fruit [...]"
Be careful too. In the Conceptual Model, there are no tables, but entities. Just as entities, relationships can later become tables too. You may get confused to use such terminology at this stage.
It's just like the user reaanb said in the comments. Relations/relationships express exactly this: relationships between things (in case, entities). You need to remember why you build relationships in the diagram. Always ask yourself:
Why this relationship exists? Why I'm creating this between X and Y? Do I really need to persist this?
You can without any problems create more than one relationship between two entities, since they represent different things (relationships).
Now. If you know that a user has a favorite fruit, then you know that these two entities are related in some way. Therefore, without a doubt, we have a relationship.
In your case, if a user always has a favorite fruit, then we have a 1-N relationship, where the N part is in users, as a fruit can be the favorite of many users.
On the other hand, if a user may or may not have a favorite fruit and you do not want a field with nullable value, we have to define a N-N relationship. This relationship will become a table when decomposed. In this case you'll have a table with only two columns. The two are foreign keys for the two tables, "users" and "fruits", and they will also be a composite primary key on this table. For it is not possible to repeat records, in the case, more than one user with a favorite fruit, set the "user" column as unique.
If you have any questions, please comment and I will answer.
How do we translate something like this into SQL?
Entity A -thick line- relation -simple line- Entity B
Its easy enough to write any of the other connections, but somehow I can't seem to figure it out when it comes to 1 thick line and a simple one, like shown aboove
I have a primary key which is the date of a football season (Entity A - Season) and an entity (Entity B - Football team) which has 2 primary keys which are it's name and primary key of the Season entity. But 'cause of that doubt I have I can't relate them properly.
Relations do not typically form independent tables (diamonds). However, for a many-many relationship, you will usually see them in a separate tables. Depending on your notation (there are many) your diagram could represent a many-many relationship or a 1:1 relationship.
Strong entities (your rectangles) get tables.
In your ER diagram, you will also typically see attributes for each table in circles connected by lines to the entity itself. Those attributes are turned into columns for each table. Attributes which are underlined in the diagram are representative of a primary key for a particular entity.
Additional or strange constraints that aren't typically easily represented in an ER diagram are usually put as side notes.
To answer your question, you must know whether or not it's a many-many relationship; if so, you would create a SeasonClub table with the two different primary keys inside it.
Me and a database architect were having argument over if a table with a compound primary key with subtypes made sense relationally and if it was a good practice.
Say we have two tables Employee and Project. We create a composite table Employee_Project with a composite primary key back to Employee and Project.
Is there a valid way for Employee_Project to have subtypes? Or can you think of any scenario where a composite key table can have subtypes?
To me a composite key relationship is a 'Is A' relationship (Employee_Project is a Employee and a Project). Subtypes are also a 'Is A' relationship. So if you have a composite key with a subtype its two 'Is A' relationships in one sentence which makes me believe this is a bad practice.
Employee-project is a bit hard, but one can imagine something like this -- although I'm not much of a chemist.
Or something like this, which would require different legal forms (fields) for single person ownership vs joint (time-share).
Or like this, providing that different forms are needed for full time and temp.
Employee projects have subtypes if the candidate subtypes are
not utterly different, but
not exactly alike
That means that
Every employee project has some
attributes (columns) in common. So they're not utterly different.
Some employee projects have different
attributes than others. So they're not exactly alike.
The determination has to do with common and distinct attributes. It doesn't have anything to do with the number of columns in a candidate key. Do you have employee projects that are not utterly different, but not exactly alike?
The most common business supertype/subtype example concerns organizations and individuals. They're not utterly different.
Both have addresses.
Both have phone numbers.
Both can be plaintiffs and defendants
in court.
But they're not exactly alike.
Individuals can go to college.
Organizations can have a CEO.
Individuals can get married.
Individuals can have children.
Organizations (in the USA) can be liquidated.
So you can express individuals and organizations as subtypes of a supertype called, say, "Parties". The attributes all the subtypes have in common relate to the supertype.
Parties have addresses.
Parties have phone numbers.
Parties can be plaintiffs and defendants
in court.
Again, this has to do with attributes that are held in common, and attributes that are distinct. It has nothing to do with the number of columns in a candidate key.
To me a composite key relationship is
a 'Is A' relationship
(Employee_Project is a Employee and a
Project).
Database designers don't think that way. We think in terms of a table's predicate.
If an employee can have many projects and a project can have many employees it is a many-to-many join that RDBM's can only represent easily in one way (the way you have outlined above.) You can see in the ER diagram below (employee / departments is one of the classic many-to-many examples) that it does not have a separate ER component. The separate table is a leaky abstraction of RDBMS's (which is probably why you are having a hard time modeling it).
http://www.library.cornell.edu/elicensestudy/dlfdeliverables/fallforum2003/ERD_final.doc
Bridge Entities
When an instance of an entity may be related to multiple instances of another entity and vice versa, that is called a “many-to-many relationship.” In the example below, a supplier may provide many different products, and each type of product may be offered by many suppliers:
While this relationship model is perfectly valid, it cannot be translated directly into a relational database design. In a relational database, relationships are expressed by keys in a table column that point to the correct instance in the related table. A many-to-many relationship does not allow this relationship expression, because each record in each table might have to point to multiple records in the other table.
http://users.csc.calpoly.edu/~jdalbey/205/Lectures/ERD_image004.gif
Here they do not event bother with a separate box although they add in later (at this step it is a 'pure' ER diagram). It can also be explicitly represented with a box and a diamond superimposed on each other.
If the degree of an entity is 8, what is the minimum number of attributes required to form the primary key?
Degree is dependent upon their relationship. Suppose, there is a binary relationship; means between two entities their degree is two. Suppose, there is ternary relationship; means between three entities there degree is three. Suppose, there are many entities at the time of relationship; then the degree is many.
Entitys don;t have "degree". What you may be referring to is the Degree of a relationship, and what is sometimes referred to as a "Dgeree of an Entity" relationship. If this is what you are asking about then the "Degree of a Relationship" in an RDBMS is the count of entities involved in that relationship.
i.e., in a relationship between a product and the store that carries it, there are two entities (product and Store) and so it is a binary relationship (Degree = 2) In a relationship between vendor and store, there could be three entities involved (vendor, product, and store) so this would be a ternary relationship (Degree=3)
In general RDBMS do not model ternary or higher degree relationships directly, they require that you implement them with multiple binary relationships (in e.g., you would need Vendor-> Product and Product-> Store relationships...
In principle the minimum number of attributes needed to form a primary key of any relation is zero. It is perfectly possible (though relatively unusual) to have a key consisting of zero attributes. A relation variable with a key consisting of no attributes is limited to one tuple at most.