Class Diagram: In A Composition Relationship Should a Child Class Always Have An ID Field? - oop

I'm having a hard time converting my database tables and foreign keys to a class diagram with classes and associations.
My question is:
"In in a composition relationship, should a child class always should have an ID field?".
In my CD, there are 2 compositor classes: PurchaseItem and PurchaseFinisher, which composite Purchase class. PurchaseItem already comes with an ID field from its table but, PurchaseFinisher doesn't because it is filtered by the id_purchase and id_payment_method foreign keys.
thanks in advance.
This is my DB diagram:
I can't see redundancy in between Purchase or Product, as you said. Could you, please, show me that based on my DB diagram? My tables are well modeled (hope so). My fault is in the classes definition.

In a class diagram, no class requires an id property: each class instance (aka object) has its own identity with or without explicit id property.
In a database, you need of course an explicit id property to uniquely identify the object among others in the database and find it back. By the way, you may annotate such properties with a trailing {id} . UML does not define any semantic for it, but it is in general sufficiently expressive to help database designers.
In the case of composition, the main question is whether a composed object can easily be identified by alternate means. There are several related ORM database techniques, for example:
you can use the owning object’s id together with another property if this is sufficient to identify the element. The two together would make a composite primary key in database.
you can use a unique id to identify the object (surrogate primary key) and use the id of the owning object as foreign key.
For PurchaseItem you have everything that is needed, although the diagram does not tell which of the two approaches you’ll use (e.g is the id unique globally, or unique within the purchase?).
But for PurchaseFinisher it is unclear if you could uniquely identify an occurence. If a payment method can only be used once per purchase, it’s fine as it may be used to identify the object.
If it would be allowed to pay two times the same amount (half of the overall price) in the same currency with the same payment methods, you’d have undistinguishable duplicates. So, some kind of identifier will be needed from the database point of view.

Related

Weak Entity in ERD

I have the following problem that I have multiple scenarios that might be right or wrong, I've been searching on this for a while and I didn't find a specific answer for my problem:
Doctor Clinic Example:
We have doctor, patient, treatment, treatment-type
Doctor: id, name....
Patient: id, name...
Treatment: date, cost
Treatment-Type: id, name
Doctor can do multiple treatments, and Patient can also do multiple treatments, so they are connected with Treatment with(1-N) relationship.
Treatment entity is a weak entity, as it cannot be defined in the absence of Doctor or Patient, so my question is, when we convert this ERD to actual tables, which is the correct (or the best-practice) scenario?
1 - doctor-id, patient-id cannot define the Treatment table uniquely, so we add to Treatment table the treatment-id field, and the PK is (doctor-id, patient-id, treatment-id).
2 - We add treatment-id field, and the PK is(treatment-id).
3 - The PK will be (doctor-id, patient-id, date).
I struggled finding if 'date' can be part of PK or not, and also I struggled if I can create an unique ID for weak entity
Thanks in advance.
Weak entity sets are entity sets that are partially identified by a parent entity set's primary key. A weak entity set necessarily depends on its parent entity set for existence (we say it participates totally in its identifying relationship), but not everything with an existence dependency is a weak entity set. Regular entity sets can also participate totally in one or more relationships. So, it depends on how you identify an entity set. See also my answer to the question "is optionality (mandatory, optional) and participation (total, partial) are same?"
An entity set that is uniquely identified by its own attributes is a regular entity set. An entity set that is partially identified by a parent entity set's primary key is a weak entity set. An entity set that is fully identified by a parent entity set's primary key is a subtype.
You should also note that weak entity sets can only have one parent entity set according to the entity-relationship model as Chen described it. Being identified by multiple parent entity sets would make it a relationship rather than an entity set.
In some schema design tools, a different interpretation is used where tables are equated to entity sets and relationships equated to FK constraints, and an identifying relationship would be an FK that is part of the PK of a table. This approach is closer to the network data model than the entity-relationship model, despite having adopted much of ER's terminology.
Let's take a look at your examples:
In example 1, we should consider whether treatment-id is identifying on its own (i.e. a surrogate key) or only in combination with doctor-id and patient-id (i.e. an ordinal number). If it's a surrogate key, it would be a mistake to include doctor-id and patient-id in the PK, example 2 would be the right way of handling it. If it's an ordinal number, then it's basically the same as example 3 - two foreign entity keys and a value set in a primary key. I'll say more about that in my comments on example 3.
In example 2, treatment-id is a surrogate key which means Treatment is a regular entity set which participates totally in its relationships with Patient and Doctor. This would be my recommended solution, since it's the simplest.
In example 3, you have a primary key consisting of two foreign entity keys and a value set.
The entity-relationship model doesn't cover such relations - relations with a single entity key are called entity relations, and relations with multiple entity keys are called relationships relations. Value sets are only described as the codomains of attributes, not the domains. The ER model's inability to handle arbitrary relations are a consequence of artificial distinctions between entity sets vs value sets, and between attributes vs relationships. Other data modeling disciplines like the relational model and object-role modeling are complete and can handle any kinds of relations.
Back to example 3, despite the ER model's shortcomings, it's not invalid to create such a table/relation in an actual database. However, think about what the primary key means - can a patient receive only one treatment per day from the same doctor? I would think multiple treatments should be possible, in which case you might need to add another ordinal number, e.g. (doctor-id, patient-id, date, treatment-id). In that case, it might be simpler just to do (doctor-id, patient-id, treatment-id).
One argument against such composite/natural keys is that they add up - a many-to-many association between two relations, each with 3 columns in their primary keys, could have up to 6 columns in its primary key! That gets inconvenient quickly, but on the other hand, those columns are relevant related info that would otherwise need to be retrieved from joined tables if the association was identified by a surrogate key.
Sorry about the long answer, but I hope this covers all the fine points. Let me know if you have any questions.

Table with foreign key that can reference different tables

I'm trying to build a table for an inbox app that stores its messages within inbox table.
The structure of the table is as follows:
inbox_id|sender_id|receiver_id|subject|message
The columns sender_id and receiver_id are FOREIGN KEYS and can reference multiple tables.
There are currently 3 types of users within the database and they all can send messages to each other. A user of type UserType1 can send a message to UserType2 and vice versa, or UserType1 can send messages to UserType1. So receiver and sender can reference one of these 3 tables.
My solution to this problem is to build a inbox_user table containing columns for each user type and to have sender_id and inbox_user reference it.
My main concern is limited flexibility of the solution and wasteful usage of resources. I would, at all times, always have 2 empty columns per row. And that would become even worse if I introduced more user types.
Would this be considered a bad practice? What are some more flexible and intelligent designs?
From your description it sounds like the best practice should be applied to your user table and not this inbox table. Of course, I don't know your constraints, but if you have 2 or 3 types of users with each type of user in its own table, that is a poor design (again, not knowing your constraints). The preference is to store all users in one table with a column to indicate their type. Then the reference to your inbox table becomes straightforward with both sender and receiver FKs pointing back to the same user table.
Otherwise, you're going to end up using multiple columns to reference each table like you said (UserTypeASenderID, UserTypeBSenderID, UserTypeCSenderID, etc). My preference is to have null FK columns and gain the referential integrity than to implement some other solution and lose the constraints.
3 different types of users is a classic type/subtype situation. (or, if you prefer, class/subclass). There are several ways to design for a class/subclass situation. Two that look good to me are "Class Table Inheritance" and "Single Table Inheritance" as explained by Martin fowler. You can find a synopsis online. You can also visit tags by these names in here, read up on the info presented, and look at the tagged questions.
Single table inheritance suffers from a lot of NULLs for fields that are not applicable some of the time, as you pointed out in the Q. This may or may not be a problem for you, depending on your case.
Class table inheritance involves a little programming when a new entry is made in a subclass (subtype). It also involves more joining, but it isn't very expensive joining. Class table inheritance is frequently combined with a technique called shared primary key. In this technique, the subclass tables end up with a primary key that is a duplicate of the primary key in the superclass table. It's also a foreign key to the superclass table. This makes joining subclass data and superclass data simple, easy, and fast.
Shared primary key resolves the quandary you stated in your Q, namely how to reference more than one table with one foreign key. A foreign key reference in some other table to the superclass table will also be a reference to at least one of the subclass tables. This seems like magic. Try it, see if you like it.

How do I structure a generic item that can have a relationship with different tables?

In my example, I have a watch, which is an indication a user wants notifications about events on a different item, say a group and an organization.
I see two ways to do this:
Have a groupwatch resource, with a groupwatch table, with id,user,group (group FK to group resource and table); and a orgwatch resource, with a orgwatch table, with id,user,organization (org FK to organization resource and table)
Have a generic watch resource, with a watch table, with id,user,type,typeid. type is one of group or organization, and typeid is the ID of the group or organization being watched.
Since both of them are watches, it seems a waste to have two different tables and resources to watch 2 different objects. It gets worse if I start watching 4, 5, 6, 20, 50 different types of resources.
On the other hand, a foreign key relationship appears impossible if I just have a generic typeid, which means that my database (if relational) and my framework (activerecord or anything else) cannot enforce it correctly.
How do I best implement this type of "association to different types of record/table for each record in my table"?
UPDATE:
Are my only choices for doing this:
separate tables/resources for each watch type, which enables the database to enforce relational integrity and do joins
single table for all watches, but I will have to enforce relational integrity and do joins at the app level?
If you add a new type of resource once every six months, you may want to define your tables in such a way that adding new resources involves changing data definitions. If you add a new resource type every week, you may want to make your data definitions stay the same when you add new types. There's a downside to either choice.
If you do choose to define table in such a way that the types are visible in the table structure, there are two patterns often used with type/subtype (aka class/subclass) situations.
One pattern has been called "single table inheritance". Put data about all the types in a single table, and leave some columns NULL wherever they do not apply.
Another pattern has been called "class table inheritance". Define one table for the superclass, with all the data that is common to all the types. Then define tables for each subtype (subclass) to contain class specific data. Make the primary key of the subtype tables a duplicate of the primary key in the supertype table, and also declare it as a foreign key that references the primary key of the supertype table. It's going to be up to the app, at insert time, to replicate the value of the primary key in the supertype table over in the subtype table.
I like Fowlers' treatment of these two patterns.
http://martinfowler.com/eaaCatalog/classTableInheritance.html
http://www.martinfowler.com/eaaCatalog/singleTableInheritance.html
This matter of sharing primary keys has a few beneficial effects.
First, it enforces the one-to-one nature of the ISa relationships.
Second, it makes it easy to find out whether a given entry belongs to a desired subtype, by just joining with the subtype table. You don't really need an extra type field.
Third, it speeds up the joins, because of the index that gets built when you declare a primary key.
If you want a structure that can adapt to new attributes without changing data definitions, you can look into E-A-V design. Be careful, though. Sometimes this results in data that is nearly impossible to use, because the logical structure is so obscure. I usually think of E-A-V as an anti-pattern for this reason, although there are some who really like the results they get from it.

How to model a mutually exclusive relationship in SQL Server

I have to add functionality to an existing application and I've run into a data situation that I'm not sure how to model. I am being restricted to the creation of new tables and code. If I need to alter the existing structure I think my client may reject the proposal.. although if its the only way to get it right this is what I will have to do.
I have an Item table that can me link to any number of tables, and these tables may increase over time. The Item can only me linked to one other table, but the record in the other table may have many items linked to it.
Examples of the tables/entities being linked to are Person, Vehicle, Building, Office. These are all separate tables.
Example of Items are Pen, Stapler, Cushion, Tyre, A4 Paper, Plastic Bag, Poster, Decoration"
For instance a Poster may be allocated to a Person or Office or Building. In the future if they add a Conference Room table it may also be added to that.
My intital thoughts are:
Item
{
ID,
Name
}
LinkedItem
{
ItemID,
LinkedToTableName,
LinkedToID
}
The LinkedToTableName field will then allow me to identify the correct table to link to in my code.
I'm not overly happy with this solution, but I can't quite think of anything else. Please help! :)
Thanks!
It is not a good practice to store table names as column values. This is a bad hack.
There are two standard ways of doing what you are trying to do. The first is called single-table inheritance. This is easily understood by ORM tools but trades off some normalization. The idea is, that all of these entities - Person, Vehicle, whatever - are stored in the same table, often with several unused columns per entry, along with a discriminator field that identifies what type the entity is.
The discriminator field is usually an integer type, that is mapped to some enumeration in your code. It may also be a foreign key to some lookup table in your database, identifying which numbers correspond to which types (not table names, just descriptions).
The other way to do this is multiple-table inheritance, which is better for your database but not as easy to map in code. You do this by having a base table which defines some common properties of all the objects - perhaps just an ID and a name - and all of your "specific" tables (Person etc.) use the base ID as a unique foreign key (usually also the primary key).
In the first case, the exclusivity is implicit, since all entities are in one table. In the second case, the relationship is between the Item and the base entity ID, which also guarantees uniqueness.
Note that with multiple-table inheritance, you have a different problem - you can't guarantee that a base ID is used by exactly one inheritance table. It could be used by several, or not used at all. That is why multiple-table inheritance schemes usually also have a discriminator column, to identify which table is "expected." Again, this discriminator doesn't hold a table name, it holds a lookup value which the consumer may (or may not) use to determine which other table to join to.
Multiple-table inheritance is a closer match to your current schema, so I would recommend going with that unless you need to use this with Linq to SQL or a similar ORM.
See here for a good detailed tutorial: Implementing Table Inheritance in SQL Server.
Find something common to Person, Vehicle, Building, Office. For the lack of a better term I have used Entity. Then implement super-type/sub-type relationship between the Entity and its sub-types. Note that the EntityID is a PK and a FK in all sub-type tables. Now, you can link the Item table to the Entity (owner).
In this model, one item can belong to only one Entity; one Entity can have (own) many items.
your link table is ok.
the trouble you will have is that you will need to generate dynamic sql at runtime. parameterized sql does not typically allow the objects inthe FROM list to be parameters.
i fyou want to avoid this, you may be able to denormalize a little - say by creating a table to hold the id (assuming the ids are unique across the other tables) and the type_id representing which table is the source, and a generated description - e.g. the name value from the inital record.
you would trigger the creation of this denormalized list when the base info is modified, and you could use that for generalized queries - and then resort to your dynamic queries when needed at runtime.

Subtyping database tables

I hear a lot about subtyping tables when designing a database, and I'm fully aware of the theory behind them. However, I have never actually seen table subtyping in action. How can you create subtypes of tables? I am using MS Access, and I'm looking for a way of doing it in SQL as well as through the GUI (Access 2003).
Cheers!
An easy example would be to have a Person table with a primary key and some columns in that table. Now you can create another table called Student that has a foreign key to the person table (its supertype). Now the student table has some columns which the supertype doesn't have like GPA, Major, etc. But the name, last name and such would be in the parent table. You can always access the student name back in the Person table through the foreign key in the Student table.
Anyways, just remember the following:
The hierarchy depicts relationship between supertypes and subtypes
Supertypes has common attributes
Subtypes have uniques attributes
Subtypes of tables is a conceptual thing in EER diagrams. I haven't seen an RDBMS (excluding object-relational DBMSs) that supports it directly. They are usually implemented in either
A set of nullable columns for each property of the subtype in a single table
With a table for base type properties and some other tables with at most one row per base table that will contain subtype properties
The notion of table sub-types is useful when using an ORM mapper to produce class sub-type heirarchy that exactly models the domain.
A sub-type table will have both a Foreign Key back to its parent which is also the sub-types table's primary key.
Keep in mind that in designing a bound application, as with an Access application, subtypes impose a heavy cost in terms of joins.
For instance, if you have a supertype table with three subtype tables and you need to display all three in a single form at once (and you need to show not just the supertype date), you end up with a choice of using three outer joins and Nz(), or you need a UNION ALL of three mutually exclusive SELECT statements (one for each subtype). Neither of these will be editable.
I was going to paste some SQL from the first major app where I worked with super/subtype tables, but looking at it, the SQL is so complicated it would just confuse people. That's not so much because my app was complicated, but it's because the nature of the problem is complex -- presenting the full set of data to the user, both super- and subtypes, is by its very nature complex. My conclusion from working with it was that I'd have been better off with only one subtype table.
That's not to say it's not useful in some circumstances, just that Access's bound forms don't necessarily make it easy to present this data to the user.
I have a similar problem I've been working on.
While looking for a repeatable pattern, I wanted to make sure I didn't abandon referential integrity, which meant that I wouldn't use a (TABLE_NAME, PK_ID) solution.
I finally settled on:
Base Type Table: CUSTOMER
Sub Type Tables: PERSON, BUSINESS, GOVT_ENTITY
I put nullable PRERSON_ID, BUSINESS_ID and GOVT_ENTITY_ID fields in CUSTOMER, with foreign keys on each, and a check constraint that only one is not null. It's easy to add new sub types, just need to add the nullable foreign key and modify the check constraint.