Model diagram doesn't seem right. How else can I relate the objects? - class-diagram

I have a entity diagram from some analysis that I'd like to have someone look over. For some reason the System object just doesn't seem right to me. Is there a better way to relate the objects?
Its basically a user authentication/management system in its infancy.
http://www.dumpt.com/img/viewer.php?file=zlh8ltbtho4mutbbb3yk.gif
Cheers,
Mike

User and Company should have a common base class (they both have names and mail addresses), then you can link the System to this base class. That's a common pattern for business modeling, look for example, into chapter one of Martin Fowler's book "Analysis Patterns".
EDIT: Or, if you think this makes more sense, you use System as the base class itself, put the EMail adress there (and perhaps give System a better name like LegalPerson, CorporateBody or something like that).

Considering the password has a 1-to-1 relationship with the User, and is not keyed to any other tables, I'd suggest saving yourself an inner join and just making it another column in the property table. Otherwise, looks pretty good.

It's hard to evaluate the "rightness" of something without some metrics of comparison. The easiest metrics for class designs are queries.
Think up as many of the queries that you will eventually want to ask of this data. Write them down and see how the design supports them. If you're unhappy, try another design and see how the queries look then.

Related

Should I have a separate class for each database table?

I assumed this was good practice, for reasons such as it helps to clearly represent relationships between classes. However, according to this website it's not:
http://www.tonymarston.net/php-mysql/good-bad-oop.html#a5
So, is having a class for each database table good OO practice or not?
Briefly reading the article in the introduction these points are said to 'complaints' against another article the author had written. I believe the author is actually saying that a class per table is a good approach and explains why, defending it against remarks such as "This code leads to a problematic dependency between the DB and the class".
Having a table per class is certainly a good place to start from, however like any pattern it is not necessarily going to be something that'll be required in all situations. An example being a junction table where perhaps just one class can handle interactions for numerous different junction tables. Likewise you may have tables with a one to one relationship that fits the program better as a single class.

What is the best aproach to develop and implement a superclass/subclass ED diagram for these two entities?

I was wondering if someone could help me decide what is the best way to develop two simple database entities. I have come up with two ways but I can't see the obvious reason why one would be better than the other.
(there is a mistake in GroupMessage entity two, the attribute message appears twice)
You have not provided a lot of background information, but it is clear that there are shared attributes between FriendMessage and GroupMessage. There is probably additional commonality between the Sender and Creator attributes, and likewise for the Receiver and Group attributes. That makes the first a clear preference, if only based on DRY. I cannot think of a single reason or circumstance that would prefer the second, completley disjoint, representation.

How to model data with unknown attributes?

What are good ways to model data that will need to be queried but where it's impossible to fully define up front?
For instance... say I want to model information about the countries of the world. Each country has a population, a flag and a list of languages, that's easy enough. But say we also want to model the win/loss record of their national baseball team and not all countries have one, of course. Or, we want to track the lineage of their kings & queens (again, obviously not applicable to most countries). Or, we decide we want to model the number of yurts the average clan member will erect in a lifetime.
Anyway, point is, we don't (and won't ever) know what's coming until it hits us. What approaches are there that are both scalable and query-able?
Is this, perhaps, a good use for a Document-centric database (MongoDB?) or perhaps some design pattern could be applied to the classic Relational database?
You can do that in a pure Relational Database, and enjoy the speed and power of Relational databases.
You need to use Sixth Normal Form, the proper method with full integrityand control.
EAV is a subset of 6NF without the Integrity or control, and usualy very badly implemented.
My answers to these questions provide a full treatment of the subject. The last one is particularly long due to the context and arguments raised.
EAV-6NF Answer One
EAV-6NF Answer Two
EAV-6NF Answer Three
All databases ought to be capable of evolving over time. If you have the right people and organisation in place then you should have no problem adding new attributes to the model as they arise.
You can apply the Entity Atribute Value Model but it is a PITA in rails; I have used MongoDB and it is great for what you need.
If you're using Rails, you can use serialize :column_name in your model and it'll persist most objects successfully for you without any additional work. If you don't think you'll have a need for a schema-less NoSQL database, this is probably about the easiest thing you can do to get this functionality.
class Country << ActiveRecord::Base
serialize :data
def add_statistic(name, value)
data[name] = value
end
def get_statistic(name)
data[name]
end
end
Those methods are somewhat superfluous; they're there just to show give you an example.
The biggest downside to this type of system is if you have a need to search or query based on these things. Rails won't handle Country.find_by_win_loss_record_of_national_basketball_team for you, after all.

How can an object-oriented programmer get his/her head around database-driven programming?

I have been programming in C# and Java for a little over a year and have a decent grasp of object oriented programming, but my new side project requires a database-driven model. I'm using C# and Linq which seems to be a very powerful tool but I'm having trouble with designing a database around my object oriented approach.
My two main question are:
How do I deal with inheritance in my database?
Let's say I'm building a staff rostering application and I have an abstract class, Event. From Event I derive abstract classes ShiftEvent and StaffEvent. I then have concrete classes Shift (derived from ShiftEvent) and StaffTimeOff (derived from StaffEvent). There are other derived classes, but for the sake of argument these are enough.
Should I have a separate table for ShiftEvents and StaffEvents? Maybe I should have separate tables for each concrete class? Both of these approaches seem like they would give me problems when interacting with the database. Another approach could be to have one Event table, and this table would have nullable columns for every type of data in any of my concrete classes. All of these approaches feel like they could impede extensibility down the road. More than likely there is a third approach that I have not considered.
My second question:
How do I deal with collections and one-to-many relationships in an object oriented way?
Let's say I have a Products class and a Categories class. Each instance of Categories would contain one or more products, but the products themselves should have no knowledge of categories. If I want to implement this in a database, then each product would need a category ID which maps to the categories table. But this introduces more coupling than I would prefer from an OO point of view. The products shouldn't even know that the categories exist, much less have a data field containing a category ID! Is there a better way?
Linq to SQL using a table per class solution:
http://blogs.microsoft.co.il/blogs/bursteg/archive/2007/10/01/linq-to-sql-inheritance.aspx
Other solutions (such as my favorite, LLBLGen) allow other models. Personally, I like the single table solution with a discriminator column, but that is probably because we often query across the inheritance hierarchy and thus see it as the normal query, whereas querying a specific type only requires a "where" change.
All said and done, I personally feel that mapping OO into tables is putting the cart before the horse. There have been continual claims that the impedance mismatch between OO and relations has been solved... and there have been plenty of OO specific databases. None of them have unseated the powerful simplicity of the relation.
Instead, I tend to design the database with the application in mind, map those tables to entities and build from there. Some find this as a loss of OO in the design process, but in my mind the data layer shouldn't be talking high enough into your application to be affecting the design of the higher order systems, just because you used a relational model for storage.
I had the opposite problem: how to get my head around OO after years of database design. Come to that, a decade earlier I had the problem of getting my head around SQL after years of "structured" flat-file programming. There are jsut enough similarities betwwen class and data entity decomposition to mislead you into thinking that they're equivalent. They aren't.
I tend to agree with the view that once you're committed to a relational database for storage then you should design a normalised model and compromise your object model where unavoidable. This is because you're more constrained by the DBMS than you are with your own code - building a compromised data model is more likley to cause you pain.
That said, in the examples given, you have choices: if ShiftEvent and StaffEvent are mostly similar in terms of attributes and are often processed together as Events, then I'd be inclined to implement a single Events table with a type column. Single-table views can be an effective way to separate out the sub-classes and on most db platforms are updatable. If the classes are more different in terms of attributes, then a table for each might be more appropriate. I don't think I like the three-table idea:"has one or none" relationships are seldom necessary in relational design. Anyway, you can always create an Event view as the union of the two tables.
As to Product and Category, if one Category can have many Products, but not vice versa, then the normal relational way to represent this is for the product to contain a category id. Yes, it's coupling, but it's only data coupling, and it's not a mortal sin. The column should probably be indexed, so that it's efficient to retrieve all products for a category. If you're really horrified by the notion then pretend it's a many-to-many relationship and use a separate ProductCategorisation table. It's not that big a deal, although it implies a potential relationship that doesn't really exist and might mislead somone coming to the app in future.
In my opinion, these paradigms (the Relational Model and OOP) apply to different domains, making it difficult (and pointless) to try to create a mapping between them.
The Relational Model is about representing facts (such as "A is a person"), i.e. intangible things that have the property of being "unique". It doesn't make sense to talk about several "instances" of the same fact - there is just the fact.
Object Oriented Programming is a programming paradigm detailing a way to construct computer programs to fulfill certain criteria (re-use, polymorphism, information hiding...). An object is typically a metaphor for some tangible thing - a car, an engine, a manager or a person etc. Tangible things are not facts - there may be two distinct objects with identical state without them being the same object (hence the difference between equals and == in Java, for example).
Spring and similar tools provide access to relational data programmatically, so that the facts can be represented by objects in the program. This does not mean that OOP and the Relational Model are the same, or should be confused with eachother. Use the Realational Model to design databases (collections of facts) and OOP to design computer programs.
TL;DR version (Object-Relational impedance mismatch distilled):
Facts = the recipe on your fridge.
Objects = the content of your fridge.
Frameworks such as
Hibernate http://www.hibernate.org/
JPA http://java.sun.com/developer/technicalArticles/J2EE/jpa/
can help you to smoothly solve this problem of inheritance. e.g. http://www.java-tips.org/java-ee-tips/enterprise-java-beans/inheritance-and-the-java-persistenc.html
I also got to understand database design, SQL, and particularly the data centered world view before tackling the object oriented approach. The object-relational-impedance-mismatch still baffles me.
The closest thing I've found to getting a handle on it is this: looking at objects not from an object oriented progamming perspective, or even from an object oriented design perspective but from an object oriented analysis perspective. The best book on OOA that I got was written in the early 90s by Peter Coad.
On the database side, the best model to compare with OOA is not the relational model of data, but the Entity-Relationship (ER) model. An ER model is not really relational, and it doesn't specify the logical design. Many relational apologists think that is ER's weakness, but it is actually its strength. ER is best used not for database design but for requirements analysis of a database, otherwise known as data analysis.
ER data analysis and OOA are surprisingly compatible with each other. ER, in turn is fairly compatible with relational data modeling and hence to SQL database design. OOA is, of course, compatible with OOD and hence to OOP.
This may seem like the long way around. But if you keep things abstract enough, you won't waste too much time on the analysis models, and you'll find it surprisingly easy to overcome the impedance mismatch.
The biggest thing to get over in terms of learning database design is this: data linkages like the foreign key to primary key linkage you objected to in your question are not horrible at all. They are the essence of tying related data together.
There is a phenomenon in pre database and pre object oriented systems called the ripple effect. The ripple effect is where a seemingly trivial change to a large system ends up causing consequent required changes all over the entire system.
OOP contains the ripple effect primarily through encapsulation and information hiding.
Relational data modeling overcomes the ripple effect primarily through physical data independence and logical data independence.
On the surface, these two seem like fundamentally contradictory modes of thinking. Eventually, you'll learn how to use both of them to good advantage.
My guess off the top of my head:
On the topic of inheritance I would suggest having 3 tables: Event, ShiftEvent and StaffEvent. Event has the common data elements kind of like how it was originally defined.
The last one can go the other way, I think. You could have a table with category ID and product ID with no other columns where for a given category ID this returns the products but the product may not need to get the category as part of how it describes itself.
The big question: how can you get your head around it? It just takes practice. You try implementing a database design, run into problems with your design, you refactor and remember for next time what worked and what didn't.
To answer your specific questions... this is a little bit of opinion thrown in, as in "how I would do it", not taking into account performance needs and such. I always start fully normalized and go from there based on real-world testing:
Table Event
EventID
Title
StartDateTime
EndDateTime
Table ShiftEvent
ShiftEventID
EventID
ShiftSpecificProperty1
...
Table Product
ProductID
Name
Table Category
CategoryID
Name
Table CategoryProduct
CategoryID
ProductID
Also reiterating what Pierre said - an ORM tool like Hibernate makes dealing with the friction between relational structures and OO structures much nicer.
There are several possibilities in order to map an inheritance tree to a relational model.
NHibernate for instance supports the 'table per class hierarchy', table per subclass and table per concrete class strategies:
http://www.hibernate.org/hib_docs/nhibernate/html/inheritance.html
For your second question:
You can create a 1:n relation in your DB, where the Products table has offcourse a foreign key to the Categories table.
However, this does not mean that your Product Class needs to have a reference to the Category instance to which it belongs to.
You can create a Category class, which contains a set or list of products, and you can create a product class, which has no notion of the Category to which it belongs.
Again, you can easy do this using (N)Hibernate;
http://www.hibernate.org/hib_docs/reference/en/html/collections.html
Sounds like you are discovering the Object-Relational Impedance Mismatch.
The products shouldn't even know that
the categories exist, much less have a
data field containing a category ID!
I disagree here, I would think that instead of supplying a category id you let your orm do it for you. Then in code you would have something like (borrowing from NHib's and Castle's ActiveRecord):
class Category
[HasMany]
IList<Product> Products {get;set;}
...
class Product
[BelongsTo]
Category ParentCategory {get;set;}
Then if you wanted to see what category the product you are in you'd just do something simple like:
Product.ParentCategory
I think you can setup the orm's differently, but either way for the inheritence question, I ask...why do you care? Either go about it with objects and forget about the database or do it a different way. Might seem silly, but unless you really really can't have a bunch of tables, or don't want a single table for some reason, why would you care about the database? For instance, I have the same setup with a few inheriting objects, and I just go about my business. I haven't looked at the actual database yet as it doesn't concern me. The underlying SQL is what is concerning me, and the correct data coming back.
If you have to care about the database then you're going to need to either modify your objects or come up with a custom way of doing things.
I guess a bit of pragmatism would be good here. Mappings between objects and tables always have a bit of strangeness here and there. Here's what I do:
I use Ibatis to talk to my database (Java to Oracle). Whenever I have an inheretance structure where I want a subclass to be stored in the database, I use a "discriminator". This is a trick where you have one table for all the Classes (Types), and have all fields which you could possibly want to store. There is one extra column in the table, containing a string which is used by Ibatis to see which type of object it needs to return.
It looks funny in the database, and sometimes can get you into trouble with relations to fields which are not in all Classes, but 80% of the time this is a good solution.
Regarding your relation between category and product, I would add a categoryId column to the product, because that would make life really easy, both SQL wise and Mapping wise. If you're really stuck on doing the "theoretically correct thing", you can consider an extra table which has only 2 colums, connecting the Categories and their products. It will work, but generally this construction is only used when you need many-to-many relations.
Try to keep it as simple as possible. Having a "academic solution" is nice, but generally means a bit of overkill and is harder to refactor because it is too abstract (like hiding the relations between Category and Product).
I hope this helps.

How do I validate the class diagram for a given domain?

I am working on car dealership business domain model/UML class diagram.
I am new to modeling, so I would like to know how to validate the class diagram. It's very important for me to have an appropriate, if not 100 percent correct, class diagram to use further development (use cases, etc.).
Is it possible to build a completely incorrect model? Or are there only appropriate and less appropriate models?
If I have a Customer associated with SalesTeam modeling a customer being served by SalesTeam, is that wrong? I have seen in examples of Customer being associated with Order, Order with ItemOrder and ItemOrder with ItemInventory. Where the SalesTeam or Staff is associated with Order.
How do I validate my model and relationships?
To validate domain models, do the following.
Write use cases. During the writing, make sure you're using nouns and verbs in a consistent way. To be sure that your nouns make sense, be sure to record notes in the domain model.
Walk through each use case, following along on your domain model. At the entities there? Relationships required for navigation? Attributes of each entity?
Since it's a domain model, try to avoid describing things as classes -- they're usually real-world entities.
For example "customer entity in direct relationship with sales team entity" is something you'll learn from the use cases. For example, customers are associated with orders, but the order is created by the sales team. So, you have two navigation paths between customer and order: direct and via the sales team. Both appear (to me) to be true.
You must compare your domain model with your use cases to be sure both agree.
The short answer is that this is not very important.
Use your domain class diagrams to keep a note of what you think is in the domain, that is all. It is not your god, and it will not hurt you to change it as you go.
Domain experts should help you to validate the domain model.
As far as validating the specific relationships, as you develop the model further and investigate the collaborations between objects you will discover more and different relationships. You will need to revisit the domain model often during your analysis and development.
I don't think it matters that it's 'correct' up front (i.e. before you move onto looking at use cases and further analysis), only that it is useful - it gives you a conceptual model of the problem and what the main classes involved are. It isn't going to be finished until the software is no longer being developed or maintained.
If it represents the way you view the problem right now, it's good enough for you to start further analysis. Revise it as your view of the problem changes and you learn more.