I want to create an application for creating graphical documents, each document consists of several geometric shapes (Ball, Brick, Cylinder, Cube).
So I created two diagrams for my application, as shown in this picture :
I want to know which diagram is better and why ? and what are the disadvantages and the advantages of both approaches.
Of course it depends on the requirements. But from a neutral standpoint the left one is definitely the better one since it has a better perspective for GraphicDocument. You don't need to know about the form of the elements only that they are shapes. So you can easily extend it without having to change GraphicDocument.
In other words: the left is loosely coupled while the right one is tightly coupled.
As Thomas says, it depends, but the right diagram has the potential to be more specific. That diagram opens the door to expressing existential quantification, which is to say that there must be some number of Cubes in a Document, for example. If you don't care to express that, the left diagram is clearly more expressive with fewer symbols.
Related
I've come across resources that depict UML diagrams with verbs like 'wrote' to describe how one class uses another. Does this convention exist in UML; is it overkill to add this convention to my designs?
ex:
Yes, this is a common convention: the name over the association (Wrote) is the name of the association. You may add the solid triangle to show the order of reading.
But often the associations are shown without name, or without the triangle, if this information is not important for the understanding of the diagram. Adding this systematically in the diagram might make it more difficult to read and give a feeling of information overload. So, up to you to find the right balance in your specific case.
Just trying to summarize a few experiences:
Using the name/triangle notation is often advantageous when working with business stakeholders. In that case the triangle is mandatory because without it can lead to confusion. Not so in the above example but it should be a modeling rule set in the domain.
Applying roles/multiplicities is practical when moving over to technical aspects. In that stage the label is not important any more as it can be guessed from the role names. So the best is to have diagrams for business people having just the labels/triangles and ones for techies containing roles/multiplicities.
If for any case you want both notations make sure that you have enough space to distinguish between labels and role names. That makes dense diagrams impossible.
Like in a Chinese Restaurant: if there's all you can eat please listen to your stomach.
I am currently enrolled in the Online Oracle Academy Database Design course, which briefly delves into the use of Matrix Diagrams to make sure all possible relationships are covered in an Entity Relationship Diagram.
The following practice problem was supplied by the course, instructing us to complete a matrix diagram for four entities: RUNNER, CITY FOR RACE, RACE TYPE, and RUNNING EVENT
The following is the supplied solution from the course:
I was able to find the following alternative solution for the same problem:
My concern stems from just how radically different these two ERDs are from each other. Is it better practice to come up with as many relationships as possible, even going so far as to fill out all boxes in the Matrix Diagram, or do something more akin to the first solution. Or is this simply an issue which should be handled based off of the current situation and the needs of the business that we are creating the ERD for?
They are not radically different. The second ERD has all the relationships of the first, it just expands due to the presumption that the knowledge that:
a runner has visited a city (if for instance you want to know if runners actually made it to a race after having registered for it)
an event may consist of multiple race types, implying a different model for what an event actually is
or that a runner has chosen a race type (I'm having a more difficult time thinking of a sensible reason here, but there are possibilities)
is important to whatever it is this database is supporting.
If you do not have such a reason to track a relationship, it's wasted effort to do so. It's good to keep future possibilities in mind when considering whether you have a reason, but Ockham's Razor is very much a guiding principle in schema design.
Occasionally I run into situations where all of the conditions below are true for two highly similar, but not quite identical entities or objects. This makes it difficult for me to decide how to model them, either on the database end or in terms of object modeling. I'm going to try to spell out the issue and my questions in detail, because I've found it to be a really difficult modeling problem to define. I'm trying to do both data and object modeling with these entities, so I'm going to use the terminology of both disciplines a little loosely.
1) Both entities share many identical properties, but have a few unique ones not found in the other.
2) One is not a supertype or subtype of another.
3) The overlap is not due to object inheritance.
4) The objects are used for different purposes in the same domain, but often in close proximity in any workflow. This frequently leads those with even moderate domain knowledge to confuse the entities. On the other hand, this fine separation in purposes leads to greater differences between the methods of the associated objects than their properties.
5) In some situations it may be possible to create bridge tables on the database side to express M2M relationships between the entities. Nevertheless, they have so many properties (or columns, on the database side) in common that it might make sense to store them in the same table.
Some cases in point I've run into include:
1) "Product vs. Project confusion" - especially in software marketing, where Products and Projects share many of the same properties. Normally a product will have multiple projects associated with it, but it is also unusual yet conceivable for a project to be used in multiple products.
2) The subtle differences between Features and Components in software development. A feature is developer-centric a means of supplying a benefit, from the customer's point of view, while a component is a means of implementing features on the developer's side. This is a really subtle distinction which nevertheless counts for a lot. For further discussion see Rod Maupin's post at http://www.installationdeveloper.com/347/features-and-components-101/
3) Templates vs. Types in a lot of different problem domains. For example, when identifying types of guitars through a TypeID column, the TypeTable it refers to would probably have columns corresponding to colors, string sizes, body shapes, etc. A template, on the other hand, is something you'd build a guitar from, so it would have different methods than a Type, perhaps linked to an "Apply Template" or "Make Item from Template" menu command. Nevertheless, it would have many of the same columns or properties as a Type, such as color, shape, string size etc. This distinction raises its head in thousands of different object types and templates in many problem domains, not just this narrow example. To complicate matters further, in some situations it might be helpful to associate multiple Templates with a particular Type, and/or vice-versa.
I haven't run into this problem of overlapping entities often, but when it does occur, it becomes a real bottleneck and leads to a lot of waste time refactoring the data and object models. I've read books on both topics and done a lot searches of data/object modeling webpages about the issue, but have yet to see it discussed. The only hits for "overlap" and "data model" I could find on StackOverflow were for differentiating between similar columns in one table or entity, not across tables or entities. My questions are:
1) Is there is a formal name for this issue?
2) Is there a simple shortcut or trick of the trade to identify such overlapping entities at the beginning of the modeling process, rather than much further down the line, when late recognition makes refactoring an issue?
3) How should such overlapping entities be handled? I assume that in terms of OOP, they ought to have separate objects since their methods tend to be different. Inheriting one from the other would be awkward though. A more difficult question would be whether or not it would make sense to use separate tables on the database end. Combining them might require a complex series of views plus waste storage space when the properties/columns they don't have in common are left null. Storing them in separate tables might also be wasteful though, if the common properties could be stored in single columns.
It's a tricky issue to even recognize, let alone handle. I have only a moderate amount of experience with data/object modeling, so the input of someone who really knows what they're doing would be helpful. Thanks :)
Your question concerns both database modeling aspects that object-oriented (programming) modeling aspects. Let’s start from an abstract point of view.
You say:
1) Both entities share many identical properties, but have a few unique ones not found in the other.
2) One is not a supertype or subtype of another.
and:
3) The overlap is not due to object inheritance.
But note that inheritance should not to be confused with subtyping, even if many times they are tied together! See for instance Inheritance (object-oriented programming) in Wikipedia, where this statement is supported by two citations [1,2].
In other words, even if A is not a subtype of B, and B is not a subtype of A, you can find a C from which both A and B inherits attributes.
So, you can think or not of this C as an “abstract supertype” of both A and B; but in any case it is convenient consider it as common ancestor, at least from a database point of view, so that factorize the common attributes in a “supertable”.
Then, from the object-oriented programming side, you can see A or B as subtype of C or simple as two different things, depending on the characteristics of your Object-Relational Mapping tools, from the problem at hand, etc.
Of course, this way of modelling things does not prohibit that A and B, in addition to inherit from C, have one or more relations between them, as in the example Products-Projects that you have done.
So, here is my answer to your four final questions:
1) Yes, it is called inheritance.
2) You can check if two entities have a significant number of common attributes.
3) You can model them in the database with a common table, that perhaps has some common property like integrity constraints, and with two tables that have a foreign key to it. Of course this rule is not to be applied blindly, but can have exception as all the human rules. From the programming point of view, on the other hand, you can decide to model them both with a supertype or not. This dependes on many factors, and should be decided on a case by case basis.
This is a generic question, I don't know if it belongs to Programming or StackOverflow.
I'm writing a litte simulation. Without going very deep into its details, consider that many kind of identities are involved. They correspond to Object since I'm using a OOP language.
There are Guys that inhabit the world simulated
There are Maps
A map has many Lots, that are pieces of land with some characteristics
There are Tribes (guys belong to tribes)
There is a generic class called Position to locate the elements
There are Bots in control of tribes that move guys around
There is a World that represents the world simulated
and so on.
If the simulated world was laid down as a database, the objects would be tables with lots of references, but in memory I have to use a different strategy. So, for example, a Tribe has an array of Guys as a property, The world has a, array of Bots, of Tribes, of Maps. A Map has a Dictionary whose key is a Position and whose value is a Lot. A Guy has a Position that is where he stands.
The way I lay down such connections is pretty much arbitrary. For example, I could have an array of Guys in the World, or an Array of guys per Lot (the guys standing on a piece of land), or an array of Guys per Bot (with the Guys controlled by the bot).
Doing so, I also have to pass around a lot of objects. For example, a Bot must have informations about the Map and opponent Guys to decide how to move its Guys.
As said, in a database I'd have a Guys table connected to the Lots table (indicating its position), to the Tribe table (indicating which Tribe it belongs to) and so it would also be easy to query "All the guys in Position [1, 5]". "All the Guys of Tribe 123". "All the Guys controlled by Bot B standing on the Lot b34 not belonging to the Tribe 456" and so on.
I've worked with APIs where to get the simplest information you had to make an instance of the CustomerContextCollection and pass it to CustomerQueryFactory to get back a CustomerInPlaceQuery to... When people criticize OOP and cite verbose abstractions that soon smell ridiculous, that's what I mean. I want to avoid such things and having to relay on deep abstractions and (anti pattern) abstract contexts.
The question is: what is the preferred, clean way to manage entities and collections of entities that are deeply linked in multiple ways?
It depends on your definition of "clean". In my case, I define clean as: I can implement desired behavior in an obvious, efficient manner.
Building OOP software is not a data modeling exercise. I'd suggest stepping back a little. What does each one of those objects actually do? What methods are you going to implement?
Just because "guys are in a lot" doesn't mean that the lot object needs a collection of guys; it only needs one if there are operations on a lot that affect all the guys in it. And even then, it doesn't necessarily need a collection of guys - it needs a way to get the guys in the lot. This may be an internally stored collection, but it could also be a simple method that calls back into the world to find guys matching a criteria. The implementation of that lookup should be transparent to anyone.
From the tenor of your questions, it seems like you're thinking of this from a "how do I generate reports" perspective. Step back and think of the behaviors you're trying to implement first.
Another thing I find extremely valuable is to differentiate between Entities and Values. Entities are objects where identity matters - you may have two guys, both named "Chris", but they are two different objects and remain distinct despite having the same "key". Values, on the other hand, act like ints. From your above list, Position sounds a lot like a value - Position(0,0) is Position(0,0) regardless of which chunk of memory (identity) those bits are stored in. The distinction has a bit effect on how you compare and store values vs. entities. For example, your Guy objects (entities) would store their Position as a simple member variable.
I've found a great reference for how to think about such things is Eric Evan's "Domain Driven Design" book. He's focused on business systems, but the discussions are very valuable for how you think about building OO systems in general I've found.
I would say that no 'true' answer exists to your core question -- a best way to manage collections of entities that are linked in multiple ways. It really depends on the kind of application (simulation) - here are some thoughts:
Is execution time important?
If this is the case, there is really no way around analyzing in which way your simulator will iterate over (query) the objects from the pool: sketch out the basic simulation loop and check what kind of events will require to iterate over what kind of model entities (I assume you are developing a discrete-event simulation?). Then you should organize the data structures in a way that optimizes the most frequent/time-consuming events (as opposed to "laying down the connections arbitrarily"). Additionally, you may want to use special data structures (such as k-d trees) to organize entities with properties that you need to query often (e.g., position data). For some typical problems, e.g. collision detection, there is also a whole lot of approaches to solve them efficiently (so look for suitable libraries/frameworks, e.g. for multi-agent simulation).
How flexible do you want to make it?
If you really want to make it super-flexible and really don't want to decide on the hierarchy of the model entities, why not just use an in-memory database? As you already said, databases are easily applicable to your problem (and you can easily save the model state, which may also be useful).
How clean is clean enough?
If you want to be absolutely sure that the rest of your simulator is not affected by the design choices you make in regards of your model representation, hide it behind an interface (say, ModelWorld), which defines methods for all the types of queries your simulator may invoke (this is orthogonal to the second point and may help with the first point, i.e. figuring out what kind of access pattern your simulator exhibits). This allows you to change implementations easily, without affecting any other parts of the simulator code.
So I have ended up in a situation with a project I'm part of that has two types of lists (at the moment, at least): booking and shift list. Both lists are made so that we have a List object containing logic for both of them, and a separate Shift and Booking list objects for individual features.
The List object is starting to be overwhelming. It has pagination, editor capabilities, selection and double click to open popups, mouse hover popups as well as filtering and paging. I would like to refactor the code to something more maintenable and perhaps into smaller units I suppose. What design patterns should I be thinking about here?
If it matters, the List object contains over 3k lines of OO JavaScript code.
Well in fact it is overwhelming. There is no simple answer or chosen design patterns. I will begin to apply the "separation of concern" design principle. One class/set of functions only do one thing. That will help to reduce the complexity. Then you can apply structural design patterns. To begin, you can just use delegation. In your case the decorator design pattern can fit as you can "decorate" the basic list with functionalities depending of the usage ...
Before thinking DP, think separation of concern to divide your code in small understandable parts. Then use some DP to link them all.
Good luck !
my2c