Using embedded classes to group related properties - oop

Let's say I'm designing a Person class.
Is it appropriate to use an embedded class to group similar properties of this person?
For instance, let's a person has a weight, height, Hair color, and eye color.
Instead of hanging these properties directly off of the person, what if I created a class called PersonPhysicalAttibutes that had these properties.
So when you need to set a person's height, you'd use
personA.PhysicalAttributes.Height = 6.1;
Would you say this is a workable design?
EDIT:
One of the answers mention grouping address properties in a seperate class. I agree that that is a case where a seperate class works well. The address class could also be reused in an employer, customer or vendor class.
However I chose physical attributes as an example for a reason. My question is, does it make sense to break these out into another class when you're reasonable sure that that class won't be used in any other context? Strictly for ease of intellisense/grouping.

It depends on case to case basis. If you have group of properties such that all of them or most of them change together(in short the properties have strong binding in themselves), then it is appropriate to move them in another class. For example a person has address which contains houseNo,street,city,zipcode. These properties represent a group which can be associated with Person, but can exist together as a group. So including them in Person class would be inappropriate. Instead you should make a different class for them called Address and associate Address with Person. But weight,eyeColor, hairColor, height all of them are independent properties. Naturally they do not form a group together. It is better that they remain associated with Person class as an individual, independent properties. If you forcefully create a subgroup like the one you mentioned PhysicalAttributes, you will frequently come across a situation where you will violate law of Demeter.

This would be a violation of the Law of Demeter. In particular, by designing it this way, you are actually coupling the calling code to both your Person class and your PersonPhysicalAttributes class, thereby making later change to your code even harder.
I would avoid doing this approach, personally.

Not only is it appropriate, it's desired to split a big class into smaller classes. However, consider your naming. It would be most sensible to compose Person of objects like
Physique { Height, Weight, ... }
Face { EyeColor, HairColor, ... }
Psyche { Iq, Mood, ... }

Related

Is correct relationships of class diagram in UML?

The image shows the logistics of the Warehouse. Very very simplistic. What is its concept: There are documents: ReceivingWayBill, DispatchingWaybill, ReplacementOrder.
They interact with the main classes: Warehouse, Counterparty, Item.
And the Register class: ItemRemainsInWarehouse. It turns out, the document is confirmation of the operation, reception, sending, and so on. The Register simply stores information about the number of remaining goods.
If you miss a lot of problems of this scheme, such as: the lack of generalization, getters and setters and a heap of everything else.
Who can tell: the relationship between classes, and there is concrete aggregation everywhere, are placed correctly, or can we somehow consider the association in more detail?
It is so hard (maybe impossible) to correct your whole model with provided explanation. I give some improvements.
You should put Multiplicity of you relationships. They are so important. In some relationship, you have 1 (ReplacementOrder , Warehouse) and some of your relatioships are maybe * (Item , ReceivingWayBill)
You put Aggregation between your classes and we know that Aggregation is type of Association. You can put Associations too. You can find a lot of similar questions and answers that explain differences between Association and Aggregation (and Composition). see Question 1, Question 2 and Question 3. But I recommend this answer.
I think, there is NOT a very significant difference between Aggregation and Association. See my example in this question.
Robert C. Martin says (see here):
Association represents the ability of one instance to send a message to another instance.
Aggregation is the typical whole/part relationship. This is exactly the same as an association with the exception that instances
cannot have cyclic aggregation relationships (i.e. a part cannot
contain its whole).
Therefor: some of your relationships are exactly an Aggregation. (relationship between Item and other classes). Your Counterparty has not good API definition. Your other relationships is about using Warehouse class. I think (just guess) the other classes only use Warehouse class services (public methods). In this case, they can be Associations. Otherwise, if they need an instance of Warehouse as a part, they are Aggregations.
Aggregation is evil!
Read the UML specs about the two variants they introduced (p. 110):
none: Indicates that the Property has no aggregation semantics. [hear, hear!]
shared: Indicates that the Property has shared aggregation semantics. Precise semantics of shared aggregation varies by application area and modeler.
composite: Indicates that the Property is aggregated compositely, i.e., the composite object has responsibility for the existence and storage of the composed objects (see the definition of parts in 11.2.3).
Composite aggregation is a strong form of aggregation that requires a part object be included in at most one composite object at a time. If a composite object is deleted, all of its part instances that are objects are deleted with it.
Now, that last sentence clearly indicates where you should use composite (!) aggregation: in security related appications. When you delete a person record in a database you need to also delete all related entities. That often used example with a car being composed of motor, tires, etc. does not really fit. The tires do not vanish when you "delete" the car. Simply because you can not delete it. Even worse is the use of a shared composite since it has no definition per definition (sic!).
So what should you do? Use multiplicities! That is what people usually want to show. There are 0..n, 1, etc. elements related to to the class at the other side. Eventually you name these by using roles to make it explicit.
If you consider DispatchingWaybill and ReceivingWaybill it looks like those are association classes. With the right multiplicities (1-* / *-1) you can leave it this way. (Edit: note the little dots at the association's ends which tell that the class at the opposite has an attribute named after the role.)
Alternatively attach either with a dashed line to an association between the classes where they are currently connected to.

How to model OO scenario

I recurrently run into an scenario similar to this:
A container business class that models a hierarchy.
A business class that participates in this hierarchy and is aggregated by the aforementioned class.
Let me give you an example.
A Map has Countries. Now the Map should know where each Country is, since its main responsability besides containing all countries is to know the locations and proximity of each. From this point of view, a functionality such as isNeighbour(Country A, Country B) seems like a correct addition to Map. However, each Country should also offer a method to know if a country is nearby. Say spain.isNeighbour(italy). This is indeed useful. Now, if I don't want to duplicate functionality and responsability, what approach should I take?
The current example I am working on is something for my university, each course requires other courses and also blocks the next level ones. The major is the one that contains all courses and dictates which course precedes which. Say I want to add a dependency of a course over another, e.g to take Calculus 2 you need Calculus 1... Should I go calculus.addRequired(calculus2) and then pass it to the major object, or maybe computerScience.addRequired(calculus1, calculus2)...
I don't want to have both alternatives because to me it seems it can lead to error, but at the same time I want each course to be able to answer what are its requirements. I don't really know how to distribute responsabilities correctly.
First thing is, that there is no problem calling each other.
You can have
boolean Map.isNeighbour(Country A, Country B) { return A.isNeighbour(B); }
or
boolean Country.isNeighbour(Country other) { return map.isNeighbour(this, other); }
Second seems to need reference to global map. First makes Map look like simple facade.
Second thing is that you say it is persisted. There also might be good idea to create a service, that will query DB with related parameters. This can be either Map or some repository service. This will also allow you to query with only identities of entities (eg. countryId) instead of full objects.
I believe neither of the solutions is better or worse. Only point of difference is where other developers expect the methods to be located. But when I think about it, this would mean Map will have all responsibilities of Country, thus breaking SRP, especially if it is not call-through to the country method.
I would put the isNeighbour() method into Country.
Country would contain a map of neighbours. And then the container can call this method on the country instance in question.
This way the logic is maintained by the countries, and the container simply delegates to answer the question to them.
In case of courses it is possible that Course-1 is required for Course-2 in Major-1, but not in Major-2. In this case I would introduce another class, e.g. CourseInMajor that would contain the required courses for a given course in a given Major.

OO Design Encapsulation

I have a question with regard to encapsulation. As I know, encapsulation enables to hide the implementation details using private/protected data members and provides public methods and properties to operate on the data. The idea here is to prevent the direct modification of the data members by the class consumers.
But I have a concern with the property getters or other public methods which return private/protected data members. For ex: if I have class like this
public class Inventory
{
private List<Guitar> guitars = new List<Guitar>();
public void AddGuitar(string serialnumber, string price)
{
Guitar guitar = new Guitar(serialnumber, price);
guitars.Add(guitar);
}
public List<Guitar> GetGuitars()
{
return guitars;
}
}
Now if the Inventory class consumer calls GetGuitars, he is going to get the list of guitars being maintained in the Inventory class. Now the consumer can modify the list, like delete/add/modify the items. For me it looks like we are not encapsulating. I think that I should be returning a copy of the Guitar list items in the GetGuitars(). What do you think?.
Is my understanding of the encapsulation right?.
Thanks
Encapsulating lists of objects can be achieved quite nicely by restricting access to them using a suitable interface.
I think you're right to control additions to your list via your AddGuitar method as you can exert control over what goes in. You can reinforce this design, IMHO, by altering GetGuitars to return IEnumerable instead of List.
This reduces the control the caller has on your list, whilst also being non-committal in returning an abstract type. This way your internal data structure can change without the public interface needing to also.
You are right. With a setter like that clients are able to modify the list. If adding a guitar requires some special handling, this is not desired. In this case you have two choices:
Return a copy of the list (as you already suggested).
Wrap it with ReadOnlyCollection within the getter.
Both cases should be documented in method description so that clients are not "surprised" when they attempt to modify the list externally.
if u want your List array cannot be modified, why u dont use AsReadOnly method: http://msdn.microsoft.com/en-us/library/e78dcd75.aspx
about encapsulation inside members are only writable and readable through the methods where members are not available from outside.
In terms of risk, it is indeed better if you return a copy of your list of make it unmodifiable (create a whole new unmodifiable list when you add a guitar, functional programming-style).
In terms of encapsulation, it would be better to get rid of the getGuitars() method and then Inventory class should offer the functionality associated with it ( for example, printInventoryReport() or whatever). This way, no client class needs to know at all how you store your guitars and you keep the related code into the Inventory class. The tradeoff is that this class gets bigger and every time you need something new from the guitar list you need to modify the Inventory.
I recommend a good article : http://www.javaworld.com/javaworld/jw-09-2003/jw-0905-toolbox.html
It was quite incendiary back in the day, but i think there's a lot of truth in there.
And if you stay with the getter, a small tip would be to choose if you need it to be a List or a Collection can do. Maybe even an Iterable! This way you tell as less as possible about your implementation, which results in better encapsulation.
I would agree that returning the list leaves something to be desired in terms on encapsulation. You may want to consider writing a getter for individual items, or possibly an iterator. The list seems like an implementation detail, so other classes really have no business accessing it directly.
There are (at least) two issues here.
The first is about hiding the implementation. You could change the "guitars" field to an array or a database but you could leave the signature of the methods AddGuitar and getGuitars unchanged so client code wouldn't break.
The second is about whether or not you want to return a defensive copy of the guitar list or not. Once you have the list of guitars do you want to add and delete elements? Since you have a method to add guitars I would assume not.

Worker vs data class

I have a data class which encapsulates relevant data items in it. Those data items are set and get by users one by one when needed.
My confusion about the design has to do with which object should be responsible for handling the update of multiple properties of that data object. Sometimes an update operation will be performed which affects many properties at once.
So, which class should have the update() method?. Is it the data class itself or another manager class ? The update() method requires data exchange with many different objects, so I don't want to make it a member of the data class because I believe it should know nothing about the other objects required for update. I want the data class to be only a data-structure. Am I thinking wrong? What would be the right approach?
My code:
class RefData
{
Matrix mX;
Vector mV;
int mA;
bool mB;
getX();
setB();
update(); // which affects almost any member attributes in the class, but requires many relations with many different classes, which makes this class dependant on them.
}
or,
class RefDataUpdater
{
update(RefData*); // something like this ?
}
There is this really great section in the book Clean Code, by Robert C. Martin, that speaks directly to this issue.
And the answer is it depends. It depends on what you are trying to accomplish in your design--and
if you might have more than one data-object that exhibit similar behaviors.
First, your data class could be considered a Data Transfer Object (DTO). As such, its ideal form is simply a class without any public methods--only public properties -- basically a data structure. It will not encapsulate any behavior, it simply groups together related data. Since other objects manipulate these data objects, if you were to add a property to the data object, you'd need to change all the other objects that have functions that now need to access that new property. However, on the flip side, if you added a new function to a manager class, you need to make zero changes to the data object class.
So, I think often you want to think about how many data objects might have an update function that relates directly to the properties of that class. If you have 5 classes that contain 3-4 properties but all have an update function, then I'd lean toward having the update function be part of the "data-class" (which is more of an OO-design). But, if you have one data-class in which it is likely to have properties added to it in the future, then I'd lean toward the DTO design (object as a data structure)--which is more procedural (requiring other functions to manipulate it) but still can be part of an otherwise Object Oriented architecture.
All this being said, as Robert Martin points out in the book:
There are ways around this that are well known to experienced
object-oriented designers: VISITOR, or dual-dispatch, for example.
But these techniques carry costs of their own and generally return the
structure to that of a procedural program.
Now, in the code you show, you have properties with types of Vector, and Matrix, which are probably more complex types than a simple DTO would contain, so you may want to think about what those represent and whether they could be moved to separate classes--with different functions to manipulate--as you typically would not expose a Matrix or a Vector directly as a property, but encapsulate them.
As already written, it depends, but I'd probably go with an external support class that handles the update.
For once, I'd like to know why you'd use such a method? I believe it's safe to assume that the class doesn't only call setter methods for a list of parameters it receives, but I'll consider this case as well
1) the trivial updater method
In this case I mean something like this:
public update(a, b, c)
{
setA(a);
setB(b);
setC(c);
}
In this case I'd probably not use such a method at all, I'd either define a macro for it or I'd call the setter themselves. But if it must be a method, then I'd place it inside the data class.
2) the complex updater method
The method in this case doesn't only contain calls to setters, but it also contains logic. If the logic is some sort of simple property update logic I'd try to put that logic inside the setters (that's what they are for in the first place), but if the logic involves multiple properties I'd put this logic inside an external supporting class (or a business logic class if any appropriate already there) since it's not a great idea having logic reside inside data classes.
Developing clear code that can be easily understood is very important and it's my belief that by putting logic of any kind (except for say setter logic) inside data classes won't help you achieving that.
Edit
I just though I'd add something else. Where to put such methods also depend upon your class and what purpose it fulfills. If we're talking for instance about Business/Domain Object classes, and we're not using an Anemic Domain Model these classes are allowed (and should contain) behavior/logic.
On the other hand, if this data class is say an Entity (persistence objects) which is not used in the Domain Model as well (complex Domain Model) I would strongly advice against placing logic inside them. The same goes for data classes which "feel" like pure data objects (more like structs), don't pollute them, keep the logic outside.
I guess like everywhere in software, there's no silver bullet and the right answer is: it depends (upon the classes, what this update method is doing, what's the architecture behind the application and other application specific considerations).

How to solve cross referencess in OOP?

I encountered this a couple of times now, and i wondered what is the OO way to solve circular references. By that i mean class A has class B as a member, and B in turn has class A as a member.
One example of this would be class Person that has Person spouse as a member.
Person jack = new Person("Jack");
Person jill = new Person("Jill");
jack.setSpouse(jill);
jill.setSpouse(jack);
Another example would be Product classes that have some Collection of other Products as a member. That collection could for example be products that people who are interested in this product might also be interested in, and we want to upkeep that list on a per-product base, not on same shared attributes (e.g. we don't want to just display all other products in the same category).
Product pc = new Product("pc");
Product monitor = new Product("monitor");
Product tv = new Product("tv");
pc.setSeeAlso({monitor, tv});
monitor.setSeeAlso({pc});
tv.setSeeAlso(null);
(these products are just for making a point, the issue is not about wether or not certain products would relate to each other)
Would this be bad design in OOP in general? Would/should all OOP languages allow this, or is it just bad practice? If it's bad practice, what would be the nicest way of solving this?
The examples you give are (to me, anyway) examples of reasonable OO design.
The cross-referencing issue you describe isn't an artifact of any design process but a real-life characteristic of the things you're representing as objects, so I don't see there's a problem.
What have you encountered that has given you the impression that this approach is bad-design?
Update 11 March:
In systems that lack garbage collection, where memory management is explicitly managed, one common approach is to require all objects to have an owner - some other object responsible for managing the lifetime of that object.
One example is Delphi's TComponent class, which provides cascading support - destroy the parent component, and all owned components are also destroyed.
If you're working on such a system, the kinds of referential loop described in this question may be considered poor design because there's no clear owner, no one object responsible for managing lifetimes.
The way that I've seen this handled in some systems is to retain the references (because they properly capture the business concerns), and to add in an explicit TransactionContext object that owns everything loaded into the business domain from the database. This context object takes care of knowing which objects need to be saved, and cleans everything up when processing is complete.
It's not a fundamental problem in OO design. An example of a time it might become a problem is in graph traversal, for instance, finding the shortest path between two objects - you could potentially get into an infinite loop. However, that's something you would have to consider on a case-by-case basis. If you know there could be cross-references in a case like that, then code some checks in to avoid infinite loops (for instance, maintaining a set of visited nodes to avoid re-visiting). But if there's no reason it could be a problem (such as in the examples you gave in your question), then it's not bad at all to have such cross-references. And in many cases, as you've described, it's a good solution to the problem at hand.
I do not think this is an example of cross referencing.
Cross referencing usually pertains to this case:
class A
{
public void MethodA(B objectB)
{
objectB.SomeMethodInB();
}
}
class B
{
public void MethodB(A objectA)
{
objectA.SomeMethodInA();
}
}
In this case each object kind of "reaches in" to each other; A calls B, B calls A, and they become tightly coupled. This is made even worse if A and B are in different packages/namespaces/assemblies; in many cases those would create compile time errors as assemblies are compiled linearly.
The way to solve that is to have either object implement an interface with the desired method.
In your case you only have one level of "reaching in":
public Class Person
{
public void setSpouse(Person person)
{ ... }
}
I do not think this is unreasonable, nor even a case of cross-referencing/circular references.
The main time this is a problem is if it becomes too confusing to cope with, or maintain, as it can become a form of spaghetti code.
However, to touch on your examples;
See Also is perfectly valid if this is a feature you need in your code - it is a simple list of pointers (or references) to other items a user may be interested in.
Similarily it is perfectly valid to add spouse, as this is a simple real world relationship that would not be confusing to someone maintaining your code.
I have always seen it as a potential code smell, or perhaps a warning to take a step back and rationalise what I am doing.
As for some systems finding recursive relationships in your code (mentioned in a comment above), these can come up regardless of this sort of design. I have recently worked on a metadata capture system that had recursive 'types' of relationships - i.e Columns being logically related to other columns. It needs to be handled by the code trying to parse your system.
I don't think the circular references as such are a problem.
However, putting all those relationships inside objects may add too much clutter, so you may instead want to represent them externally. E.g. you might use a hash table to store relationships between products.
Referencing other objects is not a real bad OO design at all. It's the way state is managed within each object.
A good rule of thumb is the Law of Demeter. Look at this perfect paper of LoD (Paperboy and the wallet): click here
One way to fix this is to refer to other object via an id.
e.g.
Person jack = new Person(new PersonId("Jack"));
Person jill = new Person(new PersonId("Jill"));
jack.setSpouse(jill.getId());
jill.setSpouse(jack.getId());
I'm not saying it is a perfect solution, but it will prevent circular references. You are using an object instead of a object reference to model the relationship.