Related
Let's say I have an class to model a city. Its characteristics are the following:
It has only two properties "name" and "population", both private, that are set in the constructor.
It has getters for these properties, but not setters.
I don't want any user of this class to set the properties, I want them to use a public .edit() method.
This method needs opens up a form to input the new name of the city and population, i.e.: a view. Then, if I have a view, I would like to implement the MVC pattern, so the idea would be that the controller receives the .edit() call, renders the view, retrieves the data back, and sends it to the view so that it changes its state.
But, if I do so, I have to change the properties of the city model from private to public. So, if any user instantiates my class, she/he can directly change the properties.
So, the philosophical question: Isn't that breaking the encapsulation?
EDIT Just to make it more explicit:
This city_instance.edit() method should be the only way to mutate the object.
Besides, I see that part of my problems comes from the misunderstanding that a model is an object (you can read that on php mvc frameworks), when it is actually a different abstraction, it's a layer that groups the business logic (domain objects + I guess more things)
Disclaimer: I don't really understand where are you proposing the .edit() method to be implemented, so it would help if you could clarify that a little bit there.
The first thing to consider here is that in the bulleted list of your question you seem to imply that a City instance acts like an immutable object: it takes its instance variables in the constructor and doesn't allow anybody in the outside to change them. However, you later state that you actually want to create a way to visually edit a City instance. This two requirements are clearly going to create some tension, since they are kind of opposites.
If you go the MVC approach, by separating the view from the model you have two main choices:
Treat your City objects as immutable and, instead of editing an instance when the values are changed in the form, throw away the original object and create a new one.
Provide a way to mutate an existing City instance.
The first approach keeps your model intact if you actually consider a City as an immutable object. For the second one there are many different ways to go:
The most standard way is to provide, in the City class, a mutator. This can have the shape of independent setters for each property or a common message (I think this is the .edit() method you mentioned) to alter many properties at once by taking an array. Note that here you don't take a form object as a parameter, since models should not be aware of the views. If you want your view to take note of internal changes in the model, you use the Observer pattern.
Use "friend" classes for controllers. Some languages allow for friend classes to access an object's internals. In this case you could create a controller that is a friend class of your model that can make the connection between the model and the view without having to add mutators to your model.
Use reflection to accomplish something similar to the friend classes.
The first of this three approaches is the only language agnostic choice. Whether that breaks encapsulation or not is kind of difficult to say, since the requirements themselves would be conflicting (It would basically mean wanting to have a model separated from the view that can be altered by the user but that doesn't allow the model itself to be changed for the outside). I would however agree that separating the model from the view promotes having an explicit mutation mechanism if you want mutable instances.
HTH
NOTE: I'm referring to MVC as it applies to Web applications. MVC can apply to many kinds of apps, and it's implemented in many kinds of ways, so it's really hard to say MVC does or does not do any specific thing unless you are talking strictly about something defined by the pattern, and not a particular implementation.
I think you have a very specific view of what "encapsulation" is, and that view does not agree with the textbook definition of encapsulation, nor does it agree with the common usage of it. There is no definition of "Encapsulation" I can find that requires that there be no setters. In fact, since Setters are in and of themselves methods that be used to "edit" the object, it's kind of a silly argument.
From the Wikipedia entry (note where it says "like getter and setter"):
In general, encapsulation is one of the four fundamentals of OOP (object-oriented programming). Encapsulation is to hide the variables or something inside a class, preventing unauthorized parties to use. So the public methods like getter and setter access it and the other classes call these methods for accessing.
http://en.wikipedia.org/wiki/Encapsulation_(object-oriented_programming)
Now, that's not to say that MVC doesn't break encapsulation, I'm just saying that your idea of what Encapsulation is is very specific and not particularly canonical.
Certainly, there are a number of problems that using Getters and Setters can cause, such as returning lists that you can then change directly outside of the object itself. You have to be careful (if you care) to keep your data hidden. You can also replace a collection with another collection, which is probably not what you intend.
The Law of Demeter is more relevant here than anything else.
But all of this is really just a red herring anyways. MVC is only about the GUI, and the GUI should be as simple as possible. It should have almost no logic in either the view or the controller. You should be using simple view models to deserialize your form data into a simple structure, which can the be used to apply to any business architecture you like (if you don't want setters, then create your business layer with objects that don't use setters and use mutattors.).
There is little need for complex architecture in the UI layer. The UI layer is more of a boundary and gateway that translates the flat form and command nature of HTTP to whatever business object model you choose. As such, it's not going to be purely OO at the UI level, because HTTP isn't.
This is called an Impedance Mismatch, which is often associated with ORM's, because Object models do not map easily to relational models. The same is true of HTTP to Business objects. You can think of MVC as a corollary to an ORM in that respect.
Reading the wikipedia entry about God Objects, it says that a class is a god object when it knows too much or does too much.
I see the logic behind this, but if it's true, then how do you couple every different class? Don't you always use a master class for connecting window management, DB connections, etc?
The main function/method may know about the existence of the windows, databases, and other objects. It may perform over-arching tasks like introduce the model to the controller.
But that doesn't mean it manages all the little details. It probably doesn't know anything about how the database or windows are implemented.
If it did, it could be accused of being a God object.
A god object is an object that contains references, directly or indirectly, to most if not all objects within an application. As the question observes, it is almost impossible to avoid having a god object in an application. Some object must hold references to the various subsystems: UI, database, communications, business logic, etc. Note that the god object need not be application-defined. Many frameworks have built-in god objects with names like "application context", "application environment", "session", "activator", etc.
The issue is not whether a god object exists, but rather how it is used. I will illustrate with an extreme example...
Let's say that in my application I want to standardize how many decimal places of precision to show when displaying numbers. However, I want the precision to be configurable. I create a class whose responsibility is to convert numbers to strings:
class NumberFormatter {
...
String format(double value) {
int decimalPlaces = getConfiguredPrecision();
return formatDouble(value, decimalPlaces);
}
int getConfiguredPrecision() {
return /* what ??? */;
}
}
The question is, how does getConfiguredPrecision figure out what to return? One way would be to give NumberFormatter a reference to the global application context which it stores in a member field called _appContext. Then we could write:
return _appContext.getPreferenceManager().getNumericPreferences().getDecimalPlaces();
By doing this, we have just made NumberFormatter into a god object as well! Why? Because now we can (indirectly) reference virtually any object in the application through its _appContext field. Is this bad? Yes, it is.
I'm going to write a unit test for NumberFormatter. Let's set up the parameters... it needs an application context?! WTF, that has 57 methods I need to mock. Oh, it only needs the pref manager... WTF, I have to mock 14 methods! Numeric prefs!?! Screw it, the class is simple enough, I don't need to test it...
Let's say that the application context had another method, getDatabaseManager(). Last week we were using SQL, so the method returned an SQL database object. But this week, we've decided to change to a NoSQL database and the method now returns a new type. Is NumberFormatter affected by the change? Hmmm, I can't remember... yeah, it might be, I see it takes an application context in the constructor... let me open the source and take a look... nope, we're in luck: it only accesses getPreferenceManager()... now let's check the other 93 classes that take an application context as a parameter...
This same scenario occurs if a change is made to the preferences manager, or the numeric preferences object. The moral of the story is that an object should only hold references to the things that it needs to perform its job, and only those things. In the case of NumberFormatter, all it needs to know is a single integer -- the number of decimal places. It could be created directly by the application god object who knows the magic number (or the pref manager or better still, numeric prefs), without turning the formatter into a god object itself. Furthermore, any components that need to format numbers could be given a formatter instead of the god object. Wins all around.
So, to summarize, the problem is not the existence of a god object but rather the act of conferring god-like status to other objects willy-nilly.
Incidentally, the design principle that tackles this problem head-on has become known as the Law of Demeter. Or "when paying at a restaurant, give the server your money not your wallet."
In my experience this most often occurs when you're dealing with code that is the product of "Develop as you go" project management (or lack there of). When a project is not thought through and planned and object responsibilities are loose and not delegated properly. In theses scenarios you find a "god-object" being the catchall for code that doesn't have any obvious organization or delegation.
It is not the interconnectedness or coupling of the different classes that is the problem with god-objects, it's the fact that a god-object many times can accomplish most if not all responsibilities of it's derived children, and are fairly unpredictable (by anyone other than the developer) as to what their defined responsibilities are.
Simply knowing about "multiple" classes doesn't make one a God; knowing about multiple classes in order to solve a problem that should be split into several sub-problems does make one a God.
I think the focus should be on whether a problem should be split into several sub-problems, not on the number of classes a given object knows about (as you pointed out, sometimes knowing about several classes is necessary).
Gods are over-hyped.
I had a bunch of objects which were responsible for their own construction (get properties from network message, then build). By construction I mean setting frame sizes, colours, that sort of thing, not literal object construction.
The code got really bloated and messy when I started adding conditions to control the building algorithm, so I decided to separate the algorithm to into a "Builder" class, which essentially gets the properties of the object, works out what needs to be done and then applies the changes to the object.
The advantage to having the builder algorithm separate is that I can wrap/decorate it, or override it completely. The object itself doesn't need to worry about how it is built, it just creates a builder and 'decorates' the builder with extra the functionality that it needs to get the job done.
I am quite happy with this approach except for one thing... Because my Builder does not inherit from the object itself (object is large and I want run-time customisation), I have to expose a lot of internal properties of the object.
It's like employing a builder to rebuild your house. He isn't a house himself but he needs access to the internal details, he can't do anything by looking through the windows. I don't want to open my house up to everyone, just the builder.
I know objects are supposed to look after themselves, and in an ideal world my object (house) would build itself, but I am refactoring the build portion of this object only, and I need a way to apply building algorithms dynamically, and I hate opening up my objects with getters and setters just for the sake of the Builder.
I should mention I'm working in Obj-C++ so lack friend classes or internal classes. If the explanation was too abstract I'd be happy to clarify with something a little more concrete. Mostly just looking for ideas or advice about what to do in this kind of situation.
Cheers folks,
Sam
EDIT: is it a good approach to declare a
interface House(StuffTheBuilderNeedsAccessTo)
category inside Builder.h ? That way I suppose I could declare the properties the builder needs and put synthesizers inside House.mm. Nobody would have access to the properties unless they included the Builder header....
That's all I can think of!
I would suggest using Factory pattern to build the object.
You can search for "Factory" on SO and you'll a get a no. of questions related to it.
Also see the Builder pattern.
You might want to consider using a delegate. Add a delegate method (and a protocol for the supported methods) to your class. The objects of the Builder class can be used as delegates.
The delegate can implement methods like calculateFrameSize (which returns a frame size) etc. The returned value of the delegate can be stored as an ivar. This way the implementation details of your class remain hidden. You are just outsourcing part the logic.
There is in fact a design pattern called, suitable enough, Builder which does tries to solve the problem with creating different configurations for a certain class. Check that out. Maybe it can give you some ideas?
But the underlying problem is still there; the builder needs to have access to the properties of the object it is building.
I don't know Obj-C++, so I don't know if this is possible, but this sounds like a problem for Categories. Expose only the necessary methods to your house in the declaration of the house itself, create a category that contains all the private methods you want to keep hidden.
What about the other way around, using multiple inheritance, so your class is also a Builder? That would mean that the bulk of the algorithms could be in the base class, and be extended to fit the neads of you specific House. It is not very beautiful, but it should let you abstract most of the functionality.
I often find myself needing reference to an object that is several objects away, or so it seems. The options I see are passing a reference through a middle-man or just making something available statically. I understand the danger of global scope, but passing a reference through an object that does nothing with it feels ridiculous. I'm okay with a little bit passing around, I suppose. I suspect there's a line to be drawn somewhere.
Does anyone have insight on where to draw this line?
Or a good way to deal with the problem of distributing references amongst dependent objects?
Use the Law of Demeter (with moderation and good taste, not dogmatically). If you're coding a.b.c.d.e, something IS wrong -- you've nailed forevermore the implementation of a to have a b which has a c which... EEP!-) One or at the most two dots is the maximum you should be using. But the alternative is NOT to plump things into globals (and ensure thread-unsafe, buggy, hard-to-maintain code!), it is to have each object "surface" those characteristics it is designed to maintain as part of its interface to clients going forward, instead of just letting poor clients go through such undending chains of nested refs!
This smells of an abstraction that may need some improvement. You seem to be violating the Law of Demeter.
In some cases a global isn't too bad.
Consider, you're probably programming against an operating system's API. That's full of globals, you can probably access a file or the registry, write to the console. Look up a window handle. You can do loads of stuff to access state that is global across the whole computer, or even across the internet... and you don't have to pass a single reference to your class to access it. All this stuff is global if you access the OS's API.
So, when you consider the number of global things that often exist, a global in your own program probably isn't as bad as many people try and make out and scream about.
However, if you want to have very nice OO code that is all unit testable, I suppose you should be writing wrapper classes around any access to globals whether they come from the OS, or are declared yourself to encapsulate them. This means you class that uses this global state can get references to the wrappers, and they could be replaced with fakes.
Hmm, anyway. I'm not quite sure what advice I'm trying to give here, other than say, structuring code is all a balance! And, how to do it for your particular problem depends on your preferences, preferences of people who will use the code, how you're feeling on the day on the academic to pragmatic scale, how big the code base is, how safety critical the system is and how far off the deadline for completion is.
I believe your question is revealing something about your classes. Maybe the responsibilities could be improved ? Maybe moving some code would solve problems ?
Tell, don't ask.
That's how it was explained to me. There is a natural tendency to call classes to obtain some data. Taken too far, asking too much, typically leads to heavy "getter sequences". But there is another way. I must admit it is not easy to find, but improves gradually in a specific code and in the coder's habits.
Class A wants to perform a calculation, and asks B's data. Sometimes, it is appropriate that A tells B to do the job, possibly passing some parameters. This could replace B's "getName()", used by A to check the validity of the name, by an "isValid()" method on B.
"Asking" has been replaced by "telling" (calling a method that executes the computation).
For me, this is the question I ask myself when I find too many getter calls. Gradually, the methods encounter their place in the correct object, and everything gets a bit simpler, I have less getters and less call to them. I have less code, and it provides more semantic, a better alignment with the functional requirement.
Move the data around
There are other cases where I move some data. For example, if a field moves two objects up, the length of the "getter chain" is reduced by two.
I believe nobody can find the correct model at first.
I first think about it (using hand-written diagrams is quick and a big help), then code it, then think again facing the real thing... Then I code the rest, and any smells I feel in the code, I think again...
Split and merge objects
If a method on A needs data from C, with B as a middle man, I can try if A and C would have some in common. Possibly, A or a part of A could become C (possible splitting of A, merging of A and C) ...
However, there are cases where I keep the getters of course.
But it's less likely a long chain will be created.
A long chain will probably get broken by one of the techniques above.
I have three patterns for this:
Pass the necessary reference to the object's constructor -- the reference can then be stored as a data member of the object, and doesn't need to be passed again; this implies that the object's factory has the necessary reference. For example, when I'm creating a DOM, I pass the element name to the DOM node when I construct the DOM node.
Let things remember their parent, and get references to properties via their parent; this implies that the parent or ancestor has the necessary property. For example, when I'm creating a DOM, there are various things which are stored as properties of the top-level DomDocument ancestor, and its child nodes can access those properties via the reference which each one has to its parent.
Put all the different things which are passed around as references into a single class, and then pass around just that one class instance as the only thing that's passed around. For example, there are many properties required to render a DOM (e.g. the GDI graphics handle, the viewport coordinates, callback events, etc.) ... I put all of these things into a single 'Context' instance which is passed as the only parameter to the methods of the DOM nodes to be rendered, and each method can get whichever properties it needs out of that context parameter.
I encountered this a couple of times now, and i wondered what is the OO way to solve circular references. By that i mean class A has class B as a member, and B in turn has class A as a member.
One example of this would be class Person that has Person spouse as a member.
Person jack = new Person("Jack");
Person jill = new Person("Jill");
jack.setSpouse(jill);
jill.setSpouse(jack);
Another example would be Product classes that have some Collection of other Products as a member. That collection could for example be products that people who are interested in this product might also be interested in, and we want to upkeep that list on a per-product base, not on same shared attributes (e.g. we don't want to just display all other products in the same category).
Product pc = new Product("pc");
Product monitor = new Product("monitor");
Product tv = new Product("tv");
pc.setSeeAlso({monitor, tv});
monitor.setSeeAlso({pc});
tv.setSeeAlso(null);
(these products are just for making a point, the issue is not about wether or not certain products would relate to each other)
Would this be bad design in OOP in general? Would/should all OOP languages allow this, or is it just bad practice? If it's bad practice, what would be the nicest way of solving this?
The examples you give are (to me, anyway) examples of reasonable OO design.
The cross-referencing issue you describe isn't an artifact of any design process but a real-life characteristic of the things you're representing as objects, so I don't see there's a problem.
What have you encountered that has given you the impression that this approach is bad-design?
Update 11 March:
In systems that lack garbage collection, where memory management is explicitly managed, one common approach is to require all objects to have an owner - some other object responsible for managing the lifetime of that object.
One example is Delphi's TComponent class, which provides cascading support - destroy the parent component, and all owned components are also destroyed.
If you're working on such a system, the kinds of referential loop described in this question may be considered poor design because there's no clear owner, no one object responsible for managing lifetimes.
The way that I've seen this handled in some systems is to retain the references (because they properly capture the business concerns), and to add in an explicit TransactionContext object that owns everything loaded into the business domain from the database. This context object takes care of knowing which objects need to be saved, and cleans everything up when processing is complete.
It's not a fundamental problem in OO design. An example of a time it might become a problem is in graph traversal, for instance, finding the shortest path between two objects - you could potentially get into an infinite loop. However, that's something you would have to consider on a case-by-case basis. If you know there could be cross-references in a case like that, then code some checks in to avoid infinite loops (for instance, maintaining a set of visited nodes to avoid re-visiting). But if there's no reason it could be a problem (such as in the examples you gave in your question), then it's not bad at all to have such cross-references. And in many cases, as you've described, it's a good solution to the problem at hand.
I do not think this is an example of cross referencing.
Cross referencing usually pertains to this case:
class A
{
public void MethodA(B objectB)
{
objectB.SomeMethodInB();
}
}
class B
{
public void MethodB(A objectA)
{
objectA.SomeMethodInA();
}
}
In this case each object kind of "reaches in" to each other; A calls B, B calls A, and they become tightly coupled. This is made even worse if A and B are in different packages/namespaces/assemblies; in many cases those would create compile time errors as assemblies are compiled linearly.
The way to solve that is to have either object implement an interface with the desired method.
In your case you only have one level of "reaching in":
public Class Person
{
public void setSpouse(Person person)
{ ... }
}
I do not think this is unreasonable, nor even a case of cross-referencing/circular references.
The main time this is a problem is if it becomes too confusing to cope with, or maintain, as it can become a form of spaghetti code.
However, to touch on your examples;
See Also is perfectly valid if this is a feature you need in your code - it is a simple list of pointers (or references) to other items a user may be interested in.
Similarily it is perfectly valid to add spouse, as this is a simple real world relationship that would not be confusing to someone maintaining your code.
I have always seen it as a potential code smell, or perhaps a warning to take a step back and rationalise what I am doing.
As for some systems finding recursive relationships in your code (mentioned in a comment above), these can come up regardless of this sort of design. I have recently worked on a metadata capture system that had recursive 'types' of relationships - i.e Columns being logically related to other columns. It needs to be handled by the code trying to parse your system.
I don't think the circular references as such are a problem.
However, putting all those relationships inside objects may add too much clutter, so you may instead want to represent them externally. E.g. you might use a hash table to store relationships between products.
Referencing other objects is not a real bad OO design at all. It's the way state is managed within each object.
A good rule of thumb is the Law of Demeter. Look at this perfect paper of LoD (Paperboy and the wallet): click here
One way to fix this is to refer to other object via an id.
e.g.
Person jack = new Person(new PersonId("Jack"));
Person jill = new Person(new PersonId("Jill"));
jack.setSpouse(jill.getId());
jill.setSpouse(jack.getId());
I'm not saying it is a perfect solution, but it will prevent circular references. You are using an object instead of a object reference to model the relationship.