In "The Unified Modelling Language User's Guide" by Bochs et. al. there is in 'Chapter 4. Classes' there (sub)section titled "Modeling the Distribution of Responsibilities in a System". It talks about balancing responsibilities, which can lead to splitting or coalescing of classes.
Could you give me example how such "balancing of responsibilities" might look like?
I think "balancing of responsibilities" happens when classes are neither too small nor too large. If responsibilities are not well balanced, then the object model has a few large classes that have too many responsibilities. Or even worse, there is a single object that does all the work. This is sometimes called the "God object". It is considered an anti-pattern. Google "god object".
A related consequence of having a God object is having many small classes that do very little other than encapsulate data. When responsibilities are well balanced, every class has a set of well-defined services and holds just enough attributes and methods to fulfill those responsibilities.
Wikipedia refers to well-balanced responsibilities as "ravioli code", but I had never heard that term used before.
Related
I am trying to understand the basic OOP concept called abstraction. When I say "understand", I mean not just to learn a definition, but really have a deep understanding.
On the internet, I have seen many definitions such as:
Hiding the low level implementation and providing high level specification
and
focusing on essential qualities rather than specific examples.
I understand that the iPhone button is a great example of abstraction, since I, as a user, don't have to know how the screen is displayed, all I have to know is to press the button.
What do you think of the following conclusion, when it comes to abstraction:
Abstraction takes many specific instances of objects and extracts their common information and functions by providing a single, generalised concept.
So based on this, a class is actually an abstraction of many instances, right?
I disagree with both of your examples. An iPhone button is not an abstraction of the screen, it is an interface to use the phone. A class is also not an abstraction of its instances.
An abstraction can be thought of treating a specific concept as a form of a more general concept.
To repeat an overused example: all vehicles can move. Cars rotate wheels, airplanes use jets, trains run on tracks.
Given a collection of vehicles, instead of being burdened with knowing the specifics of each vehicles' inner workings, and having to:
car.RotateWheel();
airplane.StartJet();
train.MoveOnTrack();
we could treat these objects as the more abstract vehicle, and tell them to
vehicle.Move();
In this case vehicle is an abstraction. It does not represent any specific object, but represents the common functionality of cars, airplanes and trains and allows us to interact with these specific objects without knowing anything about them except that they are a type of vehicle.
In the context of OOP, vehicle would most likely be a base class of the more specific types of vehicles.
IMHO there are actually 2 underlying concepts that needs to be understood here.
Abstraction: The idea of dealing only with "What" of something rather than "How" of something. For example: When you call an object method you only care about what the method does and not how it does what it does. There are layers of abstraction i.e the upper layer is only interested in what the below layer does and not how it does it. Another example: When you are writing assembly instruction you only care what a particular instruction does and not how the underlying circuit in the CPU execute the instruction.
Generalization: The idea of comparing a bunch of things (objects, functions, basically anything) and figure out the commonality between them and then extracting that commonality. A class with a bunch of properties is the generalization of the instances of the classes as all the instances have the same properties but different values for those properties.
The goal of object-oriented programming is to take the real-world thinking into software development as much as possible. That is, abstraction means what any dictionary may define.
For example, one of possible definitions of abstraction in Oxford Dictionary:
The quality of dealing with ideas rather than events.
WordReference.com's definition is even more eloquent:
the act of considering something as a general quality or characteristic, apart from concrete realities, specific objects, or actual instances.
In fact, WordReference.com's one is one of possible definitions of abstraction and you should be surprised because it's not a programming explanation of abstraction.
Perhaps you want a more programming alike definition of abstraction, and I'll try to provide a good summary:
Abstraction is the process of turning concrete realities into object representations which could be used as archetypes. Usually, in most OOP languages, archetypes are represented by types which in turn could be defined by classes, structures and interfaces. Types may abstract data or behaviors.
One good example of abstraction would be that a chair made of oak wood is still a chair. That's the way our mind works. You learn that certain forms are the most basic definition of many things. Your brain doesn't see all details of a given chair, but it sees that it fulfills the requirements to consider something a chair. Object-oriented programming and abstraction just mirrors this.
Recently, i go back to read some parts of the "UML Reference Manual" book, second edition (obviously by: Booch, Rumbaugh, Jacobson).
(see: http://www.amazon.com/Unified-Modeling-Language-Reference-Manual/dp/020130998X)
Meanwhile, i have found these "strange" words in the first chapiter "UML overview" at "Complexity of UML" section:
There is far too much use of generalization at the expense of essential distinctions. The myth that inheritance is always good has been a curse of object orientation from earliest days.
I can't see how this sentence can be fully in line with Object Oriented Paradigm which states that inheritance is a fundamental principle.
Any idea/help please?
You seem to believe the two points are mutually exclusive. They are not. Inheritance is a fundamental and powerful principle of object-oriented programming, and it is overused.
It is overused typically by inexperienced developers who are so captivated with the idea of inheritance that they are more focused on the inheritance tree than solving the problem. They try to factor out as much code as possible to some parent base class so they can just reuse it throughout the tree, and as a result they have a brittle design.
One of the greatest evils of software engineering is tight coupling between classes. That's the sort of thing that causes you to have to work through the weekend after the customer asks for a simple change. Why? Because making a change in one class has an effect on another class, and fixing that class has an effect on another, and so on.
Well, there is no tighter coupling than inheritance.
When you factor too much out to the "top level," every derived class is coupled to it. And as you find more and more code you want to factor out to various levels, you eventually have these deep trees, and every change made at the top cascades throughout the tree. As a result, you start to have methods that return null or are empty. They're unnecessary for the class, but the inheritance contract demands they be there. This violates the Liskov Substitution Principle.
So use inheritance of course. But do it smartly. Favor delegation to inheritance if you have any doubt. And when you do use inheritance, make sure you aren't factoring commonalities to the top level (of the whole tree or a subtree) just to reuse common code, but rather do so because there is a commonality of behavior from top to bottom.
If your tree is more than two or three levels deep (and I think three is really pushing it), you are almost certainly setting yourself up for trouble.
Everything is good in moderation. Remember that the quote is not saying do not use it, or avoid, etc. Rather it is saying it is an overused principal when other OO abstractions or principals work better. Inheritance is powerful but it's coupling is tight.
Wisely or rather randomly the author of the UML book is saying pointing out this current truism that inheritance is often over-used and over-referenced. What about all the other principals and abstractions. I find that developers typically only hit the OO highlights (inheritance being one) and use that abstraction to excess.
For me in UML it is a good reminder that UML is OO generally, but it is not limited to Java or .Net OO features. Many languages only offer of the abstractions available across all languages. UML attempts to help you model and express many of them.
Remember the author only said 'too much use', not bad or incorrect. Also remember that maybe you are an expert developer who does not apply inheritance incorrectly.
The creator of the Clojure language claims that "open, and large, set of functions operate upon an open, and small, set of extensible abstractions is the key to algorithmic reuse and library interoperability". Obviously it contradicts the typical OOP approach where you create a lot of abstractions (classes) and a relatively small set of functions operating on them. Please suggest a book, a chapter in a book, an article, or your personal experience that elaborate on the topics:
motivating examples of problems that appear in OOP and how using "many functions upon few abstractions" would address them
how to effectively do MFUFA* design
how to refactor OOP code towards MFUFA
how OOP languages' syntax gets in the way of MFUFA
*MFUFA: "many functions upon few abstractions"
There are two main notions of "abstraction" in programming:
parameterisation ("polymorphism", genericity).
encapsulation (data hiding),
[Edit: These two are duals. The first is client-side abstraction, the second implementer-side abstraction (and in case you care about these things: in terms of formal logic or type theory, they correspond to universal and existential quantification, respectively).]
In OO, the class is the kitchen sink feature for achieving both kinds of abstraction.
Ad (1), for almost every "pattern" you need to define a custom class (or several). In functional programming on the other hand, you often have more lightweight and direct methods to achieve the same goals, in particular, functions and tuples. It is often pointed out that most of the "design patterns" from the GoF are redundant in FP, for example.
Ad (2), encapsulation is needed a little bit less often if you don't have mutable state lingering around everywhere that you need to keep in check. You still build ADTs in FP, but they tend to be simpler and more generic, and hence you need fewer of them.
When you write program in object-oriented style, you make emphasis on expressing domain area in terms of data types. And at first glance this looks like a good idea - if we work with users, why not to have a class User? And if users sell and buy cars, why not to have class Car? This way we can easily maintain data and control flow - it just reflects order of events in the real world. While this is quite convenient for domain objects, for many internal objects (i.e. objects that do not reflect anything from real world, but occur only in program logic) it is not so good. Maybe the best example is a number of collection types in Java. In Java (and many other OOP languages) there are both arrays, Lists. In JDBC there's ResultSet which is also kind of collection, but doesn't implement Collection interface. For input you will often use InputStream that provides interface for sequential access to the data - just like linked list! However it doesn't implement any kind of collection interface as well. Thus, if your code works with database and uses ResultSet it will be harder to refactor it for text files and InputStream.
MFUFA principle teaches us to pay less attention to type definition and more to common abstractions. For this reason Clojure introduces single abstraction for all mentioned types - sequence. Any iterable is automatically coerced to sequence, streams are just lazy lists and result set may be transformed to one of previous types easily.
Another example is using PersistentMap interface for structs and records. With such common interfaces it becomes very easy to create resusable subroutines and do not spend lots of time to refactoring.
To summarize and answer your questions:
One simple example of an issue that appears in OOP frequently: reading data from many different sources (e.g. DB, file, network, etc.) and processing it in the same way.
To make good MFUFA design try to make abstractions as common as possible and avoid ad-hoc implementations. E.g. avoid types a-la UserList - List<User> is good enough in most cases.
Follow suggestions from point 2. In addition, try to add as much interfaces to your data types (classes) as it possible. For example, if you really need to have UserList (e.g. when it should have a lot of additional functionality), add both List and Iterable interfaces to its definition.
OOP (at least in Java and C#) is not very well suited for this principle, because they try to encapsulate the whole object's behavior during initial design, so it becomes hard add more functions to them. In most cases you can extend class in question and put methods you need into new object, but 1) if somebody else implements their own derived class, it will not be compatible with yours; 2) sometimes classes are final or all fields are made private, so derived classes don't have access to them (e.g. to add new functions to class String one should implement additional classStringUtils). Nevertheless, rules I described above make it much easier to use MFUFA in OOP-code. And best example here is Clojure itself, which is gracefully implemented in OO-style but still follows MFUFA principle.
UPD. I remember another description of difference between object oriented and functional styles, that maybe summarizes better all I said above: designing program in OO style is thinking in terms of data types (nouns), while designing in functional style is thinking in terms of operations (verbs). You may forget that some nouns are similar (e.g. forget about inheritance), but you should always remember that many verbs in practice do the same thing (e.g. have same or similar interfaces).
A much earlier version of the quote:
"The simple structure and natural applicability of lists are reflected in functions that are amazingly nonidiosyncratic. In Pascal the plethora of declarable data structures induces a specialization within functions that inhibits and penalizes casual cooperation. It is better to have 100 functions operate on one data structure than to have 10 functions operate on 10 data structures."
...comes from the foreword to the famous SICP book. I believe this book has a lot of applicable material on this topic.
I think you're not getting that there's a difference between libraries and programmes.
OO libraries which work well usually generate a small number of abstractions, which programmes use to build the abstractions for their domain. Larger OO libraries (and programmes) use inheritance to create different versions of methods and introduce new methods.
So, yes, the same principle applies to OO libraries.
The object oriented concepts : encapsulation, data abstraction and data hiding are 3 different concepts, but very much related to each other. So i am having difficulty in understanding the concepts fully by reading the information from internet. The information available at one place contradicts with information at another place in the internet. Could someone guide me to a tutorial which clearly explains the 3 concepts and brings out the difference between the three?
First of all, don't be too ambitious ,as you said these 3 concepts are related (especially the first two) and could be used for one another in many contexts. Using them correctly is much more important than having a complete final definition.
"data hiding" is all about putting a wall between the client and (part of) the implementation. Some objects of a module can be internal to the module and invisible to its users. As such, this is a way, a method to avoid dependency. if I cannot know how one thing is implemented, its implementation can change.
"data abstraction" is regrouping different kind of data under the same abstraction. It is close to the idea of a protocol. You don't know how the object is implemented, but you know it respect a well-known protocol, i.e a set of method that works over different type of data. In python, file-like object are a good example. In Java, one uses interfaces. It is good because you have less to learn, and also because you can check some properties at the abstraction level, i.e for all kind of data regrouped under this abstraction.
"encapsulation" is about putting a shell around objects that simplify their usage. it is linked to the idea that objects in a code base can be regrouped in layers increasingly low-level. One object in a layer calls only those of layers beneath him. For example, if you want to draw a line on the screen, the line obkect may only encapsulate an openGL context, the pixel drawer, and other stuff. These lower-level objects are encapsulated by the line object. Note the encapsulation can be applied to the same object when it is part of different layers at the same time, not good but sometimes unavoidable. For example, the file-like object in python have high-level/encapsulating method (open, close, read) and low-levels ones (seek).
That's it. Obviously, the definition of each could be broader but these make the three concepts a bit more different.
The wrapping up of data and functions into a single unit (called
class) is known as encapsulation. Data encapsulation is the most
striking feature of a class. The data is not accessible to the
outside world, and only those functions, which are wrapped in
the class, can access it. These functions provide the interface
between the object’s data and the program. This insulation of
the data from direct access by the program is called data hiding
or information hiding.
Abstraction refers to the act of representing essential features
without including the background details or explanations.
Classes use the concept of abstraction and are defined as a list
of abstract attributes such as size, weight and cost, and
functions to operate on these attributes. They encapsulate all the
essential properties of the objects that are to be created. The
attributes are sometimes called data members because they hold
information. The functions that operate on these data are
sometimes called methods or member functions.
Since the classes use the concept of data abstraction, they are
known as Abstract Data Types (ADT
The Law of Demeter indicates that you should only speak to objects that you know about directly. That is, do not perform method chaining to talk to other objects. When you do so, you are establishing improper linkages with the intermediary objects, inappropriately coupling your code to other code.
That's bad.
The solution would be for the class you do know about to essentially expose simple wrappers that delegate the responsibility to the object it has the relationship with.
That's good.
But, that seems to result in the class having low cohesion. No longer is it simply responsible for precisely what it does, but it also has the delegates that in a sense, making the code less cohesive by duplicating portions of the interface of its related object.
That's bad.
Does it really result in lowering cohesion? Is it the lesser of two evils?
Is this one of those gray areas of development, where you can debate where the line is, or are there strong, principled ways of making a decision of where to draw the line and what criteria you can use to make that decision?
Grady Booch in "Object Oriented Analysis and Design":
"The idea of cohesion also comes from structured design. Simply stated, cohesion
measures the degree of connectivity among the elements of a single module (and
for object-oriented design, a single class or object). The least desirable form of
cohesion is coincidental cohesion, in which entirely unrelated abstractions are
thrown into the same class or module. For example, consider a class comprising
the abstractions of dogs and spacecraft, whose behaviors are quite unrelated. The
most desirable form of cohesion is functional cohesion, in which the elements of
a class or module all work together to provide some well-bounded behavior.
Thus, the class Dog is functionally cohesive if its semantics embrace the behavior
of a dog, the whole dog, and nothing but the dog."
Subsitute Dog with Customer in the above and it might be a bit clearer. So the goal is really just to aim for functional cohesion and to move away from coincidental cohesion as much as possible. Depending on your abstractions, this may be simple or could require some refactoring.
Note cohesion applies just as much to a "module" than to a single class, ie a group of classes working together. So in this case the Customer and Order classes still have decent cohesion because they have this strong relationshhip, customers create orders, orders belong to customers.
Martin Fowler says he'd be more comfortable calling it the "Suggestion of Demeter" (see the article Mocks aren't stubs):
"Mockist testers do talk more about avoiding 'train wrecks' - method chains of style of getThis().getThat().getTheOther(). Avoiding method chains is also known as following the Law of Demeter. While method chains are a smell, the opposite problem of middle men objects bloated with forwarding methods is also a smell. (I've always felt I'd be more comfortable with the Law of Demeter if it were called the Suggestion of Demeter .)"
That sums up nicely where I'm coming from: it is perfectly acceptable and often necessary to have a lower level of cohesion than the strict adherence to the "law" might require. Avoid coincidental cohesion and aim for functional cohesion, but don't get hung up on tweaking where needed to fit in more naturally with your design abstraction.
If you are violating the Law of Demeter by having
int price = customer.getOrder().getPrice();
the solution is not to create a getOrderPrice() and transform the code into
int price = customer.getOrderPrice();
but instead to note that this is a code smell and make the relevant changes that hopefully both increase cohesion and lower coupling. Unfortunately there is no simple refactoring here that always applies, but you should probably apply tell don't ask
I think you may have misunderstood what cohesion means. A class that is implemented in terms of several other classes does not necessarily have low cohesion, as long as it represents a clear concept, and has a clear purpose. For example, you may have a class Person, which is implemented in terms of classes Date (for date of birth), Address, and Education (a list of schools the person went to). You may provide wrappers in Person for getting the year of birth, the last school the person went to, or the state where he lives, to avoid exposing the fact that Person is implemented in terms of those other classes. This would reduce coupling, but it would make Person no less cohesive.
It’s a grey area.
These principals are meant to help you in your work, if you find you’re working for them (i.e. they’re getting in your way and/or you find it over complicates your code) then you’re conforming too hard and you need to back off.
Make it work for you, don’t work for it.
I don't know if this actually lowers cohesion.
Aggregation/composition are all about a class utilising other classes to meet the contract it exposes through its public methods.
The class does not need to duplicate the interface of it's related objects. It's actually hiding any knwowledge about these aggregated classes from the method caller.
To obey the law of Demeter in the case of multiple levels of class dependency, you just need to apply aggregation/composition and good encapsulation at each level.
In other words each class has one or more dependencies on other classes, however these are only ever dependencies on the referenced class and not on any objects returned from properies/methods.
In the situations where there seems to be a tradeoff between coupling and cohesion, I'd probably ask myself "if somebody else had already written this logic, and I were looking for a bug in it, where would I look first?", and write the code that way.