OOP - How to choose a possible object candidate? - oop

I 'm concern about what techniques should I use to choose the right object in OOP
Is there any must-read book about OOP in terms of how to choose objects?
Best,

Just write something that gets the job done, even if it's ugly, then refactor continuously:
eliminate duplicate code (don't repeat yourself)
increase cohesion
reduce coupling
But:
don't over-engineer; keep it simple
don't write stuff you ain't gonna need
It's not a precise recipe, just some general guidelines. Keep practicing.
P.S.
Code objects are not related to tangible real-life objects; they are just constructs that hold related information together.
Don't believe what the Java books/schools teach about objects; they're lying.

You probably mean "the right class", rather than "the right object". :-)
There are a few techniques, such as text analysis (a.k.a. underlining the nouns) and Class Responsibility Collaborator (CRC).
With "underlining the nouns", you basically start with a written, natural language (i.e. plain English) description of the problem you want to solve and underline the nouns. That gives you a list of candidate classes. You will need to perform several passes to refine it into a list of classes to implement.
For CRC, check out the Wikipedia.
I suggest The OPEN Toolbox of Techniques for full reference.
Hope it helps.

I am assuming that there is understanding of what is sctruct, type, class, set, state, alphabet, scalar and vector and relationship.
Object is a noun, method is a verb. Object members can represent identity, state or scalar value per field. Relationships between objects usually are represented with references, where references are members of objects. In cases, when relationships are complex, multidirectional, have arity greater than 2, represent some sort of grouping or containment, then relationships can be expressed as objects.
For other, broader technical reasons objects are most likely the only way to represent any form of information in OOP languages.

I am adding a second answer due to demian's comment:
Sometimes the class is so obvious
because it's tangible, but other times
the concept of object it's to abstract
like a db connector.
That is true. My preferred approach is to perform a behavioural analysis of the system (using use cases, for example), and then derive system operations. Once you have a stable list of system operations (such as PrintDocument, SaveDocument, SpellCheck, MergeMail, etc. for a word processor) you need to assign each of them to a class. If you have developed a list of candidate classes with some of the techniques that I mentioned earlier, you will be able to allocate some of the operations. But some will remain unallocated. These will signal the need of more abstract or unintuitive classes, which you will need to make up, using your good judgment.
The whole method is documented in a white paper at www.openmetis.com.

You should check out Domain-Driven Design, by Eric Evans. It provides very useful concepts in thinking about the objects in your model, what their function are in the domain, and how they could be organized to work together. It's not a cookbook, and probably not a beginner book - but then, I read it at different stages of my career, and every time I found something valuable in it...
(source: domaindrivendesign.org)

Related

Abstraction as a definition

I am trying to understand the basic OOP concept called abstraction. When I say "understand", I mean not just to learn a definition, but really have a deep understanding.
On the internet, I have seen many definitions such as:
Hiding the low level implementation and providing high level specification
and
focusing on essential qualities rather than specific examples.
I understand that the iPhone button is a great example of abstraction, since I, as a user, don't have to know how the screen is displayed, all I have to know is to press the button.
What do you think of the following conclusion, when it comes to abstraction:
Abstraction takes many specific instances of objects and extracts their common information and functions by providing a single, generalised concept.
So based on this, a class is actually an abstraction of many instances, right?
I disagree with both of your examples. An iPhone button is not an abstraction of the screen, it is an interface to use the phone. A class is also not an abstraction of its instances.
An abstraction can be thought of treating a specific concept as a form of a more general concept.
To repeat an overused example: all vehicles can move. Cars rotate wheels, airplanes use jets, trains run on tracks.
Given a collection of vehicles, instead of being burdened with knowing the specifics of each vehicles' inner workings, and having to:
car.RotateWheel();
airplane.StartJet();
train.MoveOnTrack();
we could treat these objects as the more abstract vehicle, and tell them to
vehicle.Move();
In this case vehicle is an abstraction. It does not represent any specific object, but represents the common functionality of cars, airplanes and trains and allows us to interact with these specific objects without knowing anything about them except that they are a type of vehicle.
In the context of OOP, vehicle would most likely be a base class of the more specific types of vehicles.
IMHO there are actually 2 underlying concepts that needs to be understood here.
Abstraction: The idea of dealing only with "What" of something rather than "How" of something. For example: When you call an object method you only care about what the method does and not how it does what it does. There are layers of abstraction i.e the upper layer is only interested in what the below layer does and not how it does it. Another example: When you are writing assembly instruction you only care what a particular instruction does and not how the underlying circuit in the CPU execute the instruction.
Generalization: The idea of comparing a bunch of things (objects, functions, basically anything) and figure out the commonality between them and then extracting that commonality. A class with a bunch of properties is the generalization of the instances of the classes as all the instances have the same properties but different values for those properties.
The goal of object-oriented programming is to take the real-world thinking into software development as much as possible. That is, abstraction means what any dictionary may define.
For example, one of possible definitions of abstraction in Oxford Dictionary:
The quality of dealing with ideas rather than events.
WordReference.com's definition is even more eloquent:
the act of considering something as a general quality or characteristic, apart from concrete realities, specific objects, or actual instances.
In fact, WordReference.com's one is one of possible definitions of abstraction and you should be surprised because it's not a programming explanation of abstraction.
Perhaps you want a more programming alike definition of abstraction, and I'll try to provide a good summary:
Abstraction is the process of turning concrete realities into object representations which could be used as archetypes. Usually, in most OOP languages, archetypes are represented by types which in turn could be defined by classes, structures and interfaces. Types may abstract data or behaviors.
One good example of abstraction would be that a chair made of oak wood is still a chair. That's the way our mind works. You learn that certain forms are the most basic definition of many things. Your brain doesn't see all details of a given chair, but it sees that it fulfills the requirements to consider something a chair. Object-oriented programming and abstraction just mirrors this.

Few words in this definition of Abstraction

I'm sorry if my question doesn't meet the standards of SO, but I really had some hard time going through the last few words within this definition of ABSTRACTION from Grady Booch
“An abstraction denotes the essential
characteristics of an object that distinguish it from all other kinds
of objects and thus provide crisply defined conceptual boundaries,
relative to the perspective of the viewer"
Please explain what does he mean by "relative to the perspective of the viewer". Any example would be really helpful.
They simply mean that from the point of view of the person trying to understand the abstraction, it should be clear what it is, what in includes and what it doesn't.
However, how is implemented might not be that clearly different from other abstractions.
For example:
A URI is a different abstraction from a Name. It's clear to a developer and a user what either are. However, implementation-wise they both might be little more than strings.
I think that what they are trying to say is that the semantics and the behaviors define abstractions correctly, not how they are going to be implemented.
Definitions in OOP world are different and not always very clear, for example, I can bring you a definition of abstraction from Tony Hoare:
"Abstraction arises from a recognition of similarities
between certain objects, situations, or processes in the real world,
and the decision to concentrate upon those similarities and to
ignore for the time being the differences."
Maybe this is clearer to you. However, I do not care too much about the words of these definitions.
What is important to understand about abstraction is that it has the function to expose to the user (or viewer) a set of behaviors (an interface) that completely describe and identify an entity (or object). Once you know these behaviors (methods) you can and should ignore the actual implementation of these methods. What the user should care is to provide input parameters and to receive the right results.
I think this is a more practical definition of abstraction.

"Many functions operating upon few abstractions" principle vs OOP

The creator of the Clojure language claims that "open, and large, set of functions operate upon an open, and small, set of extensible abstractions is the key to algorithmic reuse and library interoperability". Obviously it contradicts the typical OOP approach where you create a lot of abstractions (classes) and a relatively small set of functions operating on them. Please suggest a book, a chapter in a book, an article, or your personal experience that elaborate on the topics:
motivating examples of problems that appear in OOP and how using "many functions upon few abstractions" would address them
how to effectively do MFUFA* design
how to refactor OOP code towards MFUFA
how OOP languages' syntax gets in the way of MFUFA
*MFUFA: "many functions upon few abstractions"
There are two main notions of "abstraction" in programming:
parameterisation ("polymorphism", genericity).
encapsulation (data hiding),
[Edit: These two are duals. The first is client-side abstraction, the second implementer-side abstraction (and in case you care about these things: in terms of formal logic or type theory, they correspond to universal and existential quantification, respectively).]
In OO, the class is the kitchen sink feature for achieving both kinds of abstraction.
Ad (1), for almost every "pattern" you need to define a custom class (or several). In functional programming on the other hand, you often have more lightweight and direct methods to achieve the same goals, in particular, functions and tuples. It is often pointed out that most of the "design patterns" from the GoF are redundant in FP, for example.
Ad (2), encapsulation is needed a little bit less often if you don't have mutable state lingering around everywhere that you need to keep in check. You still build ADTs in FP, but they tend to be simpler and more generic, and hence you need fewer of them.
When you write program in object-oriented style, you make emphasis on expressing domain area in terms of data types. And at first glance this looks like a good idea - if we work with users, why not to have a class User? And if users sell and buy cars, why not to have class Car? This way we can easily maintain data and control flow - it just reflects order of events in the real world. While this is quite convenient for domain objects, for many internal objects (i.e. objects that do not reflect anything from real world, but occur only in program logic) it is not so good. Maybe the best example is a number of collection types in Java. In Java (and many other OOP languages) there are both arrays, Lists. In JDBC there's ResultSet which is also kind of collection, but doesn't implement Collection interface. For input you will often use InputStream that provides interface for sequential access to the data - just like linked list! However it doesn't implement any kind of collection interface as well. Thus, if your code works with database and uses ResultSet it will be harder to refactor it for text files and InputStream.
MFUFA principle teaches us to pay less attention to type definition and more to common abstractions. For this reason Clojure introduces single abstraction for all mentioned types - sequence. Any iterable is automatically coerced to sequence, streams are just lazy lists and result set may be transformed to one of previous types easily.
Another example is using PersistentMap interface for structs and records. With such common interfaces it becomes very easy to create resusable subroutines and do not spend lots of time to refactoring.
To summarize and answer your questions:
One simple example of an issue that appears in OOP frequently: reading data from many different sources (e.g. DB, file, network, etc.) and processing it in the same way.
To make good MFUFA design try to make abstractions as common as possible and avoid ad-hoc implementations. E.g. avoid types a-la UserList - List<User> is good enough in most cases.
Follow suggestions from point 2. In addition, try to add as much interfaces to your data types (classes) as it possible. For example, if you really need to have UserList (e.g. when it should have a lot of additional functionality), add both List and Iterable interfaces to its definition.
OOP (at least in Java and C#) is not very well suited for this principle, because they try to encapsulate the whole object's behavior during initial design, so it becomes hard add more functions to them. In most cases you can extend class in question and put methods you need into new object, but 1) if somebody else implements their own derived class, it will not be compatible with yours; 2) sometimes classes are final or all fields are made private, so derived classes don't have access to them (e.g. to add new functions to class String one should implement additional classStringUtils). Nevertheless, rules I described above make it much easier to use MFUFA in OOP-code. And best example here is Clojure itself, which is gracefully implemented in OO-style but still follows MFUFA principle.
UPD. I remember another description of difference between object oriented and functional styles, that maybe summarizes better all I said above: designing program in OO style is thinking in terms of data types (nouns), while designing in functional style is thinking in terms of operations (verbs). You may forget that some nouns are similar (e.g. forget about inheritance), but you should always remember that many verbs in practice do the same thing (e.g. have same or similar interfaces).
A much earlier version of the quote:
"The simple structure and natural applicability of lists are reflected in functions that are amazingly nonidiosyncratic. In Pascal the plethora of declarable data structures induces a specialization within functions that inhibits and penalizes casual cooperation. It is better to have 100 functions operate on one data structure than to have 10 functions operate on 10 data structures."
...comes from the foreword to the famous SICP book. I believe this book has a lot of applicable material on this topic.
I think you're not getting that there's a difference between libraries and programmes.
OO libraries which work well usually generate a small number of abstractions, which programmes use to build the abstractions for their domain. Larger OO libraries (and programmes) use inheritance to create different versions of methods and introduce new methods.
So, yes, the same principle applies to OO libraries.

Object Oriented Programming beyond just methods?

I have a very limited understanding of OOP.
I've been programming in .Net for a year or so, but I'm completely self taught so some of the uses of the finer points of OOP are lost on me.
Encapsulation, inheritance, abstraction, etc. I know what they mean (superficially), but what are their uses?
I've only ever used OOP for putting reusable code into methods, but I know I am missing out on a lot of functionality.
Even classes -- I've only made an actual class two or three times. Rather, I typically just include all of my methods with the MainForm.
OOP is way too involved to explain in a StackOverflow answer, but the main thrust is as follows:
Procedural programming is about writing code that performs actions on data. Object-oriented programming is about creating data that performs actions on itself.
In procedural programming, you have functions and you have data. The data is structured but passive and you write functions that perform actions on the data and resources.
In object-oriented programming, data and resources are represented by objects that have properties and methods. Here, the data is no longer passive: method is a means of instructing the data or resource to perform some action on itself.
The reason that this distinction matters is that in procedural programming, any data can be inspected or modified in any arbitrary way by any part of the program. You have to watch out for unexpected interactions between different functions that touch the same data, and you have to modify a whole lot of code if you choose to change how the data is stored or organized.
But in object-oriented programming, when encapsulation is used properly, no code except that inside the object needs to know (and thus won't become dependent on) how the data object stores its properties or mutates itself. This helps greatly to modularize your code because each object now has a well-defined interface, and so long as it continues to support that interface and other objects and free functions use it through that interface, the internal workings can be modified without risk.
Additionally, the concepts of objects, along with the use of inheritance and composition, allow you to model your data structurally in your code. If you need to have data that represents an employee, you create an Employee class. If you need to work with a printer resource, you create a Printer class. If you need to draw pushbuttons on a dialog, you create a Button class. This way, not only do you achieve greater modularization, but your modules reflect a useful model of whatever real-world things your program is supposed to be working with.
You can try this: http://homepage.mac.com/s_lott/books/oodesign.html It might help you see how to design objects.
You must go though this I can't create a clear picture of implementing OOP concepts, though I understand most of the OOP concepts. Why?
I had same scenario and I too is a self taught. I followed those steps and now I started getting a knowledge of implementation of OOP. I make my code in a more modular way better structured.
OOP can be used to model things in the real world that your application deals with. For example, a video game will probably have classes for the player, the badguys, NPCs, weapons, ammo, etc... anything that the system wants to deal with as a distinct entity.
Some links I just found that are intros to OOD:
http://accu.informika.ru/acornsig/public/articles/ood_intro.html
http://www.fincher.org/tips/General/SoftwareEngineering/ObjectOrientedDesign.shtml
http://www.softwaredesign.com/objects.html
Keeping it very brief: instead of doing operations on data a bunch of different places, you ask the object to do its thing, without caring how it does it.
Polymorphism: different objects can do different things but give them the same name, so that you can just ask any object (of a particular supertype) to do its thing by asking any object of that type to do that named operation.
I learned OOP using Turbo Pascal and found it immediately useful when I tried to model physical objects. Typical examples include a Circle object with fields for location and radius and methods for drawing, checking if a point is inside or outside, and other actions. I guess, you start thinking of classes as objects, and methods as verbs and actions. Procedural programming is like writing a script. It is often linear and it follows step by step what needs to be done. In OOP world you build an available repetoire of actions and tasks (like lego pieces), and use them to do what you want to do.
Inheritance is used common code should/can be used on multiple objects. You can easily go the other way and create way too many classes for what you need. If I am dealing with shapes do I really need two different classes for rectangles and squares, or can I use a common class with different values (fields).
Mastery comes with experience and practice. Once you start scratching your head on how to solve particular problems (especially when it comes to making your code usable again in the future), slowly you will gain the confidence to start including more and more OOP features into your code.
Good luck.

How to design objects?

So there are many ways of structuring objects (I'm talking of OOP here). For the question, I will use the classic "Car" example of OOP. Basically, How do I know when to make the car an object, or the wheel of a car an object, when both program structures would accomplish the goal?
How do I classify and categorize the parts of an object to determine whether or not they are better suited as simple attributes or variables of an object, or if they really need to be an object themselves?
Well the first thing you have to realize is the OOAD ("Object-oriented analysis and design") is a tool and not a means to an end. What you get out of that process is a model, not a true representation of what you're modelling. That model makes certain assumptions. The purpose of that model is to solve a problem you have.
So how do you know how to design objects? How do you know if you've done it right? By the end result: has it solved your problem?
So, for the Car example, in some models a car count could simply be an integer count, for example the car traffic through an intersection in a traffic model. In such a model rarely do you care about the make, model or construction of cars, just the number. You might care about the type of vehicle to the point of is it a truck or car (for example). Do you model that as a Vehicle object with a type of Car or Truck? Or just separate carCount and truckCount tallies?
The short answer is: whichever works best.
The normal test for something being an object or not is does it have behaviour? Remember that ultimately objects = data + behaviour.
So you might say that cars have the following state:
of wheels;
Height of suspension;
Left or right drive;
Colour;
Width;
Weight;
Length;
Height;
of doors;
Whether it has a sunroof;
Whether it has a stereo, CD player, MP3 player and/or satnav;
Size of the petrol tank;
Number of cylinders;
of turbo charges and/or fuel injection;
Maximum torque;
Maximum brake-horsepower;
and so on.
Chances are you'll only care about a small subset of that: pick whatever is relevant. A racing game might go into more detail about the wheels, such as how hot they are, how worn, the width and tread type and so on. In such a case, a Wheel object could be said to be a collection of all that state (but little behaviour) because a Car has a number of Wheels and the Wheels are interchangeable.
So that brings up the second point about objects: an object can exist because of a relationship such that the object represents a complete set of data. So a Wheel could have tread, width, temperature and so on. You can't divide that up and say a Car has tread but no wheel width so it makes sense for Wheel to be an object since a Wheel in it's entirety is interchangeable.
But again, does that make sense for what're doing? That's the key question.
Don't start out by classifying things - seems like people are too eager to start building inheritance hierarchies.
write down a list of specific, concrete scenarios - what your app will do, step by step. An object model is only useful if it does what you need it to do - so start working back from the scenarios to see what common objects and behaviours you can shake out of each one.
identify the "roles" in your scenarios - not necessarily actual class names - just vague "roles" that turn up when you think through concrete scenarios for how your software will work. These roles might later become classes, interfaces, abstract classes - whatever you need - at the start they're just placeholders for doing a type of work.
Work out what each role "does". The key is having a bunch of named roles - that identify things that the objects will do. Thins is about distilling out a set of things each role can do - they might do the whole thing, or put together a bunch of other objects to do the work, or they might co-ordinate the work... it depends on your scenarios.
The most important thing in OOD/OOP - is OBJECTS DO THINGS - not what's inside them - what they do.
Don't think about inheritance early on - because it will tie you up in overcomplicated hierarchies and make you think in terms of SQL-oriented programming rather than object-oriented programming. Inheritance is just one way of sharing common code. There are lots of other ways - delegation, mixins, prototype-based programming...
Here are some guidelines I came up with to help with this:
What should be on a checklist that would help someone develop good OO software?
There are some good answers here, but possibly more than you were looking for. To address your specific questions briefly:
How do I know when to make the car an object, or the wheel of a car an object, when both program structures would accomplish the goal?
When you need to distinguish one instance from another, then you need an object. The key distinction of an object is: it has identity.
Extending this answer slightly to classes, when the behaviors and/or properties of two similar objects diverge, you need a new class.
So, if you're modeling a traffic simulation that counts wheels, a Vehicle class with a NumberOfWheels property may be sufficient. If you're modeling a racing simulation with detailed road-surface and wheel-torque physics, each wheel probably needs to be an independent object.
How do I classify and categorize the parts of an object to determine whether or not they are better suited as simple attributes or variables of an object, or if they really need to be an object themselves?
The key distinctions are identity and behavior. A part with unique existence is an object. A part with autonomous behavior requires its own class.
For example, if you're creating a very simple car-crash simulation, NumberOfPassengers and DamageResistance may be sufficient properties of a generic Vehicle class. This would be enough to tell you if the car was totalled and the passengers survived. If your simulation is much more detailed, perhaps you want to know how far each passenger was thrown in a head-on collision, then you would need a Passenger class and distinct Passenger objects in each Vehicle.
I like Wirfs-Brock's Responsibility-Driven Design (RDD) and also recommend this updated (free paper) Responsibility-Driven Modeling approach by Alistair Cockburn.
In over 15 years of OO development, whenever I've felt I'm getting lost in a software architecture, going back to the RDD basics always helps me clarify what the software is supposed to be doing and how.
If you like a test-driven approach, this article shows how to relate RDD to mocking objects and tests.
Attributes or variables are often "base" types of a language. The question is what you can sensibly abstract.
For example, you can reduce a Wheel to descriptors made up of base types like integers, floating-point values and strings, which represent characteristic attributes of any wheel: numberOfTreads, diameter, width, recommendedPressure, brand. Those attributes can all be expressed with base types to make a Wheel object.
Can you group some of those attributes into a more abstract arrangement that you can reuse, independent of a Wheel? I think so. Perhaps create a Dimensions object with the attributes diameter and width. Then your Wheel has a Dimensions object instance associated with it, instead of diameter and width. But you could think about using that Dimensions object with other objects, which may not necessarily be Wheel instances.
Going up the list, you can reduce a Car to be made up of base types, but also other objects, such as Wheel objects. It is sensible to do so, because other motor and non-motor vehicles (such as a Bicycle) also contain Wheel instances.
Abstracting Wheel and Dimensions lets you re-use these object types in different contexts you may not initially consider. It makes your life a little easier because you have less code to rewrite, in theory.
If you can create a hierarchy of objects, to the point where the deepest, lowest-level object is only made up of a few base types, that is probably a good place to start.
If it's true that "both program structures would accomplish the goal" equally well, then it doesn't matter which you pick.
If, however, the program does not have a single fixed "goal" but will evolve significantly over its lifetime, then pick either one for now, and refactor as necessary as future modifications dictate. We call it "software" for a reason.
Grow your classes bottom-up.
1) Class boundaries and semantics depend on context. Until you have a context, you don't have anything. (You may not even have a car in your example). Context is given by the user story (or use case).
2) Throw all the state and behavior suggested by the given context into one class (you could name this after the user story if you would like).
3) Use systematic Refactoring to tease this class apart into separate classes. While refactoring, use existing classes as reuse opportunities.
When you're done, you'll have a set of well-defined classes that are just enough to fulfill the needs of the given user story (and the user stories that came before).