Extending a class hierarchy - oop

Recently, I've found a puzzling (to me) problem: Let's say I have a hierarchy of classes C1...C_n. Assume that at least some classes have more than one (direct) child class, but none have more than one parent (i.e. no multiple inheritance). I would like to change the hierarchy's behaviour. My first impulse would be to create subclasses D1...Dn and override methods as necessary, but there is one problem: when calling a newly overridden method, an actual Di may need to be passed as a formal D1 (or some level in between); this can be solved by subclassing C1 -> D1 -> D2 .... But when calling an unchanged method, any actual Di will need to be passed as a formal Ci, so we would have to derive each Di directly from Ci. Is there any elegant or generally accepted way to solve this riddle? If so, is there any way without resorting to multiple inheritance?
If there is no general way to achieve this, can the author of the original C hierarchy follow certain rules to provide this possibility?
For those who prefer a more practical approach, the original hierarchy implements SOAP in Ada. I am working on XML-RPC. From an abstract point of view, SOAP is a superset to XML-RPC, but the actual XML "on the wire" is quite different. In principle, one can perform most of the work by throwing away some of the data types (e.g. XML-RPC has one integer and one floating point type whereas SOAP has several of each), and replacing the routines that convert the remaining types to and from XML.
However, due to the aforementioned inheritance problems, I ended up copying almost the entire SOAP hierarchy. The only code I've been able to re-use properly turned out to be the HTTP part (as it is not concerned only with text payloads, not with SOAP objects).
[Edit: Removed a simplifying assumption that would allow for a simple solution not applicable to the more general problem]

The solution, because you assume single inheritance, and direct subclass relationships, is to just create D(1) as a subclass of C(n), in which case all Ds will also be in every C class.
In practice, the solution is not to create deep class hierarchies for no reason. If you're really into small classes, then use a system which accommodates multiple inheritance in some form.

Related

Method requires specific subtype but collection is of base abstract type. What is wrong?

Recently I have fallen in a situation like this. I'm generalizing the problem because I think it relates more to the structural design than the specific problem.
General problem
There is a hierarchy of classes: an abstract base class Base and some concretions D1, D2, D3 that inherit from it. The class A contains an object's collection of type Base. A requires a computation from some service-class B but B.process() method accepts only a collection of type D1. Let's say that is important because if the input collection contains any other type the value returned is just wrong.
A have an interface that allows clients to add elements to the internal collection, which is not exposed in any other way. The classes in the hierarchy can be constructed for the same clients and pass the new values to A; A have not enough context to construct them itself.
Attempts, questions and thoughts
The major concern for me was the need to determine at runtime the type of each element in the A collection, so can filter the right ones and pass to B.process(). Even if it is possible (it is in my particular problem, more later on) it just seems wrong! I think the object who contains references to the abstract base class shouldn't have to know the concrete instances it holds.
I try to:
Change the parameter type to B.process(c: Base[]) so A doesn't have to downcast the type, but it doesn't solve anything: A still needs to filter the elements or the computation will be wrong.
Pass the complete collection Base[] to B.process() but just defer the problem of selection/downcasting to B.
Put a process() method in Base so D1 can override the behavior (well known polymorphism). The problem here is that a process() returning a SomeValue type just have sense for D1.
Separate the interface that add elements so a more specific A.addD1Element(e: D1) method could allow put D1 objects in a different collection and pass that to B. It should work but also looks... don't know, weird. If method overload based on parameter type is possible at least the process won't be so cumbersome for clients of the class.
Just separate the D1 class of the hierarchy. This is a more aggressive variation of the previous one. The issue is that D1 seems related to the whole hierarchy except for the specific requirements of B.
Those were some of my thoughts on the problem.
For instance, the language used have support to check the type of an object at runtime (instanceof) and it is easy to filter the collection based on that check. But as I say my question is more related to the paradigm. What about a language, say for instance C++, where is less handy to make a check like that?
So what could be a solution to this kind of problem? What kind of refactoring or design pattern could be applied so the problem is easy to treat with or simply fades away?
This question looks related, but I believe this is more general (although I provide a more specific context). The most upvoted answer suggest to split in different collections. This is also a think i'm considering, but that forces to change A implementation every time a new type is added.
Context (problem in action)
I'm asking in a general way because it really intrigues me on that way, but I know most of the time a design can be analyzed only with the context of the particular problem it tries to solve.
The problem at hand is similar to this:
A is a class (some kind of entity, like a DDD entity) that models a sort of agreement or debt a customer incurs for a service. It has different costs including a monthly pay. Base and related classes are Payments of different types. They share a lot in common, although most of it is data (date, amount, interests, etc); but there is at least one type of payment that have different, additional information: the monthly payment (D1). Those payments need to be analyzed carefully so a different class (B) is responsible for that, using more contextual information and all the payments of that type at once. The service needs the additional data that is specific to those payments so cannot receive an abstract Payment type (at least not in that design). Other payments doesn't have the specific information MonthlyPayment does and so they cannot generates the values that business requires and B is generating (doesn't have sense in other payment types).
All payments are stored in the same collection so other methods of the class can process all payments in a generic way.
This is mostly the context. I think the design is not the best, but I fail to see a better one.
Maybe separating only MonthlyPayment (D1) in a different collection as described earlier? But it is not the only payment that requires additional processing (it is the most complex, though), so I could end with different collections for every payment type and no hierarchy at all. Right now there are four payments types and two of them requires additional, specific analysis, but more types can be added later and the issue of need to modify the implementation every time a new type is added persists.
Is this, more discrete approach of different collections by type, a better one here? The abstract base class Payment can still be used for payments that can be manipulated trough the common interface. Also I can use a layer super type or something like that to allow reutilization of common functionality (the language allows a kind of mixing as well) and stop using the base class as root from a hierarchy.
Uf. I am sorry for the length of the text. I hope it is at least readable and clear. Thank you very much in advance.

Nested Class - Good Design?

I have a problem where I have a main class which solves a numerical problem. For simplicity, assume that it solves Ax=b. Now, I want the user the ability to choose the method to solve it. There are thousands of options and each option has thousands of options within it.
My idea was to design it as follows: create a main class and then create subclasses for each method and subsubclasses for the details of each methods (which might interact via inheritance).
For instance, I envisage the user to do something like-
Model.method='CG' Model.preconditioning=off and then Model.Solve and in the Model class, there is a CG subclass which runs. Within CG there are methods CG_Precond and CG_NoPrecond which run depending on the preconditioning being on or off. (Assume that the methods are wildly different). So, in essence, the user is running Model.CG.CG_NoPrecond.
Is this good design? Should nested classes be avoided?
One important note is that other than the Model class, all of the subclasses contain only methods and no data of their own (other than what is returned).
I spent some time reading some really beautiful answers on SO and my problem ( I believe) aligns with the requirements of the accepted answer of Why/when should you use nested classes in .net? Or shouldn't you?.
First, you should create a class Solver and use the Strategy Pattern to create subclasses which represent the different methods to solve the problem.
The options and suboptions are a harder thing to do right. If i got you right, then CG_Precond and CG_NoPrecond should be subclasses of a CG (which is also a subclass of Solver) as they seem to share some inner logic.
If the options are like predefined values for the different methods where each method requires other values and type of values, then becomes more difficult. There i would like you to present some more examples of options, suboptions and so on.

can overriding of a method be prevented by downcasting to a superclass?

I'm trying to understand whether the answer to the following question is the same in all major OOP languages; and if not, then how do those languages differ.
Suppose I have class A that defines methods act and jump; method act calls method jump. A's subclass B overrides method jump (i.e., the appropriate syntax is used to ensure that whenever jump is called, the implementation in class B is used).
I have object b of class B. I want it to behave exactly as if it was of class A. In other words, I want the jump to be performed using the implementation in A. What are my options in different languages?
For example, can I achieve this with some form of downcasting? Or perhaps by creating a proxy object that knows which methods to call?
I would want to avoid creating a brand new object of class A and carefully setting up the sharing of internal state between a and b because that's obviously not future-proof, and complicated. I would also want to avoid copying the state of b into a brand new object of class A because there might be a lot of data to copy.
UPDATE
I asked this question specifically about Python, but it seems this is impossible to achieve in Python and technically it can be done... kinda..
It appears that apart from technical feasibility, there's a strong argument against doing this from a design perspective. I'm asking about that in a separate question.
The comments reiterated: Prefer composition over inheritance.
Inheritance works well when your subclasses have well defined behavioural differences from their superclass, but you'll frequently hit a point where that model gets awkward or stops making sense. At that point, you need to reconsider your design.
Composition is usually the better solution. Delegating your object's varying behaviour to a different object (or objects) may reduce or eliminate your need for subclassing.
In your case, the behavioural differences between class A and class B could be encapsulated in the Strategy pattern. You could then change the behaviour of class A (and class B, if still required) at the instance level, simply by assigning a new strategy.
The Strategy pattern may require more code in the short run, but it's clean and maintainable. Method swizzling, monkey patching, and all those cool things that allow us to poke around in our specific language implementation are fun, but the potential for unexpected side effects is high and the code tends to be difficult to maintain.
What you are asking is completely unrelated/unsupported by OOP programming.
If you subclass an object A with class B and override its methods, when a concrete instance of B is created then all the overriden/new implementation of the base methods are associated with it (either we talk about Java or C++ with virtual tables etc).
You have instantiated object B.
Why would you expect that you could/would/should be able to call the method of the superclass if you have overriden that method?
You could call it explicitely of course e.g. by calling super inside the method, but you can not do it automatically, and casting will not help you do that either.
I can't imagine why you would want to do that.
If you need to use class A then use class A.
If you need to override its functionality then use its subclass B.
Most programming languages go to some trouble to support dynamic dispatch of virtual functions (the case of calling the overridden method jump in a subclass instead of the parent class's implementation) -- to the degree that working around it or avoiding it is difficult. In general, specialization/polymorphism is a desirable feature -- arguably a goal of OOP in the first place.
Take a look at the Wikipedia article on Virtual Functions, which gives a useful overview of the support for virtual functions in many programming languages. It will give you a place to start when considering a specific language, as well as the trade-offs to weigh when looking at a language where the programmer can control how dispatch behaves (see the section on C++, for example).
So loosely, the answer to your question is, "No, the behavior is not the same in all programming languages." Furthermore, there is no language independent solution. C++ may be your best bet if you need the behavior.
You can actually do this with Python (sort of), with some awful hacks. It requires that you implement something like the wrappers we were discussing in your first Python-specific question, but as a subclass of B. You then need to implement write-proxying as well (the wrapper object shouldn't contain any of the state normally associated with the class hierarchy, it should redirect all attribute access to the underlying instance of B.
But rather than redirecting method lookup to A and then calling the method with the wrapped instance, you'd call the method passing the wrapper object as self. This is legal because the wrapper class is a subclass of B, so the wrapper instance is an instance of the classes whose methods you're calling.
This would be very strange code, requiring you to dynamically generate classes using both IS-A and HAS-A relationships at the same time. It would probably also end up fairly fragile and have bizarre results in a lot of corner cases (you generally can't write 100% perfect wrapper classes in Python exactly because this sort of strange thing is possible).
I'm completely leaving aside weather this is a good idea or not.

Act on base or subclass without RTTI or base class modification

I asked a similar question yesterday that was specific to a technology, but now I find myself wondering about the topic in the broad sense.
For simplicity's sake, we have two classes, A and B, where B is derived from A. B truly "is a" A, and all of the routines defined in A have the same meaning in B.
Let's say we want to display a list of As, some of which are actually Bs. As we traverse our list of As, if the current object is actually a B, we want to display some of Bs additional properties....or maybe we just want to color the Bs differently, but neither A nor B have any notion of "color" or "display stuff".
Solutions:
Make the A class semi-aware of B by basically including a method called isB() in A that returns false. B will override the method and return true. Display code would have a check like: if (currentA.isB()) B b = currentA;
Provide a display() method in A that B can override.... but then we start merging the UI and the model. I won't consider this unless there is some cool trick I'm not seeing.
Use instanceof to check if the current A object to be displayed is really a B.
Just add all the junk from B to A, even though it doesn't apply to A. Basically just contain a B (that does not inherit from A) in A and set it to null until it applies. This is somewhat attractive. This is similar to #1 I guess w/ composition over inheritance.
It seems like this particular problem should come up from time to time and have an obvious solution.
So I guess the question maybe really boils down to:
If I have a subclass that extends a base class by adding additional functionality (not just changing the existing behavior of the base class), am I doing something tragically wrong? It all seems to instantly fall apart as soon as we try to act on a collection of objects that may be A or B.
A variant of option 2 (or hybrid of 1 and 2) may make sense: after all, polymorphism is the standard solution to "Bs are As but need to behave differently in situation X." Agreed, a display() method would probably tie the model to the UI too closely, but presumably the different renderings you want at the UI level reflect semantic or behavioural differences at the model level. Could those be captured in a method? For example, instead of an outright getDisplayColour() method, could it be a getPriority() (for example) method, to which A and B return different values but it is still up to the UI to decide how to translate that into a colour?
Given your more general question, however, of "how can we handle additional behaviour that we can't or won't allow to be accessed polymorphically via the base class," for example if the base class isn't under our control, your options are probably option 3, the Visitor pattern or a helper class. In both cases you are effectively farming out the polymorphism to an external entity -- in option 3, the UI (e.g. the presenter or controller), which performs an instanceOf check and does different things depending on whether it's a B or not; in Visitor or the helper case, the new class. Given your example, Visitor is probably overkill (also, if you were not able/willing to change the base class to accommodate it, it wouldn't be possible to implement it I think), so I'd suggest a simple class called something like "renderer":
public abstract class Renderer {
public static Renderer Create(A obj) {
if (obj instanceOf B)
return new BRenderer();
else
return new ARenderer();
}
public abstract Color getColor();
}
// implementations of ARenderer and BRenderer per your UI logic
This encapsulates the run-time type checking and bundles the code up into reasonably well-defined classes with clear responsibilities, without the conceptual overhead of Visitor. (Per GrizzlyNyo's answer, though, if your hierarchy or function set is more complex than what you've shown here, Visitor could well be more appropriate, but many people find Visitor hard to get their heads around and I would tend to avoid it for simple situations -- but your mileage may vary.)
The answer given by itowlson covers pretty well most part of the question. I will now deal with the very last paragraph as simply as I can.
Inheritance should be implemented for reuse, for your derived class to be reused in old code, not for your class reusing parts of the base class (you can use aggregation for that).
From that standpoint, if you have a class that is to be used on new code with some new functionality, but should be used transparently as a former class, then inheritance is your solution. New code can use the new functionality and old code will seamlessly use your new objects.
While this is the general intention, there are some common pitfals, the line here is subtle and your question is about precisely that line. If you have a collection of objects of type base, that should be because those objects are meant to be used only with base's methods. They are 'bases', behave like bases.
Using techniques as 'instanceof' or downcasts (dynamic_cast<>() in C++) to detect the real runtime type is something that I would flag in a code review and only accept after having the programmer explain to great detail why any other option is worse than that solution. I would accept it, for example, in itowlson's answer under the premises that the information is not available with the given operations in base. That is, the base type does not have any method that would offer enough information for the caller to determine the color. And if it does not make sense to include such operation: besides the prepresentation color, are you going to perform any operation on the objects based on that same information? If logic depends on the real type, then the operation should be in base class to be overriden in derived classes. If that is not possible (the operation is new and only for some given subtypes) there should at least be an operation in the base to allow the caller to determine that a downcast will not fail. And then again, I would really require a sound reason for the caller code to require knowledge of the real type. Why does the user want to see it in different colors? Will the user perform different operations on each one of the types?
If you endup requiring to use code to bypass the type system, your design has a strange smell to it. Of course, never say never, but you can surely say: avoid depending on instanceof or downcasts for logic.
This looks like text book case for the Visitor design pattern (also known as "Double Dispatch").
See this answer for link to a thorough explanation on the Visitor and Composite patterns.

How do you define a Single Responsibility?

I know about "class having a single reason to change". Now, what is that exactly? Are there some smells/signs that could tell that class does not have a single responsibility? Or could the real answer hide in YAGNI and only refactor to a single responsibility the first time your class changes?
The Single Responsibility Principle
There are many obvious cases, e.g. CoffeeAndSoupFactory. Coffee and soup in the same appliance can lead to quite distasteful results. In this example, the appliance might be broken into a HotWaterGenerator and some kind of Stirrer. Then a new CoffeeFactory and SoupFactory can be built from those components and any accidental mixing can be avoided.
Among the more subtle cases, the tension between data access objects (DAOs) and data transfer objects (DTOs) is very common. DAOs talk to the database, DTOs are serializable for transfer between processes and machines. Usually DAOs need a reference to your database framework, therefore they are unusable on your rich clients which neither have the database drivers installed nor have the necessary privileges to access the DB.
Code Smells
The methods in a class start to be grouped by areas of functionality ("these are the Coffee methods and these are the Soup methods").
Implementing many interfaces.
Write a brief, but accurate description of what the class does.
If the description contains the word "and" then it needs to be split.
Well, this principle is to be used with some salt... to avoid class explosion.
A single responsibility does not translate to single method classes. It means a single reason for existence... a service that the object provides for its clients.
A nice way to stay on the road... Use the object as person metaphor... If the object were a person, who would I ask to do this? Assign that responsibility to the corresponding class. However you wouldn't ask the same person to do your manage files, compute salaries, issue paychecks, and verify financial records... Why would you want a single object to do all these? (it's okay if a class takes on multiple responsibilities as long as they are all related and coherent.)
If you employ a CRC card, it's a nice subtle guideline. If you're having trouble getting all the responsibilities of that object on a CRC card, it's probably doing too much... a max of 7 would do as a good marker.
Another code smell from the refactoring book would be HUGE classes. Shotgun surgery would be another... making a change to one area in a class causes bugs in unrelated areas of the same class...
Finding that you are making changes to the same class for unrelated bug-fixes again and again is another indication that the class is doing too much.
A simple and practical method to check single responsibility (not only classes but also method of classes) is the name choice. When you design a class, if you easily find a name for the class that specify exactly what it defines, you're in the right way.
A difficulty to choose a name is near always a symptom of bad design.
the methods in your class should be cohesive...they should work together and make use of the same data structures internally. If you find you have too many methods that don't seem entirely well related, or seem to operate on different things, then quite likely you don't have a good single responsibility.
Often it's hard to initially find responsibilities, and sometimes you need to use the class in several different contexts and then refactor the class into two classes as you start to see the distinctions. Sometimes you find that it's because you are mixing an abstract and concrete concept together. They tend to be harder to see, and, again, use in different contexts will help clarify.
The obvious sign is when your class ends up looking like a Big Ball of Mud, which is really the opposite of SRP (single responsibility principle).
Basically, all the object's services should be focused on carrying out a single responsibility, meaning every time your class changes and adds a service which does not respect that, you know you're "deviating" from the "right" path ;)
The cause is usually due to some quick fixes hastily added to the class to repair some defects. So the reason why you are changing the class is usually the best criteria to detect if you are about to break the SRP.
Martin's Agile Principles, Patterns, and Practices in C# helped me a lot to grasp SRP. He defines SRP as:
A class should have only one reason to change.
So what is driving change?
Martin's answer is:
[...] each responsibility is an axis of change. (p. 116)
and further:
In the context of the SRP, we define a responsibility to be a reason for change. If you can think of more than one motive for changing a class, that class has more than one responsibility (p. 117)
In fact SRP is encapsulating change. If change happens, it should just have a local impact.
Where is YAGNI?
YAGNI can be nicely combined with SRP: When you apply YAGNI, you wait until some change is actually happening. If this happens you should be able to clearly see the responsibilities which are inferred from the reason(s) for change.
This also means that responsibilities can evolve with each new requirement and change. Thinking further SRP and YAGNI will provide you the means to think in flexible designs and architectures.
Perhaps a little more technical than other smells:
If you find you need several "friend" classes or functions, that's usually a good smell of bad SRP - because the required functionality is not actually exposed publically by your class.
If you end up with an excessively "deep" hierarchy (a long list of derived classes until you get to leaf classes) or "broad" hierarchy (many, many classes derived shallowly from a single parent class). It's usually a sign that the parent class does either too much or too little. Doing nothing is the limit of that, and yes, I have seen that in practice, with an "empty" parent class definition just to group together a bunch of unrelated classes in a single hierarchy.
I also find that refactoring to single responsibility is hard. By the time you finally get around to it, the different responsibilities of the class will have become entwined in the client code making it hard to factor one thing out without breaking the other thing. I'd rather err on the side of "too little" than "too much" myself.
Here are some things that help me figure out if my class is violating SRP:
Fill out the XML doc comments on a class. If you use words like if, and, but, except, when, etc., your classes probably is doing too much.
If your class is a domain service, it should have a verb in the name. Many times you have classes like "OrderService", which should probably be broken up into "GetOrderService", "SaveOrderService", "SubmitOrderService", etc.
If you end up with MethodA that uses MemberA and MethodB that uses MemberB and it is not part of some concurrency or versioning scheme, you might be violating SRP.
If you notice that you have a class that just delegates calls to a lot of other classes, you might be stuck in proxy class hell. This is especially true if you end up instantiating the proxy class everywhere when you could just use the specific classes directly. I have seen a lot of this. Think ProgramNameBL and ProgramNameDAL classes as a substitute for using a Repository pattern.
I've also been trying to get my head around the SOLID principles of OOD, specifically the single responsibility principle, aka SRP (as a side note the podcast with Jeff Atwood, Joel Spolsky and "Uncle Bob" is worth a listen). The big question for me is: What problems is SOLID trying to address?
OOP is all about modeling. The main purpose of modeling is to present a problem in a way that allows us to understand it and solve it. Modeling forces us to focus on the important details. At the same time we can use encapsulation to hide the "unimportant" details so that we only have to deal with them when absolutely necessary.
I guess you should ask yourself: What problem is your class trying to solve? Has the important information you need to solve this problem risen to the surface? Are the unimportant details tucked away so that you only have to think about them when absolutely necessary?
Thinking about these things results in programs that are easier to understand, maintain and extend. I think this is at the heart of OOD and the SOLID principles, including SRP.
Another rule of thumb I'd like to throw in is the following:
If you feel the need to either write some sort of cartesian product of cases in your test cases, or if you want to mock certain private methods of the class, Single Responsibility is violated.
I recently had this in the following way:
I had a cetain abstract syntax tree of a coroutine which will be generated into C later. For now, think of the nodes as Sequence, Iteration and Action. Sequence chains two coroutines, Iteration repeats a coroutine until a userdefined condition is true and Action performs a certain userdefined action. Furthermore, it is possible to annotate Actions and Iterations with codeblocks, which define the actions and conditions to evaluate as the coroutine walks ahead.
It was necessary to apply a certain transformation to all of these code blocks (for those interested: I needed to replace the conceptual user variables with actual implementation variables in order to prevent variable clashes. Those who know lisp macros can think of gensym in action :) ). Thus, the simplest thing that would work was a visitor which knows the operation internally and just calls them on the annotated code block of the Action and Iteration on visit and traverses all the syntax tree nodes. However, in this case, I'd have had to duplicate the assertion "transformation is applied" in my testcode for the visitAction-Method and the visitIteration-Method. In other words, I had to check the product test cases of the responsibilities Traversion (== {traverse iteration, traverse action, traverse sequence}) x Transformation (well, codeblock transformed, which blew up into iteration transformed and action transformed). Thus, I was tempted to use powermock to remove the transformation-Method and replace it with some 'return "I was transformed!";'-Stub.
However, according to the rule of thumb, I split the class into a class TreeModifier which contains a NodeModifier-instance, which provides methods modifyIteration, modifySequence, modifyCodeblock and so on. Thus, I could easily test the responsibility of traversing, calling the NodeModifier and reconstructing the tree and test the actual modification of the code blocks separately, thus removing the need for the product tests, because the responsibilities were separated now (into traversing and reconstructing and the concrete modification).
It also is interesting to notice that later on, I could heavily reuse the TreeModifier in various other transformations. :)
If you're finding troubles extending the functionality of the class without being afraid that you might end up breaking something else, or you cannot use class without modifying tons of its options which modify its behavior smells like your class doing too much.
Once I was working with the legacy class which had method "ZipAndClean", which was obviously zipping and cleaning specified folder...