Method requires specific subtype but collection is of base abstract type. What is wrong? - oop

Recently I have fallen in a situation like this. I'm generalizing the problem because I think it relates more to the structural design than the specific problem.
General problem
There is a hierarchy of classes: an abstract base class Base and some concretions D1, D2, D3 that inherit from it. The class A contains an object's collection of type Base. A requires a computation from some service-class B but B.process() method accepts only a collection of type D1. Let's say that is important because if the input collection contains any other type the value returned is just wrong.
A have an interface that allows clients to add elements to the internal collection, which is not exposed in any other way. The classes in the hierarchy can be constructed for the same clients and pass the new values to A; A have not enough context to construct them itself.
Attempts, questions and thoughts
The major concern for me was the need to determine at runtime the type of each element in the A collection, so can filter the right ones and pass to B.process(). Even if it is possible (it is in my particular problem, more later on) it just seems wrong! I think the object who contains references to the abstract base class shouldn't have to know the concrete instances it holds.
I try to:
Change the parameter type to B.process(c: Base[]) so A doesn't have to downcast the type, but it doesn't solve anything: A still needs to filter the elements or the computation will be wrong.
Pass the complete collection Base[] to B.process() but just defer the problem of selection/downcasting to B.
Put a process() method in Base so D1 can override the behavior (well known polymorphism). The problem here is that a process() returning a SomeValue type just have sense for D1.
Separate the interface that add elements so a more specific A.addD1Element(e: D1) method could allow put D1 objects in a different collection and pass that to B. It should work but also looks... don't know, weird. If method overload based on parameter type is possible at least the process won't be so cumbersome for clients of the class.
Just separate the D1 class of the hierarchy. This is a more aggressive variation of the previous one. The issue is that D1 seems related to the whole hierarchy except for the specific requirements of B.
Those were some of my thoughts on the problem.
For instance, the language used have support to check the type of an object at runtime (instanceof) and it is easy to filter the collection based on that check. But as I say my question is more related to the paradigm. What about a language, say for instance C++, where is less handy to make a check like that?
So what could be a solution to this kind of problem? What kind of refactoring or design pattern could be applied so the problem is easy to treat with or simply fades away?
This question looks related, but I believe this is more general (although I provide a more specific context). The most upvoted answer suggest to split in different collections. This is also a think i'm considering, but that forces to change A implementation every time a new type is added.
Context (problem in action)
I'm asking in a general way because it really intrigues me on that way, but I know most of the time a design can be analyzed only with the context of the particular problem it tries to solve.
The problem at hand is similar to this:
A is a class (some kind of entity, like a DDD entity) that models a sort of agreement or debt a customer incurs for a service. It has different costs including a monthly pay. Base and related classes are Payments of different types. They share a lot in common, although most of it is data (date, amount, interests, etc); but there is at least one type of payment that have different, additional information: the monthly payment (D1). Those payments need to be analyzed carefully so a different class (B) is responsible for that, using more contextual information and all the payments of that type at once. The service needs the additional data that is specific to those payments so cannot receive an abstract Payment type (at least not in that design). Other payments doesn't have the specific information MonthlyPayment does and so they cannot generates the values that business requires and B is generating (doesn't have sense in other payment types).
All payments are stored in the same collection so other methods of the class can process all payments in a generic way.
This is mostly the context. I think the design is not the best, but I fail to see a better one.
Maybe separating only MonthlyPayment (D1) in a different collection as described earlier? But it is not the only payment that requires additional processing (it is the most complex, though), so I could end with different collections for every payment type and no hierarchy at all. Right now there are four payments types and two of them requires additional, specific analysis, but more types can be added later and the issue of need to modify the implementation every time a new type is added persists.
Is this, more discrete approach of different collections by type, a better one here? The abstract base class Payment can still be used for payments that can be manipulated trough the common interface. Also I can use a layer super type or something like that to allow reutilization of common functionality (the language allows a kind of mixing as well) and stop using the base class as root from a hierarchy.
Uf. I am sorry for the length of the text. I hope it is at least readable and clear. Thank you very much in advance.

Related

Class with a list of materials: best practice

I've created the custom class ZMaterial that can be instantiated passing an ID to the constructor which sets the properties for a single material using SELECTs and BAPIs. This class is basically used to READ and UPDATE a single material.
Now I need to create a service to return a list of materials. I already have the procedural code for it in a static method (for now actually a function module), but I would like to keep using a full OOP approach and instantiate a list of my custom material object. The first approach I found is to enhance the static method to instantiate a list of my single material object after the selects are executed and I have the data in internal tables, but it does not seem the most OOP.
The second option in my mind is to create a new class ZMaterialList with one property being a list of objects ZMaterial and then a constructor with the necessary input parameters for the database select. The problem I see with this option is that I create a full class just for the constructor.
What do you think is the best way to proceed?
Create a separate class to produce the list of materials. The single responsibility principle says each class should do exactly one thing. In all but the most simple cases, using a thing is a different responsibility than producing it.
Don’t make a ZMaterialList class. A list’s focus would be managing the list items, i.e. adding, removing, iterating, sorting etc. But you should be fine with a regular STANDARD TABLE OF REF TO ZMaterial.
Make a ZMaterialReader, -Repository, -Query or -Factory class or the like, depending on the precise way you want to produce the ZMaterials. Readers read by keys, repositories read and write, queries use varying sets of selection criteria, factories instantiate with possibly different sets of inputs.
You can well let that class use the original FUNCTION underneath. It’s good style to exploit what’s already there. Just make sure you trust that code, put it in a test harness, and keep it afar from the rest of your oo code.
Extract all public interaction of ZMaterial to an interface and use only that interface. That allows you to offer alternative implementations of ZMaterial, ones that differ in the way they are produced or how they store their data.
Split single production from mass production. Reading MARA to retrieve a single material is okay. But you don’t want thousands of ZMaterials reading MARA individually - that wrecks performance.
Now you’ve got the interface, you could offer a second implementation of ZMaterial whose constructor receives all relevant data and relies on it already having been validated to avoid additional SELECTs.
You could also offer an implementation that doesn’t store its data at all but only stores pointers to rows in internal tables somewhere else. See the flyweight pattern for ideas.
If you expect mass updates on the materials, such as “reclassify all of these as B”, consider extracting these list-oriented operations to separate classes as well.

Extending a class hierarchy

Recently, I've found a puzzling (to me) problem: Let's say I have a hierarchy of classes C1...C_n. Assume that at least some classes have more than one (direct) child class, but none have more than one parent (i.e. no multiple inheritance). I would like to change the hierarchy's behaviour. My first impulse would be to create subclasses D1...Dn and override methods as necessary, but there is one problem: when calling a newly overridden method, an actual Di may need to be passed as a formal D1 (or some level in between); this can be solved by subclassing C1 -> D1 -> D2 .... But when calling an unchanged method, any actual Di will need to be passed as a formal Ci, so we would have to derive each Di directly from Ci. Is there any elegant or generally accepted way to solve this riddle? If so, is there any way without resorting to multiple inheritance?
If there is no general way to achieve this, can the author of the original C hierarchy follow certain rules to provide this possibility?
For those who prefer a more practical approach, the original hierarchy implements SOAP in Ada. I am working on XML-RPC. From an abstract point of view, SOAP is a superset to XML-RPC, but the actual XML "on the wire" is quite different. In principle, one can perform most of the work by throwing away some of the data types (e.g. XML-RPC has one integer and one floating point type whereas SOAP has several of each), and replacing the routines that convert the remaining types to and from XML.
However, due to the aforementioned inheritance problems, I ended up copying almost the entire SOAP hierarchy. The only code I've been able to re-use properly turned out to be the HTTP part (as it is not concerned only with text payloads, not with SOAP objects).
[Edit: Removed a simplifying assumption that would allow for a simple solution not applicable to the more general problem]
The solution, because you assume single inheritance, and direct subclass relationships, is to just create D(1) as a subclass of C(n), in which case all Ds will also be in every C class.
In practice, the solution is not to create deep class hierarchies for no reason. If you're really into small classes, then use a system which accommodates multiple inheritance in some form.

"Visitor" a lot of different type of object

in one application I have a lot of dirrerent object, let´s say : square, circle, etc ... a lot of different shape ---> I´m sorry for the trivial example.
With all this object I want to create a doc of different type : xml, txt, html, etc.. (e.g.: I want to scan all the object (shapes) tree and produce the xml file.
The natural approach I thought is visitor pattern, I tried and it works :-)
- all the objects have one visit method accepting the IVisitor interface.
- I have one concrete visitor for every kind of do I want to create : (XmlVisitor, TxtVisitor, etc). Every visitor has one method "visit" for every kind of object.
My doubt is ... it doesn´t seems scaling well if I have a lot of object ...
from the logic point of view it´s ok, i have just to add the new shape and the method in the concrete Visitor, that´s all.
What do you think ? is an althernative possible ?
I think that you have correctly implemented a visitor pattern and as a result you also have implemented a double dispatching mechanism. If you consider the "not scaling well" as a need to add a bunch of methods in case of adding a new shape and/or visitor, then it is just a side effect of the pattern. Some people consider this "method explosion" as harmful and opt for a different implementation, such as having a "decision matrix" object. For this example in particular I think that the DD approach is the way to go and that actually it does scale well, since you add new methods as you add new requirements (i.e. you add new visit* methods as new shapes are added or you add a new visitor class as new document types are needed).
HTH
It seems to me that what worries you the most is that you are matching against many different kinds of objects, and worry that as more and more object types are added, the performance will suffer. I don't think you need to worry about that, though: The performance of the visitor pattern is not really affected by the potential number of objects or visitors, since it is based on virtual table lookup - the passed object will contain a link to (a link to) the method which should be called.
So the visitor pattern, though relatively expensive in indirect accesses, is scalable in this regard.
I believe you have :
A class hierarchy (Shapes in your example) and
Operations on the class hierarchy (exportXML, exportToHTML etc in your example)
You have to consider what is more likely to change -
You should choose Visitor pattern if the class hierarchy is more or less fixed but you would like to add more operations in future. Visitor pattern will allow you to add more operations (e.g. JSON export) without touching the existing class hierarchy.
OTOH if the operations are more or less fixed, but more Shape objects can be added, then you should use regular inheritance. Define a ShapeExport interface which has methods like exportToXML, exportToHTML etc. Let all Shapes implement that interface. Now you can add new Shape which implements the same interface without touching existing code.

Act on base or subclass without RTTI or base class modification

I asked a similar question yesterday that was specific to a technology, but now I find myself wondering about the topic in the broad sense.
For simplicity's sake, we have two classes, A and B, where B is derived from A. B truly "is a" A, and all of the routines defined in A have the same meaning in B.
Let's say we want to display a list of As, some of which are actually Bs. As we traverse our list of As, if the current object is actually a B, we want to display some of Bs additional properties....or maybe we just want to color the Bs differently, but neither A nor B have any notion of "color" or "display stuff".
Solutions:
Make the A class semi-aware of B by basically including a method called isB() in A that returns false. B will override the method and return true. Display code would have a check like: if (currentA.isB()) B b = currentA;
Provide a display() method in A that B can override.... but then we start merging the UI and the model. I won't consider this unless there is some cool trick I'm not seeing.
Use instanceof to check if the current A object to be displayed is really a B.
Just add all the junk from B to A, even though it doesn't apply to A. Basically just contain a B (that does not inherit from A) in A and set it to null until it applies. This is somewhat attractive. This is similar to #1 I guess w/ composition over inheritance.
It seems like this particular problem should come up from time to time and have an obvious solution.
So I guess the question maybe really boils down to:
If I have a subclass that extends a base class by adding additional functionality (not just changing the existing behavior of the base class), am I doing something tragically wrong? It all seems to instantly fall apart as soon as we try to act on a collection of objects that may be A or B.
A variant of option 2 (or hybrid of 1 and 2) may make sense: after all, polymorphism is the standard solution to "Bs are As but need to behave differently in situation X." Agreed, a display() method would probably tie the model to the UI too closely, but presumably the different renderings you want at the UI level reflect semantic or behavioural differences at the model level. Could those be captured in a method? For example, instead of an outright getDisplayColour() method, could it be a getPriority() (for example) method, to which A and B return different values but it is still up to the UI to decide how to translate that into a colour?
Given your more general question, however, of "how can we handle additional behaviour that we can't or won't allow to be accessed polymorphically via the base class," for example if the base class isn't under our control, your options are probably option 3, the Visitor pattern or a helper class. In both cases you are effectively farming out the polymorphism to an external entity -- in option 3, the UI (e.g. the presenter or controller), which performs an instanceOf check and does different things depending on whether it's a B or not; in Visitor or the helper case, the new class. Given your example, Visitor is probably overkill (also, if you were not able/willing to change the base class to accommodate it, it wouldn't be possible to implement it I think), so I'd suggest a simple class called something like "renderer":
public abstract class Renderer {
public static Renderer Create(A obj) {
if (obj instanceOf B)
return new BRenderer();
else
return new ARenderer();
}
public abstract Color getColor();
}
// implementations of ARenderer and BRenderer per your UI logic
This encapsulates the run-time type checking and bundles the code up into reasonably well-defined classes with clear responsibilities, without the conceptual overhead of Visitor. (Per GrizzlyNyo's answer, though, if your hierarchy or function set is more complex than what you've shown here, Visitor could well be more appropriate, but many people find Visitor hard to get their heads around and I would tend to avoid it for simple situations -- but your mileage may vary.)
The answer given by itowlson covers pretty well most part of the question. I will now deal with the very last paragraph as simply as I can.
Inheritance should be implemented for reuse, for your derived class to be reused in old code, not for your class reusing parts of the base class (you can use aggregation for that).
From that standpoint, if you have a class that is to be used on new code with some new functionality, but should be used transparently as a former class, then inheritance is your solution. New code can use the new functionality and old code will seamlessly use your new objects.
While this is the general intention, there are some common pitfals, the line here is subtle and your question is about precisely that line. If you have a collection of objects of type base, that should be because those objects are meant to be used only with base's methods. They are 'bases', behave like bases.
Using techniques as 'instanceof' or downcasts (dynamic_cast<>() in C++) to detect the real runtime type is something that I would flag in a code review and only accept after having the programmer explain to great detail why any other option is worse than that solution. I would accept it, for example, in itowlson's answer under the premises that the information is not available with the given operations in base. That is, the base type does not have any method that would offer enough information for the caller to determine the color. And if it does not make sense to include such operation: besides the prepresentation color, are you going to perform any operation on the objects based on that same information? If logic depends on the real type, then the operation should be in base class to be overriden in derived classes. If that is not possible (the operation is new and only for some given subtypes) there should at least be an operation in the base to allow the caller to determine that a downcast will not fail. And then again, I would really require a sound reason for the caller code to require knowledge of the real type. Why does the user want to see it in different colors? Will the user perform different operations on each one of the types?
If you endup requiring to use code to bypass the type system, your design has a strange smell to it. Of course, never say never, but you can surely say: avoid depending on instanceof or downcasts for logic.
This looks like text book case for the Visitor design pattern (also known as "Double Dispatch").
See this answer for link to a thorough explanation on the Visitor and Composite patterns.

How to solve cross referencess in OOP?

I encountered this a couple of times now, and i wondered what is the OO way to solve circular references. By that i mean class A has class B as a member, and B in turn has class A as a member.
One example of this would be class Person that has Person spouse as a member.
Person jack = new Person("Jack");
Person jill = new Person("Jill");
jack.setSpouse(jill);
jill.setSpouse(jack);
Another example would be Product classes that have some Collection of other Products as a member. That collection could for example be products that people who are interested in this product might also be interested in, and we want to upkeep that list on a per-product base, not on same shared attributes (e.g. we don't want to just display all other products in the same category).
Product pc = new Product("pc");
Product monitor = new Product("monitor");
Product tv = new Product("tv");
pc.setSeeAlso({monitor, tv});
monitor.setSeeAlso({pc});
tv.setSeeAlso(null);
(these products are just for making a point, the issue is not about wether or not certain products would relate to each other)
Would this be bad design in OOP in general? Would/should all OOP languages allow this, or is it just bad practice? If it's bad practice, what would be the nicest way of solving this?
The examples you give are (to me, anyway) examples of reasonable OO design.
The cross-referencing issue you describe isn't an artifact of any design process but a real-life characteristic of the things you're representing as objects, so I don't see there's a problem.
What have you encountered that has given you the impression that this approach is bad-design?
Update 11 March:
In systems that lack garbage collection, where memory management is explicitly managed, one common approach is to require all objects to have an owner - some other object responsible for managing the lifetime of that object.
One example is Delphi's TComponent class, which provides cascading support - destroy the parent component, and all owned components are also destroyed.
If you're working on such a system, the kinds of referential loop described in this question may be considered poor design because there's no clear owner, no one object responsible for managing lifetimes.
The way that I've seen this handled in some systems is to retain the references (because they properly capture the business concerns), and to add in an explicit TransactionContext object that owns everything loaded into the business domain from the database. This context object takes care of knowing which objects need to be saved, and cleans everything up when processing is complete.
It's not a fundamental problem in OO design. An example of a time it might become a problem is in graph traversal, for instance, finding the shortest path between two objects - you could potentially get into an infinite loop. However, that's something you would have to consider on a case-by-case basis. If you know there could be cross-references in a case like that, then code some checks in to avoid infinite loops (for instance, maintaining a set of visited nodes to avoid re-visiting). But if there's no reason it could be a problem (such as in the examples you gave in your question), then it's not bad at all to have such cross-references. And in many cases, as you've described, it's a good solution to the problem at hand.
I do not think this is an example of cross referencing.
Cross referencing usually pertains to this case:
class A
{
public void MethodA(B objectB)
{
objectB.SomeMethodInB();
}
}
class B
{
public void MethodB(A objectA)
{
objectA.SomeMethodInA();
}
}
In this case each object kind of "reaches in" to each other; A calls B, B calls A, and they become tightly coupled. This is made even worse if A and B are in different packages/namespaces/assemblies; in many cases those would create compile time errors as assemblies are compiled linearly.
The way to solve that is to have either object implement an interface with the desired method.
In your case you only have one level of "reaching in":
public Class Person
{
public void setSpouse(Person person)
{ ... }
}
I do not think this is unreasonable, nor even a case of cross-referencing/circular references.
The main time this is a problem is if it becomes too confusing to cope with, or maintain, as it can become a form of spaghetti code.
However, to touch on your examples;
See Also is perfectly valid if this is a feature you need in your code - it is a simple list of pointers (or references) to other items a user may be interested in.
Similarily it is perfectly valid to add spouse, as this is a simple real world relationship that would not be confusing to someone maintaining your code.
I have always seen it as a potential code smell, or perhaps a warning to take a step back and rationalise what I am doing.
As for some systems finding recursive relationships in your code (mentioned in a comment above), these can come up regardless of this sort of design. I have recently worked on a metadata capture system that had recursive 'types' of relationships - i.e Columns being logically related to other columns. It needs to be handled by the code trying to parse your system.
I don't think the circular references as such are a problem.
However, putting all those relationships inside objects may add too much clutter, so you may instead want to represent them externally. E.g. you might use a hash table to store relationships between products.
Referencing other objects is not a real bad OO design at all. It's the way state is managed within each object.
A good rule of thumb is the Law of Demeter. Look at this perfect paper of LoD (Paperboy and the wallet): click here
One way to fix this is to refer to other object via an id.
e.g.
Person jack = new Person(new PersonId("Jack"));
Person jill = new Person(new PersonId("Jill"));
jack.setSpouse(jill.getId());
jill.setSpouse(jack.getId());
I'm not saying it is a perfect solution, but it will prevent circular references. You are using an object instead of a object reference to model the relationship.