Difference between visitor design pattern and depth first search? - oop

A depth first search seem able to perform similar functions as the visitor design pattern. A visitor allows you to define some data structures and add operations on those structures (in the form of multiple visitors) as needed without modifying the structures themselves. A description of the visitor pattern is provided on wikipedia. If we do a depth first search (or any other graph search algorithm like breadth first search) on the data structure, and every time an element of the structure is found, we run our desired operation, then this seems to perform the same function as the visitor. For example, consider a tree. Even if some nodes of the tree have different type, we can still check for the node types when doing DFS and then perform different operations based on the node type.

A depth-first search is just that--a search. A Visitor pattern is orthogonal to a depth-first search, in the sense that a Visitor doesn't necessarily care how the tree is traversed; it just knows what it needs to do on/to/with each node.

You can have Visitor implementation without having DFS. Similarly, you can do a DFS without use of the Visitor pattern. They are completely separate.
I happen to agree with the implication that they could be used together in an elegant way.
As a note, for the canonical Visitor pattern the objects being visited need to implement some sort of AcceptVisitor interface -- the clause "without modifying the structures themselves" in your question leads me to question whether you are doing that.

Let me answer the question in code:
/**
* This method makes it easier for a class to process all
* the nodes in an item. The class should implement
* the NodeVisitor interface. This method will cause the
* NodeVisitor.processNode() method to be called once for each
* node in this item.
* #param visitor the class that will visit each node
* #throws PipelineException some recoverable exception
*/
public void visitNodes(NodeVisitor visitor) throws PipelineException {
visitNodes(visitor, root);
}
private void visitNodes(NodeVisitor visitor, Node node) throws PipelineException {
visitor.processNode(node);
if (node.hasChildren()) {
int childCount = node.getChildCount();
NodeList children = node.getChildren();
for (int i = 0; i < childCount; i++) {
visitNodes(visitor, children.get(i));
}
}
}
/**
* Classes that implement this interface can be used with
* Item.visitNodes(). This method provides a convenient
* way to iterate over all the nodes in an item.
*/
public interface NodeVisitor {
public void processNode(Node node) throws PipelineException;
}
In this case the visitNodes() method implements a depth-first search, but it doesn't have to. It could implement any search that hits all nodes. It's the combination of the visitNodes() method and the NodeVisitor interface that comprise the "visitor pattern" (or one particular manifestation of it.)
There's no "tradeoff" between the design pattern and the search algorithm. The design pattern just makes the algorithm easier to use.

In graph theory you can construct a graph such that the minimum spanning tree is not a path from a depth-first search, precisley it's a non-path:https://cs.stackexchange.com/questions/6749/depth-first-search-to-find-minimum-spanning-tree. I think you cannot apply this to a design pattern.

Depth first seach is an "algorithm" and the visitor pattern is a way to forget algorithm concerns and focus on actions to perform.
Actually, the visitor Pattern may be a good way to index content, as it provides a "structure-agnostic" behaviour (you can change structure without rewriting visitors)
But if you want to perform a search, I would not advise you to use it. Any search algorithm is related to a special type of structure (tree, digraph, flow graphs, etc)
In some cases, you can achieve a depth-first search with a visitor pattern, but this is not the purpose of this pattern.
The use of visitor pattern doesn't depend of what kind of parsing you are using, but what the parsing must be performed for.

The visitor instance may choose to visit its child nodes however it wishes and is not bound by the traversal order of Depth First Search. (For example it could use Breath First Search)
Another thing to mention is the structure getting traversed does not need to be an explicit tree. I've implemented Visitors that iterate over structured data that wasn't tree like at all. I used a visitor in that instance because I could hide the complicated binary format of the file I was parsing and allow clients to control which parts of the structure they wanted to parse without needing them to know the file format specification.

I too go with NRITH's answer as visitor pattern know 'only what to do with the node' on top of that Visitor lets you 'define a new operation without changing the classes of the elements on which it operates' and the DFS talks about how the node search is performed. But for performing depth traversal and short-circuit branch traversal we use Hierarchical Visitor pattern(http://c2.com/cgi/wiki?HierarchicalVisitorPattern).
You can also look into When should I use the Visitor Design Pattern? which talks about the relation between visitor pattern/DFS/Hierarchical visitor pattern.

The visitor pattern as (first?) described in "Design Patterns" by Erich Gamma et. al. does not necessarily include the traversal through the data structure in the accept method. Although this is a very convenient combination, there is an explicit example for external iteration at the end of the Sample Code chapter in the book.
So as others already said, a depth first traversal implemented outside of the accept method could still implement the Visitor Pattern. The question is then, what's the difference between calling element.accept(visitor) which in turn directly calls visitor.visitElement(me) compared to directly calling visitor.visitElement(element)?
I can see only two reasons one may want to do that:
You can not or do not want to find out the concrete class of element, and by just stupidly calling element.accept(visitor), the element itself has to decide, whether visitor.visitElement or e.g. visitor.visitAnotherElement is the right operation to be called.
Some of the elements are composites without external access to the contained inner elements, and the visitor operations are defined for the inner elements. So the accept method would loop over the inner elements and would call visitor.visitInnerElement(innerElement). And as you do not get hold of the inner elements from outside, you also can not call visitor.visitInnerElememt(innerElement) from outside.
In summary: If you have a nicely encapsulated traversal algorithm, where you can pass in a "Visitor"-like class, and which is able to dispatch to the matching visit methods depending on the object type encountered during traversal, you do not need to bother about accept-methods. You will still be able to add new operations by just adding new Visitor implementations, and without touching your data structure nor your traversal algorithm. Whether this should still be called Visitor Pattern is a pretty academic discussion.
I think in most cases bothering with an accept method makes only sense if implementation of the accept method also includes the traversal.

Related

Abstract factory: ways of realization

I'm learning design patterns now and reading different resources for every pattern. I have a question about pattern Abstract Factory. I read about two ways of realization this. I'll write using this factories, without realization. For example, I took making different doors.
First way. We have common class for Door Factory, which consists different methods for making different types of doors (which returns appropriate class of door):
$doorFactory = new DoorFactory();
$door1 = $doorFactory->createWoodDoor();
$doorFactory = new DoorFactory();
$door2 = $doorFactory->createSteelDoor();
Second way. We have parent class DoorFactory and extends classes for WoodDoorFactory and SteelDoorFactory. This classes realises same method createDoor (and return appropriate class of Door)
$woodDoorFactory = new WoodDoorFactory();
$door1 = $woodDoorFactory->createDoor();
$steelDoorFactory = new SteelDoorFactory();
$door2 = $steelDoorFactory->createDoor();
How you think, which way more optimal and canonical?
Please imagine the situation when your factory is passed around, and what client code need to ask the factory is just creating a door (not care about wood and steel), you will see why the 2nd way is better. Let's say we have a class Client with method foo which uses the factory (I'm using Java but it should be easy to understand):
class Client {
private DoorFactory factory;
public Client(DoorFactory factory) { this.factory = factory; }
public void foo() {
Door door = factory.createDoor();
}
}
Now you can pass a WoodDoorFactory or a SteelDoorFactory or WhateverDoorFactory to the constructor of Client.
Moreover, not to mention that your 1st way may be a violation of The Single Responsibility Principles since the DoorFactory class knows many things which are probably unrelated. Indeed, it knows how to create a wood door which requires Wood APIs (just example), it knows how to create a steel door which requires Steel APIs. This clearly reduces the opportunity of reusing the DoorFactory class in another environment which does not want to depend on Wood APIs or Steel APIs. This problem does not happen with your 2nd way.
As with the other answers, I also generally prefer the second method in practice. I've found it to be a more flexible and useful approach for Dependency Injection.
That said, I do think there are cases where the first approach works just as well if not better - they just aren't as common. An example that comes to mind is an XML Document Object Model. If you have ever used Microsoft's c++ XML DOM Document API then you will be familiar with this approach (see https://learn.microsoft.com/en-us/previous-versions/windows/desktop/ms760218(v%3dvs.85))
In this case, there are a limited number of well-defined elements that can go into an XML document. There isn't a need to be able to dynamically extend the types of elements that can go into an XML document either - this is all predetermined by some standards committee. Therefore, the first factory approach works here, because you can predetermine all the different types of things you need to be able to create from the get-go.
The other advantage, in this case, is that by making the XML Document class the factory for all of the elements contained there-in, the XML Document has complete control over the life-time of those internal objects. NOTE: They disallow using the same sub-elements in multiple XML Document instances. If you wanted to use a node from one XML Document and place it in another XML Document, you would be required to go through the second XML Document to produce a new node element as well as a copy of any and all sub-elements.
A notable difference in this case from the example in the OP is that rather than the Factory Method being used to provide multiple ways of creating the same type of thing, here the Factory knows how to create a bunch of highly related (and connected) object types.

Is there a commonly accepted design pattern for base methods implementing "early exit" functionality?

I have a class hierarchy of patterns: patterns are split into simple patterns and compound patterns, both of which have concrete implementations.
Patterns have a Match method which returns a Result, which can be a Node or an Error.
All patterns can check for a memoized result when matching. Simple patterns return an error on EOF.
Is there a pattern that allows a more simple way to reuse implemented functionality than mine? Let's say we're using a single-inheritance, single-dispatch language like C# or Java.
My approach is to implement Match at pattern level only and call a protected abstract method InnerMatch inside it. At simple pattern level, InnerMatch is implemented to handle EOF and calls protected abstract InnerInnerMatch, which is where concrete implementations define their specific functionality.
I find this approach better than adding an out bool handled parameter to Match and calling the base method explicitly in each class, but I don't like how I have to define new methods. Is there a design pattern that describes a better solution?
Possibly Strategy pattern
The strategy pattern (also known as the policy pattern) is a software design pattern that enables an algorithm's behavior to be selected at runtime. The strategy pattern
defines a family of algorithms,
encapsulates each algorithm, and
makes the algorithms interchangeable within that family.
And perhaps Chain of Repsonsibility
The chain-of-responsibility pattern is a design pattern consisting of a source of command objects and a series of processing objects. Each processing object contains logic that defines the types of command objects that it can handle; the rest are passed to the next processing object in the chain. A mechanism also exists for adding new processing objects to the end of this chain.
But the Chain of Responsibility would depend more on how you want to handle allowing multiple 'Patterns'(your Objects, not 'design patterns') to be 'processed' in order.
Chain of Responsibility might also be good for allowing you to have dynamic Pattern "sets" that different inputs can be processed with. (Depending on your needs.)
You'll have to encapsulate your input values, but that isn't too big of deal.

"Visitor" a lot of different type of object

in one application I have a lot of dirrerent object, let´s say : square, circle, etc ... a lot of different shape ---> I´m sorry for the trivial example.
With all this object I want to create a doc of different type : xml, txt, html, etc.. (e.g.: I want to scan all the object (shapes) tree and produce the xml file.
The natural approach I thought is visitor pattern, I tried and it works :-)
- all the objects have one visit method accepting the IVisitor interface.
- I have one concrete visitor for every kind of do I want to create : (XmlVisitor, TxtVisitor, etc). Every visitor has one method "visit" for every kind of object.
My doubt is ... it doesn´t seems scaling well if I have a lot of object ...
from the logic point of view it´s ok, i have just to add the new shape and the method in the concrete Visitor, that´s all.
What do you think ? is an althernative possible ?
I think that you have correctly implemented a visitor pattern and as a result you also have implemented a double dispatching mechanism. If you consider the "not scaling well" as a need to add a bunch of methods in case of adding a new shape and/or visitor, then it is just a side effect of the pattern. Some people consider this "method explosion" as harmful and opt for a different implementation, such as having a "decision matrix" object. For this example in particular I think that the DD approach is the way to go and that actually it does scale well, since you add new methods as you add new requirements (i.e. you add new visit* methods as new shapes are added or you add a new visitor class as new document types are needed).
HTH
It seems to me that what worries you the most is that you are matching against many different kinds of objects, and worry that as more and more object types are added, the performance will suffer. I don't think you need to worry about that, though: The performance of the visitor pattern is not really affected by the potential number of objects or visitors, since it is based on virtual table lookup - the passed object will contain a link to (a link to) the method which should be called.
So the visitor pattern, though relatively expensive in indirect accesses, is scalable in this regard.
I believe you have :
A class hierarchy (Shapes in your example) and
Operations on the class hierarchy (exportXML, exportToHTML etc in your example)
You have to consider what is more likely to change -
You should choose Visitor pattern if the class hierarchy is more or less fixed but you would like to add more operations in future. Visitor pattern will allow you to add more operations (e.g. JSON export) without touching the existing class hierarchy.
OTOH if the operations are more or less fixed, but more Shape objects can be added, then you should use regular inheritance. Define a ShapeExport interface which has methods like exportToXML, exportToHTML etc. Let all Shapes implement that interface. Now you can add new Shape which implements the same interface without touching existing code.

Visitor pattern - adding new ConcreteElement classes is hard?

I read a book about the visitor pattern. It gives the same class diagram as in the oodesign's website.
It says that adding new ConcreteElement classes is hard. But I didn't understand why. As I understood, the Concretevisitor defines the set of operations, which have to be used by the concreteElement. So when I add a new element, which has the same operation I defined earlier, I don't need to add anything (just the ConcreteElement itself). If I add a new element, which doesn't have the same operations I defined earlier in the visitors, I need to add a new visitor. But this I have to do in any design pattern.
Well, you have to extend all your visitors.
You have a caller, some elements that need to be visited, and an element - the visitor - that does the processing of the individual elements. Your goal is to keep the implementation of the elements and the caller fixed, and extend functionality via new visitors.
Usually you have a lot of the concrete visitors. If you add a new type of element to be processed, you will need to change all the concrete visitors to take this into account.
Why?
well, imagine that the caller is "Factory", and you have the elements "Car" and "Bike".
For operation "Paint" you have to have the methods
void process(Car c); // Paint a car
void process(Bike b); // Paint a bike
Likewise for operations "Assemble", "Package", "Wash" etc.
If you add an element "Scooter", all the operations have to be extended with a new method
void process(Scooter s); // Handle a Scooter
This is a bit of work. Also you may hit the isse where the element you add is so different from the others that you canøt easily fit them to the operations.
Wikipedia (http://en.wikipedia.org/wiki/Visitor_pattern) says
In essence, the visitor allows one to add new virtual functions to a
family of classes without modifying the classes themselves; instead,
one creates a visitor class that implements all of the appropriate
specializations of the virtual function. The visitor takes the
instance reference as input, and implements the goal through double
dispatch.
That's a pretty abstract way of saying what I try to say above. Usually you'd add these methods to the elements, but if you can't you have to add the methods to somethign else, and pass that along to do the processing. This is a bit of extra work, but may be worth it, if the situation merits it.
This came up in an SO question just recently. To quote myself from this question, and more specifically the discussion
The reason a precondition of not changing the set of entities (classes
you visit) is because it forces you to implement a new VisitXYZ in
each concrete visitor. But I never took much stock in that reasoning
becasue if you are supporting a persistance visitor, and a text search
visitor, and a print visitor, and a validation visitor and you go and
add a new entity you are going to want to implement all that
functionality anyway. The visitor pattern (with a common base class)
just lets the compiler find the ones you forgot to implement for you.
So yes it is often said that its hard to implement addition conrete elements (or entities), but in my opinion it is hogwash.
If you add a new concrete element, then all of your visitor classes will need to add a new visit method for the new element. If you didn't use visitors, you would have to add the equivalent methods to your new concrete element anyway.
But adding a new method to each your visitors may be harder than adding the equivalent set of methods to a new element class. The reason is that visitors often need to traverse the element tree structure and may need to manage its own state data as it does so. Adding a new visit method may require modifying that state data which involves thinking about how the new method interacts with existing visit methods for other elements.
It may be simpler to add the equivalent methods to your new element class if you didn't have visitors because you will only need to worry about the internal state of the new concrete element which is more cohesive.
Essentially, visitor pattern is kind of data manipulator, it will
Traverse among elements following some rules
Do some calculation and manipulation with the data those elements provide
In one word, visitor pattern will extend the system functionality, without touch the element class definition.
But does this mean the visitor class must be revised if new concrete element class is added? It depends, I believe, on how the visitor pattern is designed and implemented.
If separate all the visit methods of visitor into visit functors, and dynamically bind the them together, then it might be easier when extending the how system, for both visitor and visitee.
Here is an implementation of visitor pattern I wrote several years ago, the code is a bit old and not well polished, but somehow works :)
https://github.com/tezheng/visitor

Act on base or subclass without RTTI or base class modification

I asked a similar question yesterday that was specific to a technology, but now I find myself wondering about the topic in the broad sense.
For simplicity's sake, we have two classes, A and B, where B is derived from A. B truly "is a" A, and all of the routines defined in A have the same meaning in B.
Let's say we want to display a list of As, some of which are actually Bs. As we traverse our list of As, if the current object is actually a B, we want to display some of Bs additional properties....or maybe we just want to color the Bs differently, but neither A nor B have any notion of "color" or "display stuff".
Solutions:
Make the A class semi-aware of B by basically including a method called isB() in A that returns false. B will override the method and return true. Display code would have a check like: if (currentA.isB()) B b = currentA;
Provide a display() method in A that B can override.... but then we start merging the UI and the model. I won't consider this unless there is some cool trick I'm not seeing.
Use instanceof to check if the current A object to be displayed is really a B.
Just add all the junk from B to A, even though it doesn't apply to A. Basically just contain a B (that does not inherit from A) in A and set it to null until it applies. This is somewhat attractive. This is similar to #1 I guess w/ composition over inheritance.
It seems like this particular problem should come up from time to time and have an obvious solution.
So I guess the question maybe really boils down to:
If I have a subclass that extends a base class by adding additional functionality (not just changing the existing behavior of the base class), am I doing something tragically wrong? It all seems to instantly fall apart as soon as we try to act on a collection of objects that may be A or B.
A variant of option 2 (or hybrid of 1 and 2) may make sense: after all, polymorphism is the standard solution to "Bs are As but need to behave differently in situation X." Agreed, a display() method would probably tie the model to the UI too closely, but presumably the different renderings you want at the UI level reflect semantic or behavioural differences at the model level. Could those be captured in a method? For example, instead of an outright getDisplayColour() method, could it be a getPriority() (for example) method, to which A and B return different values but it is still up to the UI to decide how to translate that into a colour?
Given your more general question, however, of "how can we handle additional behaviour that we can't or won't allow to be accessed polymorphically via the base class," for example if the base class isn't under our control, your options are probably option 3, the Visitor pattern or a helper class. In both cases you are effectively farming out the polymorphism to an external entity -- in option 3, the UI (e.g. the presenter or controller), which performs an instanceOf check and does different things depending on whether it's a B or not; in Visitor or the helper case, the new class. Given your example, Visitor is probably overkill (also, if you were not able/willing to change the base class to accommodate it, it wouldn't be possible to implement it I think), so I'd suggest a simple class called something like "renderer":
public abstract class Renderer {
public static Renderer Create(A obj) {
if (obj instanceOf B)
return new BRenderer();
else
return new ARenderer();
}
public abstract Color getColor();
}
// implementations of ARenderer and BRenderer per your UI logic
This encapsulates the run-time type checking and bundles the code up into reasonably well-defined classes with clear responsibilities, without the conceptual overhead of Visitor. (Per GrizzlyNyo's answer, though, if your hierarchy or function set is more complex than what you've shown here, Visitor could well be more appropriate, but many people find Visitor hard to get their heads around and I would tend to avoid it for simple situations -- but your mileage may vary.)
The answer given by itowlson covers pretty well most part of the question. I will now deal with the very last paragraph as simply as I can.
Inheritance should be implemented for reuse, for your derived class to be reused in old code, not for your class reusing parts of the base class (you can use aggregation for that).
From that standpoint, if you have a class that is to be used on new code with some new functionality, but should be used transparently as a former class, then inheritance is your solution. New code can use the new functionality and old code will seamlessly use your new objects.
While this is the general intention, there are some common pitfals, the line here is subtle and your question is about precisely that line. If you have a collection of objects of type base, that should be because those objects are meant to be used only with base's methods. They are 'bases', behave like bases.
Using techniques as 'instanceof' or downcasts (dynamic_cast<>() in C++) to detect the real runtime type is something that I would flag in a code review and only accept after having the programmer explain to great detail why any other option is worse than that solution. I would accept it, for example, in itowlson's answer under the premises that the information is not available with the given operations in base. That is, the base type does not have any method that would offer enough information for the caller to determine the color. And if it does not make sense to include such operation: besides the prepresentation color, are you going to perform any operation on the objects based on that same information? If logic depends on the real type, then the operation should be in base class to be overriden in derived classes. If that is not possible (the operation is new and only for some given subtypes) there should at least be an operation in the base to allow the caller to determine that a downcast will not fail. And then again, I would really require a sound reason for the caller code to require knowledge of the real type. Why does the user want to see it in different colors? Will the user perform different operations on each one of the types?
If you endup requiring to use code to bypass the type system, your design has a strange smell to it. Of course, never say never, but you can surely say: avoid depending on instanceof or downcasts for logic.
This looks like text book case for the Visitor design pattern (also known as "Double Dispatch").
See this answer for link to a thorough explanation on the Visitor and Composite patterns.