Implementing similar UseCases looks like code duplication - oop

I have the following case. User can export several object types (transaction, invoice, etc) to external accounting system.
Export algorithm has steps:
fetch objects by some filter
export objects one by one to the accounting system (web service method per object type)
register the fact that given document was exported, so it wont be exported again
prepare summary for user (num of exported documents, error messages etc)
The algorithm is the same for all object types but there are some important differences which must be handled:
different types
different target web service method, different object to DTO mappings
different filters per object type
I've considered a few solutions:
don't treat the export algorithm as code duplication and implement an algorithm per object type. Export of any data to any external system may be described by such algorithm - does it mean that we should always have one general class to export anything to anywhere?:)
move the differences to strategies (one strategy interface to create abstraction for all differences) - I even implemented it.
use generics - unfortunately I'm coding in PHP and it's not possible
The question:
Is creating a separate export algorithm per object type a code duplication?
Maybe all of them should be treated as separate Use Cases?
If it's a duplication then what techniques to avoid it should I consider?
Description of my first implementation:
In first approach I defined an Exportable abstraction but I was not happy about it. Each object has completely different payload.
An Exportable interface defined only one method getId and it was used to register that object was exported (and thanks to that wont be exported again).
For this purpose the abstraction was fine, but the problem was moved to exportService which had to check the concrete instance to choose DTO mapper and endpoint. So the exportService broke SOLID.

None of the things you have described above are domain-specific logic (and in fact you don't even mention the problem domain in your question), so I don't think it really falls under domain-driven design. Because it's not domain-specific logic I wouldn't worry too much about code duplication, especially considering that the solution doesn't seem obvious.
Keep it simple and just write out each use case separately. If you find that there's common code that's easily refactored, do so after you get everything working smooth. Don't overthink it or add patterns before they are obviously necessary.

Related

Deciding extent of coupling

I have a Component which has API exposed with some 10 functionality in all. I can think of two ways to achieve it:
Give out all these functionality as separate functions.
Expose only one function which takes an XML as input. Based on request_Type specified and the parameters passed in the XML, I internally call one of the respective functions.
Q1. Will the second design be more loosely coupled than the first ?
I always read about how I should try my components to be loosely coupled, should I really go to this extent to achieve lose coupling ?
Q2. Which one of these would be a better design in terms of OOP and why?
Edit:
If I am exposing this API over D-Bus for others to use, will type checking still be a consideration to compare the two approaches? From what I understand type checking is done at compile time, but in case when this function is exposed over some IPC, issue of type checking comes into picture ?
The two alternatives you propose do not differ in the (obviously quite large) number of "functions" you want to offer from your API. However, the second seems to have many disadvantages because you are loosing any strong type checking, it will become much harder to document the functionality etc. (The only advantage I see is that you don't need to change your API if you add functionality. But at the disadvantage that users will not be able to figure out API changes like deleted functions until run-time.)
What is more related with this question is the Single Responsiblity Principle (http://en.wikipedia.org/wiki/Single_responsibility_principle). As you are talking about OOP, you should not expose your tens of functions within one class but split them among different classes, each with a single responsibility. Defining good "responsibilities" and roles requires some practice, but following some basic guidelines will help you to get started quickly. See Are there any rules for OOP? for a good starting point.
Reply to the question edit
I haven't used D-Bus, so this might be totally wrong. But from a quick look at the tutorial I read
Each object supports one or more interfaces. Think of an interface as
a named group of methods and signals, just as it is in GLib or Qt or
Java. Interfaces define the type of an object instance.
DBus identifies interfaces with a simple namespaced string, something
like org.freedesktop.Introspectable. Most bindings will map these
interface names directly to the appropriate programming language
construct, for example to Java interfaces or C++ pure virtual classes.
As far as I understand, D-Bus has the concept of differnt objects which provide interfaces consisting of several methods. This means (to me) that my answer above still applies. The "D-Bus native" way of specifying your API would mean to exhibit interfaces and I don't see any reason why good OOP design guidelines shouldn't be valid, here. As D-Bus seems to map these even to native language constructs, this is even more likely.
Of course, nobody keeps you from just building your own API description language in XML. However, things like are some kind of abuse of underlying techniques. You should have good reasons for doing such things.

Why would I create an interface for each mapper class?

In cases of MVC applications where the model is split into separate domain and mapper layers, why would you give each of the mapper classes its own interface?
I have seen a few examples now, some from well respected developers such as the case with this blog, http://site.svn.dasprids.de/trunk/application/modules/blog/models/
I suspect that its because the developers are expecting the code to be re-used by others who may have their own back-ends. Is this the case? Or am I missing something?
Note that in the examples I have seen, developers are not necessarily creating interfaces for the domain objects.
Since interfaces are contracts between classes (I'm kinda assuming that you already know that). When a class expects you to pass an object with as specific interface, the goal is to inform you, that this class instance expect specific method to be executable on said object.
The only case that i can think of, when having a defined interface for data mappers make sense might be when using unit of work to manage the persistence. But even then it would make more sense to simply inject a factory, that can create data mappers.
TL;DR: someone's been overdoing.
P.S.: it is quite possible, that I am completely wrong about this one, since I'm a bit biased on the subject - my mappers contain only 3 (+constructor) public methods: fetch(), store() and remove() .. though names method names tend to change. I prefer to take the retrieval conditions from domain object, as described here.

How do I refactor a class with lots of operations which all require its internal data?

My Problem
I have a class with just a few fields but which represents a relatively complicated data structure. This class is central in my program and over time I found myself adding more and more functionality into it, making things a mess. Since (almost) all of its methods rely on its internal fields, I could not think of a way to move some of the methods elsewhere, even though most methods are independent of each other. How can I refactor this class to make it simpler and reduce the number of methods which are directly implemented in it?
More Information
The class in question represents a sort of automaton. It supports a ton of operations such as retrieving information about it, performing various binary operations between it and other automata, querying for specific information stored inside it, saving it to file, etc. Almost all of these operations depend on the precise implementation of the class - in my specific case I maintain an edge-set-based implementation, but other implementations were also used in the past and might be used again in the future.
Except for a narrow set of basic helper methods which are commonly used, most methods are independent of each other.
The language I am using is Java, but I'm hoping for general answers which could be applied to any statically-typed, object-oriented language.
What I've Tried
I tried refactoring it somehow to multiple types, but each of its operations require access to most of its fields, and I'm hesitant about migrating these operations elsewhere because I can't think of a way to do that without exposing the class's implementation.
I'm also not sure where I should migrate the operations to, assuming they are indeed independent of the implementation. An external utility class? An abstract base type? Will appreciate any input about this.
Perhaps you could remodel the data that your class holds, so that instead of holding the data directly, it holds objects that hold the data? Then you could move the methods that manipulate that data into the new classes, leaving the original class as a sort of container / dispatcher class.

What functionality to build into business objects?

What functionality do you think should be built into a persistable business object at bare minimum?
For example:
validation
a way to compare to another object of the same type
undo capability (the ability to roll-back changes)
The functionality dictated by the domain & business.
Read Domain Driven Design.
A persistable business object should consist of the following:
Data
New
Save
Delete
Serialization
Deserialization
Often, you'll abstract the functionality to retrieve them into a repository that supports:
GetByID
GetAll
GetByXYZCriteria
You could also wrap this type of functionality into collection classes (e.g. BusinessObjectTypeCollection), however there's a lot of movement towards using the Repository Pattern in Domain Driven Design to provide these type of accessors (e.g. InvoicingRepository.GetAllCustomers, InvoicingRepository.GetAllInvoices).
You could put the business rules in the New, Save, Update, Delete ... but sometimes you could have an external business rules engine that you pass off the objects to.
This is just one piece of an answer, but I would say that you need a way to get to all objects with which this object has a relationship. In the beginning you may try to be smart and only include one-way navigability for some relationships, but I have found that this is usually more trouble than it's worth.
All persistent frameworks also include finders, ways to do cascading deletes... sorts....
Once you start modeling, all business objects should know how to manage themselves. Whenever you find another class referring TO your business object too much, it's usually time to push that behavior into the business object itself.
Of the three things noted in the question, I would say that validation is the only one that is truly required. The others depend on the overall archetecture of the application.
Also, the business rules should be in the business objects.
Whether an object should do its own serialization is an interesting question. I have had great success in the past by having each object handle its own serialization, but I can also see merit in having a serialization module load and save the business objects just the same way as the GUI writes to and reads from the objects. Then your validation will protect against errors in the database or files too.
I can't think of anything else that is required in general.

Dealing with "global" data structures in an object-oriented world

This is a question with many answers - I am interested in knowing what others consider to be "best practice".
Consider the following situation: you have an object-oriented program that contains one or more data structures that are needed by many different classes. How do you make these data structures accessible?
You can explicitly pass references around, for example, in the constructors. This is the "proper" solution, but it means duplicating parameters and instance variables all over the program. This makes changes or additions to the global data difficult.
You can put all of the data structures inside of a single object, and pass around references to this object. This can either be an object created just for this purpose, or it could be the "main" object of your program. This simplifies the problems of (1), but the data structures may or may not have anything to do with one another, and collecting them together in a single object is pretty arbitrary.
You can make the data structures "static". This lets you reference them directly from other classes, without having to pass around references. This entirely avoids the disadvantages of (1), but is clearly not OO. This also means that there can only ever be a single instance of the program.
When there are a lot of data structures, all required by a lot of classes, I tend to use (2). This is a compromise between OO-purity and practicality. What do other folks do? (For what it's worth, I mostly come from the Java world, but this discussion is applicable to any OO language.)
Global data isn't as bad as many OO purists claim!
After all, when implementing OO classes you've usually using an API to your OS. What the heck is this if it isn't a huge pile of global data and services!
If you use some global stuff in your program, you're merely extending this huge environment your class implementation can already see of the OS with a bit of data that is domain specific to your app.
Passing pointers/references everywhere is often taught in OO courses and books, academically it sounds nice. Pragmatically, it is often the thing to do, but it is misguided to follow this rule blindly and absolutely. For a decent sized program, you can end up with a pile of references being passed all over the place and it can result in completely unnecessary drudgery work.
Globally accessible services/data providers (abstracted away behind a nice interface obviously) are pretty much a must in a decent sized app.
I must really really discourage you from using option 3 - making the data static. I've worked on several projects where the early developers made some core data static, only to later realise they did need to run two copies of the program - and incurred a huge amount of work making the data non-static and carefully putting in references into everything.
So in my experience, if you do 3), you will eventually end up doing 1) at twice the cost.
Go for 1, and be fine-grained about what data structures you reference from each object. Don't use "context objects", just pass in precisely the data needed. Yes, it makes the code more complicated, but on the plus side, it makes it clearer - the fact that a FwurzleDigestionListener is holding a reference to both a Fwurzle and a DigestionTract immediately gives the reader an idea about its purpose.
And by definition, if the data format changes, so will the classes that operate on it, so you have to change them anyway.
You might want to think about altering the requirement that lots of objects need to know about the same data structures. One reason there does not seem to be a clean OO way of sharing data is that sharing data is not very object-oriented.
You will need to look at the specifics of your application but the general idea is to have one object responsible for the shared data which provides services to the other objects based on the data encapsulated in it. However these services should not involve giving other objects the data structures - merely giving other objects the pieces of information they need to meet their responsibilites and performing mutations on the data structures internally.
I tend to use 3) and be very careful about the synchronisation and locking across threads. I agree it is less OO, but then you confess to having global data, which is very un-OO in the first place.
Don't get too hung up on whether you are sticking purely to one programming methodology or another, find a solution which fits your problem. I think there are perfectly valid contexts for singletons (Logging for instance).
I use a combination of having one global object and passing interfaces in via constructors.
From the one main global object (usually named after what your program is called or does) you can start up other globals (maybe that have their own threads). This lets you control the setting up of program objects in the main objects constructor and tearing them down again in the right order when the application stops in this main objects destructor. Using static classes directly makes it tricky to initialize/uninitialize any resources these classes use in a controlled manner. This main global object also has properties for getting at the interfaces of different sub-systems of your application that various objects may want to get hold of to do their work.
I also pass references to relevant data-structures into constructors of some objects where I feel it is useful to isolate those objects from the rest of the world within the program when they only need to be concerned with a small part of it.
Whether an object grabs the global object and navigates its properties to get the interfaces it wants or gets passed the interfaces it uses via its constructor is a matter of taste and intuition. Any object you're implementing that you think might be reused in some other project should definately be passed data structures it should use via its constructor. Objects that grab the global object should be more to do with the infrastructure of your application.
Objects that receive interfaces they use via the constructor are probably easier to unit-test because you can feed them a mock interface, and tickle their methods to make sure they return the right arguments or interact with mock interfaces correctly. To test objects that access the main global object, you have to mock up the main global object so that when they request interfaces (I often call these services) from it they get appropriate mock objects and can be tested against them.
I prefer using the singleton pattern as described in the GoF book for these situations. A singleton is not the same as either of the three options described in the question. The constructor is private (or protected) so that it cannot be used just anywhere. You use a get() function (or whatever you prefer to call it) to obtain an instance. However, the architecture of the singleton class guarantees that each call to get() returns the same instance.
We should take care not to confuse Object Oriented Design with Object Oriented Implementation. Al too often, the term OO Design is used to judge an implementation, just as, imho, it is here.
Design
If in your design you see a lot of objects having a reference to exactly the same object, that means a lot of arrows. The designer should feel an itch here. He should verify whether this object is just commonly used, or if it is really a utility (e.g. a COM factory, a registry of some kind, ...).
From the project's requirements, he can see if it really needs to be a singleton (e.g. 'The Internet'), or if the object is shared because it's too general or too expensive or whatsoever.
Implementation
When you are asked to implement an OO Design in an OO language, you face a lot of decisions, like the one you mentioned: how should I implement all the arrows to the oft used object in the design?
That's the point where questions are addressed about 'static member', 'global variable' , 'god class' and 'a-lot-of-function-arguments'.
The Design phase should have clarified if the object needs to be a singleton or not. The implementation phase will decide on how this singleness will be represented in the program.
Option 3) while not purist OO, tends to be the most reasonable solution. But I would not make your class a singleton; and use some other object as a static 'dictionary' to manage those shared resources.
I don't like any of your proposed solutions:
You are passing around a bunch of "context" objects - the things that use them don't specify what fields or pieces of data they are really interested in
See here for a description of the God Object pattern. This is the worst of all worlds
Simply do not ever use Singleton objects for anything. You seem to have identified a few of the potential problems yourself