What design pattern is used by IProject.setDescription in Eclipse - api

I'm designing an API with a specific pattern in mind, but don't know if this pattern has a name. It's similar to the Command pattern in GoF (Gang of Four) but not exactly.
One simple example of it I can find is in Eclipse where you manipulate a project (IProject), not by calling methods on the project that change its state, but by this 3 step process:
extracting its state into a descriptor object (IProjectDescription) with getDescription
setting properties on the descriptor. E.g. setName
applying the descriptor back to the original project with setDescription
The general principle seems to be that you have a complex object as part of a framework with many potentially interdependent properties, and rather than working directly on that object, one property at a time, you extract the properties into a simple data object, manipulate that, and apply it back.
It has some of the attributes of the Command pattern, in that the data object encapsulates all of the changes like a Command would - but it's not really a Command, because you don't execute it on the object, it's simply a representation of the state of the object.
It also has some attributes of a Transactional API, in that, by making the changes all in one hit with the set... call, you allow for the entire modification to effectively "roll back" if any one property changes fails. But while that's an advantage of the approach, it's not really the main purpose of it. And what's more, you can achieve the transactional nature without this approach, by simply adding transactional methods to the API (like commit and rollback)
There are two advantages in this pattern that I do want to exploit - although I don't see them being exploited by the eclipse example above:
You can represent the meaningful state of the underlying object while its implementation changes. This is useful for upgrading, or copying state from different types of representations. Say I release a new version of my API where I create an object Foo2 which is a totally new form of my old Foo1, but both have the same basic properties. To upgrade a Foo1 to a Foo2, I can extract those properties as a FooState. foo2.setFooState(foo1.getFooState) as simply as that. The way in which the properties are interpreted and represented is encapsulated in the Foos and can be totally different.
I can persist and transmit the state of the underlying object with my simple data object, where persisting the object itself would be much more complex. So I can extract the state of Foo as a FooState, and persist it as a simple XML document then later apply it to some new object by "loading" it and applying it. Or I can transmit the FooState simply to a webservice as a JSON object whereas the Foo itself is too big and complex to transmit. (Or the objects on each end of the service call are entirely different, like Foo1 and Foo2)
Anyway, I can't find an name or example of this pattern anywhere, neither in the Gang of Four design patterns, nor even in Martin Fowler's comprehensive "bliki"

Data Transfer Object(DTO) that Martin Fowler describes in his book Principles of Enterprise Application Architecture seems to be for the purpose you describe in point 2.
A DTO is a fairly simple extraction of the more complex Domain Model that it represents.
Fowler describes that the usage of a DTO in combination with an assembler can be used to keep the DTO independent from the actual Domain Object(or Objects) that it is supposed to represent. The assembler knows how to create a DTO from the Domain Object and vice versa. Also he mentions that the DTO needs to be serializable to persist/transmit its state. What you describe in point 2 seems to match this description.
What you've described in point 1 though does not seem to be an intended purpose, but definitely seems achievable using this pattern.
I'm not sure if you went through the Pattern catalog of his book or the book itself. The book itself describes this in much greater detail.
You may also want to have a look at Transfer Object definition from Oracle which Fowler says here is what he describes as DTO.

Not every design is documented as a single Design Pattern, in fact most system designs are combinations of multiple patterns.
However one part of what you're doing, with IProjectDescription is using a Memento, however yours seems to be a Polymorphic variation. Consider Patterns as they appear in Pattern Catalogues to be the pared down to the essential starting point not the end result. Patterns are by there very nature supposed to be extended and combined.
The Command pattern can give you Commit and RollBack (Do/Undo) and combining it with Memento in that way is a quite common approach. The same thing is seen in the Java Servlet API with HttpRequest & HttpResponse.

Related

Preserve Whole Object VS Don't Look For Things

I was reading Fowler's Refactoring Book and saw Preserve Whole Object. A different, newer opinion says that this refactoring is the exact opposite of what you should do: The Clean Code Talks - Don't Look For Things!.
Fowler does mention that you should look to see if the method can just be moved to the class which uses the large list of arguments. I think that would be the only reasonable alternative. This refactoring seems like a band-aid for a poorly defined method.
The Fowler source material is a bit dated. Is the prevailing wisdom to let this technique go the way of the dodo or is there actually a case when you'd want to do this kind of refactoring? Or have I misunderstood the test-driven style because those examples deal with object construction, not message sending?
There are many concepts in the Object Oriented Design such as Patterns, Principles and Practices that may seem to be either similar or contradictory at first. In fact, most of them are neither similar nor contradictory. And the thing that makes them different and consistent is their intent.
The seeming contradiction between the Preserve Whole Object refactoring and the Service Locator pattern mentioned in The Clean Code Talks video occurs when they are treated as a same concept, although they are different in their intent and their essence.
The Preserve Whole Object refactoring is simply a technique used to make code easier to read, understand and maintain by reducing the number of arguments to a function. The Service Locator, on the other hand, is a design pattern that is used to manage dependencies between different components in a system using the Inversion of Control concept. Unlike the Preserve Whole Object refactoring technique which has local effect on a system, that is applied to a small part of the system (a function), the Service Locator pattern has a global effect on the system and addresses a bigger architectural problem (Dependency Management).
When to use the Preserve Whole Object refactoring?
Use the Preserve Whole Object refactoring when You have two or more arguments to a function which are basically the properties of one object, so pass the object instead.
There is a similar concept called Parameter Object (aka Argument Object) (Introduce Parameter Object refactoring) which states that if You have a group of parameters that are not the properties of one object but are conceptually related to each other or naturally go together, wrap them with a class of their own and pass the instance of that class instead. It is mainly used when sending a message to an object.
A quote from Clean Code, Chapter 3: Functions, Function Arguments, page 43 (Robert C. Martin):
Argument Objects
When a function seems to need more than two or three arguments, it is likely that some of
those arguments ought to be wrapped into a class of their own. Consider, for example, the
difference between the two following declarations:
Circle makeCircle(double x, double y, double radius);
Circle makeCircle(Point center, double radius);
Reducing the number of arguments by creating objects out of them may seem like
cheating, but it’s not. When groups of variables are passed together, the way x and
y are in the example above, they are likely part of a concept that deserves a name of its
own.
When to use the Service Locator pattern?
Use the Service Locator pattern when Your class has dependencies that are not conceptually related and You do not want Your class to depend on concrete implementations. Actually, this is when You would want to use any of the Dependency Management approaches. Another alternative is the Dependency Injection approach which explicitly specifies all the dependencies as a separate arguments to the constructor. Whereas the Service Locator pass all the dependencies in a single container object. In fact, it is this very similarity between the Service Locator pattern and the Preserve Whole Object refactoring of combining the arguments in a single object that serves as a source of confusion. The Dependency Management techniques are mainly used in object construction.
There are pros and cons to both approaches of the Dependency Management which are discussed in the Inversion of Control Containers and the Dependency Injection pattern article by Martin Fowler.
When to use both?
Sometimes there will be situations where Your class will have two or more dependencies that are conceptually related and You might want to combine them in a single object and pass it as a dependency using the Service Locator. So, as You can see these two concepts are not mutually exclusive.

Object persistence terminology: 'repository' vs. 'store' vs. 'context' vs. 'retriever' vs. (...)

I'm not sure how to name data store classes when designing a program's data access layer (DAL).
(By data store class, I mean a class that is responsible to read a persisted object into memory, or to persist an in-memory object.)
It seems reasonable to name a data store class according to two things:
what kinds of objects it handles;
whether it loads and/or persists such objects.
⇒ A class that loads Banana objects might be called e.g. BananaSource.
I don't know how to go about the second point (ie. the Source bit in the example). I've seen different nouns apparently used for just that purpose:
repository: this sounds very general. Does this denote something read-/write-accessible?
store: this sounds like something that potentially allows write access.
context: sounds very abstract. I've seen this with LINQ and object-relational mappers (ORMs).
P.S. (several months later): This is probably appropriate for containers that contain "active" or otherwise supervised objects (the Unit of Work pattern comes to mind).
retriever: sounds like something read-only.
source & sink: probably not appropriate for object persistence; a better fit with data streams?
reader / writer: quite clear in its intention, but sounds too technical to me.
Are these names arbitrary, or are there widely accepted meanings / semantic differences behind each? More specifically, I wonder:
What names would be appropriate for read-only data stores?
What names would be appropriate for write-only data stores?
What names would be appropriate for mostly read-only data stores that are occasionally updated?
What names would be appropriate for mostly write-only data stores that are occasionally read?
Does one name fit all scenarios equally well?
As noone has yet answered the question, I'll post on what I have decided in the meantime.
Just for the record, I have pretty much decided on calling most data store classes repositories. First, it appears to be the most neutral, non-technical term from the list I suggested, and it seems to be well in line with the Repository pattern.
Generally, "repository" seems to fit well where data retrieval/persistence interfaces are something similar to the following:
public interface IRepository<TResource, TId>
{
int Count { get; }
TResource GetById(TId id);
IEnumerable<TResource> GetManyBySomeCriteria(...);
TId Add(TResource resource);
void Remove(TId id);
void Remove(TResource resource);
...
}
Another term I have decided on using is provider, which I'll be preferring over "repository" whenever objects are generated on-the-fly instead of being retrieved from a persistence store, or when access to a persistence store happens in a purely read-only manner. (Factory would also be appropriate, but sounds more technical, and I have decided against technical terms for most uses.)
P.S.: Some time has gone by since writing this answer, and I've had several opportunities at work to review someone else's code. One term I've thus added to my vocabulary is Service, which I am reserving for SOA scenarios: I might publish a FooService that is backed by a private Foo repository or provider. The "service" is basically just a thin public-facing layer above these that takes care of things like authentication, authorization, or aggregating / batching DTOs for proper "chunkiness" of service responses.
Well so to add something to you conclusion:
A repository: is meant to only care about one entity and has certain patterns like you did.
A store: is allowed to do a bit more, also working with other entities.
A reader/writer: is separated to allow semantically show and inject only reading and wrting functionality into other classes. It's coming from the CQRS pattern.
A context: is more or less bound to a ORM mapper as you mentioned and is usually used under the hood of a repository or store, some use it directly instead of making a repository on top. But it's harder to abstract.

What functionality to build into business objects?

What functionality do you think should be built into a persistable business object at bare minimum?
For example:
validation
a way to compare to another object of the same type
undo capability (the ability to roll-back changes)
The functionality dictated by the domain & business.
Read Domain Driven Design.
A persistable business object should consist of the following:
Data
New
Save
Delete
Serialization
Deserialization
Often, you'll abstract the functionality to retrieve them into a repository that supports:
GetByID
GetAll
GetByXYZCriteria
You could also wrap this type of functionality into collection classes (e.g. BusinessObjectTypeCollection), however there's a lot of movement towards using the Repository Pattern in Domain Driven Design to provide these type of accessors (e.g. InvoicingRepository.GetAllCustomers, InvoicingRepository.GetAllInvoices).
You could put the business rules in the New, Save, Update, Delete ... but sometimes you could have an external business rules engine that you pass off the objects to.
This is just one piece of an answer, but I would say that you need a way to get to all objects with which this object has a relationship. In the beginning you may try to be smart and only include one-way navigability for some relationships, but I have found that this is usually more trouble than it's worth.
All persistent frameworks also include finders, ways to do cascading deletes... sorts....
Once you start modeling, all business objects should know how to manage themselves. Whenever you find another class referring TO your business object too much, it's usually time to push that behavior into the business object itself.
Of the three things noted in the question, I would say that validation is the only one that is truly required. The others depend on the overall archetecture of the application.
Also, the business rules should be in the business objects.
Whether an object should do its own serialization is an interesting question. I have had great success in the past by having each object handle its own serialization, but I can also see merit in having a serialization module load and save the business objects just the same way as the GUI writes to and reads from the objects. Then your validation will protect against errors in the database or files too.
I can't think of anything else that is required in general.

Should entities have behavior or not?

Should entities have behavior? or not?
Why or why not?
If not, does that violate Encapsulation?
If your entities do not have behavior, then you are not writing object-oriented code. If everything is done with getters and setters and no other behavior, you're writing procedural code.
A lot of shops say they're practicing SOA when they keep their entities dumb. Their justification is that the data structure rarely changes, but the business logic does. This is a fallacy. There are plenty of patterns to deal with this problem, and they don't involve reducing everything to bags of getters and setters.
Entities should not have behavior. They represent data and data itself is passive.
I am currently working on a legacy project that has included behavior in entities and it is a nightmare, code that no one wants to touch.
You can read more on my blog post: Object-Oriented Anti-Pattern - Data Objects with Behavior .
[Preview] Object-Oriented Anti-Pattern - Data Objects with Behavior:
Attributes and Behavior
Objects are made up of attributes and behavior but Data Objects by definition represent only data and hence can have only attributes. Books, Movies, Files, even IO Streams do not have behavior. A book has a title but it does not know how to read. A movie has actors but it does not know how to play. A file has content but it does not know how to delete. A stream has content but it does not know how to open/close or stop. These are all examples of Data Objects that have attributes but do not have behavior. As such, they should be treated as dumb data objects and we as software engineers should not force behavior upon them.
Passing Around Data Instead of Behavior
Data Objects are moved around through different execution environments but behavior should be encapsulated and is usually pertinent only to one environment. In any application data is passed around, parsed, manipulated, persisted, retrieved, serialized, deserialized, and so on. An entity for example usually passes from the hibernate layer, to the service layer, to the frontend layer, and back again. In a distributed system it might pass through several pipes, queues, caches and end up in a new execution context. Attributes can apply to all three layers, but particular behavior such as save, parse, serialize only make sense in individual layers. Therefore, adding behavior to data objects violates encapsulation, modularization and even security principles.
Code written like this:
book.Write();
book.Print();
book.Publish();
book.Buy();
book.Open();
book.Read();
book.Highlight();
book.Bookmark();
book.GetRelatedBooks();
can be refactored like so:
Book book = author.WriteBook();
printer.Print(book);
publisher.Publish(book);
customer.Buy(book);
reader = new BookReader();
reader.Open(Book);
reader.Read();
reader.Highlight();
reader.Bookmark();
librarian.GetRelatedBooks(book);
What a difference natural object-oriented modeling can make! We went from a single monstrous Book class to six separate classes, each of them responsible for their own individual behavior.
This makes the code:
easier to read and understand because it is more natural
easier to update because the functionality is contained in smaller encapsulated classes
more flexible because we can easily substitute one or more of the six individual classes with overridden versions.
easier to test because the functionality is separated, and easier to mock
It depends on what kind of entity they are -- but the term "entity" implies, to me at least, business entities, in which case they should have behavior.
A "Business Entity" is a modeling of a real world object, and it should encapsulate all of the business logic (behavior) and properties/data that the object representation has in the context of your software.
If you're strictly following MVC, your model (entities) won't have any inherent behavior. I do however include whatever helper methods allow the easiest management of the entities persistence, including methods that help with maintaining its relationship to other entities.
If you plan on exposing your entities to the world, you're better off (generally) keeping behavior off of the entity. If you want to centralize your business operations (i.e. ValidateVendorOrder) you wouldn't want the Order to have an IsValid() method that runs some logic to validate itself. You don't want that code running on a client (what if they fudge it. i.e. akin to not providing any client UI to set the price on an item being placed in a shopping cart, but posting a a bogus price on the URL. If you don't have server-side validation, that's not good! And duplicating that validation is...redundant...DRY (Don't Repeat Yourself).
Another example of when having behaviors on an entity just doesn't work is the notion of lazy loading. Alot of ORMs today will allow you to lazy load data when a property is accessed on an entities. If you're building a 3-tier app, this just doesn't work as your client will ultimately inadvertantly try to make database calls when accessing properties.
These are my off-the-top-of-my-head arguments for keeping behavior off of entities.

Dealing with "global" data structures in an object-oriented world

This is a question with many answers - I am interested in knowing what others consider to be "best practice".
Consider the following situation: you have an object-oriented program that contains one or more data structures that are needed by many different classes. How do you make these data structures accessible?
You can explicitly pass references around, for example, in the constructors. This is the "proper" solution, but it means duplicating parameters and instance variables all over the program. This makes changes or additions to the global data difficult.
You can put all of the data structures inside of a single object, and pass around references to this object. This can either be an object created just for this purpose, or it could be the "main" object of your program. This simplifies the problems of (1), but the data structures may or may not have anything to do with one another, and collecting them together in a single object is pretty arbitrary.
You can make the data structures "static". This lets you reference them directly from other classes, without having to pass around references. This entirely avoids the disadvantages of (1), but is clearly not OO. This also means that there can only ever be a single instance of the program.
When there are a lot of data structures, all required by a lot of classes, I tend to use (2). This is a compromise between OO-purity and practicality. What do other folks do? (For what it's worth, I mostly come from the Java world, but this discussion is applicable to any OO language.)
Global data isn't as bad as many OO purists claim!
After all, when implementing OO classes you've usually using an API to your OS. What the heck is this if it isn't a huge pile of global data and services!
If you use some global stuff in your program, you're merely extending this huge environment your class implementation can already see of the OS with a bit of data that is domain specific to your app.
Passing pointers/references everywhere is often taught in OO courses and books, academically it sounds nice. Pragmatically, it is often the thing to do, but it is misguided to follow this rule blindly and absolutely. For a decent sized program, you can end up with a pile of references being passed all over the place and it can result in completely unnecessary drudgery work.
Globally accessible services/data providers (abstracted away behind a nice interface obviously) are pretty much a must in a decent sized app.
I must really really discourage you from using option 3 - making the data static. I've worked on several projects where the early developers made some core data static, only to later realise they did need to run two copies of the program - and incurred a huge amount of work making the data non-static and carefully putting in references into everything.
So in my experience, if you do 3), you will eventually end up doing 1) at twice the cost.
Go for 1, and be fine-grained about what data structures you reference from each object. Don't use "context objects", just pass in precisely the data needed. Yes, it makes the code more complicated, but on the plus side, it makes it clearer - the fact that a FwurzleDigestionListener is holding a reference to both a Fwurzle and a DigestionTract immediately gives the reader an idea about its purpose.
And by definition, if the data format changes, so will the classes that operate on it, so you have to change them anyway.
You might want to think about altering the requirement that lots of objects need to know about the same data structures. One reason there does not seem to be a clean OO way of sharing data is that sharing data is not very object-oriented.
You will need to look at the specifics of your application but the general idea is to have one object responsible for the shared data which provides services to the other objects based on the data encapsulated in it. However these services should not involve giving other objects the data structures - merely giving other objects the pieces of information they need to meet their responsibilites and performing mutations on the data structures internally.
I tend to use 3) and be very careful about the synchronisation and locking across threads. I agree it is less OO, but then you confess to having global data, which is very un-OO in the first place.
Don't get too hung up on whether you are sticking purely to one programming methodology or another, find a solution which fits your problem. I think there are perfectly valid contexts for singletons (Logging for instance).
I use a combination of having one global object and passing interfaces in via constructors.
From the one main global object (usually named after what your program is called or does) you can start up other globals (maybe that have their own threads). This lets you control the setting up of program objects in the main objects constructor and tearing them down again in the right order when the application stops in this main objects destructor. Using static classes directly makes it tricky to initialize/uninitialize any resources these classes use in a controlled manner. This main global object also has properties for getting at the interfaces of different sub-systems of your application that various objects may want to get hold of to do their work.
I also pass references to relevant data-structures into constructors of some objects where I feel it is useful to isolate those objects from the rest of the world within the program when they only need to be concerned with a small part of it.
Whether an object grabs the global object and navigates its properties to get the interfaces it wants or gets passed the interfaces it uses via its constructor is a matter of taste and intuition. Any object you're implementing that you think might be reused in some other project should definately be passed data structures it should use via its constructor. Objects that grab the global object should be more to do with the infrastructure of your application.
Objects that receive interfaces they use via the constructor are probably easier to unit-test because you can feed them a mock interface, and tickle their methods to make sure they return the right arguments or interact with mock interfaces correctly. To test objects that access the main global object, you have to mock up the main global object so that when they request interfaces (I often call these services) from it they get appropriate mock objects and can be tested against them.
I prefer using the singleton pattern as described in the GoF book for these situations. A singleton is not the same as either of the three options described in the question. The constructor is private (or protected) so that it cannot be used just anywhere. You use a get() function (or whatever you prefer to call it) to obtain an instance. However, the architecture of the singleton class guarantees that each call to get() returns the same instance.
We should take care not to confuse Object Oriented Design with Object Oriented Implementation. Al too often, the term OO Design is used to judge an implementation, just as, imho, it is here.
Design
If in your design you see a lot of objects having a reference to exactly the same object, that means a lot of arrows. The designer should feel an itch here. He should verify whether this object is just commonly used, or if it is really a utility (e.g. a COM factory, a registry of some kind, ...).
From the project's requirements, he can see if it really needs to be a singleton (e.g. 'The Internet'), or if the object is shared because it's too general or too expensive or whatsoever.
Implementation
When you are asked to implement an OO Design in an OO language, you face a lot of decisions, like the one you mentioned: how should I implement all the arrows to the oft used object in the design?
That's the point where questions are addressed about 'static member', 'global variable' , 'god class' and 'a-lot-of-function-arguments'.
The Design phase should have clarified if the object needs to be a singleton or not. The implementation phase will decide on how this singleness will be represented in the program.
Option 3) while not purist OO, tends to be the most reasonable solution. But I would not make your class a singleton; and use some other object as a static 'dictionary' to manage those shared resources.
I don't like any of your proposed solutions:
You are passing around a bunch of "context" objects - the things that use them don't specify what fields or pieces of data they are really interested in
See here for a description of the God Object pattern. This is the worst of all worlds
Simply do not ever use Singleton objects for anything. You seem to have identified a few of the potential problems yourself