Tackling polymorphic behaviour in a DDD

Tackling polymorphic behaviour in a DDD - oop

There is a scenario in which a Credential class contains a digital Proof. Each type of Credential can support multiple different types of Proof like JWSProof or a simple RSASignatureProof derived from the same base class (Proof) and specify the different behaviour to do the verification.
Suppose that a client sends the a credential containing a JWS proof, after receiving the data, the controller endpoint maps it to a Credential (that could eventually be stored afterwards) and somehow recognizes the proof type contained inside it. Its task is to verify it before processing (or saving it) so it should use the logic implemented in the related class (JWSProof in this case) by constructing it using a factory after collecting the cryptographic material necessary to do the verification.
So my question is: do you think this is the correct approach? I feel like the JWSProof is somehow useless as its logic could easily become a static function. How would you model this case and how many repositories are necessary to store all of these type of objects/classes?

Based on your explanation (see comments to question) that each type of Proof operates on the same kind of input data in terms of data structure (in your case simply a string) we are only concerned with different proof verification logic.
This sounds like Proof is more like behaviour only (logic) and does not represent data in your domain model. This is why I would suggest to handle this via the strategy pattern. This means there is a strategy implementation containing the verification logic for each different kind of proof. I would not suggest to use static methods because this makes testing and maintenance difficult.
You could implement a Credential factory that creates the corresponding Credential aggregate and already injects the specific proof strategy based on the proof type.
If this solution is feasible for your problem you do not even require different Credential class types due to different proof logic. Because when storing the Credential aggregate it always has the same structure and only the proof type's value (e.g. create a custom ProofType value object for strong typing) differs.

For pure computational logic that doesn't need to change (and I'd suggest that a proof algorithm is such a thing: it likely has no state from verification to verification), source code (e.g. a static function) is the correct repository.
Your ProofRepository would then fundamentally just be a map from an identifier for a proof (e.g. a string like "JWSProof") to a static function (different languages will make this easier or harder to model, especially if the proof implementations need to take different arguments).
You might by the same token have a repository for cryptographic materials needed by particular proofs.

Related

Implementing similar UseCases looks like code duplication

I have the following case. User can export several object types (transaction, invoice, etc) to external accounting system.
Export algorithm has steps:
fetch objects by some filter
export objects one by one to the accounting system (web service method per object type)
register the fact that given document was exported, so it wont be exported again
prepare summary for user (num of exported documents, error messages etc)
The algorithm is the same for all object types but there are some important differences which must be handled:
different types
different target web service method, different object to DTO mappings
different filters per object type
I've considered a few solutions:
don't treat the export algorithm as code duplication and implement an algorithm per object type. Export of any data to any external system may be described by such algorithm - does it mean that we should always have one general class to export anything to anywhere?:)
move the differences to strategies (one strategy interface to create abstraction for all differences) - I even implemented it.
use generics - unfortunately I'm coding in PHP and it's not possible
The question:
Is creating a separate export algorithm per object type a code duplication?
Maybe all of them should be treated as separate Use Cases?
If it's a duplication then what techniques to avoid it should I consider?
Description of my first implementation:
In first approach I defined an Exportable abstraction but I was not happy about it. Each object has completely different payload.
An Exportable interface defined only one method getId and it was used to register that object was exported (and thanks to that wont be exported again).
For this purpose the abstraction was fine, but the problem was moved to exportService which had to check the concrete instance to choose DTO mapper and endpoint. So the exportService broke SOLID.

None of the things you have described above are domain-specific logic (and in fact you don't even mention the problem domain in your question), so I don't think it really falls under domain-driven design. Because it's not domain-specific logic I wouldn't worry too much about code duplication, especially considering that the solution doesn't seem obvious.
Keep it simple and just write out each use case separately. If you find that there's common code that's easily refactored, do so after you get everything working smooth. Don't overthink it or add patterns before they are obviously necessary.

Deciding extent of coupling

I have a Component which has API exposed with some 10 functionality in all. I can think of two ways to achieve it:
Give out all these functionality as separate functions.
Expose only one function which takes an XML as input. Based on request_Type specified and the parameters passed in the XML, I internally call one of the respective functions.
Q1. Will the second design be more loosely coupled than the first ?
I always read about how I should try my components to be loosely coupled, should I really go to this extent to achieve lose coupling ?
Q2. Which one of these would be a better design in terms of OOP and why?
Edit:
If I am exposing this API over D-Bus for others to use, will type checking still be a consideration to compare the two approaches? From what I understand type checking is done at compile time, but in case when this function is exposed over some IPC, issue of type checking comes into picture ?

The two alternatives you propose do not differ in the (obviously quite large) number of "functions" you want to offer from your API. However, the second seems to have many disadvantages because you are loosing any strong type checking, it will become much harder to document the functionality etc. (The only advantage I see is that you don't need to change your API if you add functionality. But at the disadvantage that users will not be able to figure out API changes like deleted functions until run-time.)
What is more related with this question is the Single Responsiblity Principle (http://en.wikipedia.org/wiki/Single_responsibility_principle). As you are talking about OOP, you should not expose your tens of functions within one class but split them among different classes, each with a single responsibility. Defining good "responsibilities" and roles requires some practice, but following some basic guidelines will help you to get started quickly. See Are there any rules for OOP? for a good starting point.
Reply to the question edit
I haven't used D-Bus, so this might be totally wrong. But from a quick look at the tutorial I read
Each object supports one or more interfaces. Think of an interface as
a named group of methods and signals, just as it is in GLib or Qt or
Java. Interfaces define the type of an object instance.
DBus identifies interfaces with a simple namespaced string, something
like org.freedesktop.Introspectable. Most bindings will map these
interface names directly to the appropriate programming language
construct, for example to Java interfaces or C++ pure virtual classes.
As far as I understand, D-Bus has the concept of differnt objects which provide interfaces consisting of several methods. This means (to me) that my answer above still applies. The "D-Bus native" way of specifying your API would mean to exhibit interfaces and I don't see any reason why good OOP design guidelines shouldn't be valid, here. As D-Bus seems to map these even to native language constructs, this is even more likely.
Of course, nobody keeps you from just building your own API description language in XML. However, things like are some kind of abuse of underlying techniques. You should have good reasons for doing such things.

Is "serialisation without duplication" possible in c++0x?

One of the big uses of code generation in c++ is to support message serialisation. Typically, you want to support specifying message contents and layout in the same step and produce code for that message type that can give you objects capable of being serialised to/from communication streams. In the past, this has usually resulted in code that looks like:
class MyMessage : public SerialisableObject
{
// message members
int myNumber_;
std::string myString_;
std::vector<MyOtherSerialisableObject> aBunchOfThingsIWantToSerialise_;
public:
// ctor, dtor, accesors, mutators, then:
virtual void Serialise(SerialisationStream & stream)
{
stream & myNumber_;
stream & myString_;
stream & aBunchOfThingsIWantToSerialise_;
}
};
The problem with using this kind of design is that violates an important rule of good architecture: you should not have to specify the intent of a design twice. Duplication of intent, like duplicated code and other common development duplication, leaves room for one place in the code to become divergent with the other, causing errors.
In the above, the duplication is the list of members. Potential errors include adding a member to the class but forgetting to add it to the serialisation list, serialising a member twice (possibly by not using the same order as the member declaration or possibly due to a misspelling of a similar member, among other ways), or serialising something that is not a member (which might produce a compiler error, unless name lookup finds something at a different scope than the object that matches lookup rules). That kind of mistake is the same reason we no longer try to match every heap allocation with a delete (instead using smart pointers) or ever file open with a close (using RAII ctor//dtor mechanisms) - we don't want to have to match up our intent in multiple places because there are times we - or another engineer less familiar with the intent - make mistakes.
Generally, therefore, this has been one of the things that code generation could take care of. You might create a file MyMessage.cg to specify both layout and members in one step
serialisable MyMessage
{
int myNumber_;
std::string myString_;
std::vector<MyOtherSerialisableObject> aBunchOfThingsIWantToSerialise_;
};
that would be run through a code generation utility and produce the code.
I was wondering if it was possible yet to do this in c++0x without external code generation. Are there any new language mechanisms that make it possible to specify a class as serialisable once, and the names and layout of it's members are used to layout the message during serialisation?
To be clear, I know that there are tricks with boost tuples and fusion that can come close to this kind of behavior even in the pre-c++0x language. Those usages, though, being based on indexing into the tuple rather than by-member-name access, have all been brittle to changing the layout, as other places in the code that access the messages would then also need to be reordered. Some kind of by-member-name access is necessary to not have to duplicate the layout specification in places in the code that use the messages.
Also, I know it might be nice to take this up to the next level and ask for specifying when some of the members shouldn't be serialised. Other languages that offer serialisation built in often offer some kind of attribute to do this, so
int myNonSerialisedNumber_ [[noserialise]];
might seem natural. However, I personally think it is bad design to have serialisable objects where everything is not serialised, since the lifetime of messages is in the transport to/from the communications layer, separate from other data lifetimes. Also, you could have an object which has a purely serialisable as on of it's members, so such functionality doesn't by anything the language doesn't already offer.
Is this possible? Or did the standards committee leave out this kind of introspective capability? I don't need it to look like the code gen file above - any simple method for compiletime specification of layout and members in a single step would solve this common problem.

This is both possible and practical in C++11 – in fact it was possible back in C++03, the syntax was just a little too unwieldy. I wrote a small library based around the same idea - see the following:
www.github.com/molw5/framework
Sample syntax:
class Object : serializable <Object,
value <NAME(“Field 1”), int>,
value <NAME(“Field 2”), float>,
value <NAME(“Field 3”), double>>
{
};
Most of the underlying code could be reproduced, in principal, in C++03 – some of the implementation details without variadic templates would have been...tricky, but I believe it would have been possible to recover the core functionality. What you could not reproduce in C++03 was the NAME macro above and the syntax relies fairly heavily on it. The macro provides the machinery necessary to generate a unique typename from a string, that is the following:
NAME(“Field 1”)
expands to
type_string <'F', 'i', 'e', 'l', 'd', ' ', '1'>
through the use of some common macros and constexpr (for character extraction). Back in C++03 something similar to the type_string above would need to be entered manually.

C++, of any form, supports neither introspection nor reflection (to the extent that they are different).
One nice thing about doing serialization manually (ie: without introspection or reflection) is that you can provide object versioning. You can support older forms of the serialization, and simply create reasonable defaults for the data that wasn't in the old versions. Or if a new version removes some data, you can simply serialize and discard it.
It seems to me that what you need is Boost.Serialization.

Object persistence terminology: 'repository' vs. 'store' vs. 'context' vs. 'retriever' vs. (...)

I'm not sure how to name data store classes when designing a program's data access layer (DAL).
(By data store class, I mean a class that is responsible to read a persisted object into memory, or to persist an in-memory object.)
It seems reasonable to name a data store class according to two things:
what kinds of objects it handles;
whether it loads and/or persists such objects.
⇒ A class that loads Banana objects might be called e.g. BananaSource.
I don't know how to go about the second point (ie. the Source bit in the example). I've seen different nouns apparently used for just that purpose:
repository: this sounds very general. Does this denote something read-/write-accessible?
store: this sounds like something that potentially allows write access.
context: sounds very abstract. I've seen this with LINQ and object-relational mappers (ORMs).
P.S. (several months later): This is probably appropriate for containers that contain "active" or otherwise supervised objects (the Unit of Work pattern comes to mind).
retriever: sounds like something read-only.
source & sink: probably not appropriate for object persistence; a better fit with data streams?
reader / writer: quite clear in its intention, but sounds too technical to me.
Are these names arbitrary, or are there widely accepted meanings / semantic differences behind each? More specifically, I wonder:
What names would be appropriate for read-only data stores?
What names would be appropriate for write-only data stores?
What names would be appropriate for mostly read-only data stores that are occasionally updated?
What names would be appropriate for mostly write-only data stores that are occasionally read?
Does one name fit all scenarios equally well?

As noone has yet answered the question, I'll post on what I have decided in the meantime.
Just for the record, I have pretty much decided on calling most data store classes repositories. First, it appears to be the most neutral, non-technical term from the list I suggested, and it seems to be well in line with the Repository pattern.
Generally, "repository" seems to fit well where data retrieval/persistence interfaces are something similar to the following:
public interface IRepository<TResource, TId>
{
int Count { get; }
TResource GetById(TId id);
IEnumerable<TResource> GetManyBySomeCriteria(...);
TId Add(TResource resource);
void Remove(TId id);
void Remove(TResource resource);
...
}
Another term I have decided on using is provider, which I'll be preferring over "repository" whenever objects are generated on-the-fly instead of being retrieved from a persistence store, or when access to a persistence store happens in a purely read-only manner. (Factory would also be appropriate, but sounds more technical, and I have decided against technical terms for most uses.)
P.S.: Some time has gone by since writing this answer, and I've had several opportunities at work to review someone else's code. One term I've thus added to my vocabulary is Service, which I am reserving for SOA scenarios: I might publish a FooService that is backed by a private Foo repository or provider. The "service" is basically just a thin public-facing layer above these that takes care of things like authentication, authorization, or aggregating / batching DTOs for proper "chunkiness" of service responses.

Well so to add something to you conclusion:
A repository: is meant to only care about one entity and has certain patterns like you did.
A store: is allowed to do a bit more, also working with other entities.
A reader/writer: is separated to allow semantically show and inject only reading and wrting functionality into other classes. It's coming from the CQRS pattern.
A context: is more or less bound to a ORM mapper as you mentioned and is usually used under the hood of a repository or store, some use it directly instead of making a repository on top. But it's harder to abstract.

Architecture for a business objects / database access layer

For various reasons, we are writing a new business objects/data storage library. One of the requirements of this layer is to separate the logic of the business rules, and the actual data storage layer.
It is possible to have multiple data storage layers that implement access to the same object - for example, a main "database" data storage source that implements most objects, and another "ldap" source that implements a User object. In this scenario, User can optionally come from an LDAP source, perhaps with slightly different functionality (eg, not possible to save/update the User object), but otherwise it is used by the application the same way. Another data storage type might be a web service, or an external database.
There are two main ways we are looking at implementing this, and me and a co-worker disagree on a fundamental level which is correct. I'd like some advice on which one is the best to use. I'll try to keep my descriptions of each as neutral as possible, as I'm looking for some objective view points here.
Business objects are base classes, and data storage objects inherit business objects. Client code deals with data storage objects.
In this case, common business rules are inherited by each data storage object, and it is the data storage objects that are directly used by the client code.
This has the implication that client code determines which data storage method to use for a given object, because it has to explicitly declare an instance to that type of object. Client code needs to explicitly know connection information for each data storage type it is using.
If a data storage layer implements different functionality for a given object, client code explicitly knows about it at compile time because the object looks different. If the data storage method is changed, client code has to be updated.
Business objects encapsulate data storage objects.
In this case, business objects are directly used by client application. Client application passes along base connection information to business layer. Decision about which data storage method a given object uses is made by business object code. Connection information would be a chunk of data taken from a config file (client app does not really know/care about details of it), which may be a single connection string for a database, or several pieces connection strings for various data storage types. Additional data storage connection types could also be read from another spot - eg, a configuration table in a database that specifies URLs to various web services.
The benefit here is that if a new data storage method is added to an existing object, a configuration setting can be set at runtime to determine which method to use, and it is completely transparent to the client applications. Client apps do not need to be modified if data storage method for a given object changes.
Business objects are base classes, data source objects inherit from business objects. Client code deals primarily with base classes.
This is similar to the first method, but client code declares variables of the base business object types, and Load()/Create()/etc static methods on the business objects return the appropriate data source-typed objects.
The architecture of this solution is similar to the first method, but the main difference is the decision about which data storage object to use for a given business object is made by the business layer, not the client code.
I know there are already existing ORM libraries that provide some of this functionality, but please discount those for now (there is the possibility that a data storage layer is implemented with one of these ORM libraries) - also note I'm deliberately not telling you what language is being used here, other than that it is strongly typed.
I'm looking for some general advice here on which method is better to use (or feel free to suggest something else), and why.

might i suggest another alternative, with possibly better decoupling: business objects use data objects, and data objects implement storage objects. This should keep the business rules in the business objects but without any dependence on the storage source or format, while allowing the data objects to support whatever manipulations are required, including changing the storage objects dynamically (e.g. for online/offline manipulation)
this falls into the second category above (business objects encapsulate data storage objects), but separates data semantics from storage mechanisms more clearly

You can also have a facade to keep from your client to call the business directly. Also it creates common entry points to your business.
As said, your business should not be exposed to anything but your DTO and Facade.
Yes. Your client can deal with DTOs. It's the ideal way to pass data through your application.

I generally prefer the "business object encapsulates data object/storage" best. However, in the short you may find high redundancy with your data objects and your business objects that may seem not worthwhile. This is especially true if you opt for an ORM as the basis of your data-access layer (DAL). But, in the long term is where the real pay off is: application life cycle. As illustrated, it isn't uncommon for "data" to come from one or more storage subsystems (not limited to RDBMS), especially with the advent of cloud computing, and as commonly the case in distributed systems. For example, you may have some data that comes from a Restful service, another chunk or object from a RDBMS, another from an XML file, LDAP, and so on. With this realization, this implies the importance of very good encapsulation of the data access from the business. Take care what dependencies you expose (DI) through your c-tors and properties, too.
That said, an approach I've been toying with is to put the "meat" of the architecture in a business controller. Thinking of contemporary data-access more as a resource than traditional thinking, the controller then accepts in a URI or other form of metadata that can be used to know what data resources it must manage for the business objects. Then, the business objects DO NOT themselves encapsulate the data access; rather the controller does. This keeps your business objects lightweight and specific and allows your controller to provide optimization, composability, transaction ambiance, and so forth. Note that your controller would then "host" your business object collections, much like the controller piece of many ORMs do.
Additionally, also consider business rule management. If you squint hard at your UML (or the model in your head like I do :D ), you will notice that your business rules model are actually another model, sometimes even persistent (if you are using a business rules engine, for example). I'd consider letting the business controller also actually control your rules subsystem too, and let your business object reference the rules through the controller. The reason is because, inevitably, rule implementations often need to perform lookups and cross-checking, in order to determine validity. Often, it might require both hydrated business object lookups, as well as back-end database lookups. Consider detecting duplicate entities, for example, where only the "new" one is hydrated. Leaving your rules to be managed by your business controller, you can then do most anything you need without sacrificing that nice clean abstraction in your "domain model."
In pseudo-code:
using(MyConcreteBusinessContext ctx = new MyConcreteBusinessContext("datares://model1?DataSource=myserver;Catalog=mydatabase;Trusted_Connection=True ruleres://someruleresource?type=StaticRules&handler=My.Org.Business.Model.RuleManager")) {
User user = ctx.GetUserById("SZE543");
user.IsLogonActive = false;
ctx.Save();
}
//a business object
class User : BusinessBase {
public User(BusinessContext ctx) : base(ctx) {}
public bool Validate() {
IValidator v = ctx.GetValidator(this);
return v.Validate();
}
}
// a validator
class UserValidator : BaseValidator, IValidator {
User userInstance;
public UserValidator(User user) {
userInstance = user;
}
public bool Validate() {
// actual validation code here
return true;
}
}

Clients should never deal with storage objects directly. They can deal with DTO's directly, but any object that has any logic for storage that is not wrapped in your business object should not be called by the client directly.

Check out CSLA.net by Rocky Lhotka.

Well, here I am, the co-worker Greg mentioned.
Greg described the alternatives we have been considering with great accuracy. I just want to add some additional considerations to the situation description.
Client code can be unaware about datastorage where business objects are stored, but it is possible either in case when there is only one datastorage, or there are multiple datastorages for the same business object type (users stored in local database and in external LDAP) but the client does not create these business objects. In terms of system analysis, it means that there should be no use cases in which existence of two datastorages of objects of the same type can affect use case flow.
As soon as the need in distinguishing objects created in different data storages arise, the client component must become aware about multiplicity of data storages in its universe, and it will inevitably become responsible for the decision which data storage to use on the moment of object creation (and, I think, object loading from a data storage). Business layer can pretend it is making this decisions, but the algorithm of decision making will be based on type and content of the information coming from the Client component, making the client effectively responsible for the decision.
This responsibility can be implemented in numerous ways: it can be a connection object of specific type for each data storage; it can be segregared methods to call to create new BO instances etc.
Regards,
Michael

CLSA has been around a long time.
However I like the approach that is discussed in Eric Evans book
http://dddcommunity.org/

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas