Clean Architecture - Robert Martin - Use Case Granularity

Clean Architecture - Robert Martin - Use Case Granularity - oop

I am considering implementing Robert Martin's Clean Architecture in a project and I am trying to find out how to handle non-trivial use cases.
I am finding it difficult to scale the architecture to complex/composed use cases, especially use cases where the actor is the system as opposed to a user, as in system performing some sort of batch processing.
For illustration purposes, let's assume a use case like "System updates all account balances" implemented in pseudocode like
class UpdateAllAccountBalancesInteraction {
function Execute() {
Get a list of all accounts
For each account
Get a list of all new transactions for account
For each transaction
Perform some specific calculation on the transaction
Update account balance
}
}
In addition, "Get a list of all accounts", "Get a list of all new transactions for account", "Perform some specific calculation on the transaction", "Update account balance" are all valid use cases of their own and each of them is already implemented in its own interaction class.
A few questions arise:
Is the use case "System updates all account balances" even a valid
use case or should it be broken down into smaller use cases (although
from a business prospective it seems to make sense, it is a
legitimate business scenario)?
Is UpdateAllAccountBalancesInteraction
a legitimate interaction?
Is an interaction allowed to/supposed to orchestrate other interactions?
Is code that orchestrates other
interactions really belonging somewhere else?
Is it just OK to have
UpdateAllAccountBalancesInteraction as an interaction, but have it
call functions shared by the other interactors rather than act as an
orchestrator of other interactors?

Clearly, you have a new for high level interactions that share some (or a lot of) common functionality with lower level interactions. This is ok.
If the business requires a use case called UpdateAllAccountBalances, then it is a valid use case, and it's good that you're naming it in a way that reflects the business logic.
It's o.k. for one interaction to call other interactions, if this reflects your business logic accurately. Ask yourself the following question: If the requirements for UpdateAccountBalance change, should this also affect UpdateAllAccountBalances in exactly the same way? If the answer is yes, then the best way to achieve this is to have UpdateAllAccountBalances call UpdateAccountBalance, because otherwise, you'll need to make a change in two places in order to keep them consistent. If the answer is no, then you want to decouple the two interactions, and this can be done by having them call shared functions.

My suggestion is to approach the problem differently. Represent the problem itself in a domain model, rather than using a procedural approach. Your seeing some of the problems with Use Cases, one of which is that their granularity is generally indeterminate.
In a domain model, the standard way to represent a specific thing (i.e. an "account") is with two objects. One representing the specific account, and an associated object representing those things common to all accounts.
AccountCatalog (1) ---- (*) SpecificAccount
In your example, SpecificAccount would have a service (method) "UpdateBalance". AccountCatalog has a service (method) "UpdateAllBalances", which sends a message UpdateBalance to all SpecificAccounts in its collection.
Now anything can send the UpdateAllBalances message. Another object, human interaction, or another system.
I should note, that it can be common for an account to "know" (i.e. maintain) its own balance, rather than it being told to update.

Related

What is the use of single responsibility principle?

I am trying to understand the Single Responsibility principle but I have tough time in grasping the concept. I am reading the book "Design Patterns and Best Practices in Java by Lucian-Paul Torje; Adrian Ianculescu; Kamalmeet Singh ."
In this book I am reading Single responsibility principle chapter ,
where they have a car class as shown below:
They said Car has both Car logic and database operations. In future if we want to change database then we need to change database logic and might need to change also car logic. And vice versa...
The solution would be to create two classes as shown below:
My question is even if we create two classes , let’s consider we are adding a new property called ‘price’ to the class CAR [Or changing the property ‘model’ to ‘carModel’ ] then don’t you think we also need to update CarDAO class like changing the SQL or so on.
So What is the use of SRP here?

Great question.
First, keep in mind that this is a simplistic example in the book. It's up to the reader to expand on this a little and imagine more complex scenarios. In all of these scenarios, further imagine that you are not the only developer on the team; instead, you are working in a large team, and communication between developers often take the form of negotiating class interfaces i.e. APIs, public methods, public attributes, database schemas. In addition, you often will have to worry about rollbacks, backwards compatibility, and synchronizing releases and deploys.
Suppose, for example, that you want to swap out the database, say, from MySQL to PostgreSQL. With SRP, you will reimplement CarDAO, change whatever dialect-specific SQL was used, and leave the Car logic intact. However, you may have to make a small change, possibly in configuration, to tell Car to use the new PostgreSQL DAO. A reasonable DI framework would make this simple.
Suppose, in another example, that you want to delegate CarDAO to another developer to integrate with memcached, so that reads, while eventually consistent, are fast. Again, this developer would not need to know anything about the business logic in Car. Instead, they only need to operate behind the CRUD methods of CarDAO, and possibly declare a few more methods in the CarDAO API with different consistency guarantees.
Suppose, in yet another example, your team hires a database engineer specializing in compliance law. In preparation for the upcoming IPO, the database engineer is tasked with keeping an audit log of all changes across all tables in the company's 35 databases. With SRP, our intrepid DBA would not have to worry about any of the business logic using any of our tables; instead, their mutation tracking magic can be deftly injected into DAOs all over, using decorators or other aspect programming techniques. (This could also be done of the other side of the SQL interface, by the way.)
Alright one last one - suppose now that a systems engineer is brought onto the team, and is tasked with sharding this data across multiple regions (data centers) in AWS. This engineer could take SRP even further and add a component whose only role is to tell us, for each ID, the home region of each entity. Each time we do a cross-region read, the new component bumps a counter; each week, an automated tool migrates data frequently read across regions into a new home region to reduce latency.
Now, let's take our imagination even further, and assume that business is booming - suddenly, you are working for a Fortune 500 company with multiple departments spanning multiple countries. Business Analysts from the Finance Department want to use your table to plot quarterly growth in auto sales in their post-IPO investor reports. Instead of giving them access to Car (because the logic used for reporting might be different from the logic used to prepare data for rendering on a web UI), you could, potentially, create a read-only interface for CarDAO with a short list of carefully curated public attributes that you now have to maintain across department boundaries. God forbid you have to rename one of these attributes: be prepared for a 3-month sunset plan and many many sad dashboards and late-night escalations. (And please don't give them direct access to the actual SQL table, because the implicit assumption will be that the entire table is the public interface.) Oops, my scars may be showing.
A corollary is that, if you need to change the business logic in Car (say, add a method that computes the lower sale price of each Tesla after an embarrassing recall), you wouldn't touch the CarDAO, since if car.brand == 'Tesla; price = price * 0.6 has nothing to do with data access.
Additional Reading: CQRS

For adding new property you need to change both classes only if that property should be saved to database. If it is a property used in business logic then you do not need to change DAO. Also if you change your database from one vendor to another or from SQL to NoSQL you will have to make changes only in DAO class. And if you need to change some business logic then you need to change only Car class.

Single responsibility principle as stated by Robert C. Martin means that
A class should have only one reason to change.
Keeping this principle in mind will generally lead to smaller and highly cohesive classes, which in turn means that less people need to work on these classes simultaneously, and the code becomes more robust.
In your example, keeping data access and business logic (price calculation) logic separate means that you are less likely to break the other when making changes.

Domain Driven Design - Creating general purpose entities vs. Context specific Entities

Situation
Suppose you have Orders and Clients as entities in your application. In one aggregate, the Order entity is considered to be the root but you also want to make use of the Client entity for simple things. In another the Client is the root entity and the Order entity is touched ever so lightly.
An example:
Let's say that in the Order aggregate I use the Client only to read details like name, address, build order history and not to make the client do client specific business logic. (like persistence, passwords resets and back flips..).
On the other hand, in the Client aggregate I use the Order entity to report on the client's buying habbits, order totals, order counting, without requiring advanced order functionality like order processing, updating, status changes, etc.
Possible solution
I believe the better solution is to create the entities for each aggregate specific to the aggregate context, because making them full featured (general purpose) and ready for any situation and usage seems like overkill and could potentially become a maintenance nightmare. (and potentially memory intensive)
Question
What is the DDD recommended way of handling this situation?
What is your take on the matter?

The basic driver for these decisions should be the ubiquitous language, and consequently the real world domain you're modeling. If both works in a specific domain, I'd favor separation over god-classes for maintainability reasons.
Apart from separating behavior into different aggregates, you should also take care that you don't mix different bounded contexts. Depending on the requirements of your domain, it could make sense to separate the Purchase Context from the Reporting Context (to extend on your example).
To decide on a context design, context maps are a helpful tool.

You are one the right track. In DDD, entities are not merely containers encapsulating all attributes related to a "subject" (for example: a customer, or an order). This is a very important concept that eludes a lot of people. An entity in DDD represents an operation boundary, thus only the data necessary to perform the operation is considered to be a part of the entity. Exactly which data to include in an entity can be difficult to consider because some data is relevant in a different use-cases. Here are some tips when analyzing data:
Analyze invariants, things that must be considered when applying validation rules and that can not be out of sync should be in the same aggregate.
Drop the database-thinking, normalization is not a concern of DDD
Just because things look the same, it doesn't mean that they are. For example: the current shipping address registered on a customer is different from the shipping address which a specific order was shipped to.
Don't look at reads. Reading, like creating a report or populating av viewmodel/dto/whatever has nothing to do with operation boundaries and can typically be a 360 deg view of the data. In fact don't event use your domain model when returning reads, use a different architectural stack.

Aggregate - Correct Usage (DDD)

I have been trying to get started on Domain Driven Design (DDD) and therefore I've been studying it for a while now. I have a problem and I seek help around how I can solve it in a DDD fashion.
I have a Client class, which contains a hell lot of attributes - some of them are simple attributes, such as string contactName whereas others are complex ones, such as list addresses, list websites, etc.
DDD advocates that Client should be an Entity and it should also be an Aggregate root - ie, the client code should manipulate only the Client object itself and it's down to the Client object to perform operations on its inner objects (addresses, websites, names, etc.).
Here's the point where I get confused:
There are tons of business rules in the application that depend on the Client's inner objects - for instance:
Depending on the Client's country of birth or resident and her address, some FATCA (an US regulation) restrictions may be applicable.
I need to enrich some inner objects with data that comes from other systems, both internal to my organisation as well as external.
The application has to decide whether a Client is allowed to perform an operation and to that end, the app needs to scrutinize a lot of client details and make a decision - also, as the app scrutinizes the Client it needs to update many of its attributes to keep track of what led the application to that decision.
I could list hundreds of rules here - but you get the idea. My point is that I need to update many of the Client's inner attributes. From the domain perspective, the root is the Client - it's the Client that the user searches for in the GUI. The user cares only about the Client as a whole. Say, an isolated address is meaningless - it only exists if it's part of a Client.
Having said all that, my question is:
Eric Evans says it's OK for the root to return transient references to inner objects, preferably VOs (keyword here: VO) - but any manipulation on the inner objects should be performed by the root itself.
I have hundreds of manipulations that I need to perform on my clients - if I move all of them to the root, the root is going to become huge - it will have at least 10K lines of code!
According to Eric, a VO should be immutable - so if my root returns VOs, the client code won't be allowed to change them. So doing something like this would be unacceptable in a service: client.getExternalInfo().update(getDataFromExternalSystem())
So my question boils down to how on Earth I should update the inner objects without breaking the DDD rules?
I don't see any easy way out.
UPDATE I:
I've just come across Specifications, which seems to be the ideal DDD concept to my problem.
I'm still reading about it but I have decided to post this update anyway.

I have been studying DDD for awhile myself and am struggling to master it.
First, you're right: Specification is a fine pattern to use for validation or business rules in general, assuming the rules you are applying fit well with a predicate tree.
Of course, I don't know the specifics of your design, but I do wonder about the model itself. You mention that your Client class has "a hell lot of attributes". Are you sure your model is not somewhat anemic? Could your design benefit from some more analysis, perhaps breaking it out into other Aggregates? Is this a single Bounded Context? Should it be?

Specifications is definitely the way to go for complex business logic.
One question though - are you modeling the inner entities like addresses and names as ValueObjects? The best rule of thumb I can think of for those is if you can say they're equal, without an ID, they're likely value objects. Does your domain consider names to have a state?
If you're looking at a problem where few entities take in many types of change AND need an audit trail, you might want to also explore EventSourcing. Ideally the entity declares its reaction to an event, but you can also have the mutating code be held in the event for easy extensibility. There's pros and cons in that approach, of course.

Should the rule "one transaction per aggregate" be taken into consideration when modeling the domain?

Taking into consideration the domain events pattern and this post , why do people recomend keeping one aggregate per transaction model ? There are good cases when one aggregate could change the state of another one . Even by removing an aggregate (or altering it's identity) will lead to altering the state of other aggregates that reference it. Some people say that keeping one transaction per aggregates help scalability (keeping one aggregate per server) . But doesn't this type of thinking break the fundamental characteristic about DDD : technology agnostic ?
So based on the statements above and on your experience, is it bad to design aggregates, domain events, that lead to changes in other aggregates and this will lead to having 2 or more aggregates per transaction (ex. : when a new order is placed with 100 items change the customer's state from normal to V.I.P. )?

There are several things at play here and even more trade-offs to be made.
First and foremost, you are right, you should think about the model first. Afterall, the interplay of language, model and domain is what we're doing this all for: coming up with carefully designed abstractions as a solution to a problem.
The tactical patterns - from the DDD book - are a means to an end. In that respect we shouldn't overemphasize them, eventhough they have served us well (and caused major headaches for others). They help us find "units of consistency" in the model, things that change together, a transactional boundary. And therein lies the problem, I'm afraid. When something happens and when the side effects of it happening should be visible are two different things. Yet all too often they are treated as one, and thus cause this uncomfortable feeling, to which we respond by trying to squeeze everything within the boundary, without questioning. Still, we're left with that uncomfortable feeling. There are a lot of things that logically can be treated as a "whole change", whereas physically there are multiple small changes. It takes skill and experience, or even blunt trying to know when that is the case. Not everything can be solved this way mind you.
To scale or not to scale, that is often the question. If you don't need to scale, keep things on one box, be content with a certain backup/restore strategy, you can bend the rules and affect multiple aggregates in one go. But you have to be aware you're doing just that and not take it as a given, because inevitably change is going to come and it might mess with this particular way of handling things. So, fair warning. More subtle is the question as to why you're changing multiple aggregates in one go. People often respond to that with the "your aggregate boundaries are wrong" answer. In reality it means you have more domain and model exploration to do, to uncover the true motivation for those synchronous, multi-aggregate changes. Often a UI or service is the one that has this "unreasonable" expectation. But there might be other reasons and all it might take is a different set of abstractions to solve the same problem. This is a pretty essential aspect of DDD.
The example you gave seems like something I could handle as two separate transactions: an order was placed, and as a reaction to that, because the order was placed with a 100 items, the customer was made a VIP. As MikeSW hinted at in his answer (I started writing mine after he posted his), the question is when, who, how, and why should this customer status change be observed. Basically it's the "next" behavior that dictates the consistency requirements of the previous behavior(s).

An aggregate groups related business objects while an aggregate root (AR) is the 'representative' of that aggregate. Th AR itself is an entity modeling a (bigger, more complex) domain concept. In DDD a model is always relative to a context (the bounded context - BC) i.e that model is valid only in that BC.
This allows you to define a model representative of the specific business context and you don't need to shove everything in one model only. An Order is an AR in one context, while in another is just an id.
Since an AR pretty much encapsulates all the lower concepts and business rules, it acts as a whole i.e as a transaction/unit of work. A repository always works with AR because 1) a repo always deals with business objects and 2) the AR represents the business object for a given context.
When you have a use case involving 2 or more AR the business workflow and the correct modelling of that use case is paramount. In a lot of cases those AR can be modified independently (one doesn't care about other) or an AR changes as a result of other AR behaviour.
In your example, it's pretty trivial: when the customer places an order for 100 items, a domain event is generated and published. Then you have a handler which will check if the order complies with the customer promotions rules and if it does, a command is issued which will have the result of changing the client state to VIP.
Domain events are very powerful and allows you to implement transactions but in an eventual consistent environment. The old db transaction is an implementation detail and it's usually used when persisting one AR (remember AR are treated as a logical unit but persisting one may involve multiple tables hence db transaction).
Eventual consistency is a 'feature' of domain events which fits naturally a rich domain (and the real world actually). For some cases you might need instant consistency however those are particular cases and they are related to UI rather than how Domain works. Of course, it really depends from one domain to another. In your example, the customer won't mind it became a VIP 2 seconds or 2 minutes after the order was placed instead of the same milisecond.

Writing an API, benefits of: including nested objects automatically, not at all, or provide a parameter to specify which to include?

For example, we have an entity called ServiceConfig that contains a pointer to a Service and a Professional. If returned without including the fields would look like this:
{
'type': '__Pointer',
'className': 'Service',
'objectId': 'q92he840'
}
At which point they could query again to retrieve that service. However, it is often the case that they need the Service name. In which case it is inefficient to have to query again to get the service every time.
Options:
Automatically return the Service. In which case, we should automatically return the Industry for that Service as well in case they need that... same applies to all. Seems like we're returning data too often here.
Allow them to pass an includes parameter that specifies which entities to include. Format is an array of strings where using a . can allow them to include subclasses. In this case ['Professional', 'Service.Industry'] would work.
Can anyone identify why any one solution would be better than the others? I feel that the last solution is the best, however it does not seem to be common to do to in the APIs I've seen.

This is a good API Design decision to spend your time on before you release an initial version. Both your approaches are valid and it all depends on what you think are the most common ways that clients would use your API.
Here are some points that you could consider:
You might prefer the first approach where you do not give all the data upfront. Sometimes it is about efficiency and at times it is also about security and ensuring that any additional important data is only fetched on as as needed basis and on authorization.
Implementing the 2nd approach is going to take more effort on part of your team to design/code and test out the API. So you might want to consider how much of effort you want to put into release 1.0
Since you have nested data for example, the second approach will serve you well. Several public APIs do that as a matter of fact. For e.g. look at the LinkedIn public API and particularly the facets section, where you can specify the fields or additional information that you would like to return.
Look at some of the client applications that you have written and if you can identify for sure that some data is needed anyways upfront, then it can help in designed the return data.
Eventually monitoring API usage and doing some analysis on the number of calls, methods invoked will give you good inputs on what to do next.
If I had to make a choice and have a little bit more leeway in terms of effort, I would go with the 2nd option, even if it is a simple version at first.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas