OOP and Persistence

OOP and Persistence - oop

In the spirit of tell don't ask and never let an object get into an invalid state oop design, I'm wondering how persistence would be handled in a dynamic environment.
For a contrived example, imagine you need to write a POS application for an airline. You can only sell seats that are available. Seats are grouped such that plane -> sections -> rows -> seats. If a section is unavailable then all rows and therefore seats in that section are also unavailable. And obviously if a row in unavailable then all seats in the row are also not available.
Now the environment is highly dynamic in that maintenance personnel may be making sections, rows, or seats available/unavailable frequently. Further, imagine it's somewhat expensive to build the airplane object graph. However without constructing the entire graph for each sale attempt I don't see how you can keep business rules out of the persistence layer, which in my mind is an absolute must.
Is oop just not a viable choice for this kind of problem?
Edit:
If it makes a difference, assume the system is persisted by a db server and the inputs to the system are made via http thin clients.

I don't see why you can't construct the airplane object once, and manage it on the fly? You'd have to make it safe for concurrent access, but isn't this EXACTLY the type of problem OOP is good for?
Maybe I misunderstand your scenario.

Separate seat availability from from the plane itself. The whole plane can't be reconstructed on each request, but the seat availability might be able to be requeried each time.
availableSeats = availabilityService.getAvailableSeats(flightId)
availableSections = planeObj.getAvailableSections(seatAvailabilityList)

Related

What is the use of single responsibility principle?

I am trying to understand the Single Responsibility principle but I have tough time in grasping the concept. I am reading the book "Design Patterns and Best Practices in Java by Lucian-Paul Torje; Adrian Ianculescu; Kamalmeet Singh ."
In this book I am reading Single responsibility principle chapter ,
where they have a car class as shown below:
They said Car has both Car logic and database operations. In future if we want to change database then we need to change database logic and might need to change also car logic. And vice versa...
The solution would be to create two classes as shown below:
My question is even if we create two classes , let’s consider we are adding a new property called ‘price’ to the class CAR [Or changing the property ‘model’ to ‘carModel’ ] then don’t you think we also need to update CarDAO class like changing the SQL or so on.
So What is the use of SRP here?

Great question.
First, keep in mind that this is a simplistic example in the book. It's up to the reader to expand on this a little and imagine more complex scenarios. In all of these scenarios, further imagine that you are not the only developer on the team; instead, you are working in a large team, and communication between developers often take the form of negotiating class interfaces i.e. APIs, public methods, public attributes, database schemas. In addition, you often will have to worry about rollbacks, backwards compatibility, and synchronizing releases and deploys.
Suppose, for example, that you want to swap out the database, say, from MySQL to PostgreSQL. With SRP, you will reimplement CarDAO, change whatever dialect-specific SQL was used, and leave the Car logic intact. However, you may have to make a small change, possibly in configuration, to tell Car to use the new PostgreSQL DAO. A reasonable DI framework would make this simple.
Suppose, in another example, that you want to delegate CarDAO to another developer to integrate with memcached, so that reads, while eventually consistent, are fast. Again, this developer would not need to know anything about the business logic in Car. Instead, they only need to operate behind the CRUD methods of CarDAO, and possibly declare a few more methods in the CarDAO API with different consistency guarantees.
Suppose, in yet another example, your team hires a database engineer specializing in compliance law. In preparation for the upcoming IPO, the database engineer is tasked with keeping an audit log of all changes across all tables in the company's 35 databases. With SRP, our intrepid DBA would not have to worry about any of the business logic using any of our tables; instead, their mutation tracking magic can be deftly injected into DAOs all over, using decorators or other aspect programming techniques. (This could also be done of the other side of the SQL interface, by the way.)
Alright one last one - suppose now that a systems engineer is brought onto the team, and is tasked with sharding this data across multiple regions (data centers) in AWS. This engineer could take SRP even further and add a component whose only role is to tell us, for each ID, the home region of each entity. Each time we do a cross-region read, the new component bumps a counter; each week, an automated tool migrates data frequently read across regions into a new home region to reduce latency.
Now, let's take our imagination even further, and assume that business is booming - suddenly, you are working for a Fortune 500 company with multiple departments spanning multiple countries. Business Analysts from the Finance Department want to use your table to plot quarterly growth in auto sales in their post-IPO investor reports. Instead of giving them access to Car (because the logic used for reporting might be different from the logic used to prepare data for rendering on a web UI), you could, potentially, create a read-only interface for CarDAO with a short list of carefully curated public attributes that you now have to maintain across department boundaries. God forbid you have to rename one of these attributes: be prepared for a 3-month sunset plan and many many sad dashboards and late-night escalations. (And please don't give them direct access to the actual SQL table, because the implicit assumption will be that the entire table is the public interface.) Oops, my scars may be showing.
A corollary is that, if you need to change the business logic in Car (say, add a method that computes the lower sale price of each Tesla after an embarrassing recall), you wouldn't touch the CarDAO, since if car.brand == 'Tesla; price = price * 0.6 has nothing to do with data access.
Additional Reading: CQRS

For adding new property you need to change both classes only if that property should be saved to database. If it is a property used in business logic then you do not need to change DAO. Also if you change your database from one vendor to another or from SQL to NoSQL you will have to make changes only in DAO class. And if you need to change some business logic then you need to change only Car class.

Single responsibility principle as stated by Robert C. Martin means that
A class should have only one reason to change.
Keeping this principle in mind will generally lead to smaller and highly cohesive classes, which in turn means that less people need to work on these classes simultaneously, and the code becomes more robust.
In your example, keeping data access and business logic (price calculation) logic separate means that you are less likely to break the other when making changes.

Domain Driven Design - Creating general purpose entities vs. Context specific Entities

Situation
Suppose you have Orders and Clients as entities in your application. In one aggregate, the Order entity is considered to be the root but you also want to make use of the Client entity for simple things. In another the Client is the root entity and the Order entity is touched ever so lightly.
An example:
Let's say that in the Order aggregate I use the Client only to read details like name, address, build order history and not to make the client do client specific business logic. (like persistence, passwords resets and back flips..).
On the other hand, in the Client aggregate I use the Order entity to report on the client's buying habbits, order totals, order counting, without requiring advanced order functionality like order processing, updating, status changes, etc.
Possible solution
I believe the better solution is to create the entities for each aggregate specific to the aggregate context, because making them full featured (general purpose) and ready for any situation and usage seems like overkill and could potentially become a maintenance nightmare. (and potentially memory intensive)
Question
What is the DDD recommended way of handling this situation?
What is your take on the matter?

The basic driver for these decisions should be the ubiquitous language, and consequently the real world domain you're modeling. If both works in a specific domain, I'd favor separation over god-classes for maintainability reasons.
Apart from separating behavior into different aggregates, you should also take care that you don't mix different bounded contexts. Depending on the requirements of your domain, it could make sense to separate the Purchase Context from the Reporting Context (to extend on your example).
To decide on a context design, context maps are a helpful tool.

You are one the right track. In DDD, entities are not merely containers encapsulating all attributes related to a "subject" (for example: a customer, or an order). This is a very important concept that eludes a lot of people. An entity in DDD represents an operation boundary, thus only the data necessary to perform the operation is considered to be a part of the entity. Exactly which data to include in an entity can be difficult to consider because some data is relevant in a different use-cases. Here are some tips when analyzing data:
Analyze invariants, things that must be considered when applying validation rules and that can not be out of sync should be in the same aggregate.
Drop the database-thinking, normalization is not a concern of DDD
Just because things look the same, it doesn't mean that they are. For example: the current shipping address registered on a customer is different from the shipping address which a specific order was shipped to.
Don't look at reads. Reading, like creating a report or populating av viewmodel/dto/whatever has nothing to do with operation boundaries and can typically be a 360 deg view of the data. In fact don't event use your domain model when returning reads, use a different architectural stack.

Aggregate - Correct Usage (DDD)

I have been trying to get started on Domain Driven Design (DDD) and therefore I've been studying it for a while now. I have a problem and I seek help around how I can solve it in a DDD fashion.
I have a Client class, which contains a hell lot of attributes - some of them are simple attributes, such as string contactName whereas others are complex ones, such as list addresses, list websites, etc.
DDD advocates that Client should be an Entity and it should also be an Aggregate root - ie, the client code should manipulate only the Client object itself and it's down to the Client object to perform operations on its inner objects (addresses, websites, names, etc.).
Here's the point where I get confused:
There are tons of business rules in the application that depend on the Client's inner objects - for instance:
Depending on the Client's country of birth or resident and her address, some FATCA (an US regulation) restrictions may be applicable.
I need to enrich some inner objects with data that comes from other systems, both internal to my organisation as well as external.
The application has to decide whether a Client is allowed to perform an operation and to that end, the app needs to scrutinize a lot of client details and make a decision - also, as the app scrutinizes the Client it needs to update many of its attributes to keep track of what led the application to that decision.
I could list hundreds of rules here - but you get the idea. My point is that I need to update many of the Client's inner attributes. From the domain perspective, the root is the Client - it's the Client that the user searches for in the GUI. The user cares only about the Client as a whole. Say, an isolated address is meaningless - it only exists if it's part of a Client.
Having said all that, my question is:
Eric Evans says it's OK for the root to return transient references to inner objects, preferably VOs (keyword here: VO) - but any manipulation on the inner objects should be performed by the root itself.
I have hundreds of manipulations that I need to perform on my clients - if I move all of them to the root, the root is going to become huge - it will have at least 10K lines of code!
According to Eric, a VO should be immutable - so if my root returns VOs, the client code won't be allowed to change them. So doing something like this would be unacceptable in a service: client.getExternalInfo().update(getDataFromExternalSystem())
So my question boils down to how on Earth I should update the inner objects without breaking the DDD rules?
I don't see any easy way out.
UPDATE I:
I've just come across Specifications, which seems to be the ideal DDD concept to my problem.
I'm still reading about it but I have decided to post this update anyway.

I have been studying DDD for awhile myself and am struggling to master it.
First, you're right: Specification is a fine pattern to use for validation or business rules in general, assuming the rules you are applying fit well with a predicate tree.
Of course, I don't know the specifics of your design, but I do wonder about the model itself. You mention that your Client class has "a hell lot of attributes". Are you sure your model is not somewhat anemic? Could your design benefit from some more analysis, perhaps breaking it out into other Aggregates? Is this a single Bounded Context? Should it be?

Specifications is definitely the way to go for complex business logic.
One question though - are you modeling the inner entities like addresses and names as ValueObjects? The best rule of thumb I can think of for those is if you can say they're equal, without an ID, they're likely value objects. Does your domain consider names to have a state?
If you're looking at a problem where few entities take in many types of change AND need an audit trail, you might want to also explore EventSourcing. Ideally the entity declares its reaction to an event, but you can also have the mutating code be held in the event for easy extensibility. There's pros and cons in that approach, of course.

Should the rule "one transaction per aggregate" be taken into consideration when modeling the domain?

Taking into consideration the domain events pattern and this post , why do people recomend keeping one aggregate per transaction model ? There are good cases when one aggregate could change the state of another one . Even by removing an aggregate (or altering it's identity) will lead to altering the state of other aggregates that reference it. Some people say that keeping one transaction per aggregates help scalability (keeping one aggregate per server) . But doesn't this type of thinking break the fundamental characteristic about DDD : technology agnostic ?
So based on the statements above and on your experience, is it bad to design aggregates, domain events, that lead to changes in other aggregates and this will lead to having 2 or more aggregates per transaction (ex. : when a new order is placed with 100 items change the customer's state from normal to V.I.P. )?

There are several things at play here and even more trade-offs to be made.
First and foremost, you are right, you should think about the model first. Afterall, the interplay of language, model and domain is what we're doing this all for: coming up with carefully designed abstractions as a solution to a problem.
The tactical patterns - from the DDD book - are a means to an end. In that respect we shouldn't overemphasize them, eventhough they have served us well (and caused major headaches for others). They help us find "units of consistency" in the model, things that change together, a transactional boundary. And therein lies the problem, I'm afraid. When something happens and when the side effects of it happening should be visible are two different things. Yet all too often they are treated as one, and thus cause this uncomfortable feeling, to which we respond by trying to squeeze everything within the boundary, without questioning. Still, we're left with that uncomfortable feeling. There are a lot of things that logically can be treated as a "whole change", whereas physically there are multiple small changes. It takes skill and experience, or even blunt trying to know when that is the case. Not everything can be solved this way mind you.
To scale or not to scale, that is often the question. If you don't need to scale, keep things on one box, be content with a certain backup/restore strategy, you can bend the rules and affect multiple aggregates in one go. But you have to be aware you're doing just that and not take it as a given, because inevitably change is going to come and it might mess with this particular way of handling things. So, fair warning. More subtle is the question as to why you're changing multiple aggregates in one go. People often respond to that with the "your aggregate boundaries are wrong" answer. In reality it means you have more domain and model exploration to do, to uncover the true motivation for those synchronous, multi-aggregate changes. Often a UI or service is the one that has this "unreasonable" expectation. But there might be other reasons and all it might take is a different set of abstractions to solve the same problem. This is a pretty essential aspect of DDD.
The example you gave seems like something I could handle as two separate transactions: an order was placed, and as a reaction to that, because the order was placed with a 100 items, the customer was made a VIP. As MikeSW hinted at in his answer (I started writing mine after he posted his), the question is when, who, how, and why should this customer status change be observed. Basically it's the "next" behavior that dictates the consistency requirements of the previous behavior(s).

An aggregate groups related business objects while an aggregate root (AR) is the 'representative' of that aggregate. Th AR itself is an entity modeling a (bigger, more complex) domain concept. In DDD a model is always relative to a context (the bounded context - BC) i.e that model is valid only in that BC.
This allows you to define a model representative of the specific business context and you don't need to shove everything in one model only. An Order is an AR in one context, while in another is just an id.
Since an AR pretty much encapsulates all the lower concepts and business rules, it acts as a whole i.e as a transaction/unit of work. A repository always works with AR because 1) a repo always deals with business objects and 2) the AR represents the business object for a given context.
When you have a use case involving 2 or more AR the business workflow and the correct modelling of that use case is paramount. In a lot of cases those AR can be modified independently (one doesn't care about other) or an AR changes as a result of other AR behaviour.
In your example, it's pretty trivial: when the customer places an order for 100 items, a domain event is generated and published. Then you have a handler which will check if the order complies with the customer promotions rules and if it does, a command is issued which will have the result of changing the client state to VIP.
Domain events are very powerful and allows you to implement transactions but in an eventual consistent environment. The old db transaction is an implementation detail and it's usually used when persisting one AR (remember AR are treated as a logical unit but persisting one may involve multiple tables hence db transaction).
Eventual consistency is a 'feature' of domain events which fits naturally a rich domain (and the real world actually). For some cases you might need instant consistency however those are particular cases and they are related to UI rather than how Domain works. Of course, it really depends from one domain to another. In your example, the customer won't mind it became a VIP 2 seconds or 2 minutes after the order was placed instead of the same milisecond.

Organizing interconnected objects

This is a generic question, I don't know if it belongs to Programming or StackOverflow.
I'm writing a litte simulation. Without going very deep into its details, consider that many kind of identities are involved. They correspond to Object since I'm using a OOP language.
There are Guys that inhabit the world simulated
There are Maps
A map has many Lots, that are pieces of land with some characteristics
There are Tribes (guys belong to tribes)
There is a generic class called Position to locate the elements
There are Bots in control of tribes that move guys around
There is a World that represents the world simulated
and so on.
If the simulated world was laid down as a database, the objects would be tables with lots of references, but in memory I have to use a different strategy. So, for example, a Tribe has an array of Guys as a property, The world has a, array of Bots, of Tribes, of Maps. A Map has a Dictionary whose key is a Position and whose value is a Lot. A Guy has a Position that is where he stands.
The way I lay down such connections is pretty much arbitrary. For example, I could have an array of Guys in the World, or an Array of guys per Lot (the guys standing on a piece of land), or an array of Guys per Bot (with the Guys controlled by the bot).
Doing so, I also have to pass around a lot of objects. For example, a Bot must have informations about the Map and opponent Guys to decide how to move its Guys.
As said, in a database I'd have a Guys table connected to the Lots table (indicating its position), to the Tribe table (indicating which Tribe it belongs to) and so it would also be easy to query "All the guys in Position [1, 5]". "All the Guys of Tribe 123". "All the Guys controlled by Bot B standing on the Lot b34 not belonging to the Tribe 456" and so on.
I've worked with APIs where to get the simplest information you had to make an instance of the CustomerContextCollection and pass it to CustomerQueryFactory to get back a CustomerInPlaceQuery to... When people criticize OOP and cite verbose abstractions that soon smell ridiculous, that's what I mean. I want to avoid such things and having to relay on deep abstractions and (anti pattern) abstract contexts.
The question is: what is the preferred, clean way to manage entities and collections of entities that are deeply linked in multiple ways?

It depends on your definition of "clean". In my case, I define clean as: I can implement desired behavior in an obvious, efficient manner.
Building OOP software is not a data modeling exercise. I'd suggest stepping back a little. What does each one of those objects actually do? What methods are you going to implement?
Just because "guys are in a lot" doesn't mean that the lot object needs a collection of guys; it only needs one if there are operations on a lot that affect all the guys in it. And even then, it doesn't necessarily need a collection of guys - it needs a way to get the guys in the lot. This may be an internally stored collection, but it could also be a simple method that calls back into the world to find guys matching a criteria. The implementation of that lookup should be transparent to anyone.
From the tenor of your questions, it seems like you're thinking of this from a "how do I generate reports" perspective. Step back and think of the behaviors you're trying to implement first.
Another thing I find extremely valuable is to differentiate between Entities and Values. Entities are objects where identity matters - you may have two guys, both named "Chris", but they are two different objects and remain distinct despite having the same "key". Values, on the other hand, act like ints. From your above list, Position sounds a lot like a value - Position(0,0) is Position(0,0) regardless of which chunk of memory (identity) those bits are stored in. The distinction has a bit effect on how you compare and store values vs. entities. For example, your Guy objects (entities) would store their Position as a simple member variable.
I've found a great reference for how to think about such things is Eric Evan's "Domain Driven Design" book. He's focused on business systems, but the discussions are very valuable for how you think about building OO systems in general I've found.

I would say that no 'true' answer exists to your core question -- a best way to manage collections of entities that are linked in multiple ways. It really depends on the kind of application (simulation) - here are some thoughts:
Is execution time important?
If this is the case, there is really no way around analyzing in which way your simulator will iterate over (query) the objects from the pool: sketch out the basic simulation loop and check what kind of events will require to iterate over what kind of model entities (I assume you are developing a discrete-event simulation?). Then you should organize the data structures in a way that optimizes the most frequent/time-consuming events (as opposed to "laying down the connections arbitrarily"). Additionally, you may want to use special data structures (such as k-d trees) to organize entities with properties that you need to query often (e.g., position data). For some typical problems, e.g. collision detection, there is also a whole lot of approaches to solve them efficiently (so look for suitable libraries/frameworks, e.g. for multi-agent simulation).
How flexible do you want to make it?
If you really want to make it super-flexible and really don't want to decide on the hierarchy of the model entities, why not just use an in-memory database? As you already said, databases are easily applicable to your problem (and you can easily save the model state, which may also be useful).
How clean is clean enough?
If you want to be absolutely sure that the rest of your simulator is not affected by the design choices you make in regards of your model representation, hide it behind an interface (say, ModelWorld), which defines methods for all the types of queries your simulator may invoke (this is orthogonal to the second point and may help with the first point, i.e. figuring out what kind of access pattern your simulator exhibits). This allows you to change implementations easily, without affecting any other parts of the simulator code.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas