Does an external service (API) fit the DDD definition of a Repository? - repository

I haven't found anything concrete on the topic from the gurus, except
There are other ways to implement an Anticorruption Layer, such as by
means of a Repository (12). However, since Repositories are typically used
to persist and reconstitute Aggregates, creating Value Objects by that means
seems misplaced. If our goal is to produce an Aggregate from an Anticorrup-
tion Layer, a Repository may be a more natural source.
from Vernon Vaughn.
What my concern is that mostly ORMs/queries are used as examples of Repositories in the DDD literature. My project is very scarce in domain logic cause it's mainly a wrapper on a couple of APIs and combines the outcome of those Contexts. The responsibilities of the project are broad and could fit many areas/contexts of the business as a whole. The only architectural rule forced from the beginning is the onion architecture and at least the DDD technical modeling concepts seem fitting for me. I must say it's hard to reason about the domain in this particular ongoing project.

Does an external service (API) fit the DDD definition of a Repository?
Maybe?
REPOSITORIES address the middle and end of the [domain object's] life cycle, providing the means of finding and retrieving persistent objects while encapsulating the immense infrastructure involved.
Repository is a pattern, motivated by the notion of separation of concerns -- you shouldn't have to fuss with the details of persistence when you are working on the domain logic.

DDD is about the domain only. Details of how your app persists entities are not of its concern. That's the reason why you define the interface (in the case of .NET) of your repository in your domain, but the actual implementation is part of the infrastructure of your code.
Repositories are nothing but a pattern for you to do "CRUD" operations on an entity without the concerns of how is done. Remember that your client class (the one using the repository) can only see the exposed public methods. Whatever happens inside, it's a mystery :)
DDD says, give me an interface for me to operate. How you do it, that's your problem. You can effectively persist your entities using an external API (think of Twitter API), a text file, ORM (direct connection to a database). DDD doesn't care.

take a modern JavaScript website as an example. You'll have plenty of REST calls to create/find/update/delete your domain objects.
In the case of a server application, you'll have a database and a DAO implementation as a client interface to your database. In your web application, you'll also have some REST-client functionality as client interface to your server application. Both are considered repositories, no matter if the implemenentation of the client interface access data in your database / your server / your file sytem etc.

Related

Domain services seem to require only a fraction of the total queries defined in repositories -- how to address that?

I'm currently facing some doubts about layering and repositories.
I was thinking of creating my repositories in a persistence module. Those repositories would inherit (or implement/extend) from repositories created in the domain layer module, being kept "persistence agnostic".
The issue is that from all I can see, the necessities of the domain layer regarding its repositories are quite humble. In general, they tend to be rather CRUDish.
It's in general at the application layer level, when solving particular business use-cases that the queries tend to be more complex and contrived (and thus, the number of repository's methods to explode).
So this raises the question of how to deal with this:
1) Should I just leave the domain repository interfaces simple and then just add the extra methods in the repository implementations (such that the application layer, that does know about the repository implementations, can use them)?
2) Should I just add those methods at the domain level repository implementations? I think not.
3) Should I create another set of repositories to be used just at the application layer level? This would probably mean moving to a more CQRSesque application.
Thanks
I think you should react to the realities of your business / requirements.
That is, if your use-cases are clearly not "persistence agnostic" then don't hold on to that particular restriction. Not everything can be reduced to CRUD. In fact I think most things worth implementing can't be reduced to CRUD persistence. Most database systems relational or otherwise have a lot of features nowadays, and it seems quaint to just ignore those. Use them.
If you don't want to mix SQL with other code, there are still a lot of other "patterns" that let you do that without requiring you to abstract access to something you actually don't need abstraction to.
On the flipside, you build a dependency to a particular persistence system. Is that a problem? Most of the time it actually isn't, but you have to decide for yourself.
All in all I would choose option 4: Model the problem. If I need a complicated SQL to build a use-case, and I don't need database independence (I rarely if ever do), then just write it where it is used, end of story.
You can use other tools like refactoring later to correct design issues.
The Application layer doesn't have to know about the Infrastructure.
Normally it should be fine working with just what Repository interfaces declared in the Domain provide. The concrete implementations are injected at runtime.
Declaring repository interfaces in the Domain layer is not only about using them in domain services but also elsewhere.
Should I create another set of repositories to be used just at the
application layer level? This would probably mean moving to a more
CQRSesque application.
You could do that, but you would lose some reusability.
It is also not related to CQRS - CQRS is a vertical division of the whole application between queries and commands, not giving horizontal layers different ways of fetching data.
Given that a repository is not about querying but about working with full aggregates most of the time perhaps you could elaborate on why you may need to create a separate set of repositories that are used only in your application/integration layer?
Perhaps you need to have a read-specific implementation that is optimised for data retrieval:
This would probably mean moving to a more CQRSesque application
Well, you'd probably want to implement read-specific bits that make sense. I usually have my data access separated either by namespace and, at times, even in a separate assembly. I then use I{Aggregate}Query implementations that return the relevant bits of data in as simple a type as possible. However, it is quite possible to even map to a more complex read model that even has relations but it is still only a read model and is not concerned with any command processing. To this end the domain is never even aware of these classes.
I would not go with extending the repositories.

Is a repository only limited to the database in domain driven design?

When talking about a repository, everyone imagines abstraction over a database.
But can a repository be abstracted over a REST service or some other source of loading aggregates?
Yes totally it can even be an implementation of an event store.
The abstract concept is a repository, that's all that matter to the domain, nothing else.
Implementation details belong to infrastructure (and are called port adapters).
But can repository be abstraction over REST service or some other source of loading aggregates?
Yes. Evans's motivation for the repository pattern was to provide the application with the illusion that the collection of aggregates live in memory; providing a clean separation between the the code that needed to understand the details of persistence from the code that does not.
See Domain Driven Design, chapter 6.

Service vs Controller vs Provider naming conventions

As i grow in my professional career i consider naming conventions very important. I noticed that people throw around controller, LibraryController, service, LibraryService, and provider, LibraryProvider and use them somewhat interchangeable. Is there any specific reasoning to use one vs the other?
If there are websites that have more concrete definitions that would be great.
In Java Spring, Spring Boot and also .NET you would have:
Repository: persist data in the database and perform SQL queries.
Service: contain most of the business logic
Controller: define REST endpoints, which contains as little logic as possible.
Conceptually this means that the WHAT (functional) is separated from the HOW (technical) as much as possible. The services try to stay technologically neutral. By contrast a controller only wants to define an external contract for communication. And finally the repository only wants to facilitate the access to the database.
Organizing your code in this way keeps the business logic short, clean and maintainable. Unfortunately it is not always easy to keep them separated. e.g. It is tempting to pollute or enrich your objects with meta-data in the form of decorators/annotations. (e.g. database column name and data type).
Some developers don't see harm in this and get away with it. Others keep their objects strictly separated and define multiple sets of objects.
The objects for the database are often referred to as "entities" or "models".
For a REST controller they are often referred to as DTOs which stands for data-transfer-object.
Having multiple objects means that you need Mappers to convert one type of object to another. Some frameworks can do this for you (e.g. MapStruct).
It would be easy to claim that strictness is always a good thing, but it can slow you down. It's okay to strike a balance.
In Node.js, the concepts of controllers and services are identical. However the term Repository isn't used very often. Instead, they would call that a Provider or sometimes they would just generalize Repositories as a kind of Service.
NestJS has stronger opinions about this (which can be a good thing). The naming conventions of NestJS (a Node.js framework) are strongly influenced by the naming conventions of Angular, which is of course a totally different kind of framework (front-end).
(For completeness, in Angular, a Provider is actually just something that can be injected as a dependency. Most providers are Services, but not necessarily. (It could be a Guard or even a Module. A Module would be more like a set of tools/driver or a connector.)
PS: Anyway, the usage of the term Module is a bit confusing because there also are "ES6 modules", which is a totally different thing.)
ES6 and more modern version of javascript (including typescript) are extremely powerful when it comes to (de)constructing objects. And that makes mappers unnecessary.
Having said that, most Node.js and Angular developers prefer to use typescript these days, which has more features than java or C# when it comes to defining types.
So, all these frameworks are influencing each other. And they pretty much all agree on what a Controller and a Service is. It's mostly the Repository and Provider words that have different meanings. It really is just a matter of conventions. If your framework has a convention, then stick to that. If there isn't one, then pick one yourself.
These terms can be synonymous with each other depending on context, which is why each framework or language creator is free to explicitly declare them as they see fit... think function/method/procedure or process/service, all pretty much the same thing but slight differences in different contexts.
Just going off formal English definitions:
Provider: a person or thing that provides something.
i.e. the provider does a service by controlling some process.
Service: the action of helping or doing work for someone.
i.e. the service is provided by controlling some work process.
Controller: a person or thing that directs or regulates something.
i.e. the controller directs something to provide a service.
These definitions are just listed to the explain how the developer looks at common English meanings when defining the terminology of a framework or language; it's not always one for one and the similarity in terminology actually provides the developer with a means of naming things that are very very similar but still are slightly different.
So for example, lets take AngularJS. Here the developers decided to use the term Controller to imply "HTML Controller", a Service to imply something like a "Quasi Class" since they are instantiated with the New keyword and a Provider is really a super-set of Service and Factory which is also similar. You could really program any application using any of them and really wouldn't lose anything much; though one might be a little better than another in certain context, I don't personally believe its worth the extra confusion... essentially they are all providers. The Angular people could have just defined factory, provider and service as a single term "provider" and then passed in modifiers for things like "static" and "void" like most languages and the exact same functionality could have been provided; this would have been my preference, however I've learned not to fight the conventions and terminology of the frameworks your working no matter how strongly you disagree.
Looking myself too for a more meaningful name than Provider :)
And found this useful post
Old dev here that stumbled on this. My opinion and how I’ve seen it used over the last 20 years shows that it varies by language but the Java C# crowd mostly uses them as follows.
A service handles business logic and deals with domain objects. You find services in controllers and other services.
A repository does NOT handle business logic, but instead acts like a pool of domain objects (with helper methods for finding or persisting them. Services often contain repositories. Repositories often contain a context and are responsible for mapping from infrastructure shaped data to domain shaped data if the definitions have drifted apart. Controllers also often contain repositories for crud endpoints.
A context handles infrastructure the domain owns. Most often this is a database, but context means that anything that touches this data does so through (in) this context. A context returns infrastructure shaped data. A repository often contains a context. Context directly in services is sometimes appropriate. Context in controller is a hard no.
A provider provides access to infrastructure some other app owns. Most often these are rest apis, but can also be kafka streams or rpc classes that read data from or push data to someone else. If the source of truth for some of your domain objects fields changes you will probably see a provider next to a context in your repository, and your repository handles insulating the rest of your code from that change. Providers that provide rpc functionality are often found in services. In micro services or gateways or vertical slice architecture you sometimes see providers directly in controllers.
One old guy’s opinion but I hope it helps.

Does OData violate separation of concern?

I am looking at OData and it is very powerful, at the same time it's very disconcerting. It is the equivalent of exposing your datasource to a remote user. There is no service nothing nada and very little injection points, resulting in an almost comical 2-tier architecture.
My concerns are:
It's hard to enforce pattern such as DDD while using OData.
It is also hard to use OData against a set of soa Data Transfer Objects, because these are not usually queryable. ie.by the time you get the DTOs, the DB query is done, but OData is just starting up. There is not a lot of value in querying against it, because the DTOs are already in memory.
I can create a view on the DB itself that's a representation of the OData entity, but this is putting UI concern in persistence. Big no-no.
At best combining the result set from various DDD services now happens at the UI layer using kludgy javascipt - a maintainance nightmare with poor reuseability record.
Another possibility is to write a translator for the OData entity which is likely a ViewModel level class, that then translates to the DTOs, that then translates to Domains, that then translates to the ConceptualModels, the rest the ORM can handle. But this will require an inordinate amount of resources.
In short, how do you make OData play nice with SOA, OO encapsulation principles, DDD and just good old SOC?
There needs to be clear separation of the OData standard and the OData implementations.
As for the standard:
The standard itself to my view lets you to have an OOTB accessible data endpoint with full metadata on a portable manner. With support for relations and projection consumers can compose viewmodels on the server or later on the client. It is important to note, that OData supports operations (functions) to be exposed, so on the top of automatic crud you can have remote methods that seamlessly integrate into the relational pattern (functions can have results that you can further query, still on the server, and also functions can act as the "smart writer" code).
To wrap it up: OData gives a shape to what actually happens most of the time when someone needs massive REST compliant data access: formalizes common, always repeated scenarios like how it is to query for data or submit data back. This might be still affected by what you write at point 4, but to my opinion it's not an OData related issue. This is simply how AJAX and Mashups work: you'll have a client with lots of code dealing with combining data.
Other issues of yours can be answered with selecting the most appropriate server implementation. There are a couple of implementations already:
EF4/EF5 + WCF Data Services being the most "automatic". In this use case you might just be correct regarding some of your concerns but: with the fine extensibility model of EF you can interact with the automatic operations as you wish. Having an application that is driven by the actual EDMX model is a true DDD scenario.
ASP.NET Web API let's you have a totally code based back-end for what you perceive as a static, relational endpoint, so this is where you can think in 3 layers: DB, middle tier to bridge between DB data and to what is best for the clients, and client tier as a smart consumer to that model.
JayData provides these in Server Side JavaScript with the added dynamism of JavaScript.
SAP NetWeaver gateway exposes complex SAP data on a manner easy to consume for simple scenarios.
OO concerns:
With V3 of OData we have now "instance methods" (that are definitely server methods too) so what actually SOA took away with brutally separating things to data and functionality OData really gives back: defining functionality + data encapsulated but in mapped to the SOA concept of static methods with context data that act like the "this".
2Tier concerns:
IMHO 2Tier architectures became "ancient" not because the client has too much knowledge about the server side structure, but because they did not scale well and the new thin cliens were to dumb to act like a real DB client. In fact 2tier was always easier to work with (from a developer point of view), and now that actually OData server implementations are indeed middle tier with logic, you actually can get the best of both worlds:
- code agains a virtuall 2 tier architecture
- and scale as a normal n tier application can.

SOA architecture data access

In my SOA architecture, I have several WCF services.
All of my services need to access the database.
Should I create a specialized WCF service in charge of all the database access ?
Or is it ok if each of my services have their own database access ?
In one version, I have just one Entity layer instanced in one service, and all the other services depend on this service.
In the other one the Entity layer is duplicated in each of my services.
The main drawback of the first version is the coupling induced.
The drawback of the other version is the layer duplication, and maybe SOA bad practice ?
So, what do so think good people of Stack Overflow ?
Just my personal opinion, if you create a service for all database access then multiple services depend on ONE service which sort of defeats the point of SOA (i.e. Services are autonomous), as you have articulated. When you talk of layer duplication, if each service has its own data to deal with, is it really duplication. I realize that you probably have the same means of interacting with your relational databases or back from the OOA days you had a common class library that encapsulated data access for you. This is one of those things I struggle with myself, but I see no problem in each service having its own data layer. In fact, in Michele Bustamante's book (Chapter 1 - Page 8) - she actually depicts this and adds "Services encapsulate business components and data access". If you notice each service has a separate DALC layer. This is a good question.
It sounds as if you have several services but a single database.
If this is correct you do not really have a pure SOA architecture since the services are not independant. (There is nothing wrong with not having a pure SOA architecture, it can often be the correct choice)
Adding an extra WCF layer would just complicate and slow down your solution.
I would recommend that you create a single data access dll which contains all data access and is referenced by each WCF service. That way you do not have any duplication of code. Since you have a single database, any change in the database/datalayer would require a redeployment of all services in any case.
Why not just use a dependency injection framework, and, if they are currently using the same database, then just allow them to share the same code, and if these were in the same project then they would all use the same dll.
That way, later, if you need to put in some code that you don't want the others to share, you can make changes and just create a new DAO layer.
If there is a certain singleton that all will use, then you can just inject that in when you inject in the dao layer.
But, this will require that they use the same DI framework controller.
The real win that SOA brings is that it reduces the number of linkages between applications.
In the past I've worked with organizations who have done it a many different ways. Some data layers are integrated, and some are abstracted.
The way I've seen it most successfully done is when you create generic data-layer services for each app/database and you create the higher level services based on your newly created data layer.