I'm exposing some CRUD methods through WCF service, for some data objects persisted in a database through NHibernate. Is it a good approach to use NHibernate classes as data contracts, or is it better to wrap them or replace them with some other data contracts? What is your approach?
Our team just went through a good few months debating this design point, so I've got a lot of links to share ;-)
Short answer: You "should" translate from your NHibernate classes into a domain model.
Long answer: I think the answer to this is a matter of principle. If you ever want to be interoperable, you should not use Datasets as your DTOs (I love Hanselman's post on this). I'm not saying that it's never a good idea; clearly people have had success doing so. Just know that you are cutting corners and it's a risky proposition.
If you have complete control over the classes you are pushing the data into, you could build a nice domain model and just map the NHibernate data into those classes. You will more than likely have serious issues doing that, as IList<> (which a <bag> maps to) is not serializeable. You'd have to write your own serializer, or use something like NetDataContractSerializer, but you lose interoperability.
You will need to measure the amount of work involved in building some wrapper classes, and the translation between, but then you have complete flexibility in what your domain model will look like. Then, you're able to do things (as we have done) like code generation for your NHibernate maps and objects. Then, your data contracts serve as an abstraction from your data, as they should.
P.S. You might want to take a look at ADO.NET Data Services, which is a RESTful way to expose your data, which, at this point, seems to be the most interoperable choice to expose your data.
You would not want to expose your domain model directly, but map the domain to some kind of message as it hits the process boundary. You could leverage NHibernate to do the mapping work for you. In this case you would have 2 mappings, one for you domain model and another for your lightweight messages.
I don't have direct experience in doing this, but I have sent Datasets across via WCF before and that works just fine. I think your biggest issue in using NHibernete as data objects over WCF will be a lack of interoperability (as is also the case when using Datasets). Not only does the client have to use .NET, the client must also use NHibernate. This goes against SOA principles, but if you know for sure that you won't be reusing this component then there's not a great reason not to.
It's at least worth a try.
Related
For my application, I am looking at using an ORM and currently trying to decide if the domain layer should interface with it through a Data Access Object, Repositories, or something else? I am hesitant to pair an ORM with repositories because they can become redundant if the ORM entities are identical with the domain objects, but having one big DAO seems cludgy. I want to keep my SQL centralized, but I can't figure which of these options, if any, would make the most sense. Any suggestions on an appropriate design pattern?
This is very opinion-based, but I tend toward creating separate entities from my domain models. The domain model needs to closely match your domain, whereas your entities need to closely model your storage. They may initially match very closely, and seem really redundant, but they very often drift dramatically from each other very quickly.
That being said, wrappers that do nothing but map domain entities to persistence entities often feels horrible, and a giant waste of time. Additionally, it doesn't pay off until much later in the game, when you are doing refactoring, and you realize that your domain isn't quite right, but you don't want to modify your persistence layer.
The good news is, most languages/frameworks have some form of a mapping library that will let you automatically map from one object to another that is similarly structured. This is a great way to speed this up initially, while still giving yourself flexibility to create a manual mapping when the requirements change out from under you.
As i grow in my professional career i consider naming conventions very important. I noticed that people throw around controller, LibraryController, service, LibraryService, and provider, LibraryProvider and use them somewhat interchangeable. Is there any specific reasoning to use one vs the other?
If there are websites that have more concrete definitions that would be great.
In Java Spring, Spring Boot and also .NET you would have:
Repository: persist data in the database and perform SQL queries.
Service: contain most of the business logic
Controller: define REST endpoints, which contains as little logic as possible.
Conceptually this means that the WHAT (functional) is separated from the HOW (technical) as much as possible. The services try to stay technologically neutral. By contrast a controller only wants to define an external contract for communication. And finally the repository only wants to facilitate the access to the database.
Organizing your code in this way keeps the business logic short, clean and maintainable. Unfortunately it is not always easy to keep them separated. e.g. It is tempting to pollute or enrich your objects with meta-data in the form of decorators/annotations. (e.g. database column name and data type).
Some developers don't see harm in this and get away with it. Others keep their objects strictly separated and define multiple sets of objects.
The objects for the database are often referred to as "entities" or "models".
For a REST controller they are often referred to as DTOs which stands for data-transfer-object.
Having multiple objects means that you need Mappers to convert one type of object to another. Some frameworks can do this for you (e.g. MapStruct).
It would be easy to claim that strictness is always a good thing, but it can slow you down. It's okay to strike a balance.
In Node.js, the concepts of controllers and services are identical. However the term Repository isn't used very often. Instead, they would call that a Provider or sometimes they would just generalize Repositories as a kind of Service.
NestJS has stronger opinions about this (which can be a good thing). The naming conventions of NestJS (a Node.js framework) are strongly influenced by the naming conventions of Angular, which is of course a totally different kind of framework (front-end).
(For completeness, in Angular, a Provider is actually just something that can be injected as a dependency. Most providers are Services, but not necessarily. (It could be a Guard or even a Module. A Module would be more like a set of tools/driver or a connector.)
PS: Anyway, the usage of the term Module is a bit confusing because there also are "ES6 modules", which is a totally different thing.)
ES6 and more modern version of javascript (including typescript) are extremely powerful when it comes to (de)constructing objects. And that makes mappers unnecessary.
Having said that, most Node.js and Angular developers prefer to use typescript these days, which has more features than java or C# when it comes to defining types.
So, all these frameworks are influencing each other. And they pretty much all agree on what a Controller and a Service is. It's mostly the Repository and Provider words that have different meanings. It really is just a matter of conventions. If your framework has a convention, then stick to that. If there isn't one, then pick one yourself.
These terms can be synonymous with each other depending on context, which is why each framework or language creator is free to explicitly declare them as they see fit... think function/method/procedure or process/service, all pretty much the same thing but slight differences in different contexts.
Just going off formal English definitions:
Provider: a person or thing that provides something.
i.e. the provider does a service by controlling some process.
Service: the action of helping or doing work for someone.
i.e. the service is provided by controlling some work process.
Controller: a person or thing that directs or regulates something.
i.e. the controller directs something to provide a service.
These definitions are just listed to the explain how the developer looks at common English meanings when defining the terminology of a framework or language; it's not always one for one and the similarity in terminology actually provides the developer with a means of naming things that are very very similar but still are slightly different.
So for example, lets take AngularJS. Here the developers decided to use the term Controller to imply "HTML Controller", a Service to imply something like a "Quasi Class" since they are instantiated with the New keyword and a Provider is really a super-set of Service and Factory which is also similar. You could really program any application using any of them and really wouldn't lose anything much; though one might be a little better than another in certain context, I don't personally believe its worth the extra confusion... essentially they are all providers. The Angular people could have just defined factory, provider and service as a single term "provider" and then passed in modifiers for things like "static" and "void" like most languages and the exact same functionality could have been provided; this would have been my preference, however I've learned not to fight the conventions and terminology of the frameworks your working no matter how strongly you disagree.
Looking myself too for a more meaningful name than Provider :)
And found this useful post
Old dev here that stumbled on this. My opinion and how I’ve seen it used over the last 20 years shows that it varies by language but the Java C# crowd mostly uses them as follows.
A service handles business logic and deals with domain objects. You find services in controllers and other services.
A repository does NOT handle business logic, but instead acts like a pool of domain objects (with helper methods for finding or persisting them. Services often contain repositories. Repositories often contain a context and are responsible for mapping from infrastructure shaped data to domain shaped data if the definitions have drifted apart. Controllers also often contain repositories for crud endpoints.
A context handles infrastructure the domain owns. Most often this is a database, but context means that anything that touches this data does so through (in) this context. A context returns infrastructure shaped data. A repository often contains a context. Context directly in services is sometimes appropriate. Context in controller is a hard no.
A provider provides access to infrastructure some other app owns. Most often these are rest apis, but can also be kafka streams or rpc classes that read data from or push data to someone else. If the source of truth for some of your domain objects fields changes you will probably see a provider next to a context in your repository, and your repository handles insulating the rest of your code from that change. Providers that provide rpc functionality are often found in services. In micro services or gateways or vertical slice architecture you sometimes see providers directly in controllers.
One old guy’s opinion but I hope it helps.
In all the examples I have seen, ORM's tend to be used directly or behind some kind of DAL repository (presumably so that they can be swapped out in the future).
I am no fan of direct ORM use as it will be hard to swap out, but i am equally no fan of losing the full domain change tracking it provides!
In the past I would have written a data mapper class (Fowler) for each object in my domain, but I have learnt through experience that this CRUD coding drains around 1/3 of my time.
I a realize that changing my data access strategy is rather unlikely (I have never had to do so before) but I am really concerned that by using an ORM directly I will be locking myself into using it until the end of time.
I have been thinking about wrapping the ORM (no decision on the ORM itself yet) in a generic ORM container and passing this around to finder classes for each of the domain objects. However, I am totally unsure what a generic ORM wrapper class would look like!
Has anyone got any real life advise here? Please don't feel it nessecary to sugar coat your answers!!
The repository has a number of functions:
It allows for unit testing with a mock implementation
It allows you to hide the full implementation of the ORM from the consumer, and implement security functions
It provides a layer of abstraction for business logic (although some people use a separate service layer for this), and
It allows you to change the ORM implementation, if necessary.
Another container to genericize your ORM feels like overengineering to me. As you pointed out, it is unlikely that you will ever change your underlying implementation, but if you do, your repositories seem like the sensible place to do it.
To point you in the direction of someone much wiser than me on these matters, one of the issues with having a generic ORM wrapper as highlighted by Ayende in his blog post The false myth of encapsulating data access in the DAL is that different ORMs are inherently too different to encapsulate effectively, having different methods for transaction handling, etc.
And on top of that, there's really not much point in switching ORMs anyway - one of the main reasons for encapsulating the DAL in case of change was to cope with switching databases, but most modern ORMs are able to work with many different databases anyway.
I'm a student currently dabbling in a .Net n-tier app that uses Nhibernate+WCF+WPF.
One of the things that is done quite terribly is object graph serialisation, In fact it isn't done at all, currently associations are ignored and we are using DTOs everywhere.
As far as I can tell one method to proceed is to predefine which objects and collections should be loaded and serialised to go across the wire, thus being able to present some associations to the client, however this seems limited, inflexible and inconsistent (can you tell that I don't like this idea).
One option that occurred to me was to simply replace the NHProxies that lazy load collection on the client tier with a "disconnectedProxy" that would retrieve the associated stuff over the wire. This would mean that we'd have to expand our web service signature a little and do some hackery on our generated proxies but this seemed like a good T4/other code gen experiment.
As far as I can tell this seems to be a common stumbling block but after doing a lot of reading I haven't been able to figure out any good/generally accepted solutions. I'm looking for a bit of direction as much as any particular solution, but if there is an easy way to make the client "feel" connected please let me know.
You ask a very good question that unfortunately does not have a very clean answer. Even if you were able to get lazy loading to work over WCF (which we were able to do) you still would have issues using the proxy interceptor. Trust me on this one, you want POCO objects on the client tier!
What you really need to consider...what has been conceived as the industry standard approach to this problem from the research I have seen, is called persistence vs. usage or persistence ignorance. In other words, your object model and mappings represent your persistence domain but it does not match your ideal usage scenarios. You don't want to bring the whole database down to the client just to display a couple properties right??
It seems like such a simple problem but the solution is either very simple, or very complex. On one hand you can design your entities around your usage scenarios but then you end up with proliferation of your object domain making it difficult to maintain. On the other, you still want the rich object model relationships in order to write granular business logic.
To simplify this problem let’s examine the two main gaps we need to fill…between the database and the database/service layer and the service to client gap. NHibernate fills the first one just fine by providing an ORM to load data into your objects. It does a decent job, but in order to achieve great performance it needs to be tweaked using custom loading strategies. I digress…
The second gap, between the server and client, is where things get dicey. To simplify, imagine if you did not send any mapped entities over the wire to the client? Try creating a mechanism that exchanges business entities into DTO objects and likewise DTO objects into business entities. That way your client deals with only DTOs (POCO of course), and your business logic can maintain its rich structure. This allows you to leverage not only NHibernate’s lazy loading mechanism, but other benefits from the session such as L1 cache.
For brevity and intellectual property reasons I will not go into the design of said mechanism, but hopefully this is enough information to point you in the right direction. If you don’t care about performance or latency at all…just turn lazy loading off all together and work through the serialization issues.
It has been a while for me but the injection/disconnected proxies may not be as bad as it sounds. Since you are a student I am going to assume you have some time and want to muck around a bit.
If you want to inject your own custom serialization/deserialization logic you can use IDataContractSurrogate which can be applied using DataContractSerializerOperationBehavior. I have only done a few basic things with this but it may be worth looking into. By adding some fun logic (read: potentially hackish) at this layer you might be able to make it more connected.
Here is an MSDN post about someone who came to the same realization, DynamicProxy used by NHibernate makes it not possible to directly serialize NHibernate objects doing lazy loading.
If you are really determined to transport the object graph across the network and preserve lazy loading functionality. Take a look at some code I produced over here http://slagd.com/?page_id=6 . Basically it creates a fake session on the other side of the wire and allows the nhibernate proxies to retain their functionality. Not saying it's the right way to do things, but it might give you some ideas.
Within an n-tier app that makes use of a WCF service to interact with the database, what is the best practice way of making use of LinqToSql classes throughout the app?
I've seen it done a couple of different ways but they seemed like they burned a lot of hours creating extra interfaces, message classes, and the like which reduces the benefit you get from not having to write your data access code.
Is there a good way to do it currently? Are we stuck waiting for the Entity Framework?
LINQ to SQL isn't really suitable for use with a distributed app. The change tracking and lazy loading is part of the DataContext which is tied to the database so cannot travel across the wire. You can move L2S entities across the wire, modify them, move them back and update the database by reattaching them to the DataContext but that is pretty limited and you lose all concurrency checks as the old values are never kept around.
BTW I believe the same is true for L2E.
It is certainly not a good idea to pass the linq-to-sql object around to other parts of a distributed system. If you do that, you would couple your clients to the structure of the database, which is never a good idea. This was/is one of the major problems with DataSets by the way.
It is better to create your own classes for the transfer of data object. Those classes, of course, would be implemented as DataContracts. In your service layer, you'd convert between the linq-to-sql objects and instances of the data carrier objects. It is tedious but it decouples the clients of the service from the database schema. It also has the advantage of giving you better control of the data that is passed around in your system.