Within an n-tier app that makes use of a WCF service to interact with the database, what is the best practice way of making use of LinqToSql classes throughout the app?
I've seen it done a couple of different ways but they seemed like they burned a lot of hours creating extra interfaces, message classes, and the like which reduces the benefit you get from not having to write your data access code.
Is there a good way to do it currently? Are we stuck waiting for the Entity Framework?
LINQ to SQL isn't really suitable for use with a distributed app. The change tracking and lazy loading is part of the DataContext which is tied to the database so cannot travel across the wire. You can move L2S entities across the wire, modify them, move them back and update the database by reattaching them to the DataContext but that is pretty limited and you lose all concurrency checks as the old values are never kept around.
BTW I believe the same is true for L2E.
It is certainly not a good idea to pass the linq-to-sql object around to other parts of a distributed system. If you do that, you would couple your clients to the structure of the database, which is never a good idea. This was/is one of the major problems with DataSets by the way.
It is better to create your own classes for the transfer of data object. Those classes, of course, would be implemented as DataContracts. In your service layer, you'd convert between the linq-to-sql objects and instances of the data carrier objects. It is tedious but it decouples the clients of the service from the database schema. It also has the advantage of giving you better control of the data that is passed around in your system.
Related
I am working on a application design and investigating on passing data between WCF services and the ASP.NET web application. Options I am looking at are to either use "Datasets" or Entity Framework.
I have bunch of questions,
If I use Entity Framework to pass data, would it add more overhead to the WCF communication,
Is Datasets considered as "light weight" considering the communication overhead,
If i use Entity Framework, how can I maintain object models if I use stored procedures to return complex data?
Overall I need to figure out pros and cons of using these technologies.
Your question touches on the principles of service-oriented architecture (SOA). In SOA, a service should expose operations (methods) that apply to business objects through contracts. The business objects should be defined in a standard way which is why WCF uses WSDL and XML Schema to do this.
An SOA tenet is that business objects are shared through a schema and/or a contract but not as a specific implementation. By this priniciple, Dataset is a heavyweight, .NET specific, database-oriented object which should not be used to represent a business object. The Microsoft Patterns & Practices group apparently disregards SOA tenets since it shows how to use Dataset as Data Transfer Objects. Whether they're just promoting vendor lock-in or just trashing the whole concept of SOA is anybody's guess. If there is even the remotest chance your service will ever be consumed by a non-.NET client please do not use a Dataset.
If you decide on using the Entity Framework then I'd recommend using Code First to define the Entity Framework data model and just expose those "code first" business objects in your service. This SO question & answers provide a good discussion on using Entity Framework with WCF.
Are you really considering passing entire datasets up and down? This will destroy your object model and be more difficult to maintain, though I suppose you could implement the Repository pattern. Still, even just sending the changes you would need to do strange things like ensure you do not transfer the schema and perhaps using the compress option. But it is a very poor choice when compared to Entity Framework Code First with nice clean POCOs in a separate assembly and without any of the EF infrastructure polluting the DTOs.
EF should not add overhead as such but it depends how it is implemented and what data objects are being passed around. If you pass data into a context and use the correct option when SaveChanges is called such as SaveChangesOptions.ReplaceOnUpdate, only the changed entities will be updated. The queries executed are efficient so long as you are mindful not to lazy load stuff you don't need. You need to understand LINQ to entities well and batch your updates as you would any other potentially expensive method call. Run some tests with your database profiler running and try to improve the efficiency of your interactions with EF, monitor your IIS logs for data sizes and transmission times etc.
Datasets are not considered lightweight due to the need to encapsulate a schema and potentially somebody might make a mistake and send up a whole heap of data in multiple tables, including tables that are dependencies. These might need to be pulled in anyway either on the client or server - very messy! EF does support stored procedures in a sensible manner as they can be part of your model and get called when specific entities need to be saved. ORM will compliment your OO design and lead to cleaner code.
Also if you are doing something simple and only require CRUD without much in the way of business logic consider WCF Data Services.
I want to know if using Self Tacking Entities (in Entity Framework) is recommended with WCF services? If yes, then can you guide me to a tutorial which may guide how to do that?
Actually, I am going to develop a WPF application using Prism with MEF and MVVM. I have decided to use Entity Framework. I want suggestions and advices regarding this approach.
Any help will be appreciated.
I want to know if using Self Tacking Entities (in Entity Framework) is
recommended with WCF services?
It depends who you ask. If you ask MS they will tell you Yes because they simply don't have anything better to offer. STEs were response to this very old MS Connect suggestion. The problem is that EF itself has terrible bad support for merging changes between two entity graphs (you must do it completely yourselves) and developers working on MS platform (sometimes including me) share some common behaviors:
They are lazy to develop their own solution to problem and they expect some magic directly in APIs provided by MS.
Most of the time they are not trained / skilled / competent in the technology they have to use, because they have to move to a new one too often.
The only APIs they know are part of .NET Framework. They don't look for other options neither they compare features.
First two points are result of MS strategy where RAD become synonym for designer (or newly also T4 templates).
I share #Richard opinion about STEs. I would add one additional drawback of STEs - they move large datasets between participants. If you decide to get an entity graph from the server, change a single entity in the graph and push data back they will transfer again the whole graph. Transferring only changed entities results in fighting with STE's core logic. I'm also afraid that they track changes completely on per entity level instead of per property level. In case of modification to entities with large binary or string data it can result in transferring too much unneeded data between the service and the database and between the service and the client.
Anyway for a simple application with low data traffic and small entities they can do a good job and allow you building your application quickly but without strict separation of concerns. You will get entities from service and bind them directly to WPF UI and they will be able to track changes for you. Later you will push entities back to service and they will be able to persist changes. Your client and service will be tightly coupled but in some scenarios it can be good enough.
I would avoid self tracking entities in general - I blogged about it here.
Create your own DTOs and use them to manage the transfer of data - then biuold your POCO objects in the service and use them with entity framework for persistence
If you want self tracking then there is a slightly cleaner approach here
I'm exposing some CRUD methods through WCF service, for some data objects persisted in a database through NHibernate. Is it a good approach to use NHibernate classes as data contracts, or is it better to wrap them or replace them with some other data contracts? What is your approach?
Our team just went through a good few months debating this design point, so I've got a lot of links to share ;-)
Short answer: You "should" translate from your NHibernate classes into a domain model.
Long answer: I think the answer to this is a matter of principle. If you ever want to be interoperable, you should not use Datasets as your DTOs (I love Hanselman's post on this). I'm not saying that it's never a good idea; clearly people have had success doing so. Just know that you are cutting corners and it's a risky proposition.
If you have complete control over the classes you are pushing the data into, you could build a nice domain model and just map the NHibernate data into those classes. You will more than likely have serious issues doing that, as IList<> (which a <bag> maps to) is not serializeable. You'd have to write your own serializer, or use something like NetDataContractSerializer, but you lose interoperability.
You will need to measure the amount of work involved in building some wrapper classes, and the translation between, but then you have complete flexibility in what your domain model will look like. Then, you're able to do things (as we have done) like code generation for your NHibernate maps and objects. Then, your data contracts serve as an abstraction from your data, as they should.
P.S. You might want to take a look at ADO.NET Data Services, which is a RESTful way to expose your data, which, at this point, seems to be the most interoperable choice to expose your data.
You would not want to expose your domain model directly, but map the domain to some kind of message as it hits the process boundary. You could leverage NHibernate to do the mapping work for you. In this case you would have 2 mappings, one for you domain model and another for your lightweight messages.
I don't have direct experience in doing this, but I have sent Datasets across via WCF before and that works just fine. I think your biggest issue in using NHibernete as data objects over WCF will be a lack of interoperability (as is also the case when using Datasets). Not only does the client have to use .NET, the client must also use NHibernate. This goes against SOA principles, but if you know for sure that you won't be reusing this component then there's not a great reason not to.
It's at least worth a try.
I'm a student currently dabbling in a .Net n-tier app that uses Nhibernate+WCF+WPF.
One of the things that is done quite terribly is object graph serialisation, In fact it isn't done at all, currently associations are ignored and we are using DTOs everywhere.
As far as I can tell one method to proceed is to predefine which objects and collections should be loaded and serialised to go across the wire, thus being able to present some associations to the client, however this seems limited, inflexible and inconsistent (can you tell that I don't like this idea).
One option that occurred to me was to simply replace the NHProxies that lazy load collection on the client tier with a "disconnectedProxy" that would retrieve the associated stuff over the wire. This would mean that we'd have to expand our web service signature a little and do some hackery on our generated proxies but this seemed like a good T4/other code gen experiment.
As far as I can tell this seems to be a common stumbling block but after doing a lot of reading I haven't been able to figure out any good/generally accepted solutions. I'm looking for a bit of direction as much as any particular solution, but if there is an easy way to make the client "feel" connected please let me know.
You ask a very good question that unfortunately does not have a very clean answer. Even if you were able to get lazy loading to work over WCF (which we were able to do) you still would have issues using the proxy interceptor. Trust me on this one, you want POCO objects on the client tier!
What you really need to consider...what has been conceived as the industry standard approach to this problem from the research I have seen, is called persistence vs. usage or persistence ignorance. In other words, your object model and mappings represent your persistence domain but it does not match your ideal usage scenarios. You don't want to bring the whole database down to the client just to display a couple properties right??
It seems like such a simple problem but the solution is either very simple, or very complex. On one hand you can design your entities around your usage scenarios but then you end up with proliferation of your object domain making it difficult to maintain. On the other, you still want the rich object model relationships in order to write granular business logic.
To simplify this problem let’s examine the two main gaps we need to fill…between the database and the database/service layer and the service to client gap. NHibernate fills the first one just fine by providing an ORM to load data into your objects. It does a decent job, but in order to achieve great performance it needs to be tweaked using custom loading strategies. I digress…
The second gap, between the server and client, is where things get dicey. To simplify, imagine if you did not send any mapped entities over the wire to the client? Try creating a mechanism that exchanges business entities into DTO objects and likewise DTO objects into business entities. That way your client deals with only DTOs (POCO of course), and your business logic can maintain its rich structure. This allows you to leverage not only NHibernate’s lazy loading mechanism, but other benefits from the session such as L1 cache.
For brevity and intellectual property reasons I will not go into the design of said mechanism, but hopefully this is enough information to point you in the right direction. If you don’t care about performance or latency at all…just turn lazy loading off all together and work through the serialization issues.
It has been a while for me but the injection/disconnected proxies may not be as bad as it sounds. Since you are a student I am going to assume you have some time and want to muck around a bit.
If you want to inject your own custom serialization/deserialization logic you can use IDataContractSurrogate which can be applied using DataContractSerializerOperationBehavior. I have only done a few basic things with this but it may be worth looking into. By adding some fun logic (read: potentially hackish) at this layer you might be able to make it more connected.
Here is an MSDN post about someone who came to the same realization, DynamicProxy used by NHibernate makes it not possible to directly serialize NHibernate objects doing lazy loading.
If you are really determined to transport the object graph across the network and preserve lazy loading functionality. Take a look at some code I produced over here http://slagd.com/?page_id=6 . Basically it creates a fake session on the other side of the wire and allows the nhibernate proxies to retain their functionality. Not saying it's the right way to do things, but it might give you some ideas.
What do you suggest for Data Access layer? Using ORMs like Entity Framework and Hibernate OR Code Generators like Subsonic, .netTiers, T4, etc.?
For me, this is a no-brainer, you generate the code.
I'm going to go slightly off topic here because there's a bigger underlying fallacy at play. The fallacy is that these ORM frameworks solve the object/relational impedence mismatch. This claim is a barefaced lie.
I find the best way to resolve the object/relational impedance mismatch is to either use OOP exclusively and use an object database or use the idioms of the relational database exclusively and ignore OOP.
The abstraction "everything is a table" is to me, much more powerful than the abstraction "everything is a class." It takes less code, less intellectual effort and leads to faster code when you code to the database rather than to an object model.
To me this seems obvious. If your application is data driven then surely your code should be data driven too? Yet to say this is hugely controversial.
The central problem here is that OOP becomes a really leaky abstraction when used in conjunction with a database. Code that look perfectly sensible when written to the idioms of OOP looks completely insane when you see the traffic that code generates at the database. When that messiness becomes a performance problem, OOP is the first casualty.
There is really no way to resolve this. Databases work with sets of data. OOP focus on instances of classes. Trying to marry the two is always going to end in divorce.
So to answer your question, I believe you should generate your classes and try and make them map the underlying database structure as closely as possible.
Perhaps controversially, I've always felt that using code generators for the ADO.NET plumbing is fundamentally solving the wrong problem.
At some point, hopefully not too long after learning about Connection Strings, SqlCommands, DataAdapters, and all that, we notice that:
Such code is ugly
It is very boring to write
It's very easy to miss something if you're doing it by hand
It has to be repeated every time you want to access the database
So, the problem to solve is "how to do the same thing lots of times fast"?
I say no.
Using code generators to make this process quick still means that you have a ton of code, all the same, all over your business classes (or your data access layer, if you separate that from your business logic).
And then, if you want to do something generically (like track stored procedure usage, for instance), you end up having to customise your code generator if it doesn't already have the feature you want. And even if it does, you still have to regenerate everything all the time.
I like to do things once, not many times, no matter how fast I can do them.
So I rolled my own Data Access class that knows how to add parameters, set up and close connections, manage transactions, and other cool stuff. It only had to be written once, and calling its methods from a Business object that needs some database stuff done consists of one line of code.
When I needed to make the application support multithreaded database accesses, it required a change to the Data Access class only, and all the business classes just do what they already did.
There is no right answer it all depends on your project. As Simon points out if your application is all data driven, then it might make sense depending on the size and complexity of the domain to use non oop paradigm. I had a lot of success building a system using a Transaction Script pattern, which passed XML Messages around the system.
However this system started to break down after five or six years as the application grew in size and complexity (5 or 6 webs, several web services, tons of COM+ components, legacy and .net code, 8+ databases with 800+ tables 4,000+ procedures). No one knew what anything was, and duplication was running rampant.
There are other ways to alleviate the maintance then OOP; however, if you have a very complex domain then hainvg a rich domain model is ideal IMHO, as it allows for the business rules to be expressed in nice encapsulated components.
To answer your question, avoid code generators if you can. Code generators are a recipe for disaster, but if you do go with code generation do not modify the generated code. Also be sure to have a good process in place that is easy for developers to get the new generated code.
I recommend using either the following: ORM or hand roll a lightweight DAL. I am currently transitioning a project over to nHibernate off my hand rolled DAL and am having a lot of success; however, I like having the option of using either option. Further if you properly seperate your concerns (Data Access from Business Layer from Presentation) you can have a single service layer that might talk to a Dao (Data Access Object) that for one object is an ORM but for another is hand rolled). I like this flexibility as it allows to apply the best tool to the job.
I like nHibernate over a hand rolled DAL because while my DAL does abstract away most of the ADO.Net code you still have to write the code that takes a data reader to an object or an object and creates the parameters.
I've always preferred to go the code generator route, especially in C# where you can make use of extended classes to add functionality to the basic data objects.
Hate to say this, but it depends. If you find an ORM tool that fits your needs go for it. We wrote our own system in small steps while developing the application. We are using C++ and there are not that many tools out there anyway. Ours ended up being a XML description of the database, from that the SQL generation script and the basic object layer and metadata were generated.
Do your homework and evaluate theses tools and you will find a good fit for your needs.