Related
I read a debate in the comments here (current live site, without comments).
Why the debate? A Dataset for me is like a relational database, an Object is a hierarchical-like model. Why do people absolutely want a "pure" Object model, whereas we still deal with relational databases, so why not combine the two?
And if we should, is there any lightweight, comprehensive framework that allows us to do that (not a heavy mammoth, like NHibernate, which has a huge learning curve)?
"Pure objects" are a lot easier to work with, the typed object gives you intellisense and compile-time type checking.
Bare datasets are very cumbersome and annoying to work with - you need to know the column names, there's no type checking possible, so if you mistype a column name, you're out of luck and won't discover the error until runtime (the worst possible scenario).
Typed datasets are a step in the right direction, but the "things" you work with in your .NET code are still tied very closely and tightly to your database implementation - not typically a good thing, since any change in the underlying database might affect your app all the way up to your UI and cause a lot of changes being necessary.
Using an ORM like NHibernate allows you to better abstract and decouple the database (physical storage) layer from your logical business model - only in the simplest of scenarios will those two be an exact 1:1 match, so you'll need some kind of "translation" or mapping between the two anyway.
So all in all - using typed datasets might be okay for small, simple apps, but for a challenging, larger-scale, enterprise-level business app, I would never recommend coupling your business object model so closely and tightly to the database.
Marc
why do people absolutly want "pure" Object model
Because you don't want your application to depend on the database schema
Well, all the reasons you give were the same as the academical reasons that were given for EJB in Java which was a mess in the past. So arent't people falling into another fashionable hype ?
As I read here:
http://blogs.tedneward.com/2006/06/26/The+Vietnam+Of+Computer+Science.aspx
the promise is one thing, the reality is other thing.
Where is the proof upon the claims ?
Scientifically, Complexity is tight to the Concept of Entropy, you cannot reduce the inherent complexity of things, you can just move it somewhere else, so for me there is something fundamentally irational.
Ted Newards is highly controversial because it seems to me that everybody is herding like in the old EJB days: nobody dared to say EJB suck until Rod Johnson gets out with Hibernate.
Now it seems nobody cares to say ORM frameworks like Hibernate, Entity Framework, etc. are too complex, because there isn't yet another Rod Johnson II maybe :)
You pretend that adding a new layer solves the problem, it's not always the case even theorcially, like adding more team members when a project becomes a mess because adding more programmers also mean add to coordination and communication problem.
And in practice, what it seems, is that the layers that should be independant at least from the GUI viewpoint, aren't really. I see many people struggle to do simple stuff in the GUI when they use an ORM.
I recently had a debate with a colleague who is not a fan of OOP. What took my attention was what he said:
"What's the point of doing my coding in objects? If it's reuse then I can just create a library and call whatever functions I need for whatever task is at hand. Do I need these concepts of polymorphism, inheritance, interfaces, patterns or whatever?"
We are in a small company developing small projects for e-commerce sites and real estate.
How can I take advantage of OOP in an "everyday, real-world" setup? Or was OOP really meant to solve complex problems and not intended for "everyday" development?
My personally view: context
When you program in OOP you have a greater awareness of the context. It helps you to organize the code in such a way that it is easier to understand because the real world is also object oriented.
The good things about OOP come from tying a set of data to a set of behaviors.
So, if you need to do many related operations on a related set of data, you can write many functions that operate on a struct, or you can use an object.
Objects give you some code reuse help in the form of inheritance.
IME, it is easier to work with an object with a known set of attributes and methods that it is to keep a set of complex structs and the functions that operate on them.
Some people will go on about inheritance and polymorphism. These are valuable, but the real value in OOP (in my opinion) comes from the nice way it encapsulates and associates data with behaviors.
Should you use OOP on your projects? That depends on how well your language supports OOP. That depends on the types of problems you need to solve.
But, if you are doing small websites, you are still talking about enough complexity that I would use OOP design given proper support in the development language.
More than getting something to just work - your friend's point, a well designed OO design is easier to understand, to follow, to expand, to extend and to implement. It is so much easier for example to delegate work that categorically are similar or to hold data that should stay together (yes even a C struct is an object).
Well, I'm sure a lot of people will give a lot more academically correctly answers, but here's my take on a few of the most valuable advantages:
OOP allows for better encapsulation
OOP allows the programmer to think in more logical terms, making software projects easier to design and understand (if well designed)
OOP is a time saver. For example, look at the things you can do with a C++ string object, vectors, etc. All that functionality (and much more) comes for "free." Now, those are really features of the class libraries and not OOP itself, but almost all OOP implementations come with nice class libraries. Can you implement all that stuff in C (or most of it)? Sure. But why write it yourself?
Look at the use of Design Patterns and you'll see the utility of OOP. It's not just about encapsulation and reuse, but extensibility and maintainability. It's the interfaces that make things powerful.
A few examples:
Implementing a stream (decorator pattern) without objects is difficult
Adding a new operation to an existing system such as a new encryption type (strategy pattern) can be difficult without objects.
Look at the way PostgresQL is
implemented versus the way your
database book says a database should
be implemented and you'll see a big
difference. The book will suggest
node objects for each operator.
Postgres uses myriad tables and
macros to try to emulate these nodes.
It is much less pretty and much
harder to extend because of that.
The list goes on.
The power of most programming languages is in the abstractions that they make available. Object Oriented programming provides a very powerful system of abstractions in the way it allows you to manage relationships between related ideas or actions.
Consider the task of calculating areas for an arbitrary and expanding collection of shapes. Any programmer can quickly write functions for the area of a circle, square, triangle, ect. and store them in a library. The difficulty comes when trying to write a program that identifies and calculates the area of an arbitrary shape. Each time you add a new kind of shape, say a pentagon, you would need to update and extend something like an IF or CASE structure to allow your program to identify the new shape and call the correct area routine from your "library of functions". After a while, the maintenance costs associated with this approach begin to pile up.
With object-oriented programming, a lot of this comes free-- just define a Shape class that contains an area method. Then it doesn't really matter what specific shape you're dealing with at run time, just make each geometrical figure an object that inherits from Shape and call the area method. The Object Oriented paradigm handles the details of whether at this moment in time, with this user input, do we need to calculate the area of a circle, triangle, square, pentagon or the ellipse option that was just added half a minute ago.
What if you decided to change the interface behind the way the area function was called? With Object Oriented programming you would just update the Shape class and the changes automagically propagate to all entities that inherit from that class. With a non Object Oriented system you would be facing the task of slogging through your "library of functions" and updating each individual interface.
In summary, Object Oriented programming provides a powerful form of abstraction that can save you time and effort by eliminating repetition in your code and streamlining extensions and maintenance.
Around 1994 I was trying to make sense of OOP and C++ at the same time, and found myself frustrated, even though I could understand in principle what the value of OOP was. I was so used to being able to mess with the state of any part of the application from other languages (mostly Basic, Assembly, and Pascal-family languages) that it seemed like I was giving up productivity in favor of some academic abstraction. Unfortunately, my first few encounters with OO frameworks like MFC made it easier to hack, but didn't necessarily provide much in the way of enlightenment.
It was only through a combination of persistence, exposure to alternate (non-C++) ways of dealing with objects, and careful analysis of OO code that both 1) worked and 2) read more coherently and intuitively than the equivalent procedural code that I started to really get it. And 15 years later, I'm regularly surprised at new (to me) discoveries of clever, yet impressively simple OO solutions that I can't imagine doing as neatly in a procedural approach.
I've been going through the same set of struggles trying to make sense of the functional programming paradigm over the last couple of years. To paraphrase Paul Graham, when you're looking down the power continuum, you see everything that's missing. When you're looking up the power continuum, you don't see the power, you just see weirdness.
I think, in order to commit to doing something a different way, you have to 1) see someone obviously being more productive with more powerful constructs and 2) suspend disbelief when you find yourself hitting a wall. It probably helps to have a mentor who is at least a tiny bit further along in their understanding of the new paradigm, too.
Barring the gumption required to suspend disbelief, if you want someone to quickly grok the value of an OO model, I think you could do a lot worse than to ask someone to spend a week with the Pragmatic Programmers book on Rails. It unfortunately does leave out a lot of the details of how the magic works, but it's a pretty good introduction to the power of a system of OO abstractions. If, after working through that book, your colleague still doesn't see the value of OO for some reason, he/she may be a hopeless case. But if they're willing to spend a little time working with an approach that has a strongly opinionated OO design that works, and gets them from 0-60 far faster than doing the same thing in a procedural language, there may just be hope. I think that's true even if your work doesn't involve web development.
I'm not so sure that bringing up the "real world" would be as much a selling point as a working framework for writing good apps, because it turns out that, especially in statically typed languages like C# and Java, modeling the real world often requires tortuous abstractions. You can see a concrete example of the difficulty of modeling the real world by looking at thousands of people struggling to model something as ostensibly simple as the geometric abstraction of "shape" (shape, ellipse, circle).
All programming paradigms have the same goal: hiding unneeded complexity.
Some problems are easily solved with an imperative paradigm, like your friend uses. Other problems are easily solved with an object-oriented paradigm. There are many other paradigms. The main ones (logic programming, functional programming, and imperative programming) are all equivalent to each other; object-oriented programming is usually thought as an extension to imperative programming.
Object-oriented programming is best used when the programmer is modeling items that are similar, but not the same. An imperative paradigm would put the different kinds of models into one function. An object-oriented paradigm separates the different kinds of models into different methods on related objects.
Your colleague seems to be stuck in one paradigm. Good luck.
To me, the power of OOP doesn't show itself until you start talking about inheritance and polymorphism.
If one's argument for OOP rests the concept of encapsulation and abstraction, well that isn't a very convincing argument for me. I can write a huge library and only document the interfaces to it that I want the user to be aware of, or I can rely on language-level constructs like packages in Ada to make fields private and only expose what it is that I want to expose.
However, the real advantage comes when I've written code in a generic hierarchy so that it can be reused later such that the same exact code interfaces are used for different functionality to achieve the same result.
Why is this handy? Because I can stand on the shoulders of giants to accomplish my current task. The idea is that I can boil the parts of a problem down to the most basic parts, the objects that compose the objects that compose... the objects that compose the project. By using a class that defines behavior very well in the general case, I can use that same proven code to build a more specific version of the same thing, and then a more specific version of the same thing, and then yet an even more specific version of the same thing. The key is that each of these entities has commonality that has already been coded and tested, and there is no need to reimpliment it again later. If I don't use inheritance for this, I end up reimplementing the common functionality or explicitly linking my new code against the old code, which provides a scenario for me to introduce control flow bugs.
Polymorphism is very handy in instances where I need to achieve a certain functionality from an object, but the same functionality is also needed from similar, but unique types. For instance, in Qt, there is the idea of inserting items onto a model so that the data can be displayed and you can easily maintain metadata for that object. Without polymorphism, I would need to bother myself with much more detail than I currently do (I.E. i would need to implement the same code interfaces that conduct the same business logic as the item that was originally intended to go on the model). Because the base class of my data-bound object interacts natively with the model, I can instead insert metadata onto this model with no trouble. I get what I need out of the object with no concern over what the model needs, and the model gets what it needs with no concern over what I have added to the class.
Ask your friend to visualize any object in his very Room, House or City... and if he can tell a single such object which a system in itself and is capable of doing some meaningful work. Things like a button isnt doing something alone - it takes lots of objects to make a phone call. Similarly a car engine is made of the crank shaft, pistons, spark plugs. OOPS concepts have evolved from our perception in natural processes or things in our lives. The "Inside COM" book tells the purpose of COM by taking analogy from a childhood game of identifying animals by asking questions.
Design trumps technology and methodology. Good designs tend to incorporate universal principals of complexity management such as law of demeter which is at the heart of what OO language features strive to codify.
Good design is not dependant on use of OO specific language features although it is typically in ones best interests to use them.
Not only does it make
programming easier / more maintainable in the current situation for other people (and yourself)
It is already allowing easier database CRUD (Create, Update, Delete) operations.
You can find more info about it looking up:
- Java : Hibernate
- Dot Net : Entity Framework
See even how LINQ (Visual Studio) can make your programming life MUCH easier.
Also, you can start using design patterns for solving real life problems (design patterns are all about OO)
Perhaps it is even fun to demonstrate with a little demo:
Let's say you need to store employees, accounts, members, books in a text file in a similar way.
.PS. I tried writing it in a PSEUDO way :)
the OO way
Code you call:
io.file.save(objectsCollection.ourFunctionForSaving())
class objectsCollection
function ourFunctionForSaving() As String
String _Objects
for each _Object in objectsCollection
Objects &= _Object & "-"
end for
return _Objects
end method
NON-OO Way
I don't think i'll write down non-oo code. But think of it :)
NOW LET'S SAY
In the OO way. The above class is the parent class of all methods for saving the books, employees, members, accounts, ...
What happens if we want to change the way of saving to a textfile? For example, to make it compactible with a current standard (.CVS).
And let's say we would like to add a load function, how much code do you need to write?
In the OO- way you only need the add a New Sub method which can split all the data into parameters (This happens once).
Let your collegue think about that :)
In domains where state and behavior are poorly aligned, Object-Orientation reduces the overall dependency density (i.e. complexity) within these domains, which makes the resulting systems less brittle.
This is because the essence of Object-Orientation is based on the fact that, organizationally, it doesn't dustinguish between state and behavior at all, treating both uniformly as "features". Objects are just sets of features clumpled to minimize overall dependency.
In other domains, Object-Orientation is not the best approach. There are different language paradigms for different problems. Experienced developers know this, and are willing to use whatever language is closest to the domain.
I look around and see some great snippets of code for defining rules, validation, business objects (entities) and the like, but I have to admit to having never seen a great and well-written business layer in its entirety.
I'm left knowing what I don't like, but not knowing what a great one is.
Can anyone point out some good OO business layers (or great business objects) or let me know how they judge a business layer and what makes one great?
Thanks
I’ve never encountered a well written business layer.
Here is Alex Papadimoulis's take on this:
[...] If you think about it, virtually every line of code in a software
application is business logic:
The Customers database table, with
its CustomerNumber (CHAR-13),
ApprovedDate (DATETIME), and
SalesRepName (VARCHAR-35) columns:
business logic. If it wasn’t, it’d
just be Table032 with Column01,
Column02, and Column03.
The
subroutine that extends a ten-percent
discount to first time customers:
definitely business logic. And
hopefully, not soft-coded.
And
the code that highlights past-due
invoices in red: that’s business
logic, too. Internet Explorer
certainly doesn’t look for the strings
“unpaid” and “30+ days” and go, hey,
that sure would look good with a #990000 background!
So how then is possible to encapsulate all of this business logic
in a single layer of code? With
terrible architecture and bad code of
course!
[...] By implying that a system’s architecture should include a layer dedicated to business logic, many developers employ all sorts of horribly clever techniques to achieve that goal. And it always ends up in a disaster.
I imagine this is because business logic, as a general rule, is arbitrary and nasty. Garbage in, garbage out.
Also, most of the really good business layers are most probably proprietary. ;-)
Good business layers have been designed after a thorough domain analysis. If you can capture the business' semantics and isolate it from any kind of implementation, whether that be in data storage or any specific application (including presentation), then the logic should be well-factored and reusable in different contexts.
Just as a good database schema design should capture business semantics and isolate itself from any application, a business layer should do the same and even if a database schema and a business layer describe the same entities and concepts, the two should be usable in separate contexts--a database schema shouldn't have to change even when the business logic changes unless the schema doesn't reflect the current business. A business layer should work with any storage schema provided that it's abstracted via an intermdiate layer. For example, the ADO.NET Entity framework lets you design a conceptual schema which maps to the business layer and has a separate mapping to the storage schema which can be changed without recompiling the business object layer or conceptual layer.
If a person from the business side of things can look at code written with the business layer and have a rough idea of what's going on then it might be a good indication that the objects were designed right--you've succesfully conveyed a solution in the problem domain without obfuscating it with artifacts from the solution domain.
I've always been stuck between a rock and a hard place. Ideally, your business logic wouldn't be at all concerned with database or UI-related issues.
Keys Cause Problems
Still, I find things like primary and foreign keys causing problems. Even tools like Entity Framework don't completely eliminate this creep. It can be extremely inefficient to convert IDs passed as POST data into their respective objects, only to pass this to the business layer, which then passes them to the data layer to just be stripped down again.
Even NoSQL databases come with problems. They tend to return full object models, but they usually return more than you need and can lead to problems because you're assuming that object model won't change. And keys are still found in NoSQL databases.
Reuse vs. Overhead
There's also the issue of code reuse. It's pretty common for data layers to return fully populated objects, including every column in that particular table or tables. However, often business logic only cares about a limited subset of this information. It lends itself to specialized data transfer objects that only carry with them the relavent data. Of course, you need to convert between representations, so you create a mapper class. Then, when you save, you need to somehow convert these lesser objects back into the full database representation or do a partial UPDATE (meaning a another SQL command).
So, I see a lot of business layer classes accepting objects mapping directly to database tables (data transfer objects). I also see a lot of business layers accepting raw UI values (presentation objects), as well. It's also not unusual to see business layers calling out to the database mid-computation to retrieve needed data. To try to grab it up-front would probably be inefficient (think about how and if-statement can affect the data that gets retrieved) and lazy loaded values result in a lot of magic or unintended calls out to the database.
Write Your Logic First
Recently, I've been trying to write the "core" code first. This is the code that performs the actual business logic. I don't know about you, but many times when going over someone else's code, I ask the question, "But, where does it do [business rule]?" Often, the business logic is so crowded with concerns about grabbing data, transforming it and whatnot that I can't even see it (needle in a hay stack). So, now I implement the logic first and as I figure out what data I need, I add it as a parameter or add it to a parameter object. Getting the rest of the code to fit this new interface usually falls on a mediator class of some kind.
Like I said, though, you have to keep a lot in mind when writing business layers, including performance. The approach above has been useful lately because I don't have rights to version control or the database schema yet. I am working in a dark room with just my understanding of the requirements so far.
Write with Testing in Mind
Utiltizing dependency injection can be useful for designing a good architecture up-front. Try to think about how you would test your code without hitting a database or other service. This also lends itself to small, reusable classes that can run in multiple contexts.
Conclusion
My conclusion is that there really is no such thing as a perfect business layer. Even in the same application, there can be times when one approach only works 90% of the time. The best we can do is try to write the simplest thing that works. For the longest time I avoided DTOs and wrapped ADO.NET DataRows with objects so updates were immediately recorded in the underlying DataTable. This was a HUGE mistake because I couldn't copy objects and constraints caused exceptions to be thrown at weird times. I only did it to avoid setting parameter values explicitly.
Martin Fowler has blogged extensively about DSLs. I would recommend starting there.
http://martinfowler.com/bliki/dsl.html
It was helpful to me to learn and play with CSLA.Net (if you are a MS guy). I've never implemented a "pure" CSLA application, but have used many of the ideas presented in the architecture.
Your best bet is keep looking for that elusive magic bullet and use the ideas that best fit the problem you are solving. Keep it simple.
One problem I find is that even when you have a nicely designed business layer it is hard to stop business logic leaking out, and development tools tend to encourage this. For example as soon as you add a validator control to an ASP.NET WebForm you have let business logic leak out into the view. The validation should occur in the business layer and only the results of it displayed in the view. And as soon as you add constraints to a database you then have business logic in your database as well. DBA types tend to disagree strongly with this last point though.
Neither have I. We don't create a business layer in our applications. Instead we use MVC-ARS. The business logic is embedded in the (S) state machine and the (A) action.
Possibly because in reality we are never able to fully decouple the business logic from the "process", the inputs, outputs, interface and that ultimately people find it hard to deal with the abstract let alone relating it back to reality.
What do you suggest for Data Access layer? Using ORMs like Entity Framework and Hibernate OR Code Generators like Subsonic, .netTiers, T4, etc.?
For me, this is a no-brainer, you generate the code.
I'm going to go slightly off topic here because there's a bigger underlying fallacy at play. The fallacy is that these ORM frameworks solve the object/relational impedence mismatch. This claim is a barefaced lie.
I find the best way to resolve the object/relational impedance mismatch is to either use OOP exclusively and use an object database or use the idioms of the relational database exclusively and ignore OOP.
The abstraction "everything is a table" is to me, much more powerful than the abstraction "everything is a class." It takes less code, less intellectual effort and leads to faster code when you code to the database rather than to an object model.
To me this seems obvious. If your application is data driven then surely your code should be data driven too? Yet to say this is hugely controversial.
The central problem here is that OOP becomes a really leaky abstraction when used in conjunction with a database. Code that look perfectly sensible when written to the idioms of OOP looks completely insane when you see the traffic that code generates at the database. When that messiness becomes a performance problem, OOP is the first casualty.
There is really no way to resolve this. Databases work with sets of data. OOP focus on instances of classes. Trying to marry the two is always going to end in divorce.
So to answer your question, I believe you should generate your classes and try and make them map the underlying database structure as closely as possible.
Perhaps controversially, I've always felt that using code generators for the ADO.NET plumbing is fundamentally solving the wrong problem.
At some point, hopefully not too long after learning about Connection Strings, SqlCommands, DataAdapters, and all that, we notice that:
Such code is ugly
It is very boring to write
It's very easy to miss something if you're doing it by hand
It has to be repeated every time you want to access the database
So, the problem to solve is "how to do the same thing lots of times fast"?
I say no.
Using code generators to make this process quick still means that you have a ton of code, all the same, all over your business classes (or your data access layer, if you separate that from your business logic).
And then, if you want to do something generically (like track stored procedure usage, for instance), you end up having to customise your code generator if it doesn't already have the feature you want. And even if it does, you still have to regenerate everything all the time.
I like to do things once, not many times, no matter how fast I can do them.
So I rolled my own Data Access class that knows how to add parameters, set up and close connections, manage transactions, and other cool stuff. It only had to be written once, and calling its methods from a Business object that needs some database stuff done consists of one line of code.
When I needed to make the application support multithreaded database accesses, it required a change to the Data Access class only, and all the business classes just do what they already did.
There is no right answer it all depends on your project. As Simon points out if your application is all data driven, then it might make sense depending on the size and complexity of the domain to use non oop paradigm. I had a lot of success building a system using a Transaction Script pattern, which passed XML Messages around the system.
However this system started to break down after five or six years as the application grew in size and complexity (5 or 6 webs, several web services, tons of COM+ components, legacy and .net code, 8+ databases with 800+ tables 4,000+ procedures). No one knew what anything was, and duplication was running rampant.
There are other ways to alleviate the maintance then OOP; however, if you have a very complex domain then hainvg a rich domain model is ideal IMHO, as it allows for the business rules to be expressed in nice encapsulated components.
To answer your question, avoid code generators if you can. Code generators are a recipe for disaster, but if you do go with code generation do not modify the generated code. Also be sure to have a good process in place that is easy for developers to get the new generated code.
I recommend using either the following: ORM or hand roll a lightweight DAL. I am currently transitioning a project over to nHibernate off my hand rolled DAL and am having a lot of success; however, I like having the option of using either option. Further if you properly seperate your concerns (Data Access from Business Layer from Presentation) you can have a single service layer that might talk to a Dao (Data Access Object) that for one object is an ORM but for another is hand rolled). I like this flexibility as it allows to apply the best tool to the job.
I like nHibernate over a hand rolled DAL because while my DAL does abstract away most of the ADO.Net code you still have to write the code that takes a data reader to an object or an object and creates the parameters.
I've always preferred to go the code generator route, especially in C# where you can make use of extended classes to add functionality to the basic data objects.
Hate to say this, but it depends. If you find an ORM tool that fits your needs go for it. We wrote our own system in small steps while developing the application. We are using C++ and there are not that many tools out there anyway. Ours ended up being a XML description of the database, from that the SQL generation script and the basic object layer and metadata were generated.
Do your homework and evaluate theses tools and you will find a good fit for your needs.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
I really need to see some honest, thoughtful debate on the merits of the currently accepted enterprise application design paradigm.
I am not convinced that entity objects should exist.
By entity objects I mean the typical things we tend to build for our applications, like "Person", "Account", "Order", etc.
My current design philosophy is this:
All database access must be accomplished via stored procedures.
Whenever you need data, call a stored procedure and iterate over a SqlDataReader or the rows in a DataTable
(Note: I have also built enterprise applications with Java EE, java folks please substitute the equvalent for my .NET examples)
I am not anti-OO. I write lots of classes for different purposes, just not entities. I will admit that a large portion of the classes I write are static helper classes.
I am not building toys. I'm talking about large, high volume transactional applications deployed across multiple machines. Web applications, windows services, web services, b2b interaction, you name it.
I have used OR Mappers. I have written a few. I have used the Java EE stack, CSLA, and a few other equivalents. I have not only used them but actively developed and maintained these applications in production environments.
I have come to the battle-tested conclusion that entity objects are getting in our way, and our lives would be so much easier without them.
Consider this simple example: you get a support call about a certain page in your application that is not working correctly, maybe one of the fields is not being persisted like it should be. With my model, the developer assigned to find the problem opens exactly 3 files. An ASPX, an ASPX.CS and a SQL file with the stored procedure. The problem, which might be a missing parameter to the stored procedure call, takes minutes to solve. But with any entity model, you will invariably fire up the debugger, start stepping through code, and you may end up with 15-20 files open in Visual Studio. By the time you step down to the bottom of the stack, you forgot where you started. We can only keep so many things in our heads at one time. Software is incredibly complex without adding any unnecessary layers.
Development complexity and troubleshooting are just one side of my gripe.
Now let's talk about scalability.
Do developers realize that each and every time they write or modify any code that interacts with the database, they need to do a throrough analysis of the exact impact on the database? And not just the development copy, I mean a mimic of production, so you can see that the additional column you now require for your object just invalidated the current query plan and a report that was running in 1 second will now take 2 minutes, just because you added a single column to the select list? And it turns out that the index you now require is so big that the DBA is going to have to modify the physical layout of your files?
If you let people get too far away from the physical data store with an abstraction, they will create havoc with an application that needs to scale.
I am not a zealot. I can be convinced if I am wrong, and maybe I am, since there is such a strong push towards Linq to Sql, ADO.NET EF, Hibernate, Java EE, etc. Please think through your responses, if I am missing something I really want to know what it is, and why I should change my thinking.
[Edit]
It looks like this question is suddenly active again, so now that we have the new comment feature I have commented directly on several answers. Thanks for the replies, I think this is a healthy discussion.
I probably should have been more clear that I am talking about enterprise applications. I really can't comment on, say, a game that's running on someone's desktop, or a mobile app.
One thing I have to put up here at the top in response to several similar answers: orthogonality and separation of concerns often get cited as reasons to go entity/ORM. Stored procedures, to me, are the best example of separation of concerns that I can think of. If you disallow all other access to the database, other than via stored procedures, you could in theory redesign your entire data model and not break any code, so long as you maintained the inputs and outputs of the stored procedures. They are a perfect example of programming by contract (just so long as you avoid "select *" and document the result sets).
Ask someone who's been in the industry for a long time and has worked with long-lived applications: how many application and UI layers have come and gone while a database has lived on? How hard is it to tune and refactor a database when there are 4 or 5 different persistence layers generating SQL to get at the data? You can't change anything! ORMs or any code that generates SQL lock your database in stone.
I think it comes down to how complicated the "logic" of the application is, and where you have implemented it. If all your logic is in stored procedures, and all your application does is call those procedures and display the results, then developing entity objects is indeed a waste of time. But for an application where the objects have rich interactions with one another, and the database is just a persistence mechanism, there can be value to having those objects.
So, I'd say there is no one-size-fits-all answer. Developers do need to be aware that, sometimes, trying to be too OO can cause more problems than it solves.
Theory says that highly cohesive, loosely coupled implementations are the way forward.
So I suppose you are questioning that approach, namely separating concerns.
Should my aspx.cs file be interacting with the database, calling a sproc, and understanding IDataReader?
In a team environment, especially where you have less technical people dealing with the aspx portion of the application, I don't need these people being able to "touch" this stuff.
Separating my domain from my database protects me from structural changes in the database, surely a good thing? Sure database efficacy is absolutely important, so let someone who is most excellent at that stuff deal with that stuff, in one place, with as little impact on the rest of the system as possible.
Unless I am misunderstanding your approach, one structural change in the database could have a large impact area with the surface of your application. I see that this separation of concerns enables me and my team to minimise this. Also any new member of the team should understand this approach better.
Also, your approach seems to advocate the business logic of your application to reside in your database? This feels wrong to me, SQL is really good at querying data, and not, imho, expressing business logic.
Interesting thought though, although it feels one step away from SQL in the aspx, which from my bad old unstructured asp days, fills me with dread.
One reason - separating your domain model from your database model.
What I do is use Test Driven Development so I write my UI and Model layers first and the Data layer is mocked, so the UI and model is build around domain specific objects, then later I map these objects to what ever technology I'm using the the Data Layer. Its a bad idea to let the database structure determine the design of your application. Where possible write the app first and let that influence the structure of your database, not the other way around.
For me it boils down to I don't want my application to be concerned with how the data is stored. I'll probably get slapped for saying this...but your application is not your data, data is an artifact of the application. I want my application to be thinking in terms of Customers, Orders and Items, not a technology like DataSets, DataTables and DataRows...cuz who knows how long those will be around.
I agree that there is always a certain amount of coupling, but I prefer that coupling to reach upwards rather than downwards. I can tweak the limbs and leaves of a tree easier than I can alter it's trunk.
I tend to reserve sprocs for reporting as the queries do tend to get a little nastier than the applications general data access.
I also tend to think with proper unit testing early on that scenario's like that one column not being persisted is likely not to be a problem.
Eric,
You are dead on. For any really scalable / easily maintained / robust application the only real answer is to dispense with all the garbage and stick to the basics.
I've followed a similiar trajectory with my career and have come to the same conclusions. Of course, we're considered heretics and looked at funny. But my stuff works and works well.
Every line of code should be looked at with suspicion.
I would like to answer with an example similar to the one you proposed.
On my company I had to build a simple CRUD section for products, I build all my entities and a separate DAL. Later another developer had to change a related table and he even renamed several fields. The only file I had to change to update my form was the DAL for that table.
What (in my opinion) entities brings to a project is:
Ortogonality: Changes in one layer might not affect other layers (off course if you make a huge change on the database it would ripple through all the layers but most small changes won't).
Testability: You can test your logic with out touching your database. This increases performance on your tests (allowing you to run them more frequently).
Separation of concerns: In a big product you can assign the database to a DBA and he can optimize the hell out of it. Assign the Model to a business expert that has the knowledge necessary to design it. Assign individual forms to developers more experienced on webforms etc..
Finally I would like to add that most ORM mappers support stored procedures since that's what you are using.
Cheers.
I think you may be "biting off more than you can chew" on this topic. Ted Neward was not being flippant when he called it the "Vietnam of Computer Science".
One thing I can absolutely guarantee you is that it will change nobody's point of view on the matter, as has been proven so often on innumerable other blogs, forums, podcasts etc.
It's certainly ok to have open disucssion and debate about a controversial topic, it's just this one has been done so many times that both "sides" have agreed to disagree and just got on with writing software.
If you want to do some further reading on both sides, see articles on Ted's blog, Ayende Rahein, Jimmy Nilson, Scott Bellware, Alt.Net, Stephen Forte, Eric Evans etc.
#Dan, sorry, that's not the kind of thing I'm looking for. I know the theory. Your statement "is a very bad idea" is not backed up by a real example. We are trying to develop software in less time, with less people, with less mistakes, and we want the ability to easily make changes. Your multi-layer model, in my experience, is a negative in all of the above categories. Especially with regards to making the data model the last thing you do. The physical data model must be an important consideration from day 1.
I found your question really interesting.
Usually I need entities objects to encapsulate the business logic of an application. It would be really complicated and inadequate to push this logic into the data layer.
What would you do to avoid these entities objects? What solution do you have in mind?
Entity Objects can facilitate cacheing on the application layer. Good luck caching a datareader.
We should also talk about the notion what entities really are.
When I read through this discussion, I get the impression that most people here are looking at entities in the sense of an Anemic Domain Model.
A lot of people are considering the Anemic Domain Model as an antipattern!
There is value in rich domain models. That is what Domain Driven Design is all about.
I personally believe that OO is a way to conquer complexity. This means not only technical complexity (like data-access, ui-binding, security ...) but also complexity in the business domain!
If we can apply OO techniques to analyze, model, design and implement our business problems, this is a tremendous advantage for maintainability and extensibility of non-trivial applications!
There are differences between your entities and your tables. Entities should represent your model, tables just represent the data-aspect of your model!
It is true that data lives longer than apps, but consider this quote from David Laribee: Models are forever ... data is a happy side effect.
Some more links on this topic:
Why Setters and Getters are evil
Return of pure OO
POJO vs. NOJO
Super Models Part 2
TDD, Mocks and Design
Really interesting question. Honestly I can not prove why entities are good. But I can share my opinion why I like them. Code like
void exportOrder(Order order, String fileName){...};
is not concerned where order came from - from DB, from web request, from unit test, etc. It makes this method more explicitly declare what exactly it requires, instead of taking DataRow and documenting which columns it expects to have and which types they should be. Same applies if you implement it somehow as stored procedure - you still need to push record id to it, while it not necessary should be present in DB.
Implementation of this method would be done based on Order abstraction, not based on how exactly it is presented in DB. Most of such operations which I implemented really do not depend on how this data is stored. I do understand that some operations require coupling with DB structure for perfomance and scalability purposes, just in my experience there are not too much of them. In my experience very often it is enough to know that Person has .getFirstName() returning String, and .getAddress() returning Address, and address has .getZipCode(), etc - and do not care which tables are involed to store that data.
If you have to deal with such problems as you described, like when additional column breaks report perfomance, then for your tasks DB is a critical part, and you indeed should be as close as possible to it. While entities can provide some convenient abstractions they can hide some important details as well.
Scalability is interesting point here - most of websites which require enormous scalability (like facebook, livejournal, flickr) tend to use DB-ascetic approach, when DB is used as rare as possible and scalability issues are solved by caching, especially by RAM usage. http://highscalability.com/ has some interesting articles on it.
There are other good reasons for entity objects besides abstraction and loose coupling. One of the things I like most is the strong typing that you can't get with a DataReader or a DataTable. Another reason is that when done well, proper entity classes can make the code more maintanable by using first-class constructs for domain-specific terms that anyone looking at the code is likely to understand rather than a bunch of strings with field names in them used for indexing a DataRow. Stored procedures are really orthogonal to the use of an ORM since a lot of mapping frameworks give you the ability to map to sprocs.
I wouldn't consider sprocs + datareaders a substitute for a good ORM. With stored procedures, you're still constrained by, and tightly-coupled to, the procedure's type signature, which uses a different type system than the calling code. Stored procedures can be subject to modification to acommodate additional options and schema changes. An alternative to stored procedures in the case where the schema is subject to change is to use views--you can map objects to views and then re-map views to the underlying tables when you change them.
I can understand your aversion to ORMs if your experience mainly consists of Java EE and CSLA. You might want to have a look at LINQ to SQL, which is a very lightweight framework and is primarily a one-to-one mapping with the database tables but usually only needs minor extension for them to be full-blown business objects. LINQ to SQL can also map input and output objects to stored procedures' paramaters and results.
The ADO.NET Entity framework has the added advantage that your database tables can be viewed as entity classes inheriting from each other, or as columns from multiple tables aggregated into a single entity. If you need to change the schema, you can change the mapping from the conceptual model to the storage schema without changing the actual application code. And again, stored procedures can be used here.
I think that more IT projects in enterprises fail because of unmaintainability of the code or poor developer productivity (which can happen from, e.g., context switching between sproc-writing and app-writing) than scalability problems of an application.
I would also like to add to Dan's answer that separating both models could enable your application to be run on different database servers or even database models.
What if you need to scale your app by load balancing more than one web server? You could install the full app on all web servers, but a better solution is to have the web servers talk to an application server.
But if there aren't any entity objects, they won't have very much to talk about.
I'm not saying that you shouldn't write monoliths if its a simple, internal, short life application. But as soon as it gets moderately complex, or it should last a significant amount of time, you really need to think about a good design.
This saves time when it comes to maintaining it.
By splitting application logic from presentation logic and data access, and by passing DTOs between them, you decouple them. Allowing them to change independently.
You might find this post on comp.object interesting.
I'm not claiming to agree or disagree but it's interesting and (I think) relevant to this topic.
A question: How do you handle disconnected applications if all your business logic is trapped in the database?
In the type of Enterprise application I'm interested in, we have to deal with multiple sites, some of them must be able to function in a disconnected state.
If your business logic is encapsulated in a Domain layer that is simple to incorporate into various application types -say, as a dll- then I can build applications that are aware of the business rules and are able, when necessary, to apply them locally.
In keeping the Domain layer in stored procedures on the database you have to stick with a single type of application that needs a permanent line-of-sight to the database.
It's ok for a certain class of environments, but it certainly doesn't cover the whole spectrum of Enterprise applications.
#jdecuyper, one maxim I repeat to myself often is "if your business logic is not in your database, it is only a recommendation". I think Paul Nielson said that in one of his books. Application layers and UI come and go, but data usually lives for a very long time.
How do I avoid entity objects? Stored procedures mostly. I also freely admit that business logic tends to reach through all layers in an application whether you intend it to or not. A certain amount of coupling is inherent and unavoidable.
I have been thinking about this same thing a lot lately; I was a heavy user of CSLA for a while, and I love the purity of saying that "all of your business logic (or at least as much as is reasonably possible) is encapsulated in business entities".
I have seen the business entity model provide a lot of value in cases where the design of the database is different than the way you work with the data, which is the case in a lot of business software.
For example, the idea of a "customer" may consist of a main record in a Customer table, combined with all of the orders the customer has placed, as well as all the customer's employees and their contact information, and some of the properties of a customer and its children may be determined from lookup tables. It's really nice from a development standpoint to be able to work with the Customer as a single entity, since from a business perspective, the concept of Customer contains all of these things, and the relationships may or may not be enforced in the database.
While I appreciate the quote that "if your business rule is not in your database, it's only a suggestion", I also believe that you shouldn't design the database to enforce business rules, you should design it to be efficient, fast and normalized.
That said, as others have noted above, there is no "perfect design", the tool has to fit the job. But using business entities can really help with maintenance and productivity, since you know where to go to modify business logic, and objects can model real-world concepts in an intuitive way.
Eric,
No one is stopping you from choosing the framework/approach that you would wish. If you are going to go the "data driven/stored procedure-powered" path, then by all means, go for it! Especially if it really, really helps you deliver your applications on-spec and on-time.
The caveat being (a flipside to your question that is), ALL of your business rules should be on stored procedures, and your application is nothing more than a thin client.
That being said, same rules apply if you do your application in OOP : be consistent. Follow OOP's tenets, and that includes creating entity objects to represent your domain models.
The only real rule here is the word consistency. Nobody is stopping you from going DB-centric. No one is stopping you from doing old-school structured (aka, functional/procedural) programs. Hell, no one is stopping anybody from doing COBOL-style code. BUT an application has to be very, very consistent once going down this path, if it wishes to attain any degree of success.
I'm really not sure what you consider "Enterprise Applications". But I'm getting the impression you are defining it as an Internal Application where the RDBMS would be set in stone and the system wouldn't have to be interoperable with any other systems whether internal or external.
But what if you had a database with 100 tables which equate to 4 Stored Procedures for each table just for basic CRUD operations that's 400 stored procedures which need to be maintained and aren't strongly-typed so are susceptible to typos nor can be Unit Tested. What happens when you get a new CTO who is an Open Source Evangelist and wants to change the RDBMS from SQL Server to MySql?
A lot of software today whether Enterprise Applications or Products are using SOA and have some requirements for exposing Web Services, at least the software I am and have been involved with do.
Using your approach you would end up exposing a Serialized DataTable or DataRows. Now this may be deemed acceptable if the Client is guaranteed to be .NET and on an internal network. But when the Client is not known then you should be striving to Design an API which is intuitive and in most cases you would not want to be exposing the Full Database schema.
I certainly wouldn't want to explain to a Java developer what a DataTable is and how to use it. There's also the consideration of Bandwith and payload size and serialized DataTables, DataSets are very heavy.
There is no silver bullet with software design and it really depends on where the priorities lie, for me it's in Unit Testable code and loosely coupled components that can be easily consumed be any client.
just my 2 cents
I'd like to offer another angle to the problem of distance between OO and RDB: history.
Any software has a model of reality that is to some degree an abstraction of reality. No computer program can capture all the complexities of reality, and programs are written just to solve a set of problems from reality. Therefore any software model is a reduction of reality. Sometimes the software model forces reality to reduce itself. Like when you want the car rental company to reserve any car for you as long as it is blue and has alloys, but the operator can't comply because your request won't fit in the computer.
RDB comes from a very old tradition of putting information into tables, called accounting. Accounting was done on paper, then on punch cards, then in computers. But accounting is already a reduction of reality. Accounting has forced people to follow its system so long that it has become accepted reality. That's why it is relatively easy to make computer software for accounting, accounting has had its information model, long before the computer came along.
Given the importance of good accounting systems, and the acceptance you get from any business managers, these systems have become very advanced. The database foundations are now very solid and noone hesitates about keeping vital data in something so trustworthy.
I guess that OO must have come along when people have found that other aspects of reality are harder to model than accounting (which is already a model). OO has become a very successful idea, but persistance of OO data is relatively underdeveloped. RDB/Accounting has had easy wins, but OO is a much larger field (basically everything that isn't accounting).
So many of us have wanted to use OO but we still want safe storage of our data. What can be safer than to store our data the same way as the esteemed accounting system does? It is an enticing prospects, but we all run into the same pitfalls. Very few have taken the trouble to think of OO persistence compared to the massive efforts by the RDB industry, who has had the benefit of accounting's tradition and position.
Prevayler and db4o are some suggestions, I'm sure there are others I haven't heard of, but none have seemed to get half the press as, say, hibernation.
Storing your objects in good old files doesn't even seem to be taken seriously for multiuser applications, and especially web applications.
In my everyday struggle to close the chasm between OO and RDB I use OO as much as possible but try to keep inheritance to a minimum. I don't often use SPs. I'll use the advanced query stuff only in aspects that look like accounting.
I'll be happily supprised when the chasm is closed for good. I think the solution will come when Oracle launches something like "Oracle Object Instance Base". To really catch on, it will have to have a reassuring name.
Not a lot of time at the moment, but just off the top of my head...
The entity model lets you give a consistent interface to the database (and other possible systems) even beyond what a stored procedure interface can do. By using enterprise-wide business models you can make sure that all applications affect the data consistently which is a VERY important thing. Otherwise you end up with bad data, which is just plain evil.
If you only have one application then you don't really have an "enterprise" system, regardless of how big that application or your data are. In that case you can use an approach similar to what you talk about. Just be aware of the work that will be needed if you decide to grow your systems in the future.
Here are a few things that you should keep in mind (IMO) though:
Generated SQL code is bad
(exceptions to follow). Sorry, I
know that a lot of people think that
it's a huge time saver, but I've
never found a system that could
generate more efficient code than
what I could write and often the
code is just plain horrible. You
also often end up generating a ton
of SQL code that never gets used.
The exception here is very simple
patterns, like maybe lookup tables.
A lot of people get carried away on
it though.
Entities <> Tables (or even logical data model entities necessarily). A data model often has data rules that should be enforced as closely to the database as possible which can include rules around how table rows relate to each other or other similar rules that are too complex for declarative RI. These should be handled in stored procedures. If all of your stored procedures are simple CRUD procs, you can't do that. On top of that, the CRUD model usually creates performance issues because it doesn't minimize round trips across the network to the database. That's often the biggest bottleneck in an enterprise application.
Sometimes, your application and data layer are not that tightly coupled. For example, you may have a telephone billing application. You later create a separate application which monitors phone usage to a) better advertise to you b) optimise your phone plan.
These applications have different concerns and data requirements (even the data is coming out of the same database), they would drive different designs. Your code base can end up an absolute mess (in either application) and a nightmare to maintain if you let the database drive the code.
Applications that have domain logic separated from the data storage logic are adaptable to any kind of data source (database or otherwise) or UI (web or windows(or linux etc.)) application.
Your pretty much stuck in your database, which isn't bad if your with a company who is satisfied with the current database system your using. However, because databases evolve overtime there might be a new database system that is really neat and new that your company wants to use. What if they wanted to switch to a web services method of data access (like Service Orientated architecture sometime does). You might have to port your stored procedures all over the place.
Also the domain logic abstracts away the UI, which can be more important in large complex systems that have ever evolving UIs (especially when they are constantly searching for more customers).
Also, while I agree that there is no definitive answer to the question of stored procedures versus domain logic. I'm in the domain logic camp (and I think they are winning over time), because I believe that elaborate stored procedures are harder to maintain than elaborate domain logic. But that's a whole other debate
I think that you are just used to writing a specific kind of application, and solving a certain kind of problem. You seem to be attacking this from a "database first" perspective. There are lots of developers out there where data is persisted to a DB but performance is not a top priority. In lots of cases putting an abstraction over the persistence layer simplifies code greatly and the performance cost is a non-issue.
Whatever you are doing, it's not OOP. It's not wrong, it's just not OOP, and it doesn't make sense to apply your solutions to every othe problem out there.
Interesting question. A couple thoughts:
How would you unit test if all of your business logic was in your database?
Wouldn't changes to your database structure, specifically ones that affect several pages in your app, be a major hassle to change throughout the app?
Good Question!
One approach I rather like is to create an iterator/generator object that emits instances of objects that are relevant to a specific context. Usually this object wraps some underlying database access stuff, but I don't need to know that when using it.
For example,
An AnswerIterator object generates AnswerIterator.Answer objects. Under the hood it's iterating over a SQL Statement to fetch all the answers, and another SQL statement to fetch all related comments. But when using the iterator I just use the Answer object that has the minimum properties for this context. With a little bit of skeleton code this becomes almost trivial to do.
I've found that this works well when I have a huge dataset to work on, and when done right, it gives me small, transient objects that are relatively easy to test.
It's basically a thin veneer over the Database Access stuff, but it still gives me the flexibility of abstracting it when I need to.
The objects in my apps tend to relate one-to-one to the database, but I'm finding using Linq To Sql rather than sprocs makes it much easier writing complicated queries, especially being able to build them up using the deferred execution. e.g. from r in Images.User.Ratings where etc. This saves me trying to work out several join statements in sql, and having Skip & Take for paging also simplifies the code rather than having to embed the row_number & 'over' code.
Why stop at entity objects? If you don't see the value with entity objects in an enterprise level app, then just do your data access in a purely functional/procedural language and wire it up to a UI. Why not just cut out all the OO "fluff"?