Eiffel: Recommendation for ORM (Object Relationship Model) design - orm

Recommendations I understood in Java (which has a lot of restrictions, # least for me), even with hibernate was to have separated layers
Entities like persons, children, users, etc...
DAO entities linked to database
Service providing entities and functionalities, where I'll do the SQL
WebService providing an interface over needs
As I'm starting with Eiffel and store, I'm facing some difficulties I face since ever in programming (hoping there's somebody in this earth who has not the same problem) I always want to generalize things more than necessary. Every time I do a copy-paste, I refactor and look for a solution which makes me able to write it one time... which takes time and time on the delivery of the software, but for me adds more quality and flexibility to the software. I'm actually working alone in a company where I'm going to be the lead developer and if the future wants we'll be more developers. The goal is to develop a platform of services in Eiffel, postgresql-odbc, and an Angular-web front-end.
I'd like to have the more generic pattern to be able to manage entities in the future with typical situations as:
Database entities
Relationships
one_to_one
one_to_many
many_to_one
many_to_many
# The point I'm now, I'm about to develop an architecture which ideally for me has:
DB_ENTITY which as relations: BAG[RELATIONSHIP[P,S]] where P=Primary and S=Secondary
Primary is P->DB_ENTITY when ONE and BAG[P] when MANY
A COMPANY on my design will inherit from DB_ENTITY and add relationships as a BRANCH. So I was thinking having in my COMPANY class branches: RELATIONSHIP[like Current, BRANCH]
The relationship classes would help me to create the CRUD SQL statements into the "service" layer in a more abstract manner.
when I try something more lightweight I find restrictions in the pattern where I have to repeat operations... thats a bit my difficulty
Do you think of any disadvantages of such model I'm creating out of the first shot of development?

Quenio dos Santos not wanting to create an account on stackexchange, I'll quote its answer which could be useful for others
I recommend the book:
https://www.amazon.com/Domain-Driven-Design-Tackling-Complexity-Software-ebook/dp/B00794TAUG/ref=sr_1_2?s=digital-text&ie=UTF8&qid=1540350158&sr=1-2&keywords=domain+driven+design&dpID=51OWGtzQLLL&preST=_SY445_QL70_&dpSrc=srch
Not just because of the Repository pattern.
You should be able to implement reusable, abstract classes out of the
repository pattern if you want to get away from repetitive code. In
Java, MyBatis is a framework that helps with that. I really don’t know
if there is anything equivalent in Eiffel. I’m also a new Eiffel
developer.
Some pros-and-cons of the Repository pattern:
You define the SQL yourself. Some see it as a cons, but some see it as a pros, because you have clear understanding of the mapping from
the database to your classes, and it allows you to optimize your
queries, and join several tables into a single class, or smaller
number of classes, when approriate in your context.
More freedom on how you define your domain model. It can be quite different from the schema of your database. Your classes don’t have to
be just a set of anemic attribute holders of the tables in your
database, but they can have behavior and be useful and expressive in a
setting completely independent from your database.
ORM frameworks, like Hibernate, are sometimes hard-to-use for a new developer not very familiar with them. With the repository pattern,
because the mapping is so clear and available in your own code, it
tends to be easier to understand and debug.
You can also have different implementations of your repositories for different technologies, such as NoSQL databases. With ORM frameworks,
you tend to be stuck with a relational database, unless you rework
quite a bit of your dependencies on the ORM framework. It is easier to
encapsulate the data store technology behind repositories, and keep a
clean separation between your domain model and your database.
I’d say those are the main points. Having said that, these are very
general guidelines. I don’t have any experience with data persistent
in Eiffel, so I couldn’t say what are the best practices for Eiffel
developers. But I have a gut feeling that the repository pattern fits
well Eiffel because the original goal of the language was to build
more abstract, domain-rich class libraries, which is the purpose
behind the repository pattern. (By the way, I’m calling it a pattern,
but I’m not sure the author calls it that. The author would probably
call aggregates, entities and repositories, all kinds of classes used
in domain-driven design, all together a pattern.)

Related

ASP.NET MVC4 n-Tier Architecture: best approach

I developing a 3 tier architecture for an MVC4 webapp + EntityFramwork5.
I want to keep separete the layer, so only DAL knows that I'm using EF, for example.
Actually I have a lot of classes to manage that:
DAL
Entity POCO
Entity DataContext : DbContext
Entity Repository
BL
Entity ViewModel
Entity Service(instantiate Entity Repository)
WEB
Entity Controllers (instantiate Entity Service)
This is working but is quite hard to mantain. I was thinking to remove the Entity Repository in DAL and use directly the DataContext (if I'm not wrong, after all DbContext has been desingned to be a Repository and a Unit of Work), but that will force me to add a reference to EntityFramework.dll in my BL. Is not a big issue, but I0m not sure it is the best choice.
Any advice?
(I hope I gave enough informations, if you need more, just ask)
You can use this this and this article.
An experienced Architect does not need to go through every single step in the book to get a reasonable design done for a small web
application. Such Architects can use their experience to speed up the
process. Since I have done similar web applications before and have
understood my deliverable, I am going to take the faster approach to
get the initial part of our DMS design done. That will hopefully
assist me to shorten the length of this article.
For those who do not have experience, let me briefly mention the general steps that involved in architecturing a software below...
Understand the initial customer requirement - Ask questions and do research to further elaborate the requirement
Define the process flow of the system preferably in visual (diagram) form. I usually draw a process-flow diagram here. In my
effort, I would try to define the manual version of the system first
and then would try to convert that into the automated version while
identifying the processes and their relations. This process-flow
diagram that we draw here can be used as the medium to validate the
captured requirements with the customer too.
Identify the software development model that suite your requirements
When the requirements are fully captured and defined before the design start, you can use the 'Water-Fall' model. But when the
requirements are undefined, a variant of 'Spiral' can be used to deal
with that.
When requirements are not defined, the system gets defined while it is being designed. In such cases, you need to keep adequate spaces
in respective modules, which later expansions are expected.
Decide what architecture to be used. In my case, to design our Document Management System (DMS), I will be using a combination of
ASP.NET MVC and Multitier Architecture (Three Tier Variant).
Analyze the system and identify its modules or sub systems.
Pick one sub system at a time and further analyze it and identify all granular level requirements belonging to that part of the systems.
Recognize the data entities and define the relationships among entities (Entity Relationship Diagram or ER Diagram). That can
followed by identifying the business entities (Some business entities
directly map with the classes of your system) and define the business
process flow.
Organized your entities. This is where you normalize your database, and decide what OOP concepts and design pattern to be used
etc.
Make your design consistent. Follow the same standards across all modules and layers. This includes streamlining the concepts (as an
example, if you have used two different design patterns in two
different modules to achieve the same goal, then pick the better
approach and use that in both the places), and conventions used in the
project.
Tuning the design is the last part of the process. In order to do this, you need to have a meeting with the project team. In that
meeting you need to present your design to your team and make them ask
questions about it. Take this as an opportunity to honestly evaluate/
adjust your design.

Are ORM's counterproductive to OO design?

In OOD, design of an object is said to be characterized by its identity and behavior.
Having used ORM's in the past, the primary purpose, in my opinion, revolves around the ability to store/retrieve data. That is to say, ORM objects are not design by behavior, but rather data (i.e. database tables). Case and point: Many ORM tools come with a point-to-a-database-table-and-click-object-generator.
If objects are no longer characterized by behavior this will, in my opinion, muddy the identity and responsibility of the objects. Subsequently, if objects are not defined by a responsibility this could lend a hand to having tightly coupled classes and overall poor design.
Furthermore, I would think that in an application setting, you would be heading towards scalability issues.
So, my question is, do you think that ORM's are counterproductive to OO design? Perhaps the underlying question would be whether or not they are counterproductive to application development.
There's a well-known and oft-ignored impedance mismatch between the requirements of good database design and the requirements of good OO design. Most developers (in my experience) either do not understand this impedance mismatch or do not care. Since it's more common to start with the database and generate the objects from it (rather than the reverse), then yes, you'll end up with objects that are great as a persistence layer but sub-optimal from an OO perspective. (The reverse, generating the database from the object model, makes me want to stab my eyes out.)
Why are they sub-optimal from an OO perspective? Because the objects produced by an ORM are not business objects, even with partial classes and the like. Business objects model behavior. ORM objects model persistence. I'm not going to spend ten paragraphs arguing this distinction. It's something Rocky Lhotka has covered quite well in his books on Business Objects and his CSLA framework. Whether or not you like or use CSLA, I think his arguments are solid ones.
If objects are no longer characterized by behavior this will, in my opinion, muddy the
identity and responsibility of the objects.
The objects in question do have database reading and writing as defined behavior. They just don't have much other than that.
The reality of the situation is pretty simple: object orientation isn't an end in itself, it's a means to an end -- but in some cases it just doesn't do much to improve the end result. A lot of uses for ORM form a case in point -- they are thousands of variations of CRUD applications that don't need or want to attach any real behavior to most of the data they process.
The application as a whole gains flexibility by not encoding much (if any) of the data's "behavior" into the code of the application itself. Instead, they're often better off with that as "dumb" data, that they simply pass through from UI to database and back out to reports and such. With a bit of care, this can allow a substantial level of user customization that's almost impossible to match when you try to treat the data as real objects with real behavior encoded into the application proper.
Of course, there's another side to that: it can make it substantially more difficult to ensure the integrity of the data or that the data is only used appropriately -- I've seen code that accidentally used the wrong field in a calculation, so they were averaging the office numbers instead of office sizes in square feet. Both were user-defined fields that just said the contents should be numeric. The application had no way to know that one made perfect sense, and the other didn't at all.
Case and point: Many OR/M tools come with a point-to-a-database-table-and-click-object-generator.
Yes but there are equal, if not greater numbers or ORM solutions that base themselves off your objects and generate your database tables.
If you start with data and then tramp down forcing auto-generated objects to have object behaviours, yes, you might get confused... But if you start with the object and generate the database as a secondary layer, you end up with something a lot more usable, even if the database isn't perhaps as optimised as it could be.
If you're looking for an excuse not to use ORM, don't use it. I personally find it saves me thousands of lines of code doing trivial things that the ORM does just great.
I don't believe ORM are counterproductive to OO design, unless you want to insist that persistence is an integral part of behavior.
I'd separate persistence from business behavior. You're certainly free to add busines behavior to any object that an ORM generates for you.
I've also seen ORM systems which tend to go from the OO model and generate the database.
If anything, I would say ORMs are more biased towards producing good OO code than they are to producing good database code.
Ideally, a successful ORM bridges the two worlds and your application code would be great from a business domain problem-solving and implementation perspective and your database code and model would be great from a normalization, performance and ETL/reporting/replication whatever perspective.
On systems where it separates your data from behavior it's absolutely counterproductive.
Orm systems tend to analyze existing database tables to create stupid "Objects" in your language. These things are not true objects because they do not contain business behavior, but since people have these structures, they tend to want to use them.
Ruby on Rails (Active Record) actually binds your data to a "Live" class--this is much better.
I've never seen a system I really liked--ActiveRecord is close but it makes a few rubiesque assumptions that I'm not quite comfortable with--the biggest being supplying public setters & getters by default.
But to sum up--I've seen a lot of good OO programmers write screwed up code because of ORM.
As with most questions of this type, it depends on your usage.
The main ways to use ORM tools are:
Define object by data, use this object throughout application code (BAD BAD BAD)
Use ORM objects only for data access, define your own objects with an interpretive layer between (Much Better)
If starting from scratch a 3rd method is to design data from your object model. (Best if possible)
So yes, if you define the object by the data tables and use that throughout your code you will not be using OOD and introducing very poor design and maintenance issues.
But if you only use the ORM objects as a data access tool (replacing ADO) then you are free to use good OOD and ORM together. Sure, more code is required to build the interpretation layer, but enables much better practices, with not much more code required than old ADO code.
I'm answering this from a C# perspective, since that is where I do the majority of my development....
I see your point, but I think with the ability to create partial classes you can still create objects with any behavior you like and still get the power that an OR/M brings to the table for data retrieval.
From what I've seen of ORMs they do not go against OO principles - fairly orthogonal to them in fact. (for info - pretty new to ORM technology, Java perspective)
My reasoning is that ORMs help you store the data members of a class to a persistent store without having to couple to that store and write that code yourself. You still decide on the data members and write the behaviours of a class.
I guess you could abuse ORMs to break OO principles, but then you can do that with anything. You might use tooling to create skeletal data classes from a pre-existing table, but you would still create methods etc.

.NET Dataset vs Business Object : Why the debate? Why not combine the two?

I read a debate in the comments here (current live site, without comments).
Why the debate? A Dataset for me is like a relational database, an Object is a hierarchical-like model. Why do people absolutely want a "pure" Object model, whereas we still deal with relational databases, so why not combine the two?
And if we should, is there any lightweight, comprehensive framework that allows us to do that (not a heavy mammoth, like NHibernate, which has a huge learning curve)?
"Pure objects" are a lot easier to work with, the typed object gives you intellisense and compile-time type checking.
Bare datasets are very cumbersome and annoying to work with - you need to know the column names, there's no type checking possible, so if you mistype a column name, you're out of luck and won't discover the error until runtime (the worst possible scenario).
Typed datasets are a step in the right direction, but the "things" you work with in your .NET code are still tied very closely and tightly to your database implementation - not typically a good thing, since any change in the underlying database might affect your app all the way up to your UI and cause a lot of changes being necessary.
Using an ORM like NHibernate allows you to better abstract and decouple the database (physical storage) layer from your logical business model - only in the simplest of scenarios will those two be an exact 1:1 match, so you'll need some kind of "translation" or mapping between the two anyway.
So all in all - using typed datasets might be okay for small, simple apps, but for a challenging, larger-scale, enterprise-level business app, I would never recommend coupling your business object model so closely and tightly to the database.
Marc
why do people absolutly want "pure" Object model
Because you don't want your application to depend on the database schema
Well, all the reasons you give were the same as the academical reasons that were given for EJB in Java which was a mess in the past. So arent't people falling into another fashionable hype ?
As I read here:
http://blogs.tedneward.com/2006/06/26/The+Vietnam+Of+Computer+Science.aspx
the promise is one thing, the reality is other thing.
Where is the proof upon the claims ?
Scientifically, Complexity is tight to the Concept of Entropy, you cannot reduce the inherent complexity of things, you can just move it somewhere else, so for me there is something fundamentally irational.
Ted Newards is highly controversial because it seems to me that everybody is herding like in the old EJB days: nobody dared to say EJB suck until Rod Johnson gets out with Hibernate.
Now it seems nobody cares to say ORM frameworks like Hibernate, Entity Framework, etc. are too complex, because there isn't yet another Rod Johnson II maybe :)
You pretend that adding a new layer solves the problem, it's not always the case even theorcially, like adding more team members when a project becomes a mess because adding more programmers also mean add to coordination and communication problem.
And in practice, what it seems, is that the layers that should be independant at least from the GUI viewpoint, aren't really. I see many people struggle to do simple stuff in the GUI when they use an ORM.

Are Databases and Functional Programming at odds?

I've been a web developer for some time now, and have recently started learning some functional programming. Like others, I've had some significant trouble apply many of these concepts to my professional work. For me, the primary reason for this is I see a conflict between between FP's goal of remaining stateless seems quite at odds with that fact that most web development work I've done has been heavily tied to databases, which are very data-centric.
One thing that made me a much more productive developer on the OOP side of things was the discovery of object-relational mappers like MyGeneration d00dads for .Net, Class::DBI for perl, ActiveRecord for ruby, etc. This allowed me to stay away from writing insert and select statements all day, and to focus on working with the data easily as objects. Of course, I could still write SQL queries when their power was needed, but otherwise it was abstracted nicely behind the scenes.
Now, turning to functional-programming, it seems like with many of the FP web frameworks like Links require writing a lot of boilerplate sql code, as in this example. Weblocks seems a little better, but it seems to use kind of an OOP model for working with data, and still requires code to be manually written for each table in your database as in this example. I suppose you use some code generation to write these mapping functions, but that seems decidedly un-lisp-like.
(Note I have not looked at Weblocks or Links extremely closely, I may just be misunderstanding how they are used).
So the question is, for the database access portions (which I believe are pretty large) of web application, or other development requiring interface with a sql database we seem to be forced down one of the following paths:
Don't Use Functional Programming
Access Data in an annoying, un-abstracted way that involves manually writing a lot of SQL or SQL-like code ala Links
Force our functional Language into a pseudo-OOP paradigm, thus removing some of the elegance and stability of true functional programming.
Clearly, none of these options seem ideal. Has found a way circumvent these issues? Is there really an even an issue here?
Note: I personally am most familiar with LISP on the FP front, so if you want to give any examples and know multiple FP languages, lisp would probably be the preferred language of choice
PS: For Issues specific to other aspects of web development see this question.
Coming at this from the perspective of a database person, I find that front end developers try too hard to find ways to make databases fit their model rather than consider the most effective ways to use database which are not object oriented or functional but relational and using set-theory. I have seen this generally result in poorly performing code. And further it creates code that is difficult to performance tune.
When considering database access there are three main considerations - data integrity (why all business rules should be enforced at the database level not through the user interface), performance, and security. SQL is written to manage the first two considerations more effectively than any front end language. Because it is specifically designed to do that. The task of a database is far different than the task of a user interface. Is it any wonder that the type of code that is most effective in managing the task is conceptually different?
And databases hold information critical to the survival of a company. Is is any wonder that businesses aren't willing to experiment with new methods when their survival is at stake. Heck many businesses are unwilling to even upgrade to new versions of their existing database. So there is in inherent conservatism in database design. And it is deliberately that way.
I wouldn't try to write T-SQL or use database design concepts to create your user-interface, why would you try to use your interface language and design concepts to access my database? Because you think SQL isn't fancy (or new) enough? Or you don't feel comfortable with it? Just because something doesn't fit the model you feel most comfortable with, doesn't mean it is bad or wrong. It means that it is different and probably different for a legitimate reason. You use a different tool for a different task.
First of all, I would not say that CLOS (Common Lisp Object System) is "pseudo-OO". It is first class OO.
Second, I believe that you should use the paradigm that fits your needs.
You cannot statelessly store data, while a function is flow of data and does not really need state.
If you have several needs intermixed, mix your paradigms. Do not restrict yourself to only using the lower right corner of your toolbox.
You should look at the paper "Out of the Tar Pit" by Ben Moseley and Peter Marks, available here: "Out of the Tar Pit" (Feb. 6, 2006)
It is a modern classic which details a programming paradigm/system called Functional-Relational Programming. While not directly relating to databases, it discusses how to isolate interactions with the outside world (databases, for example) from the functional core of a system.
The paper also discusses how to implement a system where the internal state of the application is defined and modified using a relational algebra, which obviously is related to relational databases.
This paper will not give an an exact answer to how to integrate databases and functional programming, but it will help you design a system to minimize the problem.
Functional languages do not have the goal to remain stateless, they have the goal to make management of state explicit. For instance, in Haskell, you can consider the State monad as the heart of "normal" state and the IO monad a representation of state which must exist outside of the program. Both of these monads allow you to (a) explicitly represent stateful actions and (b) build stateful actions by composing them using referentially transparent tools.
You reference a number of ORMs, which, per their name, abstract databases as sets of objects. Truely, this is not what the information in a relational database represents! Per its name, it represents relational data. SQL is an algebra (language) for handling relationships on a relational data set and is actually quite "functional" itself. I bring this up so as to consider that (a) ORMs are not the only way to map database information, (b) that SQL is actually a pretty nice language for some database designs, and (c) that functional languages often have relational algebra mappings which expose the power of SQL in an idiomatic (and in the case of Haskell, typechecked) fashion.
I would say most lisps are a poor man's functional language. It's fully capable of being used according to modern functional practices, but since it doesn't require them the community is less likely to use them. This leads to a mixture of methods which can be highly useful but certainly obscures how pure functional interfaces can still use databases meaningfully.
I don't think the stateless nature of fp languages is a problem with connecting to databases. Lisp is a non-pure functional programming language so it shouldn't have any problem dealing with state. And pure functional programming languages like Haskell have ways of dealing with input and output that can be applied to using databases.
From your question it seems like your main problem lies in finding a good way to abstract away the record-based data you get back from your database into something that is lisp-y (lisp-ish?) without having to write a lot of SQL code. This seems more like a problem with the tooling/libraries than a problem with the language paradigm. If you want to do pure FP maybe lisp isn't the right language for you. Common lisp seems more about integrating good ideas from oo, fp and other paradigms than about pure fp. Maybe you should be using Erlang or Haskell if you want to go the pure FP route.
I do think the 'pseudo-oo' ideas in lisp have their merit too. You might want to try them out. If they don't fit the way you want to work with your data you could try creating a layer on top of Weblocks that allows you to work with your data the way you want. This might be easier than writing everything yourself.
Disclaimer: I'm not a Lisp expert. I'm mostly interested in programming languages and have been playing with Lisp/CLOS, Scheme, Erlang, Python and a bit of Ruby. In daily programming life I'm still forced to use C#.
If your database doesn't destroy information, then you can work with it in a functional manner consistent with "pure functional" programming values by working in functions of the entire database as a value.
If at time T the database states that "Bob likes Suzie", and you had a function likes which accepted a database and a liker, then so long as you can recover the database at time T you have a pure functional program that involves a database. e.g.
# Start: Time T
likes(db, "Bob")
=> "Suzie"
# Change who bob likes
...
likes(db "Bob")
=> "Alice"
# Recover the database from T
db = getDb(T)
likes(db, "Bob")
=> "Suzie"
To do this you can't ever throw away information you might use (which in all practicality means you cannot throw away information), so your storage needs will increase monotonically. But you can start to work with your database as a linear series of discrete values, where subsequent values are related to the prior ones through transactions.
This is the major idea behind Datomic, for example.
Not at all. There are a genre of databases known as 'Functional Databases', of which Mnesia is perhaps the most accessible example. The basic principle is that functional programming is declarative, so it can be optimised. You can implement a join using List Comprehensions on persistent collections and the query optimiser can automagically work out how to implement the disk access.
Mnesia is written in Erlang and there is at least one web framework (Erlyweb) available for that platform. Erlang is inherently parallel with a shared-nothing threading model, so in certain ways it lends itself to scalable architectures.
A database is the perfect way to keep track of state in a stateless API. If you subscribe to REST, then your goal is to write stateless code that interacts with a datastore (or some other backend) that keeps track of state information in a transparent way so that your client doesn't have to.
The idea of an Object-Relational Mapper, where you import a database record as an object and then modify it, is just as applicable and useful to functional programming as it is to object oriented programming. The one caveat is that functional programming does not modify the object in place, but the database API can allow you to modify the record in place. The control flow of your client would look something like this:
Import the record as an object (the database API can lock the record at this point),
Read the object and branch based on its contents as you like,
Package a new object with your desired modifications,
Pass the new object to the appropriate API call which updates the record on the database.
The database will update the record with your changes. Pure functional programming might disallow reassigning variables within the scope of your program, but your database API can still allow in-place updates.
I'm most comfortable with Haskell. The most prominent Haskell web framework (comparable to Rails and Django) is called Yesod. It seems to have a pretty cool, type-safe, multi-backend ORM. Have a look at the Persistance chapter in their book.
Databases and Functional Programming can be fused.
for example:
Clojure is a functional programming language based on relational database theory.
Clojure -> DBMS, Super Foxpro
STM -> Transaction,MVCC
Persistent Collections -> db, table, col
hash-map -> indexed data
Watch -> trigger, log
Spec -> constraint
Core API -> SQL, Built-in function
function -> Stored Procedure
Meta Data -> System Table
Note: In the latest spec2, spec is more like RMDB.
see: spec-alpha2 wiki: Schema-and-select
I advocate: Building a relational data model on top of hash-map to achieve a combination of NoSQL and RMDB advantages. This is actually a reverse implementation of posgtresql.
Duck Typing: If it looks like a duck and quacks like a duck, it must be a duck.
If clojure's data model like a RMDB, clojure's facilities like a RMDB and clojure's data manipulation like a RMDB, clojure must be a RMDB.
Clojure is a functional programming language based on relational database theory
Everything is RMDB
Implement relational data model and programming based on hash-map (NoSQL)

Why do we need entity objects? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
I really need to see some honest, thoughtful debate on the merits of the currently accepted enterprise application design paradigm.
I am not convinced that entity objects should exist.
By entity objects I mean the typical things we tend to build for our applications, like "Person", "Account", "Order", etc.
My current design philosophy is this:
All database access must be accomplished via stored procedures.
Whenever you need data, call a stored procedure and iterate over a SqlDataReader or the rows in a DataTable
(Note: I have also built enterprise applications with Java EE, java folks please substitute the equvalent for my .NET examples)
I am not anti-OO. I write lots of classes for different purposes, just not entities. I will admit that a large portion of the classes I write are static helper classes.
I am not building toys. I'm talking about large, high volume transactional applications deployed across multiple machines. Web applications, windows services, web services, b2b interaction, you name it.
I have used OR Mappers. I have written a few. I have used the Java EE stack, CSLA, and a few other equivalents. I have not only used them but actively developed and maintained these applications in production environments.
I have come to the battle-tested conclusion that entity objects are getting in our way, and our lives would be so much easier without them.
Consider this simple example: you get a support call about a certain page in your application that is not working correctly, maybe one of the fields is not being persisted like it should be. With my model, the developer assigned to find the problem opens exactly 3 files. An ASPX, an ASPX.CS and a SQL file with the stored procedure. The problem, which might be a missing parameter to the stored procedure call, takes minutes to solve. But with any entity model, you will invariably fire up the debugger, start stepping through code, and you may end up with 15-20 files open in Visual Studio. By the time you step down to the bottom of the stack, you forgot where you started. We can only keep so many things in our heads at one time. Software is incredibly complex without adding any unnecessary layers.
Development complexity and troubleshooting are just one side of my gripe.
Now let's talk about scalability.
Do developers realize that each and every time they write or modify any code that interacts with the database, they need to do a throrough analysis of the exact impact on the database? And not just the development copy, I mean a mimic of production, so you can see that the additional column you now require for your object just invalidated the current query plan and a report that was running in 1 second will now take 2 minutes, just because you added a single column to the select list? And it turns out that the index you now require is so big that the DBA is going to have to modify the physical layout of your files?
If you let people get too far away from the physical data store with an abstraction, they will create havoc with an application that needs to scale.
I am not a zealot. I can be convinced if I am wrong, and maybe I am, since there is such a strong push towards Linq to Sql, ADO.NET EF, Hibernate, Java EE, etc. Please think through your responses, if I am missing something I really want to know what it is, and why I should change my thinking.
[Edit]
It looks like this question is suddenly active again, so now that we have the new comment feature I have commented directly on several answers. Thanks for the replies, I think this is a healthy discussion.
I probably should have been more clear that I am talking about enterprise applications. I really can't comment on, say, a game that's running on someone's desktop, or a mobile app.
One thing I have to put up here at the top in response to several similar answers: orthogonality and separation of concerns often get cited as reasons to go entity/ORM. Stored procedures, to me, are the best example of separation of concerns that I can think of. If you disallow all other access to the database, other than via stored procedures, you could in theory redesign your entire data model and not break any code, so long as you maintained the inputs and outputs of the stored procedures. They are a perfect example of programming by contract (just so long as you avoid "select *" and document the result sets).
Ask someone who's been in the industry for a long time and has worked with long-lived applications: how many application and UI layers have come and gone while a database has lived on? How hard is it to tune and refactor a database when there are 4 or 5 different persistence layers generating SQL to get at the data? You can't change anything! ORMs or any code that generates SQL lock your database in stone.
I think it comes down to how complicated the "logic" of the application is, and where you have implemented it. If all your logic is in stored procedures, and all your application does is call those procedures and display the results, then developing entity objects is indeed a waste of time. But for an application where the objects have rich interactions with one another, and the database is just a persistence mechanism, there can be value to having those objects.
So, I'd say there is no one-size-fits-all answer. Developers do need to be aware that, sometimes, trying to be too OO can cause more problems than it solves.
Theory says that highly cohesive, loosely coupled implementations are the way forward.
So I suppose you are questioning that approach, namely separating concerns.
Should my aspx.cs file be interacting with the database, calling a sproc, and understanding IDataReader?
In a team environment, especially where you have less technical people dealing with the aspx portion of the application, I don't need these people being able to "touch" this stuff.
Separating my domain from my database protects me from structural changes in the database, surely a good thing? Sure database efficacy is absolutely important, so let someone who is most excellent at that stuff deal with that stuff, in one place, with as little impact on the rest of the system as possible.
Unless I am misunderstanding your approach, one structural change in the database could have a large impact area with the surface of your application. I see that this separation of concerns enables me and my team to minimise this. Also any new member of the team should understand this approach better.
Also, your approach seems to advocate the business logic of your application to reside in your database? This feels wrong to me, SQL is really good at querying data, and not, imho, expressing business logic.
Interesting thought though, although it feels one step away from SQL in the aspx, which from my bad old unstructured asp days, fills me with dread.
One reason - separating your domain model from your database model.
What I do is use Test Driven Development so I write my UI and Model layers first and the Data layer is mocked, so the UI and model is build around domain specific objects, then later I map these objects to what ever technology I'm using the the Data Layer. Its a bad idea to let the database structure determine the design of your application. Where possible write the app first and let that influence the structure of your database, not the other way around.
For me it boils down to I don't want my application to be concerned with how the data is stored. I'll probably get slapped for saying this...but your application is not your data, data is an artifact of the application. I want my application to be thinking in terms of Customers, Orders and Items, not a technology like DataSets, DataTables and DataRows...cuz who knows how long those will be around.
I agree that there is always a certain amount of coupling, but I prefer that coupling to reach upwards rather than downwards. I can tweak the limbs and leaves of a tree easier than I can alter it's trunk.
I tend to reserve sprocs for reporting as the queries do tend to get a little nastier than the applications general data access.
I also tend to think with proper unit testing early on that scenario's like that one column not being persisted is likely not to be a problem.
Eric,
You are dead on. For any really scalable / easily maintained / robust application the only real answer is to dispense with all the garbage and stick to the basics.
I've followed a similiar trajectory with my career and have come to the same conclusions. Of course, we're considered heretics and looked at funny. But my stuff works and works well.
Every line of code should be looked at with suspicion.
I would like to answer with an example similar to the one you proposed.
On my company I had to build a simple CRUD section for products, I build all my entities and a separate DAL. Later another developer had to change a related table and he even renamed several fields. The only file I had to change to update my form was the DAL for that table.
What (in my opinion) entities brings to a project is:
Ortogonality: Changes in one layer might not affect other layers (off course if you make a huge change on the database it would ripple through all the layers but most small changes won't).
Testability: You can test your logic with out touching your database. This increases performance on your tests (allowing you to run them more frequently).
Separation of concerns: In a big product you can assign the database to a DBA and he can optimize the hell out of it. Assign the Model to a business expert that has the knowledge necessary to design it. Assign individual forms to developers more experienced on webforms etc..
Finally I would like to add that most ORM mappers support stored procedures since that's what you are using.
Cheers.
I think you may be "biting off more than you can chew" on this topic. Ted Neward was not being flippant when he called it the "Vietnam of Computer Science".
One thing I can absolutely guarantee you is that it will change nobody's point of view on the matter, as has been proven so often on innumerable other blogs, forums, podcasts etc.
It's certainly ok to have open disucssion and debate about a controversial topic, it's just this one has been done so many times that both "sides" have agreed to disagree and just got on with writing software.
If you want to do some further reading on both sides, see articles on Ted's blog, Ayende Rahein, Jimmy Nilson, Scott Bellware, Alt.Net, Stephen Forte, Eric Evans etc.
#Dan, sorry, that's not the kind of thing I'm looking for. I know the theory. Your statement "is a very bad idea" is not backed up by a real example. We are trying to develop software in less time, with less people, with less mistakes, and we want the ability to easily make changes. Your multi-layer model, in my experience, is a negative in all of the above categories. Especially with regards to making the data model the last thing you do. The physical data model must be an important consideration from day 1.
I found your question really interesting.
Usually I need entities objects to encapsulate the business logic of an application. It would be really complicated and inadequate to push this logic into the data layer.
What would you do to avoid these entities objects? What solution do you have in mind?
Entity Objects can facilitate cacheing on the application layer. Good luck caching a datareader.
We should also talk about the notion what entities really are.
When I read through this discussion, I get the impression that most people here are looking at entities in the sense of an Anemic Domain Model.
A lot of people are considering the Anemic Domain Model as an antipattern!
There is value in rich domain models. That is what Domain Driven Design is all about.
I personally believe that OO is a way to conquer complexity. This means not only technical complexity (like data-access, ui-binding, security ...) but also complexity in the business domain!
If we can apply OO techniques to analyze, model, design and implement our business problems, this is a tremendous advantage for maintainability and extensibility of non-trivial applications!
There are differences between your entities and your tables. Entities should represent your model, tables just represent the data-aspect of your model!
It is true that data lives longer than apps, but consider this quote from David Laribee: Models are forever ... data is a happy side effect.
Some more links on this topic:
Why Setters and Getters are evil
Return of pure OO
POJO vs. NOJO
Super Models Part 2
TDD, Mocks and Design
Really interesting question. Honestly I can not prove why entities are good. But I can share my opinion why I like them. Code like
void exportOrder(Order order, String fileName){...};
is not concerned where order came from - from DB, from web request, from unit test, etc. It makes this method more explicitly declare what exactly it requires, instead of taking DataRow and documenting which columns it expects to have and which types they should be. Same applies if you implement it somehow as stored procedure - you still need to push record id to it, while it not necessary should be present in DB.
Implementation of this method would be done based on Order abstraction, not based on how exactly it is presented in DB. Most of such operations which I implemented really do not depend on how this data is stored. I do understand that some operations require coupling with DB structure for perfomance and scalability purposes, just in my experience there are not too much of them. In my experience very often it is enough to know that Person has .getFirstName() returning String, and .getAddress() returning Address, and address has .getZipCode(), etc - and do not care which tables are involed to store that data.
If you have to deal with such problems as you described, like when additional column breaks report perfomance, then for your tasks DB is a critical part, and you indeed should be as close as possible to it. While entities can provide some convenient abstractions they can hide some important details as well.
Scalability is interesting point here - most of websites which require enormous scalability (like facebook, livejournal, flickr) tend to use DB-ascetic approach, when DB is used as rare as possible and scalability issues are solved by caching, especially by RAM usage. http://highscalability.com/ has some interesting articles on it.
There are other good reasons for entity objects besides abstraction and loose coupling. One of the things I like most is the strong typing that you can't get with a DataReader or a DataTable. Another reason is that when done well, proper entity classes can make the code more maintanable by using first-class constructs for domain-specific terms that anyone looking at the code is likely to understand rather than a bunch of strings with field names in them used for indexing a DataRow. Stored procedures are really orthogonal to the use of an ORM since a lot of mapping frameworks give you the ability to map to sprocs.
I wouldn't consider sprocs + datareaders a substitute for a good ORM. With stored procedures, you're still constrained by, and tightly-coupled to, the procedure's type signature, which uses a different type system than the calling code. Stored procedures can be subject to modification to acommodate additional options and schema changes. An alternative to stored procedures in the case where the schema is subject to change is to use views--you can map objects to views and then re-map views to the underlying tables when you change them.
I can understand your aversion to ORMs if your experience mainly consists of Java EE and CSLA. You might want to have a look at LINQ to SQL, which is a very lightweight framework and is primarily a one-to-one mapping with the database tables but usually only needs minor extension for them to be full-blown business objects. LINQ to SQL can also map input and output objects to stored procedures' paramaters and results.
The ADO.NET Entity framework has the added advantage that your database tables can be viewed as entity classes inheriting from each other, or as columns from multiple tables aggregated into a single entity. If you need to change the schema, you can change the mapping from the conceptual model to the storage schema without changing the actual application code. And again, stored procedures can be used here.
I think that more IT projects in enterprises fail because of unmaintainability of the code or poor developer productivity (which can happen from, e.g., context switching between sproc-writing and app-writing) than scalability problems of an application.
I would also like to add to Dan's answer that separating both models could enable your application to be run on different database servers or even database models.
What if you need to scale your app by load balancing more than one web server? You could install the full app on all web servers, but a better solution is to have the web servers talk to an application server.
But if there aren't any entity objects, they won't have very much to talk about.
I'm not saying that you shouldn't write monoliths if its a simple, internal, short life application. But as soon as it gets moderately complex, or it should last a significant amount of time, you really need to think about a good design.
This saves time when it comes to maintaining it.
By splitting application logic from presentation logic and data access, and by passing DTOs between them, you decouple them. Allowing them to change independently.
You might find this post on comp.object interesting.
I'm not claiming to agree or disagree but it's interesting and (I think) relevant to this topic.
A question: How do you handle disconnected applications if all your business logic is trapped in the database?
In the type of Enterprise application I'm interested in, we have to deal with multiple sites, some of them must be able to function in a disconnected state.
If your business logic is encapsulated in a Domain layer that is simple to incorporate into various application types -say, as a dll- then I can build applications that are aware of the business rules and are able, when necessary, to apply them locally.
In keeping the Domain layer in stored procedures on the database you have to stick with a single type of application that needs a permanent line-of-sight to the database.
It's ok for a certain class of environments, but it certainly doesn't cover the whole spectrum of Enterprise applications.
#jdecuyper, one maxim I repeat to myself often is "if your business logic is not in your database, it is only a recommendation". I think Paul Nielson said that in one of his books. Application layers and UI come and go, but data usually lives for a very long time.
How do I avoid entity objects? Stored procedures mostly. I also freely admit that business logic tends to reach through all layers in an application whether you intend it to or not. A certain amount of coupling is inherent and unavoidable.
I have been thinking about this same thing a lot lately; I was a heavy user of CSLA for a while, and I love the purity of saying that "all of your business logic (or at least as much as is reasonably possible) is encapsulated in business entities".
I have seen the business entity model provide a lot of value in cases where the design of the database is different than the way you work with the data, which is the case in a lot of business software.
For example, the idea of a "customer" may consist of a main record in a Customer table, combined with all of the orders the customer has placed, as well as all the customer's employees and their contact information, and some of the properties of a customer and its children may be determined from lookup tables. It's really nice from a development standpoint to be able to work with the Customer as a single entity, since from a business perspective, the concept of Customer contains all of these things, and the relationships may or may not be enforced in the database.
While I appreciate the quote that "if your business rule is not in your database, it's only a suggestion", I also believe that you shouldn't design the database to enforce business rules, you should design it to be efficient, fast and normalized.
That said, as others have noted above, there is no "perfect design", the tool has to fit the job. But using business entities can really help with maintenance and productivity, since you know where to go to modify business logic, and objects can model real-world concepts in an intuitive way.
Eric,
No one is stopping you from choosing the framework/approach that you would wish. If you are going to go the "data driven/stored procedure-powered" path, then by all means, go for it! Especially if it really, really helps you deliver your applications on-spec and on-time.
The caveat being (a flipside to your question that is), ALL of your business rules should be on stored procedures, and your application is nothing more than a thin client.
That being said, same rules apply if you do your application in OOP : be consistent. Follow OOP's tenets, and that includes creating entity objects to represent your domain models.
The only real rule here is the word consistency. Nobody is stopping you from going DB-centric. No one is stopping you from doing old-school structured (aka, functional/procedural) programs. Hell, no one is stopping anybody from doing COBOL-style code. BUT an application has to be very, very consistent once going down this path, if it wishes to attain any degree of success.
I'm really not sure what you consider "Enterprise Applications". But I'm getting the impression you are defining it as an Internal Application where the RDBMS would be set in stone and the system wouldn't have to be interoperable with any other systems whether internal or external.
But what if you had a database with 100 tables which equate to 4 Stored Procedures for each table just for basic CRUD operations that's 400 stored procedures which need to be maintained and aren't strongly-typed so are susceptible to typos nor can be Unit Tested. What happens when you get a new CTO who is an Open Source Evangelist and wants to change the RDBMS from SQL Server to MySql?
A lot of software today whether Enterprise Applications or Products are using SOA and have some requirements for exposing Web Services, at least the software I am and have been involved with do.
Using your approach you would end up exposing a Serialized DataTable or DataRows. Now this may be deemed acceptable if the Client is guaranteed to be .NET and on an internal network. But when the Client is not known then you should be striving to Design an API which is intuitive and in most cases you would not want to be exposing the Full Database schema.
I certainly wouldn't want to explain to a Java developer what a DataTable is and how to use it. There's also the consideration of Bandwith and payload size and serialized DataTables, DataSets are very heavy.
There is no silver bullet with software design and it really depends on where the priorities lie, for me it's in Unit Testable code and loosely coupled components that can be easily consumed be any client.
just my 2 cents
I'd like to offer another angle to the problem of distance between OO and RDB: history.
Any software has a model of reality that is to some degree an abstraction of reality. No computer program can capture all the complexities of reality, and programs are written just to solve a set of problems from reality. Therefore any software model is a reduction of reality. Sometimes the software model forces reality to reduce itself. Like when you want the car rental company to reserve any car for you as long as it is blue and has alloys, but the operator can't comply because your request won't fit in the computer.
RDB comes from a very old tradition of putting information into tables, called accounting. Accounting was done on paper, then on punch cards, then in computers. But accounting is already a reduction of reality. Accounting has forced people to follow its system so long that it has become accepted reality. That's why it is relatively easy to make computer software for accounting, accounting has had its information model, long before the computer came along.
Given the importance of good accounting systems, and the acceptance you get from any business managers, these systems have become very advanced. The database foundations are now very solid and noone hesitates about keeping vital data in something so trustworthy.
I guess that OO must have come along when people have found that other aspects of reality are harder to model than accounting (which is already a model). OO has become a very successful idea, but persistance of OO data is relatively underdeveloped. RDB/Accounting has had easy wins, but OO is a much larger field (basically everything that isn't accounting).
So many of us have wanted to use OO but we still want safe storage of our data. What can be safer than to store our data the same way as the esteemed accounting system does? It is an enticing prospects, but we all run into the same pitfalls. Very few have taken the trouble to think of OO persistence compared to the massive efforts by the RDB industry, who has had the benefit of accounting's tradition and position.
Prevayler and db4o are some suggestions, I'm sure there are others I haven't heard of, but none have seemed to get half the press as, say, hibernation.
Storing your objects in good old files doesn't even seem to be taken seriously for multiuser applications, and especially web applications.
In my everyday struggle to close the chasm between OO and RDB I use OO as much as possible but try to keep inheritance to a minimum. I don't often use SPs. I'll use the advanced query stuff only in aspects that look like accounting.
I'll be happily supprised when the chasm is closed for good. I think the solution will come when Oracle launches something like "Oracle Object Instance Base". To really catch on, it will have to have a reassuring name.
Not a lot of time at the moment, but just off the top of my head...
The entity model lets you give a consistent interface to the database (and other possible systems) even beyond what a stored procedure interface can do. By using enterprise-wide business models you can make sure that all applications affect the data consistently which is a VERY important thing. Otherwise you end up with bad data, which is just plain evil.
If you only have one application then you don't really have an "enterprise" system, regardless of how big that application or your data are. In that case you can use an approach similar to what you talk about. Just be aware of the work that will be needed if you decide to grow your systems in the future.
Here are a few things that you should keep in mind (IMO) though:
Generated SQL code is bad
(exceptions to follow). Sorry, I
know that a lot of people think that
it's a huge time saver, but I've
never found a system that could
generate more efficient code than
what I could write and often the
code is just plain horrible. You
also often end up generating a ton
of SQL code that never gets used.
The exception here is very simple
patterns, like maybe lookup tables.
A lot of people get carried away on
it though.
Entities <> Tables (or even logical data model entities necessarily). A data model often has data rules that should be enforced as closely to the database as possible which can include rules around how table rows relate to each other or other similar rules that are too complex for declarative RI. These should be handled in stored procedures. If all of your stored procedures are simple CRUD procs, you can't do that. On top of that, the CRUD model usually creates performance issues because it doesn't minimize round trips across the network to the database. That's often the biggest bottleneck in an enterprise application.
Sometimes, your application and data layer are not that tightly coupled. For example, you may have a telephone billing application. You later create a separate application which monitors phone usage to a) better advertise to you b) optimise your phone plan.
These applications have different concerns and data requirements (even the data is coming out of the same database), they would drive different designs. Your code base can end up an absolute mess (in either application) and a nightmare to maintain if you let the database drive the code.
Applications that have domain logic separated from the data storage logic are adaptable to any kind of data source (database or otherwise) or UI (web or windows(or linux etc.)) application.
Your pretty much stuck in your database, which isn't bad if your with a company who is satisfied with the current database system your using. However, because databases evolve overtime there might be a new database system that is really neat and new that your company wants to use. What if they wanted to switch to a web services method of data access (like Service Orientated architecture sometime does). You might have to port your stored procedures all over the place.
Also the domain logic abstracts away the UI, which can be more important in large complex systems that have ever evolving UIs (especially when they are constantly searching for more customers).
Also, while I agree that there is no definitive answer to the question of stored procedures versus domain logic. I'm in the domain logic camp (and I think they are winning over time), because I believe that elaborate stored procedures are harder to maintain than elaborate domain logic. But that's a whole other debate
I think that you are just used to writing a specific kind of application, and solving a certain kind of problem. You seem to be attacking this from a "database first" perspective. There are lots of developers out there where data is persisted to a DB but performance is not a top priority. In lots of cases putting an abstraction over the persistence layer simplifies code greatly and the performance cost is a non-issue.
Whatever you are doing, it's not OOP. It's not wrong, it's just not OOP, and it doesn't make sense to apply your solutions to every othe problem out there.
Interesting question. A couple thoughts:
How would you unit test if all of your business logic was in your database?
Wouldn't changes to your database structure, specifically ones that affect several pages in your app, be a major hassle to change throughout the app?
Good Question!
One approach I rather like is to create an iterator/generator object that emits instances of objects that are relevant to a specific context. Usually this object wraps some underlying database access stuff, but I don't need to know that when using it.
For example,
An AnswerIterator object generates AnswerIterator.Answer objects. Under the hood it's iterating over a SQL Statement to fetch all the answers, and another SQL statement to fetch all related comments. But when using the iterator I just use the Answer object that has the minimum properties for this context. With a little bit of skeleton code this becomes almost trivial to do.
I've found that this works well when I have a huge dataset to work on, and when done right, it gives me small, transient objects that are relatively easy to test.
It's basically a thin veneer over the Database Access stuff, but it still gives me the flexibility of abstracting it when I need to.
The objects in my apps tend to relate one-to-one to the database, but I'm finding using Linq To Sql rather than sprocs makes it much easier writing complicated queries, especially being able to build them up using the deferred execution. e.g. from r in Images.User.Ratings where etc. This saves me trying to work out several join statements in sql, and having Skip & Take for paging also simplifies the code rather than having to embed the row_number & 'over' code.
Why stop at entity objects? If you don't see the value with entity objects in an enterprise level app, then just do your data access in a purely functional/procedural language and wire it up to a UI. Why not just cut out all the OO "fluff"?