Business rules in DMN or database table? - bpmn

I'm learning Camunda the workflow engine. I understand that for some long-running processes, process modeling brings many tactical and strategic benefits such as expressiveness, fail-tolerance and observability with additional overhead ofcourse.
The book I'm reading also advocates the use of DMN (decision tables) to bundle business rules inside the process model. The motive is to centralize maintenance and decouple configuration from the code. I'm taking this advice with grain of salt, as decision tables smells somewhat clunky to work with. There is no strong typing and powerful IDE features. I'm used to implementations where business parameters are stored in database table and consumed by the application. The implementation also provides admin GUI to maintain these parameters at runtime.
For what reason I should favor DMN over more solid database based solution?

You are moving the business logic to BPMN to get it out of the code, make it transparent in graphical model, accessible to all stakeholders, support business-IT alignment, empower business to own they business process/logic, support multi-version enactment at runtime, and more...
The same reasoning applies to business rules, which are too complex to be modeled out as graphs in BPMN diagrams. The DMN standard is also aimed at business people and the expression language used is intentionally kept simpler than an Excel formula. It is the "Friendly Enough Expression Language" (FEEL). So you see where this is going.
Database tables
are not well accessible to business users
do not flexible changes to the table structure(s)/schema at runtime
usually do not support multi-version enactment at runtime
do not support a graphical, logical decomposition of rules (into DRDs) unless you work with multiple tables - but db schema are not flexible
cannot be easily deployed to many systems
cannot be easily tested in unit test
likely do not automatically generate audit data, which is accessible for audits and analytics
These are just a few points. So, for business rules, definitely DMN over DB tables.

Related

Can Distributed SQL tools be applied as alternatives to 2 phase commit or sagas patterns for distributed transaction coordination?

I am currently reading the Microservices Patterns and it says there are mostly two approaches for distributed transactions: two phase commit (2PC) and sagas pattern.
Also, I've heard about currently evolving Distributed SQL (DSQL) tools like CockroachDB, YugabyteDB and YDB, which also support distributed ACID-like transactions via their own low-level db nodes communication.
So the question is, could the latter be applied as an alternative to the former ones?
To illustrate the question, consider the following typical microservices distributed transaction sample. Here we need 2PC or sagas for the red zone coordination.
What I want would be to completely eliminate the need to develop and support coordination from the business logic side moving it to the general DSQL engine:
On the one hand, it is clear that such approach somehow breaks the microservice's responsibility segregation principle. Also, as far as I understand, DSQL tools evolved mostly for replication/sharding tasks, and not for the microservices' business logic coordination. On the other hand, it would very much simplify developing and supporting such solutions.
I think it depends what you want to de-couple.
With distributed SQL databases, many operations on one database has no impact on the other databases in the same cluster. Like rolling upgrades (vs. monolithic databases were you have to take down all applications sharing the same database), scaling-up, scaling-out.
You can also, within the same cluster, dedicate nodes, to specific applications. Or move them to different regions. And with PostgreSQL compatibility, you can serve many use cases (relational, JSON, key-value, timeseries...) with the same database. For these, you benefit from sharing the same infrastructure, manage service provider, skills... and still de-couple applications.
With the need to de-couple more, like replicating asynchronously, YugabyteDB has xCluster replication. And, there are many levels of coordination possible, in the SQL layer, between the application and the data. The PostgreSQL compatibility comes with triggers, which can call external actions, or Foreign Data Wrappers, which can interact with other databases with a standard API.
So I would say distributed databases bring more possibilities between de-coupling everything (like the choice of the database vendor) and full consolidation that would impact applications.
Having separate databases in a microservice architecture has a few different benefits. Some are obviated by using a distributed database--CockroachDB's resilience, scalability, and dynamic load balancing mean that you generally won't need to worry about tuning your database to the workload of an individual service. But others aren't. Having separate databases forces your services to interact through well-defined APIs, which prevents their logic from becoming tightly coupled, allowing them to develop and update independently. If you're using a shared database to enforce consistency, you're necessarily introducing logical coupling no matter the mechanism, in that you need a shared understanding of what needs to be consistent. This may be unavoidable, in which case the key is really to make the consistency logic explicit and legible, which can be a bit trickier when implemented at the Raft level than in an API.

When is tight coupling essential or a good thing?

From all my readings and research on OO design/patterns/principles I've found that the general consensus is that loose coupling (and high cohesion) is the almost always the better design. I completely agree speaking from my past software project experiences.
But let's say some particular software company (which I don't work at) has some questionably designed large scale software that interacts with some hardware. The modules (that I never worked on) are so tightly coupled and function calls that goes 20+ levels deep to manage states. Class boundaries are never clearly defined and use cases poorly thought up. A good software developer (not me) would bring up these issues but only get turned down by the more senior developers that development practices (like SOLID or TDD) doesn't really apply because the software has worked for years using the "traditional" methodology, and it's too late to change. And the biggest complains from the customers (which I don't know who they are) are of the quality of the product.
Because of the above unrealistic scenario (I was never apart of), I thought about if there are cases where tight coupling is preferred or even required? When are there cases where developer needs to cross module boundaries and share states and increase dependency and reduce testability? What are some examples of systems that's so complex that would require this? I couldn't come up with a good case myself so I'm hoping some of the more experienced craftsmen can help me out.
Thanks. Again, I don't know this company.
A tightly coupled architecture integrates enterprise applications around a single point of truth, which is often a single spatially-enabled RDBMS. The types of applications that are linked include engineering design (CAD), facility records management (GIS), asset management, workflow, ERP, CRM, outage management, and other enterprise applications.
A major advantage of a tightly coupled architecture is that it enables the rapid and efficient processing of large volumes of data, provides a single point of truth instead of several, often redundant, data sources, and enables open access to data throughout the organization.
Tightly coupled architectures rely on standards such as SQL, ODBC, JDBC, and OLEDB, SQL/MM, and the Simple Feature Specification for SQL from the OGC, to provide open and secure access to data, including geo-spatial data, throughout the organization.
Loosely coupled Web services require substantial redundancies unlike tight coupling between clients and service, which minimizes redundancies.
One problem with asynchronous loosely coupled Web services is that for some business functions, it can exceed its resource capacity for the message queuing servers or system.
Loosely coupled Web services can be made to switch to tight coupling mode to avoid system overloads of scarce resources.

Any advantages when using an ORM Tool (Framework)?

I searched over .... I see many advantages, but it seems that all the advantages comes from a comparison over in-line SQL. I know in-line SQL is bad. But why compare with a bad one to show the other better?
If stored procedures are used (possibly exclusively), it seems none of the advantages still exists. Stored procedures definitely provide performance advantages in terms of security, performance (If a ORM can outrun a stored procedure, then the stored procedure is badly written) and a well written stored procedure is an automatic repository (pattern). Stored procedures can definitely provide better transaction and transaction isolation control.
I really appreciate an answer -- how ORM is better over a well architected application using stored procedures.
--- Thanks for all the answers that I receive so far ... It seems that the advantages still come from comparing using ORM's "dynamically generated SQL" with using "statically written in-line SQL" in the code. Yes, it has advantages. But it is not he question.
The question is better stated as the following:
If you consider having the stored procedures to implement your business logic (SPs can be written very advanced, and also very efficiently), in the Application code (.NET, JAVA), you have a very thin layer wrapper of the stored procedures organized by business need. My question is how ORM out-perform this architecture (Of course a well designed one).
ORM Tools make possible to develop abstraction layer between database and the model in the OO environment. The main advantage of this layer is that the developers who are not familiar with SQL can work with the model.
I have been seeking a good answer myself. Here is what I feel makes the difference:
1) ORM increases the developer productivity - mapping domain class to database is easier.
2) Stored Procs can potentially contain business logic - it is difficult to test these. This is mainly because of lack of tools/mocking framework.
3) ORM frameworks are tested ones which give you features like caching out of the box - no need to reinvent the wheel - and in most applications I've seen which do not use any ORM feature end up writing in-house Data Layer which ORM offers out of the box.
That being said - ORM does add some overhead as well, and it requires the developers to be aware of a new platform - writing efficient mapping comes with practise so there is a learning curve.
In the modern day setup, network bandwidth isn't as precious as rapid development and good quality (well tested) code. I guess this makes ORM well suited for database driven apps.
An ORM is a tool that can be used to build what you call a "well architected system". The idea is that when you are developing in a non-Relational language, there will be an impedance mismatch between the relational operation set provided by SQL/Stored Procedures and the language that you are using to build the rest of your application.
For developers using an object-oriented language (whether it is C++, C#, or Java) there are many considerations when mapping a complex relational schema into a rich Domain Model. It is certainly possible to perform all of this mapping in your own code, but as your interactions in this "no-man's-land" between OO and Relational paradigms grow more complex the more useful an ORM engine and associated tooling can be.
Some considerations as you plan out your mapping layer:
Do you need manage single-table or multi-table inheritance?
Do you want to leverage lazy loading?
Do you want to manually keep classes and tables synchronized or are you planning on using a tool to generate per-table classes (such as with a DataSet)?
Another consideration, especially when working in a team, is that when relational to domain layer mapping is performed by hand, there can be a great deal of variation in the way developers write the mapping. This can lead to inconsistencies, overlapping, and gaps that are difficult to detect. The selection of an ORM (especially a well known / solidly established ORM) can have an enormous (hopefully positive) impact on the solution and the pre-existing community surrounding that ORM will shape how you conceive of the mapping layer (you will find that there are significant cultural differences between Spring.NET and Entity Framework users, for instance).
Does an ORM make a good architecture? No. Are there systems whose architectures would be better off with an ORM? definitely. Are there projects that have been crippled by the unnecessary addition of an ORM? I'm guessing that there are many.
I suggest approaching this question from a different angle, and apply it to the specific application you are working on. Do you have any pain points by using SQL and/or Stored Procedures that an ORM might solve? Do you see any risks or have any concerns over problems that the introduction of an ORM might cause? Only by weighing the answers to these questions will you be able to determine if an ORM is a good fit for any given solution.

Should the data access layer contain business logic?

I've seen a trend to move business logic out of the data access layer (stored procedures, LINQ, etc.) and into a business logic component layer (like C# objects).
Is this considered the "right" way to do things these days? If so, does this mean that some database developer positions may be eliminated in favor of more middle-tier coding positions? (i.e. more c# code rather than more long stored procedures.)
If the applications is small with a short lifetime, then it's not worth putting time into abstracting the concerns in layers. In larger, long lived applications your logic/business rules should not be coupled to the data access. It creates a maintenance nightmare as the application grows.
Moving concerns to a common layer or also known as Separation of concerns, has been around for a while:
Wikipedia
The term separation of concerns was
probably coined by Edsger W. Dijkstra
in his 1974 paper "On the role of
scientific thought"1.
For Application Architecture a great book to start with is Domain Driven Design. Eric Evans breaks down the different layers of the application in detail. He also discusses the database impedance and what he calls a "Bounded Context"
Bounded Context
A blog is a system that displays posts from newest to oldest so that people can comment on. Some would view this as one system, or one "Bounded Context." If you subscribe to DDD, one would say there are two systems or two "Bounded Contexts" in a blog: A commenting system and a publication system. DDD argues that each system is independent (of course there will be interaction between the two) and should be modeled as such. DDD gives concrete guidance on how to separate the concerns into the appropriate layers.
Other resources that might interest you:
Domain Driven Design Quickly
Applying Domain Driven Design and
Patterns
Clean Code
Working Effectively with Legacy
Code
Refactor
Until I had a chance to experience The Big Ball of Mud or Spaghetti Code I had a hard time understanding why Application Architecture was so important...
The right way to do things will always to be dependent on the size, availability requirements and lifespan of your application. To use stored procs or not to use stored procs... Tools such as nHibrnate and Linq to SQL are great for small to mid-size projects. To make myself clear, I've never used nHibranate or Linq To Sql on a large application, but my gut feeling is an application will reach a size where optimizations will need to be done on the database server via views, Stored Procedures.. etc to keep the application performant. To do this work Developers with both Development and Database skills will be needed.
Data access logic belongs in the data access layer, business logic belongs in the business layer. I don't see how mixing the two could ever be considered a good idea from a design standpoint.
Separation of layers does not automatically mean not using stored procedures for business logic. This separation is equally possible:
Presentation Layer: .Net, PHP, whatever
Business Layer: Stored Procedures
Data Layer: Stored Procedures or DML
This works very well with Oracle, for example, where the business layer may be implemented in packages in a different schema from the data layer (to enforce proper separation of concerns).
What matters is the separation of concerns, not the language/technology used at each level.
(I expect to get roundly flamed for this heresy!)
The perfect world doesn't exist. It's about elegance versus what works better.
Executing complex SQL queries inside data access layers is much more performative than making a service to ask data many times and then merging and transforming them. When you make complex queries you are putting business logic in those queries.
It really depends on the requirements. Either way as long as it's NOT "behind the button" as it were. I think stored procedure are better for "classic" client server apps with changing needs. A strict middle "business logic" layer is better for apps that need to be very scalable, run on multiple database platforms, etc.
If you are building a layered architecture, and the architecture contains a dedicated business layer, then of course you should put business logic there. However, you can ask any five designers/architects/developers what 'business logic' actually is, and get six different answers. (Hey, I'm an architect myself, so I know all about 'on the one hand, but on the other'!). Is navigating an object graph part of the data layer or business layer? Depends on which EAA patterns you are using, and on exactly how complicated/clever your domain objects are. Or is it perhaps even part of your presentation?
But in more concrete terms: database development tools tend to lag behind Eclipse/Visual Studio/Netbeans/; and stored procedures have never been extremely comfortable for large-scale development. Yes, of course you can code everything in TSQL, PL/SQL &c, but there's a price to pay. What's more, the price of having several languages and platforms involved in one solution increases maintenance costs and delays. On the other hand, moving data access out of reach of DBA's can cause other headaches, especially with shared infrastructure environments with any kind of availability requirements. But overall, yes, modern tools and languages are currently moving logic from the data(base) layer into the application layer. We'll have to see how well it works out and scales.
The reason I've seen this trend is that LINQ and LINQ to SQL ORM give you a nice type-safe alternative to stored procedures.
What's "right" is whether you benefit from doing this personally.
Yes, business logic should be in the business logic layer. For me this is the biggest drawback of using store procedures for everything and thus moving some of the business rules to the db, I prefer to have that logic in the BLL in have the DLL only do communication with the db
It is ALWAYS a good idea to separate your layers. I can't tell you the number of times I've seen stored procedures that are VERY gnarly from lots of business logic written into the sproc. Also if you modify your complex stored procedure for whatever reason, you have the potential to break EVERYTHING that uses it.
Us devs at my company are moving to LINQ w/ the EF and dismissing the stored procedure unless we absolutely need it. LINQ and the EF make separating our layers a lot easier...when the EF is not being difficult. But that's another rant. :)
There will likely always be some level of business logic in the data layer. The data itself is a representation of some of that logic. For instance, primary keys are often created based on business logic rules.
For example, if your system won't allow an order to have more than one customer is part of the business logic, but it's also present (or should be) in the Data layer.
Further, some kinds of business rules are best done on the database itself for efficiency reasons. These are usually stored procedures, and thus exist in the data layer. An example might be a trigger that goes off if a customer has spent more than $X in a year, or if a ship-to is different from a bill-to.
Many of these rules might be handled in the business layer as well, but they also need a data layer component. It depends on where your error handling is.
Business logic in the data layer was common in client/server apps, as there really was no business logic layer per se (unless you could really, seriously prevent anyone from connecting to the database outside the application). Now that web apps are more common, you're seeing more 3- and 4-tier apps (client+web server+app server+database server), and more companies are following best practices and consolidating business logic in its own tier. I don't think there will be any less work for database developers, they'll probably just become the ones that write the business logic layer (and let an ORM tool write most of the database layer).
There are also technical reasons/limitations to be considered when planning where to author the business rules.
In most LOB applications centralization and performance pushes developers to use the database it self as the primary Business Layer, so in a sense, DAL and BL is mixed or unified.
A typical example would be the field that calculates the current location of a rental item, a piece of information that should be available for one or for many listed items, making an SQL view with a User Defined Function the most powerful candidate to hold the rule.
The above example is valid of course if a specific database design and processes implementation is preferred, but I just want to point out that in real world, we choose based on technical limitations and other principles, more often than we do for organizing our code.

Why do we need entity objects? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
I really need to see some honest, thoughtful debate on the merits of the currently accepted enterprise application design paradigm.
I am not convinced that entity objects should exist.
By entity objects I mean the typical things we tend to build for our applications, like "Person", "Account", "Order", etc.
My current design philosophy is this:
All database access must be accomplished via stored procedures.
Whenever you need data, call a stored procedure and iterate over a SqlDataReader or the rows in a DataTable
(Note: I have also built enterprise applications with Java EE, java folks please substitute the equvalent for my .NET examples)
I am not anti-OO. I write lots of classes for different purposes, just not entities. I will admit that a large portion of the classes I write are static helper classes.
I am not building toys. I'm talking about large, high volume transactional applications deployed across multiple machines. Web applications, windows services, web services, b2b interaction, you name it.
I have used OR Mappers. I have written a few. I have used the Java EE stack, CSLA, and a few other equivalents. I have not only used them but actively developed and maintained these applications in production environments.
I have come to the battle-tested conclusion that entity objects are getting in our way, and our lives would be so much easier without them.
Consider this simple example: you get a support call about a certain page in your application that is not working correctly, maybe one of the fields is not being persisted like it should be. With my model, the developer assigned to find the problem opens exactly 3 files. An ASPX, an ASPX.CS and a SQL file with the stored procedure. The problem, which might be a missing parameter to the stored procedure call, takes minutes to solve. But with any entity model, you will invariably fire up the debugger, start stepping through code, and you may end up with 15-20 files open in Visual Studio. By the time you step down to the bottom of the stack, you forgot where you started. We can only keep so many things in our heads at one time. Software is incredibly complex without adding any unnecessary layers.
Development complexity and troubleshooting are just one side of my gripe.
Now let's talk about scalability.
Do developers realize that each and every time they write or modify any code that interacts with the database, they need to do a throrough analysis of the exact impact on the database? And not just the development copy, I mean a mimic of production, so you can see that the additional column you now require for your object just invalidated the current query plan and a report that was running in 1 second will now take 2 minutes, just because you added a single column to the select list? And it turns out that the index you now require is so big that the DBA is going to have to modify the physical layout of your files?
If you let people get too far away from the physical data store with an abstraction, they will create havoc with an application that needs to scale.
I am not a zealot. I can be convinced if I am wrong, and maybe I am, since there is such a strong push towards Linq to Sql, ADO.NET EF, Hibernate, Java EE, etc. Please think through your responses, if I am missing something I really want to know what it is, and why I should change my thinking.
[Edit]
It looks like this question is suddenly active again, so now that we have the new comment feature I have commented directly on several answers. Thanks for the replies, I think this is a healthy discussion.
I probably should have been more clear that I am talking about enterprise applications. I really can't comment on, say, a game that's running on someone's desktop, or a mobile app.
One thing I have to put up here at the top in response to several similar answers: orthogonality and separation of concerns often get cited as reasons to go entity/ORM. Stored procedures, to me, are the best example of separation of concerns that I can think of. If you disallow all other access to the database, other than via stored procedures, you could in theory redesign your entire data model and not break any code, so long as you maintained the inputs and outputs of the stored procedures. They are a perfect example of programming by contract (just so long as you avoid "select *" and document the result sets).
Ask someone who's been in the industry for a long time and has worked with long-lived applications: how many application and UI layers have come and gone while a database has lived on? How hard is it to tune and refactor a database when there are 4 or 5 different persistence layers generating SQL to get at the data? You can't change anything! ORMs or any code that generates SQL lock your database in stone.
I think it comes down to how complicated the "logic" of the application is, and where you have implemented it. If all your logic is in stored procedures, and all your application does is call those procedures and display the results, then developing entity objects is indeed a waste of time. But for an application where the objects have rich interactions with one another, and the database is just a persistence mechanism, there can be value to having those objects.
So, I'd say there is no one-size-fits-all answer. Developers do need to be aware that, sometimes, trying to be too OO can cause more problems than it solves.
Theory says that highly cohesive, loosely coupled implementations are the way forward.
So I suppose you are questioning that approach, namely separating concerns.
Should my aspx.cs file be interacting with the database, calling a sproc, and understanding IDataReader?
In a team environment, especially where you have less technical people dealing with the aspx portion of the application, I don't need these people being able to "touch" this stuff.
Separating my domain from my database protects me from structural changes in the database, surely a good thing? Sure database efficacy is absolutely important, so let someone who is most excellent at that stuff deal with that stuff, in one place, with as little impact on the rest of the system as possible.
Unless I am misunderstanding your approach, one structural change in the database could have a large impact area with the surface of your application. I see that this separation of concerns enables me and my team to minimise this. Also any new member of the team should understand this approach better.
Also, your approach seems to advocate the business logic of your application to reside in your database? This feels wrong to me, SQL is really good at querying data, and not, imho, expressing business logic.
Interesting thought though, although it feels one step away from SQL in the aspx, which from my bad old unstructured asp days, fills me with dread.
One reason - separating your domain model from your database model.
What I do is use Test Driven Development so I write my UI and Model layers first and the Data layer is mocked, so the UI and model is build around domain specific objects, then later I map these objects to what ever technology I'm using the the Data Layer. Its a bad idea to let the database structure determine the design of your application. Where possible write the app first and let that influence the structure of your database, not the other way around.
For me it boils down to I don't want my application to be concerned with how the data is stored. I'll probably get slapped for saying this...but your application is not your data, data is an artifact of the application. I want my application to be thinking in terms of Customers, Orders and Items, not a technology like DataSets, DataTables and DataRows...cuz who knows how long those will be around.
I agree that there is always a certain amount of coupling, but I prefer that coupling to reach upwards rather than downwards. I can tweak the limbs and leaves of a tree easier than I can alter it's trunk.
I tend to reserve sprocs for reporting as the queries do tend to get a little nastier than the applications general data access.
I also tend to think with proper unit testing early on that scenario's like that one column not being persisted is likely not to be a problem.
Eric,
You are dead on. For any really scalable / easily maintained / robust application the only real answer is to dispense with all the garbage and stick to the basics.
I've followed a similiar trajectory with my career and have come to the same conclusions. Of course, we're considered heretics and looked at funny. But my stuff works and works well.
Every line of code should be looked at with suspicion.
I would like to answer with an example similar to the one you proposed.
On my company I had to build a simple CRUD section for products, I build all my entities and a separate DAL. Later another developer had to change a related table and he even renamed several fields. The only file I had to change to update my form was the DAL for that table.
What (in my opinion) entities brings to a project is:
Ortogonality: Changes in one layer might not affect other layers (off course if you make a huge change on the database it would ripple through all the layers but most small changes won't).
Testability: You can test your logic with out touching your database. This increases performance on your tests (allowing you to run them more frequently).
Separation of concerns: In a big product you can assign the database to a DBA and he can optimize the hell out of it. Assign the Model to a business expert that has the knowledge necessary to design it. Assign individual forms to developers more experienced on webforms etc..
Finally I would like to add that most ORM mappers support stored procedures since that's what you are using.
Cheers.
I think you may be "biting off more than you can chew" on this topic. Ted Neward was not being flippant when he called it the "Vietnam of Computer Science".
One thing I can absolutely guarantee you is that it will change nobody's point of view on the matter, as has been proven so often on innumerable other blogs, forums, podcasts etc.
It's certainly ok to have open disucssion and debate about a controversial topic, it's just this one has been done so many times that both "sides" have agreed to disagree and just got on with writing software.
If you want to do some further reading on both sides, see articles on Ted's blog, Ayende Rahein, Jimmy Nilson, Scott Bellware, Alt.Net, Stephen Forte, Eric Evans etc.
#Dan, sorry, that's not the kind of thing I'm looking for. I know the theory. Your statement "is a very bad idea" is not backed up by a real example. We are trying to develop software in less time, with less people, with less mistakes, and we want the ability to easily make changes. Your multi-layer model, in my experience, is a negative in all of the above categories. Especially with regards to making the data model the last thing you do. The physical data model must be an important consideration from day 1.
I found your question really interesting.
Usually I need entities objects to encapsulate the business logic of an application. It would be really complicated and inadequate to push this logic into the data layer.
What would you do to avoid these entities objects? What solution do you have in mind?
Entity Objects can facilitate cacheing on the application layer. Good luck caching a datareader.
We should also talk about the notion what entities really are.
When I read through this discussion, I get the impression that most people here are looking at entities in the sense of an Anemic Domain Model.
A lot of people are considering the Anemic Domain Model as an antipattern!
There is value in rich domain models. That is what Domain Driven Design is all about.
I personally believe that OO is a way to conquer complexity. This means not only technical complexity (like data-access, ui-binding, security ...) but also complexity in the business domain!
If we can apply OO techniques to analyze, model, design and implement our business problems, this is a tremendous advantage for maintainability and extensibility of non-trivial applications!
There are differences between your entities and your tables. Entities should represent your model, tables just represent the data-aspect of your model!
It is true that data lives longer than apps, but consider this quote from David Laribee: Models are forever ... data is a happy side effect.
Some more links on this topic:
Why Setters and Getters are evil
Return of pure OO
POJO vs. NOJO
Super Models Part 2
TDD, Mocks and Design
Really interesting question. Honestly I can not prove why entities are good. But I can share my opinion why I like them. Code like
void exportOrder(Order order, String fileName){...};
is not concerned where order came from - from DB, from web request, from unit test, etc. It makes this method more explicitly declare what exactly it requires, instead of taking DataRow and documenting which columns it expects to have and which types they should be. Same applies if you implement it somehow as stored procedure - you still need to push record id to it, while it not necessary should be present in DB.
Implementation of this method would be done based on Order abstraction, not based on how exactly it is presented in DB. Most of such operations which I implemented really do not depend on how this data is stored. I do understand that some operations require coupling with DB structure for perfomance and scalability purposes, just in my experience there are not too much of them. In my experience very often it is enough to know that Person has .getFirstName() returning String, and .getAddress() returning Address, and address has .getZipCode(), etc - and do not care which tables are involed to store that data.
If you have to deal with such problems as you described, like when additional column breaks report perfomance, then for your tasks DB is a critical part, and you indeed should be as close as possible to it. While entities can provide some convenient abstractions they can hide some important details as well.
Scalability is interesting point here - most of websites which require enormous scalability (like facebook, livejournal, flickr) tend to use DB-ascetic approach, when DB is used as rare as possible and scalability issues are solved by caching, especially by RAM usage. http://highscalability.com/ has some interesting articles on it.
There are other good reasons for entity objects besides abstraction and loose coupling. One of the things I like most is the strong typing that you can't get with a DataReader or a DataTable. Another reason is that when done well, proper entity classes can make the code more maintanable by using first-class constructs for domain-specific terms that anyone looking at the code is likely to understand rather than a bunch of strings with field names in them used for indexing a DataRow. Stored procedures are really orthogonal to the use of an ORM since a lot of mapping frameworks give you the ability to map to sprocs.
I wouldn't consider sprocs + datareaders a substitute for a good ORM. With stored procedures, you're still constrained by, and tightly-coupled to, the procedure's type signature, which uses a different type system than the calling code. Stored procedures can be subject to modification to acommodate additional options and schema changes. An alternative to stored procedures in the case where the schema is subject to change is to use views--you can map objects to views and then re-map views to the underlying tables when you change them.
I can understand your aversion to ORMs if your experience mainly consists of Java EE and CSLA. You might want to have a look at LINQ to SQL, which is a very lightweight framework and is primarily a one-to-one mapping with the database tables but usually only needs minor extension for them to be full-blown business objects. LINQ to SQL can also map input and output objects to stored procedures' paramaters and results.
The ADO.NET Entity framework has the added advantage that your database tables can be viewed as entity classes inheriting from each other, or as columns from multiple tables aggregated into a single entity. If you need to change the schema, you can change the mapping from the conceptual model to the storage schema without changing the actual application code. And again, stored procedures can be used here.
I think that more IT projects in enterprises fail because of unmaintainability of the code or poor developer productivity (which can happen from, e.g., context switching between sproc-writing and app-writing) than scalability problems of an application.
I would also like to add to Dan's answer that separating both models could enable your application to be run on different database servers or even database models.
What if you need to scale your app by load balancing more than one web server? You could install the full app on all web servers, but a better solution is to have the web servers talk to an application server.
But if there aren't any entity objects, they won't have very much to talk about.
I'm not saying that you shouldn't write monoliths if its a simple, internal, short life application. But as soon as it gets moderately complex, or it should last a significant amount of time, you really need to think about a good design.
This saves time when it comes to maintaining it.
By splitting application logic from presentation logic and data access, and by passing DTOs between them, you decouple them. Allowing them to change independently.
You might find this post on comp.object interesting.
I'm not claiming to agree or disagree but it's interesting and (I think) relevant to this topic.
A question: How do you handle disconnected applications if all your business logic is trapped in the database?
In the type of Enterprise application I'm interested in, we have to deal with multiple sites, some of them must be able to function in a disconnected state.
If your business logic is encapsulated in a Domain layer that is simple to incorporate into various application types -say, as a dll- then I can build applications that are aware of the business rules and are able, when necessary, to apply them locally.
In keeping the Domain layer in stored procedures on the database you have to stick with a single type of application that needs a permanent line-of-sight to the database.
It's ok for a certain class of environments, but it certainly doesn't cover the whole spectrum of Enterprise applications.
#jdecuyper, one maxim I repeat to myself often is "if your business logic is not in your database, it is only a recommendation". I think Paul Nielson said that in one of his books. Application layers and UI come and go, but data usually lives for a very long time.
How do I avoid entity objects? Stored procedures mostly. I also freely admit that business logic tends to reach through all layers in an application whether you intend it to or not. A certain amount of coupling is inherent and unavoidable.
I have been thinking about this same thing a lot lately; I was a heavy user of CSLA for a while, and I love the purity of saying that "all of your business logic (or at least as much as is reasonably possible) is encapsulated in business entities".
I have seen the business entity model provide a lot of value in cases where the design of the database is different than the way you work with the data, which is the case in a lot of business software.
For example, the idea of a "customer" may consist of a main record in a Customer table, combined with all of the orders the customer has placed, as well as all the customer's employees and their contact information, and some of the properties of a customer and its children may be determined from lookup tables. It's really nice from a development standpoint to be able to work with the Customer as a single entity, since from a business perspective, the concept of Customer contains all of these things, and the relationships may or may not be enforced in the database.
While I appreciate the quote that "if your business rule is not in your database, it's only a suggestion", I also believe that you shouldn't design the database to enforce business rules, you should design it to be efficient, fast and normalized.
That said, as others have noted above, there is no "perfect design", the tool has to fit the job. But using business entities can really help with maintenance and productivity, since you know where to go to modify business logic, and objects can model real-world concepts in an intuitive way.
Eric,
No one is stopping you from choosing the framework/approach that you would wish. If you are going to go the "data driven/stored procedure-powered" path, then by all means, go for it! Especially if it really, really helps you deliver your applications on-spec and on-time.
The caveat being (a flipside to your question that is), ALL of your business rules should be on stored procedures, and your application is nothing more than a thin client.
That being said, same rules apply if you do your application in OOP : be consistent. Follow OOP's tenets, and that includes creating entity objects to represent your domain models.
The only real rule here is the word consistency. Nobody is stopping you from going DB-centric. No one is stopping you from doing old-school structured (aka, functional/procedural) programs. Hell, no one is stopping anybody from doing COBOL-style code. BUT an application has to be very, very consistent once going down this path, if it wishes to attain any degree of success.
I'm really not sure what you consider "Enterprise Applications". But I'm getting the impression you are defining it as an Internal Application where the RDBMS would be set in stone and the system wouldn't have to be interoperable with any other systems whether internal or external.
But what if you had a database with 100 tables which equate to 4 Stored Procedures for each table just for basic CRUD operations that's 400 stored procedures which need to be maintained and aren't strongly-typed so are susceptible to typos nor can be Unit Tested. What happens when you get a new CTO who is an Open Source Evangelist and wants to change the RDBMS from SQL Server to MySql?
A lot of software today whether Enterprise Applications or Products are using SOA and have some requirements for exposing Web Services, at least the software I am and have been involved with do.
Using your approach you would end up exposing a Serialized DataTable or DataRows. Now this may be deemed acceptable if the Client is guaranteed to be .NET and on an internal network. But when the Client is not known then you should be striving to Design an API which is intuitive and in most cases you would not want to be exposing the Full Database schema.
I certainly wouldn't want to explain to a Java developer what a DataTable is and how to use it. There's also the consideration of Bandwith and payload size and serialized DataTables, DataSets are very heavy.
There is no silver bullet with software design and it really depends on where the priorities lie, for me it's in Unit Testable code and loosely coupled components that can be easily consumed be any client.
just my 2 cents
I'd like to offer another angle to the problem of distance between OO and RDB: history.
Any software has a model of reality that is to some degree an abstraction of reality. No computer program can capture all the complexities of reality, and programs are written just to solve a set of problems from reality. Therefore any software model is a reduction of reality. Sometimes the software model forces reality to reduce itself. Like when you want the car rental company to reserve any car for you as long as it is blue and has alloys, but the operator can't comply because your request won't fit in the computer.
RDB comes from a very old tradition of putting information into tables, called accounting. Accounting was done on paper, then on punch cards, then in computers. But accounting is already a reduction of reality. Accounting has forced people to follow its system so long that it has become accepted reality. That's why it is relatively easy to make computer software for accounting, accounting has had its information model, long before the computer came along.
Given the importance of good accounting systems, and the acceptance you get from any business managers, these systems have become very advanced. The database foundations are now very solid and noone hesitates about keeping vital data in something so trustworthy.
I guess that OO must have come along when people have found that other aspects of reality are harder to model than accounting (which is already a model). OO has become a very successful idea, but persistance of OO data is relatively underdeveloped. RDB/Accounting has had easy wins, but OO is a much larger field (basically everything that isn't accounting).
So many of us have wanted to use OO but we still want safe storage of our data. What can be safer than to store our data the same way as the esteemed accounting system does? It is an enticing prospects, but we all run into the same pitfalls. Very few have taken the trouble to think of OO persistence compared to the massive efforts by the RDB industry, who has had the benefit of accounting's tradition and position.
Prevayler and db4o are some suggestions, I'm sure there are others I haven't heard of, but none have seemed to get half the press as, say, hibernation.
Storing your objects in good old files doesn't even seem to be taken seriously for multiuser applications, and especially web applications.
In my everyday struggle to close the chasm between OO and RDB I use OO as much as possible but try to keep inheritance to a minimum. I don't often use SPs. I'll use the advanced query stuff only in aspects that look like accounting.
I'll be happily supprised when the chasm is closed for good. I think the solution will come when Oracle launches something like "Oracle Object Instance Base". To really catch on, it will have to have a reassuring name.
Not a lot of time at the moment, but just off the top of my head...
The entity model lets you give a consistent interface to the database (and other possible systems) even beyond what a stored procedure interface can do. By using enterprise-wide business models you can make sure that all applications affect the data consistently which is a VERY important thing. Otherwise you end up with bad data, which is just plain evil.
If you only have one application then you don't really have an "enterprise" system, regardless of how big that application or your data are. In that case you can use an approach similar to what you talk about. Just be aware of the work that will be needed if you decide to grow your systems in the future.
Here are a few things that you should keep in mind (IMO) though:
Generated SQL code is bad
(exceptions to follow). Sorry, I
know that a lot of people think that
it's a huge time saver, but I've
never found a system that could
generate more efficient code than
what I could write and often the
code is just plain horrible. You
also often end up generating a ton
of SQL code that never gets used.
The exception here is very simple
patterns, like maybe lookup tables.
A lot of people get carried away on
it though.
Entities <> Tables (or even logical data model entities necessarily). A data model often has data rules that should be enforced as closely to the database as possible which can include rules around how table rows relate to each other or other similar rules that are too complex for declarative RI. These should be handled in stored procedures. If all of your stored procedures are simple CRUD procs, you can't do that. On top of that, the CRUD model usually creates performance issues because it doesn't minimize round trips across the network to the database. That's often the biggest bottleneck in an enterprise application.
Sometimes, your application and data layer are not that tightly coupled. For example, you may have a telephone billing application. You later create a separate application which monitors phone usage to a) better advertise to you b) optimise your phone plan.
These applications have different concerns and data requirements (even the data is coming out of the same database), they would drive different designs. Your code base can end up an absolute mess (in either application) and a nightmare to maintain if you let the database drive the code.
Applications that have domain logic separated from the data storage logic are adaptable to any kind of data source (database or otherwise) or UI (web or windows(or linux etc.)) application.
Your pretty much stuck in your database, which isn't bad if your with a company who is satisfied with the current database system your using. However, because databases evolve overtime there might be a new database system that is really neat and new that your company wants to use. What if they wanted to switch to a web services method of data access (like Service Orientated architecture sometime does). You might have to port your stored procedures all over the place.
Also the domain logic abstracts away the UI, which can be more important in large complex systems that have ever evolving UIs (especially when they are constantly searching for more customers).
Also, while I agree that there is no definitive answer to the question of stored procedures versus domain logic. I'm in the domain logic camp (and I think they are winning over time), because I believe that elaborate stored procedures are harder to maintain than elaborate domain logic. But that's a whole other debate
I think that you are just used to writing a specific kind of application, and solving a certain kind of problem. You seem to be attacking this from a "database first" perspective. There are lots of developers out there where data is persisted to a DB but performance is not a top priority. In lots of cases putting an abstraction over the persistence layer simplifies code greatly and the performance cost is a non-issue.
Whatever you are doing, it's not OOP. It's not wrong, it's just not OOP, and it doesn't make sense to apply your solutions to every othe problem out there.
Interesting question. A couple thoughts:
How would you unit test if all of your business logic was in your database?
Wouldn't changes to your database structure, specifically ones that affect several pages in your app, be a major hassle to change throughout the app?
Good Question!
One approach I rather like is to create an iterator/generator object that emits instances of objects that are relevant to a specific context. Usually this object wraps some underlying database access stuff, but I don't need to know that when using it.
For example,
An AnswerIterator object generates AnswerIterator.Answer objects. Under the hood it's iterating over a SQL Statement to fetch all the answers, and another SQL statement to fetch all related comments. But when using the iterator I just use the Answer object that has the minimum properties for this context. With a little bit of skeleton code this becomes almost trivial to do.
I've found that this works well when I have a huge dataset to work on, and when done right, it gives me small, transient objects that are relatively easy to test.
It's basically a thin veneer over the Database Access stuff, but it still gives me the flexibility of abstracting it when I need to.
The objects in my apps tend to relate one-to-one to the database, but I'm finding using Linq To Sql rather than sprocs makes it much easier writing complicated queries, especially being able to build them up using the deferred execution. e.g. from r in Images.User.Ratings where etc. This saves me trying to work out several join statements in sql, and having Skip & Take for paging also simplifies the code rather than having to embed the row_number & 'over' code.
Why stop at entity objects? If you don't see the value with entity objects in an enterprise level app, then just do your data access in a purely functional/procedural language and wire it up to a UI. Why not just cut out all the OO "fluff"?