Is O/R Mapping worth it? - orm

The expressiveness of the query languages (QL) provided with ORMs can be very powerful. Unfortunately, once you have a fleet of complex queries, and then some puzzling schema or data problem arises, it is very difficult to enlist the DBA help that you need? Here they are, part of the team that is evolving the database, yet they can't read the application QL, much less suggest modifications. I generally end up grabbing generated SQL out of the log for them. But then when they recommend changes to it, how does that relate to the original QL? The process is not round-trip.
So after a decade of promoting the value of ORMs, I am now wondering if I should be writing my SQL manually. And maybe all that I really want the framework to do is automate the data marshaling as much as possible.
Question: Have you found a way to deal with the round-trip issue in your organization? Is there a SQL-marshaling framework that scales well, and maintains easily?
(Yes, I know that pure SQL might bind me to the database vendor. But it is possible to write standards-compliant SQL.)

I think that what you want is a solution that maximizes the benefits of ORM without preventing you using other means. We have much the same issue as you do in our application; very heavy queries, and a large data model. Given the size of the data model, ORM is invaluable for the vast majority of the application. It allows us to extend the data model without having to go to a great deal of effort hand-maintaining SQL scripts. Moreover, and you touched on this, we support four database vendors, so the abstraction is nice.
However, there are instances where we've had to tune the queries manually, and since we chose a flexible ORM solution, we can do that too. As you say, it gets out of our way when we need it gone, and simply marshals objects for us.
So, in short (yep, short) yes, ORM is worth it, but like every solution to a problem, it's not a panacea.

In general, ORMs increase developer productivity a lot so I'd using them unless they've become a bigger problem than they're worth. If a majority of your tables are big enough that you are having a lot of problems, consider ditching the ORM. I would definitely not say that ORMs are a bad idea in general. Most databases are small enough and most queries are simple enough that they work well.
I've overcome that problem by using stored procedures or hand-written SQL only for the poorly performing queries. DBAs love stored procedures because they can modify them without telling you. Most (if not all) ORMs allow you to mix in hand written SQL or stored procedures.

todays O/R frameworks, as i believe you're familiar with, support the option of defining some queries manually ((N)Hibernate does). that can be used for complex parts of schemas, and for straight-forward parts use the ORM as provided by the framework.
another thing for you to check out might be the iBatis framework (http://ibatis.apache.org/). i haven't used it, but i've read that it's more close to SQL and people familiar with databases and SQL prefer it over full-blown ORM framework like hibernate, because it's closer to them than the completely different concept of ORM.

Related

What are the limitations of ORM in general?

I know that there are a lot of ORM fans out there but how do you deal with a database with more than 300 tables and some of the tables have more than 100 fields?
Most of the sample applications that i have seen only use a few fields. Is is prudent to use ORM in such large scale? I think that ORM is redundant (why creating another layer when in reality databases do not get changed easily?).
For me it makes sense for small applications that might get moved from databases to databases or applications that can be run on multiple platforms to use ORM.
Otherwise it seems useless or simply another headache.
any idea?
I have used ORM in some projects (Hibernate) and not in others. ORM limitations are the same as for all abstractions, you give up some flexibility and you must invest in learning the specifics of the implementation. However you typically gain coding efficiency, reduce duplication, centralize configuration, and get other improvements that are specific to the implementation. Note that database portability is not always without effort - obviously not if you use vendor-specific features.
You don't mention whether your project already has a data access implementation. If you're starting from scratch then the size of the database should not concern you too much as ORM should actually save you more on a bigger database in terms of efficiency and reducing duplication. However if you're contemplating replacing an existing data access implementation and you don't foresee the database changing much then your efforts will almost certainly outweigh the benefits.
BTW, I suspect sample applications use small databases because they're less effort to create and easier for users to understand the examples, not because the developers think that their ORM solution is only appropriate for small databases
The great added value of the ORM is that the business logic developers can focus on interaction with objects rather than database tables.
I.e. sometimes your business object might be quite complex or use multiple database tables (i.e. #SecondaryTable in JPA 2.0). You don't need to know how the entity is represented in the database in order to do your job.
And what about relations? As a developer, I don't need to know if the relation is realised as a join table, foreign key or whatever. I just need to set appropriate object-oriented associations and the ORM will do the rest of the work for me.
I've seen quite a large projects (> 50 developers) that worked fine on the ORM even besides in that time the tools hasn't been so good and mature as now.
You might want to see this thread: Is ORM fit for complex projects?

Raw SQL vs OOP based queries (ORM)?

I was doing a project that requires frequent database access, insertions and deletions. Should I go for Raw SQL commands or should I prefer to go with an ORM technique? The project can work fine without any objects and using only SQL commands? Does this affect scalability in general?
EDIT: The project is one of the types where the user isn't provided with my content, but the user generates content, and the project is online. So, the amount of content depends upon the number of users, and if the project has even 50000 users, and additionally every user can create content or read content, then what would be the most apt approach?
If you have no ( or limited ) experience with ORM, then it will take time to learn new API. Plus, you have to keep in mind, that the sacrifice the speed for 'magic'. For example, most ORMs will select wildcard '*' for fields, even when you just need list of titles from your Articles table.
And ORMs will aways fail in niche cases.
Most of ORMs out there ( the ones based on ActiveRecord pattern ) are extremely flawed from OOP's point of view. They create a tight coupling between your database structure and class/model.
You can think of ORMs as technical debt. It will make the start of project easier. But, as the code grows more complex, you will begin to encounter more and more problems caused by limitations in ORM's API. Eventually, you will have situations, when it is impossible to to do something with ORM and you will have to start writing SQL fragments and entires statements directly.
I would suggest to stay away from ORMs and implement a DataMapper pattern in your code. This will give you separation between your Domain Objects and the Database Access Layer.
I'd say it's better to try to achieve the objective in the most simple way possible.
If using an ORM has no real added advantage, and the application is fairly simple, I would not use an ORM.
If the application is really about processing large sets of data, and there is no business logic, I would not use an ORM.
That doesn't mean that you shouldn't design your application property though, but again: if using an ORM doesn't give you any benefit, then why should you use it ?
For speed of development, I would go with an ORM, in particular if most data access is CRUD.
This way you don't have to also develop the SQL and write data access routines.
Scalability should't suffer, though you do need to understand what you are doing (you could hurt scalability with raw SQL as well).
If the project is either oriented :
- data editing (as in viewing simple tables of data and editing them)
- performance (as in designing the fastest algorithm to do a simple task)
Then you could go with direct sql commands in your code.
The thing you don't want to do, is do this if this is a large software, where you end up with many classes, and lot's of code. If you are in this case, and you scatter sql everywhere in your code, you will clearly regret it someday. You will have a hard time making changes to your domain model. Any modification would become really hard (except for adding functionalities or entites independant with the existing ones).
More information would be good, though, as :
- What do you mean by frequent (how frequent) ?
- What performance do you need ?
EDIT
It seems you're making some sort of CMS service. My bet is you don't want to start stuffing your code with SQL. #teresko's pattern suggestion seems interesting, seperating your application logic from the DB (which is always good), but giving the possiblity to customize every queries. Nonetheless, adding a layer that fills in memory objects can take more time than simply using the database result to write your page, but I don't think that small difference should matter in your case.
I'd suggest to choose a good pattern that seperates your business logique and dataAccess, like what #terekso suggested.
It depends a bit on timescale and your current knowledge of MySQL and ORM systems. If you don't have much time, just do whatever you know best, rather than wasting time learning a whole new set of code.
With more time, an ORM system like Doctrine or Propel can massively improve your development speed. When the schema is still changing a lot, you don't want to be spending a lot of time just rewriting queries. With an ORM system, it can be as simple as changing the schema file and clearing the cache.
Then when the design settles down, keep an eye on performance. If you do use ORM and your code is solid OOP, it's not too big an issue to migrate to SQL one query at a time.
That's the great thing about coding with OOP - a decision like this doesn't have to bind you forever.
I would always recommend using some form of ORM for your data access layer, as there has been a lot of time invested into the security aspect. That alone is a reason to not roll your own, unless you feel confident about your skills in protecting against SQL injection and other vulnerabilities.

NHibernate vs EF4 - Performance on Low End Computer

I'm working on a small Windows Form application that will be run on a Netbook computer. I will control the hardware/environment, meaning I provide the hardware and software to the end user. It will have a single database on the local drive that only this one app will access. It will have a couple tables and a few hundred (or maybe a couple thousands) rows in one of the tables. No foreign keys, etc. Really simple. I just need a place to store this data and perform simple queries and map to objects (ORM).
I understand the basics of Nhibernate and EF4 and have experimented a little with both. I'd use EF4 with POCOs if I decided to use EF.
I don't think performance is an issue because its a small amount of data. But, Netbooks are not real powerful so I'm wondering which of these two products would offer me a more lightweight solution.
We're a Microsoft shop and not using EF4 yet, but I think we may be going that way as our data engine of the future, so this may influence my decision. But this app is kind of an island of its own so I could potentially use nhibernate without too much political fallout. :) My general impression of EF4 and its wizards and generators and magic is that its bloated. I may be wrong, but thats the feeling I get. I'd hate to select EF4 and find out its bogging down my Netbook's performance.
Any comments are welcome. I know this is a wide open subject. ;)
I don't think the difference between the two is even measurable with a small amount of data. The sql query itself will take much longer than the work done by the orm.
Yes, this is a wide open subject. You will only know the difference for the exact case you use when you measure it.
Personally, I wouldn't use an orm at all for just a couple of tables.
I also wouldn't think about performance before I have a performance problem.
I like NHibernate, and still wait EF to impress me, but for your not so complex application I would not use neither of them.
Instead I'll advice to use Linq2Sql, the most easiest solution and enough powerful.
I think NHibernate and EF is for more complex applications

What is the difference between NHibernate and iBATIS.NET?

I am looking for some up to date information comparing NHibernate and iBATIS.NET. I found some information searching Google, but a good bit of it applies either to the Java versions of these products or is dated.
Some specific things I am interested in:
Which is better if you control both the data model and the application?
iBATIS is repeatedly called simpler to learn - does this have long-term maintenance consequences (i.e. easy to start, hard to maintain)?
Do both make it easy to switch the underlying database vendor?
How skilled do your developers need to be with SQL?
Any major feature that one has that the other lacks?
Is either product more suitable for a particular type of application?
Real world examples of observed benefits and drawbacks are appreciated!
EDIT: Thanks for the information. I am doing my own evaluation as well. One thing I am wondering about still, does iBATIS help you to save/update complex object graphs? It seems like NHibernate is nice in that I can pass it a root object and it figures out the details of what, if anything, needs to be updated in the database.
I made some research a while ago.
One specific question from me, might give you some additional information:
Would you use NHibernate for a project with a legacy database, which is partly out of your control?
Some of your points of interest I can answer:
Which is better if you control both the data model and the application?
I can answer it the other way around: If you don't have control over the data model and thus facing some legacy database, iBatis is the better choice.
iBATIS is repeatedly called simpler to learn - does this have long-term maintenance consequences (i.e. easy to start, hard to maintain)?
It depends what you want to do with it. If you have a domain driven development approach then iBatis might get painful by time. If you just do simple data manipulation and don't have a full blown domain model then nHibernate might be a overkill by the time.
Do both make it easy to switch the underlying database vendor?
Both have mechanisms to shield you off from a specific database vendor, but I admit that have not done intense research in this direction.
How skilled do your developers need to be with SQL?
When you use iBatis, you need more SQL skills than NHibernate. Using iBatis you always need to code some SQL. NHibernate doesn't require you to code SQL statements -- it even can do the DDLs for you. Powerful features will require you to go to old good SQL, which will be inevitable.
Some other points:
I personally find that iBatis much more lightweighter. You can get things done very quickly. NHibernate is more powerful, but has much more features, which you can use in wrong way.
It is possible to combine the use of NHibernate and iBatis! You can use NHibernate for your business logic. For reporting purposes, where you just read data out of tables, fallback to iBatis.
If your application has a longer life cycle and a lot of business logic, consider NHibernate. It has a lot of feature aiding you in handle business objects.
The community around NHibernate is very active and come up with useful tools.
In a sense it's comparing apples to oranges.
Which is better if you control both the data model and the application?
They both work with normalized databases well, so they are more-or-less equal if you can shape the db. iBatis is better at mapping to legacy databases since it doesn't actually care about the database structure at all. It only cares about the shape of the result set.
.iBATIS is repeatedly called simpler to learn - does this have long-term maintenance consequences (i.e. easy to start, hard to maintain)?
It is much simpler, but that is because it has a much smaller featureset. I don't think it has any ticking timebomb long term maintenance issues.
Do both make it easy to switch the underlying database vendor?
Yes
How skilled do your developers need to be with SQL?
Both require a good knowledge of SQL. With iBatis, you still have to write the sql queries/procs. With NHibernate you have to know how to write NHibernate queries to get effective SQL. Neither are a replacement for SQL knowledge.
Any major feature that one has that the other lacks?
iBatis is a datamapper (a term used on the iBatis site). NHibernate is a full-blown Object Relational Mapper. iBatis is a great way to go if you primarily want something that takes the monotony out of mapping objects to result sets. However, it doesn't go all the way in trying to solve the object/relational mismatch. NHibernate has many more features such as dirty tracking, caching based on identity /identity map, flexible querying, dynamic sql, batching etc... NHibernate is much more dynamic in that it can do many things in one trip to the DB that could take iBatis several trips.
We recently posted an article comparing these two tools, and I think many of your questions are addressed. The article is here on our wiki site.

Stored procedures or OR mappers?

Which is better? Or use and OR mapper with SP's? If you have a system with SP's already, is an OR mapper worth it?
I like ORM's because you don't have to reinvent the wheel. That being said, it completely depends on your application needs, development style and that of the team.
This question has already been covered Why is parameterized SQL generated by NHibernate just as fast as a stored procedure?
There is nothing good to be said about stored procedures. There were a necessity 10 years ago but every single benefit of using sprocs is no longer valid. The two most common arguments are regarding security and performance. The "sending stuff over the wire" crap doesn't hold either, I can certainly create a query dynamically to do everything on the server too. One thing the sproc proponents won't tell you is that it makes updates impossible if you are using column conflict resolution on a merge publication. Only DBAs who think they are the database overlord insist on sprocs because it makes their job look more impressive than it really is.
This has been discussed at length on previous questions.
What are the pros and cons to keeping SQL in Stored Procs versus Code
At my work, we mostly do line of business apps - contract work.
For this type of business, I'm a huge fan of ORM. About four years ago (when the ORM tools were less mature) we studied up on CSLA and rolled our own simplified ORM tool that we use in most of our applications,including some enterprise-class systems that have 100+ tables.
We estimate that this approach (which of course includes a lot of code generation) creates a time savings of up to 30% in our projects. Seriously, it's rediculous.
There is a small performance trade-off, but it's insubstantial as long as you have a decent understanding of software development. There are always exceptions that require flexibility.
For instance, extremely data-intensive batch operations should still be handled in specialized sprocs if possible. You probably don't want to send 100,000 huge records over the wire if you could do it in a sproc right on the database.
This is the type of problem that newbie devs run into whether they're using ORM or not. They just have to see the results and if they're competent, they will get it.
What we've seen in our web apps is that usually the most difficult to solve performance bottlenecks are no longer database-related even with ORM. Rather, tey're on the front-end (browser) due to bandwidth, AJAX overhead, etc. Even mid-range database servers are incredibly powerful these days.
Of course, other shops who work on much larger high-demand systems may have different experiences there. :)
Stored procedures hands down. OR Mappers are language specific, and often add graphic slowdowns.
Stored procedures means you're not limited by the language interface, and you can merely tack on new interfaces to the database in forwards compatible ways.
My personal opinion of OR Mappers is their existence highlights a design flaw in the popular structure of databases. Database developers should realize the tasks people are trying to achieve with complicated OR-Mappers and create server-side utilities that assist in performing this task.
OR Mappers also are epic targets of the "leaky abstraction" syndrome ( Joel On Software: Leaky Abstractions )
Where its quite easy to find things it just cant handle because of the abstraction layer not being psychic.
Stored procedures are better, in my view, because they can have an independent security configuration from the underlying tables.
This means you can allow specific operations without out allowing writes/reads to specific tables. It also limits the damage that people can do if they discover a SQL injection exploit.
Definitely ORMs. More flexible, more portable (generally they tend to have portability built in). In case of slowness you may want to use caching or hand-tuned SQL in hot spots.
Generally stored procedures have several problems with maintainability.
separate from application (so many changes have now to be made in two places)
generally harder to change
harder to put under version control
harder to make sure they're updated (deployment issues)
portability (already mentioned)
I personally have found that SP's tend to be faster performance wise, at least for the large data items that I execute on a regular basis. But I know many people that swear by OR tools and wouldn't do ANYTHING else.
I would argue that using an OR mapper will increase readability and maintainability of your applications source code, while using SP will increase the performance of the application.
They are not actually mutually exclusive, though to your point they usually are so.
The advantage of using Object Relational mapping is that you can swap out data sources. Not only database structure, but you could use any data source. With advent web services / Service-oriented architecture / ESB's, in a larger corporation, it would be wise to consider having a higher level separation of concerns than what you could get in stored procedures. However, in smaller companies and in application that will never use a different data source, then SP's can fit the bill fine. And one last point, it is not necessary to use an OR mapper to get the abstraction. My former team had great success by simply using an adapter model using Spring.NET to plug-in the data source.
# Kent Fredrick
My personal opinion of OR Mappers is their existence highlights a design flaw in the popular structure of databases"
I think you're talking about the difference between the relational model and object-oriented model. This is actually why we need ORMs, but the implementations of these models were done on purpose - it is not a design flow - it is just how things turned out to be historically.
Use stored procedures where you have identified a performance bottleneck. if you haven't identified a bottleneck, what are you doing with premature optimisation?
Use stored procedures where you are concerned about security access to a particular table.
Use stored procs when you have a SQL wizard who is prepared to sit and write complex queries that join together loads of tables in a legacy database- to do the things that are hard in an OR mapper.
Use the OR mapper for the other (at least) 80% of your database: where the selects and updates are so routine as to make access through stored procedures alone a pointless exercise in manual coding, and where updates are so infrequent that there is no performance cost. Use an OR mapper to automate the easy stuff.
Most OR mappers can talk to stored procs for the rest.
You should not use stored procs assuming that they're faster than a sql statement in a string, this is not necessarily the case in the last few versions of MS SQL server.
You do not need to use stored procs to thwart SQL injection attacks, there are other ways to do make sure that your query parameters are strongly typed and not just string-concatenated.
You don't need to use an OR mapper to get a POCO domain model, but it does help.
If you already have a data API that's exposed as sprocs, you'd need to justify a major architectural overhaul to go to ORM.
For a green-fields build, I'd evaluate several things:
If there's a dedicated DBA on the team, I'd lean to sprocs
If there's more than one application touching the same DB I'd lean to sprocs
If there's no possibility of database migration ever, I'd lean to sprocs
If I'm trying to implement MVCC in the DB, I'd lean to sprocs
If I'm deploying this as a product with potentially multiple backend dbs (MySql, MSSql, Oracle), I'd lean to ORM
If I'm on a tight deadline, I'd lean to ORM, since it's a faster way to create my domain model and keep it in sync with the data model (with appropriate tooling).
If I'm exposing the same domain model in multiple ways (web app, web service, RIA client), I'll lean to ORM as then data model is then hidden behind my ORM facade, making a robust domain model is more valuable to me.
I think performance is a bit of a red herring; hibernate seems to perform nearly as well or better than hand-coded SQL (due to it's caching tiers), and it's easy to write a bad query in your sproc either way.
The most important criteria are probably the team's skillset and long-term database portability needs.
Well the SP's are already there. It doesn't make sense to can them really. I guess does it make sense to use a mapper with SP's?
"I'm trying to drive in a nail. Should I use the heel of my shoe or a glass bottle?"
Both Stored Procedures and ORMs are difficult and annoying to use for a developer (though not necessarily for a DBA or architect, respectively), because they incur a start-up cost and higher maintenance cost that doesn't guarantee a pay-off.
Both will pay off well if the requirements aren't expected to change much over the lifespan of the system, but they will get in your way if you're building the system to discover the requirements in the first place.
Straight-coded SQL or quasi-ORM like LINQ and ActiveRecord is better for build-to-discover projects (which happen in the enterprise a lot more than the PR wants you to think).
Stored Procedures are better in a language-agnostic environment, or where fine-grained control over permissions is required. They're also better if your DBA has a better grasp of the requirements than your programmers.
Full-blown ORMs are better if you do Big Design Up Front, use lots of UML, want to abstract the database back-end, and your architect has a better grasp of the requirements than either your DBA or programmers.
And then there's option #4: Use all of them. A whole system is not usually just one program, and while many programs may talk to the same database, they could each use whatever method is appropriate both for the program's specific task, and for its level of maturity. That is: you start with straight-coded SQL or LINQ, then mature the program by refactoring in ORM and Stored Procedures where you see they make sense.