Replace relational DB (SQL Server) with rules-based/declarative implementation? - sql

I have started working on a project in the financial services industry that is based (mainly) on SQL Server (2000), ColdFusion (8), and some Access/.NET applications. This project started as some simple Access forms/VBA and was slowly converted to web interfaces.
I could say that the database design and application coding was done by people that were learning on the job and didn't have the opportunity to learn about good design principles from the start. Many of the business rules are set in a myriad of cascading functions and stored procedures as well as in the web server templates. There is a huge amount of special case handling deep within complex 500-line SQL UDFs that use uncommented constants. It is very difficult to trace all of the interactions between the 10-20 UDFs that might be involved in a query. Some of the queries seem to take way too long to run (up to 15 minutes).
While the tables are fairly well indexed, there is a lack of FK relationships and almost no referential integrity. The DB is updated infrequently with daily batches of low volume (1,000 records in multiple tables.) It is primarily used to serve as a data repository - I suppose a data warehouse. We get very infrequent deadlocks or delays.
So, my question is: If I want to re-implement the whole project including the database and front-end would it make sense to look at non-relational implementations? The primary DB is only about 1GB (.mdf) so it could fit easily in memory. I would like to move from the SQL query structure to some declarative model that could be efficiently compiled and executed. If necessary, I could use the SQL DB just as a data store.

Why do you want to move from the relational approach? By moving from the relational approach you are only going to bury business logic deeper into the code by using any other approach. As you pointed out, the data model is fairly simple. You could first look at improving the data model itself. The reason they may not be any referential integrity constraints is because the initial designers might have assumed that this would lead to lower performance. They might be doing the checks using code that might itself be inefficient.
Your DB is small. adding referential integrity constraints will not affect the performance in any way. If required, you can rewrite some of the UDFs. Why dont you use a query analyzer to look at the performance metrics? That will give you a good starting point for analysis.

If I want to re-implement the whole project including the database and front-end would it make sense to look at non-relational implementations?
In general, most of the developers, even those who breathe map/reduce, and wear NoSQL T shirts, feel a LOT more comfortable with SQL.
If your application follows the classic MVC/MVP model, then most of the frameworks ( e.g. Spring, Rails, Grails, Django, Webmachine, etc.. ) actually come with first class support for a SQL back end. And some support for a NoSQL one.
In case you see no actual benefit that NoSQL can bring to your system ( here are the benefits I posted to another question ), why bother?
I would like to have a set of "english-language" rules that describe the transformations from the underlying raw data to a form that can be directly consumed by the application (web)
Seems that you are talking about a classic persistence layer with a service layer on top of it. Where "english-language" rules are just "english-language" methods in your service layer. Unless you need a more sophisticated rules engine, but most of the time it is not needed.

Related

How to structure many and/or complex SQL queries?

Having quite a lot of SQL queries, many of them ad-hoc-ones, a database has grown a bit messy. I have two problems :
Hard to keep many different views/sproc's in good order when only using names to structure them
Subqueries (views that calls other views, in 2-4 levels) adds to the structure mess and are hard to maintain
Now I like to get a better structure in my database and/or in my C# code (could be java/python/ruby/whatever). How?
Should I use "schemas" in SQL to separate views in different areas? Like namespaces.
Should I avoid having lot of TSQL in the database altoghether and instead keep the querying to my C#? That would move the database logic closer to the rest of the system, and would be much easier to maintain and keep in code versioning, but at the same time I appreciate being close to the data, to keep performance good (with the help of SQL profiler).
Any other suggestion?
Update: the database and the c#-projects are a few years old and has grown, and will continue to grow over time (different areas of functionality) + new projects will be added. I need to clean it up in a good way, or change strategy.
Ok, why is complexity in your C# code easier to manage than complexity in your TSQL?
It must be that either your tooling for or knowledge of C# is superior. You should address that.
You can adopt naming conventions, and organise files to aid in this task. Use development environments to support TSQL and source control. Make sure you can deploy new and upgraded database schemas programmatically. Just like you should with C#.
Without knowing the details of your project I can't specify an exact structure.
How should you decide where logic should be implemented?
At a generic level this is simple.
The database should perform set based operations that can benefit from the use of indecies on your data. This code is easier to write in TSQL and other set based query languages.
Your application/business layer should perform row level operations, ideally in a stateless (Shared/static) fashion. This code is easier to write in c#, and other procedural languages.
Scaling database servers is more difficult than a stateless application layer. How do you maintain synchronization across multiple machines?
There is no exact right answer. There are just many shades of grey. The best solution will be based on your requirements. Start with the MS Defacto VS 2013, SQL 2012, C#, EF6 and embellish or simplify from there.
I just found this article by Martin Fowler "Domain Logic and SQL" http://martinfowler.com/articles/dblogic.html .
Performance first?
"One of the first questions people consider with this kind of thing is performance. Personally I don't think performance should be the first question. My philosophy is that most of the time you should focus on writing maintainable code. Then use a profiler to identify hot spots and then replace only those hot spots with faster but less clear code"
Maintainability
"For any long-lived enterprise application, you can be sure of one thing - it's going to change a lot. As a result you have to ensure that the system is organized in such a way that's easy to change. Modifiability is probably the main reason why people put business logic in memory [= application code instead of TSQL]."
Encapsulation
"Using views, or indeed stored procedures, provides encapsulation only up to a point. In many enterprise applications data comes from multiple sources, not just multiple relational databases, but also legacy systems, other applications, and files. [...] In this case full encapsulation really can only be done by a layer within the application code, which further implies that domain logic should also sit in memory."

How to go from a full SQL querying to something like a NoSQL?

In one of my process I have this SQL query that take 10-20% of the total execution time. This SQL query does a filter on my Database, and load a list of PricingGrid object.
So I want to improve these performance.
So far I guessed 2 solutions :
Use a NoSQL solution, AFAIK these are good solutions for improving reading process.
But the migration seems hard and needs a lot of work (like import the data from sql server to nosql in a regular basis)
I don't have any knowledge , I even don't know which one I should use (the first I'd use is Ravendb because I follow ayende and it's done by the .net community).
I might have some stuff to change in my model to make my object ok for a nosql database
Load all my PricingGrid object in memory (in a static IEnumerable)
This might be a problem when my server won't have enough memory to load everything
I might reinvent the wheel (indexes...) invented by the NoSQL providers
I think I'm not the first one wondering this, so what would be the best solution ? Is there any tools that could help me ?
.net 3.5, SQL Server 2005, windows server 2005
Migrating your data from SQL is only the first step.
Moving to a document store (like RavenDB or MongoDB) also means that you need to:
Denormalize your data
Perform schema validation in your code
Handle concurrency of complex operations in your code since you no longer have transactions (at least not the same way)
Perform rollbacks in the event of partial commits (changes)
Depending on your updates, reads and network model you might also need to handle conflicts
You provided very limited information but it sounds like your needs include a single database server and that your data fits well in the relational model.
In such a case I would vote against a NoSQL solution, it is more likely that you can speed up your queries with database optimizations and still retain all the added value of a RDBMS.
Non-relational databases are tools for a specific job (no matter how they sell them), if you need them it is usually because your data doesn't fit well in the relational model or if you have a need to distribute your data over multiple machines (size or availability). For instance, I use MongoDB for a write-intensive high throughput job management application. It is centralized and the data is very transient so the "cost" of having low durability is acceptable. This doesn't sound like the case for you.
If prefer to use a NoSQL solution perhaps you should try using Memcached+MySQL (InnoDB) this will allow you to get the speed benefits of an in-memory cache (in the form of a memcached daemon plugin) with the underlying protection and capabilities of an RDBMS (MySQL). It should also ease data migration and somewhat reduce the amount of changes required in your code.
I myself have never used it, I find that I either need NoSQL for the reasons I stated above or that I can optimize the RDBMS using stored procedures, indexes and table views in a way which is sufficient for my needs.
Asaf has provided great information in regards to the usage of NoSQL and when it is most appropriate. Given that your main concern was performance, I would tend to agree with his opinion - it would take you much more time and effort to adopt a completely new (and very different) data persistence platform than it would to trick out your SQL Server cluster. That said, my answer is mainly to address the "how" part of your question.
Addressing misunderstandings:
Denormalizing Data - You do not need to manually denormalize your existing data. This will be done for you when it is migrated over. More than anything you need to simply think about your data in a different fashion - root aggregates, entity and value types, etc.
Concurrency/Transactions - Transactions are possible in both Mongo and Raven, they are simply done in a different fashion. One of the inherent ways Raven does this is by using an ORM-like "unit of work" pattern with its RavenSession objects. Yes, your data validation needs to be done in code, but you already should be doing it there anyway. In my experience this is an over-hyped con.
How:
Install Raven or Mongo on a primary server, run it as a service.
Create or extend an existing application that uses the database you intend to port. This application needs all the model classes/libraries that your SQL database provides persistence for.
a. In your "data layer" you likely have a repository class somewhere. Extract an interface form this, and use it to build another repository class for your Raven/Mongo persistence. Both DB's have plenty good documentation for using their APIs to push/pull/update changes in the document graphs. It's pretty damn simple.
b. Load your SQL data into C# objects in memory. Pull back your top-level objects (just the entities) and load their inner collections and related data in memory. Your repository is probably already doing this (ex. when fetching an Order object, ensure not only its properties but associated collections like Items are loaded in memory.
c. Instantiate your Raven/Mongo repository and push the data to it. Primary entities become "top level documents" or "root aggregates" serialized in JSON, and their collections' data nested within. Save changes and close the repository. Note: You may break this step down into as many little pieces as your data deems necessary.
Once your data is migrated, play around with it and ensure you are satisfied. You may want to modify your application Models a little to adjust the way they are persisted to Raven/Mongo - for instance you may want to make both Orders and Items top-level documents and simply use reference values (much like relationships in RDBMS systems). Watch out here though, as doing so sort-of goes against the principal and performance behind NoSQL as now you have to tap the DB twice to get the Order and the Items.
If satisfied, shard/replicate your mongo/raven servers across your remaining available server boxes.
Obviously there are tons of little details I did not explain, but that is the general process, and much of it depends on the applications already consuming the database and may be tricky if more than one app/system talks to it.
Lastly, just to reiterate what Asaf said... learn as much as you can about NoSQL and its best use-cases. It is an amazing tool, but not golden solution for all data persistence. In your case try to really find the bottlenecks in your current solution and see if they are solvable. As one of my systems guys says, "technology for technology's sake is bullshit"

Does ORM for social networking sites makes any sense?

The reason why I ask this is because I need to know whether not using ORM for a social networking site makes any sense at all.
My argument why ORM does not fit into social networking sites are:
Social networking sites are not a product, thus you don't need to support multiple database. You know what database to use, and you most likely won't change it every now and then.
Social networking sites requires many-to-many relationship between users, and in the end sometimes you will need to write plain SQL to get those relations. The value of ORM is thus decreased again.
Related to the previous point, ORM sometimes do multiple queries in the backend to fetch its record, which sometimes may be inefficient and may cause bottleneck in the database. In the end you have to write down plain SQL query. If we know we are going to write plain SQL anyway, what is the point using ORM?
This is my limited understanding based on my limited experience. What are you're experience with building a social networking sites? Are my points valid? Is it lame to use bare SQL without worrying about using ORM? What are the points where ORM may help in building a social networking sites?
The value of using an ORM is to help speed up development, by automating the tedious work of assigning query results to object fields, and tracking changes to object fields so you can save them to the database. Hence the term Object-Relational Mapping.
An ORM has little value for you regarding database portability, since you only use the one database you deploy on.
The runtime performance aspect of an ORM is no better than, and typically much worse than writing plain SQL yourself. The generic methods of query generation often make naive mistakes and result in redundant queries, as you have mentioned. Again, the benefit is in development time, not runtime efficiency.
Using an ORM versus not using an ORM doesn't seem to make a huge difference for scalability. Other techniques with more bang-for-the-buck for scalability include:
Managing indexes in the RDBMS. Improve as many algorithms as possible from O(n) to O(log2n).
Intelligent caching architecture.
Horizontal scaling by database partitioning/sharding.
Database load-balancing and replication. Read from slave databases where possible, and write to a single master database. Index slaves and masters differently.
Supplement the RDBMS with complementary technology, such as Sphinx Search.
Vertical scaling by throwing hardware at the problem. Jeff Atwood has commented about this on the StackOverflow podcast.
Some people advocate moving your data management to a distributed architecture using cloud computing or distributed non-relational databases. This is probably not necessary until you get a very large number of users. Once you grow to a certain level of magnitude, all the rules change and you probably can't use an RDBMS anyway. But unless you are the data architect at Yahoo or Facebook or LinkedIn, don't worry about it -- cloud computing is over-hyped.
There's a common wisdom that the database is always the bottleneck in web apps, but there's also a case that improving efficiency on the front-end is at least as important. Cf. books by Steve Souders.
Julia Lerman in Programming Entity Framework (2009), p.503 shows that there's a 220% increase in query execution cost between using a DataReader directly and using Microsoft’s LINQ to Entities.
Also see Jeff Atwood's post on All Abstractions are Failed Abstractions, where he shows that using LINQ is at least double the cost of using plain SQL even in a naive way.
Here's my response to your points:
ORM does not need multiple database to be effective, in fact most cases of ORM usage are not due to the ability to adapt to different databases.
Most modern ORM frameworks are flexible enough to fetch 'lightweight' variants of mapped classes, it really depends on how you implement them.
If really required to, you can write native SQL queries within the ORM frameworks. Do note that caching and performance related algorithms are often part of the these frameworks.
IMO, an ORM helps you write cleaner, clearer code. If you use it sloppily you can cause excessive queries, but that isn't a rule by any means. If I were you I would start using the ORM and best practices of a framework, and only drop to SQL if you find yourself needing functionality that the ORM does not provide.
Also note that in web applications, many people are moving away from SQL databases. An ORM might help you to migrate to a non-relational database (precisely because you do not have SQL in your application code). Look at the use of JDO and JPA in Google's App Engine.
IMHO. ORM is need.
It allow you to access database in OOP way, no matter multiple database or not.
Cleaner code, you can define all method related to a particular table in the table class file, if you need raw sql join query, no problem, define there. it follows DRY and KISS. It is much better than you write similar raw sql query again and again.
The odds of your site being big enough that scaling becomes an issue are quite small so why prematurely optimize by doing everything in raw SQL instead of an ORM? You can get fairly far by throwing better hardware at a database assuming the database and application design are decent. While you may need to write raw SQL for things like creating friend graphs what about all the little things like updating the database when someone changes there email, sends a private message, uploads a photo, etc? Using an ORM can simplify all the simple database tasks you will have to do while still allowing you to hand code where absolutely necessary.

What are the advantages of using an ORM? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
As a web developer looking to move from hand-coded PHP sites to framework-based sites, I have seen a lot of discussion about the advantages of one ORM over another. It seems to be useful for projects of a certain (?) size, and even more important for enterprise-level applications.
What does it give me as a developer? How will my code differ from the individual SELECT statements that I use now? How will it help with DB access and security? How does it find out about the DB schema and user credentials?
Edit: #duffymo pointed out what should have been obvious to me: ORM is only useful for OOP code. My code is not OO, so I haven't run into the problems that ORM solves.
I'd say that if you aren't dealing with objects there's little point in using an ORM.
If your relational tables/columns map 1:1 with objects/attributes, there's not much point in using an ORM.
If your objects don't have any 1:1, 1:m or m:n relationships with other objects, there's not much point in using an ORM.
If you have complex, hand-tuned SQL, there's not much point in using an ORM.
If you've decided that your database will have stored procedures as its interface, there's not much point in using an ORM.
If you have a complex legacy schema that can't be refactored, there's not much point in using an ORM.
So here's the converse:
If you have a solid object model, with relationships between objects that are 1:1, 1:m, and m:n, don't have stored procedures, and like the dynamic SQL that an ORM solution will give you, by all means use an ORM.
Decisions like these are always a choice. Choose, implement, measure, evaluate.
ORMs are being hyped for being the solution to Data Access problems. Personally, after having used them in an Enterprise Project, they are far from being the solution for Enterprise Application Development. Maybe they work in small projects. Here are the problems we have experienced with them specifically nHibernate:
Configuration: ORM technologies require configuration files to map table schemas into object structures. In large enterprise systems the configuration grows very quickly and becomes extremely difficult to create and manage. Maintaining the configuration also gets tedious and unmaintainable as business requirements and models constantly change and evolve in an agile environment.
Custom Queries: The ability to map custom queries that do not fit into any defined object is either not supported or not recommended by the framework providers. Developers are forced to find work-arounds by writing adhoc objects and queries, or writing custom code to get the data they need. They may have to use Stored Procedures on a regular basis for anything more complex than a simple Select.
Proprietery binding: These frameworks require the use of proprietary libraries and proprietary object query languages that are not standardized in the computer science industry. These proprietary libraries and query languages bind the application to the specific implementation of the provider with little or no flexibility to change if required and no interoperability to collaborate with each other.
Object Query Languages: New query languages called Object Query Languages are provided to perform queries on the object model. They automatically generate SQL queries against the databse and the user is abstracted from the process. To Object Oriented developers this may seem like a benefit since they feel the problem of writing SQL is solved. The problem in practicality is that these query languages cannot support some of the intermediate to advanced SQL constructs required by most real world applications. They also prevent developers from tweaking the SQL queries if necessary.
Performance: The ORM layers use reflection and introspection to instantiate and populate the objects with data from the database. These are costly operations in terms of processing and add to the performance degradation of the mapping operations. The Object Queries that are translated to produce unoptimized queries without the option of tuning them causing significant performance losses and overloading of the database management systems. Performance tuning the SQL is almost impossible since the frameworks provide little flexiblity over controlling the SQL that gets autogenerated.
Tight coupling: This approach creates a tight dependancy between model objects and database schemas. Developers don't want a one-to-one correlation between database fields and class fields. Changing the database schema has rippling affects in the object model and mapping configuration and vice versa.
Caches: This approach also requires the use of object caches and contexts that are necessary to maintian and track the state of the object and reduce database roundtrips for the cached data. These caches if not maintained and synchrnonized in a multi-tiered implementation can have significant ramifications in terms of data-accuracy and concurrency. Often third party caches or external caches have to be plugged in to solve this problem, adding extensive burden to the data-access layer.
For more information on our analysis you can read:
http://www.orasissoftware.com/driver.aspx?topic=whitepaper
At a very high level: ORMs help to reduce the Object-Relational impedance mismatch. They allow you to store and retrieve full live objects from a relational database without doing a lot of parsing/serialization yourself.
What does it give me as a developer?
For starters it helps you stay DRY. Either you schema or you model classes are authoritative and the other is automatically generated which reduces the number of bugs and amount of boiler plate code.
It helps with marshaling. ORMs generally handle marshaling the values of individual columns into the appropriate types so that you don't have to parse/serialize them yourself. Furthermore, it allows you to retrieve fully formed object from the DB rather than simply row objects that you have to wrap your self.
How will my code differ from the individual SELECT statements that I use now?
Since your queries will return objects rather then just rows, you will be able to access related objects using attribute access rather than creating a new query. You are generally able to write SQL directly when you need to, but for most operations (CRUD) the ORM will make the code for interacting with persistent objects simpler.
How will it help with DB access and security?
Generally speaking, ORMs have their own API for building queries (eg. attribute access) and so are less vulnerable to SQL injection attacks; however, they often allow you to inject your own SQL into the generated queries so that you can do strange things if you need to. Such injected SQL you are responsible for sanitizing yourself, but, if you stay away from using such features then the ORM should take care of sanitizing user data automatically.
How does it find out about the DB schema and user credentials?
Many ORMs come with tools that will inspect a schema and build up a set of model classes that allow you to interact with the objects in the database. [Database] user credentials are generally stored in a settings file.
If you write your data access layer by hand, you are essentially writing your own feature poor ORM.
Oren Eini has a nice blog which sums up what essential features you may need in your DAL/ORM and why it writing your own becomes a bad idea after time:
http://ayende.com/Blog/archive/2006/05/12/25ReasonsNotToWriteYourOwnObjectRelationalMapper.aspx
EDIT: The OP has commented in other answers that his code base isn't very object oriented. Dealing with object mapping is only one facet of ORMs. The Active Record pattern is a good example of how ORMs are still useful in scenarios where objects map 1:1 to tables.
Top Benefits:
Database Abstraction
API-centric design mentality
High Level == Less to worry about at the fundamental level (its been thought of for you)
I have to say, working with an ORM is really the evolution of database-driven applications. You worry less about the boilerplate SQL you always write, and more on how the interfaces can work together to make a very straightforward system.
I love not having to worry about INNER JOIN and SELECT COUNT(*). I just work in my high level abstraction, and I've taken care of database abstraction at the same time.
Having said that, I never have really run into an issue where I needed to run the same code on more than one database system at a time realistically. However, that's not to say that case doesn't exist, its a very real problem for some developers.
I can't speak for other ORM's, just Hibernate (for Java).
Hibernate gives me the following:
Automatically updates schema for tables on production system at run-time. Sometimes you still have to update some things manually yourself.
Automatically creates foreign keys which keeps you from writing bad code that is creating orphaned data.
Implements connection pooling. Multiple connection pooling providers are available.
Caches data for faster access. Multiple caching providers are available. This also allows you to cluster together many servers to help you scale.
Makes database access more transparent so that you can easily port your application to another database.
Make queries easier to write. The following query that would normally require you to write 'join' three times can be written like this:
"from Invoice i where i.customer.address.city = ?" this retrieves all invoices with a specific city
a list of Invoice objects are returned. I can then call invoice.getCustomer().getCompanyName(); if the data is not already in the cache the database is queried automatically in the background
You can reverse-engineer a database to create the hibernate schema (haven't tried this myself) or you can create the schema from scratch.
There is of course a learning curve as with any new technology but I think it's well worth it.
When needed you can still drop down to the lower SQL level to write an optimized query.
Most databases used are relational databases which does not directly translate to objects. What an Object-Relational Mapper does is take the data, create a shell around it with utility functions for updating, removing, inserting, and other operations that can be performed. So instead of thinking of it as an array of rows, you now have a list of objets that you can manipulate as you would any other and simply call obj.Save() when you're done.
I suggest you take a look at some of the ORM's that are in use, a favourite of mine is the ORM used in the python framework, django. The idea is that you write a definition of how your data looks in the database and the ORM takes care of validation, checks and any mechanics that need to run before the data is inserted.
What does it give me as a developer?
Saves you time, since you don't have to code the db access portion.
How will my code differ from the individual SELECT statements that I use now?
You will use either attributes or xml files to define the class mapping to the database tables.
How will it help with DB access and security?
Most frameworks try to adhere to db best practices where applicable, such as parametrized SQL and such. Because the implementation detail is coded in the framework, you don't have to worry about it. For this reason, however, it's also important to understand the framework you're using, and be aware of any design flaws or bugs that may open unexpected holes.
How does it find out about the DB schema and user credentials?
You provide the connection string as always. The framework providers (e.g. SQL, Oracle, MySQL specific classes) provide the implementation that queries the db schema, processes the class mappings, and renders / executes the db access code as necessary.
Personally I've not had a great experience with using ORM technology to date. I'm currently working for a company that uses nHibernate and I really can't get on with it. Give me a stored proc and DAL any day! More code sure ... but also more control and code that's easier to debug - from my experience using an early version of nHibernate it has to be added.
Using an ORM will remove dependencies from your code on a particular SQL dialect. Instead of directly interacting with the database you'll be interacting with an abstraction layer that provides insulation between your code and the database implementation. Additionally, ORMs typically provide protection from SQL injection by constructing parameterized queries. Granted you could do this yourself, but it's nice to have the framework guarantee.
ORMs work in one of two ways: some discover the schema from an existing database -- the LINQToSQL designer does this --, others require you to map your class onto a table. In both cases, once the schema has been mapped, the ORM may be able to create (recreate) your database structure for you. DB permissions probably still need to be applied by hand or via custom SQL.
Typically, the credentials supplied programatically via the API or using a configuration file -- or both, defaults coming from a configuration file, but able to be override in code.
While I agree with the accepted answer almost completely, I think it can be amended with lightweight alternatives in mind.
If you have complex, hand-tuned SQL
If your objects don't have any 1:1, 1:m or m:n relationships with other objects
If you have a complex legacy schema that can't be refactored
...then you might benefit from a lightweight ORM where SQL is is not
obscured or abstracted to the point where it is easier to write your
own database integration.
These are a few of the many reasons why the developer team at my company decided that we needed to make a more flexible abstraction to reside on top of the JDBC.
There are many open source alternatives around that accomplish similar things, and jORM is our proposed solution.
I would recommend to evaluate a few of the strongest candidates before choosing a lightweight ORM. They are slightly different in their approach to abstract databases, but might look similar from a top down view.
jORM
ActiveJDBC
ORMLite
my concern with ORM frameworks is probably the very thing that makes it attractive to lots of developers.
nameley that it obviates the need to 'care' about what's going on at the DB level. Most of the problems that we see during the day to day running of our apps are related to database problems. I worry slightly about a world that is 100% ORM that people won't know about what queries are hitting the database, or if they do, they are unsure about how to change them or optimize them.
{I realize this may be a contraversial answer :) }

Stored procedures or OR mappers?

Which is better? Or use and OR mapper with SP's? If you have a system with SP's already, is an OR mapper worth it?
I like ORM's because you don't have to reinvent the wheel. That being said, it completely depends on your application needs, development style and that of the team.
This question has already been covered Why is parameterized SQL generated by NHibernate just as fast as a stored procedure?
There is nothing good to be said about stored procedures. There were a necessity 10 years ago but every single benefit of using sprocs is no longer valid. The two most common arguments are regarding security and performance. The "sending stuff over the wire" crap doesn't hold either, I can certainly create a query dynamically to do everything on the server too. One thing the sproc proponents won't tell you is that it makes updates impossible if you are using column conflict resolution on a merge publication. Only DBAs who think they are the database overlord insist on sprocs because it makes their job look more impressive than it really is.
This has been discussed at length on previous questions.
What are the pros and cons to keeping SQL in Stored Procs versus Code
At my work, we mostly do line of business apps - contract work.
For this type of business, I'm a huge fan of ORM. About four years ago (when the ORM tools were less mature) we studied up on CSLA and rolled our own simplified ORM tool that we use in most of our applications,including some enterprise-class systems that have 100+ tables.
We estimate that this approach (which of course includes a lot of code generation) creates a time savings of up to 30% in our projects. Seriously, it's rediculous.
There is a small performance trade-off, but it's insubstantial as long as you have a decent understanding of software development. There are always exceptions that require flexibility.
For instance, extremely data-intensive batch operations should still be handled in specialized sprocs if possible. You probably don't want to send 100,000 huge records over the wire if you could do it in a sproc right on the database.
This is the type of problem that newbie devs run into whether they're using ORM or not. They just have to see the results and if they're competent, they will get it.
What we've seen in our web apps is that usually the most difficult to solve performance bottlenecks are no longer database-related even with ORM. Rather, tey're on the front-end (browser) due to bandwidth, AJAX overhead, etc. Even mid-range database servers are incredibly powerful these days.
Of course, other shops who work on much larger high-demand systems may have different experiences there. :)
Stored procedures hands down. OR Mappers are language specific, and often add graphic slowdowns.
Stored procedures means you're not limited by the language interface, and you can merely tack on new interfaces to the database in forwards compatible ways.
My personal opinion of OR Mappers is their existence highlights a design flaw in the popular structure of databases. Database developers should realize the tasks people are trying to achieve with complicated OR-Mappers and create server-side utilities that assist in performing this task.
OR Mappers also are epic targets of the "leaky abstraction" syndrome ( Joel On Software: Leaky Abstractions )
Where its quite easy to find things it just cant handle because of the abstraction layer not being psychic.
Stored procedures are better, in my view, because they can have an independent security configuration from the underlying tables.
This means you can allow specific operations without out allowing writes/reads to specific tables. It also limits the damage that people can do if they discover a SQL injection exploit.
Definitely ORMs. More flexible, more portable (generally they tend to have portability built in). In case of slowness you may want to use caching or hand-tuned SQL in hot spots.
Generally stored procedures have several problems with maintainability.
separate from application (so many changes have now to be made in two places)
generally harder to change
harder to put under version control
harder to make sure they're updated (deployment issues)
portability (already mentioned)
I personally have found that SP's tend to be faster performance wise, at least for the large data items that I execute on a regular basis. But I know many people that swear by OR tools and wouldn't do ANYTHING else.
I would argue that using an OR mapper will increase readability and maintainability of your applications source code, while using SP will increase the performance of the application.
They are not actually mutually exclusive, though to your point they usually are so.
The advantage of using Object Relational mapping is that you can swap out data sources. Not only database structure, but you could use any data source. With advent web services / Service-oriented architecture / ESB's, in a larger corporation, it would be wise to consider having a higher level separation of concerns than what you could get in stored procedures. However, in smaller companies and in application that will never use a different data source, then SP's can fit the bill fine. And one last point, it is not necessary to use an OR mapper to get the abstraction. My former team had great success by simply using an adapter model using Spring.NET to plug-in the data source.
# Kent Fredrick
My personal opinion of OR Mappers is their existence highlights a design flaw in the popular structure of databases"
I think you're talking about the difference between the relational model and object-oriented model. This is actually why we need ORMs, but the implementations of these models were done on purpose - it is not a design flow - it is just how things turned out to be historically.
Use stored procedures where you have identified a performance bottleneck. if you haven't identified a bottleneck, what are you doing with premature optimisation?
Use stored procedures where you are concerned about security access to a particular table.
Use stored procs when you have a SQL wizard who is prepared to sit and write complex queries that join together loads of tables in a legacy database- to do the things that are hard in an OR mapper.
Use the OR mapper for the other (at least) 80% of your database: where the selects and updates are so routine as to make access through stored procedures alone a pointless exercise in manual coding, and where updates are so infrequent that there is no performance cost. Use an OR mapper to automate the easy stuff.
Most OR mappers can talk to stored procs for the rest.
You should not use stored procs assuming that they're faster than a sql statement in a string, this is not necessarily the case in the last few versions of MS SQL server.
You do not need to use stored procs to thwart SQL injection attacks, there are other ways to do make sure that your query parameters are strongly typed and not just string-concatenated.
You don't need to use an OR mapper to get a POCO domain model, but it does help.
If you already have a data API that's exposed as sprocs, you'd need to justify a major architectural overhaul to go to ORM.
For a green-fields build, I'd evaluate several things:
If there's a dedicated DBA on the team, I'd lean to sprocs
If there's more than one application touching the same DB I'd lean to sprocs
If there's no possibility of database migration ever, I'd lean to sprocs
If I'm trying to implement MVCC in the DB, I'd lean to sprocs
If I'm deploying this as a product with potentially multiple backend dbs (MySql, MSSql, Oracle), I'd lean to ORM
If I'm on a tight deadline, I'd lean to ORM, since it's a faster way to create my domain model and keep it in sync with the data model (with appropriate tooling).
If I'm exposing the same domain model in multiple ways (web app, web service, RIA client), I'll lean to ORM as then data model is then hidden behind my ORM facade, making a robust domain model is more valuable to me.
I think performance is a bit of a red herring; hibernate seems to perform nearly as well or better than hand-coded SQL (due to it's caching tiers), and it's easy to write a bad query in your sproc either way.
The most important criteria are probably the team's skillset and long-term database portability needs.
Well the SP's are already there. It doesn't make sense to can them really. I guess does it make sense to use a mapper with SP's?
"I'm trying to drive in a nail. Should I use the heel of my shoe or a glass bottle?"
Both Stored Procedures and ORMs are difficult and annoying to use for a developer (though not necessarily for a DBA or architect, respectively), because they incur a start-up cost and higher maintenance cost that doesn't guarantee a pay-off.
Both will pay off well if the requirements aren't expected to change much over the lifespan of the system, but they will get in your way if you're building the system to discover the requirements in the first place.
Straight-coded SQL or quasi-ORM like LINQ and ActiveRecord is better for build-to-discover projects (which happen in the enterprise a lot more than the PR wants you to think).
Stored Procedures are better in a language-agnostic environment, or where fine-grained control over permissions is required. They're also better if your DBA has a better grasp of the requirements than your programmers.
Full-blown ORMs are better if you do Big Design Up Front, use lots of UML, want to abstract the database back-end, and your architect has a better grasp of the requirements than either your DBA or programmers.
And then there's option #4: Use all of them. A whole system is not usually just one program, and while many programs may talk to the same database, they could each use whatever method is appropriate both for the program's specific task, and for its level of maturity. That is: you start with straight-coded SQL or LINQ, then mature the program by refactoring in ORM and Stored Procedures where you see they make sense.