How to structure many and/or complex SQL queries? - sql

Having quite a lot of SQL queries, many of them ad-hoc-ones, a database has grown a bit messy. I have two problems :
Hard to keep many different views/sproc's in good order when only using names to structure them
Subqueries (views that calls other views, in 2-4 levels) adds to the structure mess and are hard to maintain
Now I like to get a better structure in my database and/or in my C# code (could be java/python/ruby/whatever). How?
Should I use "schemas" in SQL to separate views in different areas? Like namespaces.
Should I avoid having lot of TSQL in the database altoghether and instead keep the querying to my C#? That would move the database logic closer to the rest of the system, and would be much easier to maintain and keep in code versioning, but at the same time I appreciate being close to the data, to keep performance good (with the help of SQL profiler).
Any other suggestion?
Update: the database and the c#-projects are a few years old and has grown, and will continue to grow over time (different areas of functionality) + new projects will be added. I need to clean it up in a good way, or change strategy.

Ok, why is complexity in your C# code easier to manage than complexity in your TSQL?
It must be that either your tooling for or knowledge of C# is superior. You should address that.
You can adopt naming conventions, and organise files to aid in this task. Use development environments to support TSQL and source control. Make sure you can deploy new and upgraded database schemas programmatically. Just like you should with C#.
Without knowing the details of your project I can't specify an exact structure.
How should you decide where logic should be implemented?
At a generic level this is simple.
The database should perform set based operations that can benefit from the use of indecies on your data. This code is easier to write in TSQL and other set based query languages.
Your application/business layer should perform row level operations, ideally in a stateless (Shared/static) fashion. This code is easier to write in c#, and other procedural languages.
Scaling database servers is more difficult than a stateless application layer. How do you maintain synchronization across multiple machines?
There is no exact right answer. There are just many shades of grey. The best solution will be based on your requirements. Start with the MS Defacto VS 2013, SQL 2012, C#, EF6 and embellish or simplify from there.

I just found this article by Martin Fowler "Domain Logic and SQL" http://martinfowler.com/articles/dblogic.html .
Performance first?
"One of the first questions people consider with this kind of thing is performance. Personally I don't think performance should be the first question. My philosophy is that most of the time you should focus on writing maintainable code. Then use a profiler to identify hot spots and then replace only those hot spots with faster but less clear code"
Maintainability
"For any long-lived enterprise application, you can be sure of one thing - it's going to change a lot. As a result you have to ensure that the system is organized in such a way that's easy to change. Modifiability is probably the main reason why people put business logic in memory [= application code instead of TSQL]."
Encapsulation
"Using views, or indeed stored procedures, provides encapsulation only up to a point. In many enterprise applications data comes from multiple sources, not just multiple relational databases, but also legacy systems, other applications, and files. [...] In this case full encapsulation really can only be done by a layer within the application code, which further implies that domain logic should also sit in memory."

Related

Raw SQL vs OOP based queries (ORM)?

I was doing a project that requires frequent database access, insertions and deletions. Should I go for Raw SQL commands or should I prefer to go with an ORM technique? The project can work fine without any objects and using only SQL commands? Does this affect scalability in general?
EDIT: The project is one of the types where the user isn't provided with my content, but the user generates content, and the project is online. So, the amount of content depends upon the number of users, and if the project has even 50000 users, and additionally every user can create content or read content, then what would be the most apt approach?
If you have no ( or limited ) experience with ORM, then it will take time to learn new API. Plus, you have to keep in mind, that the sacrifice the speed for 'magic'. For example, most ORMs will select wildcard '*' for fields, even when you just need list of titles from your Articles table.
And ORMs will aways fail in niche cases.
Most of ORMs out there ( the ones based on ActiveRecord pattern ) are extremely flawed from OOP's point of view. They create a tight coupling between your database structure and class/model.
You can think of ORMs as technical debt. It will make the start of project easier. But, as the code grows more complex, you will begin to encounter more and more problems caused by limitations in ORM's API. Eventually, you will have situations, when it is impossible to to do something with ORM and you will have to start writing SQL fragments and entires statements directly.
I would suggest to stay away from ORMs and implement a DataMapper pattern in your code. This will give you separation between your Domain Objects and the Database Access Layer.
I'd say it's better to try to achieve the objective in the most simple way possible.
If using an ORM has no real added advantage, and the application is fairly simple, I would not use an ORM.
If the application is really about processing large sets of data, and there is no business logic, I would not use an ORM.
That doesn't mean that you shouldn't design your application property though, but again: if using an ORM doesn't give you any benefit, then why should you use it ?
For speed of development, I would go with an ORM, in particular if most data access is CRUD.
This way you don't have to also develop the SQL and write data access routines.
Scalability should't suffer, though you do need to understand what you are doing (you could hurt scalability with raw SQL as well).
If the project is either oriented :
- data editing (as in viewing simple tables of data and editing them)
- performance (as in designing the fastest algorithm to do a simple task)
Then you could go with direct sql commands in your code.
The thing you don't want to do, is do this if this is a large software, where you end up with many classes, and lot's of code. If you are in this case, and you scatter sql everywhere in your code, you will clearly regret it someday. You will have a hard time making changes to your domain model. Any modification would become really hard (except for adding functionalities or entites independant with the existing ones).
More information would be good, though, as :
- What do you mean by frequent (how frequent) ?
- What performance do you need ?
EDIT
It seems you're making some sort of CMS service. My bet is you don't want to start stuffing your code with SQL. #teresko's pattern suggestion seems interesting, seperating your application logic from the DB (which is always good), but giving the possiblity to customize every queries. Nonetheless, adding a layer that fills in memory objects can take more time than simply using the database result to write your page, but I don't think that small difference should matter in your case.
I'd suggest to choose a good pattern that seperates your business logique and dataAccess, like what #terekso suggested.
It depends a bit on timescale and your current knowledge of MySQL and ORM systems. If you don't have much time, just do whatever you know best, rather than wasting time learning a whole new set of code.
With more time, an ORM system like Doctrine or Propel can massively improve your development speed. When the schema is still changing a lot, you don't want to be spending a lot of time just rewriting queries. With an ORM system, it can be as simple as changing the schema file and clearing the cache.
Then when the design settles down, keep an eye on performance. If you do use ORM and your code is solid OOP, it's not too big an issue to migrate to SQL one query at a time.
That's the great thing about coding with OOP - a decision like this doesn't have to bind you forever.
I would always recommend using some form of ORM for your data access layer, as there has been a lot of time invested into the security aspect. That alone is a reason to not roll your own, unless you feel confident about your skills in protecting against SQL injection and other vulnerabilities.

Why use SQL database? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I'm not quite sure stackoverflow is a place for such a general question, but let's give it a try.
Being exposed to the need of storing application data somewhere, I've always used MySQL or sqlite, just because it's always done like that. As it seems like the whole world is using these databases (most of all software products, frameworks, etc), it is rather hard for a beginning developer like me to start thinking about whether this is a good solution or not.
Ok, say we have some object-oriented logic in our application, and objects are related to each other somehow. We need to map this logic to the storage logic, so relations between database objects are required too. This leads us to using relational database, and I'm ok with that - to put it simple, our database table rows sometimes will need to have references to other tables' rows. But why use SQL language for interaction with such a database?
SQL query is a text message. I can understand this is cool for actually understanding what it does, but isn't it silly to use text table and column names for a part of application that no one ever seen after deploynment? If you had to write a data storage from scratch, you would have never used this kind of solution. Personally, I would have used some 'compiled db query' bytecode, that would be assembled once inside a client application and passed to the database. And it surely would name tables and colons by id numbers, not ascii-strings. In the case of changes in table structure those byte queries could be recompiled according to new db schema, stored in XML or something like that.
What are the problems of my idea? Is there any reason for me not to write it myself and to use SQL database instead?
EDIT To make my question more clear. Most of answers claim that SQL, being a text query, helps developers better understand the query itself and debug it more easily. Personally, I haven't seen people writing SQL queries by hand for a while. Everyone I know, including me, is using ORM. This situation, in which we build up a new level of abstraction to hide SQL, leads to thinking if we need SQL or not. I would be very grateful if you could give some examples in which SQL is used without ORM purposely, and why.
EDIT2 SQL is an interface between a human and a database. The question is why do we have to use it for application/database interaction? I still ask for examples of human beings writing/debugging SQL.
Everyone I know, including me, is using ORM
Strange. Everyone I know, including me, still writes most of the SQL by hand. You typically end up with tighter, more high performance queries than you do with a generated solution. And, depending on your industry and application, this speed does matter. Sometimes a lot. yeah, I'll sometimes use LINQ for a quick-n-dirty where I don't really care what the resulting SQL looks like, but thus far nothing automated beats hand-tuned SQL for when performance against a large database in a high-load environment really matters.
If all you need to do is store some application data somewhere, then a general purpose RDBMS or even SQLite might be overkill. Serializing your objects and writing them to a file might be simpler in some cases. An advantage to SQLite is that if you have a lot of this kind of information, it is all contained in one file. A disadvantage is that it is more difficult to read it. For example, if you serialize you data to YAML, you can read the file with any text editor or shell.
Personally, I would have used some
'compiled db query' bytecode, that
would be assembled once inside a
client application and passed to the
database.
This is how some database APIs work. Check out static SQL and prepared statements.
Is there any reason for me not to
write it myself and to use SQL
database instead?
If you need a lot of features, at some point it will be easier to use an existing RDMBS then to write your own database from scratch. If you don't need many features, a simpler solution may be wiser.
The whole point of database products is to avoid writing the database layer for every new program. Yes, a modern RDMBS might not always be a perfect fit for every project. This is because they were designed to be very general, so in practice, you will always get additional features you don't need. That doesn't mean it is better to have a custom solution. The glove doesn't always need to be a perfect fit.
UPDATE:
But why use SQL language for
interaction with such a database?
Good question.
The answer to that may be found in the original paper describing the relational model A Relational Model of Data for Large Shared Data Banks, by E. F. Codd, published by IBM in 1970. This paper describes the problems with the existing database technologies of the time, and explains why the relational model is superior.
The reason for using the relational model, and thus a logical query language like SQL, is data independence.
Data independence is defined in the paper as:
"... the independence of application programs and terminal activities from the growth in data types and changes in data representations."
Before the relational model, the dominate technology for databases was referred to as the network model. In this model, the programmer had to know the on-disk structure of the data and traverse the tree or graph manually. The relational model allows one to write a query against the conceptual or logical scheme that is independent of the physical representation of the data on disk. This separation of logical scheme from the physical schema is why we use the relational model. For a more on this issue, here are some slides from a database class. In the relational model, we use logic based query languages like SQL to retrieve data.
Codd's paper goes into more detail about the benefits of the relational model. Give it a read.
SQL is a query language that is easy to type into a computer in contrast with the query languages typically used in a research papers. Research papers generally use relation algebra or relational calculus to write queries.
In summary, we use SQL because we happen to use the relational model for our databases.
If you understand the relational model, it is not hard to see why SQL is the way it is. So basically, you need to study the relation model and database internals more in-depth to really understand why we use SQL. It may be a bit of a mystery otherwise.
UPDATE 2:
SQL is an interface between a human
and a database. The question is why do
we have to use it for
application/database interaction? I
still ask for examples of human beings
writing/debugging SQL.
Because the database is a relational database, it only understands relational query languages. Internally it uses a relational algebra like language for specifying queries which it then turns into a query plan. So, we write our query in a form we can understand (SQL), the DB takes our SQL query and turns it into its internal query language. Then it takes the query and tries to find a "query plan" for executing the query. Then it executes the query plan and returns the result.
At some point, we must encode our query in a format that the database understands. The database only knows how to convert SQL to its internal representation, that is why there is always SQL at some point in the chain. It cannot be avoided.
When you use ORM, your just adding a layer on top of the SQL. The SQL is still there, its just hidden. If you have a higher-level layer for translating your request into SQL, then you don't need to write SQL directly which is beneficial in some cases. Some times we do not have such a layer that is capable of doing the kinds of queries we need, so we must use SQL.
Given the fact that you used MySQL and SQLite, I understand your point of view completely. Most DBMS have features that would require some of the programming from your side, while you get it from database for free:
Indexes - you can store large amounts of data and still be able to filter and search very quickly because of indexes. Of course, you could implement you own indexing, but why reinvent the wheel
data integrity - using database features like cascading foreign keys can ensure data integrity across the system. You only need to declare relationship between data, and system takes care of the rest. Of course, once more, you could implement constraints in code, but it's more work. Consider, for example, deletion, where you would have to write code in object's destructor to track all dependent objects and act accordingly
ability to have multiple applications written in different programming languages, working on different operating systems, some even distributed across the network - all using the same data stored in a common database
dead easy implementation of observer pattern via triggers. There are many cases where only some data depends on some other data and it does not affect UI aspect of application. Ensuring consistency can be very tricky or require a lot of programming. Of course, you could implement trigger-like behavior with objects but it requires more programming than simple SQL definition
There are some good answers here. I'll attempt to add my two cents.
I like SQL, I can think in it pretty easily. The queries produced by layers on top of the DB (like ORM frameworks) are usually hideous. They'll select tons of extra stuff, join in things you don't need, etc.; all because they don't know that you only want a small part of the object in this code. When you need high performance, you'll often end up going in and using at least some custom SQL queries in an ORM system just to speed up a few bottlenecks.
Why SQL? As others have said, it's easy for humans. It makes a good lowest common denominator. Any language can make SQL and call command line clients if necessary, and they is pretty much always a good library.
Is parsing out the SQL inefficient? Somewhat. The grammar is pretty structured, so there aren't tons of ambiguities that would make the parser's job really hard. The real thing is that the overhead of parsing out SQL is basically nothing.
Let's say you run a query like "SELECT x FROM table WHERE id = 3", and then do it again with 4, then 5, over and over. In that case, the parsing overhead may exist. That's why you have prepared statements (as others have mentioned). The server parses the query once, and can swap in the 3 and 4 and 5 without having to reparse everything.
But that's the trivial case. In real life, your system may join 6 tables and have to pull hundreds of thousands of records (if not more). It may be a query that you let run on a database cluster for hours, because that's the best way to do things in your case. Even with a query that takes only a minute or two to execute, the time to parse the query is essentially free compared to pulling records off disk and doing sorting/aggregation/etc. The overhead of sending the ext "LEFT OUTER JOIN ON" is only a few bytes compared to sending special encoded byte 0x3F. But when your result set is 30 MB (let alone gigs+), those few extra bytes are worthless compared to not having to mess with some special query compiler object.
Many people use SQL on small databases. The biggest one I interact with is only a few dozen gigs. SQL is used on everything from tiny files (like little SQLite DBs may be) up to terabyte size Oracle clusters. Considering it's power, it's actually a surprisingly simple and small command set.
It's an ubiquitous standard. Pretty much every programming language out there has a way to access SQL databases. Try that with a proprietary binary protocol.
Everyone knows it. You can find experts easily, new developers will usually understand it to some degree without requiring training
SQL is very closely tied to the relational model, which has been thoroughly explored in regard to optimization and scalability. But it still frequently requires manual tweaking (index creation, query structure, etc.), which is relatively easy due to the textual interface.
But why use SQL language for interaction with such a database?
I think it's for the same reason that you use a human-readable (source code) language for interaction with the compiler.
Personally, I would have used some 'compiled db query' bytecode, that would be assembled once inside a client application and passed to the database.
This is an existing (optional) feature of databases, called "stored procedures".
Edit:
I would be very grateful if you could give some examples in which SQL is used without ORM purposely, and why
When I implemented my own ORM, I implemented the ORM framework using ADO.NET: and using ADO.NET includes using SQL statements in its implementation.
After all the edits and comments, the main point of your question appears to be : why is the nature of SQL closer to being a human/database interface than to being an application/database interface ?
And the very simple answer to that question is : because that is exactly what it was originally intended to be.
The predecessors of SQL (QUEL being presumably the most important one) were intended to be exactly that : a QUERY language, i.e. one that didn't have any of INSERT, UPDATE, DELETE.
Moreover, it was intended to be a query language that could be used by any user, provided that user was aware of the logical structure of the database, and obviously knew how to express that logical structure in the query language he was using.
The original ideas behind QUEL/SQL were that a database was built using "just any mechanism conceivable", that the "real" database could be really just anything (e.g. one single gigantic XML file - allthough 'XML' was not considered a valid option at the time), and that there would be "some kind of machinery" that understood how to transform the actual structure of that 'just anything' into the logical relational structure as it was perceived by the SQL user.
The fact that in order to actually achieve that, the underlying structures are required to lend themselves to "viewing them relationally", was not understood as well in those days as it is now.
Yes, it is annoying to have to write SQL statements to store and retrieve objects.
That's why Microsoft have added things like LINQ (language integrated query) into C# and VB.NET to make it possible to query databases using objects and methods instead of strings.
Most other languages have something similar with varying levels of success depending on the abilities of that language.
On the other hand, it is useful to know how SQL works and I think it is a mistake to shield yourself entirely from it. If you use the database without thinking you can write extremely inefficient queries and index the database incorrectly. But once you understand how to use SQL correctly and have tuned your database, you have a very powerful tried-and-tested tool available for finding exactly the data you need extremely quickly.
My biggest reason for SQL is Ad-hoc reporting. That report your business users want but don't know that they need it yet.
SQL is an interface between a human
and a database. The question is why do
we have to use it for
application/database interaction? I
still ask for examples of human beings
writing/debugging SQL.
I use sqlite a lot right from the simplest of tasks (like logging my firewall logs directly to a sqlite database) to more complex analytic and debugging tasks in my day-to-day research. Laying out my data in tables and writing SQL queries to munge them in interesting ways seems to be the most natural thing to me in these situations.
On your point about why it is still used as an interface between application/database, this is my simple reasoning:
There is about 3-4 decades of
serious research in that area
starting in 1970 with Codd's seminal
paper on Relational Algebra.
Relational Algebra forms the
mathematical basis to SQL (and other
QLs), although SQL does not
completely follow the relational
model.
The "text" form of the language
(aside from being easily
understandable to humans) is also
easily parsable by machines (say
using a grammar parser like like
lex) and is easily convertable to whatever "bytecode" using any number of optimizations.
I am not sure if doing this in any
other way would have yielded
compelling benefits in the generic cases. Otherwise it
would have been probably discovered
and adopted in the 3 decades of
research. SQL probably provides the
best tradeoffs when bridging the
divide between humans/databases and
applications/databases.
The question that then becomes interesting to ask is, "What are the real benefits of doing SQL in any other "non-text" way?" Will google for this now:)
SQL is a common interface used by the DBMS platform - the entire point of the interface is that all database operations can be specified in SQL without needing supplementary API calls. This means that there is a common interface across all clients of the system - application software, reports and ad-hoc query tools.
Secondly, SQL gets more and more useful as queries get more complex. Try using LINQ to specify a 12-way join a with three conditions based on existential predicates and a condition based on an aggregate calculated in a subquery. This sort of thing is fairly comprehensible in SQL but unlikely to be possible in an ORM.
In many cases an ORM will do 95% of what you want - most of the queries issued by applications are simple CRUD operations that an ORM or other generic database interface mechanism can handle easily. Some operations are best done using custom SQL code.
However, ORMs are not the be-all and end-all of database interfacing. Fowler's Patterns of Enterprise Application Architecture has quite a good section on other types of database access strategy with some discussion of the merits of each.
There are often good reasons not to use an ORM as the primary database interface layer. An example of a good one is that platform database libraries like ADO.Net often do a good enough job and integrate nicely with the rest of the environment. You might find that the gain from using some other interface doesn't really outweigh the benefits from the integration.
However, the final reason that you can't really ignore SQL is that you are ultimately working with a database if you are doing a database application. There are many, many WTF stories about screw-ups in commercial application code done by people who didn't understand databases properly. Poorly thought-out database code can cause trouble in so many ways, and blithely thinking that you don't need to understand how the DBMS works is an act of Hubris that is bound to come and bite you some day. Worse yet, it will come and bite some other poor schmoe who inherits your code.
While I see your point, SQL's query language has a place, especially in large applications with a lot of data. And to point out the obvious, if the language wasn't there, you couldn't call it SQL (Structured Query Language). The benefit of having SQL over the method you described is SQL is generally very readable, though some really push the limits on their queries.
I whole heartly agree with Mark Byers, you should not shield yourself from SQL. Any developer can write SQL, but to really make your application perform well with SQL interaction, understanding the language is a must.
If everything was precompiled with bytecode as you described, I'd hate to be the one to have to debug the application after the original developer left (or even after not seeing the code for 6 months).
I think the premise of the question is incorrect. That SQL can be represented as text is immaterial. Most modern databases would only compile queries once and cache them anyway, so you already have effectively a 'compiled bytecode'. And there's no reason this couldn't happen client-wise though I'm not sure if anyone's done it.
You said SQL is a text message, well I think of him as a messenger, and, as we know, don't shoot the messenger. The real issue is that relations are not a good enough way of organising real world data. SQL is just lipstick on the pig.
If the first part you seem to refer to what is usually called the Object - relational mapping impedance. There are already a lot of frameworks to alleviate that problem. There are tradeofs as well. Some things will be easier, others will get more complex, but in the general case they work well if you can afford the extra layer.
In the second part you seem to complain about SQL being text (it uses strings instead of ids, etc)... SQL is a query language. Any language (computer or otherwise) that is meant to be read or written by humans is text oriented for that matter. Assembly, C, PHP, you name it. Why? Because, well... it does make sense, doesn't it?
If you want precompiled queries, you already have stored procedures. Prepared statements are also compiled once on the fly, IIRC. Most (if not all) db drivers talk to the database server using a binary protocol anyway.
yes, text is a bit inefficient. But actually getting the data is a lot more costly, so the text based sql is reasonably insignificant.
SQL was created to provide an interface to make ad hoc queries against a relational database.
Generally, most relational databases understand some form of SQL.
Object-oriented databases exist, and (presumably) use objects to do their querying... but as I understand it, OO databases have a lot more overheard, and relational databases work just fine.
Relational Databases also allow you to operate in a "disconnected" state. Once you have the information you asked for, you can close the database connection. With an OO database, you either need to return all objects related to the current one (and the ones they're related to... and the... etc...) or reopen the connection to retrieve new objects as they are accessed.
In addition to SQL, you also have ORMs (object-relational mappings) that map objects to SQL and back. There are quite a few of them, including LINQ (.NET), the MS Entity Framework (.NET), Hibernate (Java), SQLAlchemy (Python), ActiveRecord (Ruby), Class::DBI (Perl), etc...
A database language is useful because it provides a logical model for your data independent of any applications that use it. SQL has a lot of shortcomings however, not the least being that its integration with other languages is poor, type support is about 30 years behind the rest of the industry and it has never been a truly relational language anyway.
SQL has survived mostly because the database market has been and remains dominated by the three mega-vendors who have a vested interest in protecting their investment. That's changing and SQL's days are probably numbered but the model that will finally replace it probably hasn't arrived yet - although there are plenty of contenders around these days.
I don't think most people are getting your question, though I think it's very clear. Unfortunately I don't have the "correct" answer. I would guess it's a combination of several things:
Semi-arbitrary decisions when it was designed such as ease of use, not needing a SQL compiler (or IDE), portability, etc.
It happened to catch on well (probably due to similar reasons)
And now due to historical reasons (compatibility, well known, proven, etc.) continues to be used.
I don't think most companies have bothered with another solution because it works well, isn't much of a bottleneck, it's a standard, blah, blah..
One of the Unix design principles can be said thusly, "Write programs to handle text streams, because that is a universal interface.".
And that, I believe, is why we typically use SQL instead of some 'byte-SQL' that only has a compilation interface. Even if we did have a byte-SQL, someone would write a "Text SQL", and the loop would be complete.
Also, MySQL and SQLite are less full-featured than, say, MSSQL and Oracle SQL. So you're still in the low end of the SQL pool.
Actually there are a few non-SQL database (like Objectivity, Oracle Berkeley DB, etc.) products came but non of them succeeded. In future if someone finds intuitive alternative for SQL, that will answer your question.
There are a lot of non relational database systems. Here are just a few:
Memcached
Tokyo Cabinet
As far as finding a relational database that doesn't use SQL as its primary interface, I think you won't find it. Reason: SQL is a great way to talk about relations. I can't figure out why that's a big deal to you: if you don't like SQL, put an abstraction over it (like an ORM) so you don't have to worry about it. Let the abstraction worry about it. It gets you to the same place.
However, the problem your'e really mentioning here is the object-relation disconnect - the problem is with the relation itself. Objects and relational-tuples don't always lend themselves to be a 1-1 relationship, which is the reason why a developer can frustrated with a database. The solution to that is to use a different database type.
Because often, you cannot be sure that (citing you) "no one ever seen after deployment". Knowing that there is an easy interface for reporting and for dataset level querying is a good path for evolution of your app.
You're right, that there are other solutions that may be valid in some situations: XML, plain text files, OODB...
But having a set of common interfaces (like ODBC) is a huge plus for the life of data.
I think the reason might be the search/find/grab algorithms the sql laungage is connected to do. Remember that sql has been developed for 40 years - and the goal has been both preformence wise and user firendly wise.
Ask yourself what the best way of finding 2 attibutes is. Now why investigating that each time you would want to do something that includes that each time you develope your application. Assuming the main goal is the developing of your application when developing an application.
An application has similarities with other applications, a database has similarities with other databases. So there should be a "best way" of these to interact, logically.
Also ask yourself how you would develop a better console only application that does not use sql laungage. If you cannot do that I think you need to develope a new kind of GUI that are even more fundamentally easier to use than with a console - to develope things from it. And that might actually be possible. But still most development of applications is based around console and typing.
Then when it comes to laungage I don´t think you can make a much more fundamentally easier text laungage than sql. And remember that each word of anything is inseperatly connected to its meaning - if you remove the meaning the word cannot be used - if you remove the word you cannot communicate the meaning. You have nothing to describe it with (And maybe you cannot even think it beacuse it woulden´t be connected to anything else you have thought before...).
So basically the best possible algorithms for database manipulation are assigned to words - if you remove these words you will have to assign these manipulations something else - and what would that be?
i think you can use ORM
if and only if you know the basic of sql.
else the result there isn't the best

Why no love for SQL? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
I've heard a lot lately that SQL is a terrible language, and it seems that every framework under the sun comes pre-packaged with a database abstraction layer.
In my experience though, SQL is often the much easier, more versatile, and more programmer-friendly way to manage data input and output. Every abstraction layer I've used seems to be a markedly limited approach with no real benefit.
What makes SQL so terrible, and why are database abstraction layers valuable?
This is partly subjective. So this is my opinion:
SQL has a pseudo-natural-language style. The inventors believed that they can create a language just like English and that database queries will be very simple. A terrible mistake. SQL is very hard to understand except in trivial cases.
SQL is declarative. You can't tell the database how it should do stuff, just what you want as result. This would be perfect and very powerful - if you wouldn't have to care about performance. So you end up in writing SQL - reading execution plans - rephrasing SQL trying to influence the execution plan, and you wonder why you can't write the execution plan yourself.
Another problem of the declarative language is that some problems are easier to solve in a imperative manner. So you either write it in another language (you'll need standard SQL and probably a data access layer) or by using vendor specific language extensions, say by writing stored procedures and the like. Doing so you will probably find that you're using one of the worst languages you've ever seen - because it was never designed to be used as an imperative language.
SQL is very old. SQL has been standardized, but too late, many vendors already developed their language extensions. So SQL ended up in dozens of dialects. That's why applications are not portable and one reason to have a DB abstraction layer.
But it's true - there are no feasible alternatives. So we all will use SQL for the next few years.
Aside from everything that was said, a technology doesn't have to be bad to make an abstraction layer valuable.
If you're doing a very simple script or application, you can afford to mix SQL calls in your code wherever you like. However, if you're doing a complex system, isolating the database calls in separate module(s) is a good practice and so it is isolating your SQL code. It improves your code's readability, maintainability and testability. It allows you to quickly adapt your system to changes in the database model without breaking up all the high level stuff, etc.
SQL is great. Abstraction layers over it makes it even greater!
One point of abstraction layers is the fact that SQL implementations tend to be more or less incompatible with each other since the standard is slightly ambiguous, and also because most vendors have added their own (nonstandard) extras there. That is, SQL written for a MySQL DB might not work quite similarly with, say, an Oracle DB — even if it "should".
I agree, though, that SQL is way better than most of the abstraction layers out there. It's not SQL's fault that it's being used for things that it wasn't designed for.
SQL gets badmouthed from several sources:
Programmers who are not comfortable with anything but an imperative language.
Consultants who have to deal with many incompatible SQL-based products on a daily basis
Nonrelational database vendors trying to break the stranglehold of relational database vendors on the market
Relational database experts like Chris Date who view current implementations of SQL as insufficient
If you stick to one DBMS product, then I definitely agree that SQL DBs are more versatile and of higher quality than their competition, at least until you hit a scalability barrier intrinsic in the model. But are you really trying to write the next Twitter, or are you just trying to keep some accounting data organized and consistent?
Criticism of SQL is often a standin for criticisms of RDBMSes. What critics of RDBMSes seem not to understand is that they solve a huge class of computing problems quite well, and that they are here to make our lives easier, not harder.
If they were serious about criticizing SQL itself, they'd back efforts like Tutorial D and Dataphor.
It's not so terrible. It's an unfortunate trend in this industry to rubbish the previous reliable technology when a new "paradigm" comes out. At the end of the day, these frameworks are very most probably using SQL to communicate with the database so how can it be THAT bad? That said, having a "standard" abstraction layer means that a developer can focus on the application code and not the SQL code. Without such a standard layer you'd probably write a lightweight one each time you're developing a system, which is a waste of effort.
SQL is designed for management and query of SET based data. It is often used to do more and edge cases lead to frustration at times.
Actual USE of SQL can be SO impacted by the base database design that the SQL may not be the issue, but the design might - and when you toss in the legacy code associated with a bad design, changes are more impactive and costly to impliment (no one like to go back and "fix" stuff that is "working" and meeting objectives)
Carpenters can pound nails with hammers, saw lumber with saws and smooth boards with planes. It IS possible to "saw" using hammers and planes, but dang it is frustrating.
I wont say it's terrible. It's unsuitable for some tasks. For example: you can not write good procedural code with SQL. I was once forced to work with set manipulation with SQL. It took me a whole weekend to figure that out.
SQL was designed for relational algebra - that's where it should to be used.
I've heard a lot lately that SQL is a terrible language, and it seems that every framework under the sun comes pre-packaged with a database abstraction layer.
Note that these layers just convert their own stuff into SQL. For most database vendors SQL is the only way to communicate with the engine.
In my experience though, SQL is often the much easier, more versatile, and more programmer-friendly way to manage data input and output. Every abstraction layer I've used seems to be a markedly limited approach with no real benefit.
… reason for which I just described above.
The database layers don't add anything, they just limit you. They make the queries disputably more simple but never more efficient.
By definition, there is nothing in the database layers that is not in SQL.
What makes SQL so terrible, and why are database abstraction layers valuable?
SQL is a nice language, however, it takes some brain twist to work with it.
In theory, SQL is declarative, that is you declare what you want to get and the engine provides it in the fastest way possible.
In practice, there are many ways to formulate a correct query (that is the query that return correct results).
The optimizers are able to build a Lego castle out of some predefined algorithms (yes, they are multiple), but they just cannot make new algorithms. It still takes an SQL developer to assist them.
However, some people expect the optimizer to produce "the best plan possible", not "the best plan available for this query with given implementation of the SQL engine".
And as we all know, when the computer program does not meet people's expectations, it's the program that gets blamed, not the expectations.
In most cases, however, reformulating a query can produce a best plan possible indeed. There are tasks when it's impossible, however, with the new and growing improvements to SQL these cases get fewer and fewer in number.
It would be nice, though, if the vendors provided some low-level access to the functions like "get the index range", "get a row by the rowid" etc., like C compilers let you to embed the assembly right into the language.
I recenty wrote an article on this in my blog:
Double-thinking in SQL
I'm a huge ORM advocate and I still believe that SQL is very useful, although it's certainly possible to do terrible things with it (just like anything else). .
I look at SQL as a super-efficient language that does not have code re-use or maintainability/refactoring as priorities.
So lightning fast processing is the priority. And that's acceptable. You just have to be aware of the trade-offs, which to me are considerable.
From an aesthetic point of view, as a language I feel that it is lacking some things since it doesn't have OO concepts and so on -- it feels like very old school procedural code to me. But it's far and away the fastest way to do certain things, and that's a powerful niche!
SQL is excellent for certain kinds of tasks, especially manipulating and retrieving sets of data.
However, SQL is missing (or only partially implements) several important tools for managing change and complexity:
Encapsulation: SQL's encapsulation mechanisms are coarse. When you write SQL code, you have to know everything about the implementation of your data. This limits the amount of abstraction you can achieve.
Polymorphism: if you want to perform the same operation on different tables, you've got to write the code twice. (One can mitigate this with imaginative use of views.)
Visibility control: there's no standard SQL mechanism for hiding pieces of the code from one another or grouping them into logical units, so every table, procedure, etc. is
accessible from every other one, even when it's undesirable.
Modularity and Versioning
Finally, manually coding CRUD operations in SQL (and writing the code to hook it up to the rest of one's application) is repetitive and error-prone.
A modern abstraction layer provides all of those features, and allows us to use SQL where it's most effective while hiding the disruptive, repetitive implementation details. It provides tools to help overcome the object-relational impedance mismatch that complicates data access in object-oriented software development.
I would say that a database abstraction layer included with a framework is a good thing because it solves two very important problems:
It keeps the code distinct. By putting the SQL into another layer, which is generally very thin and should only be doing the basics of querying and handoff of results (in a standardized way), you keep your application free from the clutter of SQL. It's the same reason web developers (should) put CSS and Javascript in separate files. If you can avoid it, do not mix your languages.
Many programmers are just plain bad at using SQL. For whatever reason, a large number of developers (especially web developers) seem to be very, very bad at using SQL, or RDBMSes in general. They treat the database (and SQL by extension) as the grubby little middleman they have to go through to get to data. This leads to extremely poorly thought out databases with no indexes, tables stacked on top of tables in dubious manners, and very poorly written queries. Or worse, they try to be too general (Expert System, anyone?) and cannot reasonably relate data in any meaningful way.
Unfortunately, sometimes the way that someone tries to solve a problem and tools they use, whether due to ignorance, stubbornness, or some other trait, are in direct opposition with one another, and good luck trying to convince them of this. As such, in addition to just being a good practice, I consider a database abstraction layer to be a sort of safety net, as it not only keeps the SQL out of the poor developer's eyes, but it makes their code significantly easier to refactor, since all the queries are in one place.
SQL is based on Set Theory, while most high level languages are object oriented these days. Object programmers typically like to think in objects, and have to make a mental shift to use Set based tools to store their objects. Generally, it is much more natural (for the OO programmer) to just cut code in the language of their choice and do something like object.save or object.delete in application code instead of having to write sql queries and call the database to achieve the same result.
Of course, sometimes for complex things, SQL is easier to use and more efficient, so it is good to have a handle on both types of technology.
IMO, the problem that I see that people have with SQL has nothing to do with relational design nor the SQL language itself. It has to do with the discipline of modeling the data layer which in many ways is fundamentally different than modeling a business layer or interface. Mistakes in modeling at the presentation layer are generally much easier to correct than at the data layer where you have multiple applications using the database. These problems are the same as those encountered in modeling a service layer in SOA designs where you have to account for current consumers of your service and the input and output contracts.
SQL was designed to interact with relational database models. There are other data models that have existed for some time, but the discipline about designing the data layer properly exists regardless of the theoretical model used and thus, the difficulties that developers typically have with SQL are usually related to attempts to impose a non-relational data model onto a relational database product.
For one thing, they make it trivial to use parameterized queries, protecting you from SQL injection attacks. Using raw SQL, from this perspective, is riskier, that is, easier to get wrong from a security perspective. They also often present an object-oriented perspective on your database, relieving you of having to do this translation.
Heard a lot recently? I hope you're not confusing this with the NoSql movement. As far as i'm aware that is mainly a bunch of people who use NoSql for high scalability web apps and appear to have forgotten that SQL is an effective tool in a non "high scalability web app" scenario.
The abstraction layer business is just about sorting out the difference between Object Oriented code and Table - Set based code such as SQL likes to talk. Usually this results in writing lots of boiler plate and dull transition code between the two. ORM automates this and thus saves time for business objecty people.
For experienced SQL programmer the bad sides are
Verbosity
As many have said here, SQL is declarative, which means optimizing is not direct. It's like rallying compared to circuit racing.
Frameworks that try to address all possible dialects and don't support shortcuts of any of them
No easy version control.
For others, the reasons are that
some programmers are bad at SQL. Probably because SQL operates with sets, while programming languages work in object or functional paradigm. Thinking in sets (union, product, intersect) is a matter of habbit that some people don't have.
some operations aren't self-explanatory: i.e. at first it's not clear that where and having filter different sets.
there are too many dialects
The primary goal of SQL frameworks is to reduce your typing. They somehow do, but too often only for very simple queries. If you try doing something complex, you have to use strings and type a lot. Frameworks that try to handle everything possible, like SQL Alchemy, become too huge, like another programming language.
[update on 26.06.10] Recently I worked with Django ORM module. This is the only worthy SQL framework I've seen. And this one makes working with stuff a lot. Complex aggregates are a bit harder though.
SQL is not a terrible language, it just doesn't play too well with others sometimes.
If for example if you have a system that wants to represent all entities as objects in some OO language or another, then combining this with SQL without any kind of abstraction layer can become rather cumbersome. There's no easy way to map a complex SQL query onto the OO-world. To ease the tension between those worlds additional layers of abstraction are inserted (an OR-Mapper for example).
SQL is a really good language for data manipulation. From a developer perspective, what I don't like with it is that changing the database don't break your code at compile time... So I use abstraction which add this feature at the price of performance and maybe expressiveness of the SQL language, because in most application you don't need all the stuff SQL has.
The other reason why SQL is hated, is because of relational databases.
The CAP Theorem becomes popular:
What goals might you want from a
shared-data system?
Strong Consistency: all clients see the same view, even in presence of
updates
High Availability: all clients can find some replica of the data, even in
the presence of failures
Partition-tolerance: the system properties hold even when the system
is partitioned
The theorem states that you can always
have only two of the three CAP
properties at the same time
Relational database address Strong Consistency and Partition-Tolerance.
So more and more people realize that relational database is not the silver bullet, and more and more people begin to reject it in favor of high availability, because high availability makes horizontal scaling more easy. Horizontal scaling gain popularity because we have reached the limit of Moore law, so the best way to scale is to add more machine.
If relational database is rejected, SQL is rejected too.
Quick, write me SQL to paginate a dataset that works in MySQL, Oracle, MSSQL, PostgreSQL, and DB2.
Oh, right, standard SQL doesn't define any operators to limit the number of results coming back and which row to start at.
• Every vendor extends the SQL syntax to suit their needs. So unless you're doing fairly simple things, your SQL code is not portable.
• The syntax of SQL is not orthogonal; e.g., the select, insert, update,anddelete statements all have completely different syntactical structure.
I agree with your points, but to answer your question, one thing that makes SQL so "terrible" is the lack of complete standardization of T-SQL between database vendors (Sql Server, Oracle etc.), which makes SQL code unlikely to be completely portable. Database abstraction layers solve this problem, albeit with a performance cost (sometimes a very severe one).
Living with pure SQL can really be a maintenance hell. For me the greatest advantage of ORMs is the ability to safely refactor code without tedious "DB refactoring" procedures. There are good unit testing frameworks and refactoring tools for OO languages, but I yet have to see Resharper's counterpart for SQL, for example.
Still all DALs have SQL behind the scenes, and still you need to know it to understand what's happening to your database, but daily working with good abstraction layer becomes easier.
If you haven't used SQL too much, I think the major problem is the lack of good developer tools.
If you have lots of experience with SQL, you will have, at one point or another, been frustrated by the lack of control over the execution plan. This is an inherent problem in the way SQL was specified to the vendors. I think SQL needs to become a more robust language to truly harness the underlying technology (which is very powerful).
SQL has many flaws, as some other posters here have pointed out. Still, I much prefer to use SQL over many of the tools that people offer as alternatives, because the "simplifications" are often more complicated than the thing they were supposed to simplify.
My theory is that SQL was invented by a bunch of ivory-tower blue-skiers. The whole non-procedural structure. Sounds great: tell me what you want rather than how you want to do it. But in practice, it's often easier to just give the steps. Often this seems like trying to give car maintenance instructions by describing how the car should perform when you're done. Yes, you could say, "I want the car to once again get 30 miles per gallon, and to run with this humming sound like this ... hmmmm ... and, etc" But wouldn't it be easier for everyone to just say, "Replace the spark plugs" ? And even when you do figure out how to express a complex query in non-procedural terms, the database engine often comes up with a very inefficient execution plan to get there. I think SQL would be much improved by the addition of standardized ways to tell it which table to read first and what index to use.
And the handling of nulls drive me crazy! Yes, theoretically it must have sounded great when someone said, "Hey, if null means unknown, then adding an unknown value to a known value should give an unknown value. After all, by definition, we have no idea what the unknown value is." Theoretically, absolutely true. In practice, if we have 10,000 customers and we know exactly how much money 9,999 owe us but there's some question about the amount owed by the last one, and management says, "What are our total accounts receivable?", yes, the mathematically correct answer is "I don't know". But the practical answer is "we calculate $4,327,287.42 but one account is in question so that number isn't exact". I'm sure management would much rather get a close if not certain number than a blank stare. But SQL insists on this mathemcatically pristine approach, so every operation you do, you have to add extra code to check for nulls and handle them special.
All that said, I'd still rather use SQL than some layer built on top of SQL, that just creates another whole set of things I need to learn, and then I have to know that ultimately this will be translated to SQL, and sometimes I can just trust it to do the translation correctly and efficiently, but when things get complex I can't, so now I have to know the extra layer, I still have to know SQL, and I have to know how it's going to translate to I can trick the layer into tricking SQL into doing the right thing. Arggh.
There's no love for SQL because SQL is bad in syntax, semantics and current usage. I'll explain:
it's syntax is a cobol shrapnel, all the cobol criticism applies here (to a lesser degree, to be fair). Trying to be natural language like without actually attempting to interpret natural language creates arbirtrary syntax (is it DROP TABLE or DROP , UPDATE TABLE , UPDATE or UPDATE IN , DELETE or DELETE FROM ...) and syntactical monstrosities like SELECT (how many pages does it fill?)
semantics is also deeply flawed, Date explains it in great detail, but it will suffice to note that a three valued boolean logic doesn't really fit a relational algebra where a row can only be or not be part of a table
having a programming language as the main (and often only) interface to databases proved to be a really bad choice and it created a new category of security flaws
I'd agree with most of the posts here that the debate over the utility of SQL is mostly subjective, but I think it's more subjective in the nature of your business needs.
Declarative languages, as Stefan Steinegger has pointed out, are good for specifying what you want, not how you want to do it. This means that your various implementations of SQL are decent from a high-level perspective : that is, if all you want is to get some data and nothing else matters, you can satisfy yourself with writing relatively simple queries, and choosing the implementation of SQL that is right for you.
If you work on a much "lower" level, and you need to optimize all of that yourself, it's far from ideal. Using a further layer of abstraction can help, but if what you're really trying to do is specify the methods for optimizing queries and so forth, it's a little counter intuitive to add a middleman when trying to optimize.
The biggest problem I have with SQL is like other "standardized" languages, there are very few real standards. I'd almost prefer having to learn a whole new language between Sybase and MySQL so that I don't get the two conventions confused.
While SQL does get the job done it certainly has issues...
it tries to simultaneously be the high level and the low level abstraction, and that's ... odd. Perhaps it should have been two or more standards at different levels.
it is a huge failure as a standard. Lots of things go wrong when a standard either stirs in everything, asks too much of implementations, asks too little, or for some reason does not accomplish the partially social goal of motivating vendors and implementors to produce strictly conforming interoperable complete implementations. You certainly cannot say SQL has done any of that. Look at some other standards and note that success or failure of the standard is clearly a factor of the useful cooperation attained:
RS-232 (Bad, not nearly enough specified, even which pin transmits and which pin receives is optional, sheesh. You can comply but still achieve nothing. Chance of successful interop: really low until the IBM PC made a de-facto useful standard.)
IEEE 754-1985 Floating Point (Bad, overreach: not a single supercomputer or scientific workstation or RISC microprocessor ever adopted it, although eventually after 20 years we were able to implement it nicely in HW. At least the world eventually grew into it.)
C89, C99, PCI, USB, Java (Good, whether standard or spec, they succeeded in motivating strict compliance from almost everyone, and that compliance resulted in successful interoperation.)
it failed to be selected for arguably the most important database in the world. While this is more of a datapoint than a reason, the fact that Google Bigtable is not SQL and not relational is kind of an anti-achievement for SQL.
I don't dislike SQL, but I also don't want to have to write it as part of what I am developing. The DAL is not about speed to market - actually, I have never thought that there would be a DAL implementation that would be faster than direct queries from the code. But the goal of the DAL is to abstract. Abstraction comes at a cost, and here it is that it will take longer to implement.
The benefits are huge, though. Writing native tests around the code, using expressive classes, strongly typed datasets, etc. We use a "DAL" of sorts, which is a pure DDD implementation using Generics in C#. So we have generic repositories, unit of work implementations (code based transactions), and logical separation. We can do things like mock out our datasets with little effort and actually develop ahead of database implementations. There was an upfront cost in building such a framework, but it is very nice that business logic is the star of the show again. We consume data as a resource now, and deal with it in the language we are natively using in the code. An added benefit of this approach is the clear separation it provides. I no longer see a database query in a web page, for example. Yes, that page needs data. Yes, the database is involved. But now, no matter where I am pulling data from, there is one (and only one) place to go into the code and find it. Maybe not a big deal on smaller projects, but when you have hundreds of pages in a site or dozens of windows in a desktop application, you truly can appreciate it.
As a developer, I was hired to implement the requirements of the business using my logical and analytical skills - and our framework implementation allows for me to be more productive now. As a manager, I would rather have my developers using their logical and analytical skills to solve problems than to write SQL. The fact that we can build an entire application that uses the database without having the database until closer to the end of the development cycle is a beautiful thing. It isn't meant as a knock against database professionals. Sometimes a database implementation is more complex than the solution. SQL (and in our case, Views and Stored Procs, specifically) are an abstraction point where code can consume data as a service. In shops where there is a definite separation between the data and development teams, this helps to eliminate sitting in a holding pattern waiting for database implementation and changes. Developers can focus on the problem domain without hovering over a DBA and the DBA can focus on the correct implementation without a developer needing it right now.
Many posts here seem to argue that SQL is bad because it doesn't have "code optimization" features, and that you have no control over execution plans.
What SQL engines are good at is to come up with an execution plan for a written instruction, geared towards the data, the actual contents. If you care to take a look beyond the programming side of things, you will see that there is more to data than bytes being passed between application tiers.

Is O/R Mapping worth it?

The expressiveness of the query languages (QL) provided with ORMs can be very powerful. Unfortunately, once you have a fleet of complex queries, and then some puzzling schema or data problem arises, it is very difficult to enlist the DBA help that you need? Here they are, part of the team that is evolving the database, yet they can't read the application QL, much less suggest modifications. I generally end up grabbing generated SQL out of the log for them. But then when they recommend changes to it, how does that relate to the original QL? The process is not round-trip.
So after a decade of promoting the value of ORMs, I am now wondering if I should be writing my SQL manually. And maybe all that I really want the framework to do is automate the data marshaling as much as possible.
Question: Have you found a way to deal with the round-trip issue in your organization? Is there a SQL-marshaling framework that scales well, and maintains easily?
(Yes, I know that pure SQL might bind me to the database vendor. But it is possible to write standards-compliant SQL.)
I think that what you want is a solution that maximizes the benefits of ORM without preventing you using other means. We have much the same issue as you do in our application; very heavy queries, and a large data model. Given the size of the data model, ORM is invaluable for the vast majority of the application. It allows us to extend the data model without having to go to a great deal of effort hand-maintaining SQL scripts. Moreover, and you touched on this, we support four database vendors, so the abstraction is nice.
However, there are instances where we've had to tune the queries manually, and since we chose a flexible ORM solution, we can do that too. As you say, it gets out of our way when we need it gone, and simply marshals objects for us.
So, in short (yep, short) yes, ORM is worth it, but like every solution to a problem, it's not a panacea.
In general, ORMs increase developer productivity a lot so I'd using them unless they've become a bigger problem than they're worth. If a majority of your tables are big enough that you are having a lot of problems, consider ditching the ORM. I would definitely not say that ORMs are a bad idea in general. Most databases are small enough and most queries are simple enough that they work well.
I've overcome that problem by using stored procedures or hand-written SQL only for the poorly performing queries. DBAs love stored procedures because they can modify them without telling you. Most (if not all) ORMs allow you to mix in hand written SQL or stored procedures.
todays O/R frameworks, as i believe you're familiar with, support the option of defining some queries manually ((N)Hibernate does). that can be used for complex parts of schemas, and for straight-forward parts use the ORM as provided by the framework.
another thing for you to check out might be the iBatis framework (http://ibatis.apache.org/). i haven't used it, but i've read that it's more close to SQL and people familiar with databases and SQL prefer it over full-blown ORM framework like hibernate, because it's closer to them than the completely different concept of ORM.

Stored procedures or OR mappers?

Which is better? Or use and OR mapper with SP's? If you have a system with SP's already, is an OR mapper worth it?
I like ORM's because you don't have to reinvent the wheel. That being said, it completely depends on your application needs, development style and that of the team.
This question has already been covered Why is parameterized SQL generated by NHibernate just as fast as a stored procedure?
There is nothing good to be said about stored procedures. There were a necessity 10 years ago but every single benefit of using sprocs is no longer valid. The two most common arguments are regarding security and performance. The "sending stuff over the wire" crap doesn't hold either, I can certainly create a query dynamically to do everything on the server too. One thing the sproc proponents won't tell you is that it makes updates impossible if you are using column conflict resolution on a merge publication. Only DBAs who think they are the database overlord insist on sprocs because it makes their job look more impressive than it really is.
This has been discussed at length on previous questions.
What are the pros and cons to keeping SQL in Stored Procs versus Code
At my work, we mostly do line of business apps - contract work.
For this type of business, I'm a huge fan of ORM. About four years ago (when the ORM tools were less mature) we studied up on CSLA and rolled our own simplified ORM tool that we use in most of our applications,including some enterprise-class systems that have 100+ tables.
We estimate that this approach (which of course includes a lot of code generation) creates a time savings of up to 30% in our projects. Seriously, it's rediculous.
There is a small performance trade-off, but it's insubstantial as long as you have a decent understanding of software development. There are always exceptions that require flexibility.
For instance, extremely data-intensive batch operations should still be handled in specialized sprocs if possible. You probably don't want to send 100,000 huge records over the wire if you could do it in a sproc right on the database.
This is the type of problem that newbie devs run into whether they're using ORM or not. They just have to see the results and if they're competent, they will get it.
What we've seen in our web apps is that usually the most difficult to solve performance bottlenecks are no longer database-related even with ORM. Rather, tey're on the front-end (browser) due to bandwidth, AJAX overhead, etc. Even mid-range database servers are incredibly powerful these days.
Of course, other shops who work on much larger high-demand systems may have different experiences there. :)
Stored procedures hands down. OR Mappers are language specific, and often add graphic slowdowns.
Stored procedures means you're not limited by the language interface, and you can merely tack on new interfaces to the database in forwards compatible ways.
My personal opinion of OR Mappers is their existence highlights a design flaw in the popular structure of databases. Database developers should realize the tasks people are trying to achieve with complicated OR-Mappers and create server-side utilities that assist in performing this task.
OR Mappers also are epic targets of the "leaky abstraction" syndrome ( Joel On Software: Leaky Abstractions )
Where its quite easy to find things it just cant handle because of the abstraction layer not being psychic.
Stored procedures are better, in my view, because they can have an independent security configuration from the underlying tables.
This means you can allow specific operations without out allowing writes/reads to specific tables. It also limits the damage that people can do if they discover a SQL injection exploit.
Definitely ORMs. More flexible, more portable (generally they tend to have portability built in). In case of slowness you may want to use caching or hand-tuned SQL in hot spots.
Generally stored procedures have several problems with maintainability.
separate from application (so many changes have now to be made in two places)
generally harder to change
harder to put under version control
harder to make sure they're updated (deployment issues)
portability (already mentioned)
I personally have found that SP's tend to be faster performance wise, at least for the large data items that I execute on a regular basis. But I know many people that swear by OR tools and wouldn't do ANYTHING else.
I would argue that using an OR mapper will increase readability and maintainability of your applications source code, while using SP will increase the performance of the application.
They are not actually mutually exclusive, though to your point they usually are so.
The advantage of using Object Relational mapping is that you can swap out data sources. Not only database structure, but you could use any data source. With advent web services / Service-oriented architecture / ESB's, in a larger corporation, it would be wise to consider having a higher level separation of concerns than what you could get in stored procedures. However, in smaller companies and in application that will never use a different data source, then SP's can fit the bill fine. And one last point, it is not necessary to use an OR mapper to get the abstraction. My former team had great success by simply using an adapter model using Spring.NET to plug-in the data source.
# Kent Fredrick
My personal opinion of OR Mappers is their existence highlights a design flaw in the popular structure of databases"
I think you're talking about the difference between the relational model and object-oriented model. This is actually why we need ORMs, but the implementations of these models were done on purpose - it is not a design flow - it is just how things turned out to be historically.
Use stored procedures where you have identified a performance bottleneck. if you haven't identified a bottleneck, what are you doing with premature optimisation?
Use stored procedures where you are concerned about security access to a particular table.
Use stored procs when you have a SQL wizard who is prepared to sit and write complex queries that join together loads of tables in a legacy database- to do the things that are hard in an OR mapper.
Use the OR mapper for the other (at least) 80% of your database: where the selects and updates are so routine as to make access through stored procedures alone a pointless exercise in manual coding, and where updates are so infrequent that there is no performance cost. Use an OR mapper to automate the easy stuff.
Most OR mappers can talk to stored procs for the rest.
You should not use stored procs assuming that they're faster than a sql statement in a string, this is not necessarily the case in the last few versions of MS SQL server.
You do not need to use stored procs to thwart SQL injection attacks, there are other ways to do make sure that your query parameters are strongly typed and not just string-concatenated.
You don't need to use an OR mapper to get a POCO domain model, but it does help.
If you already have a data API that's exposed as sprocs, you'd need to justify a major architectural overhaul to go to ORM.
For a green-fields build, I'd evaluate several things:
If there's a dedicated DBA on the team, I'd lean to sprocs
If there's more than one application touching the same DB I'd lean to sprocs
If there's no possibility of database migration ever, I'd lean to sprocs
If I'm trying to implement MVCC in the DB, I'd lean to sprocs
If I'm deploying this as a product with potentially multiple backend dbs (MySql, MSSql, Oracle), I'd lean to ORM
If I'm on a tight deadline, I'd lean to ORM, since it's a faster way to create my domain model and keep it in sync with the data model (with appropriate tooling).
If I'm exposing the same domain model in multiple ways (web app, web service, RIA client), I'll lean to ORM as then data model is then hidden behind my ORM facade, making a robust domain model is more valuable to me.
I think performance is a bit of a red herring; hibernate seems to perform nearly as well or better than hand-coded SQL (due to it's caching tiers), and it's easy to write a bad query in your sproc either way.
The most important criteria are probably the team's skillset and long-term database portability needs.
Well the SP's are already there. It doesn't make sense to can them really. I guess does it make sense to use a mapper with SP's?
"I'm trying to drive in a nail. Should I use the heel of my shoe or a glass bottle?"
Both Stored Procedures and ORMs are difficult and annoying to use for a developer (though not necessarily for a DBA or architect, respectively), because they incur a start-up cost and higher maintenance cost that doesn't guarantee a pay-off.
Both will pay off well if the requirements aren't expected to change much over the lifespan of the system, but they will get in your way if you're building the system to discover the requirements in the first place.
Straight-coded SQL or quasi-ORM like LINQ and ActiveRecord is better for build-to-discover projects (which happen in the enterprise a lot more than the PR wants you to think).
Stored Procedures are better in a language-agnostic environment, or where fine-grained control over permissions is required. They're also better if your DBA has a better grasp of the requirements than your programmers.
Full-blown ORMs are better if you do Big Design Up Front, use lots of UML, want to abstract the database back-end, and your architect has a better grasp of the requirements than either your DBA or programmers.
And then there's option #4: Use all of them. A whole system is not usually just one program, and while many programs may talk to the same database, they could each use whatever method is appropriate both for the program's specific task, and for its level of maturity. That is: you start with straight-coded SQL or LINQ, then mature the program by refactoring in ORM and Stored Procedures where you see they make sense.