In non-procedural languages, what specifies how things are to be done? - sql

If you compare C vs SQL, this is the argument:
In contrast to procedural languages
such as C, which describe how things
should be done, SQL is nonprocedural
and describes what should be done.
So, the how part for languages like SQL is specified by the language itself, is it? What if I want to change the way some query works. Suppose I want to change the way a SELECT is handled. Is that possible?

So, the how part for languages like
SQL is specified by the language
itself, is it?
Not strictly by the language (ie. SQL), but normally by the database and its optimiser. As such, even where the same data is being queried from tables with the same structures and the same indexes, some databases will build the resultset in a different way to others.
Suppose I want to change the way a
SELECT is handled. Is that possible?
To some degree, yes. You can either:
Rewrite the query, to achieve the same result a different way, or
Use hinting - http://en.wikipedia.org/wiki/Hint_%28SQL%29
Neither of these directly instruct the database engine which approach to use, but both of them will affect how the resultset is returned - this is likely to vary between databases.
Additionally, I understand that some databases have additional interfaces that allow more low-level interaction with the database engine, enabling greater control over how a query is built than is possible from plain SQL. (However, your question did specify SQL.)

This is actually exaggerating the difference. There is no clear-cut point at which one is telling how things are done and the other only telling what it done. Rather, one may have to specify what/how things are done at a greater level of detail than the other. A typical SQL implementation allows the user to control such things as what indexes are used (or ignored), what kind of locking to do, and so on.
If you were to do the same job in C, you would (at some point) have to specify a great deal more detail (unless you used something like ODBC). Nonetheless, you're still telling what should be done, not all the details of how it should be done (e.g., despite being about as low-level as possible short of assembly language, C will still do some type conversions automatically, so you don't have to tell it how to do something like adding an integer to a floating point number -- you just tell it to add them, and it handles the details).
Bottom line: trying to talk about one as procedural and the other as non-procedural is misleading. SQL doesn't always require as much detail, but it's a difference of degree, not really "how" versus "what".

Related

When to use Oracle hints?

I'm doing some refactoring on a Oracle Schema (oracle version 10), and I see a lot of views that use hints *+ALL_ROWS*/. In others views there are also other type of hints.
Why I should use a hint? Doesn't the DB make the best choice in base of the query? many thanks!
That's a good question, but there's no single answer to it because there are different categories of hint for which different advice would apply. http://docs.oracle.com/cd/E11882_01/server.112/e16638/hintsref.htm#PFGRF501
ALL_ROWS is an optimisation approach, and it's perfectly valid to specify it in order to make it clear that your goal is to get the last row of the result set as early as possible, not the first. In many cases the optimiser will deduce that from the query anyway, so it may be redundant, but you're not going to harm anything by using it correctly.
Then there are different categories, some of which might be characterised as being for testing and exploration, such as the optimizer_features_enable. Arguably the hints that affect join order, access path, and join operations are of this type as they're sometimes discouraged for use in applications. However the optimizer is not perfect, and does not have perfect information, and sometimes it will make a choice based on incomplete information that needs to be corrected.
Some hints are unquestionably useful and appropriate -- APPEND is possibly the best example, as it is the standard method for invoking direct path insert.
In the end it's really difficult to give generalised advice on this. Really every hint needs to be addressed in respect of whether it should be used in production code or not, but if you understand the optimiser and understand what the hints you are considering really do and whether there are better alternatives -- eg better statistics, different init parameters, or dynamic sampling (itself a hint) -- the you'll be able to make your own assessment and defend it if challenged.

Is it possible to write a SQL statement in plain assembly language processor-level code?

Just recently a friend suggested it is possible and achievable (though very difficult) to write a SQL statement in assembly code, since every programming operation eventually gets down to processor-level execution.
I did a bit of research on SQL's behaviour and although it follows relational algebra's theory and platform-independent execution, I still believe that the level of abstraction and semantics are rather distant as to even consider a way to translate a SQL statement to assembly code (a very operations/memory/resources specific set of instructions).
Perhaps you could mimic a SQL statement's processor operations result and try to replicate it with a pure assembly set of instructions. You would come to realise though, that you still would not be writing/translating SQL statements.
Take for instance, MonetDB's SQL Reference page, they state the following in the third paragraph:
"The architecture is based on a compiler, which translates SQL
statements into the MonetDB Assembly Language (MAL). In this process
common optimization heuristics, specific to the relational algebra are
performed."
The SQL language however does not even allow for brute assembly instructions to be typed, whereas common languages such as C-based, and C# do allow for such typing/imports.
What do you guys think? Thanks for sharing your thoughts!
Anything that runs on your computer can be coded using an assembly language. If a SQL database can run on your machine, then it can be coded in assembly.
It can be ridiculously hard to do though.
The SQL example you mention isn't that far removed from what happens when C or other compiled languages are translated to machine code. Modern optimizing compilers don't translate your C code directly to assembly. They use one (or more) intermediate representations that are easier to perform optimizations on. It's a multi-step process, and the actual assembly output isn't the main part of it complexity-wise.
If you look at it that way, your SQL case is not very different. You could imagine an SQL pre-processor that produces native code from the MAL given a sufficiently fixed environment (schema notably). With something like that, adding extensions to that SQL dialect to allow inline assembly (for aggregate functions for instance) could make sens. And doing all that manually (i.e. without the pre-processor itself) would be possible.
You loose all the portability and flexibility you get from a runtime SQL interpreter though, would have to recompile every time your schema changes, data-dependent optimizations become nearly impossible, etc. So the situations where this would be useful are, I believe, very limited. (Same thing for other languages that are usually run through a VM or interpreter - compiling them down to native code usually carries heavy restrictions.)
The SQL language however does not even allow for brute assembly instructions to be typed, whereas common languages such as C-based, and C# do allow for such typing/imports.
No, SQL does not allow this because it is a higher level language than C (or C#). In SQL, the code describes what should be done and not how, nor any details on how to do it. The implementation has to parse the code and compile it into a set or low-level instructions that do what SQL code describes.
For example, for a SELECT we have no guarantee on what the plan to access the tables will be, in what order they will be accessed, which (if any) indices will be used, what type of operations will be used for joins, if temporary tables will be used or the sorting is done in memory, etc...
So, something like this would be ill-defined and extremely dangerous to be allowed:
SELECT *
FROM a_table AS a
JOIN another_table AS b
ON b.aid = a.id
WHERE b_data LIKE 'Alex%'
( .CODE
getRSP PROC
mov rax, rsp
add rax, 8
ret
getRSP ENDP
END
)
AND a_date BETWEEN '2000-01-01'
AND '2099-12-31'
ORDER BY b_year
If you are interested in compilation of relational queries/operations to assembler, you might want to check out this paper: http://www.vldb.org/pvldb/vol4/p539-neumann.pdf. In this DBMS, components of LLVM are used to produce CPU-instructions (which I assume is what you mean when you say assembler) from a query within the DBMS.
Also, even though I might be preaching to the choir, I want to make clear the MAL has nothing to do with CPU-instruction Assembler. Every single MAL-statement its backed by an implementation in C. MAL is used only (taadaa:) as an intermediate representation that is easy to optimized and interpret.
Well, the machine executes instructions you could have written in assembly. However, I wouldn't call writing the assembly language directly doing a SQL query. SQL could be interpreted very differently... e.g. by librarians consulting encyclopediae, in contexts where raw assembly might have little meaning.
No. SQL is an abstraction that can be interpreted* by different SQL implementations with different SQL environments with different physical layouts. Maybe the layouts even change over time, as you ALTER TABLE and now you have a mixture of old and new tuple layouts. Also, there's more you can do with SQL than just run it. You can also type-check it, analyze it to see what kind of effects it has, put it in a view definition or stored procedure, etc.
Here's another way to put it. Can you "write" HTML as assembly language? Maybe you can write a program that, when executed, has the same effect as a browser rendering a particular page. But can your program be processed by AdBlock, NoScript, and whatever other filters I have installed? Anything that supports all of the relevant operations on HTML is going to be isomorphic to HTML itself. Similarly with SQL, and any other language. Any other data structure, in fact: a change in representation must preserve the meaning of all the relevant operations on that data structure. And languages tend to have lots of relevant operations.
(* I don't mean "interpreted" as in "vs compiled"; I mean "given meaning".)

Why prefix sql function names?

What is a scenario that exemplifies a good reason to use prefixes, such as fn_GetName, on function names in SQL Server? It would seem that it would be unnecessary since usually the context of its usage would make it clear that it's a function. I have not used any other language that has ever needed prefixes on functions, and I can't think of a good scenario that would show why SQL is any different.
My only thinking is that perhaps in older IDE's it was useful for grouping functions together when the database objects were all listed together, but modern IDE's already make it clear what is a function.
You are correct about older IDEs
As a DBA, trying to fix permissions using SQL Server Enterprise Manager (SQL Server 2000 and 7.0), it was complete bugger trying to view permissions. If you had ufn or usp or vw it became easier to group things together because of how the GUI presented the data.
Saying that, if you have SELECT * FROM Thing, what is Thing? A view or a table? It may work for you as a developer but it will crash when deployed because you don't grant permissions on the table itself: only views or procs. Surely a vwThing will keep your blood pressure down...?
If you use schemas, it becomes irrelevant. You can "namespace" your objects. For example, tables in "Data" and other objects per client in other schemas eg WebGUI
Edit:
Function. You have table valued and scalar functions. If you stick with the code "VerbNoun" concept, how do you know which is which without some other clue. (of course, this can't happen because object names are unique)
SELECT dbo.GetName() FROM MyTable
SELECT * FROM dbo.GetName()
If you use a plural to signify a table valued function, this is arguably worse
SELECT dbo.GetName() FROM MyTable
SELECT * FROM dbo.GetNames()
Whereas this is less ambiguous, albeit offensive to some folk ;-)
SELECT dbo.sfnGetName() FROM MyTable
SELECT * FROM dbo.tfnGetName()
With schemas. And no name ambiguity.
SELECT ScalarFN.GetName() FROM MyTable
SELECT * FROM TableFN.GetName()
Your "any other language" comment doesn't apply. SQL isn't structured like c#, Java, f#, Ada (OK, PL/SQL might be), VBA, whatever: there is no object or namespace hierarchy. No Object.DoStuff method stuff.
A prefix may be just the thing to keep you sane...
There's no need to prefix function names with fn_ any more than there's a need to prefix table names with t_ (a convention I have seen). This sort of systematic prefix tends to be used by people who are not yet comfortable with the language and who need the convention as an extra help to understanding the code.
Like all naming conventions, it hardly matters what the convention actually is. What really matter is to be consistent. So even if the convention is wrong, it is still important to stick to it for consistency. Yes, it may be argued that if the naming convention is wrong then it should be changed, but the effort implies a lot: rename all objects, change source code, all maintenance procedures, get the dev team committed to follow the new convention, have all the support and ops personnel follow the new rules etc etc. On a large org, the effort to change a well established naming convention is just simply overwhelming.
I don't know what your situation is, but you should consider carefully before proposing a naming convention change just for sake of 'pretty'. No matter how bad the existing naming convention in your org is, is far better to stick to it and keep the naming consistency than to ignore it and start your own.
Of course, more often than not, the naming convention is not only bad, is also not followed and names are inconsistent. In that case, sucks to be you...
What is a scenario that exemplifies a
good reason to use prefixes
THere is none. People do all kind of stuff because they always did so, and quite a number of bad habits are explained with ancient knowledge that is wrong for many years.
I'm not a fan of prefixes, but perhaps one advantage could be that these fn_ prefixes might make it easier to identify that a particular function is user-defined, rather than in-built.
We had many painful meetings at our company about this. IMO, we don't need prefixes on any object names. However, you could make an argument for putting them on views, if the name of the view might conflict with the underlying table name.
In any case, there is no SQL requirement that prefixes be used, and frankly, IMO, they have no real value.
As others have noticed, any scenario you define can easily be challenged, so there's no rule defining it as necessary; however, there's equally no rule saying it's unnecessary, either. Like all naming conventions, it's more important to have consistent usage rather than justification.
One (and the only) advantage I can think of is that it a consequently applied prefixing scheme could make using intellisense / autocomplete easier since related functionality is automagically grouped together.
Since I recently stumbled over this question, I'd like to add the following point in favour of prefixes: Imagine, you have some object id from some system table and you want to determine whether it's a function, proc, view etc. You could of course run a test for each type, it's much easier though to extract the prefix from the object name and then act upon that. This gets even easier when you use an underscore to separate the prefix from the name, e.g. usp_Foo instead of uspFoo. IMHO it's not just about stupid IDEs.

Why use SQL database? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I'm not quite sure stackoverflow is a place for such a general question, but let's give it a try.
Being exposed to the need of storing application data somewhere, I've always used MySQL or sqlite, just because it's always done like that. As it seems like the whole world is using these databases (most of all software products, frameworks, etc), it is rather hard for a beginning developer like me to start thinking about whether this is a good solution or not.
Ok, say we have some object-oriented logic in our application, and objects are related to each other somehow. We need to map this logic to the storage logic, so relations between database objects are required too. This leads us to using relational database, and I'm ok with that - to put it simple, our database table rows sometimes will need to have references to other tables' rows. But why use SQL language for interaction with such a database?
SQL query is a text message. I can understand this is cool for actually understanding what it does, but isn't it silly to use text table and column names for a part of application that no one ever seen after deploynment? If you had to write a data storage from scratch, you would have never used this kind of solution. Personally, I would have used some 'compiled db query' bytecode, that would be assembled once inside a client application and passed to the database. And it surely would name tables and colons by id numbers, not ascii-strings. In the case of changes in table structure those byte queries could be recompiled according to new db schema, stored in XML or something like that.
What are the problems of my idea? Is there any reason for me not to write it myself and to use SQL database instead?
EDIT To make my question more clear. Most of answers claim that SQL, being a text query, helps developers better understand the query itself and debug it more easily. Personally, I haven't seen people writing SQL queries by hand for a while. Everyone I know, including me, is using ORM. This situation, in which we build up a new level of abstraction to hide SQL, leads to thinking if we need SQL or not. I would be very grateful if you could give some examples in which SQL is used without ORM purposely, and why.
EDIT2 SQL is an interface between a human and a database. The question is why do we have to use it for application/database interaction? I still ask for examples of human beings writing/debugging SQL.
Everyone I know, including me, is using ORM
Strange. Everyone I know, including me, still writes most of the SQL by hand. You typically end up with tighter, more high performance queries than you do with a generated solution. And, depending on your industry and application, this speed does matter. Sometimes a lot. yeah, I'll sometimes use LINQ for a quick-n-dirty where I don't really care what the resulting SQL looks like, but thus far nothing automated beats hand-tuned SQL for when performance against a large database in a high-load environment really matters.
If all you need to do is store some application data somewhere, then a general purpose RDBMS or even SQLite might be overkill. Serializing your objects and writing them to a file might be simpler in some cases. An advantage to SQLite is that if you have a lot of this kind of information, it is all contained in one file. A disadvantage is that it is more difficult to read it. For example, if you serialize you data to YAML, you can read the file with any text editor or shell.
Personally, I would have used some
'compiled db query' bytecode, that
would be assembled once inside a
client application and passed to the
database.
This is how some database APIs work. Check out static SQL and prepared statements.
Is there any reason for me not to
write it myself and to use SQL
database instead?
If you need a lot of features, at some point it will be easier to use an existing RDMBS then to write your own database from scratch. If you don't need many features, a simpler solution may be wiser.
The whole point of database products is to avoid writing the database layer for every new program. Yes, a modern RDMBS might not always be a perfect fit for every project. This is because they were designed to be very general, so in practice, you will always get additional features you don't need. That doesn't mean it is better to have a custom solution. The glove doesn't always need to be a perfect fit.
UPDATE:
But why use SQL language for
interaction with such a database?
Good question.
The answer to that may be found in the original paper describing the relational model A Relational Model of Data for Large Shared Data Banks, by E. F. Codd, published by IBM in 1970. This paper describes the problems with the existing database technologies of the time, and explains why the relational model is superior.
The reason for using the relational model, and thus a logical query language like SQL, is data independence.
Data independence is defined in the paper as:
"... the independence of application programs and terminal activities from the growth in data types and changes in data representations."
Before the relational model, the dominate technology for databases was referred to as the network model. In this model, the programmer had to know the on-disk structure of the data and traverse the tree or graph manually. The relational model allows one to write a query against the conceptual or logical scheme that is independent of the physical representation of the data on disk. This separation of logical scheme from the physical schema is why we use the relational model. For a more on this issue, here are some slides from a database class. In the relational model, we use logic based query languages like SQL to retrieve data.
Codd's paper goes into more detail about the benefits of the relational model. Give it a read.
SQL is a query language that is easy to type into a computer in contrast with the query languages typically used in a research papers. Research papers generally use relation algebra or relational calculus to write queries.
In summary, we use SQL because we happen to use the relational model for our databases.
If you understand the relational model, it is not hard to see why SQL is the way it is. So basically, you need to study the relation model and database internals more in-depth to really understand why we use SQL. It may be a bit of a mystery otherwise.
UPDATE 2:
SQL is an interface between a human
and a database. The question is why do
we have to use it for
application/database interaction? I
still ask for examples of human beings
writing/debugging SQL.
Because the database is a relational database, it only understands relational query languages. Internally it uses a relational algebra like language for specifying queries which it then turns into a query plan. So, we write our query in a form we can understand (SQL), the DB takes our SQL query and turns it into its internal query language. Then it takes the query and tries to find a "query plan" for executing the query. Then it executes the query plan and returns the result.
At some point, we must encode our query in a format that the database understands. The database only knows how to convert SQL to its internal representation, that is why there is always SQL at some point in the chain. It cannot be avoided.
When you use ORM, your just adding a layer on top of the SQL. The SQL is still there, its just hidden. If you have a higher-level layer for translating your request into SQL, then you don't need to write SQL directly which is beneficial in some cases. Some times we do not have such a layer that is capable of doing the kinds of queries we need, so we must use SQL.
Given the fact that you used MySQL and SQLite, I understand your point of view completely. Most DBMS have features that would require some of the programming from your side, while you get it from database for free:
Indexes - you can store large amounts of data and still be able to filter and search very quickly because of indexes. Of course, you could implement you own indexing, but why reinvent the wheel
data integrity - using database features like cascading foreign keys can ensure data integrity across the system. You only need to declare relationship between data, and system takes care of the rest. Of course, once more, you could implement constraints in code, but it's more work. Consider, for example, deletion, where you would have to write code in object's destructor to track all dependent objects and act accordingly
ability to have multiple applications written in different programming languages, working on different operating systems, some even distributed across the network - all using the same data stored in a common database
dead easy implementation of observer pattern via triggers. There are many cases where only some data depends on some other data and it does not affect UI aspect of application. Ensuring consistency can be very tricky or require a lot of programming. Of course, you could implement trigger-like behavior with objects but it requires more programming than simple SQL definition
There are some good answers here. I'll attempt to add my two cents.
I like SQL, I can think in it pretty easily. The queries produced by layers on top of the DB (like ORM frameworks) are usually hideous. They'll select tons of extra stuff, join in things you don't need, etc.; all because they don't know that you only want a small part of the object in this code. When you need high performance, you'll often end up going in and using at least some custom SQL queries in an ORM system just to speed up a few bottlenecks.
Why SQL? As others have said, it's easy for humans. It makes a good lowest common denominator. Any language can make SQL and call command line clients if necessary, and they is pretty much always a good library.
Is parsing out the SQL inefficient? Somewhat. The grammar is pretty structured, so there aren't tons of ambiguities that would make the parser's job really hard. The real thing is that the overhead of parsing out SQL is basically nothing.
Let's say you run a query like "SELECT x FROM table WHERE id = 3", and then do it again with 4, then 5, over and over. In that case, the parsing overhead may exist. That's why you have prepared statements (as others have mentioned). The server parses the query once, and can swap in the 3 and 4 and 5 without having to reparse everything.
But that's the trivial case. In real life, your system may join 6 tables and have to pull hundreds of thousands of records (if not more). It may be a query that you let run on a database cluster for hours, because that's the best way to do things in your case. Even with a query that takes only a minute or two to execute, the time to parse the query is essentially free compared to pulling records off disk and doing sorting/aggregation/etc. The overhead of sending the ext "LEFT OUTER JOIN ON" is only a few bytes compared to sending special encoded byte 0x3F. But when your result set is 30 MB (let alone gigs+), those few extra bytes are worthless compared to not having to mess with some special query compiler object.
Many people use SQL on small databases. The biggest one I interact with is only a few dozen gigs. SQL is used on everything from tiny files (like little SQLite DBs may be) up to terabyte size Oracle clusters. Considering it's power, it's actually a surprisingly simple and small command set.
It's an ubiquitous standard. Pretty much every programming language out there has a way to access SQL databases. Try that with a proprietary binary protocol.
Everyone knows it. You can find experts easily, new developers will usually understand it to some degree without requiring training
SQL is very closely tied to the relational model, which has been thoroughly explored in regard to optimization and scalability. But it still frequently requires manual tweaking (index creation, query structure, etc.), which is relatively easy due to the textual interface.
But why use SQL language for interaction with such a database?
I think it's for the same reason that you use a human-readable (source code) language for interaction with the compiler.
Personally, I would have used some 'compiled db query' bytecode, that would be assembled once inside a client application and passed to the database.
This is an existing (optional) feature of databases, called "stored procedures".
Edit:
I would be very grateful if you could give some examples in which SQL is used without ORM purposely, and why
When I implemented my own ORM, I implemented the ORM framework using ADO.NET: and using ADO.NET includes using SQL statements in its implementation.
After all the edits and comments, the main point of your question appears to be : why is the nature of SQL closer to being a human/database interface than to being an application/database interface ?
And the very simple answer to that question is : because that is exactly what it was originally intended to be.
The predecessors of SQL (QUEL being presumably the most important one) were intended to be exactly that : a QUERY language, i.e. one that didn't have any of INSERT, UPDATE, DELETE.
Moreover, it was intended to be a query language that could be used by any user, provided that user was aware of the logical structure of the database, and obviously knew how to express that logical structure in the query language he was using.
The original ideas behind QUEL/SQL were that a database was built using "just any mechanism conceivable", that the "real" database could be really just anything (e.g. one single gigantic XML file - allthough 'XML' was not considered a valid option at the time), and that there would be "some kind of machinery" that understood how to transform the actual structure of that 'just anything' into the logical relational structure as it was perceived by the SQL user.
The fact that in order to actually achieve that, the underlying structures are required to lend themselves to "viewing them relationally", was not understood as well in those days as it is now.
Yes, it is annoying to have to write SQL statements to store and retrieve objects.
That's why Microsoft have added things like LINQ (language integrated query) into C# and VB.NET to make it possible to query databases using objects and methods instead of strings.
Most other languages have something similar with varying levels of success depending on the abilities of that language.
On the other hand, it is useful to know how SQL works and I think it is a mistake to shield yourself entirely from it. If you use the database without thinking you can write extremely inefficient queries and index the database incorrectly. But once you understand how to use SQL correctly and have tuned your database, you have a very powerful tried-and-tested tool available for finding exactly the data you need extremely quickly.
My biggest reason for SQL is Ad-hoc reporting. That report your business users want but don't know that they need it yet.
SQL is an interface between a human
and a database. The question is why do
we have to use it for
application/database interaction? I
still ask for examples of human beings
writing/debugging SQL.
I use sqlite a lot right from the simplest of tasks (like logging my firewall logs directly to a sqlite database) to more complex analytic and debugging tasks in my day-to-day research. Laying out my data in tables and writing SQL queries to munge them in interesting ways seems to be the most natural thing to me in these situations.
On your point about why it is still used as an interface between application/database, this is my simple reasoning:
There is about 3-4 decades of
serious research in that area
starting in 1970 with Codd's seminal
paper on Relational Algebra.
Relational Algebra forms the
mathematical basis to SQL (and other
QLs), although SQL does not
completely follow the relational
model.
The "text" form of the language
(aside from being easily
understandable to humans) is also
easily parsable by machines (say
using a grammar parser like like
lex) and is easily convertable to whatever "bytecode" using any number of optimizations.
I am not sure if doing this in any
other way would have yielded
compelling benefits in the generic cases. Otherwise it
would have been probably discovered
and adopted in the 3 decades of
research. SQL probably provides the
best tradeoffs when bridging the
divide between humans/databases and
applications/databases.
The question that then becomes interesting to ask is, "What are the real benefits of doing SQL in any other "non-text" way?" Will google for this now:)
SQL is a common interface used by the DBMS platform - the entire point of the interface is that all database operations can be specified in SQL without needing supplementary API calls. This means that there is a common interface across all clients of the system - application software, reports and ad-hoc query tools.
Secondly, SQL gets more and more useful as queries get more complex. Try using LINQ to specify a 12-way join a with three conditions based on existential predicates and a condition based on an aggregate calculated in a subquery. This sort of thing is fairly comprehensible in SQL but unlikely to be possible in an ORM.
In many cases an ORM will do 95% of what you want - most of the queries issued by applications are simple CRUD operations that an ORM or other generic database interface mechanism can handle easily. Some operations are best done using custom SQL code.
However, ORMs are not the be-all and end-all of database interfacing. Fowler's Patterns of Enterprise Application Architecture has quite a good section on other types of database access strategy with some discussion of the merits of each.
There are often good reasons not to use an ORM as the primary database interface layer. An example of a good one is that platform database libraries like ADO.Net often do a good enough job and integrate nicely with the rest of the environment. You might find that the gain from using some other interface doesn't really outweigh the benefits from the integration.
However, the final reason that you can't really ignore SQL is that you are ultimately working with a database if you are doing a database application. There are many, many WTF stories about screw-ups in commercial application code done by people who didn't understand databases properly. Poorly thought-out database code can cause trouble in so many ways, and blithely thinking that you don't need to understand how the DBMS works is an act of Hubris that is bound to come and bite you some day. Worse yet, it will come and bite some other poor schmoe who inherits your code.
While I see your point, SQL's query language has a place, especially in large applications with a lot of data. And to point out the obvious, if the language wasn't there, you couldn't call it SQL (Structured Query Language). The benefit of having SQL over the method you described is SQL is generally very readable, though some really push the limits on their queries.
I whole heartly agree with Mark Byers, you should not shield yourself from SQL. Any developer can write SQL, but to really make your application perform well with SQL interaction, understanding the language is a must.
If everything was precompiled with bytecode as you described, I'd hate to be the one to have to debug the application after the original developer left (or even after not seeing the code for 6 months).
I think the premise of the question is incorrect. That SQL can be represented as text is immaterial. Most modern databases would only compile queries once and cache them anyway, so you already have effectively a 'compiled bytecode'. And there's no reason this couldn't happen client-wise though I'm not sure if anyone's done it.
You said SQL is a text message, well I think of him as a messenger, and, as we know, don't shoot the messenger. The real issue is that relations are not a good enough way of organising real world data. SQL is just lipstick on the pig.
If the first part you seem to refer to what is usually called the Object - relational mapping impedance. There are already a lot of frameworks to alleviate that problem. There are tradeofs as well. Some things will be easier, others will get more complex, but in the general case they work well if you can afford the extra layer.
In the second part you seem to complain about SQL being text (it uses strings instead of ids, etc)... SQL is a query language. Any language (computer or otherwise) that is meant to be read or written by humans is text oriented for that matter. Assembly, C, PHP, you name it. Why? Because, well... it does make sense, doesn't it?
If you want precompiled queries, you already have stored procedures. Prepared statements are also compiled once on the fly, IIRC. Most (if not all) db drivers talk to the database server using a binary protocol anyway.
yes, text is a bit inefficient. But actually getting the data is a lot more costly, so the text based sql is reasonably insignificant.
SQL was created to provide an interface to make ad hoc queries against a relational database.
Generally, most relational databases understand some form of SQL.
Object-oriented databases exist, and (presumably) use objects to do their querying... but as I understand it, OO databases have a lot more overheard, and relational databases work just fine.
Relational Databases also allow you to operate in a "disconnected" state. Once you have the information you asked for, you can close the database connection. With an OO database, you either need to return all objects related to the current one (and the ones they're related to... and the... etc...) or reopen the connection to retrieve new objects as they are accessed.
In addition to SQL, you also have ORMs (object-relational mappings) that map objects to SQL and back. There are quite a few of them, including LINQ (.NET), the MS Entity Framework (.NET), Hibernate (Java), SQLAlchemy (Python), ActiveRecord (Ruby), Class::DBI (Perl), etc...
A database language is useful because it provides a logical model for your data independent of any applications that use it. SQL has a lot of shortcomings however, not the least being that its integration with other languages is poor, type support is about 30 years behind the rest of the industry and it has never been a truly relational language anyway.
SQL has survived mostly because the database market has been and remains dominated by the three mega-vendors who have a vested interest in protecting their investment. That's changing and SQL's days are probably numbered but the model that will finally replace it probably hasn't arrived yet - although there are plenty of contenders around these days.
I don't think most people are getting your question, though I think it's very clear. Unfortunately I don't have the "correct" answer. I would guess it's a combination of several things:
Semi-arbitrary decisions when it was designed such as ease of use, not needing a SQL compiler (or IDE), portability, etc.
It happened to catch on well (probably due to similar reasons)
And now due to historical reasons (compatibility, well known, proven, etc.) continues to be used.
I don't think most companies have bothered with another solution because it works well, isn't much of a bottleneck, it's a standard, blah, blah..
One of the Unix design principles can be said thusly, "Write programs to handle text streams, because that is a universal interface.".
And that, I believe, is why we typically use SQL instead of some 'byte-SQL' that only has a compilation interface. Even if we did have a byte-SQL, someone would write a "Text SQL", and the loop would be complete.
Also, MySQL and SQLite are less full-featured than, say, MSSQL and Oracle SQL. So you're still in the low end of the SQL pool.
Actually there are a few non-SQL database (like Objectivity, Oracle Berkeley DB, etc.) products came but non of them succeeded. In future if someone finds intuitive alternative for SQL, that will answer your question.
There are a lot of non relational database systems. Here are just a few:
Memcached
Tokyo Cabinet
As far as finding a relational database that doesn't use SQL as its primary interface, I think you won't find it. Reason: SQL is a great way to talk about relations. I can't figure out why that's a big deal to you: if you don't like SQL, put an abstraction over it (like an ORM) so you don't have to worry about it. Let the abstraction worry about it. It gets you to the same place.
However, the problem your'e really mentioning here is the object-relation disconnect - the problem is with the relation itself. Objects and relational-tuples don't always lend themselves to be a 1-1 relationship, which is the reason why a developer can frustrated with a database. The solution to that is to use a different database type.
Because often, you cannot be sure that (citing you) "no one ever seen after deployment". Knowing that there is an easy interface for reporting and for dataset level querying is a good path for evolution of your app.
You're right, that there are other solutions that may be valid in some situations: XML, plain text files, OODB...
But having a set of common interfaces (like ODBC) is a huge plus for the life of data.
I think the reason might be the search/find/grab algorithms the sql laungage is connected to do. Remember that sql has been developed for 40 years - and the goal has been both preformence wise and user firendly wise.
Ask yourself what the best way of finding 2 attibutes is. Now why investigating that each time you would want to do something that includes that each time you develope your application. Assuming the main goal is the developing of your application when developing an application.
An application has similarities with other applications, a database has similarities with other databases. So there should be a "best way" of these to interact, logically.
Also ask yourself how you would develop a better console only application that does not use sql laungage. If you cannot do that I think you need to develope a new kind of GUI that are even more fundamentally easier to use than with a console - to develope things from it. And that might actually be possible. But still most development of applications is based around console and typing.
Then when it comes to laungage I don´t think you can make a much more fundamentally easier text laungage than sql. And remember that each word of anything is inseperatly connected to its meaning - if you remove the meaning the word cannot be used - if you remove the word you cannot communicate the meaning. You have nothing to describe it with (And maybe you cannot even think it beacuse it woulden´t be connected to anything else you have thought before...).
So basically the best possible algorithms for database manipulation are assigned to words - if you remove these words you will have to assign these manipulations something else - and what would that be?
i think you can use ORM
if and only if you know the basic of sql.
else the result there isn't the best

Languages other than SQL in postgres

I've been using PostgreSQL a little bit lately, and one of the things that I think is cool is that you can use languages other than SQL for scripting functions and whatnot. But when is this actually useful?
For example, the documentation says that the main use for PL/Perl is that it's pretty good at text manipulation. But isn't that more of something that should be programmed into the application?
Secondly, is there any valid reason to use an untrusted language? It seems like making it so that any user can execute any operation would be a bad idea on a production system.
PS. Bonus points if someone can make PL/LOLCODE seem useful.
#Mike: this kind of thinking makes me nervous. I've heard to many times "this should be infinitely portable", but when the question is asked: do you actually foresee that there will be any porting? the answer is: no.
Sticking to the lowest common denominator can really hurt performance, as can the introduction of abstraction layers (ORM's, PHP PDO, etc). My opinion is:
Evaluate realistically if there is a need to support multiple RDBMS's. For example if you are writing an open source web application, chances are that you need to support MySQL and PostgreSQL at least (if not MSSQL and Oracle)
After the evaluation, make the most of the platform you decided upon
And BTW: you are mixing relational with non-relation databases (CouchDB is not a RDBMS comparable with Oracle for example), further exemplifying the point that the perceived need for portability is many times greatly overestimated.
"isn't that [text manipulation] more of something that should be programmed into the application?"
Usually, yes. The generally accepted "three-tier" application design for databases says that your logic should be in the middle tier, between the client and the database. However, sometimes you need some logic in a trigger or need to index on a function, requiring that some code be placed into the database. In that case all the usual "which language should I use?" questions come up.
If you only need a little logic, the most-portable language should probably be used (pl/pgSQL). If you need to do some serious programming though, you might be better off using a more expressive language (maybe pl/ruby). This will always be a judgment call.
"is there any valid reason to use an untrusted language?"
As above, yes. Again, putting direct file access (for example) into your middle tier is best when possible, but if you need to fire things off based on triggers (that might need access to data not available directly to your middle tier), then you need untrusted languages. It's not ideal, and should generally be avoided. And you definitely need to guard access to it.
These days, any "unique" or "cool" feature in a DBMS makes me incredibly nervous. I break out in a rash and have to stop work until the itching goes away.
I just hate to be locked in to a platform unnecessarily. Suppose you build a big chunk of your system in PL/Perl inside the database. Or in C# within SQL Server, or PL/SQL within Oracle, there are plenty of examples*.
Now you suddenly discover that your chosen platform doesn't scale. Or isn't fast enough. Or something. Worse, there's a new kid on the database block (something like MonetDB, CouchDB, Cache, say but much cooler) that would solve all your problems (even if your only problem, like mine, is having an uncool databse platform). And you can't switch to it without recoding half your application.
(*Admittedly, the paid-for products are to some extent seeking to lock you in by persuading you to use their unique features, which is not an accusation that can directly be levelled at the free providers, but the effect is the same).
So that's a rant on the first part of the question. Heart-felt, though.
is there any valid reason to use an
untrusted language? It seems like
making it so that any user can execute
any operation would be a bad idea
My goodness, yes it does! A sort of "Perl injection attack"? Almost worth doing it just to see what happens, I'd have thought.
For philosophical reasons outlined above I think I'll pass on the PL/LOLCODE challenge. Although I was somewhat amazed to discover it was a link to something extant.
From my perspective, I guess the answer is 'it depends'.
There is an argument that manipulation of the data belongs in the database layer, so that the business logic does not need to be overly concerned about how the manipulation happens, it just knows that it has.
Another very good reason to process data on the db layer is if the volume of data being crunched means that network bandwidth will become an issue. I once had to categorise very large amounts of data. Processing this in the application layer was severly restricted by the time required to transfer all the data across the network for processing.
I then wrote a binning algorithm in PL/pgSQL and it worked much faster.
Regarding untrusted languages, I heard a podcast from Josh Berkus (a postgres advocate) who discussed an application of postgresql that brought in data from MySQL as part of its processing, so that the communication itself was handled by the postgres server. I don't remember the full details, I think it was on the FLOSS Weekly podcast which is quite an interesting discussion of the history of PostGRESQL and some of the issues it is put to.
The untrusted versions of the procedural languages allow you to access I/O on the system.
This can come in handy if you need a trigger or something send a email or connect to a socket server to send a popup notification. There are tons of uses for this type of thing, and because of postgresql isolation levels you cans safely do things like this.
You can put checkpoints in the function so if the transaction fails the email or whatever won't go out. The nice thing about doing this is it removes the logic from the client and puts it on the server.
I think most additional languages are offered so that if you develop in that language on a regular basis, you can feel comfortable writing db functions, triggers, etc. The usefulness of these features is to provide a control over data as close to the data as possible.
An example of a useful stored procedure I recently wrote in an external language that would not have been possible in pl/sql is a version of 'df' which allowed SQL table generators to pick a tablespace with the most free space available at runtime.
I used plperlu, and it was relatively simple, although I had to be careful with data typing.