Related
I've been studying proper relational algebra, from Christopher Date's book Database in Depth: Relational Theory for Practitioners. Throughout the book he uses the language he and Hugh Darwen came up with in order to convey the theory — Tutorial D. In general I think Tutorial D is a very workable query language, much more flexible than SQL and so I (just for fun) was keen to take a stab at writing a (poor performing, undoubtedly) little RDBMS based on Tutorial D, rather than SQL.
Realizing this is a mammoth of a task even just to make something basic, I wonder if there are existing storage systems available that don't represent tables in the SQL sense, but represent relations in the relational sense and don't assume any particular query language is used to access the data, but rather just provide low-level functions like product, join, intersect, union, project etc (at the C-level, not at a query language level).
Am I making sense? :) Basically I'd like to take something like this and stick a Tutorial D (or similar) query interface in front of it.
It's really easy to do everything in memory, but representing the data structures on disk in a fashion that is even mildly efficient is pretty tricky and probably over my head without some serious research.
General SQL-based RDBMS that use SQL as an interface for structured input between the user and the database engine use what is called a Query Optimizer which takes the query expression and generates a set of Execution Plans.
The most optimal execution plan is then executed on the database; that's what generates result sets.
So, if you took an open source RDBMS implementation and wanted to modify it to accept a different query language, all you would have to do would be to translate the query language of your choice into an execution plan.
That's not to say that what you're trying to do is easy. Just that it should be possible, without having to write your own RDBMS. You would need to write a lexer and interpreter for your query language and then figure out how to transfer your interpreted query expression to the database engine's optimizer so that it can generate the execution plans, and execute the most efficient of them.
Take a look a SQLite as a compact open source relational database engine.
Dave Voorhis' Rel already does what you seem to want to build.
http://dbappbuilder.sourceforge.net/Rel.php
Unless of course it is your express purpose to try and build for yourself ...
Note that a front end for Tutorial D would not be query-language agnostic ;)
My vote also goes for Rel.
Hugh Darwen maintains a list of projects related to TTM (the spec for a D language of which Tutorial D is an implementation), I'm sure he would love to hear of your efforts if they come to anything.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I'm not quite sure stackoverflow is a place for such a general question, but let's give it a try.
Being exposed to the need of storing application data somewhere, I've always used MySQL or sqlite, just because it's always done like that. As it seems like the whole world is using these databases (most of all software products, frameworks, etc), it is rather hard for a beginning developer like me to start thinking about whether this is a good solution or not.
Ok, say we have some object-oriented logic in our application, and objects are related to each other somehow. We need to map this logic to the storage logic, so relations between database objects are required too. This leads us to using relational database, and I'm ok with that - to put it simple, our database table rows sometimes will need to have references to other tables' rows. But why use SQL language for interaction with such a database?
SQL query is a text message. I can understand this is cool for actually understanding what it does, but isn't it silly to use text table and column names for a part of application that no one ever seen after deploynment? If you had to write a data storage from scratch, you would have never used this kind of solution. Personally, I would have used some 'compiled db query' bytecode, that would be assembled once inside a client application and passed to the database. And it surely would name tables and colons by id numbers, not ascii-strings. In the case of changes in table structure those byte queries could be recompiled according to new db schema, stored in XML or something like that.
What are the problems of my idea? Is there any reason for me not to write it myself and to use SQL database instead?
EDIT To make my question more clear. Most of answers claim that SQL, being a text query, helps developers better understand the query itself and debug it more easily. Personally, I haven't seen people writing SQL queries by hand for a while. Everyone I know, including me, is using ORM. This situation, in which we build up a new level of abstraction to hide SQL, leads to thinking if we need SQL or not. I would be very grateful if you could give some examples in which SQL is used without ORM purposely, and why.
EDIT2 SQL is an interface between a human and a database. The question is why do we have to use it for application/database interaction? I still ask for examples of human beings writing/debugging SQL.
Everyone I know, including me, is using ORM
Strange. Everyone I know, including me, still writes most of the SQL by hand. You typically end up with tighter, more high performance queries than you do with a generated solution. And, depending on your industry and application, this speed does matter. Sometimes a lot. yeah, I'll sometimes use LINQ for a quick-n-dirty where I don't really care what the resulting SQL looks like, but thus far nothing automated beats hand-tuned SQL for when performance against a large database in a high-load environment really matters.
If all you need to do is store some application data somewhere, then a general purpose RDBMS or even SQLite might be overkill. Serializing your objects and writing them to a file might be simpler in some cases. An advantage to SQLite is that if you have a lot of this kind of information, it is all contained in one file. A disadvantage is that it is more difficult to read it. For example, if you serialize you data to YAML, you can read the file with any text editor or shell.
Personally, I would have used some
'compiled db query' bytecode, that
would be assembled once inside a
client application and passed to the
database.
This is how some database APIs work. Check out static SQL and prepared statements.
Is there any reason for me not to
write it myself and to use SQL
database instead?
If you need a lot of features, at some point it will be easier to use an existing RDMBS then to write your own database from scratch. If you don't need many features, a simpler solution may be wiser.
The whole point of database products is to avoid writing the database layer for every new program. Yes, a modern RDMBS might not always be a perfect fit for every project. This is because they were designed to be very general, so in practice, you will always get additional features you don't need. That doesn't mean it is better to have a custom solution. The glove doesn't always need to be a perfect fit.
UPDATE:
But why use SQL language for
interaction with such a database?
Good question.
The answer to that may be found in the original paper describing the relational model A Relational Model of Data for Large Shared Data Banks, by E. F. Codd, published by IBM in 1970. This paper describes the problems with the existing database technologies of the time, and explains why the relational model is superior.
The reason for using the relational model, and thus a logical query language like SQL, is data independence.
Data independence is defined in the paper as:
"... the independence of application programs and terminal activities from the growth in data types and changes in data representations."
Before the relational model, the dominate technology for databases was referred to as the network model. In this model, the programmer had to know the on-disk structure of the data and traverse the tree or graph manually. The relational model allows one to write a query against the conceptual or logical scheme that is independent of the physical representation of the data on disk. This separation of logical scheme from the physical schema is why we use the relational model. For a more on this issue, here are some slides from a database class. In the relational model, we use logic based query languages like SQL to retrieve data.
Codd's paper goes into more detail about the benefits of the relational model. Give it a read.
SQL is a query language that is easy to type into a computer in contrast with the query languages typically used in a research papers. Research papers generally use relation algebra or relational calculus to write queries.
In summary, we use SQL because we happen to use the relational model for our databases.
If you understand the relational model, it is not hard to see why SQL is the way it is. So basically, you need to study the relation model and database internals more in-depth to really understand why we use SQL. It may be a bit of a mystery otherwise.
UPDATE 2:
SQL is an interface between a human
and a database. The question is why do
we have to use it for
application/database interaction? I
still ask for examples of human beings
writing/debugging SQL.
Because the database is a relational database, it only understands relational query languages. Internally it uses a relational algebra like language for specifying queries which it then turns into a query plan. So, we write our query in a form we can understand (SQL), the DB takes our SQL query and turns it into its internal query language. Then it takes the query and tries to find a "query plan" for executing the query. Then it executes the query plan and returns the result.
At some point, we must encode our query in a format that the database understands. The database only knows how to convert SQL to its internal representation, that is why there is always SQL at some point in the chain. It cannot be avoided.
When you use ORM, your just adding a layer on top of the SQL. The SQL is still there, its just hidden. If you have a higher-level layer for translating your request into SQL, then you don't need to write SQL directly which is beneficial in some cases. Some times we do not have such a layer that is capable of doing the kinds of queries we need, so we must use SQL.
Given the fact that you used MySQL and SQLite, I understand your point of view completely. Most DBMS have features that would require some of the programming from your side, while you get it from database for free:
Indexes - you can store large amounts of data and still be able to filter and search very quickly because of indexes. Of course, you could implement you own indexing, but why reinvent the wheel
data integrity - using database features like cascading foreign keys can ensure data integrity across the system. You only need to declare relationship between data, and system takes care of the rest. Of course, once more, you could implement constraints in code, but it's more work. Consider, for example, deletion, where you would have to write code in object's destructor to track all dependent objects and act accordingly
ability to have multiple applications written in different programming languages, working on different operating systems, some even distributed across the network - all using the same data stored in a common database
dead easy implementation of observer pattern via triggers. There are many cases where only some data depends on some other data and it does not affect UI aspect of application. Ensuring consistency can be very tricky or require a lot of programming. Of course, you could implement trigger-like behavior with objects but it requires more programming than simple SQL definition
There are some good answers here. I'll attempt to add my two cents.
I like SQL, I can think in it pretty easily. The queries produced by layers on top of the DB (like ORM frameworks) are usually hideous. They'll select tons of extra stuff, join in things you don't need, etc.; all because they don't know that you only want a small part of the object in this code. When you need high performance, you'll often end up going in and using at least some custom SQL queries in an ORM system just to speed up a few bottlenecks.
Why SQL? As others have said, it's easy for humans. It makes a good lowest common denominator. Any language can make SQL and call command line clients if necessary, and they is pretty much always a good library.
Is parsing out the SQL inefficient? Somewhat. The grammar is pretty structured, so there aren't tons of ambiguities that would make the parser's job really hard. The real thing is that the overhead of parsing out SQL is basically nothing.
Let's say you run a query like "SELECT x FROM table WHERE id = 3", and then do it again with 4, then 5, over and over. In that case, the parsing overhead may exist. That's why you have prepared statements (as others have mentioned). The server parses the query once, and can swap in the 3 and 4 and 5 without having to reparse everything.
But that's the trivial case. In real life, your system may join 6 tables and have to pull hundreds of thousands of records (if not more). It may be a query that you let run on a database cluster for hours, because that's the best way to do things in your case. Even with a query that takes only a minute or two to execute, the time to parse the query is essentially free compared to pulling records off disk and doing sorting/aggregation/etc. The overhead of sending the ext "LEFT OUTER JOIN ON" is only a few bytes compared to sending special encoded byte 0x3F. But when your result set is 30 MB (let alone gigs+), those few extra bytes are worthless compared to not having to mess with some special query compiler object.
Many people use SQL on small databases. The biggest one I interact with is only a few dozen gigs. SQL is used on everything from tiny files (like little SQLite DBs may be) up to terabyte size Oracle clusters. Considering it's power, it's actually a surprisingly simple and small command set.
It's an ubiquitous standard. Pretty much every programming language out there has a way to access SQL databases. Try that with a proprietary binary protocol.
Everyone knows it. You can find experts easily, new developers will usually understand it to some degree without requiring training
SQL is very closely tied to the relational model, which has been thoroughly explored in regard to optimization and scalability. But it still frequently requires manual tweaking (index creation, query structure, etc.), which is relatively easy due to the textual interface.
But why use SQL language for interaction with such a database?
I think it's for the same reason that you use a human-readable (source code) language for interaction with the compiler.
Personally, I would have used some 'compiled db query' bytecode, that would be assembled once inside a client application and passed to the database.
This is an existing (optional) feature of databases, called "stored procedures".
Edit:
I would be very grateful if you could give some examples in which SQL is used without ORM purposely, and why
When I implemented my own ORM, I implemented the ORM framework using ADO.NET: and using ADO.NET includes using SQL statements in its implementation.
After all the edits and comments, the main point of your question appears to be : why is the nature of SQL closer to being a human/database interface than to being an application/database interface ?
And the very simple answer to that question is : because that is exactly what it was originally intended to be.
The predecessors of SQL (QUEL being presumably the most important one) were intended to be exactly that : a QUERY language, i.e. one that didn't have any of INSERT, UPDATE, DELETE.
Moreover, it was intended to be a query language that could be used by any user, provided that user was aware of the logical structure of the database, and obviously knew how to express that logical structure in the query language he was using.
The original ideas behind QUEL/SQL were that a database was built using "just any mechanism conceivable", that the "real" database could be really just anything (e.g. one single gigantic XML file - allthough 'XML' was not considered a valid option at the time), and that there would be "some kind of machinery" that understood how to transform the actual structure of that 'just anything' into the logical relational structure as it was perceived by the SQL user.
The fact that in order to actually achieve that, the underlying structures are required to lend themselves to "viewing them relationally", was not understood as well in those days as it is now.
Yes, it is annoying to have to write SQL statements to store and retrieve objects.
That's why Microsoft have added things like LINQ (language integrated query) into C# and VB.NET to make it possible to query databases using objects and methods instead of strings.
Most other languages have something similar with varying levels of success depending on the abilities of that language.
On the other hand, it is useful to know how SQL works and I think it is a mistake to shield yourself entirely from it. If you use the database without thinking you can write extremely inefficient queries and index the database incorrectly. But once you understand how to use SQL correctly and have tuned your database, you have a very powerful tried-and-tested tool available for finding exactly the data you need extremely quickly.
My biggest reason for SQL is Ad-hoc reporting. That report your business users want but don't know that they need it yet.
SQL is an interface between a human
and a database. The question is why do
we have to use it for
application/database interaction? I
still ask for examples of human beings
writing/debugging SQL.
I use sqlite a lot right from the simplest of tasks (like logging my firewall logs directly to a sqlite database) to more complex analytic and debugging tasks in my day-to-day research. Laying out my data in tables and writing SQL queries to munge them in interesting ways seems to be the most natural thing to me in these situations.
On your point about why it is still used as an interface between application/database, this is my simple reasoning:
There is about 3-4 decades of
serious research in that area
starting in 1970 with Codd's seminal
paper on Relational Algebra.
Relational Algebra forms the
mathematical basis to SQL (and other
QLs), although SQL does not
completely follow the relational
model.
The "text" form of the language
(aside from being easily
understandable to humans) is also
easily parsable by machines (say
using a grammar parser like like
lex) and is easily convertable to whatever "bytecode" using any number of optimizations.
I am not sure if doing this in any
other way would have yielded
compelling benefits in the generic cases. Otherwise it
would have been probably discovered
and adopted in the 3 decades of
research. SQL probably provides the
best tradeoffs when bridging the
divide between humans/databases and
applications/databases.
The question that then becomes interesting to ask is, "What are the real benefits of doing SQL in any other "non-text" way?" Will google for this now:)
SQL is a common interface used by the DBMS platform - the entire point of the interface is that all database operations can be specified in SQL without needing supplementary API calls. This means that there is a common interface across all clients of the system - application software, reports and ad-hoc query tools.
Secondly, SQL gets more and more useful as queries get more complex. Try using LINQ to specify a 12-way join a with three conditions based on existential predicates and a condition based on an aggregate calculated in a subquery. This sort of thing is fairly comprehensible in SQL but unlikely to be possible in an ORM.
In many cases an ORM will do 95% of what you want - most of the queries issued by applications are simple CRUD operations that an ORM or other generic database interface mechanism can handle easily. Some operations are best done using custom SQL code.
However, ORMs are not the be-all and end-all of database interfacing. Fowler's Patterns of Enterprise Application Architecture has quite a good section on other types of database access strategy with some discussion of the merits of each.
There are often good reasons not to use an ORM as the primary database interface layer. An example of a good one is that platform database libraries like ADO.Net often do a good enough job and integrate nicely with the rest of the environment. You might find that the gain from using some other interface doesn't really outweigh the benefits from the integration.
However, the final reason that you can't really ignore SQL is that you are ultimately working with a database if you are doing a database application. There are many, many WTF stories about screw-ups in commercial application code done by people who didn't understand databases properly. Poorly thought-out database code can cause trouble in so many ways, and blithely thinking that you don't need to understand how the DBMS works is an act of Hubris that is bound to come and bite you some day. Worse yet, it will come and bite some other poor schmoe who inherits your code.
While I see your point, SQL's query language has a place, especially in large applications with a lot of data. And to point out the obvious, if the language wasn't there, you couldn't call it SQL (Structured Query Language). The benefit of having SQL over the method you described is SQL is generally very readable, though some really push the limits on their queries.
I whole heartly agree with Mark Byers, you should not shield yourself from SQL. Any developer can write SQL, but to really make your application perform well with SQL interaction, understanding the language is a must.
If everything was precompiled with bytecode as you described, I'd hate to be the one to have to debug the application after the original developer left (or even after not seeing the code for 6 months).
I think the premise of the question is incorrect. That SQL can be represented as text is immaterial. Most modern databases would only compile queries once and cache them anyway, so you already have effectively a 'compiled bytecode'. And there's no reason this couldn't happen client-wise though I'm not sure if anyone's done it.
You said SQL is a text message, well I think of him as a messenger, and, as we know, don't shoot the messenger. The real issue is that relations are not a good enough way of organising real world data. SQL is just lipstick on the pig.
If the first part you seem to refer to what is usually called the Object - relational mapping impedance. There are already a lot of frameworks to alleviate that problem. There are tradeofs as well. Some things will be easier, others will get more complex, but in the general case they work well if you can afford the extra layer.
In the second part you seem to complain about SQL being text (it uses strings instead of ids, etc)... SQL is a query language. Any language (computer or otherwise) that is meant to be read or written by humans is text oriented for that matter. Assembly, C, PHP, you name it. Why? Because, well... it does make sense, doesn't it?
If you want precompiled queries, you already have stored procedures. Prepared statements are also compiled once on the fly, IIRC. Most (if not all) db drivers talk to the database server using a binary protocol anyway.
yes, text is a bit inefficient. But actually getting the data is a lot more costly, so the text based sql is reasonably insignificant.
SQL was created to provide an interface to make ad hoc queries against a relational database.
Generally, most relational databases understand some form of SQL.
Object-oriented databases exist, and (presumably) use objects to do their querying... but as I understand it, OO databases have a lot more overheard, and relational databases work just fine.
Relational Databases also allow you to operate in a "disconnected" state. Once you have the information you asked for, you can close the database connection. With an OO database, you either need to return all objects related to the current one (and the ones they're related to... and the... etc...) or reopen the connection to retrieve new objects as they are accessed.
In addition to SQL, you also have ORMs (object-relational mappings) that map objects to SQL and back. There are quite a few of them, including LINQ (.NET), the MS Entity Framework (.NET), Hibernate (Java), SQLAlchemy (Python), ActiveRecord (Ruby), Class::DBI (Perl), etc...
A database language is useful because it provides a logical model for your data independent of any applications that use it. SQL has a lot of shortcomings however, not the least being that its integration with other languages is poor, type support is about 30 years behind the rest of the industry and it has never been a truly relational language anyway.
SQL has survived mostly because the database market has been and remains dominated by the three mega-vendors who have a vested interest in protecting their investment. That's changing and SQL's days are probably numbered but the model that will finally replace it probably hasn't arrived yet - although there are plenty of contenders around these days.
I don't think most people are getting your question, though I think it's very clear. Unfortunately I don't have the "correct" answer. I would guess it's a combination of several things:
Semi-arbitrary decisions when it was designed such as ease of use, not needing a SQL compiler (or IDE), portability, etc.
It happened to catch on well (probably due to similar reasons)
And now due to historical reasons (compatibility, well known, proven, etc.) continues to be used.
I don't think most companies have bothered with another solution because it works well, isn't much of a bottleneck, it's a standard, blah, blah..
One of the Unix design principles can be said thusly, "Write programs to handle text streams, because that is a universal interface.".
And that, I believe, is why we typically use SQL instead of some 'byte-SQL' that only has a compilation interface. Even if we did have a byte-SQL, someone would write a "Text SQL", and the loop would be complete.
Also, MySQL and SQLite are less full-featured than, say, MSSQL and Oracle SQL. So you're still in the low end of the SQL pool.
Actually there are a few non-SQL database (like Objectivity, Oracle Berkeley DB, etc.) products came but non of them succeeded. In future if someone finds intuitive alternative for SQL, that will answer your question.
There are a lot of non relational database systems. Here are just a few:
Memcached
Tokyo Cabinet
As far as finding a relational database that doesn't use SQL as its primary interface, I think you won't find it. Reason: SQL is a great way to talk about relations. I can't figure out why that's a big deal to you: if you don't like SQL, put an abstraction over it (like an ORM) so you don't have to worry about it. Let the abstraction worry about it. It gets you to the same place.
However, the problem your'e really mentioning here is the object-relation disconnect - the problem is with the relation itself. Objects and relational-tuples don't always lend themselves to be a 1-1 relationship, which is the reason why a developer can frustrated with a database. The solution to that is to use a different database type.
Because often, you cannot be sure that (citing you) "no one ever seen after deployment". Knowing that there is an easy interface for reporting and for dataset level querying is a good path for evolution of your app.
You're right, that there are other solutions that may be valid in some situations: XML, plain text files, OODB...
But having a set of common interfaces (like ODBC) is a huge plus for the life of data.
I think the reason might be the search/find/grab algorithms the sql laungage is connected to do. Remember that sql has been developed for 40 years - and the goal has been both preformence wise and user firendly wise.
Ask yourself what the best way of finding 2 attibutes is. Now why investigating that each time you would want to do something that includes that each time you develope your application. Assuming the main goal is the developing of your application when developing an application.
An application has similarities with other applications, a database has similarities with other databases. So there should be a "best way" of these to interact, logically.
Also ask yourself how you would develop a better console only application that does not use sql laungage. If you cannot do that I think you need to develope a new kind of GUI that are even more fundamentally easier to use than with a console - to develope things from it. And that might actually be possible. But still most development of applications is based around console and typing.
Then when it comes to laungage I don´t think you can make a much more fundamentally easier text laungage than sql. And remember that each word of anything is inseperatly connected to its meaning - if you remove the meaning the word cannot be used - if you remove the word you cannot communicate the meaning. You have nothing to describe it with (And maybe you cannot even think it beacuse it woulden´t be connected to anything else you have thought before...).
So basically the best possible algorithms for database manipulation are assigned to words - if you remove these words you will have to assign these manipulations something else - and what would that be?
i think you can use ORM
if and only if you know the basic of sql.
else the result there isn't the best
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I occasionally hear things about how SQL sucks and it's not a good language, but I never really hear much about alternatives to it. So, are other good languages that serve the same purpose (database access) and what makes them better than SQL? Are there any good databases that use this alternative language?
EDIT:
I'm familiar with SQL and use it all the time. I don't have a problem with it, I'm just interested in any alternatives that might exist, and why people like them better.
I'm also not looking for alternative kinds of databases (the NoSQL movement), just different ways of accessing databases.
I certainly agree that SQL's syntax is difficult to work with, both from the standpoint of automatically generating it, and from the standpoint of parsing it, and it's not the style of language we would write today if we were designing SQL for the demands we place on it today. I don't think we'd find so many varied keywords if we designed the language today, I suspect join syntax would be different, functions like GROUP_CONCAT would have more regular syntax rather than sticking more keywords in the middle of the parentheses to control its behavior... create your own laundry list of inconsistencies and redundancies in SQL that you'd like/expect to see smoothed out if we redesigned the language today.
There aren't any alternatives to SQL for speaking to relational databases (i.e. SQL as a protocol), but there are many alternatives to writing SQL in your applications. These alternatives have been implemented in the form of frontends for working with relational databases. Some examples of a frontend include:
SchemeQL and CLSQL, which are probably the most flexible, owing to their Lisp heritage, but they also look like a lot more like SQL than other frontends.
LINQ (in .Net)
ScalaQL and ScalaQuery (in Scala)
SqlStatement, ActiveRecord and many others in Ruby,
HaskellDB
...the list goes on for many other languages.
I think that the underlying theme today is that rather than replace SQL with one new query language, we are instead creating language-specific frontends to hide the SQL in our regular every-day programming languages, and treating SQL as the protocol for talking to relational databases.
Take a look at this list.
Hibernate Query Language is probably the most common. The advantage of Hibernate is that objects map very easily (nearly automatically) to the relational database, and the developer doesn't have to spend much time doing database design. Check out the Hibernate website for more info. I'm sure others will chime in with other interesting query languages...
Of course, there's plenty of NoSQL stuff out there, but you specifically mention that you're not interested in those.
"I occasionally hear things about how SQL sucks and it's not a good language"
SQL is over thirty years old. Insights about "which features make something a 'good' language and which ones make it a 'bad' one" have evolved more rapidly than SQL itself.
Also, SQL is not a language that conforms to current standards of "what it takes to be relational", so, SQL just isn't a relational language to boot.
"but I never really hear much about alternatives to it."
I invite you to ponder the possibility that you are trying to hear only in the wrong places (that is, the commercial DBMS industry exclusively).
"So, are other good languages that serve the same purpose (database access) and what makes them better than SQL?"
Date&Darwen describe the features that a modern data manipulation language must conform to in their "Third Manifesto", the most recent version of which is laid down in their book "Databases, Types & the Relational Model".
"Are there any good databases that use this alternative language?"
If by "good", you mean something like "industrial-strength", then no. The closest thing available would probably be Dataphor.
The Rel project offers an implementation for the Tutorial D language defined in "Databases, Types & The Relational Model", but the current prime goal of Rel is to be educational in nature.
My SIRA_PRISE project offers an implementation for "truly relational" data management, but I hesitate to also label it "an implementation of a language".
And of course, you might also look into some non-relational stuff, as some have proposed, but I personally dismiss non-relational data management as multiple decades of technological regression. Not worth considering, that is.
Oh, and by the way, a software system that is used to manage databases is not "a database", but "a DataBase Management System", "DBMS" for short. Just like a photograph is not the same thing as a camera, and if you are discussing cameras, and you want to avoid confusion, then you should be using the proper word "cameras" instead of "photograph".
Perhaps you're thinking of the criticism C. Date and his friends have uttered against existing relational databases and SQL; they say the systems and language aren't 100% relational, and should be. I don't really see any real problem here; as far as I can see you can have a 100% relational system, if you want, just by disciplining the way in which you use SQL.
What I personally keep running into is the lack of expressive power SQL inherits from its theoretical basis, relational algebra. One issue is the lack of support for the use of domain ordering, which you run into when you work with data marked by dates, timestamps, etcetera. I once tried to do a reporting application entirely in plain SQL on a database full of timestamps and it just wasn't feasible. Another is the lack of support for path traversal: most of my data look like directed graphs that I need to traverse paths in, and SQL can't do it. (It lacks "transitive closure". SQL-1999 can do it with "recursive subqueries" but I haven't seen them in actual use yet. There are also various hacks to make SQL cope but they're ugly.) These problems are also discussed by some of Date's writings, by the way.
Recently I was pointed at .QL which appears to address the transitive closure issue nicely, but I don't know whether it can resolve the issue with ordered domains.
Take a look at LINQ to SQL...
Tried it out a couple months ago and never looked back....
Direct answer: I don't think there's any serious contender out there. DBase and its imitators (Foxpro, Codebase etc) was a contender for a while, but I think they basically lost the database query language war. There have been many other database products that had their own query language, like Progress and Paradox and several others I've used whose names I don't remember and surely many more that I never heard of. But I don't think any other contender even came close to getting a non-trivial share of the market.
As simple proof that there is a difference between a database format and a query language, the last version of DBase I used -- many years ago now -- offerred both the "traditional" DBase query language and SQL, both of which could be used to access the same data.
Side ramble: I wouldn't say that SQL sucks, but it has many flaws. With the benefit of the years of experience and hindsight we now have, I'm sure one could design a better query language. But creating a better query language, and convincing people to use it, are two very different things. Would it be enough better to convince people that it was worth the trouble of learning. People have invested many years of their lives learning to use SQL effectively. Even if your new language is easier to use, there would surely be a learning curve. And how would you migrate your existing systems from SQL to the new language? Etc. It can be done, of course, just like C++, C#, and Java have largely overthrown COBOL and FORTRAN. But it takes a combination of technical superiority and good marketing to pull it off.
Still, I get a chuckle out of people who rush forward to defend SQL anytime someone criticizes it, who insist that any problem you have with SQL must be your own ineptitude in using it and not any fault of SQL, that you must just not have reached the higher plane of thingking necessary to comprehend its perfection, etc. Calm down, take a deep breath: We are insulting a computer language, not your mother.
Back in the 1980's, ObjectStore provided transparent object access. It was kind of like an RDBMS plus an ORM, except without all those extra leaky abstraction layers: it stored objects directly in the database.
So this alternative was really "no language at all", or perhaps "the language you're already using". You'd write C++ code and create or traverse objects as if they were native objects, and the database took care of everything as needed. Kind of like ActiveRecord but it actually worked as well as the ActiveRecord marketing blitzes claim. :-)
(Of course, it didn't have Oracle's marketing muscle, and it didn't have MySQL's zero-cost, so everybody ignored it. And now we try to replicate that with RDBMSs and ORMs, and some people try to argue that tables actually make sense for storing objects, and that writing giant XML file to tell your computer how to map objects to tables is somehow a reasonable solution.)
I think you might be interested in looking at Dataphor, which is an open-source relational development environment with its own database server (which speaks D), and the ability to derive user interfaces from its query language.
Also, it appears Ingres still supports QUEL, and it's open source.
The general movement these days is NoSQL; generally these technologies are:
Distributed "hashtables" that store data as key/value pairs
Document-oriented databases
Personally I think there is nothing wrong with SQL as long as it fits your needs. SQL is expressive and great for working with structured data.
SQL works fine for the domain for which it was designed — interrelated tables of data. This is generally found in traditional business data processing. SQL doesn't work that well when trying to persist a complex network of objects.
If your needs are to store and process relatively traditional data, use some SQL-based DBMS.
In response to your edit:
If you're looking for alternatives to the SQL DML for retrieving data from relational data stores, I've never heard of any serious alternative to SQL.
The knocks SQL gets are not, I think, so much against the language as opposed to the underlying data storage principles on which the language is based. People often confuse the language SQL with the relational data model on which RDBMSes are built.
Relational Databases are not the only kind of databases around. I should say a word about Object-Databases as I havn't seen it in responses from others. I had some experience with the Zope python framework that use ZODB for objects persistency instead of RDBMS (well, it's theoretically possible to replace ZODB by another database within zope but the last time I checked I didn't succeed to have it working, so can't be positive about that).
The ZODB mindset is really different, more like object programming that would happen to be persistent.
ORM can be seen as a kind of language
In a way I believe the Object-database model is what ORM are about : accessing persistent data through your usual object model. It's a kind of language and it's gaining some market share, but for now we don't see it as a language but as an abstraction layer. However I believe it would be much more efficient to use an ORM over an Object-database than over SQL (in other words performance of ORMs I happened to use using some SQL database as base layers sucked).
There are many implementations of SQL (SQL Server, mysql, Oracle, etc.), but there is no other language that serves the same purpose in the sense of being a general purpose language designed for relational data storage and retrieval.
There are object databases such as db4o, and there are similar so-called noSQL databases that refer to just about any data storage mechanism that doesn't rely on SQL, but most commonly open-source products like Cassandra based loosely on Google's Bigtable concept.
There are also a number of special-purpose database products like CDF, but you probably don't need to worry about those - if you need one, you'll know.
None of these are equivalent to SQL.
That doesn't mean they're "better" or "worse" - they're just not the same. Dennis Forbes wrote a great post recently breaking down a number of the strange claims surfacing against SQL. He maintains (and I agree) that these complaints originate largely from people and shops who have either picked the wrong tool for the job in the first place, or aren't using their SQL DBMS properly (I'm not even surprised anymore when I see another SQL database where every column is a varchar(50) and there's not a single index or key, anywhere).
If you are implementing yet another social networking site and aren't too concerned with ACID principles, by all means start looking into products such as db4o. If you are developing a mission-critical business system, however, I highly highly recommend that you think twice before joining the "SQL sucks" chorus. Do the research first, find out what features the various products can and cannot support.
Edit - I was busy writing my answer and didn't get the question update from a few minutes. Having said that, SQL is essentially inseparable from the DBMS itself. If you run a SQL database product, then you access it with SQL, period.
Perhaps you are looking for abstractions over the syntax; Linq to SQL, Entity Framework, Hibernate/NHibernate, SubSonic, and a host of other ORM tools all provide their own SQL-like syntax that is not quite SQL. All of these "compile down" to SQL. If you run SQL Server, then you can also write CLR Functions/Procedures/Triggers, which allows you to write code in any .NET language that will run inside the database; however, this isn't really a substitute for SQL, more of an extension to it.
I'm not aware of any full "language" that you can layer on top of a SQL database; short of switching to a different database product, you're eventually going to see SQL on the pipe.
SQL is de-facto.
Frameworks that try to shield developers from it have eventually created their own specific language (Hibernate HQL comes to mind).
SQL solves a problem fairly well. It is no more difficult to learn than a high level programming language. If you already know a functional language then it is a breeze to grasp SQL.
Considering the leading database vendors providing state of the art databases (Oracle and SQL Server) support SQL and have invested years into optimization engines, etc. and all leading data modelling software and change management software deals in SQL, I'd say it is the safest bet.
Also, there is more to a database than just queries. There is scalability, backup and recovery, data mining. The big vendors support a lot of things that even the new "cache" engines don't even consider.
Problems with SQL have motivated me to cook up a draft query language called SMEQL over at the Portland Pattern Repository wiki. Comments Welcome. It borrows ideas from functional programming and IBM's experimental Business System 12 language. (I originally called it TQL, but found later that name was taken.)
Within the .NET world, while it still has a SQL-esque feel to it, LINQ-to-SQL will allow you to have a good mix of SQL and in-memory .NET processing of your data. It also simplifies a lot of the lower-level data plumbing that nobody really wants to do.
If you want to see a database type of a completely different mindset, take a look at CouchDB. "Better" is obviously a relative requirement and this sort of non-relation database is "Better" but only in certain scenarios.
SQL the language is very powerful, and relational database management systems have been and still are a huge success. But there is a class of application that requires very high scalability and availability, but not necessarily a high degree of data consistency (eventual consistency is what matters). A variety of systems get better performance and scaling than an RDBMS by relaxing the need for full ACID compliant transactions. These have been named "NoSQL", but as others point out, this is a misnomer: that perhaps they should be called NoACID databases.
Michael Stonebraker covers this in The "NoSQL" Discussion has Nothing to Do With SQL.
Can anyone recommend a good ANSI SQL reference manual?
I don't necessary mean a tutorial but a proper reference document to lookup when you need either a basic or more in-depth explanation or example.
Currently I am using W3Schools SQL Tutorial and SQL Tutorial which are ok, but I don't find them "deep" enough.
Of course, each major RDBMS producer will have some sort of reference manuals targeting their own product, but they tend to be biased and sometime will use proprietary extensions.
EDITED: The aim of the question was to focus on the things database engines have in common i.e. the SQL roots. But understanding the differences can also be a positive thing - this is quite interesting.
Here's the ‘Second Informal Review Draft’ of SQL:1992, which seems to have been accurate enough for everything I've looked up. 1992 covers most of the stuff routinely used across DBMSs.
SQL isn't like C or Java, where there is a standard for the language, and then a number of companies and organizations are implementing the language as best they can, following the standard.
Instead, the major databases came before the SQL standard, and the standard is a sort of compromise where every database vendor wanted to get their particular dialect and features in the standard.
Therefore, there is much more variety between databases than between typical programming language compilers, and to use a database, you really need to know that particular SQL dialect.
Having said that, I've got Gultzan and Peltzer's SQL-99 Complete, Really here in my bookshelf. It is a good book if you need to know what the standard really contains. (And yes, there is a newer version since SQL-99, but noone seems to care.)
EDIT: Actually, there is not just one newer version since SQL-99, but three: SQL:2003, SQL:2006, and SQL:2008. And still noone seems to care. Actually, many don't even care about SQL-99, so SQL-92 is still, in a way, "the standard".
ANSI documents can all be purchased from -- you guessed it -- ANSI.
http://webstore.ansi.org/
The main problem with an ANSI SQL reference manual is that you can't find a DB which implements it. And when it does, then you'll find that ANSI SQL can't solve some of the daily problems. Which is why all professional databases define extensions.
So at work, you'll need a reference manual for the specific version of the database which you use.
This reminds me of my 2nd year university course where we learn relational theory instead of SQL.
Read a good book on Relational Theory. Database theory and practice have evolved since Edgar Codd originally defined the relational model back in 1969. Independent of any SQL products, SQL and Relational Theory draws on decades of research to present the most up-to-date treatment of the material available anywhere. Anyone with a modest to advanced background in SQL will benefit from the many insights in this book.
Oreilly January 2009
I found the best way to learn SQL was to actually get to writing queries and understanding the nature of joins/conditionals etc. I found this link with a lot of DIY examples to be the best : http://sqlzoo.net/
It's a littel outdated, but this book is really helpful is looking at how the differnt vendors implement things, I belive it includes ANSII standard.
http://www.amazon.com/SQL-Nutshell-2nd-Kevin-Kline/dp/0596004818/ref=sr_1_1?ie=UTF8&s=books&qid=1257963172&sr=8-1
I really like just about anything Joe Celko has written Celko's Books
I think this may be helpful to you.
Understanding the ANSI SQL standard
By: Kevin Kline
http://www.amazon.com/gp/product/1565927443/102-0105946-4028970?v=glance&n=283155
The DevGuru resources always worked well for me:
http://www.devguru.com/technologies/t-sql/home.asp
Although I must admit it's not strictly an 'ANSI' focused resource. I've always been MS SQL centric, and it was helpful to me when I was starting out. IMHO Your best bet will be to use several resources - specifically including at least one of for each DB platform you want to use.
To Quote the DevGuru intro for their T-SQL resource:
Although there are standards for SQL,
such as ANSI SQL92 and SQL99, most
databases use their own dialect and/or
extentions. Microsoft's flavor of SQL
used in SQL Server 7 and SQL Server
2000 is called T-SQL. While many of
the examples in this quick reference
may work on other databases, it is
assumed that SQL Server 2000 is used,
especially for advanced topics such as
stored procedures.
I know that most sql server software allows you to do "A Update on a Join", but I am wondering, is this in the SQL standards?
(eg. can I assume that any software package allows this?)
Note: I am asking this because I am writing a database library that should be easily extensible to database software that is not included in the original build. As such there's no point in answering with a remark such as "a, b, c and b all allow that - together they make up the lionshare of the market, so you can assume that all software packages allow that". No, I am interested in whether it is in the standards or not.
If I understand the question right, I think the answer is no, there is no standard "update based on a join". The postgres manual page for UPDATE includes this under "Compatibility":
This command conforms to the SQL standard, except that the FROM and RETURNING clauses are PostgreSQL extensions, as is the ability to use WITH with UPDATE.
Some other database systems offer a FROM option in which the target table is supposed to be listed again within FROM. That is not how PostgreSQL interprets FROM. Be careful when porting applications that use this extension.
While this doesn't explicitly say there isn't, the Compatibility notes in that manual generally note when there is a related, but not identical, feature in the standard. What's more, the mention of other systems with different behaviour demonstrates that if there is a standard, you can't rely on it anyway.
According to the ANSI SQL-92 standard, an UPDATE on JOINed tables is NOT part of the standards; See http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt sections 13.9 and 13.10 (you'll have to search for 391, the page number).
I tried to find an ANSI 2003 standard, but the closest I came was here: www.wiscorp.com/sql_2003_standard.zip (a late draft). There was no substantial difference between the two in regards to the UPDATE statement and JOIN syntax.
Stu
You're presuming that all software packages adhere to ANSI SQL Standards.....in reality, none of them that I'm aware of adhere completely to the standards.
If you're looking to adhere to ANSI SQL standards, the best place to start would be with the documented standards themselves. Here's the SQL-92 document:
http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt
Careful now, folks. Writing truly portable code is much more difficult than you would imagine and you also have to be willing to give up a lot in the areas of performance, ease of coding/maintenance, and readability. Just declare and use one variable in, say, SQL Server and your code is no longer truly portable. Write an audit trigger and I can guarantee that your trigger won't be portable between Oracle, SQL Server, and several other popular engines. And, it should really matter because it's not actually rocket science in any RDBMS (well, except maybe for writing a joined UPDATE in Oracle without using MERGE {which is standard but not portable, yet}).
Also, don't forget there are two basic types of SQL. That which supports the single row nature of most front-end code and that of batch code. If you really want your batch code to perform well, you'll use many of the "proprietary extensions" to the database engine you're using to efficiently process sometimes billions of rows overnight... the same night. ;-)
Be careful when aiming at writing code for "true" portability. You might end up with a tangled mess that's a whole lot slower than you might have ever imagined.