Most SQL databases follow the ANSI SQL standards to a degree, but
The standard is ambiguous, leaving some areas open to interpretation (eg: how different operations with NULLs should be handled is ambiguous)
Some vendors contradict the standard outright or just lack functionality defined by the standard (eg: MySQL has a list of differences between the standard and their implementation)
Some databases will behave differently depending on how they are configured, but configuration can be changed to have them behave the same way (eg: Oracle performs case-sensitive string comparisons by default, while SQL Server does them case-insensitve)
There is some functionality that is not part of the standard but is implemented by different RDBMSs anyway, albeit with different names (eg: Oracle's LISTAGG = MySQL's GROUP_CONCAT)
Is there a resource with a comprehensive list of quirks and gotchas to pay attention to when you are trying to write something that is supposed to be compatible with multiple databases?
I'm not sure how comprehensive this list is, but maybe this will help -
http://troels.arvin.dk/db/rdbms/
Except of already mentioned some comparison you can find in Wikipedia
Also similar question was already posted on Stackoverflow where you can fin a couple of useful links.
Related
I just wanted to know if there is an SQL standard compliance validator out there for Visual Studio 2019 Professional (something that could be set to strict: only absolutely compliant syntax would be accepted). It would be nice if it had support for native languages too, but I'm used to that kind of stuff being CLR-only (I don't really know why probably because of linking... I may be so absolutely wrong, though... I actually have no idea and took a guess).
Something important would be that it needs to be standard compliant, not only SQL-server compliant. What is not in the standard is an error.
The goal is to make SQL code that is completely independent of the DBMS. Thank you for taking the time to read my question.
The goal is to make SQL code that is completely independent of the DBMS.
Impossible goal, unless you are going to forsake writing SQL at all. It is perhaps sad, but different databases differ on very fundamental things, picking and choosing the parts of the standard they want. Happily, the major things like SELECT, JOIN and GROUP BY are common but the details are not.
You can think of them of them like dialects of a spoken language over time and region. I'm most familiar with English, but it is true that all languages evolve and change. I can read Shakespearean English, but I am not going to write English like that. It would be grammatically incorrect in some cases, use unknown words, and alternative meanings of common words.
Here are just some examples of some features that differ widely among databases:
Intervals. Adding an interval to a date using the standard syntax is interval + '1 day'. This varies significantly across databases.
Some databases do not support FULL JOIN.
Some databases do not support recursive CTEs. Some use the recursive keyword; some do not.
Some databases do not support the VALUES() constructor in the FROM clause.
Some databases allow the FROM clause to be optional.
The standard has nifty functionality such as FILTER and aggregation by functionally dependent ids.that few databases support
Limitations on data types vary significantly -- what is the longer string, for instance.
The standard uses FETCH to limit results, which some databases do not support.
Parsing strings into dates and formatting dates into strings is totally database-dependent.
Extracting date/time components uses extract() in the standard, but few databases actually support that functionality.
These are just a few of the differences off the top of my head -- in no way meant to be complete or even the most important. I just want to point out that what you want to do is not possible.
Everybody loves to mention how JDBC abstracts away vendor-specific differences between SQLs to present a single SQL flavor that would work against a whole slew of them.
But no book or reference on JDBC ever mentions a (detailed) specification or even a decent, user-space coverage of this SQL supported by (a specific version of) JDBC, say JDBC 4.1!
So, what ends up happening (at least with me) is that, if I'm working with MySQL, I must refer to the MySQL reference manual and then try to guard myself against accidentally using MySQL-specific features. For writing portable SQL (at least at the level supported by the JDBC driver version I'm using), I would rather prefer to refer to a JDBC spec or to an SQL spec directly instead of referring to MySQL, PostgresQL, etc.
Is the SQL standard itself (2008, 2003, etc), on which a particular version of JDBC is based, freely available? Or, do I have to shell out $$ to get a copy?
There is no "JDBC SQL", just ISO SQL and the vendor implementations of it. JDBC defines the interface for talking to SQL databases, it's a different layer to the query language its self.
The reference for JDBC its self is the JSR documentation:
JDBC 4.0
JDBC 4.1
Unfortunately the official SQL standards are expensive and must be purchased from the ISO.
You can find late-stage drafts that're perfectly good for reference when you're not trying to develop a conforming implementation here among other places.
The SQL spec isn't the most friendly and readable of things, so in practice it's a good idea to use vendor documentation that's actually intended to be read by human beings. You can compare a couple of vendor docs or fall back on the standard doc when uncertainty arises.
Standard compliance with the spec isn't exactly ideal across DBs; writing code strictly to the spec doesn't necessarily mean it'll actually work. For example, MySQL doesn't impliment window functions or common table expressions, PostgreSQL doesn't implement SQL/PSM (instead offering PL/PgSQL) or the CALL statement; most vendors use different ways of specifying auto-increment columns or sequence generators; etc etc etc.
Please don't use the w3schools SQL guides, they're severely outdated, wrong, fail to differentiate between vendor extensions and the standard, and should generally be avoided. I mention them because w3schools tends to come up quite high in search rankings - back in the day they used to actually be useful.
You can download the JDBC 4.1 specification from http://download.oracle.com/otndocs/jcp/jdbc-4_1-mrel-spec/index.html but this only covers JDBC itself, not SQL. The specification is more a description of the interface; it does expect databases to support some level of the SQL standards, but don't expect to find more information than a reference to the SQL standard if it comes to the requirements to queries.
You usually need to use the database specific SQL anyway, because even though there is a SQL standard, database vendors don't implement them to the letter. JDBC itself defines some escapes to bridge the gaps, but as far as I know, they are hardly ever used. Drivers also - usually - don't translate standard SQL to database specific SQL if the database doesn't support the standard SQL.
If you want to look at the official SQL standard, you need to buy it from ISO or your country-specific ISO representative. That said, with some searching you can find and download draft versions of the specification for free. I am not sure how helpful that is though, as the SQL standard documents are not intended as a reference manual, but meant to be a formal description and goes really deep in details that are only relevant to an implementer.
I'm looking for a SQL Implementation (and its Editor) that can be used for translating it to many other(s) SQL Languages.
For example, when i code in that SQL Language to script file(s), and then i translate to other(s) SQL Language script file(s) (for ex: MS SQL's , MySQL's , ...).
If you're sure to use only ANSI SQL to construct your scripts, you should be good to go.
I agree with #Justin Niessner: all SQL vendors pay attention to the SQL Standards, notably core SQL-92. To take SQL Server as an example, although they find Sybase legacy code is tricky to deprecate they are not afraid to do so and entirely new features (e.g. MERGE in MSSQL2008) tend to extend their Standard SQL equivalents, rather than reinventing the wheel.
For a product that has good Standards compliance, take a look at Mimer
Here at Mimer Information Technology, we pride ourselves on conforming
to the SQL standard and we play an active role in the Database
Languages standardization group which determines exactly what is SQL
standard.
Mimer also provide extremely useful SQL validators for SQL-92, SQL-99 and SQL:2003 respectively.
I've been researching the same thing a while ago. What I've found is that there is a project liquibase. It is aimed at change tracking but also converting between different DBMS. You can download source code and see different datatypes conversions across databases. Source at github browse for java files there, probably you'll find something helpful
If all you want are basic operations, these are fairly universal. For instance:
SELECT
INSERT
DELETE
UPDATE
FROM
WHERE
JOIN
...are all at the most basic level the same across implementations.
However, the more complicated your scripts get, the more difficult it becomes to make them "universal". Things like aggregation, subqueries, cursors, while loops, functions, indexes, constraints, temp tables, variables, string manipulation, window operations etc. are all pretty much database-specific.
Some of these do have "universal" equivalents but the more generic you make your code the worse it will perform.
Can anyone recommend a good ANSI SQL reference manual?
I don't necessary mean a tutorial but a proper reference document to lookup when you need either a basic or more in-depth explanation or example.
Currently I am using W3Schools SQL Tutorial and SQL Tutorial which are ok, but I don't find them "deep" enough.
Of course, each major RDBMS producer will have some sort of reference manuals targeting their own product, but they tend to be biased and sometime will use proprietary extensions.
EDITED: The aim of the question was to focus on the things database engines have in common i.e. the SQL roots. But understanding the differences can also be a positive thing - this is quite interesting.
Here's the ‘Second Informal Review Draft’ of SQL:1992, which seems to have been accurate enough for everything I've looked up. 1992 covers most of the stuff routinely used across DBMSs.
SQL isn't like C or Java, where there is a standard for the language, and then a number of companies and organizations are implementing the language as best they can, following the standard.
Instead, the major databases came before the SQL standard, and the standard is a sort of compromise where every database vendor wanted to get their particular dialect and features in the standard.
Therefore, there is much more variety between databases than between typical programming language compilers, and to use a database, you really need to know that particular SQL dialect.
Having said that, I've got Gultzan and Peltzer's SQL-99 Complete, Really here in my bookshelf. It is a good book if you need to know what the standard really contains. (And yes, there is a newer version since SQL-99, but noone seems to care.)
EDIT: Actually, there is not just one newer version since SQL-99, but three: SQL:2003, SQL:2006, and SQL:2008. And still noone seems to care. Actually, many don't even care about SQL-99, so SQL-92 is still, in a way, "the standard".
ANSI documents can all be purchased from -- you guessed it -- ANSI.
http://webstore.ansi.org/
The main problem with an ANSI SQL reference manual is that you can't find a DB which implements it. And when it does, then you'll find that ANSI SQL can't solve some of the daily problems. Which is why all professional databases define extensions.
So at work, you'll need a reference manual for the specific version of the database which you use.
This reminds me of my 2nd year university course where we learn relational theory instead of SQL.
Read a good book on Relational Theory. Database theory and practice have evolved since Edgar Codd originally defined the relational model back in 1969. Independent of any SQL products, SQL and Relational Theory draws on decades of research to present the most up-to-date treatment of the material available anywhere. Anyone with a modest to advanced background in SQL will benefit from the many insights in this book.
Oreilly January 2009
I found the best way to learn SQL was to actually get to writing queries and understanding the nature of joins/conditionals etc. I found this link with a lot of DIY examples to be the best : http://sqlzoo.net/
It's a littel outdated, but this book is really helpful is looking at how the differnt vendors implement things, I belive it includes ANSII standard.
http://www.amazon.com/SQL-Nutshell-2nd-Kevin-Kline/dp/0596004818/ref=sr_1_1?ie=UTF8&s=books&qid=1257963172&sr=8-1
I really like just about anything Joe Celko has written Celko's Books
I think this may be helpful to you.
Understanding the ANSI SQL standard
By: Kevin Kline
http://www.amazon.com/gp/product/1565927443/102-0105946-4028970?v=glance&n=283155
The DevGuru resources always worked well for me:
http://www.devguru.com/technologies/t-sql/home.asp
Although I must admit it's not strictly an 'ANSI' focused resource. I've always been MS SQL centric, and it was helpful to me when I was starting out. IMHO Your best bet will be to use several resources - specifically including at least one of for each DB platform you want to use.
To Quote the DevGuru intro for their T-SQL resource:
Although there are standards for SQL,
such as ANSI SQL92 and SQL99, most
databases use their own dialect and/or
extentions. Microsoft's flavor of SQL
used in SQL Server 7 and SQL Server
2000 is called T-SQL. While many of
the examples in this quick reference
may work on other databases, it is
assumed that SQL Server 2000 is used,
especially for advanced topics such as
stored procedures.
I know that most sql server software allows you to do "A Update on a Join", but I am wondering, is this in the SQL standards?
(eg. can I assume that any software package allows this?)
Note: I am asking this because I am writing a database library that should be easily extensible to database software that is not included in the original build. As such there's no point in answering with a remark such as "a, b, c and b all allow that - together they make up the lionshare of the market, so you can assume that all software packages allow that". No, I am interested in whether it is in the standards or not.
If I understand the question right, I think the answer is no, there is no standard "update based on a join". The postgres manual page for UPDATE includes this under "Compatibility":
This command conforms to the SQL standard, except that the FROM and RETURNING clauses are PostgreSQL extensions, as is the ability to use WITH with UPDATE.
Some other database systems offer a FROM option in which the target table is supposed to be listed again within FROM. That is not how PostgreSQL interprets FROM. Be careful when porting applications that use this extension.
While this doesn't explicitly say there isn't, the Compatibility notes in that manual generally note when there is a related, but not identical, feature in the standard. What's more, the mention of other systems with different behaviour demonstrates that if there is a standard, you can't rely on it anyway.
According to the ANSI SQL-92 standard, an UPDATE on JOINed tables is NOT part of the standards; See http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt sections 13.9 and 13.10 (you'll have to search for 391, the page number).
I tried to find an ANSI 2003 standard, but the closest I came was here: www.wiscorp.com/sql_2003_standard.zip (a late draft). There was no substantial difference between the two in regards to the UPDATE statement and JOIN syntax.
Stu
You're presuming that all software packages adhere to ANSI SQL Standards.....in reality, none of them that I'm aware of adhere completely to the standards.
If you're looking to adhere to ANSI SQL standards, the best place to start would be with the documented standards themselves. Here's the SQL-92 document:
http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt
Careful now, folks. Writing truly portable code is much more difficult than you would imagine and you also have to be willing to give up a lot in the areas of performance, ease of coding/maintenance, and readability. Just declare and use one variable in, say, SQL Server and your code is no longer truly portable. Write an audit trigger and I can guarantee that your trigger won't be portable between Oracle, SQL Server, and several other popular engines. And, it should really matter because it's not actually rocket science in any RDBMS (well, except maybe for writing a joined UPDATE in Oracle without using MERGE {which is standard but not portable, yet}).
Also, don't forget there are two basic types of SQL. That which supports the single row nature of most front-end code and that of batch code. If you really want your batch code to perform well, you'll use many of the "proprietary extensions" to the database engine you're using to efficiently process sometimes billions of rows overnight... the same night. ;-)
Be careful when aiming at writing code for "true" portability. You might end up with a tangled mess that's a whole lot slower than you might have ever imagined.