Design patterns to ease migration of SQL code between vendors - sql

I am looking at ways which a developer can take advantage of implementation specific advanced features(T-SQL, PL-SQL etc) and at the same time makes it easy to migrate the ANSI compliant SQL code to a new database. By that I mean, how to efficiently abstract the ANSI compliant SQL code and move it across different databases?

I use the Mimer SQL validators:
Check Your SQL Against the SQL-92
Standard. Mimer SQL-92 Validator
recognizes the standards Entry,
Transitional, Intermediate and Full
SQL-92, Persistent Stored Modules
(PSM) SQL-99 and Database Triggers
Copy and paste SQL text into the webpage and hit the 'Test SQL' button.
..though I'm not sure that is the kind of "efficient" process you are looking for, it may of some use.

Related

SQL in Access and SQL in MS SQL Server

I have heard that SQL is mostly the same from program to program, but there are some differences. I am wondering if there are any differences in SQL between Access (2007 if it matters) and MS SQL Server? I wonder because I regularly use Access and want to learn SQL from a book, and I wonder if a book using MS SQL Server will serve my purposes? I am considering "Access 2007 Pure SQL" and "Beginning SQL Joes 2 Pros", the second of which uses MS SQL Server. Thanks for any help!
There's multiple differences, even down to simple things like the string concatenation operator. Access uses &, SQL Server uses +. SQL is like English. There's British English, Canadian English, American English, Australian English, etc... Multiple dialects, mostly but not totally compatible with each other.
That's not to say that things are totally imcompatible - learning SQL on any DBMS is of use, because the core concepts of relational databases remain the same regardless of which DBMS you're on. It's just how you interface with them that's different.
MS Access uses JET SQL while SQL Server uses Transact SQL. For the most part, they are very similar. SQL in general is a programming language designed for managing data in relational database management systems. So all the flavors feature a common subset. But there are differences too. For more info, refer to this article on Convert Microsoft Access (JET SQL) to SQL Server (T-SQL) Cheatsheet. There are numerous other resources on web, but this should give you a quick picture of some differences.
I would say that Access SQL and T-SQL (SQL Server) have more differences than similarities. Any appearance of similarity are due to 1) both being based on the SQL-89 Standard (but both T-SQL and the Standards have moved on greatly, Access not so), 2) the SQL Server team tried but failed to make Access2000 (Jet 4.0) compliant with entry level SQL-92 Standard (the de facto "bare minimum" Standard).
Take for example the UPDATE statement. In its simplest form, i.e. involving a literal or input parameter (scalar) values, the two broadly are the same. However, when updating one table using the values from another table, the latest T-SQL syntax (2008) supports the SQL-92 scalar subquery syntax, the SQL-99 and SQL:2003 Standards' MERGE syntax with useful proprietary extensions, plus its older proprietary UODATE..FROM syntax (which should be avoided nowadays because it allows potentially ambiguous results), all of which can optionally use SQL:2003 common table expressions (useful for simplifying the SQL-92 scalar subquery syntax).
For Access you are compelled to use its proprietary UPDATE..FROM syntax, which is not the same as the T-SQL proprietary UPDATE..FROM syntax but has the same problem of allowing potentially ambiguous results (but this time cannot be avoided!), unless the query involves aggregated values in which case you cannot use SQL at all (!!) and must resort to client side (non-SQL) procedural code (because Access does not support procedural SQL code, another huge difference from T-SQL).

Which language has good SQL parsing library?

I'm looking for good SQL parser. One that will work with subselects, non-select queries, CTE, window functions and other legal SQL elements.
Result would be some kind of abstract syntax tree, that I could later on work on.
Language is mostly irrelevant, as I am willing to learn new language just to use the library, if it exists.
I know that it is technically possible to extract parser from some open source database, but it's far from easy (at least for the parser of PostgreSQL which is what I need).
There's a non-validating SQL parser in Python: python-sqlparse. The tokens are exposed as objects. I doubt if they support "other legal SQL statements", window functions, and the like though as those are controlled by vendor specific grammars and no vendor is technically fully compliant with SQL standards.
Um (knowing that you're willing to learn a new language), why would you need to work on the syntax tree? If you need some magic in dealing with the database, probably you don't need to reinvent the wheel: Python got a fantastic database toolkit - SQL ALchemy.
You can google "sql parser". This is the one that listed: General SQL Parser Here are some highlighted features listed on official website:
Offline SQL syntax check
Highly customizable SQL formatter
In-depth analysis of SQL script
Fully access to SQL query parse tree
Custom SQL engine for various databases
Major programming language support
It's a commercial SQL library.
Our DMS Software Reengineering Toolkit has PL/SQL and ANSI SQL 2011 full parsers (to ASTs) and prettyprinters (ASTs back to valid text). Neither of these are PostGres SQL, but DMS has a dialect mechanism that enables one to relatively easily build a dialect from a base grammar, by revising just some of the grammar rules and retaining the rest. Doing this from the SQL 2011 grammar seems like a practical way to tackle the problem.
DMS also offers facilities to access/traverse/modify the ASTs, both procedurally and in terms of surface-syntax patterns and transformations. Think of this as "life beyond parsing".

Which SQL Implementation can translate to many other(s)?

I'm looking for a SQL Implementation (and its Editor) that can be used for translating it to many other(s) SQL Languages.
For example, when i code in that SQL Language to script file(s), and then i translate to other(s) SQL Language script file(s) (for ex: MS SQL's , MySQL's , ...).
If you're sure to use only ANSI SQL to construct your scripts, you should be good to go.
I agree with #Justin Niessner: all SQL vendors pay attention to the SQL Standards, notably core SQL-92. To take SQL Server as an example, although they find Sybase legacy code is tricky to deprecate they are not afraid to do so and entirely new features (e.g. MERGE in MSSQL2008) tend to extend their Standard SQL equivalents, rather than reinventing the wheel.
For a product that has good Standards compliance, take a look at Mimer
Here at Mimer Information Technology, we pride ourselves on conforming
to the SQL standard and we play an active role in the Database
Languages standardization group which determines exactly what is SQL
standard.
Mimer also provide extremely useful SQL validators for SQL-92, SQL-99 and SQL:2003 respectively.
I've been researching the same thing a while ago. What I've found is that there is a project liquibase. It is aimed at change tracking but also converting between different DBMS. You can download source code and see different datatypes conversions across databases. Source at github browse for java files there, probably you'll find something helpful
If all you want are basic operations, these are fairly universal. For instance:
SELECT
INSERT
DELETE
UPDATE
FROM
WHERE
JOIN
...are all at the most basic level the same across implementations.
However, the more complicated your scripts get, the more difficult it becomes to make them "universal". Things like aggregation, subqueries, cursors, while loops, functions, indexes, constraints, temp tables, variables, string manipulation, window operations etc. are all pretty much database-specific.
Some of these do have "universal" equivalents but the more generic you make your code the worse it will perform.

are there open source validation parsers for major SQL dialects (TSQL, Oracle, MySQL)? or at least precise specs for these dialects?

word on the street is that Perl is defined not by a spec but by whatever the current interpreter version happens to accept. Now, let's consider an SQL dialect like TSQL. Is there a published spec of it that would allow making a validator equivalent to the one inside SQL Server? Are there such validators already in existence as open source? And the same question for Oracle.
Ok, so for MySQL I am guessing that validator could be extracted directly from the MySQL codebase. Nevertheless, do they in fact publish the spec itself in case I wanted to make my own validator?
You seem to have an idea of what to do for MySQL. I can't really say much about Oracle apart from that it mostly implements ANSI SQL and the PL/SQL procedural language extensions to SQL can mostly be found here for Oracle 9i.
For SQL Server:
Microsoft Books On Line (BOL) is the official reference spec. There are different pages for different versions of SQL Server, however.
There are a few projects relating to this.
http://www.sqlparser.com/ - This has .NET, Java, COM and VCL versions for Oracle, DB2, Mysql and SQL Server / Sybase (T-SQL). Quite reasonably priced too.
http://www.codeproject.com/Articles/1136/SharpHSQL-An-SQL-engine-written-in-C (c#)
http://antlr.org/ - This looks like a good bet.
I often use this site for formatting of SQL but it also does some validation although it's fairly crude:
http://www.dpriver.com/pp/sqlformat.htm
This is a similar site:
http://www.tsqltidy.com/
I would suggest that writing a validator for SQL even in just one of its variations is a massive undertaking. You could look at the various ISO/IEC standards for ANSI SQL. ANSI SQL-92 is very widely implemented, but there is a SQL:2008 standard as well.
You'd have to pay for the documentation for those standards though and they aren't cheap.
Good luck.

Are all SQL Geospatial implementations database specific?

My team is looking into geospatial features offered by different database platforms.
Are all of the implementations database specific, or is there a ANSI SQL standard, or similar type of standard, which is being offered, or will be offered in the future?
I ask, because I would like the implemented code to be as database agnostic as possible (our project is written to be ANSI SQL standard).
Is there any known plan for standardization of this functionality in the future?
Currently, there are more than one specifications followed by popular proprietary and open source implementations of spatial databases:
The OpenGIS - Simple Features for SQL
ISO SQL Multimedia Specification for Spatial - ISO/IEC 13249-3:2006 - Information technology -- Database languages -- SQL multimedia and application packages -- Part 3: Spatial
PostGIS, Oracle, Microsoft SQL Server and to some limited degree MySQL, all the databases implement the standard interfaces to manipulate spatial data. However, in spite of this fairly standardized features, all databases usually differ on simple SQL level what may make the database-agnostic implementation of your solution tricky. You likely need to survey the features you are interested and compare what various vendors provide.
For example GIS extensions for MySQL and for PostgreSQL both follow OpenGIS "Simple Features Specification for SQL" standard.
I haven't tried it, but Google tells me FDO is "an open-source API for manipulating, defining and analyzing geospatial information regardless of where it is stored". It's listed on osgeo.org - a point in its favour in my opinion.
There are providers for MySQL & Oracle. Disappointingly though SQL Server and Postgis aren't listed on the FDO providers page.
The only standard I know of is http://www.opengeospatial.org/standards/sfs and I don't know how well all the spatial database extensions implement it.
there are a number of geo-databases which are accessible with hibernate spatial
Oracle10g
Postgresql
MySQL
using an abtraction layer like hibernate is a good idea anyways, if you plan to write a database agnostic application. hibernatespatial fills this gap for geo features.