I already searched a lot of resources on the net for parsing. Parsing in integers, parsing in char, parsing in string. However I just can't create a program that will parse a SQL Query and do conversion
For example, MySQL to MsSQL.
Does anybody have some sample query conversion code or relavent links?
SQL conversion from one database to another is quite complicated, there are lots of things to do such as data type conversion, different syntax of functions, propriety join syntax and stored procedure is much more difficult to convert.
Here are two articles with real demo to do some SQL query conversion.
Rewrite Oracle propriety joins to ANSI SQL compliant joins.
Rewrite SQL Server proprietary joins to ANSI SQL compliant joins.
Microsoft provides some guidelines for migrating from other databases to their products. You can download documents from their site which will assist you in the necessary conversions for your queries. Migration to Microsoft SQL Server 2008. The guides are word documents that you can download.
You could use Antlr or a similar tool. There is an almost ready-to-use MySQL grammar for Antlr, see http://www.antlr.org/grammar/list
Adding a vb.net target to Antlr will not be that easy, but I suppose you'd be just fine with an existing C# backend.
Related
Does a grammar (like in EBNF or similar format) exist for MS Access SQL syntax? Like how TSQL syntax is documented with EBNF: https://learn.microsoft.com/en-us/sql/t-sql/queries/select-transact-sql?view=sql-server-2017.
I have only been able to find find tutorials with examples, but not a full grammar.
You can find the full Access SQL reference here on MS Docs.
Note that some statements are exclusive to the SQL server compatible syntax (anything with the DECIMAL type and CHECK constraints), and this isn't properly described in the reference.
It isn't as extensive and well-written as the T-SQL stuff, but it's closest to what you're asking.
Is it possible to implement own database server taking Oracle PL/SQL syntax as the bases or i would like to ask why different database solutions have different syntax eg: SQL server, MySql, Sqlite etc. can't they have some specific standard of syntax for basic operations including PL/SQL(excluding SQLite) why everyone is having a different syntax, sorry for diversion of question into patent issues but i could not find a better place to ask this question.
Of course you can, but you have to parse the PL/SQL yourself into something other platforms understand. (You can use ANTLR for example as parser tool. There is even a full featured grammar for PL/SQL) This is possible for small solutions with a small instruction set, but for large, full support of PL/SQL you need to be Oracle-sized.
To answer the why: two reasons:
There is no standard, so everyone picks his own;
You don't want customers to leave, so your own 'best' framework that is incompatible with others, that is your USP, and it prevents users from just porting their code to the other platform. They are stuck on yours.
I have heard that SQL is mostly the same from program to program, but there are some differences. I am wondering if there are any differences in SQL between Access (2007 if it matters) and MS SQL Server? I wonder because I regularly use Access and want to learn SQL from a book, and I wonder if a book using MS SQL Server will serve my purposes? I am considering "Access 2007 Pure SQL" and "Beginning SQL Joes 2 Pros", the second of which uses MS SQL Server. Thanks for any help!
There's multiple differences, even down to simple things like the string concatenation operator. Access uses &, SQL Server uses +. SQL is like English. There's British English, Canadian English, American English, Australian English, etc... Multiple dialects, mostly but not totally compatible with each other.
That's not to say that things are totally imcompatible - learning SQL on any DBMS is of use, because the core concepts of relational databases remain the same regardless of which DBMS you're on. It's just how you interface with them that's different.
MS Access uses JET SQL while SQL Server uses Transact SQL. For the most part, they are very similar. SQL in general is a programming language designed for managing data in relational database management systems. So all the flavors feature a common subset. But there are differences too. For more info, refer to this article on Convert Microsoft Access (JET SQL) to SQL Server (T-SQL) Cheatsheet. There are numerous other resources on web, but this should give you a quick picture of some differences.
I would say that Access SQL and T-SQL (SQL Server) have more differences than similarities. Any appearance of similarity are due to 1) both being based on the SQL-89 Standard (but both T-SQL and the Standards have moved on greatly, Access not so), 2) the SQL Server team tried but failed to make Access2000 (Jet 4.0) compliant with entry level SQL-92 Standard (the de facto "bare minimum" Standard).
Take for example the UPDATE statement. In its simplest form, i.e. involving a literal or input parameter (scalar) values, the two broadly are the same. However, when updating one table using the values from another table, the latest T-SQL syntax (2008) supports the SQL-92 scalar subquery syntax, the SQL-99 and SQL:2003 Standards' MERGE syntax with useful proprietary extensions, plus its older proprietary UODATE..FROM syntax (which should be avoided nowadays because it allows potentially ambiguous results), all of which can optionally use SQL:2003 common table expressions (useful for simplifying the SQL-92 scalar subquery syntax).
For Access you are compelled to use its proprietary UPDATE..FROM syntax, which is not the same as the T-SQL proprietary UPDATE..FROM syntax but has the same problem of allowing potentially ambiguous results (but this time cannot be avoided!), unless the query involves aggregated values in which case you cannot use SQL at all (!!) and must resort to client side (non-SQL) procedural code (because Access does not support procedural SQL code, another huge difference from T-SQL).
I'm looking for good SQL parser. One that will work with subselects, non-select queries, CTE, window functions and other legal SQL elements.
Result would be some kind of abstract syntax tree, that I could later on work on.
Language is mostly irrelevant, as I am willing to learn new language just to use the library, if it exists.
I know that it is technically possible to extract parser from some open source database, but it's far from easy (at least for the parser of PostgreSQL which is what I need).
There's a non-validating SQL parser in Python: python-sqlparse. The tokens are exposed as objects. I doubt if they support "other legal SQL statements", window functions, and the like though as those are controlled by vendor specific grammars and no vendor is technically fully compliant with SQL standards.
Um (knowing that you're willing to learn a new language), why would you need to work on the syntax tree? If you need some magic in dealing with the database, probably you don't need to reinvent the wheel: Python got a fantastic database toolkit - SQL ALchemy.
You can google "sql parser". This is the one that listed: General SQL Parser Here are some highlighted features listed on official website:
Offline SQL syntax check
Highly customizable SQL formatter
In-depth analysis of SQL script
Fully access to SQL query parse tree
Custom SQL engine for various databases
Major programming language support
It's a commercial SQL library.
Our DMS Software Reengineering Toolkit has PL/SQL and ANSI SQL 2011 full parsers (to ASTs) and prettyprinters (ASTs back to valid text). Neither of these are PostGres SQL, but DMS has a dialect mechanism that enables one to relatively easily build a dialect from a base grammar, by revising just some of the grammar rules and retaining the rest. Doing this from the SQL 2011 grammar seems like a practical way to tackle the problem.
DMS also offers facilities to access/traverse/modify the ASTs, both procedurally and in terms of surface-syntax patterns and transformations. Think of this as "life beyond parsing".
word on the street is that Perl is defined not by a spec but by whatever the current interpreter version happens to accept. Now, let's consider an SQL dialect like TSQL. Is there a published spec of it that would allow making a validator equivalent to the one inside SQL Server? Are there such validators already in existence as open source? And the same question for Oracle.
Ok, so for MySQL I am guessing that validator could be extracted directly from the MySQL codebase. Nevertheless, do they in fact publish the spec itself in case I wanted to make my own validator?
You seem to have an idea of what to do for MySQL. I can't really say much about Oracle apart from that it mostly implements ANSI SQL and the PL/SQL procedural language extensions to SQL can mostly be found here for Oracle 9i.
For SQL Server:
Microsoft Books On Line (BOL) is the official reference spec. There are different pages for different versions of SQL Server, however.
There are a few projects relating to this.
http://www.sqlparser.com/ - This has .NET, Java, COM and VCL versions for Oracle, DB2, Mysql and SQL Server / Sybase (T-SQL). Quite reasonably priced too.
http://www.codeproject.com/Articles/1136/SharpHSQL-An-SQL-engine-written-in-C (c#)
http://antlr.org/ - This looks like a good bet.
I often use this site for formatting of SQL but it also does some validation although it's fairly crude:
http://www.dpriver.com/pp/sqlformat.htm
This is a similar site:
http://www.tsqltidy.com/
I would suggest that writing a validator for SQL even in just one of its variations is a massive undertaking. You could look at the various ISO/IEC standards for ANSI SQL. ANSI SQL-92 is very widely implemented, but there is a SQL:2008 standard as well.
You'd have to pay for the documentation for those standards though and they aren't cheap.
Good luck.