BNF notation of T-SQL - sql

Do you know where can I get the BNF (Backus Naur Form) notation for the latest version of T-SQL from. This is the microsoft version and I can't find anything for it. I found SQL2 The revised ISO standard here also called SQL92 but it seems to lack some features of microsoft's T-SQL

Have you checked out this btw?
General Sql Parser
They have enginered the notation from the ground up....

I developed TSql grammar for ANTLR 4 in EBNF form. Check it in official grammars repository. It based on msdn description.
Currently implemented syntax:
Control of Flow (MSDN).
Cursors (MSDN).
Data manipulation language (DML):
Delete (MSDN).
Insert (MSDN).
Select (MSDN).
Update (MSDN).
Expressions (MSDN).
Predicates.
Transactions (MSDN).
And another syntax. Check grammar and test files for more detail.

I know this is an old question, but I just found this grammar file hosted on bitbucket that can be used with GOLD Parsing System.
Since you're looking for TSQL's BNF (I was too), and it doesn't really exist, this grammar is the next best thing IMO.

SQL-92 in BNF
SQL Server 2005 is based on SQL-92 with some SQL-99 features and Microsoft's T-SQL extensions. Best I have found currently.
Let me know if you find a more up to date one.....

Related

A grammar for Access SQL

Does a grammar (like in EBNF or similar format) exist for MS Access SQL syntax? Like how TSQL syntax is documented with EBNF: https://learn.microsoft.com/en-us/sql/t-sql/queries/select-transact-sql?view=sql-server-2017.
I have only been able to find find tutorials with examples, but not a full grammar.
You can find the full Access SQL reference here on MS Docs.
Note that some statements are exclusive to the SQL server compatible syntax (anything with the DECIMAL type and CHECK constraints), and this isn't properly described in the reference.
It isn't as extensive and well-written as the T-SQL stuff, but it's closest to what you're asking.

Which language has good SQL parsing library?

I'm looking for good SQL parser. One that will work with subselects, non-select queries, CTE, window functions and other legal SQL elements.
Result would be some kind of abstract syntax tree, that I could later on work on.
Language is mostly irrelevant, as I am willing to learn new language just to use the library, if it exists.
I know that it is technically possible to extract parser from some open source database, but it's far from easy (at least for the parser of PostgreSQL which is what I need).
There's a non-validating SQL parser in Python: python-sqlparse. The tokens are exposed as objects. I doubt if they support "other legal SQL statements", window functions, and the like though as those are controlled by vendor specific grammars and no vendor is technically fully compliant with SQL standards.
Um (knowing that you're willing to learn a new language), why would you need to work on the syntax tree? If you need some magic in dealing with the database, probably you don't need to reinvent the wheel: Python got a fantastic database toolkit - SQL ALchemy.
You can google "sql parser". This is the one that listed: General SQL Parser Here are some highlighted features listed on official website:
Offline SQL syntax check
Highly customizable SQL formatter
In-depth analysis of SQL script
Fully access to SQL query parse tree
Custom SQL engine for various databases
Major programming language support
It's a commercial SQL library.
Our DMS Software Reengineering Toolkit has PL/SQL and ANSI SQL 2011 full parsers (to ASTs) and prettyprinters (ASTs back to valid text). Neither of these are PostGres SQL, but DMS has a dialect mechanism that enables one to relatively easily build a dialect from a base grammar, by revising just some of the grammar rules and retaining the rest. Doing this from the SQL 2011 grammar seems like a practical way to tackle the problem.
DMS also offers facilities to access/traverse/modify the ASTs, both procedurally and in terms of surface-syntax patterns and transformations. Think of this as "life beyond parsing".

Which SQL Implementation can translate to many other(s)?

I'm looking for a SQL Implementation (and its Editor) that can be used for translating it to many other(s) SQL Languages.
For example, when i code in that SQL Language to script file(s), and then i translate to other(s) SQL Language script file(s) (for ex: MS SQL's , MySQL's , ...).
If you're sure to use only ANSI SQL to construct your scripts, you should be good to go.
I agree with #Justin Niessner: all SQL vendors pay attention to the SQL Standards, notably core SQL-92. To take SQL Server as an example, although they find Sybase legacy code is tricky to deprecate they are not afraid to do so and entirely new features (e.g. MERGE in MSSQL2008) tend to extend their Standard SQL equivalents, rather than reinventing the wheel.
For a product that has good Standards compliance, take a look at Mimer
Here at Mimer Information Technology, we pride ourselves on conforming
to the SQL standard and we play an active role in the Database
Languages standardization group which determines exactly what is SQL
standard.
Mimer also provide extremely useful SQL validators for SQL-92, SQL-99 and SQL:2003 respectively.
I've been researching the same thing a while ago. What I've found is that there is a project liquibase. It is aimed at change tracking but also converting between different DBMS. You can download source code and see different datatypes conversions across databases. Source at github browse for java files there, probably you'll find something helpful
If all you want are basic operations, these are fairly universal. For instance:
SELECT
INSERT
DELETE
UPDATE
FROM
WHERE
JOIN
...are all at the most basic level the same across implementations.
However, the more complicated your scripts get, the more difficult it becomes to make them "universal". Things like aggregation, subqueries, cursors, while loops, functions, indexes, constraints, temp tables, variables, string manipulation, window operations etc. are all pretty much database-specific.
Some of these do have "universal" equivalents but the more generic you make your code the worse it will perform.

are there open source validation parsers for major SQL dialects (TSQL, Oracle, MySQL)? or at least precise specs for these dialects?

word on the street is that Perl is defined not by a spec but by whatever the current interpreter version happens to accept. Now, let's consider an SQL dialect like TSQL. Is there a published spec of it that would allow making a validator equivalent to the one inside SQL Server? Are there such validators already in existence as open source? And the same question for Oracle.
Ok, so for MySQL I am guessing that validator could be extracted directly from the MySQL codebase. Nevertheless, do they in fact publish the spec itself in case I wanted to make my own validator?
You seem to have an idea of what to do for MySQL. I can't really say much about Oracle apart from that it mostly implements ANSI SQL and the PL/SQL procedural language extensions to SQL can mostly be found here for Oracle 9i.
For SQL Server:
Microsoft Books On Line (BOL) is the official reference spec. There are different pages for different versions of SQL Server, however.
There are a few projects relating to this.
http://www.sqlparser.com/ - This has .NET, Java, COM and VCL versions for Oracle, DB2, Mysql and SQL Server / Sybase (T-SQL). Quite reasonably priced too.
http://www.codeproject.com/Articles/1136/SharpHSQL-An-SQL-engine-written-in-C (c#)
http://antlr.org/ - This looks like a good bet.
I often use this site for formatting of SQL but it also does some validation although it's fairly crude:
http://www.dpriver.com/pp/sqlformat.htm
This is a similar site:
http://www.tsqltidy.com/
I would suggest that writing a validator for SQL even in just one of its variations is a massive undertaking. You could look at the various ISO/IEC standards for ANSI SQL. ANSI SQL-92 is very widely implemented, but there is a SQL:2008 standard as well.
You'd have to pay for the documentation for those standards though and they aren't cheap.
Good luck.

SQL - Parsing a query

I worked on a parser for arithmetic expressions.
The key there was building a syntax tree, where leaves are variables and nodes are operators.
Not I'm thinking about parsing SQL queries. Parsing simple select won't be a problem, but I'm not pretty sure about the complex queries.
Can you point me to a good reference about sql parsing. Thank you in advance!
Take a look at the SQL BNF grammars
Some codesamples:
Look at sourceforge Open SQL parser.
There was a question for sql parser library before. Look there.
I'm not sure if you know C# or .NET, but LinqToSql basically does this by building expression trees that are then executed only when the query is 'called'.