Is there a way to make SQL standard compliant queries using Visual Studio? - sql

I just wanted to know if there is an SQL standard compliance validator out there for Visual Studio 2019 Professional (something that could be set to strict: only absolutely compliant syntax would be accepted). It would be nice if it had support for native languages too, but I'm used to that kind of stuff being CLR-only (I don't really know why probably because of linking... I may be so absolutely wrong, though... I actually have no idea and took a guess).
Something important would be that it needs to be standard compliant, not only SQL-server compliant. What is not in the standard is an error.
The goal is to make SQL code that is completely independent of the DBMS. Thank you for taking the time to read my question.

The goal is to make SQL code that is completely independent of the DBMS.
Impossible goal, unless you are going to forsake writing SQL at all. It is perhaps sad, but different databases differ on very fundamental things, picking and choosing the parts of the standard they want. Happily, the major things like SELECT, JOIN and GROUP BY are common but the details are not.
You can think of them of them like dialects of a spoken language over time and region. I'm most familiar with English, but it is true that all languages evolve and change. I can read Shakespearean English, but I am not going to write English like that. It would be grammatically incorrect in some cases, use unknown words, and alternative meanings of common words.
Here are just some examples of some features that differ widely among databases:
Intervals. Adding an interval to a date using the standard syntax is interval + '1 day'. This varies significantly across databases.
Some databases do not support FULL JOIN.
Some databases do not support recursive CTEs. Some use the recursive keyword; some do not.
Some databases do not support the VALUES() constructor in the FROM clause.
Some databases allow the FROM clause to be optional.
The standard has nifty functionality such as FILTER and aggregation by functionally dependent ids.that few databases support
Limitations on data types vary significantly -- what is the longer string, for instance.
The standard uses FETCH to limit results, which some databases do not support.
Parsing strings into dates and formatting dates into strings is totally database-dependent.
Extracting date/time components uses extract() in the standard, but few databases actually support that functionality.
These are just a few of the differences off the top of my head -- in no way meant to be complete or even the most important. I just want to point out that what you want to do is not possible.

Related

Performance of SQL standards vs T-SQL extensions

Articles on the internet say user-defined functions can either burden or increase the performance.
Now, I know that standard SQL is pretty limited, however, some of the behavior can still be written as in T-SQL built-in functions.
For example, adddays() vs. dateadd() . Another point I've heard that its also better to use coalesce() - the ANSI standard function rather than isNull().
What is the performance difference between using the ANSI SQL standard functions vs T-SQL functions?
Does T-SQL adds any burden what so ever on the performance with it trying to make the job easier, or not?
My research does not seem to indicate a trend.
You will need to approach this on a case-by-case basis and do actual testing. There is no general rule, other than Microsoft tries to make the entire stack perform as well as possible. TESTING is what you need to do - we can't tell you that always a certain thing would be faster. That would be really bad advice.
It is important to do this testing on your actual production data, prefereably a copy of it. Do not rely on tests done against data sets that aren't yours. When you're talking about performance differences of functions, some very subtle things can make a big difference. Things like the size of the table, the data types involved, the indexing, and SQL Server versions, can change the result of these tests. That is why "no one has done this" for you. We can't.

List of differences between SQL databases

Most SQL databases follow the ANSI SQL standards to a degree, but
The standard is ambiguous, leaving some areas open to interpretation (eg: how different operations with NULLs should be handled is ambiguous)
Some vendors contradict the standard outright or just lack functionality defined by the standard (eg: MySQL has a list of differences between the standard and their implementation)
Some databases will behave differently depending on how they are configured, but configuration can be changed to have them behave the same way (eg: Oracle performs case-sensitive string comparisons by default, while SQL Server does them case-insensitve)
There is some functionality that is not part of the standard but is implemented by different RDBMSs anyway, albeit with different names (eg: Oracle's LISTAGG = MySQL's GROUP_CONCAT)
Is there a resource with a comprehensive list of quirks and gotchas to pay attention to when you are trying to write something that is supposed to be compatible with multiple databases?
I'm not sure how comprehensive this list is, but maybe this will help -
http://troels.arvin.dk/db/rdbms/
Except of already mentioned some comparison you can find in Wikipedia
Also similar question was already posted on Stackoverflow where you can fin a couple of useful links.

Which SQL Implementation can translate to many other(s)?

I'm looking for a SQL Implementation (and its Editor) that can be used for translating it to many other(s) SQL Languages.
For example, when i code in that SQL Language to script file(s), and then i translate to other(s) SQL Language script file(s) (for ex: MS SQL's , MySQL's , ...).
If you're sure to use only ANSI SQL to construct your scripts, you should be good to go.
I agree with #Justin Niessner: all SQL vendors pay attention to the SQL Standards, notably core SQL-92. To take SQL Server as an example, although they find Sybase legacy code is tricky to deprecate they are not afraid to do so and entirely new features (e.g. MERGE in MSSQL2008) tend to extend their Standard SQL equivalents, rather than reinventing the wheel.
For a product that has good Standards compliance, take a look at Mimer
Here at Mimer Information Technology, we pride ourselves on conforming
to the SQL standard and we play an active role in the Database
Languages standardization group which determines exactly what is SQL
standard.
Mimer also provide extremely useful SQL validators for SQL-92, SQL-99 and SQL:2003 respectively.
I've been researching the same thing a while ago. What I've found is that there is a project liquibase. It is aimed at change tracking but also converting between different DBMS. You can download source code and see different datatypes conversions across databases. Source at github browse for java files there, probably you'll find something helpful
If all you want are basic operations, these are fairly universal. For instance:
SELECT
INSERT
DELETE
UPDATE
FROM
WHERE
JOIN
...are all at the most basic level the same across implementations.
However, the more complicated your scripts get, the more difficult it becomes to make them "universal". Things like aggregation, subqueries, cursors, while loops, functions, indexes, constraints, temp tables, variables, string manipulation, window operations etc. are all pretty much database-specific.
Some of these do have "universal" equivalents but the more generic you make your code the worse it will perform.

SQL: Update on join, in standards?

I know that most sql server software allows you to do "A Update on a Join", but I am wondering, is this in the SQL standards?
(eg. can I assume that any software package allows this?)
Note: I am asking this because I am writing a database library that should be easily extensible to database software that is not included in the original build. As such there's no point in answering with a remark such as "a, b, c and b all allow that - together they make up the lionshare of the market, so you can assume that all software packages allow that". No, I am interested in whether it is in the standards or not.
If I understand the question right, I think the answer is no, there is no standard "update based on a join". The postgres manual page for UPDATE includes this under "Compatibility":
This command conforms to the SQL standard, except that the FROM and RETURNING clauses are PostgreSQL extensions, as is the ability to use WITH with UPDATE.
Some other database systems offer a FROM option in which the target table is supposed to be listed again within FROM. That is not how PostgreSQL interprets FROM. Be careful when porting applications that use this extension.
While this doesn't explicitly say there isn't, the Compatibility notes in that manual generally note when there is a related, but not identical, feature in the standard. What's more, the mention of other systems with different behaviour demonstrates that if there is a standard, you can't rely on it anyway.
According to the ANSI SQL-92 standard, an UPDATE on JOINed tables is NOT part of the standards; See http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt sections 13.9 and 13.10 (you'll have to search for 391, the page number).
I tried to find an ANSI 2003 standard, but the closest I came was here: www.wiscorp.com/sql_2003_standard.zip (a late draft). There was no substantial difference between the two in regards to the UPDATE statement and JOIN syntax.
Stu
You're presuming that all software packages adhere to ANSI SQL Standards.....in reality, none of them that I'm aware of adhere completely to the standards.
If you're looking to adhere to ANSI SQL standards, the best place to start would be with the documented standards themselves. Here's the SQL-92 document:
http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt
Careful now, folks. Writing truly portable code is much more difficult than you would imagine and you also have to be willing to give up a lot in the areas of performance, ease of coding/maintenance, and readability. Just declare and use one variable in, say, SQL Server and your code is no longer truly portable. Write an audit trigger and I can guarantee that your trigger won't be portable between Oracle, SQL Server, and several other popular engines. And, it should really matter because it's not actually rocket science in any RDBMS (well, except maybe for writing a joined UPDATE in Oracle without using MERGE {which is standard but not portable, yet}).
Also, don't forget there are two basic types of SQL. That which supports the single row nature of most front-end code and that of batch code. If you really want your batch code to perform well, you'll use many of the "proprietary extensions" to the database engine you're using to efficiently process sometimes billions of rows overnight... the same night. ;-)
Be careful when aiming at writing code for "true" portability. You might end up with a tangled mess that's a whole lot slower than you might have ever imagined.

Reasons for SQL differences

Why are SQL distributions so non-standard despite an ANSI standard existing for SQL? Are there really that many meaningful differences in the way SQL databases work or is it just the two databases with which I have been working: MS-SQL and PostgreSQL? Why do these differences arise?
The ANSI standard specifies only a limited set of commands and data types. Once you go beyond those, the implementors are on their own. And some very important concepts aren't specified at all, such as auto-incrementing columns. SQLite just picks the first non-null integer, MySQL requires AUTO INCREMENT, PostgreSQL uses sequences, etc. It's a mess, and that's only among the OSS databases! Try getting Oracle, Microsoft, and IBM to collectively decide on a tricky bit of functionality.
It's a form of "Stealth lock-in". Joel goes into great detail here:
http://www.joelonsoftware.com/articles/fog0000000056.html
http://www.joelonsoftware.com/articles/fog0000000052.html
Companies end up tying their business functionality to non-standard or weird unsupported functionality in their implementation, this restricts their ability to move away from their vendor to a competitor.
On the other hand, it's pretty short-sighted because anyone with half a brain will tend to abstract away the proprietary pieces, or avoid the lock-in altogether, if it gets too egregious.
First, I don't find databases to be as, say, browsers or operating systems in terms of incompatibility. Anyone with a few hours of training can start doing selects, inserts, deletes and updates on any SQL database. Meanwhile, it's difficult to write HTML that renders identically on every browser or write system code for more than one OS. Generally, differences in SQL are related to performance or fairly esoteric features. The major exception seems to be date formats and functions.
Second, database developers generally are motivated to add features that differentiate their product from everyone else. Products like Oracle, MS SQL Server and MySQL are vast ecosystems that rarely cross-pollinate in practice. At my workplace, we use Oracle and MySQL, but we could probably switch over to 100% Oracle in about a day if needed or desired. So I care a lot about the shiny toys Oracle gives us with each release, but I don't even know what version of MySQL we are using. IBM, Microsoft, PostgreSQL and the rest might as well not exist as far as we are concerned. Having the features to get and keep customers and users is far more important than compatibility in the database world. (That's the positive spin on the "lock-in" answer, I suppose.)
Third, there are legitimate reasons for different companies to implement SQL differently. For instance, Oracle has a multi-versioning system that allows very fast and scalable consistent reads. Other databases lack that feature, but usually are faster inserting rows and rolling back transactions. This is a fundamental difference in these systems. It doesn't make one better than the other (at least in the general case), just different. One should not be surprised if the SQL ontop of a database engine takes advantage of its strengths and attempts to minimize its weaknesses. In fact, it would be irresponsible of the developers to not do this.
John: The standard actually covers lots of subjects, including identity columns, sequences, triggers, routines, upsert, etc. But of course, many of these standards-components may have been brought in place later than the first implementations; and this could be a reason why SQL standards compliance is somewhat low, generally.
Neall: There are actually areas where the SQL standard is ahead of the implementations. For example, it would be nice to have CREATE ASSERTION, but as far as I know, no DBMS implements assertions yet.
Personally, I believe that the closed nature of some ISO standards (like the SQL standard) is part of the problem: When a standard is not readily available online, it's less likely to be known by implementors/planners, and too few customers ask for compliance because they don't know what to ask for.
It's certainly effective lock-in, as 1800 says. But in fairness to the database vendors, the SQL standard is always playing catch-up to current databases' feature sets. Most databases we have today are of pretty ancient lineages. If you trace Microsoft SQL Server back to its roots, I think you'll find Ingres - one of the very first relational databases written in the '70s. And Postgres was originally written by some of the same people in the '80s as a successor to Ingres. Oracle goes way back, and I'm not sure where MySQL came in.
Database non-portability does suck, but it could be a lot worse.