Application-time period tables - sql-server-2016

I am implementing a bitemporal solution for a few of our tables, using the native temporal table features, and some custom columns and code to handle the application/valid time.
However, I just stumbled across a reference to something which is supposedly in the SQL:2011 standard:
From wikipedia:
As of December 2011, ISO/IEC 9075, Database Language SQL:2011 Part 2:
SQL/Foundation included clauses in table definitions to define
"application-time period tables" (valid time tables),
"system-versioned tables" (transaction time tables) and
"system-versioned application-time period tables" (bitemporal tables)
This pdf actually has code to do this (application-time):
CREATE TABLE Emp(
ENo INTEGER,
EStart DATE,
EEnd DATE,
EDept INTEGER,
PERIOD FOR EPeriod (EStart, EEnd)
)
This code will not run in SSMS. Has something changed that makes this invalid SQL now? It looks like what used to be undocumented support for application-time/bitemporal tables has now been removed?

Just because it's in the standard doesn't mean it's in any particular implementation. Each vendor has a stretch goal of full standard coverage, but not one of them is there yet, and I doubt it will happen in my lifetime.
Currently SQL Server supports system time, but it does not support application time. There may be another vendor who does; I'm not sure, as I don't follow all the various RDBMS platforms as they mature. I know it's on the SQL Server radar but there have been no formal announcements to date.
The example in the PDF is just that: an example of what could be done by a platform that supports application time. The next example is this...
INSERT INTO Emp
VALUES (22217,
DATE ‘2010-01-01’,
DATE '2011-11-12', 3)
...which also isn't valid in SQL Server for more than one reason, and violates a few best practices to boot. Maybe this stuff is all valid in DB2, as you suggest, but the standard is not supposed to be vendor-specific. I mean, by definition, if nothing else.

IBM DB2 supports what you are asking about. Think of the SQL standard as a definition of the recommended way a vendor should expose a feature if they support it, well at least after SQL 92, which is kind of a core. In the history of SQL dialects, sometimes vendors get ahead of the standard and dialects diverge. A vendor would be kind of foolish to implement a feature in a non-standard way after it has been standardized, but sometimes they do. Hot on the left, cold on the right; that is a standard. It works the other way around, but people tend to get burned.
In this case, it looks like IBM decided to implement the feature and make their way of implementing it part of the standard in one fell swoop. Microsoft has not yet decided it is worth their trouble.

Related

JDBC SQL:Where is the detailed specification?

Everybody loves to mention how JDBC abstracts away vendor-specific differences between SQLs to present a single SQL flavor that would work against a whole slew of them.
But no book or reference on JDBC ever mentions a (detailed) specification or even a decent, user-space coverage of this SQL supported by (a specific version of) JDBC, say JDBC 4.1!
So, what ends up happening (at least with me) is that, if I'm working with MySQL, I must refer to the MySQL reference manual and then try to guard myself against accidentally using MySQL-specific features. For writing portable SQL (at least at the level supported by the JDBC driver version I'm using), I would rather prefer to refer to a JDBC spec or to an SQL spec directly instead of referring to MySQL, PostgresQL, etc.
Is the SQL standard itself (2008, 2003, etc), on which a particular version of JDBC is based, freely available? Or, do I have to shell out $$ to get a copy?
There is no "JDBC SQL", just ISO SQL and the vendor implementations of it. JDBC defines the interface for talking to SQL databases, it's a different layer to the query language its self.
The reference for JDBC its self is the JSR documentation:
JDBC 4.0
JDBC 4.1
Unfortunately the official SQL standards are expensive and must be purchased from the ISO.
You can find late-stage drafts that're perfectly good for reference when you're not trying to develop a conforming implementation here among other places.
The SQL spec isn't the most friendly and readable of things, so in practice it's a good idea to use vendor documentation that's actually intended to be read by human beings. You can compare a couple of vendor docs or fall back on the standard doc when uncertainty arises.
Standard compliance with the spec isn't exactly ideal across DBs; writing code strictly to the spec doesn't necessarily mean it'll actually work. For example, MySQL doesn't impliment window functions or common table expressions, PostgreSQL doesn't implement SQL/PSM (instead offering PL/PgSQL) or the CALL statement; most vendors use different ways of specifying auto-increment columns or sequence generators; etc etc etc.
Please don't use the w3schools SQL guides, they're severely outdated, wrong, fail to differentiate between vendor extensions and the standard, and should generally be avoided. I mention them because w3schools tends to come up quite high in search rankings - back in the day they used to actually be useful.
You can download the JDBC 4.1 specification from http://download.oracle.com/otndocs/jcp/jdbc-4_1-mrel-spec/index.html but this only covers JDBC itself, not SQL. The specification is more a description of the interface; it does expect databases to support some level of the SQL standards, but don't expect to find more information than a reference to the SQL standard if it comes to the requirements to queries.
You usually need to use the database specific SQL anyway, because even though there is a SQL standard, database vendors don't implement them to the letter. JDBC itself defines some escapes to bridge the gaps, but as far as I know, they are hardly ever used. Drivers also - usually - don't translate standard SQL to database specific SQL if the database doesn't support the standard SQL.
If you want to look at the official SQL standard, you need to buy it from ISO or your country-specific ISO representative. That said, with some searching you can find and download draft versions of the specification for free. I am not sure how helpful that is though, as the SQL standard documents are not intended as a reference manual, but meant to be a formal description and goes really deep in details that are only relevant to an implementer.

Does a version control database storage engine exist?

I was just wondering if a storage engine type existed that allowed you to do version control on row level contents. For instance, if I have a simple table with ID, name, value, and ID is the PK, I could see that row 354 started as (354, "zak", "test")v1 then was updated to be (354, "zak", "this is version 2 of the value")v2 , and could see a change history on the row with something like select history (value) where ID = 354.
It's kind of an esoteric thing, but it would beat having to keep writing these separate history tables and functions every time a change is made...
It seems you are looking more for auditing features. Oracle and several other DBMS have full auditing features. But many DBAs still end up implementing trigger based row auditing. It all depends on your needs.
Oracle supports several granularities of auditing that are easy to configure from the command line.
I see you tagged as MySQL, but asked about any storage engine. Anyway, other answers are saying the same thing, so I'm going delete this post as originally it was about the flashback features.
Obviously you are really after a MySQL solution, so this probably won't help you much, but Oracle has a feature called Total Recall (more formally Flashback Archive) which automates the process you are currently hand-rolling. The Archive is a set of compressed tables which are populated with changes automatically, and queryable with a simple AS OF syntax.
Naturally being Oracle they charge for it: it needs an additional license on top of the Enterprise Edition, alas. Find out more (PDF).
Oracle and Sql Server both call this feature Change Data Capture. There is no equivalent for MySql at this time.
You can achieve similar behavior with triggers (search for "triggers to catch all database changes") - particularly if they implement SQL92 INFORMATION_SCHEMA.
Otherwise I'd agree with mrjoltcola
Edit: The only gotcha I'd mention with MySQL and triggers is that (as of the latest community version I downloaded) it requires the user account have the SUPER privilege, which can make things a little ugly
CouchDB has full versioning for every change made, but it is part of the NOSQL world, so would probably be a pretty crazy shift from what you are currently doing.
The wikipedia article on google's bigtable mentions that it allows versioning by adding a time dimension to the tables:
Each table has multiple dimensions
(one of which is a field for time,
allowing versioning).
There are also links there to several non-google implementations of a bigtable-type dbms.
I think Big table, the Google DB engine, does something like that : it associate a timestamp with every update of a row.
Maybe you can try Google App Engine.
There is a Google paper explaining how Big Table works.
The book Refactoring Databases has some insights on the matter.
But it also points out there is no real solution currently, other then carefully making changes and managing them manually.
One approximation to this is a temporal database - which allows you to see the status of the whole database at different times in the past. I'm not sure that wholly answers your question though; it would not allow you to see the contents of Row 1 at time t1 while simultaneously letting you look at the contents of Row 2 at a separate time t2.
"It's kind of an esoteric thing, but it would beat having to keep writing these separate history tables and functions every time a change is made..."
I wouldn't call audit trails (which is obviously what you're talking of) an "esoteric thing" ...
And : there is still a difference between the history of database updates, and the history of reality. Historical database tables should really be used to reflect the history of reality, NOT the history of database updates.
The history of database updates is already kept by the DBMS in its logs and journals. If someone needs to inquire the history of database upates, then he/she should really resort to the logs and journals, not to any kind of application-level construct that can NEVER provide sufficient guarantee that it reflects ALL updates.

SQL: Update on join, in standards?

I know that most sql server software allows you to do "A Update on a Join", but I am wondering, is this in the SQL standards?
(eg. can I assume that any software package allows this?)
Note: I am asking this because I am writing a database library that should be easily extensible to database software that is not included in the original build. As such there's no point in answering with a remark such as "a, b, c and b all allow that - together they make up the lionshare of the market, so you can assume that all software packages allow that". No, I am interested in whether it is in the standards or not.
If I understand the question right, I think the answer is no, there is no standard "update based on a join". The postgres manual page for UPDATE includes this under "Compatibility":
This command conforms to the SQL standard, except that the FROM and RETURNING clauses are PostgreSQL extensions, as is the ability to use WITH with UPDATE.
Some other database systems offer a FROM option in which the target table is supposed to be listed again within FROM. That is not how PostgreSQL interprets FROM. Be careful when porting applications that use this extension.
While this doesn't explicitly say there isn't, the Compatibility notes in that manual generally note when there is a related, but not identical, feature in the standard. What's more, the mention of other systems with different behaviour demonstrates that if there is a standard, you can't rely on it anyway.
According to the ANSI SQL-92 standard, an UPDATE on JOINed tables is NOT part of the standards; See http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt sections 13.9 and 13.10 (you'll have to search for 391, the page number).
I tried to find an ANSI 2003 standard, but the closest I came was here: www.wiscorp.com/sql_2003_standard.zip (a late draft). There was no substantial difference between the two in regards to the UPDATE statement and JOIN syntax.
Stu
You're presuming that all software packages adhere to ANSI SQL Standards.....in reality, none of them that I'm aware of adhere completely to the standards.
If you're looking to adhere to ANSI SQL standards, the best place to start would be with the documented standards themselves. Here's the SQL-92 document:
http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt
Careful now, folks. Writing truly portable code is much more difficult than you would imagine and you also have to be willing to give up a lot in the areas of performance, ease of coding/maintenance, and readability. Just declare and use one variable in, say, SQL Server and your code is no longer truly portable. Write an audit trigger and I can guarantee that your trigger won't be portable between Oracle, SQL Server, and several other popular engines. And, it should really matter because it's not actually rocket science in any RDBMS (well, except maybe for writing a joined UPDATE in Oracle without using MERGE {which is standard but not portable, yet}).
Also, don't forget there are two basic types of SQL. That which supports the single row nature of most front-end code and that of batch code. If you really want your batch code to perform well, you'll use many of the "proprietary extensions" to the database engine you're using to efficiently process sometimes billions of rows overnight... the same night. ;-)
Be careful when aiming at writing code for "true" portability. You might end up with a tangled mess that's a whole lot slower than you might have ever imagined.

How Important is SQL Portability?

It seems to me, from both personal experience and SO questions and answers, that SQL implementations vary substantially. One of the first issues for SQL questions is: What dbms are you using?
In most cases with SQL there are several ways to structure a given query, even using the same dialect. But I find it interesting that the relative portability of various approaches is frequently not discussed, nor valued very highly when it is.
But even disregarding the likelihood that any given application may or not be subject to conversion, I'd think that we would prefer that our skills, habits, and patterns be as portable as possible.
In your work with SQL, how strongly do you prefer standard SQL syntax? How actively do you eschew propriety variations? Please answer without reference to proprietary preferences for the purpose of perceived better performance, which most would concede is usually a sufficiently legitimate defense.
I vote against standard/vendor independent sql
Only seldom the database is actually switched.
There is no single database that fully conforms to the current sql standard. So even when you are standard conform, you are not vendor independent.
vendor differences go beyond sql syntax. Locking behaviour is different. Isolation levels are different.
database testing is pretty tough and under developed. No need to make it even harder by throwing multiple vendors in the game, if you don't absolutly need it.
there is a lot of power in the vendor specific tweaks. (think 'limit', or 'analytic functions', or 'hints' )
So the quintessence:
- If there is no requirement for vendor independence, get specialised for the vendor you are actually using.
- If there is a requirement for vendor independence make sure that who ever pays the bill, that this will cost money. Make sure you have every single rdbms available for testing. And use it too
- Put every piece of sql in a special layer, which is pluggable, so you can use the power of the database AND work with different vendors
- Only where the difference is a pure question of syntax go with the standard, e.g. using the oracle notation for (outer) joins vs the ANSI standard syntax.
We take it very seriously at our shop. We do not allow non-standard SQL or extensions unless they're supported on ALL of the major platforms. Even then, they're flagged within the code as non-standard and justifications are necessary.
It is not up to the application developer to make their queries run fast, we have a clear separation of duties. The query is to be optimized only by the DBMS itself or the DBAs tuning of the DBMS.
Real databases, like DB2/z :-), process standard SQL plenty fast.
The reason we enforce this is to give the customer choice. They don't like the idea of being locked into a specific vendor any more than we do.
In my experience, query portability turns out to be not so important. We work with various data sources (mainly MSSQL and MySQL), but we know which data is stored where and can optimize accordingly. Since we control the systems, we decide when - if ever - structures are moved and queries need to be rewritten.
I also like to use certain other server-specific functionality, such as query notification in SQL Server, which MySQL doesn't offer. So there, again, we use it when we can and don't worry about portability.
Furthermore, parts of our apps need to query schema information and act on it. Here, again, we have server-specific code for the different systems, instead of trying to restrict ourselves to the lowest common denominator.
There is no clear answer whether SQL portability is desirable or not - it really depends a lot on the situation, such as the type of application.
If the application is going to be a service - ie there will only ever be you hosting it, then obviously nobody but you will care whether your SQL is portable enough, so you could safely ignore it as long as you have no specific plans to drop support for your current platform.
If the application is going to be installed at a number of sites, which each have their own established database systems, obviously SQL portability is very important to people. It lets you widen your potential market, and may give a bit of piece of mind to clients who are on the fence in regards to their database system. Whether you want to support that, or you are happy selling only to, for instance, Oracle customers, or only to MySQL/PostgreSQL customers, for example, is up to you and what you think your market is.
If you are coding in PHP, then the vast majority of your potential customers are probably going to expect MySQL. If so, then it's not a big deal to assume MySQL. Or similarly if you are in C#/.NET then you could assume Microsoft SQL Server. Again, however, there is a flip side because there may exist a small but less competitive market of PHP or .NET users who want to connect to other database systems than the usual.
So I would largely regard this as a market research question, unless as in my first example you are providing a hosted service where it doesn't matter to users, in which case it is for your own convenience only.

Reasons for SQL differences

Why are SQL distributions so non-standard despite an ANSI standard existing for SQL? Are there really that many meaningful differences in the way SQL databases work or is it just the two databases with which I have been working: MS-SQL and PostgreSQL? Why do these differences arise?
The ANSI standard specifies only a limited set of commands and data types. Once you go beyond those, the implementors are on their own. And some very important concepts aren't specified at all, such as auto-incrementing columns. SQLite just picks the first non-null integer, MySQL requires AUTO INCREMENT, PostgreSQL uses sequences, etc. It's a mess, and that's only among the OSS databases! Try getting Oracle, Microsoft, and IBM to collectively decide on a tricky bit of functionality.
It's a form of "Stealth lock-in". Joel goes into great detail here:
http://www.joelonsoftware.com/articles/fog0000000056.html
http://www.joelonsoftware.com/articles/fog0000000052.html
Companies end up tying their business functionality to non-standard or weird unsupported functionality in their implementation, this restricts their ability to move away from their vendor to a competitor.
On the other hand, it's pretty short-sighted because anyone with half a brain will tend to abstract away the proprietary pieces, or avoid the lock-in altogether, if it gets too egregious.
First, I don't find databases to be as, say, browsers or operating systems in terms of incompatibility. Anyone with a few hours of training can start doing selects, inserts, deletes and updates on any SQL database. Meanwhile, it's difficult to write HTML that renders identically on every browser or write system code for more than one OS. Generally, differences in SQL are related to performance or fairly esoteric features. The major exception seems to be date formats and functions.
Second, database developers generally are motivated to add features that differentiate their product from everyone else. Products like Oracle, MS SQL Server and MySQL are vast ecosystems that rarely cross-pollinate in practice. At my workplace, we use Oracle and MySQL, but we could probably switch over to 100% Oracle in about a day if needed or desired. So I care a lot about the shiny toys Oracle gives us with each release, but I don't even know what version of MySQL we are using. IBM, Microsoft, PostgreSQL and the rest might as well not exist as far as we are concerned. Having the features to get and keep customers and users is far more important than compatibility in the database world. (That's the positive spin on the "lock-in" answer, I suppose.)
Third, there are legitimate reasons for different companies to implement SQL differently. For instance, Oracle has a multi-versioning system that allows very fast and scalable consistent reads. Other databases lack that feature, but usually are faster inserting rows and rolling back transactions. This is a fundamental difference in these systems. It doesn't make one better than the other (at least in the general case), just different. One should not be surprised if the SQL ontop of a database engine takes advantage of its strengths and attempts to minimize its weaknesses. In fact, it would be irresponsible of the developers to not do this.
John: The standard actually covers lots of subjects, including identity columns, sequences, triggers, routines, upsert, etc. But of course, many of these standards-components may have been brought in place later than the first implementations; and this could be a reason why SQL standards compliance is somewhat low, generally.
Neall: There are actually areas where the SQL standard is ahead of the implementations. For example, it would be nice to have CREATE ASSERTION, but as far as I know, no DBMS implements assertions yet.
Personally, I believe that the closed nature of some ISO standards (like the SQL standard) is part of the problem: When a standard is not readily available online, it's less likely to be known by implementors/planners, and too few customers ask for compliance because they don't know what to ask for.
It's certainly effective lock-in, as 1800 says. But in fairness to the database vendors, the SQL standard is always playing catch-up to current databases' feature sets. Most databases we have today are of pretty ancient lineages. If you trace Microsoft SQL Server back to its roots, I think you'll find Ingres - one of the very first relational databases written in the '70s. And Postgres was originally written by some of the same people in the '80s as a successor to Ingres. Oracle goes way back, and I'm not sure where MySQL came in.
Database non-portability does suck, but it could be a lot worse.