I'm used to writing lines of code and having them execute in order, from up to down. I believe this is how nearly most programming languages work.
In SQL, if I understand it correctly, one first must create a table by performing some table operations (FROM to choose the starting tables, JOIN to combine them in various ways, WHERE to filter out some rows, GROUP BY to filter our duplicates, etc.).
Once you have your outputted table, you use SELECT to perform one final table operation: eliminate the undesired fields.
So I would expect FROM to be the first line of code, and SELECT to be the very last line of code. But SELECT is always first for some reason...
Is there an explanation as to why SQL commands are written in reverse?
SQL has a standard and that standard specifies the ordering.
You are missing something fundamental about SQL: it is a descriptive language, not a procedural language. That is, a SQL query describes the result set being produced, not the steps that must be taken to produce those results.
So, there is no "execution order" for the SQL clause. All taken together describe the results.
That said, there might be notes going back to the original development of SQL. My best guess is that the first thing one sees on a tabular report are the column headings, and that is why the SELECT is first. I appreciate the elegance of having FROM first (because FROM defines the table and column aliases used throughout the query). But that simply is not how the language is defined.
This is the structure of syntax used by SQL, like semantics of any programming language.
SELECT is basically used where we need to filter and fetch particular data. Then, FROM is used which refers to table.
It's used in normal SQL queries also for the same purpose.
In terms of procedure we use SELECT as below:
CREATE PROCEDURE
SelectAllStudent
As
SELECT * FROM students
GO
And then we execute procedure as:
EXEC SelectAllStudent
Is it possible to have Postgres reject queries which use its proprietary extensions to the SQL language?
e.g. select a::int from b; should throw an error, forcing the use of proper casts as in select cast(a as int) from b;
Perhaps more to the point is the question of whether it is possible to write SQL that is supported by all RDBMS with the same resulting behaviour?
PostgreSQL has no such feature. Even if it did, it wouldn't help you tons because interpretations of the SQL standard vary, support for standard syntax and features vary, and some DBs are relaxed about restrictions that others enforce or have limitations others don't. Syntax is the least of your problems.
The only reliable way to write cross-DB portable SQL is to test that SQL on every target database as part of an automated test suite. And to swear a lot.
In many places the query parser/rewriter transforms the standard "spelling" of a query into the PostgreSQL internal form, which will be emitted on dump/reload. In particular, PostgreSQL doesn't store the raw source code for things like views, check constraint expressions, index expressions, etc. It stores the internal parse tree, and reconstructs the source from that when it's asked to dump or display the object.
For example:
regress=> CREATE TABLE sometable ( x varchar(100) );
CREATE TABLE
regress=> CREATE VIEW someview AS SELECT CAST (x AS integer) FROM sometable;
CREATE VIEW
regress=> SELECT pg_get_viewdef('someview');
pg_get_viewdef
-------------------------------------
SELECT (sometable.x)::integer AS x
FROM sometable;
(1 row)
It'd be pretty useless anyway, since the standard fails to specify some pretty common and important pieces of functionality and often has rather ambiguous specifications of things it does define. Until recently it didn't define a way to limit the number of rows returned by a query, for example, so every database had its own different syntax (TOP, LIMIT / OFFSET, etc).
Other things the standard specifies are not implemented by most vendors, so using them is pretty pointless. Good luck using the SQL-standard generated and identity columns across all DB vendors.
It'd be quite nice to have a "prefer standard spelling" dump mode, that used CAST instead of ::, etc, but it's really not simple to do because some transformations aren't 1:1 reversible, e.g.:
regress=> CREATE VIEW v AS SELECT '1234' SIMILAR TO '%23%';
CREATE VIEW
regress=> SELECT pg_get_viewdef('v');
SELECT ('1234'::text ~ similar_escape('%23%'::text, NULL::text));
or:
regress=> CREATE VIEW v2 AS SELECT extract(dow FROM current_date);
CREATE VIEW
regress=> SELECT pg_get_viewdef('v2');
SELECT date_part('dow'::text, ('now'::text)::date) AS date_part;
so you see that significant changes would need to be made to how PostgreSQL internally represents and works with functions and expressions before what you want would be possible.
Lots of the SQL standard stuff uses funky one-off syntax that PostgreSQL converts into function calls and casts during parsing, so it doesn't have to add special case features every time the SQL committe have another brain-fart and pull some new creative bit of syntax out of ... somewhere. Changing that would require adding tons of new expression node types and general mess, all for no real gain.
Perhaps more to the point is the question of whether it is possible to
write SQL that is supported by all RDBMS with the same resulting
behaviour?
No, not even for many simple statments..
select top 10 ... -- tsql
select ... limit 10 -- everyone else
many more examples exist. Use an orm or something similar if you want to insulate yourself from database choice.
If you do write sql by hand, then trying to follow the SQL standard is always a good choice :-)
You could use a tool like Mimer's SQL Validator to validate that queries follow the SQL spec before running them:
http://developer.mimer.com/validator/parser92/index.tml
You could force users to write queries in HQL or JPQL, which would then get translated in to the correct SQL dialect for your database.
Using Delphi I need to create a class which will contain specific table structure (without data), including all fields, constraints, foreign keys, indexes. The goal is having "standard" table, compare them and find differences. This thing ought to be included into my big project so I can't use any "outer" comparators. Furthermore, this functionality might be extended, so I need to have my own realization. The question is how can I retrieve this information, having connection string and knowing specific table name. SQL Server 2008 is being used.
If you look at Delphi source, it's done this way :
Select * from Table where 1=2
Update:
Metadata can be retrieved with Information Schema Views , for example constraints :
SELECT * FROM databaseName.INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE
Where TABLE_NAME='tableName'
Been a very long time since I touched Delphi, but I do remember a couple things I used to do. Like
select top 0 * from table
returns 0 records but the TQuery is 'populated' with the meta data. Or I think on a TClientDataSet you can set the rows to -1, which has the same effect.
As I said, it's been a long time since I tinkered in Delphi and I was using the BDE and not native clients so this could all be useless info.
Hope this helps a bit though.
I have lately learned what is dynamic sql and one of the most interesting features of it to me is that we can use dynamic columns names and tables. But I cannot think about useful real life examples. The only one that came into my mind is statistical table.
Let`s say that we have table with name, type and created_data. Then we want to have a table that in columns are years from created_data column and in row type and number of names created in years. (sorry for my English)
What can be other useful real life examples of using dynamic sql with column and table as parameters? How do you use it?
Thanks for any suggestions and help :)
regards
Gabe
/edit
Thx for replies, I am particulary interested in examples that do not contain administrative things or database convertion or something like that, I am looking for examples where the code in example java is more complicated than using a dynamic sql in for example stored procedure.
An example of dynamic SQL is to fix a broken schema and make it more usable.
For example if you have hundreds of users and someone originally decided to create a new table for each user, you might want to redesign the database to have only one table. Then you'd need to migrate all the existing data to this new system.
You can query the information schema for table names with a certain naming pattern or containing certain columns then use dynamic SQL to select all the data from each of those tables then put it into a single table.
INSERT INTO users (name, col1, col2)
SELECT 'foo', col1, col2 FROM user_foo
UNION ALL
SELECT 'bar', col1, col2 FROM user_bar
UNION ALL
...
Then hopefully after doing this once you will never need to touch dynamic SQL again.
Long-long ago I have worked with appliaction where users uses their own tables in common database.
Imagine, each user can create their own table in database from UI. To get the access to data from these tables, developer needs to use the dynamic SQL.
I once had to write an Excel import where the excel sheet was not like a csv file but layed out like a matrix. So I had to deal with a unknown number of columns for 3 temporary tables (columns, rows, "infield"). The rows were also a short form of tree. Sounds weird, but was a fun to do.
In SQL Server there was no chance to handle this without dynamic SQL.
Another example from a situation I recently came up against. A MySQL database of about 250 tables, all in MyISAM engine and no database design schema, chart or other explanation at all - well, except the not so helpful table and column names.
To plan for conversion to InnoDB and find possible foreign keys, we either had to manually check all queries (and the conditions used in JOIN and WHERE clauses) created from the web frontend code or make a script that uses dynamic SQL and checks all combinations of columns with compatible datatype and compares the data stored in those columns combinations (and then manually accept or reject these possibilities).
Dealing with SQL shows us some limitations and gives us an opportunity to imagine what could be.
Which improvements to SQL are you waiting for? Which would you put on top of the wish list?
I think it can be nice if you post in your answer the database your feature request lacks.
T-SQL Specific: A decent way to select from a result set returned by a stored procedure that doesn't involve putting it into a temporary table or using some obscure function.
SELECT * FROM EXEC [master].[dbo].[xp_readerrorlog]
I know it's wildly unrealistic, but I wish they'd make the syntax of INSERT and UPDATE consistent. Talk about gratuitous non-orthogonality.
Operator to manage range of dates (or numbers):
where interval(date0, date1) intersects interval(date3, date4)
EDIT: Date or numbers, of course are the same.
EDIT 2: It seems Oracle have something to go, the undocumented OVERLAPS predicate. More info here.
A decent way of walking a tree with hierarchical data. Oracle has CONNECT BY but the simple and common structure of storing an object and a self-referential join back to the table for 'parent' is hard to query in a natural way.
More SQL Server than SQL but better integration with Source Control. Preferably SVN rather than VSS.
Implicit joins or what it should be called (That is, predefined views bound to the table definition)
SELECT CUSTOMERID, SUM(C.ORDERS.LINES.VALUE) FROM
CUSTOMER C
A redesign of the whole GROUP BY thing so that every expression in the SELECT clause doesn't have to be repeated in the GROUP BY clause
Some support for let expressions or otherwise more legal places to use an alias, a bit related to the GROUP BY thing, but I find other times what I just hate Oracle for forcing me to use an outer select just to reference a big expression by alias.
I would like to see the ability to use Regular Expressions in string handling.
A way of dynamically specifying columns/tables without having to resort to full dynamic sql that executes in another context.
Ability to define columns based on other columns ad infinitum (including disambiguation).
This is a contrived example and not a real world case, but I think you'll see where I'm going:
SELECT LTRIM(t1.a) AS [a.new]
,REPLICATE(' ', 20 - LEN([a.new])) + [a.new] AS [a.conformed]
,LEN([a.conformed]) as [a.length]
FROM t1
INNER JOIN TABLE t2
ON [a.new] = t2.a
ORDER BY [a.new]
instead of:
SELECT LTRIM(t1.a) AS [a.new]
,REPLICATE(' ', 20 - LEN(LTRIM(t1.a))) + LTRIM(t1.a) AS [a.conformed]
,LEN(REPLICATE(' ', 20 - LEN(LTRIM(t1.a))) + LTRIM(t1.a)) as [a.length]
FROM t1
INNER JOIN TABLE t2
ON LTRIM(t1.a) = t2.a
ORDER BY LTRIM(t1.a)
Right now, in SQL Server 2005 and up, I would use a CTE and build up in successive layers.
I'd like the vendors to actually standardise their SQL. They're all guilty of it. The LIMIT/OFFSET clause from MySQL and PostGresql is a good solution that no-one else appears to do. Oracle has it's own syntax for explicit JOINs whilst everyone else uses ANSI-92. MySQL should deprecate the CONCAT() function and use || like everyone else. And there are numerous clauses and statements that are outside the standard that could be wider spread. MySQL's REPLACE is a good example. There's more, with issues about casting and comparing types, quirks of column types, sequences, etc etc etc.
parameterized order by, as in:
select * from tableA order by #columName
Support in SQL to specify if you want your query plan to be optimized to return the first rows quickly, or all rows quickly.
Oracle has the concept of FIRST_ROWS hint, but a standard approach in the language would be useful.
Automatic denormalization.
But I may be dreaming.
Improved pivot tables. I'd like to tell it to automatically create the columns based on the keys found in the data.
On my wish list is a database supporting sub-queries in CHECK-constraints, without having to rely on materialized view tricks. And a database which supports the SQL standard's "assertions", i.e. constraints which may span more than one table.
Something else: A metadata-related function which would return the possible values of a given column, if the set of possible values is low. I.e., if a column has a foreign key to another column, it would return the existing values in the column being referred to. Of if the column has a CHECK-constraint like "CHECK foo IN(1,2,3)", it would return 1,2,3. This would make it easier to create GUI elements based on a table schema: If the function returned a list of two values, the programmer could decide that a radio button widget would be relevant - or if the function returned - e.g. - 10 values, the application showed a dropdown-widget instead. Etc.
UPSERT or MERGE in PostgreSQL. It's the one feature whose absence just boggles my mind. Postgres has everything else; why can't they get their act together and implement it, even in limited form?
Check constraints with subqueries, I mean something like:
CHECK ( 1 > (SELECT COUNT(*) FROM TABLE WHERE A = COLUMN))
These are all MS Sql Server/T-SQL specific:
"Natural" joins based on an existing Foreign Key relationship.
Easily use a stored proc result as a resultset
Some other loop construct besides while
Unique constraints across non NULL values
EXCEPT, IN, ALL clauses instead of LEFT|RIGHT JOIN WHERE x IS [NOT] NULL
Schema bound stored proc (to ease #2)
Relationships, schema bound views, etc. across multiple databases
WITH clause for other statements other than SELECT, it means for UPDATE and DELETE.
For instance:
WITH table as (
SELECT ...
)
DELETE from table2 where not exists (SELECT ...)
Something which I call REFERENCE JOIN. It joins two tables together by implicitly using the FOREIGN KEY...REFERENCES constraint between them.
A relational algebra DIVIDE operator. I hate always having to re-think how to do all elements of table a that are in all of given from table B.
http://www.tc.umn.edu/~hause011/code/SQLexample.txt
String Agregation on Group by (In Oracle is possible with this trick):
SELECT deptno, string_agg(ename) AS employees
FROM emp
GROUP BY deptno;
DEPTNO EMPLOYEES
---------- --------------------------------------------------
10 CLARK,KING,MILLER
20 SMITH,FORD,ADAMS,SCOTT,JONES
30 ALLEN,BLAKE,MARTIN,TURNER,JAMES,WARD
More OOP features:
stored procedures and user functions
CREATE PROCEDURE tablename.spname ( params ) AS ...
called via
EXECUTE spname
FROM tablename
WHERE conditions
ORDER BY
which implicitly passes a cursor or a current record to the SP. (similar to inserted and deleted pseudo-tables)
table definitions with inheritance
table definition as derived from base table, inheriting common columns etc
Btw, this is not necessarily real OOP, but only syntactic sugar on existing technology, but it would simplify development a lot.
Abstract tables and sub-classing
create abstract table person
(
id primary key,
name varchar(50)
);
create table concretePerson extends person
(
birth date,
death date
);
create table fictionalCharacter extends person
(
creator int references concretePerson.id
);
Increased temporal database support in Sql Server. Intervals, overlaps, etc.
Increased OVER support in Sql Server, including LAG, LEAD, and TOP.
Arrays
I'm not sure what's holding this back but lack of arrays lead to temp tables and related mess.
Some kind of UPGRADE table which allows to make changes on the table to be like the given:
CREATE OR UPGRADE TABLE
(
a VARCHAR,
---
)
My wish list (for SQLServer)
Ability to store/use multiple execution plans for a stored procedure concurrently and have the system automatically understand the best stored plan to use at each execution.
Currently theres one plan - if it is no longer optimal its used anyway or a brand new one is computed in its place.
Native UTF-8 storage
Database mirroring with more than one standby server and the ability to use a recovery model approaching 'simple' provided of course all servers are up and the transaction commits everywhere.
PCRE in replace functions
Some clever way of reusing fragments of large sql queries, stored match conditions, select conditions...etc. Similiar to functions but actually implemented more like preprocessor macros.
Comments for check constraints. With this feature, an application (or the database itself when raising an error) can query the metadata and retrieve that comment to show it to the user.
Automated dba notification in the case where the optimizer generates a plan different that the plan that that the query was tested with.
In other words, every query can be registered. At that time, the plan is saved. Later when the query is executed, if there is a change to the plan, the dba receives a notice, that something unexpected occurred.