SQL Order of processing - sql

Consider I have a query like
select * from A
Except
select * from B
union all
select * from B
except
select * from A
Query is processed like
select *
from
(
select * from A
Except
select * from B
) a
union all
(
select * from B
Except
select * from A
) b
How is the order of processing defined in sql. Will it process like this at any case
select * from A
Except
select * from
(
select * from B
union all
select * from B
) a
except
select * from A

EXCEPT and UNION are processed "left to right". Meaning that without any parenthesis to make the determination they will process in the order they appear in the sql.
https://msdn.microsoft.com/en-us/library/ms188055.aspx

UNION and EXCEPT have equal precedence but will bind from left to right, meaning they are evaluated in "left to right" order, as they are processed.
From #SeanLange's URL (TL;DR), worth taking note of:
If EXCEPT or INTERSECT is used together with other operators in an
expression, it is evaluated in the context of the following
precedence:
Expressions in parentheses
The INTERSECT operator
EXCEPT and UNION evaluated from left to right based on their position in the
expression

Related

SQL statement to return non-intersection records

I was recently asked this question and was a little stumped so I want to ask the experts...
Given two tables A & B, I want to return all the values from A and B that do not overlap. Think of two overlapping circles; how do we return all the data that is NOT in the overlapping center section? And, I had to use ANSI Standard SQL rather than Oracle syntax.
Assuming we want everything exclusive to both A & B, my answer was
select *
from A
cross join B
minus
(select a.common_column from a
intersect
select b.common_column)
Does this look correct, or even close? If it is correct, is there a more efficient way to do this?
BTW - my solution was soundly rejected....
Thank you!
Given the tables A and B, you are looking for (A U B) - (A & B). In other words, you need A union B minus their intersection. Remember A and B must be union-compatible for this query to work. I would do:
(select * from A
union
select * from B
)
minus
(select * from A
intersect
select * from B
)
May be full outer join?
select coalesce(A.col, B.col)
from A full outer join B on A.col = B.col
where A.col is null or B.col is null;
For computing a set symmetric difference, you can use a combination of MINUS and UNION ALL:
select * from (
(select * from A
minus
select * from B)
union all
(select * from B
minus
select * from A)
)
Your query was rejected because it is syntactically incorrect: the number of columns differ and it confuses cross join and union all. However, I think you have the right idea for solving this.
You can easily fix this:
(select *
from A
union all
select *
from B
) minus
(select *
from A
intersect
select *
from B
);
That is, combine everything using union all and then subtract the rows that occur in both tables.
Of course, if there is a single id, then you can use the id with join and other operations.
Just like Frank Schmitt answered in the meantime:
Here it is including a data example:
WITH
table_a(name) AS (
SELECT 'From_A_1'
UNION ALL SELECT 'From_A_2'
UNION ALL SELECT 'From_A_3'
UNION ALL SELECT 'From_A_4'
UNION ALL SELECT 'From_A_5'
UNION ALL SELECT 'From_BOTH_6'
UNION ALL SELECT 'From_BOTH_7'
UNION ALL SELECT 'From_BOTH_8'
)
,
table_b(name) AS (
SELECT 'From_B_1'
UNION ALL SELECT 'From_B_2'
UNION ALL SELECT 'From_B_3'
UNION ALL SELECT 'From_B_4'
UNION ALL SELECT 'From_B_5'
UNION ALL SELECT 'From_BOTH_6'
UNION ALL SELECT 'From_BOTH_7'
UNION ALL SELECT 'From_BOTH_8'
)
(SELECT * FROM table_a EXCEPT SELECT * FROM table_b)
UNION ALL
(SELECT * FROM table_b EXCEPT SELECT * FROM table_a)
ORDER BY name
;
name
From_A_1
From_A_2
From_A_3
From_A_4
From_A_5
From_B_1
From_B_2
From_B_3
From_B_4
From_B_5
You will need to select all the data from both tables, except where they overlap, and then combine the data with a union. The code provided should work for your example.
SELECT *
FROM
(
SELECT * FROM Table1
EXCEPT SELECT * FROM Table2
)
UNION
SELECT *
FROM
(
SELECT * FROM Table2
EXCEPT SELECT * FROM Table1
)
Hope this helps.

Join two tables with geometry column into one

I have two tables that are exactly the same. I want to join them together into one large dataset. I tried simply SELECT-INTO query but got an error...
SELECT * INTO dbo.ParkingBay
FROM (SELECT * FROM dbo.ParkingBay_Old
UNION
SELECT * FROM dbo.ParkingBay_New) AS PARKING_BAY;
The error is:
The geometry data type cannot be selected as DISTINCT because it is
not comparable.
The UNION performs a DISTINCT on the combined result set.
UNION ALL eliminates this DISTINCT step, but would create the possibility of dupes in the result.
If you are OK with dupe possibility, then try this
SELECT * INTO dbo.ParkingBay
FROM (SELECT * FROM dbo.ParkingBay_Old
UNION ALL
SELECT * FROM dbo.ParkingBay_New) AS PARKING_BAY;
It looks like ALL solves everything:
SELECT * INTO dbo.ParkingBay
FROM (SELECT * FROM dbo.ParkingBay_Old
UNION ALL
SELECT * FROM dbo.ParkingBay_New) AS PARKING_BAY;

SQL condition on unioned tables

I'm trying to execute a query with union and then put an additional condition on the result. Basically, I want to have something like
SELECT * FROM (A UNION B) WHERE condition
Is this possible to do? When I try to execute
SELECT * FROM (A UNION B)
SQL Management Studio complains about the closing bracket:
Incorrect syntax near ')'.
SELECT *
FROM ( SELECT * FROM A
UNION
SELECT * FROM B
) temp
WHERE condition
Here is the correct syntax:
SELECT *
FROM (select from A
UNION
select from B
) ab
WHERE condition;
I agree that the original syntax "looks right" for a from statement, but SQL doesn't allow it. Also, you need an alias on the subquery (requirement of SQL Server). Finally, you might consider using union all, if you know there are no duplicates in tables.
how about with CTE:
with tbl as (
select * from a
union
select * from b)
select * from tbl where condition
Use alias for derived table
SELECT * FROM (select * from A UNION select * from B) as t WHERE condition

Postgresql UNION takes 10 times as long as running the individual queries

I am trying to get the diff between two nearly identical tables in postgresql. The current query I am running is:
SELECT * FROM tableA EXCEPT SELECT * FROM tableB;
and
SELECT * FROM tableB EXCEPT SELECT * FROM tableA;
Each of the above queries takes about 2 minutes to run (Its a large table)
I wanted to combine the two queries in hopes to save time, so I tried:
SELECT * FROM tableA EXCEPT SELECT * FROM tableB
UNION
SELECT * FROM tableB EXCEPT SELECT * FROM tableA;
And while it works, it takes 20 minutes to run!!! I would guess that it would at most take 4 minutes, the amount of time to run each query individually.
Is there some extra work UNION is doing that is making it take so long? Or is there any way I can speed this up (with or without the UNION)?
UPDATE: Running the query with UNION ALL takes 15 minutes, almost 4 times as long as running each one on its own, Am I correct in saying that UNION (all) is not going to speed this up at all?
With regards to your "extra work" question. Yes. Union not only combines the two queries but also goes through and removes duplicates. It's the same as using a distinct statement.
For this reason, especially combined with your except statements "union all" would likely be faster.
Read more here:
http://www.postgresql.org/files/documentation/books/aw_pgsql/node80.html
In addition to combining the results of the first and second query, UNION by default also removes duplicate records. (see http://www.postgresql.org/docs/8.1/static/sql-select.html). The extra work involved in checking for duplicate records between the two queries is probably responsible for the extra time. In this situation there should not be any duplicate records so the extra work looking for duplicates can be avoided by specifying UNION ALL.
SELECT * FROM tableA EXCEPT SELECT * FROM tableB
UNION ALL
SELECT * FROM tableB EXCEPT SELECT * FROM tableA;
I don't think your code returns resultset you intend it to. I rather think you want to do this:
SELECT *
FROM (
SELECT * FROM tableA
EXCEPT
SELECT * FROM tableB
) AS T1
UNION
SELECT *
FROM (
SELECT * FROM tableB
EXCEPT
SELECT * FROM tableA
) AS T2;
In other words, you want the set of mutually exclusive members. If so, you need to read up on relational operator precedence in SQL ;) And when you have, you may realise the above can be rationalised to:
SELECT * FROM tableA
UNION
SELECT * FROM tableB
EXCEPT
SELECT * FROM tableA
INTERSECT
SELECT * FROM tableB;
FWIW, using subqueries (derived tables T1 and T2) to explicitly show (what would otherwise be implicit) relational operator precedence, your original query is this:
SELECT *
FROM (
SELECT *
FROM (
SELECT *
FROM tableA
EXCEPT
SELECT *
FROM tableB
) AS T2
UNION
SELECT *
FROM tableB
) AS T1
EXCEPT
SELECT *
FROM tableA;
The above can be relationalised to:
SELECT *
FROM tableB
EXCEPT
SELECT *
FROM tableA;
...and I think not what is intended.
You could use tableA FULL OUTER JOIN tableB, which would give what you want (with a propre join condition) with only 1 table scan, it probably would be faster than the 2 queries above.
Post more info please.

What's wrong with my UNION SELECT

SELECT *
FROM
(SELECT Campus_ID AS A_Campus, * FROM A_Campuses
UNION
SELECT Campus_ID AS H_Campus, * FROM H_Campuses
UNION
SELECT Campus_ID AS B_Campus, * FROM B_Campuses)
ORDER BY Campus_Name ASC
It gives the sql error
#1064 - You have an error in your SQL syntax; check the manual that
corresponds to your MySQL server
version for the right syntax to use
near '* FROM A_Campuses UNION
SELECT Campus_ID AS H_Campus, * FROM
H_Campuses' at line 3
Step 1
You have two syntax errors
not aliasing the derived table.
* cannot be used AFTER a column name, unless you alias the source table
SELECT *
FROM
(SELECT Campus_ID AS A_Campus, A.* FROM A_Campuses A
UNION
SELECT Campus_ID AS H_Campus, H.* FROM H_Campuses H
UNION
SELECT Campus_ID AS B_Campus, B.* FROM B_Campuses B) AS X
ORDER BY Campus_Name ASC
Step 2
But as Phil points out, since you are sub-querying only to do an order by, there is no need to subquery at all. ORDER BY applies to the entire UNION-ed result.
SELECT Campus_ID AS A_Campus, A.* FROM A_Campuses A
UNION
SELECT Campus_ID AS H_Campus, H.* FROM H_Campuses H
UNION
SELECT Campus_ID AS B_Campus, B.* FROM B_Campuses B
ORDER BY Campus_Name ASC
Step 3
The next thing to point out is that A_, H_ and B_ must ALL have compatible structures for the UNION to align properly. It is also worth mentioning that aliasing Campus_ID as different column names has no value. The column names of the resultant result of a UNION is the FIRST name encountered across the UNION parts - in this case all the column names will come from A_Campuses, as well as the additional column A_Campus. In actual fact, you will have two columns A_Campus and Campus_ID which will always hold EXACTLY the same values. What you probably wanted was to indicate the SOURCE of the data: (notice that I have not even bothered to alias the columns for the 2nd and 3rd parts of the UNION)
SELECT 'A' AS Source, A.* FROM A_Campuses A
UNION ALL
SELECT 'H', H.* FROM H_Campuses H
UNION ALL
SELECT 'B', B.* FROM B_Campuses B
ORDER BY Campus_Name ASC
Note
For performance reasons, use UNION ALL instead of UNION, which performs a DISTINCT against the final result. If you had duplicate Campus_ID across different tables, as well as exactly the same record data, UNION results in one of them being removed, whereas UNION ALL keeps both (or all 3) copies. Given the addition of the Source column, this is not a possibility, so using UNION ALL will result in a faster query.
The columns in each branch of the UNION normally need the same name, or will end up with a single name. Also, a sub-select needs an alias (the 'AS C' below), at least in standard SQL; even if you don't mention the alias anywhere else in the query, as below.
I think what you're after is likely:
SELECT *
FROM (SELECT "A" AS Campus_ID, * FROM A_Campuses
UNION
SELECT "H" AS Campus_ID, * FROM H_Campuses
UNION
SELECT "B" AS Campus_ID, * FROM B_Campuses) AS C
ORDER BY Campus_Name ASC
That isn't how you write a union query.
Any ORDER BY clause applies to the union so you don't need to sub-query it. Also, you should aim to have the same column names and aliases across all parts of the union. Your A_Campus, H_Campus and B_Campus aliases will be lost (not sure which one will win out). For example
SELECT Campus_ID, Campus_Name FROM A_Campuses
UNION
SELECT Campus_ID, Campus_Name FROM H_Campuses
UNION
SELECT Campus_ID, Campus_Name FROM B_Campuses
ORDER BY Campus_Name ASC
I'd also refrain from using SELECT * in a union as you need to be specific about what you're selecting.
perhaps more parenthesis are needed
SELECT *
FROM
(
(SELECT Campus_ID AS A_Campus, * FROM A_Campuses)
UNION
(SELECT Campus_ID AS H_Campus, * FROM H_Campuses)
UNION
(SELECT Campus_ID AS B_Campus, * FROM B_Campuses)
)
ORDER BY Campus_Name ASC
Also can you UNION with different column names like that? You might need a fake A_Campus and B_Campus in the H_Campus subquery to get identical columns.