Until today I thought that the CTE had to go before the main SELECT clause, but it appears that a CTE can be dropped into any subselect. As an example:
-- tested on BigQuery, Redshift, MySQL, Postgres
SELECT CONCAT(Odd.num, Even.num) FROM
(WITH Numbers AS (SELECT 1 AS num UNION ALL SELECT 3 UNION ALL SELECT 5) SELECT * FROM Numbers) Odd,
(WITH Numbers AS (SELECT 2 AS num UNION ALL SELECT 4 UNION ALL SELECT 6) SELECT * FROM Numbers) Even
Outside of just playing around with things though, is there ever a case for this sort of "embedded" CTE, or is there no reason to ever have it outside of the top, as if it's a variable declaration?
Not only that, you can have CTEs inside of CTEs.
WITH CTE1 AS (
WITH CTE2 AS (
…
)
)
SELECT …
I had to work on someone’s query that did this 13 levels deep. What a pain. The main reason this isn’t good is that it makes a mess of readability. I think what you are describing isn’t quite as bad as it keeps subsections of the query together. Doing oddball things like this generally leads to confusion when others try to understand your query.
Related
This is a minimized version of complex recursive query. The query works when columns in recursive member (second part of union all) of recursive CTE are listed explicitly:
with t (c,p) as (
select 2,1 from dual
), rec (c,p) as (
select c,p from t
union all
select t.c,t.p from rec join t on rec.c = t.p
)
select * from rec
I don't get why error ORA-01789: query block has incorrect number of result columns is raised when specified t.* instead.
with t (c,p) as (
select 2,1 from dual
), rec (c,p) as (
select c,p from t
union all
select t.* from rec join t on rec.c = t.p
)
select * from rec
Why t.* is not equivalent to t.c,t.p here? Could you please point me to documentation for any reasoning?
UPDATE: reproducible on 11g and 18 (dbfiddle).
I finally asked on AskTom forum and according to response from Oracle expert Connor McDonald, this behavior is in compliance with documentation, namely the sentence The number of column aliases following WITH query_name and the number of columns in the SELECT lists of the anchor and recursive query blocks must be the same which can be found in this paragraph.
The point is, the expansion of star expression is done after checking whether the numbers of columns are same. Hence one must list columns explicitly, shortening to star is not possible.
Seems like there could be some kind of bug to me. I modified the query slightly just to test various cases and am now able to reproduce an ORA-00600 error in my Oracle 19.6.0.0.0 database! Running the problematic query on apex.oracle.com or on livesql.oracle.com (which is running 19.8.0.0.0) also results in errors. Reporting it to Oracle now!
The following post had compelling reasons for generally avoiding the use of select * in SQL.
Why is SELECT * considered harmful?
In the discussion was examples of when it was or wasn't acceptable to use select * However I did not see discussion on common table expression (CTE). Are there any drawbacks for using select * in CTEs?
Example:
WITH CTE1 AS
(
SELECT Doc, TotalDue
FROM ARInvoices
WHERE CustomerName = 'ABC'
UNION
SELECT Doc, - TotalDue
FROM ARInvoiceMemos
WHERE CustomerName = 'ABC'
)
select * from CTE1
UNION
Select 'Total' as Doc, sum(TotalDue)
FROM CTE1
Since you already properly listed the column names in the cte, I don't see any harm in using select * from the cte.
In fact, it might be just the right place to use select *, since there is no point of listing the columns twice.
Unless you don't need to use all the columns returned by the cte. (i.e a column in the cte is used on the query, but not in the select clause) In that case, I would suggest listing only the columns you need even of the from is pointing to a cte.
Note that if the cte itself uses select * then all of the drawbacks listed in the post you linked to applies to it.
My main objection to select * is that it's usually used by lazy developers that doesn't consider the consequences of the *.
Note: Everything I've written here applies to derived tables as well.
In theory the rule of thumb that select * is ill advised always applies.
In practice though, if you are a developer who considers things like design and general good programming practice as important as functionality, your CTE will most likely be coded to only return the columns which are actually needed, so select * from CTE1 might not be so bad.
The following post had compelling reasons for generally avoiding the use of select * in SQL.
Why is SELECT * considered harmful?
In the discussion was examples of when it was or wasn't acceptable to use select * However I did not see discussion on common table expression (CTE). Are there any drawbacks for using select * in CTEs?
Example:
WITH CTE1 AS
(
SELECT Doc, TotalDue
FROM ARInvoices
WHERE CustomerName = 'ABC'
UNION
SELECT Doc, - TotalDue
FROM ARInvoiceMemos
WHERE CustomerName = 'ABC'
)
select * from CTE1
UNION
Select 'Total' as Doc, sum(TotalDue)
FROM CTE1
Since you already properly listed the column names in the cte, I don't see any harm in using select * from the cte.
In fact, it might be just the right place to use select *, since there is no point of listing the columns twice.
Unless you don't need to use all the columns returned by the cte. (i.e a column in the cte is used on the query, but not in the select clause) In that case, I would suggest listing only the columns you need even of the from is pointing to a cte.
Note that if the cte itself uses select * then all of the drawbacks listed in the post you linked to applies to it.
My main objection to select * is that it's usually used by lazy developers that doesn't consider the consequences of the *.
Note: Everything I've written here applies to derived tables as well.
In theory the rule of thumb that select * is ill advised always applies.
In practice though, if you are a developer who considers things like design and general good programming practice as important as functionality, your CTE will most likely be coded to only return the columns which are actually needed, so select * from CTE1 might not be so bad.
I found this example code online when I searched for "how to do an Exclusive Between oracle sql"
Someone was proving that, in Oracle, BETWEEN is by default inclusive.
So they used such code :
with x as (
select 1 col1 from dual
union
select 2 col1 from dual
union
select 3 col1 from dual
UNION
select 4 col1 from dual
)
select *
from x
where col1 between 2 and 3
I've never seen such an example, what is going on with the WITH ?
In short, WITH clause is an inline view, or subquery. It is useful when you will refer to something multiple times, or when you want to abstract parts of a complex query to make it easier to read.
If you are from SQL Server world, you can also think of it like a temporary table.
So:
WITH foo as (select * from tab);
select * from foo;
is like
select * from (select * from tab);
Though it may be more efficient since x is resolved to a single dataset, even if queried multiple times.
It also reduces repetition. If you use a subquery more than once in a statement, you can consider factoring it out using WITH.
It has nothing to do with the BETWEEN example, it is just the author's choice of approach for demonstrating a concept.
I just recently learned of the existence of the new "EXCEPT" clause in SQL Server (a bit late, I know...) through reading code written by a co-worker. It truly amazed me!
But then I have some questions regarding its usage: when is it recommended to be employed? Is there a difference, performance-wise, between using it versus a correlated query employing "AND NOT EXISTS..."?
After reading EXCEPT's article in the BOL I thought it was just a shorthand for the second option, but was surprised when I rewrote a couple queries using it (so they had the "AND NOT EXISTS" syntax much more familiar to me) and then checked the execution plans - surprise! The EXCEPT version had a shorter execution plan, and executed faster, also. Is this always so?
So I'd like to know: what are the guidelines for using this powerful tool?
EXCEPT treats NULL values as matching.
This query:
WITH q (value) AS
(
SELECT NULL
UNION ALL
SELECT 1
),
p (value) AS
(
SELECT NULL
UNION ALL
SELECT 2
)
SELECT *
FROM q
WHERE value NOT IN
(
SELECT value
FROM p
)
will return an empty rowset.
This query:
WITH q (value) AS
(
SELECT NULL
UNION ALL
SELECT 1
),
p (value) AS
(
SELECT NULL
UNION ALL
SELECT 2
)
SELECT *
FROM q
WHERE NOT EXISTS
(
SELECT NULL
FROM p
WHERE p.value = q.value
)
will return
NULL
1
, and this one:
WITH q (value) AS
(
SELECT NULL
UNION ALL
SELECT 1
),
p (value) AS
(
SELECT NULL
UNION ALL
SELECT 2
)
SELECT *
FROM q
EXCEPT
SELECT *
FROM p
will return:
1
Recursive reference is also allowed in EXCEPT clause in a recursive CTE, though it behaves in a strange way: it returns everything except the last row of a previous set, not everything except the whole previous set:
WITH q (value) AS
(
SELECT 1
UNION ALL
SELECT 2
UNION ALL
SELECT 3
),
rec (value) AS
(
SELECT value
FROM q
UNION ALL
SELECT *
FROM (
SELECT value
FROM q
EXCEPT
SELECT value
FROM rec
) q2
)
SELECT TOP 10 *
FROM rec
---
1
2
3
-- original set
1
2
-- everything except the last row of the previous set, that is 3
1
3
-- everything except the last row of the previous set, that is 2
1
2
-- everything except the last row of the previous set, that is 3, etc.
1
SQL Server developers must just have forgotten to forbid it.
I have done a lot of analysis of except, not exists, not in and left outer join. Generally the left outer join is the fastest for finding missing rows, especially joining on a primary key. Not In can be very fast if you know it will be a small list returned in the select.
I use EXCEPT a lot to compare what is being returned when rewriting code. Run the old code saving results. Run new code saving results and then use except to capture all differences. It is a very quick and easy way to find differences, especially when needing to get all differences including null. Very good for on the fly easy coding.
But, every situation is different. I say to every developer I have ever mentored. Try it. Do timings all different ways. Try it, time it, do it.
EXCEPT compares all (paired)columns of two full-selects.
NOT EXISTS compares two or more tables accoding to the conditions specified in WHERE clause in the sub-query following NOT EXISTS keyword.
EXCEPT can be rewritten by using NOT EXISTS.
(EXCEPT ALL can be rewritten by using ROW_NUMBER and NOT EXISTS.)
Got this from here
There is no accounting for SQL server's execution plans. I have always found when having performance issues that it was utterly arbitrary (from a user's perspective, I'm sure the algorithm writers would understand why) when one syntax made a better execution plan rather than another.
In this case, something about the query parameter comparison allows SQL to figure out a shortcut that it couldn't from a straight select statement. I'm sure that is a deficiency in the algorithm. In other words, you could logically interpolate the same thing, but the algorithm doesn't make that translation on an exists query. Sometimes that is because an algorithm that could reliably figure it out would take longer to execute than the query itself, or at least the algorithm designer thought so.
If your query is fine tuned then there is no performance difference b/w using of EXCEPT clause and NOT EXIST/NOT IN.. first time when I ran EXCEPT after changing my correlated query into it.. I was surprised because it returned with the result just in 7 secs while correlated query was returning in 22 secs.. then I used distinct clause in my correlated query and reran.. it also returned in 7 secs.. so EXCEPT is good when you don't know or don't have time to fine tuned your query otherwise both are same performance wise..