recursive query with select * raises ORA-01789 - sql

This is a minimized version of complex recursive query. The query works when columns in recursive member (second part of union all) of recursive CTE are listed explicitly:
with t (c,p) as (
select 2,1 from dual
), rec (c,p) as (
select c,p from t
union all
select t.c,t.p from rec join t on rec.c = t.p
)
select * from rec
I don't get why error ORA-01789: query block has incorrect number of result columns is raised when specified t.* instead.
with t (c,p) as (
select 2,1 from dual
), rec (c,p) as (
select c,p from t
union all
select t.* from rec join t on rec.c = t.p
)
select * from rec
Why t.* is not equivalent to t.c,t.p here? Could you please point me to documentation for any reasoning?
UPDATE: reproducible on 11g and 18 (dbfiddle).

I finally asked on AskTom forum and according to response from Oracle expert Connor McDonald, this behavior is in compliance with documentation, namely the sentence The number of column aliases following WITH query_name and the number of columns in the SELECT lists of the anchor and recursive query blocks must be the same which can be found in this paragraph.
The point is, the expansion of star expression is done after checking whether the numbers of columns are same. Hence one must list columns explicitly, shortening to star is not possible.

Seems like there could be some kind of bug to me. I modified the query slightly just to test various cases and am now able to reproduce an ORA-00600 error in my Oracle 19.6.0.0.0 database! Running the problematic query on apex.oracle.com or on livesql.oracle.com (which is running 19.8.0.0.0) also results in errors. Reporting it to Oracle now!

Related

In-line CTE in SQL

Until today I thought that the CTE had to go before the main SELECT clause, but it appears that a CTE can be dropped into any subselect. As an example:
-- tested on BigQuery, Redshift, MySQL, Postgres
SELECT CONCAT(Odd.num, Even.num) FROM
(WITH Numbers AS (SELECT 1 AS num UNION ALL SELECT 3 UNION ALL SELECT 5) SELECT * FROM Numbers) Odd,
(WITH Numbers AS (SELECT 2 AS num UNION ALL SELECT 4 UNION ALL SELECT 6) SELECT * FROM Numbers) Even
Outside of just playing around with things though, is there ever a case for this sort of "embedded" CTE, or is there no reason to ever have it outside of the top, as if it's a variable declaration?
Not only that, you can have CTEs inside of CTEs.
WITH CTE1 AS (
WITH CTE2 AS (
…
)
)
SELECT …
I had to work on someone’s query that did this 13 levels deep. What a pain. The main reason this isn’t good is that it makes a mess of readability. I think what you are describing isn’t quite as bad as it keeps subsections of the query together. Doing oddball things like this generally leads to confusion when others try to understand your query.

Does Oracle always resolve CTE with clauses even if they are not used in the result set

If I have an Oracle SQL query like this:
with
query1 as (
select * from animals where type = 'dog'
),
query2 as (
select * from animals where type = 'cat'
)
select * from query1;
Will the DBMS actually do the work of resolving/running query2, or does Oracle know that query2 is not required by the final output, so the work of that CTE/with should be skipped?
Oracle version is 12c Enterprise.
I was going going to say "it's up to the optimizer" or "this is hard to answer" or "you need to look at the execution plan". But coming up with a single example where the code is not run is sufficient.
So here is an example demonstrating that at least one version of Oracle for at least one example does not evaluate the CTE:
with query1 as (
select * from animals where type = 'dog'
),
query2 as (
select a.*, type + 1 from animals a
)
select * from query1;
The second CTE would generate an error if it were evaluated.
This is not a guarantee, of course, that Oracle always ignores unused CTEs. And there could possibly be more arcane explanations for the behavior, but non-evaluation seems like the simplest.

Using nested query twice within one query - for FROM and WHERE

I am trying to figure out why the following Microsoft SQL code does not work. I simplified the query as it is quite complex. Basically the part that is not working is the second nested subquery (line FROM a) - I get an error: Invalid object name 'a'.
I would appreciate any advice on why it is not working and how I could make it work. Some background sources on why is it not working would also be helpful, as I struggle to find any information on limitations of nested queries beyond some basics.
SELECT *
FROM (
SELECT ... FROM ...
) a
WHERE x IN(
SELECT x
FROM a
WHERE v1=v2)
I managed to solve my problem thanks to the suggestion in the comments to use CTE.
So I transformed it into:
WITH CTE_1
AS
(
SELECT ... FROM ...
)
SELECT * FROM CTE_1
WHERE x IN(
SELECT x
FROM CTE_1
WHERE v1=v2)

TABLE/CAST/MULTISET vs subquery in FROM clause

The following query doesn't work. It is expected to fail since temp.col references something that is unavailable in that context.
with temp as (
select 'A' col from dual
union all
select 'B' col from dual
)
select *
from temp,
(select level || temp.col from dual connect by level < 3);
The error message from Oracle is : ORA-00904: "TEMP"."COL": invalid identifier
But why is the next query working ? I see CAST/MULTISET as a way to go from a SQL table to a collection type and TABLE to go back to a SQL table. Why do we use such round-trip ? I guess to make the query work, but how ?
with temp as (
select 'A' col from dual
union all
select 'B' col from dual
)
select *
from temp,
table(
cast(
multiset(
select level || temp.col from dual connect by level < 3
) as sys.odcivarchar2list
)
) t;
The result is :
COL COLUMN_VALUE
--- ------------
A 1A
A 2A
B 1B
B 2B
Look how the second column is named COLUMN_VALUE. Looks like a generated name by one of the construct CAST/MULTISET or TABLE.
EDIT
With the accepted answer below, I checked the documentation and found that the TABLE mechanism is a table collection expression. The expression between rounded brackets is the collection expression. The documentations defines a mechanism called left correlation :
The collection_expression can reference columns of tables defined to
its left in the FROM clause. This is called left correlation. Left
correlation can occur only in table_collection_expression. Other
subqueries cannot contains references to columns defined outside the
subquery.
So this is like LATERAL in 12c.
Oracle allows lateral inline views to reference other tables inside the inline view.
In old versions this feature was mostly used for optimizations, as discussed in the Oracle optimizer blog here. Explicit lateral joins were added in 12c. Your first query only needs a small change to work in 12c:
with temp as (
select 'A' col from dual
union all
select 'B' col from dual
)
select *
from temp,
lateral(select level || temp.col from dual connect by level < 3);
Apparently Oracle also silently uses lateral joins for collection unnesting. There are a few cases where SQL uses a logical cross join, but the tables are obviously closely related; such as XMLTable, JSON_table, and queries like your second example. In those cases it makes sense to execute the two tables together. I assume the lateral mechanism is used there, although neither the execution plan nor the 10053 optimizer trace uses the word "lateral". The documentation even has an example very similar to yours in the Collection Unnesting: Examples. However, this "feature" is still not well documented.
On a side note, in general you should avoid SQL features that increase the context. Features like lateral joins, common table expressions, and correlated subqueries can be useful, but they can also make SQL statements more difficult to understand. A regular inline view can be run and understood all by itself and has a very simple interface - its projected columns. That simplicity makes it easier to assemble small components into a large statement.
I suggest you re-write your query like below. Treat each inline view like you would a function or procedure - give them good names and comments. It will help you later when you assemble them into large, realistic statements.
select col, the_level||col
from
(
--Good comment 1.
select 'A' col from dual union all
select 'B' col from dual
) good_name_1
cross join
(
--Good comment 2.
select level the_level
from dual
connect by level < 3
) good_name_2

When to use EXCEPT as opposed to NOT EXISTS in Transact SQL?

I just recently learned of the existence of the new "EXCEPT" clause in SQL Server (a bit late, I know...) through reading code written by a co-worker. It truly amazed me!
But then I have some questions regarding its usage: when is it recommended to be employed? Is there a difference, performance-wise, between using it versus a correlated query employing "AND NOT EXISTS..."?
After reading EXCEPT's article in the BOL I thought it was just a shorthand for the second option, but was surprised when I rewrote a couple queries using it (so they had the "AND NOT EXISTS" syntax much more familiar to me) and then checked the execution plans - surprise! The EXCEPT version had a shorter execution plan, and executed faster, also. Is this always so?
So I'd like to know: what are the guidelines for using this powerful tool?
EXCEPT treats NULL values as matching.
This query:
WITH q (value) AS
(
SELECT NULL
UNION ALL
SELECT 1
),
p (value) AS
(
SELECT NULL
UNION ALL
SELECT 2
)
SELECT *
FROM q
WHERE value NOT IN
(
SELECT value
FROM p
)
will return an empty rowset.
This query:
WITH q (value) AS
(
SELECT NULL
UNION ALL
SELECT 1
),
p (value) AS
(
SELECT NULL
UNION ALL
SELECT 2
)
SELECT *
FROM q
WHERE NOT EXISTS
(
SELECT NULL
FROM p
WHERE p.value = q.value
)
will return
NULL
1
, and this one:
WITH q (value) AS
(
SELECT NULL
UNION ALL
SELECT 1
),
p (value) AS
(
SELECT NULL
UNION ALL
SELECT 2
)
SELECT *
FROM q
EXCEPT
SELECT *
FROM p
will return:
1
Recursive reference is also allowed in EXCEPT clause in a recursive CTE, though it behaves in a strange way: it returns everything except the last row of a previous set, not everything except the whole previous set:
WITH q (value) AS
(
SELECT 1
UNION ALL
SELECT 2
UNION ALL
SELECT 3
),
rec (value) AS
(
SELECT value
FROM q
UNION ALL
SELECT *
FROM (
SELECT value
FROM q
EXCEPT
SELECT value
FROM rec
) q2
)
SELECT TOP 10 *
FROM rec
---
1
2
3
-- original set
1
2
-- everything except the last row of the previous set, that is 3
1
3
-- everything except the last row of the previous set, that is 2
1
2
-- everything except the last row of the previous set, that is 3, etc.
1
SQL Server developers must just have forgotten to forbid it.
I have done a lot of analysis of except, not exists, not in and left outer join. Generally the left outer join is the fastest for finding missing rows, especially joining on a primary key. Not In can be very fast if you know it will be a small list returned in the select.
I use EXCEPT a lot to compare what is being returned when rewriting code. Run the old code saving results. Run new code saving results and then use except to capture all differences. It is a very quick and easy way to find differences, especially when needing to get all differences including null. Very good for on the fly easy coding.
But, every situation is different. I say to every developer I have ever mentored. Try it. Do timings all different ways. Try it, time it, do it.
EXCEPT compares all (paired)columns of two full-selects.
NOT EXISTS compares two or more tables accoding to the conditions specified in WHERE clause in the sub-query following NOT EXISTS keyword.
EXCEPT can be rewritten by using NOT EXISTS.
(EXCEPT ALL can be rewritten by using ROW_NUMBER and NOT EXISTS.)
Got this from here
There is no accounting for SQL server's execution plans. I have always found when having performance issues that it was utterly arbitrary (from a user's perspective, I'm sure the algorithm writers would understand why) when one syntax made a better execution plan rather than another.
In this case, something about the query parameter comparison allows SQL to figure out a shortcut that it couldn't from a straight select statement. I'm sure that is a deficiency in the algorithm. In other words, you could logically interpolate the same thing, but the algorithm doesn't make that translation on an exists query. Sometimes that is because an algorithm that could reliably figure it out would take longer to execute than the query itself, or at least the algorithm designer thought so.
If your query is fine tuned then there is no performance difference b/w using of EXCEPT clause and NOT EXIST/NOT IN.. first time when I ran EXCEPT after changing my correlated query into it.. I was surprised because it returned with the result just in 7 secs while correlated query was returning in 22 secs.. then I used distinct clause in my correlated query and reran.. it also returned in 7 secs.. so EXCEPT is good when you don't know or don't have time to fine tuned your query otherwise both are same performance wise..