I found this example code online when I searched for "how to do an Exclusive Between oracle sql"
Someone was proving that, in Oracle, BETWEEN is by default inclusive.
So they used such code :
with x as (
select 1 col1 from dual
union
select 2 col1 from dual
union
select 3 col1 from dual
UNION
select 4 col1 from dual
)
select *
from x
where col1 between 2 and 3
I've never seen such an example, what is going on with the WITH ?
In short, WITH clause is an inline view, or subquery. It is useful when you will refer to something multiple times, or when you want to abstract parts of a complex query to make it easier to read.
If you are from SQL Server world, you can also think of it like a temporary table.
So:
WITH foo as (select * from tab);
select * from foo;
is like
select * from (select * from tab);
Though it may be more efficient since x is resolved to a single dataset, even if queried multiple times.
It also reduces repetition. If you use a subquery more than once in a statement, you can consider factoring it out using WITH.
It has nothing to do with the BETWEEN example, it is just the author's choice of approach for demonstrating a concept.
Related
Is it possible to create a CTE without a FROM, and if not isn't that the whole point of a CTE in the first place?
WITH cte AS
(
SELECT 1 AS col1, 2 AS col2
)
SELECT col1, col2 FROM cte;
> ORA-00923: FROM keyword not found where expected
It seems a quick-fix for this is just adding FROM DUAL whenever needed. Is that what's supposed to be done?
Yes, that's exactly what dual is supposed to be used for.
Selecting from the DUAL table is useful for computing a constant
expression with the SELECT statement. Because DUAL has only one row,
the constant is returned only once.
https://docs.oracle.com/cd/B19306_01/server.102/b14200/queries009.htm
In Oracle, we can write this to generate a single row using a SELECT statement.
SELECT 1 AS x FROM dual
What is Teradata's equivalent?
Generally, no such table is needed
In most cases, no table is really needed in the Teradata database. The following is valid SQL (just like in H2, PostgreSQL, Redshift, SQL Server, SQLite, Sybase ASE, Sybase SQL Anywhere, Vertica)
SELECT 1
SELECT 1 WHERE 1 = 1
Exceptions
However, there is an exception, when a set operation is desireable. E.g. this is invalid in Teradata:
SELECT 1 UNION ALL SELECT 2
Yielding this error:
A SELECT for a UNION,INTERSECT or MINUS must reference a table.
But since the FROM clause is generally optional, it's very easy to emulate a DUAL table as follows:
SELECT 1 FROM (SELECT 1 AS "DUMMY") AS "DUAL"
UNION ALL
SELECT 2 FROM (SELECT 1 AS "DUMMY") AS "DUAL"
Compatibility
In case compatibility needs to be achieved with Oracle etc, it is easy to create a view that behaves like Oracle's dual:
CREATE VIEW "DUAL" AS (SELECT 1 AS "DUMMY");
Notice that DUAL is a keyword in Teradata, thus the view needs to be quoted.
Other dialects
In case anyone is interested, the jOOQ user manual lists various ways of emulating DUAL (if it's required) in 30+ SQL dialects.
The following query doesn't work. It is expected to fail since temp.col references something that is unavailable in that context.
with temp as (
select 'A' col from dual
union all
select 'B' col from dual
)
select *
from temp,
(select level || temp.col from dual connect by level < 3);
The error message from Oracle is : ORA-00904: "TEMP"."COL": invalid identifier
But why is the next query working ? I see CAST/MULTISET as a way to go from a SQL table to a collection type and TABLE to go back to a SQL table. Why do we use such round-trip ? I guess to make the query work, but how ?
with temp as (
select 'A' col from dual
union all
select 'B' col from dual
)
select *
from temp,
table(
cast(
multiset(
select level || temp.col from dual connect by level < 3
) as sys.odcivarchar2list
)
) t;
The result is :
COL COLUMN_VALUE
--- ------------
A 1A
A 2A
B 1B
B 2B
Look how the second column is named COLUMN_VALUE. Looks like a generated name by one of the construct CAST/MULTISET or TABLE.
EDIT
With the accepted answer below, I checked the documentation and found that the TABLE mechanism is a table collection expression. The expression between rounded brackets is the collection expression. The documentations defines a mechanism called left correlation :
The collection_expression can reference columns of tables defined to
its left in the FROM clause. This is called left correlation. Left
correlation can occur only in table_collection_expression. Other
subqueries cannot contains references to columns defined outside the
subquery.
So this is like LATERAL in 12c.
Oracle allows lateral inline views to reference other tables inside the inline view.
In old versions this feature was mostly used for optimizations, as discussed in the Oracle optimizer blog here. Explicit lateral joins were added in 12c. Your first query only needs a small change to work in 12c:
with temp as (
select 'A' col from dual
union all
select 'B' col from dual
)
select *
from temp,
lateral(select level || temp.col from dual connect by level < 3);
Apparently Oracle also silently uses lateral joins for collection unnesting. There are a few cases where SQL uses a logical cross join, but the tables are obviously closely related; such as XMLTable, JSON_table, and queries like your second example. In those cases it makes sense to execute the two tables together. I assume the lateral mechanism is used there, although neither the execution plan nor the 10053 optimizer trace uses the word "lateral". The documentation even has an example very similar to yours in the Collection Unnesting: Examples. However, this "feature" is still not well documented.
On a side note, in general you should avoid SQL features that increase the context. Features like lateral joins, common table expressions, and correlated subqueries can be useful, but they can also make SQL statements more difficult to understand. A regular inline view can be run and understood all by itself and has a very simple interface - its projected columns. That simplicity makes it easier to assemble small components into a large statement.
I suggest you re-write your query like below. Treat each inline view like you would a function or procedure - give them good names and comments. It will help you later when you assemble them into large, realistic statements.
select col, the_level||col
from
(
--Good comment 1.
select 'A' col from dual union all
select 'B' col from dual
) good_name_1
cross join
(
--Good comment 2.
select level the_level
from dual
connect by level < 3
) good_name_2
BigQuery does not seem to have support for UNION yet:
https://developers.google.com/bigquery/docs/query-reference
(I don't mean unioning tables together for the source. It has that.)
Is it coming soon?
If you want UNION so that you can combine query results, you can use subselects
in BigQuery:
SELECT foo, bar
FROM
(SELECT integer(id) AS foo, string(title) AS bar
FROM publicdata:samples.wikipedia limit 10),
(SELECT integer(year) AS foo, string(state) AS bar
FROM publicdata:samples.natality limit 10);
This is almost exactly equivalent to the SQL
SELECT id AS foo, title AS bar
FROM publicdata:samples.wikipedia limit 10
UNION ALL
SELECT year AS foo, state AS bar
FROM publicdata:samples.natality limit 10;
(note that if want SQL UNION and not UNION ALL this won't work)
Alternately, you could run two queries and append the result.
BigQuery recently added support for Standard SQL, including the UNION operation.
When submitting a query through the web UI, just make sure to uncheck "Use Legacy SQL" under the SQL Version rubric:
You can always do:
SELECT * FROM (query 1), (query 2);
It does the same thing as :
SELECT * from query1 UNION select * from query 2;
Note that, if you're using standard SQL, the comma operator now means JOIN - you have to use the UNION syntax if you want a union:
In legacy SQL, the comma operator , has the non-standard meaning of UNION ALL when applied to tables. In standard SQL, the comma operator has the standard meaning of JOIN.
For example:
#standardSQL
SELECT
column_name,
count(*)
from
(SELECT * FROM me.table1 UNION ALL SELECT * FROM me.table2)
group by 1
This helped me out very much for doing a UNION INTERSECT with big query's StandardSQL.
#standardSQL
WITH
a AS (
SELECT
*
FROM
table_a),
b AS (
SELECT
*
FROM
table_b)
SELECT
*
FROM
a INTERSECT DISTINCT
SELECT
*
FROM
b
I STOLE/MODIFIED THIS EXAMPLE FROM: https://gist.github.com/yancya/bf38d1b60edf972140492e3efd0955d0
Unions are indeed supported. An excerpt from the link that you posted:
Note: Unlike many other SQL-based systems, BigQuery uses the comma syntax to indicate table unions, not joins. This means you can run a query over several tables with compatible schemas as follows:
// Find suspicious activity over several days
SELECT FORMAT_UTC_USEC(event.timestamp_in_usec) AS time, request_url
FROM [applogs.events_20120501], [applogs.events_20120502], [applogs.events_20120503]
WHERE event.username = 'root' AND NOT event.source_ip.is_internal;
Here's my query:
SELECT my_view.*
FROM my_view
WHERE my_view.trial in (select 2 as trial_id from dual union select 3 from dual union select 4 from dual)
and my_view.location like ('123-%')
When I execute this query it returns results which do not conform to the my_view.location like ('123-%') condition. It's as if that condition is being ignored completely. I can even change it to my_view.location IS NULL and it returns the same results, despite that field being not-nullable.
I know this query seems ridiculous with the selects from dual, but I've structured it this way to replicate a problem I have when I use a 'WITH' clause (the results of that query are where the selects from dual inline view are).
I can modify the query like so and it returns the expected results:
SELECT my_view.*
FROM my_view
WHERE my_view.trial in (2, 3, 4)
and my_view.location like ('123-%')
Unfortunately I do not know the trial values up front (they are queried for in a 'WITH' clause) so I cannot structure my query this way. What am I doing wrong?
I will say that the my_view view is composed of 3 other views whose results are UNION ALL and each of which retrieve some data over a DB Link. Not that I believe that should matter, but in case it does.
One thing you could try if you don't have luck with this route is to replace "IN" with an "EXISTS" or "NOT EXISTS" statement.
If you could accomplish what you want using joins, that would be the best option because of performance. If you have views pulling data from views, you can often make a single query to do what you want that gives you better performance using subqueries.
If you do
EXPLAIN PLAN FOR
SELECT my_view.*
FROM my_view
WHERE my_view.trial in (select 2 as trial_id from dual union select 3 from dual union select 4 from dual)
and my_view.location like ('123-%');
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);
you should see where the location predicate is 9or is not) being applied. My bet is that it has something to do with the DB links and you won't be able to reproduce it if all the tables are local.
Optimizing a distributed query gets complicated.
Try changing the UNION query to use UNION ALL, as in:
SELECT my_view.*
FROM my_view
WHERE my_view.trial in (select 2 as trial_id from dual
UNION ALL
select 3 AS TRIAL_ID from dual
UNION ALL
select 4 AS TRIAL_ID from dual)
and my_view.location like ('123-%')
I also put in "AS TRIAL_ID" on the 3 and 4 cases. I agree that neither of these should matter, but I've run into cases occasionally where things that I thought shouldn't matter mattered.
Good luck.