Running nested CTEs - sql

On the BigQuery SELECT syntax page it gives the following:
query_statement:
query_expr
query_expr:
[ WITH cte[, ...] ]
{ select | ( query_expr ) | set_operation }
[ ORDER BY expression [{ ASC | DESC }] [, ...] ]
[ LIMIT count [ OFFSET skip_rows ] ]
I understand how the (second line) select could be either:
{ select | set_operation }
But what is the ( query_expr ) in the middle for? For example, if it can refer to itself, wouldn't it make the possibility to construct a lisp-like query such as:
with x as (select 1 a)
(with y as (select 2 b)
(with z as (select 3 c)
select * from x, y, z))
Actually, I just tested it and the answer is yes. If so, what would be an actual use case of the above construction where you can use ( query_expr ) ?
And, is there ever a case where using a nested CTE can do something that multiple CTEs cannot? (For example, the current answer is just a verbose way or writing what would more properly be written with a single WITH expression with multiple CTEs)

The following construction might be useful to model how an ETL flow might work and to encapsulate certain 'steps' that you don't want the outer query to have access to. But this is quite a stretch...
WITH output AS (
WITH step_csv_query AS (
SELECT * FROM 'Sales.CSV'),
step_filter_csv AS (
SELECT * FROM step_csv_query WHERE Country='US'),
step_mysql_query AS (
SELECT * FROM MySQL1 LEFT OUTER JOIN MySQL2...),
step_join_queries AS (
SELECT * FROM step_filter_csv INNER JOIN step_mysql_query USING (id)
) SELECT * FROM step_join_queries -- output is last step
) SELECT * FROM output -- or whatever we want to do with the output...

This query might be useful in that case where a CTE is being referred to by subsequent CTEs.
For instance, you can use this if you want to join two tables, use expressions and query the resulting table.
with x as (select '2' as id, 'sample' as name )
(with y as ( select '2' as number, 'customer' as type)
(with z as ( select CONCAT('C00',id), name, type from x inner join y on x.id=y.number)
Select * from z))
The above query gives the following output :
Though there are other ways to achieve the same, the above method would be much easier for debugging.
Following article can be referred for further information on the use cases.
In nested CTEs, the same CTE alias can be reused which is not possible in case of multiple CTEs. For example, in the following query the inner CTE will override the outer CTEs with the same alias :
with x as (select '1')
(with x as (select '2' as id, 'sample' as name )
(with x as ( select '2' as number, 'customer' as type)
select * from x))

Related

postgres: COUNT, DISTINCT is not implemented for window functions

I am trying to use COUNT(DISTINC column) OVER(PARTITION BY column) when I am using COUNT + window function(OVER).
I get an error like the one in the title and can't get it to work.
I have looked into how to deal with this error, but I have not found an example of how to deal with such a complex query as the one below.
I cannot find an example of how to deal with such a complex query as shown below, and I am not sure how to handle it.
The COUNT part of the problem exists on line 65.
How can such a complex query be resolved without slowing down?
WITH RECURSIVE "cte" AS((
SELECT
"videos_productvideocomment"."id",
"videos_productvideocomment"."user_id",
"videos_productvideocomment"."video_id",
"videos_productvideocomment"."parent_id",
"videos_productvideocomment"."text",
"videos_productvideocomment"."commented_at",
"videos_productvideocomment"."edited_at",
"videos_productvideocomment"."created_at",
"videos_productvideocomment"."updated_at",
"videos_productvideocomment"."id" AS "root_id"
FROM
"videos_productvideocomment"
WHERE
(
"videos_productvideocomment"."parent_id" IS NULL
AND "videos_productvideocomment"."video_id" = 'f264433c-c0af-49cc-8b40-84453da71b2d'
)
) UNION(
SELECT
"videos_productvideocomment"."id",
"videos_productvideocomment"."user_id",
"videos_productvideocomment"."video_id",
"videos_productvideocomment"."parent_id",
"videos_productvideocomment"."text",
"videos_productvideocomment"."commented_at",
"videos_productvideocomment"."edited_at",
"videos_productvideocomment"."created_at",
"videos_productvideocomment"."updated_at",
"cte"."root_id" AS "root_id"
FROM
"videos_productvideocomment"
INNER JOIN
"cte"
ON "videos_productvideocomment"."parent_id" = "cte"."id"
))
SELECT
*,
EXISTS(
SELECT
(1) AS "a"
FROM
"videos_productvideolikecomment" U0
WHERE
(
U0."comment_id" = t."id"
AND U0."user_id" = '3bd3bc86-0335-481e-9fd2-eb2fb1168f48'
)
LIMIT 1
) AS "liked"
FROM
(
SELECT DISTINCT
"cte"."id",
"cte"."created_at",
"cte"."updated_at",
"cte"."user_id",
"cte"."text",
"cte"."commented_at",
"cte"."edited_at",
"cte"."parent_id",
"cte"."video_id",
"cte"."root_id" AS "root_id",
COUNT(DISTINCT "cte"."root_id") OVER(PARTITION BY "cte"."root_id") AS "reply_count", <--- here
COUNT("videos_productvideolikecomment"."id") OVER(PARTITION BY "cte"."id") AS "liked_count"
FROM
"cte"
LEFT OUTER JOIN
"videos_productvideolikecomment"
ON (
"cte"."id" = "videos_productvideolikecomment"."comment_id"
)
) t
WHERE
t."id" = t."root_id"
ORDER BY
CASE
WHEN t."user_id" = '3bd3bc86-0335-481e-9fd2-eb2fb1168f48' THEN 0
ELSE 1
END ASC,
"liked_count" DESC
DISTINCT will look for duplicates and remove it, but in big data it will take a lot of time to process this query, you should process the middle of the record in the programming part I think it will be fast than. Thank

How pass paraeter to WITH clause

I want pass parameter to WITH class something like this:
WITH sction(id) AS (
SELECT q.value1
FROM Example q
WHERE q.id=id )
is it possible?Anyone can help me?
WITH factoring clause a.k.a. CTE (Common Table Expression) is what we some time ago used to call a "subquery". As such, it uses a WHERE clause which you can use to pass that "parameter". For example:
WITH sction AS
(SELECT q.id,
q.value1
FROM Example q
)
SELECT *
FROM sction
WHERE id = 125 --> "125" is that "parameter" you pass while selecting from SCTION CTE
As of a subquery I mentioned: that would have been
select *
from (
select id, value1 from example --> this is a CTE
)
where id = 125
In CTE, it is moved "up", outside your "main" query.
with sction as (select q.value1,q.id from example q )
select * from sction where id = 1;

How to search for multiple matches using the IN operator in BigQuery?

Right now I am filtering my rows by using the WHERE operator and 2 conditional statements. It seems somewhat inefficient that I am writing 2 conditions. Would it be possible to check if "amznbida" and "ksga" are in the array by only writing in one statement?
standardSQL
-- Get all the keys
SELECT
*
FROM `encoded-victory-198215.DFP_TEST.test3`
WHERE
"amznbida" IN UNNEST(ARRAY(SELECT name FROM UNNEST(keywords)))
AND
"ksga"IN UNNEST(ARRAY(SELECT name FROM UNNEST(keywords)))
Just remove the UNNEST(ARRAY( part and leave the subquery - you should be fine.
working example:
SELECT
*,
t in (select * from unnest(a)) condition
FROM unnest([
struct('a' as t, ['a', 'b', 'c'] as a),
('b',['r', 'f'])
])
Below is for BigQuery Standard SQL
#standardSQL
SELECT *
FROM `encoded-victory-198215.DFP_TEST.test3`
WHERE 2 = (SELECT COUNT(DISTINCT name) FROM UNNEST(keywords) WHERE name IN ("amznbida", "ksga"))
Yu can test, play with above using dummy data as below
#standardSQL
WITH `encoded-victory-198215.DFP_TEST.test3` AS (
SELECT
ARRAY<STRUCT<value ARRAY<STRING>, name STRING>>[
STRUCT(['ksg-1', 'ksg-2'], 'ksga'), STRUCT(['amznbid-1', 'amznbid-2'], 'amznbida')
] keywords,
1 impression UNION ALL
SELECT
ARRAY<STRUCT<value ARRAY<STRING>, name STRING>>[
STRUCT(['xxx-1', 'xxx-2'], 'xxxa'), STRUCT(['amznbid-1', 'amznbid-2'], 'amznbida')
] keywords,
2 impression
)
SELECT *
FROM `encoded-victory-198215.DFP_TEST.test3`
WHERE 2 = (SELECT COUNT(DISTINCT name) FROM UNNEST(keywords) WHERE name IN ("amznbida", "ksga"))
with result
Row keywords.value keywords.name impression
1 ksg-1 ksga 1
ksg-2
amznbid-1 amznbida
amznbid-2

Using a case statement as an if statement

I am attempting to create an IF statement in BigQuery. I have built a concept that will work but it does not select the data from a table, I can only get it to display 1 or 0
Example:
SELECT --AS STRUCT
CASE
WHEN (
Select Count(1) FROM ( -- If the records are the same, then return = 0, if the records are not the same then > 1
Select Distinct ESCO, SOURCE, LDCTEXT, STATUS,DDR_DATE, TempF, HeatingDegreeDays, DecaTherms
from `gas-ddr.gas_ddr_outbound.LexingtonDDRsOutbound_onchange_Prior_Filtered`
Except Distinct
Select Distinct ESCO, SOURCE, LDCTEXT, STATUS,DDR_DATE, TempF, HeatingDegreeDays, DecaTherms
from `gas-ddr.gas_ddr_outbound.LexingtonDDRsOutbound_onchange_Latest_Filtered`
)
)= 0
THEN
(Select * from `gas-ddr.gas_ddr_outbound.LexingtonDDRsOutbound_onchange_Latest`) -- This Does not
work Scalar subquery cannot have more than one column unless using SELECT AS
STRUCT to build STRUCT values at [16:4] END
SELECT --AS STRUCT
CASE
WHEN (
Select Count(1) FROM ( -- If the records are the same, then return = 0, if the records are not the same then > 1
Select Distinct ESCO, SOURCE, LDCTEXT, STATUS,DDR_DATE, TempF, HeatingDegreeDays, DecaTherms
from `gas-ddr.gas_ddr_outbound.LexingtonDDRsOutbound_onchange_Prior_Filtered`
Except Distinct
Select Distinct ESCO, SOURCE, LDCTEXT, STATUS,DDR_DATE, TempF, HeatingDegreeDays, DecaTherms
from `gas-ddr.gas_ddr_outbound.LexingtonDDRsOutbound_onchange_Latest_Filtered`
)
)= 0
THEN 1 --- This does work
Else
0
END
How can I Get this query to return results from an existing table?
You question is still a little generic, so my answer same as well - and just mimic your use case at extend I can reverse engineer it from your comments
So, in below code - project.dataset.yourtable mimics your table ; whereas
project.dataset.yourtable_Prior_Filtered and project.dataset.yourtable_Latest_Filtered mimic your respective views
#standardSQL
WITH `project.dataset.yourtable` AS (
SELECT 'aaa' cols, 'prior' filter UNION ALL
SELECT 'bbb' cols, 'latest' filter
), `project.dataset.yourtable_Prior_Filtered` AS (
SELECT cols FROM `project.dataset.yourtable` WHERE filter = 'prior'
), `project.dataset.yourtable_Latest_Filtered` AS (
SELECT cols FROM `project.dataset.yourtable` WHERE filter = 'latest'
), check AS (
SELECT COUNT(1) > 0 changed FROM (
SELECT DISTINCT cols FROM `project.dataset.yourtable_Latest_Filtered`
EXCEPT DISTINCT
SELECT DISTINCT cols FROM `project.dataset.yourtable_Prior_Filtered`
)
)
SELECT t.* FROM `project.dataset.yourtable` t
CROSS JOIN check WHERE check.changed
the result is
Row cols filter
1 aaa prior
2 bbb latest
if you changed your table to
WITH `project.dataset.yourtable` AS (
SELECT 'aaa' cols, 'prior' filter UNION ALL
SELECT 'aaa' cols, 'latest' filter
) ......
the result will be
Row cols filter
Query returned zero records.
I hope this gives you right direction
Added more explanations:
I can be wrong - but per your question - it looks like you have one table project.dataset.yourtable and two views project.dataset.yourtable_Prior_Filtered and project.dataset.yourtable_Latest_Filtered which present state of your table prior and after some event
So, first three CTE in the answer above just mimic those table and views which you described in your question.
They are here so you can see concept and can play with it without any extra work before adjusting this to your real use-case.
For your real use-case you should omit them and use your real table and views names and whatever columns the have.
So the query for you to play with is:
#standardSQL
WITH check AS (
SELECT COUNT(1) > 0 changed FROM (
SELECT DISTINCT cols FROM `project.dataset.yourtable_Latest_Filtered`
EXCEPT DISTINCT
SELECT DISTINCT cols FROM `project.dataset.yourtable_Prior_Filtered`
)
)
SELECT t.* FROM `project.dataset.yourtable` t
CROSS JOIN check WHERE check.changed
It should be a very simple IF statement in any language.
Unfortunately NO! it cannot be done with just simple IF and if you see it fit you can submit a feature request to BigQuery team for whatever you think makes sense

jpql INTERSECT without INTERSECT

i have four queries that return intergers.
select listOfIntegers from [something]...
(edit: the results are ROWS)
and need a way to do
select ...
intersect
select ...
intersect
select ...
intersect
select ...
but in jpql there is no intersect as such.
so, is there a way to mimic the behavior using some other jpql to get the same result?
(for those that are unsure about intersect) basically i need to get all the values that appear in ALL the selects...
result from select 1: 1,2,3,4
result from select 2: 1,2,5,6
result from select 3: 1,2,7,8
result from select 4: 1,2,9,0
so the result i want with intersect: 1,2
thnx a lot
p.s. there is no chance to use ANYHTING OTHER THAN JPQL :( no native queries, etc...
Can you use something like this?:
select s1.result
from select_1 as s1
where exists (
select *
from select_2 as s2
where s2.result = s1.result
)
and exists (
select *
from select_3 as s3
where s3.result = s1.result
)
and exists (
select *
from select_4 as s4
where s4.result = s1.result
);