How to optimise performance in query having 10+ joins - google-bigquery
I am running a query in BigQuery - which contains 10+ joins leading to performance delay if I try to fetch data for 3-4 months.
It works fine for a day or two, but keeps on executing for 3+ hours for even 15days data fetch.
I need to fetch last 6 months data, but not even able to fetch for 15days.
Any optimization suggestion would be highly welcomed.
Sample Query Below:
with staging as (
SELECT *
FROM base_table
WHERE condition1,condition2,...condition n
),
login_users as (
SELECT
field1,
field2,
field3,
field4,
field5,
field6
FROM staging
where conditions
),
menu_clicks as (
SELECT
field1,
field2,
field3,
field4,
field5,
field6
FROM staging
where conditions
),
tile1_clicks as (
SELECT
field1,
field2,
field3,
field4,
field5,
field6
FROM staging
where conditions
),
tile1_success as (
SELECT
field1,
field2,
field3,
field4,
field5,
field6
FROM staging
where conditions
),
tile2_clicks as (
SELECT
field1,
field2,
field3,
field4,
field5,
field6
FROM staging
where conditions
),
tile3_clicks as (
SELECT
field1,
field2,
field3,
field4,
field5,
field6
FROM staging
where conditions
),
tile3_success as (
SELECT
field1,
field2,
field3,
field4,
field5,
field6
FROM staging
where conditions
),
tile4_clicks as (
SELECT
field1,
field2,
field3,
field4,
field5,
field6
FROM staging
where conditions
),
tile4_success as (
SELECT
field1,
field2,
field3,
field4,
field5,
field6
FROM staging
where conditions
),
tile5_clicks as (
SELECT
field1,
field2,
field3,
field4,
field5,
field6
FROM staging
where conditions
),
tile5_success as (
SELECT
field1,
field2,
field3,
field4,
field5,
field6
FROM staging
where conditions
),
tile6_clicks as (
SELECT
field1,
field2,
field3,
field4,
field5,
field6
FROM staging
where conditions
),
tile6_success as (
SELECT
field1,
field2,
field3,
field4,
field5,
field6
FROM staging
where conditions
),
tile7_clicks as (
SELECT
field1,
field2,
field3,
field4,
field5,
field6
FROM staging
where conditions
),
tile7_success as (
SELECT
field1,
field2,
field3,
field4,
field5,
field6
FROM staging
where conditions
)
select
lu.Brand,
lu.Summary_Date,
lu.Country,
count(distinct lu.field3) loggedInUsers,
count(distinct mc.field3) clickedMenu,
count(distinct setc.field3) clickedtile1,
count(distinct sets.field3) tile1_success,
count(distinct histc.field3) clickedtile2,
count(distinct detc.field3) clickedtile3,
count(distinct detUs.field3) tile3_success,
count(distinct helpc.field3) clickedtile4,
count(distinct helps.field3) tile4_success,
count(distinct gcc.field3) clickedtile5,
count(distinct gcs.field3) tile5_success,
count(distinct gcp.field3) clickedtile6,
count(distinct gcps.field3) tile6_success,
count(distinct gcd.field3) clickedtile7,
count(distinct gcds.field3) tile7_success
from login_users lu left join menu_clicks mc
on lu.field5 =mc.field5 and lu.field2 = mc.field2 and lu.field6 < mc.field6
left join tile1_clicks setc
on mc.field5 =setc.field5 and mc.field2 = setc.field2 and mc.field6 < setc.field6
left join tile1_success sets
on setc.field5 =sets.field5 and setc.field2 = sets.field2 and setc.field6 < sets.field6
left join tile2_clicks histc
on mc.field5 =histc.field5 and mc.field2 = histc.field2 and mc.field6 < histc.field6
left join tile3_clicks detc
on mc.field5 =detc.field5 and mc.field2 = detc.field2 and mc.field6 < detc.field6
left join tile3_success detUs
on detc.field5 =detUs.field5 and detc.field2 = detUs.field2 and detc.field6 < detUs.field6
left join tile4_clicks helpc
on mc.field5 =helpc.field5 and mc.field2 = helpc.field2 and mc.field6 < helpc.field6
left join tile4_success helps
on helpc.field5 =helps.field5 and helpc.field2 = helps.field2 and helpc.field6 < helps.field6
left join tile5_clicks gcc
on mc.field5 =gcc.field5 and mc.field2 = gcc.field2 and mc.field6 < gcc.field6
left join tile5_success gcs
on gcc.field5 =gcs.field5 and gcc.field2 = gcs.field2 and gcc.field6 < gcs.field6
left join tile6_clicks gcp
on mc.field5 =gcp.field5 and mc.field2 = gcp.field2 and mc.field6 < gcp.field6
left join tile6_success gcps
on gcp.field5 =gcps.field5 and gcp.field2 = gcps.field2 and gcp.field6 < gcps.field6
left join tile7_clicks gcd
on mc.field5 =gcd.field5 and mc.field2 = gcd.field2 and mc.field6 < gcd.field6
left join tile7_success gcds
on gcd.field5 =gcds.field5 and gcd.field2 = gcds.field2 and gcd.field6 < gcds.field6
group by 1,2,3
Related
mongodb equivalent to sql WITH clause?
I want to query something with SQL's WITH clause, query: with t1 as( select field1, field2, calculatedField from table1 ), t2 as ( select field1, field2, calculatedField from table 2 ) select a.*, b.* from t1 as a left join t2 as b on a.field1 = b.field1 how can I achieve this in mongodb? i can't find an operator in mongodb for WITH
How to add 2 columns to existing query result based on another query?
I have a query like this: SELECT field1 as field1 , field2 as field2 , (select count(*) from ... where ...=field1) as field3 FROM ... And it works fine - and I see 3 columns in results The I need to add one more column for internal query: SELECT field1 as field1 , field2 as field2 , (select count(*) as my_count, sum(*) as my _sum from ...where ...=field1 ) as field3 FROM ... this syntax doesn't work. How can I achieve it ?
This partial query makes it unsure what you really want, but I would expect that the subquery actually correlates to the outer query (otherwise, you could just cross join). If so, a typical solution is a lateral join. In Postgres: select field1 as field1, field2 as field2, x.* from ... left join lateral ( select count(*) as my_count, sum(*) as my _sum from ... ) x Oracle supports lateral joins starting version 12. You just need to replace left join lateral with outer apply.
The following would seem to do what you want, and it should work fine in Oracle 9i: SELECT t.field1, t.field2, x.my_count, x.my_sum FROM SOME_TABLE t LEFT OUTER JOIN (select FIELD1, count(*) as my_count, sum(SOME_FIELD) as my_sum from SOME_OTHER_TABLE GROUP BY FIELD1) x ON x.FIELD1 = t.FIELD1
You can use a CTE (Common Table Expression) to precompute the values: WITH q as (select count(*) as my_count, sum(*) as my _sum from ... ) SELECT field1 as field1 , field2 as field2 , q.my_count as field3, q.my_sum as field4 FROM ... CROSS JOIN q Or... you can always use the less performant, simpler way: SELECT field1 as field1 , field2 as field2 , (select count(*) from ... ) as field3, (select sum(*) from ... ) as field4 FROM ...
With your limited (& a bit confusing - 2 databases, sum(*) ...) info, here is the logic: SELECT field1 as field1 , field2 as field2 , (select count(*) from ... ) as my_count, (Select sum(<my field>) from ...) as my _sum FROM ...
Oracle SQL - Define table names for later usage?
I wanted to know if there is a way, in SQL Oracle, to do some range-definition (like in Excel). For example: DEFINE TABLE1 = SELECT FIELD1, FIELD2, FIELD3 FROM [SCHEMA].[TABLE0][WHERE/GROUP BY/HAVING/ORDER BY/...]; DEFINE TABLE2 = SELECT FIELD1, FIELD2, FIELD3 FROM TABLE1 [WHERE/GROUP BY/HAVING/ORDER BY/...]; DEFINE TABLE3 = SELECT FIELD1, FIELD2, FIELD3 FROM TABLE2 LEFT JOIN TABLE1 ON [CONDITIONS]; SELECT * FROM TABLE3; Thanks a lot in advance.
Based on your examples, it sounds like you want to create views: CREATE VIEW TABLE1 AS SELECT FIELD1, FIELD2, FIELD3 FROM [SCHEMA].[TABLE0][WHERE/GROUP BY/HAVING/...]; CREATE VIEW TABLE2 AS SELECT FIELD1, FIELD2, FIELD3 FROM TABLE1 [WHERE/GROUP BY/HAVING/...]; CREATE VIEW TABLE3 AS SELECT FIELD1, FIELD2, FIELD3 FROM TABLE2 LEFT JOIN TABLE1 ON [CONDITIONS]; SELECT * FROM TABLE3;
TO close this question. From one of the comments (Steve), what I needed is a WITH clause, as I didn't have DDL privileges. Thanks,
Matching table between select in a function into a biggest select
I have to call a function into a select, and at the same time, i have to do a select into that function which is matching with the biggest select. I try this but it doesn't work: SELECT field1, field2, function(select field3 from table2 where table2.id = table1.id and table2.id = 3) FROM table1 WHERE ... How should i do the select into the function?
You should do a Union of results as SELECT field1, field2, FROM table1 WHERE ... Union All Select function(select * from table2 where table2.id = table1.id) --function returning max
Joining multiple tables in bigquery
I would like to be able to join multiple tables in bigquery. Joining two is pretty trivial. SELECT t1.field1 AS field1, t2.field2 AS field2, t1.field3 AS field3 FROM [datasetName.tableA] t1 JOIN [datasetName.tableB] t2 ON t1.somefield = t2.anotherfield But what if I want to join three or more tables? Can I just do it as SELECT t1.field1 AS field1, t2.field2 AS field2, t1.field3 AS field3, t3.field4 as field4 FROM [datasetName.tableA] t1 JOIN [datasetName.tableB] t2 JOIN [datasetName.tableC] t3 ON t1.somefield = t2.anotherfield AND t1.somefield=t3.yetanotherfield I've tried that and it doesn't work. I think I need to do something like SELECT t12.field1 as field1, t12.field2 as field2, t3.field3 as field3, FROM (SELECT t1.field1 AS field1, t2.field2 AS field2, t1.field3 AS field3 FROM [datasetName.tableA] t1 JOIN [datasetName.tableB] t2 ON t1.somefield = t2.anotherfield) t12 JOIN [datasetName.tableC] t3 ON t12.field1 = t3.field1 But is there a simpler way to accomplish this? Thanks, Brad
I think you are looking for something like below SELECT t1.field1 AS field1, t2.field2 AS field2, t1.field3 AS field3, t3.field4 AS field4 FROM [datasetName.tableA] t1 JOIN [datasetName.tableB] t2 ON t1.somefield = t2.anotherfield JOIN [datasetName.tableC] t3 ON t1.somefield = t3.yetanotherfield
You can also use the USING(field) notation in Standard SQL. Which is actually just sugar for #mikhail-berlyant answer. Docs here #standardSQL SELECT t1.field1 AS field1, t2.field2 AS field2, t1.field3 AS field3 FROM `datasetName.tableA` t1 JOIN `datasetName.tableB` t2 USING(commonfield_AB) JOIN `datasetName.tableC` t3 USING(commonfield_AC)