How to optimise performance in query having 10+ joins - google-bigquery

I am running a query in BigQuery - which contains 10+ joins leading to performance delay if I try to fetch data for 3-4 months.
It works fine for a day or two, but keeps on executing for 3+ hours for even 15days data fetch.
I need to fetch last 6 months data, but not even able to fetch for 15days.
Any optimization suggestion would be highly welcomed.
Sample Query Below:
with staging as (
SELECT *
FROM base_table
WHERE condition1,condition2,...condition n
),
login_users as (
SELECT
field1,
field2,
field3,
field4,
field5,
field6
FROM staging
where conditions
),
menu_clicks as (
SELECT
field1,
field2,
field3,
field4,
field5,
field6
FROM staging
where conditions
),
tile1_clicks as (
SELECT
field1,
field2,
field3,
field4,
field5,
field6
FROM staging
where conditions
),
tile1_success as (
SELECT
field1,
field2,
field3,
field4,
field5,
field6
FROM staging
where conditions
),
tile2_clicks as (
SELECT
field1,
field2,
field3,
field4,
field5,
field6
FROM staging
where conditions
),
tile3_clicks as (
SELECT
field1,
field2,
field3,
field4,
field5,
field6
FROM staging
where conditions
),
tile3_success as (
SELECT
field1,
field2,
field3,
field4,
field5,
field6
FROM staging
where conditions
),
tile4_clicks as (
SELECT
field1,
field2,
field3,
field4,
field5,
field6
FROM staging
where conditions
),
tile4_success as (
SELECT
field1,
field2,
field3,
field4,
field5,
field6
FROM staging
where conditions
),
tile5_clicks as (
SELECT
field1,
field2,
field3,
field4,
field5,
field6
FROM staging
where conditions
),
tile5_success as (
SELECT
field1,
field2,
field3,
field4,
field5,
field6
FROM staging
where conditions
),
tile6_clicks as (
SELECT
field1,
field2,
field3,
field4,
field5,
field6
FROM staging
where conditions
),
tile6_success as (
SELECT
field1,
field2,
field3,
field4,
field5,
field6
FROM staging
where conditions
),
tile7_clicks as (
SELECT
field1,
field2,
field3,
field4,
field5,
field6
FROM staging
where conditions
),
tile7_success as (
SELECT
field1,
field2,
field3,
field4,
field5,
field6
FROM staging
where conditions
)
select
lu.Brand,
lu.Summary_Date,
lu.Country,
count(distinct lu.field3) loggedInUsers,
count(distinct mc.field3) clickedMenu,
count(distinct setc.field3) clickedtile1,
count(distinct sets.field3) tile1_success,
count(distinct histc.field3) clickedtile2,
count(distinct detc.field3) clickedtile3,
count(distinct detUs.field3) tile3_success,
count(distinct helpc.field3) clickedtile4,
count(distinct helps.field3) tile4_success,
count(distinct gcc.field3) clickedtile5,
count(distinct gcs.field3) tile5_success,
count(distinct gcp.field3) clickedtile6,
count(distinct gcps.field3) tile6_success,
count(distinct gcd.field3) clickedtile7,
count(distinct gcds.field3) tile7_success
from login_users lu left join menu_clicks mc
on lu.field5 =mc.field5 and lu.field2 = mc.field2 and lu.field6 < mc.field6
left join tile1_clicks setc
on mc.field5 =setc.field5 and mc.field2 = setc.field2 and mc.field6 < setc.field6
left join tile1_success sets
on setc.field5 =sets.field5 and setc.field2 = sets.field2 and setc.field6 < sets.field6
left join tile2_clicks histc
on mc.field5 =histc.field5 and mc.field2 = histc.field2 and mc.field6 < histc.field6
left join tile3_clicks detc
on mc.field5 =detc.field5 and mc.field2 = detc.field2 and mc.field6 < detc.field6
left join tile3_success detUs
on detc.field5 =detUs.field5 and detc.field2 = detUs.field2 and detc.field6 < detUs.field6
left join tile4_clicks helpc
on mc.field5 =helpc.field5 and mc.field2 = helpc.field2 and mc.field6 < helpc.field6
left join tile4_success helps
on helpc.field5 =helps.field5 and helpc.field2 = helps.field2 and helpc.field6 < helps.field6
left join tile5_clicks gcc
on mc.field5 =gcc.field5 and mc.field2 = gcc.field2 and mc.field6 < gcc.field6
left join tile5_success gcs
on gcc.field5 =gcs.field5 and gcc.field2 = gcs.field2 and gcc.field6 < gcs.field6
left join tile6_clicks gcp
on mc.field5 =gcp.field5 and mc.field2 = gcp.field2 and mc.field6 < gcp.field6
left join tile6_success gcps
on gcp.field5 =gcps.field5 and gcp.field2 = gcps.field2 and gcp.field6 < gcps.field6
left join tile7_clicks gcd
on mc.field5 =gcd.field5 and mc.field2 = gcd.field2 and mc.field6 < gcd.field6
left join tile7_success gcds
on gcd.field5 =gcds.field5 and gcd.field2 = gcds.field2 and gcd.field6 < gcds.field6
group by 1,2,3

Related

mongodb equivalent to sql WITH clause?

I want to query something with SQL's WITH clause, query:
with t1 as(
select field1, field2, calculatedField
from table1
),
t2 as (
select field1, field2, calculatedField
from table 2
)
select a.*, b.*
from t1 as a
left join t2 as b on a.field1 = b.field1
how can I achieve this in mongodb? i can't find an operator in mongodb for WITH

How to add 2 columns to existing query result based on another query?

I have a query like this:
SELECT
field1 as field1 ,
field2 as field2 ,
(select count(*) from ... where ...=field1) as field3
FROM
...
And it works fine - and I see 3 columns in results
The I need to add one more column for internal query:
SELECT
field1 as field1 ,
field2 as field2 ,
(select count(*) as my_count, sum(*) as my _sum from ...where ...=field1 ) as field3
FROM
...
this syntax doesn't work.
How can I achieve it ?
This partial query makes it unsure what you really want, but I would expect that the subquery actually correlates to the outer query (otherwise, you could just cross join). If so, a typical solution is a lateral join.
In Postgres:
select
field1 as field1,
field2 as field2,
x.*
from ...
left join lateral (
select count(*) as my_count, sum(*) as my _sum from ...
) x
Oracle supports lateral joins starting version 12. You just need to replace left join lateral with outer apply.
The following would seem to do what you want, and it should work fine in Oracle 9i:
SELECT t.field1,
t.field2,
x.my_count,
x.my_sum
FROM SOME_TABLE t
LEFT OUTER JOIN (select FIELD1,
count(*) as my_count,
sum(SOME_FIELD) as my_sum
from SOME_OTHER_TABLE
GROUP BY FIELD1) x
ON x.FIELD1 = t.FIELD1
You can use a CTE (Common Table Expression) to precompute the values:
WITH
q as (select count(*) as my_count, sum(*) as my _sum from ... )
SELECT
field1 as field1 ,
field2 as field2 ,
q.my_count as field3,
q.my_sum as field4
FROM
...
CROSS JOIN q
Or... you can always use the less performant, simpler way:
SELECT
field1 as field1 ,
field2 as field2 ,
(select count(*) from ... ) as field3,
(select sum(*) from ... ) as field4
FROM
...
With your limited (& a bit confusing - 2 databases, sum(*) ...) info,
here is the logic:
SELECT
field1 as field1 ,
field2 as field2 ,
(select count(*) from ... ) as my_count,
(Select sum(<my field>) from ...) as my _sum
FROM
...

Oracle SQL - Define table names for later usage?

I wanted to know if there is a way, in SQL Oracle, to do some range-definition (like in Excel). For example:
DEFINE TABLE1 = SELECT FIELD1, FIELD2, FIELD3 FROM [SCHEMA].[TABLE0][WHERE/GROUP BY/HAVING/ORDER BY/...];
DEFINE TABLE2 = SELECT FIELD1, FIELD2, FIELD3 FROM TABLE1 [WHERE/GROUP BY/HAVING/ORDER BY/...];
DEFINE TABLE3 = SELECT FIELD1, FIELD2, FIELD3 FROM TABLE2 LEFT JOIN TABLE1 ON [CONDITIONS];
SELECT * FROM TABLE3;
Thanks a lot in advance.
Based on your examples, it sounds like you want to create views:
CREATE VIEW TABLE1 AS
SELECT FIELD1, FIELD2, FIELD3
FROM [SCHEMA].[TABLE0][WHERE/GROUP BY/HAVING/...];
CREATE VIEW TABLE2 AS
SELECT FIELD1, FIELD2, FIELD3
FROM TABLE1 [WHERE/GROUP BY/HAVING/...];
CREATE VIEW TABLE3 AS
SELECT FIELD1, FIELD2, FIELD3
FROM TABLE2
LEFT JOIN TABLE1 ON [CONDITIONS];
SELECT * FROM TABLE3;
TO close this question. From one of the comments (Steve), what I needed is a WITH clause, as I didn't have DDL privileges.
Thanks,

Matching table between select in a function into a biggest select

I have to call a function into a select, and at the same time, i have to do a select into that function which is matching with the biggest select.
I try this but it doesn't work:
SELECT field1,
field2,
function(select field3 from table2 where table2.id = table1.id and table2.id = 3)
FROM table1
WHERE ...
How should i do the select into the function?
You should do a Union of results as
SELECT field1,
field2,
FROM table1
WHERE ...
Union All
Select function(select * from table2 where table2.id = table1.id) --function returning max

Joining multiple tables in bigquery

I would like to be able to join multiple tables in bigquery. Joining two is pretty trivial.
SELECT
t1.field1 AS field1,
t2.field2 AS field2,
t1.field3 AS field3
FROM [datasetName.tableA] t1
JOIN [datasetName.tableB] t2
ON t1.somefield = t2.anotherfield
But what if I want to join three or more tables? Can I just do it as
SELECT
t1.field1 AS field1,
t2.field2 AS field2,
t1.field3 AS field3,
t3.field4 as field4
FROM [datasetName.tableA] t1
JOIN [datasetName.tableB] t2
JOIN [datasetName.tableC] t3
ON t1.somefield = t2.anotherfield AND t1.somefield=t3.yetanotherfield
I've tried that and it doesn't work. I think I need to do something like
SELECT
t12.field1 as field1,
t12.field2 as field2,
t3.field3 as field3,
FROM
(SELECT
t1.field1 AS field1,
t2.field2 AS field2,
t1.field3 AS field3
FROM [datasetName.tableA] t1
JOIN [datasetName.tableB] t2
ON t1.somefield = t2.anotherfield) t12
JOIN
[datasetName.tableC] t3
ON t12.field1 = t3.field1
But is there a simpler way to accomplish this?
Thanks,
Brad
I think you are looking for something like below
SELECT
t1.field1 AS field1,
t2.field2 AS field2,
t1.field3 AS field3,
t3.field4 AS field4
FROM [datasetName.tableA] t1
JOIN [datasetName.tableB] t2 ON t1.somefield = t2.anotherfield
JOIN [datasetName.tableC] t3 ON t1.somefield = t3.yetanotherfield
You can also use the USING(field) notation in Standard SQL. Which is actually just sugar for #mikhail-berlyant answer. Docs here
#standardSQL
SELECT
t1.field1 AS field1,
t2.field2 AS field2,
t1.field3 AS field3
FROM `datasetName.tableA` t1
JOIN `datasetName.tableB` t2 USING(commonfield_AB)
JOIN `datasetName.tableC` t3 USING(commonfield_AC)