Sql optimization query with join on Presto - sql

I have the following query which shows out of all the clicks on products how many no of products have >1 image. The queries separately work fine but when combined its not able to execute within 3m threshold time. Any inputs how can this be optimized best.
select DATE (DATE_TRUNC('week',dt)) AS wk_dt,COUNT(DISTINCT a.product_id) tot_prods,COUNT(DISTINCT b.product_id) multiimp_prods,count(a.product_id) AS total_clicks,count(case when b.product_id>0 then a.product_id END) AS total_mulimg_clicks
from (
select product_id,DATE(from_unixtime((time/1000)+19800)) as dt--,count(distinct_id) clicks
from silver.mixpanel_android__product_clicked
WHERE DATE(from_unixtime((time/1000)+19800)) BETWEEN date(date_trunc('week',cast(Current_date - interval '14' day AS date))) AND Current_date
GROUP BY 1,2) a --2,471,245 1,458,476
LEFT join (
SELECT product_id,tot_img
FROM (SELECT DISTINCT product_id, catalog_id,(a + b + c + d + e) AS tot_img
FROM (
SELECT product_id,catalog_id,
CASE WHEN images is NULL then 0 else 1 end as a,
CASE WHEN img_2 is NULL then 0 else 1 end as b,
CASE WHEN img_3 is NULL then 0 else 1 end as c,
CASE WHEN img_4 is NULL then 0 else 1 end as d,
CASE WHEN img_5 is NULL then 0 else 1 end as e
FROM (
SELECT DISTINCT id AS product_id,catalog_id,
TRIM(images) AS images,
TRIM(SPLIT_PART(images,',',2)) AS "img_2",
TRIM(SPLIT_PART(images,',',3)) AS "img_3",
TRIM(SPLIT_PART(images,',',4)) AS "img_4",
TRIM(SPLIT_PART(images,',',5)) AS "img_5"
FROM silver.supply__products --WHERE date(created) <= date(date_trunc('week',cast(Current_date - interval '14' day AS date)))
)
--GROUP BY 1,2
order by 6 asc
)
)
WHERE tot_img>1
) b
on a.product_id=b.product_id
GROUP BY 1

Related

Why is a value not returned if another value is 0 in SQL query

I have a query that does not return the expense value if the drvalue(Debtor Value) is equal to 0 or NULL,
If i change the drvalue to any value greater than 0, query returns the expenses value.
Drvalue is a sum of all the values for a specific period
Below is the query
SELECT f.vehiclenumber,f.fleettype,f.IsCreditor,
ISNULL(SUM(l.DrValue),0) AS drvalue,
CASE WHEN iscreditor=1 then Sum (DrValue) - Sum(CrValue) else null end AS Profit,
sum(l.CrValue) AS CrValue ,
sum(l.distance) AS LoadDist,
sum(l.DrValue)/DDist AS DRVal ,
d.Liters AS Liters,d.ddist AS DDist, d.DDiesel AS DDiesel, d.DieselCost AS dieslCost,
(MAX(isnull(l.Closingkm,0))-MIN(isnull(l.OpeningKM,0))) AS [CO],ISNULL(SUM(l.DrValue),0) / CASE WHEN (isnull(MAX(l.Closingkm),0)-MIN(isnull(l.OpeningKM,0)))=0 THEN 1 ELSE (isnull(MAX(l.Closingkm),0)-MIN(isnull(l.OpeningKM,0))) END AS TotalCPK,
(d.DieselCost/sum(l.DrValue)) * 100 AS DieselPerc,
SUM(jobdetails.total) AS Expenses,
count(l.vehicleNo) AS LoadCount
FROM tblVehicle AS f
LEFT JOIN (SELECT Vehicleno, loaddate, DrValue, CrValue,Distance, OpeningKM, closingkm FROM tblloads WHERE DrValue IS NOT NULL) AS l ON f.VehicleNumber = l.VehicleNo AND l.loaddate >= '2020-06-01' and l.loaddate <= '2020-06-30'
LEFT JOIN (SELECT fleet , NULLIF(SUM(Liters),0) AS Liters,NULLIF(sum(distance),0) AS DDist, NULLIF(sum(distance)/ CASE WHEN sum(Liters)=0 THEN 1 ELSE sum(Liters) END,0) AS DDiesel, NULLIF(SUM (Manual_Amount),0) AS DieselCost FROM tblinput
WHERE [Date] >= '2020-06-01' And [Date] <= '2020-06-30'
GROUP BY Fleet) AS d ON l.VehicleNo = d.Fleet
LEFT JOIN (SELECT jd.fleet, SUM(Total) AS total From tblJobDetails jd, tbljobcards WHERE tbljobcards.JobID = jd.jobid AND jobdate >= '2020-06-01' AND jobdate <= '2020-06-30' GROUP BY jd.fleet) AS jobdetails ON jobdetails.fleet = l.vehicleno
WHERE VehicleCategory <> 'T'
GROUP BY f.vehiclenumber,f.fleettype,f.IsCreditor,d.Liters,d.ddist, d.DDiesel,d.DieselCost
ORDER BY fleettype
RESULTS

How do I Separate One Column values into Multiple rows based on condition in SQL

How to Seperate column values into multiple rows based on Condition. Resultant table will show in Grid view. I tried with Count(*) in multiple select statements but not what i expected. Thanks in advance
Table: RegistrationReport
Date Type
-----------------------------------------
02/05/2015 A
04/05/2015 B
04/05/2015 C
05/05/2015 A
I need output like this:
Date Type 1 Type 2 Type 3
--------------------------------------------------
02/05/2015 A - -
04/05/2015 - B -
04/05/2015 - - C
05/05/2015 A - -
--------------------------------------------------
Total: 2 1 1
Try below mentioned simple query to get Total as well.
;with CTE as(
SELECT Date
,case when Type = 'A' then 'A' else '-' end as 'Type_1'
, case when Type = 'B' then 'B' else '-' end as 'Type_2'
, case when Type = 'C' then 'C' else '-' end as 'Type_3'
FROM RegistrationReport
)
select cast(Date as varchar(20))Date
,Type_1
,Type_2
,Type_3
from CTE
UNION ALL
SELECT 'Total:'
,CAST(SUM(case when Type_1= 'A' then 1 else 0 end)as varchar(10))
,CAST(SUM(case when Type_2= 'B' then 1 else 0 end)as varchar(10))
,CAST(SUM(case when Type_3= 'C' then 1 else 0 end) as varchar(10))
FROM CTE
You will get required output!
Assuming you know the number of columns, and it is relatively few, I think the easiest solution is to just self join:
Select distinct Cast(coalesce(a.date, b.date, c.date) as varchar) as Date
, isnull(a.Type, '--') as Type1
, isnull(b.Type, '--') as Type2
, isnull(c.Type, '--') as Type3
from Table a
full outer join Table b
on a.date = b.date
full outer join Table c
on isnull(a.date, b.date) = c.date
where isnull(a.type, 'A') = 'A'
and isnull(b.type, 'B') = 'B'
and isnull(c.type, 'C') = 'C'
union all
select 'Total'
, count(distinct case when type = 'A' then Date end)
, count(distinct case when type = 'B' then Date end)
, count(distinct case when type = 'C' then Date end)
from Table

sql join and group by generated date range

I have Table1 and I need a query to populate Table2:
Problem here is with Date column. I want to know the process of location/partner combination per day. Main issue here is that I can't pick DateCreated and make it as default date since it doesn't necessarily cover whole date range, like in this example where it doesn't have 2015-01-07 and 2015-01-09. Same case with other dates.
So, my idea is to first select dates from some table which contains needed date range and then perform calculation for each day/location/partner combination from cte but in that case I can't figure out how to make a join for LocationId and PartnerId.
Columns:
Date - CreatedItems - number of created items where Table1.DateCreated = Table2.Date
DeliveredItems - number of delivered items where Table1.DateDateOut = Table2.Date
CycleTime - number of days delivered item was in the location (DateOut - DateIn + 1)
I started with something like this but it's very like that I completely missed the point with it:
with d as
(
select date from DimDate
where date between DATEADD(DAY, -365, getdate()) and getdate()
),
cr as -- created items
(
select
DateCreated,
LocationId,
PartnerId,
CreatedItems = count(*)
from Table1
where DateCreated is not null
group by DateCreated,
LocationId,
PartnerId
),
del as -- delivered items
(
select
DateOut,
LocationId,
ParnerId,
DeliveredItems = count(*),
CycleTime = DATEDIFF(Day, DateOut, DateIn)
from Table1
where DateOut is not null
and Datein is not null
group by DateOut,
LocationId,
PartnerId
)
select
d.Date
from d
LEFT OUTER JOIN cr on cr.DateCreated = d.Date -- MISSING JOIN PER LocationId and PartnerId
LEFT OUTER JOIN del on del.DateCompleted = d.Date -- MISSING JOIN PER LocationId and PartnerId
with range(days) as (
select 0 union all select 1 union all select 2 union all
select 3 union all select 4 union all select 5 union all
select 6 /* extend as necessary */
)
select dateadd(day, r.days, t.DateCreated) as "Date", locationId, PartnerId,
sum(
case
when dateadd(day, r.days, t.DateCreated) = t.DateCreated
then 1 else 0
end) as CreatedItems,
sum(
case
when dateadd(day, r.days, t.DateCreated) = t.Dateout
then 1 else 0
end) as DeliveredItems,
sum(
case
when dateadd(day, r.days, t.DateCreated) = t.Dateout
then datediff(days, t.DateIn, t.DateOut) + 1 else 0
end) as CycleTime
from
<yourtable> as t
inner join range as r
on r.days between 0 and datediff(day, t.DateCreated, t.DateOut)
group by dateadd(day, r.days, t.DateCreated), LocationId, PartnerId;
If you only want the end dates (rather than all the dates in between) this is probably a better approach:
with range(dt) as (
select distinct DateCreated from T union
select distinct DateOut from T
)
select r.dt as "Date", locationId, PartnerId,
sum(
case
when r.dt = t.DateCreated
then 1 else 0
end) as CreatedItems,
sum(
case
when r.dt = t.Dateout
then 1 else 0
end) as DeliveredItems,
sum(
case
when r.dt = t.Dateout
then datediff(days, t.DateIn, t.DateOut) + 1 else 0
end) as CycleTime
from
<yourtable> as t
inner join range as r
on r.dt in (t.DateCreated, t.DateOut)
group by r.dt, LocationId, PartnerId;
If to specify WHERE clause? Something Like that:
WHERE cr.LocationId = del.LocationId AND
cr.PartnerId = del.PartnerId

ORACLE SQL: Fill in missing dates

I have the following code which gives me production dates and production volumes for a thirty day period.
select
(case when trunc(so.revised_due_date) <= trunc(sysdate)
then trunc(sysdate) else trunc(so.revised_due_date) end) due_date,
(case
when (case when sp.pr_typ in ('VV','VD') then 'DVD' when sp.pr_typ in ('RD','CD')
then 'CD' end) = 'CD'
and (case when so.tec_criteria in ('PI','MC')
then 'XX' else so.tec_criteria end) = 'OF'
then sum(so.revised_qty_due)
end) CD_OF_VOLUME
from shop_order so
left join scm_prodtyp sp
on so.prodtyp = sp.prodtyp
where so.order_type = 'MD'
and so.plant = 'W'
and so.status_code between '4' and '8'
and trunc(so.revised_due_date) <= trunc(sysdate)+30
group by trunc(so.revised_due_date), so.tec_criteria, sp.pr_typ
order by trunc(so.revised_due_date)
The problem I have is where there is a date with no production planned, the date wont appear on the report. Is there a way of filling in the missing dates.
i.e. the current report shows the following ...
DUE_DATE CD_OF_VOLUME
14/04/2015 35,267.00
15/04/2015 71,744.00
16/04/2015 20,268.00
17/04/2015 35,156.00
18/04/2015 74,395.00
19/04/2015 3,636.00
21/04/2015 5,522.00
22/04/2015 15,502.00
04/05/2015 10,082.00
Note: missing dates (20/04/2015, 23/04/2015 to 03/05/2015)
Range is always for a thirty day period from sysdate.
How do you fill in the missing dates?
Do you need some kind of calendar table?
Thanks
You can get the 30-day period from SYSDATE as follows (I assume you want to include SYSDATE?):
WITH mydates AS (
SELECT TRUNC(SYSDATE) - 1 + LEVEL AS due_date FROM dual
CONNECT BY LEVEL <= 31
)
Then use the above to do a LEFT JOIN with your query (perhaps not a bad idea to put your query in a CTE as well):
WITH mydates AS (
SELECT TRUNC(SYSDATE) - 1 + LEVEL AS due_date FROM dual
CONNECT BY LEVEL <= 31
), myorders AS (
select
(case when trunc(so.revised_due_date) <= trunc(sysdate)
then trunc(sysdate) else trunc(so.revised_due_date) end) due_date,
(case
when (case when sp.pr_typ in ('VV','VD') then 'DVD' when sp.pr_typ in ('RD','CD')
then 'CD' end) = 'CD'
and (case when so.tec_criteria in ('PI','MC')
then 'XX' else so.tec_criteria end) = 'OF'
then sum(so.revised_qty_due)
end) CD_OF_VOLUME
from shop_order so
left join scm_prodtyp sp
on so.prodtyp = sp.prodtyp
where so.order_type = 'MD'
and so.plant = 'W'
and so.status_code between '4' and '8'
and trunc(so.revised_due_date) <= trunc(sysdate)+30
group by trunc(so.revised_due_date), so.tec_criteria, sp.pr_typ
order by trunc(so.revised_due_date)
)
SELECT mydates.due_date, myorders.cd_of_volume
FROM mydates LEFT JOIN myorders
ON mydates.due_date = myorders.due_date;
If you want to show a zero on "missing" dates instead of a NULL, use COALESCE(myorders.cd_of_volume, 0) AS cd_of_volume above.
what you can do is this :
creating a new table with all the days you need .
WITH DAYS AS
(SELECT TRUNC(SYSDATE) - ROWNUM DDD
FROM ALL_OBJECTS
WHERE ROWNUM < 365)
SELECT
DAYS.DDD
FROM
DAYS;
then full outer join between thoes table :
select DUE_DATE , CD_OF_VOLUME , DDD
from (
select
(case when trunc(so.revised_due_date) <= trunc(sysdate)
then trunc(sysdate) else trunc(so.revised_due_date) end) due_date,
(case
when (case when sp.pr_typ in ('VV','VD') then 'DVD' when sp.pr_typ in ('RD','CD')
then 'CD' end) = 'CD'
and (case when so.tec_criteria in ('PI','MC')
then 'XX' else so.tec_criteria end) = 'OF'
then sum(so.revised_qty_due)
end) CD_OF_VOLUME
from shop_order so
left join scm_prodtyp sp
on so.prodtyp = sp.prodtyp
where so.order_type = 'MD'
and so.plant = 'W'
and so.status_code between '4' and '8'
and trunc(so.revised_due_date) <= trunc(sysdate)+30
group by trunc(so.revised_due_date), so.tec_criteria, sp.pr_typ
order by trunc(so.revised_due_date)
) full outer join NEW_TABLE new on ( new .DDD = DUE_DATE )
where new .DDD between /* */ AND /* */ /* pick your own limit) */
you can get the gaps by using connect by and a left join:
assuming your schema is:
create table tbl(DUE_DATE date, CD_OF_VOLUME float);
insert into tbl values(to_date('14/04/2015','DD/MM/YYYY'),35267.00);
insert into tbl values(to_date('15/04/2015','DD/MM/YYYY'),71744.00);
insert into tbl values(to_date('16/04/2015','DD/MM/YYYY'),20268.00);
insert into tbl values(to_date('17/04/2015','DD/MM/YYYY'),35156.00);
insert into tbl values(to_date('18/04/2015','DD/MM/YYYY'),74395.00);
insert into tbl values(to_date('19/04/2015','DD/MM/YYYY'),3636.00);
insert into tbl values(to_date('21/04/2015','DD/MM/YYYY'),5522.00);
insert into tbl values(to_date('22/04/2015','DD/MM/YYYY'),15502.00);
insert into tbl values(to_date('04/05/2015','DD/MM/YYYY'),10082.00);
you can say:
with cte as
(
select (select min(DUE_DATE)-1 from tbl)+ level as dt
from dual
connect by level <= (select max(DUE_DATE)-min(DUE_DATE) from tbl)
)
select to_char(c.dt,'DD/MM/YYYY') gap,null volume
from cte c
left join tbl t on c.dt=t.DUE_DATE
where t.DUE_DATE is null
order by c.dt
Result:
GAP VOLUME
20/04/2015 (null)
23/04/2015 (null)
24/04/2015 (null)
25/04/2015 (null)
26/04/2015 (null)
27/04/2015 (null)
28/04/2015 (null)
29/04/2015 (null)
30/04/2015 (null)
01/05/2015 (null)
02/05/2015 (null)
03/05/2015 (null)
Notice: you can implement this in your original query, one simplest way is to wrap your query and use it as a subquery instead of tbl in above code snippet.

counting events over flexible ranges

I am trying to count events (which are rows in the event_table) in the year before and the year after a particular target date for each person. For example, say I have a person 100 and target date is 10/01/2012. I would like to count events in 9/30/2011-9/30/2012 and in 10/02/2012-9/30/2013.
My query looks like:
select *
from (
select id, target_date
from subsample_table
) as i
left join (
select id, event_date, count(*) as N
, case when event_date between target_date-365 and target_date-1 then 0
when event_date between target_date+1 and target_date+365 then 1
else 2 end as after
from event_table
group by id, target_date, period
) as h
on i.id = h.id
and i.target_date = h.event_date
The output should look something like:
id target_date after N
100 10/01/2012 0 1000
100 10/01/2012 1 0
It's possible that some people do not have any events in the before or after periods (or both), and it would be nice to have zeros in that case. I don't care about the events outside the 730 days.
Any suggestions would be greatly appreciated.
I think the following may approach what you are trying to accomplish.
select id
, target_date
, event_date
, count(*) as N
, SUM(case when event_date between target_date-365 and target_date-1
then 1
else 0
end) AS Prior_
, SUM(case when event_date between target_date+1 and target_date+365
then 1
else 0
end) as After_
from subsample_table i
left join
event_table h
on i.id = h.id
and i.target_date = h.event_date
group by id, target_date, period
This is a generic answer. I don't know what date functions teradata has, so I will use sql server syntax.
select id, target_date, sum(before) before, sum(after) after, sum(righton) righton
from yourtable t
join (
select id, target_date td
, case when yourdate >= dateadd(year, -1, target_date)
and yourdate < target_date then 1 else 0 end before
, case when yourdate <= dateadd(year, 1, target_date)
and yourdate > target_date then 1 else 0 end after
, case when yourdate = target_date then 1 else 0 end righton
from yourtable
where whatever
group by id, target_date) sq on t.id = sq.id and target_date = dt
where whatever
group by id, target_date
This answer assumes that an id can have more than one target date.