Different where condition for each column - sql
Is there a way to write query like this in SQL Server, without using select two times and then join?
select trans_date, datepart(HOUR,trans_time) as hour,
(datepart(MINUTE,trans_time)/30)*30 as minute,
case
when paper_number = 11111/*paperA*/
then sum(t1.price*t1.amount)/SUM(t1.amount)*100
end as avgA,
case
when paper_number = 22222/*PaperB*/
then sum(t1.price*t1.amount)/SUM(t1.amount)*100
end as avgB
from dbo.transactions t1
where trans_date = '2006-01-01' and (paper_number = 11111 or paper_number = 22222)
group by trans_date, datepart(HOUR,trans_time), datepart(MINUTE,trans_time)/30
order by hour, minute
SQL Server asks me to add paper_number to group by, and returns nulls when I do so
trans_date hour minute avgA avgB
2006-01-01 9 30 1802.57199725463 NULL
2006-01-01 9 30 NULL 169125.886524823
2006-01-01 10 0 1804.04742534103 NULL
2006-01-01 10 0 NULL 169077.777777778
2006-01-01 10 30 1806.18773535637 NULL
2006-01-01 10 30 NULL 170274.550381867
2006-01-01 11 0 1804.43466045433 NULL
2006-01-01 11 0 NULL 170743.4
2006-01-01 11 30 1807.04532012137 NULL
2006-01-01 11 30 NULL 171307.00280112
Try:
with cte as
(select trans_date,
datepart(HOUR,trans_time) as hour,
(datepart(MINUTE,trans_time)/30)*30 as minute,
sum(case when paper_number = 11111/*paperA*/
then t1.price*t1.amount else 0 end) as wtdSumA,
sum(case when paper_number = 11111/*paperA*/
then t1.amount else 0 end) as amtSumA,
sum(case when paper_number = 22222/*PaperB*/
then t1.price*t1.amount else 0 end) as wtdSumB,
sum(case when paper_number = 22222/*PaperB*/
then t1.amount else 0 end) as amtSumB
from dbo.transactions t1
where trans_date = '2006-01-01'
group by trans_date, datepart(HOUR,trans_time), datepart(MINUTE,trans_time)/30)
select trans_date, hour, minute,
case amtSumA when 0 then 0 else 100 * wtdSumA / amtSumA end as avgA,
case amtSumB when 0 then 0 else 100 * wtdSumB / amtSumB end as avgB
from cte
order by hour, minute
(SQLFiddle here)
You can derive this without the CTE, like so:
select trans_date,
datepart(HOUR,trans_time) as hour,
(datepart(MINUTE,trans_time)/30)*30 as minute,
case sum(case when paper_number = 11111/*paperA*/ then t1.amount else 0 end)
when 0 then 0
else 100 * sum(case when paper_number = 11111 then t1.price*t1.amount else 0 end)
/ sum(case when paper_number = 11111 then t1.amount else 0 end) end as avgA,
case sum(case when paper_number = 22222/*paperA*/ then t1.amount else 0 end)
when 0 then 0
else 100 * sum(case when paper_number = 22222 then t1.price*t1.amount else 0 end)
/ sum(case when paper_number = 22222 then t1.amount else 0 end) end as avgB
from dbo.transactions t1
where trans_date = '2006-01-01'
group by trans_date, datepart(HOUR,trans_time), datepart(MINUTE,trans_time)/30
order by 1,2,3
Use SUM() function on the entire CASE expression
select trans_date, datepart(HOUR,trans_time) as hour, (datepart(MINUTE,trans_time)/30)*30 as minute,
sum(case when paper_number = 11111/*paperA*/ then t1.price*t1.amount end) * 1.00
/ sum(case when paper_number = 11111/*paperA*/ then t1.amount end) * 100 as avgA,
sum(case when paper_number = 22222/*PaperB*/ then t1.price*t1.amount end) * 1.00
/ sum(case when paper_number = 22222/*paperB*/ then t1.amount end) * 100 as avgB
from dbo.transactions t1
where trans_date = '2006-01-01'
group by trans_date, datepart(HOUR,trans_time), datepart(MINUTE,trans_time)/30
order by hour, minute
Demo on SQLFiddle
You could also try using UNPIVOT and PIVOT like below:
WITH prepared AS (
SELECT
trans_date,
trans_time = DATEADD(MINUTE, DATEDIFF(MINUTE, '00:00', trans_time) / 30 * 30, CAST('00:00' AS time)),
paper_number,
total = price * amount,
amount
FROM transactions
),
unpivoted AS (
SELECT
trans_date,
trans_time,
attribute = attribute + CAST(paper_number AS varchar(10)),
value
FROM prepared
UNPIVOT (value FOR attribute IN (total, amount)) u
),
pivoted AS (
SELECT
trans_date,
trans_time,
avgA = total11111 * 100 / amount11111,
avgB = total22222 * 100 / amount22222
FROM unpivoted
PIVOT (
SUM(value) FOR attribute IN (total11111, amount11111, total22222, amount22222)
) p
)
SELECT *
FROM pivoted
;
As an attempt at explaining how the above query works, below is a description of transformations that the original dataset undergoes in the course of the query's execution, using the following example:
trans_date trans_time paper_number price amount
---------- ---------- ------------ ----- ------
2013-04-09 11:12:35 11111 10 15
2013-04-09 11:13:01 22222 24 10
2013-04-09 11:28:44 11111 12 5
2013-04-09 11:36:20 22222 20 11
The prepared CTE produces the following column set:
trans_date trans_time paper_number total amount
---------- ---------- ------------ ----- ------
2013-04-09 11:00:00 11111 150 15
2013-04-09 11:00:00 22222 240 10
2013-04-09 11:00:00 11111 60 5
2013-04-09 11:30:00 22222 220 11
where trans_time is the original trans_time rounded down to the nearest half-hour and total is price multiplied by amount.
The unpivoted CTE unpivots the total and amount values to produce attribute and value:
trans_date trans_time paper_number attribute value
---------- ---------- ------------ --------- -----
2013-04-09 11:00:00 11111 total 150
2013-04-09 11:00:00 11111 amount 15
2013-04-09 11:00:00 22222 total 240
2013-04-09 11:00:00 22222 amount 10
2013-04-09 11:00:00 11111 total 60
2013-04-09 11:00:00 11111 amount 5
2013-04-09 11:30:00 22222 total 220
2013-04-09 11:30:00 22222 amount 11
Then paper_number is combined with attribute to form a single column, also called attribute:
trans_date trans_time attribute value
---------- ---------- ----------- -----
2013-04-09 11:00:00 total11111 150
2013-04-09 11:00:00 amount11111 15
2013-04-09 11:00:00 total22222 240
2013-04-09 11:00:00 amount22222 10
2013-04-09 11:00:00 total11111 60
2013-04-09 11:00:00 amount11111 5
2013-04-09 11:30:00 total22222 220
2013-04-09 11:30:00 amount22222 11
Finally, the pivoted CTE pivots the value data back aggregating them along the way with SUM() and using the attribute values for column names:
trans_date trans_time total11111 amount11111 total22222 amount22222
---------- ---------- ---------- ----------- ---------- -----------
2013-04-09 11:00:00 210 20 240 10
2013-04-09 11:30:00 NULL NULL 220 11
The pivoted values are then additionally processed (every totalNNN is multiplied by 100 and divided by the corresponding amountNNN) to form the final output:
trans_date trans_time avgA avgB
---------- ---------- ---- ----
2013-04-09 11:00:00 1050 2400
2013-04-09 11:30:00 NULL 2000
There's a couple of issues that may need to be addressed:
If price and amount are different data types, the total and amount may end up different data types as well. For UNPIVOT, it is mandatory that the values being unpivoted are of exactly the same type, and so you'll need to add an explicit conversion of total and amount to some common type, possibly one which would prevent data/precision loss. That would could be done in the prepared CTE like this (assuming the common type to be decimal(10,2)):
total = CAST(price * amount AS decimal(10,2)),
amount = CAST(amount AS decimal(10,2))
If aggregated amounts may ever end up 0, you'll need to account for the division by 0 issue. One way to do that could be to substitute the 0 amount with NULL, which would make the result of the division NULL as well. Applying ISNULL or COALESCE to that result would allow you to transform it to some default value, 0 for instance. So, change this bit in the pivoted CTE:
avgA = ISNULL(total11111 * 100 / NULLIF(amount11111, 0), 0),
avgB = ISNULL(total22222 * 100 / NULLIF(amount22222, 0), 0)
Related
Calculate how long a process took, taking into account opening hours
I have two tables. An opening hours table that says for each seller and store, which are the opening and closing times for each day of the week. The second table is the operation one which has all information about the processes. What I need is to calculate how many seconds each process took considering only the hours when the store was opened. I tried to solve that with case when. I solved the problem when the process take less than 2 days. But I don't know how to handle it when it takes more days. The other problem I had with this code is that case when takes a lot of time to process. Can anybody help me with these issues? Opening hours table: sellerid sellerstoreid day dayweek opening closing next_day opening_next_day days_to_next 123 abc 1 monday 09:00:00 17:00:00 2 09:00:00 1 123 abc 2 tuesday 09:00:00 17:00:00 4 09:00:00 2 123 abc 4 thursday 09:00:00 17:00:00 5 09:30:00 1 123 abc 5 friday 09:30:00 17:00:00 1 09:00:00 3 Where: sellerid + sellerstoreid + day works as a primary key; dayweek translates day from number to name; opening and closing are the opening and closing time for that day; opening_next_day shows the opening time o the next available date for that store and seller; days_to_next informes in how many days will the store reopen Process table: delivery_id sellerid sellerstoreid process end_time a1 123 abc p1 05/12/2022 16:00:00.000 a1 123 abc p2 06/12/2022 16:00:00.000 a1 123 abc p3 06/12/2022 16:00:00.000 a1 123 abc p4 08/12/2022 16:00:00.000 a1 123 abc p5 13/12/2022 16:00:00.000 Where: The end_time of the previous process will be the the start time of the process. with table_1 as ( select delivery_id , sellerid , sellerstoreid , process , lag(end_time, 1) over (partition by delivery_id order by end_time) as start_time , extract(dow from lag(end_time, 1) over (partition by delivery_id order by end_time)) as dow_start_time , end_time , extract(dow from end_time) as dow_end_time from process_table ), table_2 as ( select table_1.* , oh_start.opening as start_opening , oh_start.closing as start_closing , oh_end .opening as end_opening , oh_end .closing as end_closing from table_1 tb1 left join opening_hours oh_start on oh_start.sellerid = tb1.sellerid and oh_start.sellerstoreid = tb1.sellerstoreid and oh_start.day = dow_start_time left join opening_hours oh_end on oh_end .sellerid = tb1.sellerid and oh_end.sellerstoreid = tb1.sellerstoreid and oh_end.day = dow_end_time ) select * , case when dow_start_time = dow_end_time then extract(epoch from (case when end_time::time > start_opening then (case when end_time::time > start_closing then start_closing else end_time::time end) else start_opening end - case when start_time::time > start_opening then (case when start_time::time < start_closing then start_time::time else start_closing end ) else start_opening end)) when dow_start_time <> dow_end_time then extract(epoch from (start_closing - case when start_time::time > start_opening then (case when start_time::time < start_closing then start_time::time else start_closing end) else start_opening end) + (case when end_time::time > end_opening then (case when end_time::time > end_closing then end_closing else end_time::time end) else end_opening end - end_opening) end status_duration from table_2
SQL query to get top 24 records, then average the first 12 and bottom 12
I'm attempting to analyze each account's performance (A_Count & B_Count) during their first year versus their second year. This should only return clients who have at least 24 months of totals (records). Volume Table Account ReportDate A_Count B_Count 1001A 2019-01-01 47 100 1001A 2019-02-01 50 105 1002A 2019-02-01 50 105 I think I'm on the right track by wanting to grab the top 24 records for each account (only if 24 exist) and then grabbing the top 12 and bottom 12, but not sure how to get there. I guess ideal output would be: Account YR1_A_Avg YR1_B_Avg YR2_A_Avg YR2_B_Avg FirstDate LastDate 1001A 47 100 53 115 2019-01-01 2021-12-31 1002A 50 105 65 130 2019-02-01 2022-01-01 1003A 15 180 38 200 2017-05-01 2019-04-01 I'm not too worried about performance.
Assuming there are no gaps in ReportDate (per Account). select Account ,avg(case when year_index = 1 then A_Count end) as YR1_A_Avg ,avg(case when year_index = 1 then B_Count end) as YR1_B_Avg ,avg(case when year_index = 2 then A_Count end) as YR2_A_Avg ,avg(case when year_index = 2 then B_Count end) as YR2_B_Avg ,min(ReportDate) as FirstDate ,max(ReportDate) as LastDate from ( select * ,count(*) over(partition by Account) as cnt ,(row_number() over(partition by Account order by ReportDate)-1)/12 +1 as year_index from Volume ) t where cnt >= 24 and year_index <= 2 group by Account
Add a counting condition into dense_rank window Function SQL
I have a function that counts how many times you've visited and if you have converted or not. What I'd like is for the dense_rank to re-start the count, if there has been a conversion: SELECT uid, channel, time, conversion, dense_rank() OVER (PARTITION BY uid ORDER BY time asc) as visit_order FROM table current table output: this customer (uid) had a conversion at visit 18 and now I would want the visit_order count from dense_rank to restart at 0 for the same customer until it hits the next conversion that is non-null.
See this (I do not like "try this" π): SELECT id, ts, conversion, -- SC, ROW_NUMBER() OVER (PARTITION BY id,SC) R FROM ( SELECT id, ts, conversion, -- COUNT(conversion) OVER (PARTITION BY id, conversion=0 ORDER BY ts ) CC, SUM(CASE WHEN conversion=1 THEN 1000 ELSE 1 END) OVER (PARTITION BY id ORDER BY ts ) - SUM(CASE WHEN conversion=1 THEN 1000 ELSE 1 END) OVER (PARTITION BY id ORDER BY ts )%1000 SC FROM sample ORDER BY ts ) x ORDER BY ts; DBFIDDLE output: id ts conversion R 1 2022-01-15 10:00:00 0 1 1 2022-01-16 10:00:00 0 2 1 2022-01-17 10:00:00 0 3 1 2022-01-18 10:00:00 1 1 1 2022-01-19 10:00:00 0 2 1 2022-01-20 10:00:00 0 3 1 2022-01-21 10:00:00 0 4 1 2022-01-22 10:00:00 0 5 1 2022-01-23 10:00:00 0 6 1 2022-01-24 10:00:00 0 7 1 2022-01-25 10:00:00 1 1 1 2022-01-26 10:00:00 0 2 1 2022-01-27 10:00:00 0 3
MsSql Compare specific datetimes in sequence based on ID
I have a table where we store our data from a call and it looks like this: CallID Arrive_Seq DateTime ActivitytypeID 1 1 2018-01-01 05:00:00 1 1 2 2018-01-01 05:00:01 2 1 3 2018-01-01 06:00:00 21 1 4 2018-01-01 06:00:01 28 1 5 2018-01-01 06:00:02 13 1 6 2018-01-01 06:00:03 22 1 7 2018-01-01 06:00:05 29 1 8 2018-01-01 06:05:00 21 1 9 2018-01-01 06:05:01 28 1 10 2018-01-01 06:05:02 13 1 11 2018-01-01 06:05:03 22 1 12 2018-01-01 06:07:45 29 Now I want to select the datediff between ActivitytypeID 21 and 29 in the arrive_sew order. In this example they occur twice (on arrive_seq 3,8 and 7,12). This order is not specific and ActivitytypeID can occur both more and less times in the sequence but they are always connected with eachother. Think of it as ActivitytypeID 21 = 'call started' AND ActivitytypeID = 29 'Call ended'. In the example the answer whould be: SELECT DATEDIFF (SECOND, '2018-01-01 06:00:00', '2018-01-01 06:00:05') = 5 -- Compares datetime of arrive_seq 3 and 7 AND SELECT DATEDIFF (SECOND, '2018-01-01 06:00:05', '2018-01-01 06:07:45') = 460 -- Compares datetime of arrive_seq 21 and 29 Total duration = 465 I have tried with this code but it doesn't work all the time due to row# can change based on arrive_seq and ActivitytypeID ;WITH CallbackDuration AS ( SELECT ROW_NUMBER() OVER(ORDER BY a.time_stamp ASC) AS RowNumber, DATEDIFF(second, a.time_stamp, b.time_stamp) AS 'Duration' FROM Table a JOIN Table b on a.call_id = b.call_id WHERE a.call_id = 1 AND a.activity_type = 21 AND b.activity_type = 29 GROUP BY a.time_stamp, b.time_stamp,a.call_id) SELECT SUM(Duration) AS 'Duration' FROM CallbackDuration WHERE RowNumber in (1,5,9)
I think this is what you want: select call_start, call_end, datediff (second, call_start, call_end) as duration from ( select call_timestamp as call_end, lag(call_timestamp) over (partition by call_id order by call_timestamp) as call_start, activity_type as call_end_activity, lag (activity_type) over (partition by call_id order by call_timestamp) as call_start_activity from call_log where activity_type in (21, 29) ) x where call_start_activity = 21; Result: call_start call_end duration ----------------------- ----------------------- ----------- 2018-01-01 06:00:00.000 2018-01-01 06:00:05.000 5 2018-01-01 06:05:00.000 2018-01-01 06:07:45.000 165 (2 rows affected) Note that the time of the second call is based on your sample data with start time 2018-01-01 06:05:00
This query seems to return your expected result declare #x int = 21 declare #y int = 29 ;with cte(CallID, Arrive_Seq, DateTime, ActivitytypeID) as ( select a, b, cast(c as datetime), d from (values (1,1,'2018-01-01 05:00:00',1) ,(1,2,'2018-01-01 05:00:01',2) ,(1,3,'2018-01-01 06:00:00',21) ,(1,4,'2018-01-01 06:00:01',28) ,(1,5,'2018-01-01 06:00:02',13) ,(1,6,'2018-01-01 06:00:03',22) ,(1,7,'2018-01-01 06:00:05',29) ,(1,8,'2018-01-01 06:05:00',21) ,(1,9,'2018-01-01 06:05:01',28) ,(1,10,'2018-01-01 06:05:02',13) ,(1,11,'2018-01-01 06:05:03',22) ,(1,12,'2018-01-01 06:07:45',29) ) t(a,b,c,d) ) select sum(ss) from ( select *, ss = datediff(ss, DateTime, lead(datetime) over (order by Arrive_Seq)) , rn = row_number() over (order by Arrive_Seq) from cte where ActivitytypeID in (#x, #y) ) t where rn % 2 = 1
How to calculate time duration using Microsoft SQL?
I want to find the time duration for each person from one start time. I want to calculate the time duration from 1 start time for each day and multiple end times for multiple users. This is my code: SELECT *, CAST(DATEDIFF(n, CAST(End_Time AS datetime), CAST(Start_Time AS datetime)) AS FLOAT) / 60 AS Time_Duration FROM ( SELECT NAME, MAX(CASE WHEN DESCRIPTION = 'Green' THEN Final_Value END) AS Start_Time, MAX(CASE WHEN DESCRIPTION = 'Red' THEN Final_Value END) AS End_Time FROM mydata WHERE NAME != βNAβ GROUP BY NAME ) C I am not able to get any results for time duration. This is what my output looks like: Name Start_time End_time Time_Duration 1 Day_1 5/6/15 2:30 2 John 5/6/15 3:30 3 Ben 5/6/15 4:30 4 Mike 5/6/15 5:30 5 Day_2 5/7/15 2:30 6 John_2 5/7/15 4:30 7 Ben_2 5/7/15 5:30 8 Mike_2 5/7/15 6:30 I want it to look like this: Name Start_time End_time Time_Duration 1 Day_1 5/6/15 2:30 2 John 5/6/15 3:30 1.00 3 Ben 5/6/15 4:30 2.00 4 Mike 5/6/15 5:30 3.00 5 Day_2 5/7/15 2:30 6 John_2 5/7/15 4:30 2.00 7 Ben_2 5/7/15 5:30 3.00 8 Mike_2 5/7/15 6:30 4.00
Assuming that the values in name column has suffix of the day number (and none for day 1) WITH td AS ( SELECT *, ROW_NUMBER() OVER (PARTITION BY [day] ORDER BY final_value) rnum FROM (SELECT *, CASE WHEN CHARINDEX('_', name) = 0 THEN '1' ELSE SUBSTRING(name, CHARINDEX('_', name) + 1, LEN(name) - CHARINDEX('_', name)) END [day] FROM t_dur ) tt ) SELECT t1.name, CASE WHEN rnum = 1 THEN t1.final_value END start_time, CASE WHEN rnum <> 1 THEN t1.final_value END end_time, CASE CAST(DATEDIFF(hour, (SELECT t2.final_value FROM td t2 WHERE t2.[day] = t1.[day] AND t2.rnum = 1), t1.final_value) AS DECIMAl(5,2)) WHEN 0 THEN NULL ELSE CAST(DATEDIFF(hour, (SELECT t2.final_value FROM td t2 WHERE t2.[day] = t1.[day] AND t2.rnum = 1), t1.final_value) AS DECIMAl(5,2)) END time_duration FROM td t1 Result name start_time end_time time_duration Day_1 2015-05-06 02:30:00.000 NULL NULL John NULL 2015-05-06 03:30:00.000 1.00 Ben NULL 2015-05-06 04:30:00.000 2.00 Mike NULL 2015-05-06 05:30:00.000 3.00 Day_2 2015-05-07 02:30:00.000 NULL NULL John_2 NULL 2015-05-07 04:30:00.000 2.00 Ben_2 NULL 2015-05-07 05:30:00.000 3.00 Mike_2 NULL 2015-05-07 06:30:00.000 4.00