Further group by WEEK NUMBER in sub-query - sql

I am trying to subtract values from 2 columns and group them week number.
The event code column has values 3,4. I am trying to sum duration for event codes 4 and subtract the duration of event code 3. These values need to be derived for the last 12 weeks.
Here is what I have so far. I am stuff and further grouping by week number:
SELECT DISTINCT CUSTOMER_ID,
((SELECT SUM(DURATION_IN_SECONDS)/60 FROM TABLE1 ee WHERE ee.CUSTOMER_ID = e.CUSTOMER_ID AND EVENT_CODE IN (4))-
(SELECT SUM(DURATION_IN_SECONDS)/60 FROM TABLE1 ee WHERE ee.CUSTOMER_ID = e.CUSTOMER_ID AND EVENT_CODE IN (3))) AS UNPRODUCTIVE_MINUTES
FROM TABLE1 e
WHERE TIMEDATE >= TO_DATE('01-OCT-19','DD-MON-YY')
AND TIMEDATE <= TO_DATE('31-DEC-19','DD-MON-YY')
GROUP BY CUSTOMER_ID
The above query produces results like this:
CUSTOMER_ID UNPRODUCTIVE_MINUTES
A100 1601
But my result has to be like this:
CUSTOMER_ID WEEKNUMBER UNPRODUCTIVE_MINUTES
A100 12 171
A100 11 108
A100 10 112
A100 9 110
A100 8 98
A100 7 67
A100 6 117
A100 5 100
A100 4 111
A100 3 77
A100 2 73
A100 1 87

I am not sure, how you want to calculate the week number but I guess weeknumber is (timedate - start timedate / 7) + 1 so creating the query accordingly.
Select customer_id,
Sum(case when EVENT_CODE = 4 then DURATION_IN_SECONDS else (-1* DURATION_IN_SECONDS) end)/60 as dur,
Floor(Trunc(timedate) - TO_DATE('01-OCT-19','DD-MON-YY') / 7) + 1 as weeknumber
From table1 e
Where TIMEDATE >= TO_DATE('01-OCT-19','DD-MON-YY')
AND TIMEDATE <= TO_DATE('31-DEC-19','DD-MON-YY')
AND EVENT_CODE in (3, 4)
GROUP BY CUSTOMER_ID, floor(trunc(timedate) - TO_DATE('01-OCT-19','DD-MON-YY') / 7) + 1
Here, I have not used event_code 3 as DURATION_IN_SECONDS for event_code 3 and 4 both minus DURATION_IN_SECONDS for 3 will eventually same as DURATION_IN_SECONDS for event_code 4 alone.
Cheers!!

TO_CHAR(TIMEDATE, 'ww') function might directly be used to
determine the week number
No need to use Correlated Subqueries, but Conditional Aggregation
should be used instead
Reformat your DATE literals as DATE'yyyy-mm-dd' according to ISO standard as in the below
Using BETWEEN Operator is enough for inclusive date ranges instead of
inequalities
query
SELECT CUSTOMER_ID,
TO_CHAR(TIMEDATE, 'ww') AS WEEK,
NVL(SUM(CASE
WHEN EVENT_CODE = 4 THEN
DURATION_IN_SECONDS / 60
END),
0) - NVL(SUM(CASE
WHEN EVENT_CODE = 3 THEN
DURATION_IN_SECONDS / 60
END),
0) AS UNPRODUCTIVE_MINUTES
FROM TABLE1 e
WHERE TIMEDATE BETWEEN DATE '2019-10-01' AND DATE '2019-12-31'
GROUP BY CUSTOMER_ID, TO_CHAR(TIMEDATE, 'ww')
ORDER BY CUSTOMER_ID, WEEK
Demo

Related

Snowflake SQL - Count Distinct Users within descending time interval

I want to count the distinct amount of users over the last 60 days, and then, count the distinct amount of users over the last 59 days, and so on and so forth.
Ideally, the output would look like this (TARGET OUTPUT)
Day Distinct Users
60 200
59 200
58 188
57 185
56 180
[...] [...]
where 60 days is the max total possible distinct users, and then 59 would have a little less and so on and so forth.
my query looks like this.
select
count(distinct (case when datediff(day,DATE,current_date) <= 60 then USER_ID end)) as day_60,
count(distinct (case when datediff(day,DATE,current_date) <= 59 then USER_ID end)) as day_59,
count(distinct (case when datediff(day,DATE,current_date) <= 58 then USER_ID end)) as day_58
FROM Table
The issue with my query is that This outputs the data by column instead of by rows (like shown below) AND, most importantly, I have to write out this logic 60x for each of the 60 days.
Current Output:
Day_60 Day_59 Day_58
209 207 207
Is it possible to write the SQL in a way that creates the target as shown initially above?
Using below data in CTE format -
with data_cte(dates,userid) as
(select * from values
('2022-05-01'::date,'UID1'),
('2022-05-01'::date,'UID2'),
('2022-05-02'::date,'UID1'),
('2022-05-02'::date,'UID2'),
('2022-05-03'::date,'UID1'),
('2022-05-03'::date,'UID2'),
('2022-05-03'::date,'UID3'),
('2022-05-04'::date,'UID1'),
('2022-05-04'::date,'UID1'),
('2022-05-04'::date,'UID2'),
('2022-05-04'::date,'UID3'),
('2022-05-04'::date,'UID4'),
('2022-05-05'::date,'UID1'),
('2022-05-06'::date,'UID1'),
('2022-05-07'::date,'UID1'),
('2022-05-07'::date,'UID2'),
('2022-05-08'::date,'UID1')
)
Query to get all dates and count and distinct counts -
select dates,count(userid) cnt, count(distinct userid) cnt_d
from data_cte
group by dates;
DATES
CNT
CNT_D
2022-05-01
2
2
2022-05-02
2
2
2022-05-03
3
3
2022-05-04
5
4
2022-05-05
1
1
2022-05-06
1
1
2022-05-08
1
1
2022-05-07
2
2
Query to get difference of date from current date
select dates,datediff(day,dates,current_date()) ddiff,
count(userid) cnt,
count(distinct userid) cnt_d
from data_cte
group by dates;
DATES
DDIFF
CNT
CNT_D
2022-05-01
45
2
2
2022-05-02
44
2
2
2022-05-03
43
3
3
2022-05-04
42
5
4
2022-05-05
41
1
1
2022-05-06
40
1
1
2022-05-08
38
1
1
2022-05-07
39
2
2
Get records with date difference beyond a certain range only -
include clause having
select datediff(day,dates,current_date()) ddiff,
count(userid) cnt,
count(distinct userid) cnt_d
from data_cte
group by dates
having ddiff<=43;
DDIFF
CNT
CNT_D
43
3
3
42
5
4
41
1
1
39
2
2
38
1
1
40
1
1
If you need to prefix 'day' to each date diff count, you can
add and outer query to previously fetched data-set and add the needed prefix to the date diff column as following -
I am using CTE syntax, but you may use sub-query given you will select from table -
,cte_1 as (
select datediff(day,dates,current_date()) ddiff,
count(userid) cnt,
count(distinct userid) cnt_d
from data_cte
group by dates
having ddiff<=43)
select 'day_'||to_char(ddiff) days,
cnt,
cnt_d
from cte_1;
DAYS
CNT
CNT_D
day_43
3
3
day_42
5
4
day_41
1
1
day_39
2
2
day_38
1
1
day_40
1
1
Updated the answer to get distinct user count for number of days range.
A clause can be included in the final query to limit to number of days needed.
with data_cte(dates,userid) as
(select * from values
('2022-05-01'::date,'UID1'),
('2022-05-01'::date,'UID2'),
('2022-05-02'::date,'UID1'),
('2022-05-02'::date,'UID2'),
('2022-05-03'::date,'UID5'),
('2022-05-03'::date,'UID2'),
('2022-05-03'::date,'UID3'),
('2022-05-04'::date,'UID1'),
('2022-05-04'::date,'UID6'),
('2022-05-04'::date,'UID2'),
('2022-05-04'::date,'UID3'),
('2022-05-04'::date,'UID4'),
('2022-05-05'::date,'UID7'),
('2022-05-06'::date,'UID1'),
('2022-05-07'::date,'UID8'),
('2022-05-07'::date,'UID2'),
('2022-05-08'::date,'UID9')
),cte_1 as
(select datediff(day,dates,current_date()) ddiff,userid
from data_cte), cte_2 as
(select distinct ddiff from cte_1 )
select cte_2.ddiff,
(select count(distinct userid)
from cte_1 where cte_1.ddiff <= cte_2.ddiff) cnt
from cte_2
order by cte_2.ddiff desc
DDIFF
CNT
47
9
46
9
45
9
44
8
43
5
42
4
41
3
40
1
You can do unpivot after getting your current output.
sample one.
select
*
from (
select
209 Day_60,
207 Day_59,
207 Day_58
)unpivot ( cnt for days in (Day_60,Day_59,Day_58));

3 or more consecutive entries in the last 15 days

I have the following data:
ID EMP_ID SALE_DATE
---------------------------------
1 777 5/28/2016
2 777 5/29/2016
3 777 5/30/2016
4 777 5/31/2016
5 888 5/26/2016
6 888 5/28/2016
7 888 5/29/2016
8 999 5/29/2016
9 999 5/30/2016
10 999 5/31/2016
i need to fetch data for emp_id having 3 or more days of consecutive sales in the last 15 days.
Output should be:
777
999
Following is the query:
SELECT TRUNC (sale_date), emp_id
FROM table1
WHERE sale_date >= SYSDATE - 14
GROUP BY TRUNC (sale_date), emp_id
HAVING COUNT (*) >= 3
But this returns consecutive transactions in the last three days only.
Note: This is oracle.
Assuming you have one row per day, you can use lead():
select distinct emp_id
from (select t1.*,
lead(sale_date, 1) over (partition by emp_id order by sale_date) as sd_1,
lead(sale_date, 2) over (partition by emp_id order by sale_date) as sd_2
from table1 t1
where sale_date >= trunc(sysdate) - 14
) t
where sd_1 = sale_date + 1 and
sd_2 = sale_date + 2;

Sum Based on Date

I currently have this code that I want to sum every quantity based on the year. I have written a code that I thought would sum all the charges in 2016 and 2017, but it isn't running correctly.
I added the two different types of partition by statements to test and see if either would work and they don't. When I take them out, the Annual column just shows me the quantity for that specific receipt.
Here is my current code:
SELECT
ReceiptNumber
,Quantity
,Date
,sum(CASE WHEN (Date >= '2016-01-01' and Date < '2017-01-01') THEN
Quantity
ELSE 0 END)
OVER (PARTITION BY Date)
as Annual2016
,sum(CASE WHEN (Date >= '2017-01-01' and Date < '2018-01-01') THEN
Quantity
ELSE 0 END)
OVER (PARTITION BY ReceiptNumber)
as Annual2017
FROM Table1
GROUP BY ReceiptNumber, Quantity, Date
I would like my data to look like this
ReceiptNumber Quantity Date Annual2016 Annual2017
1 5 2016-01-05 17 13
2 11 2017-04-03 17 13
3 12 2016-11-11 17 13
4 2 2017-09-09 17 13
Here is a sample of some of the data I am pulling from:
ReceiptNumber Quantity Date
1 5 2016-01-05
2 11 2017-04-03
3 12 2016-11-11
4 2 2017-09-09
5 96 2015-07-08
6 15 2016-12-12
7 24 2016-04-19
8 31 2017-01-02
9 10 2017-0404
10 18 2015-10-10
11 56 2017-06-02
Try something like this
Select
..
sum(CASE WHEN (Date >= '2016-01-01' and Date < '2017-01-01') THEN
Quantity
ELSE 0 END)
OVER () as Annual2016
sum(CASE WHEN (Date >= '2017-01-01' and Date < '2018-01-01') THEN
Quantity
ELSE 0 END)
OVER ()as Annual2017
..
Where Date >= '2016-01-01' and Date < '2018-01-01'
If you want it printed only once at the top then you should run it in a separate query like:
SELECT YEAR(Date) y, sum(Quantity) s FROM Table1 GROUP BY YEAR(Date)
and then do the main query like this:
SELECT * FROM table1
Easy, peasey ... ;-)
Your original question could also be answered with:
SELECT *,
(SELECT SUM(Quantity) FROM Table1 WHERE YEAR(Date)=2016 ) Annual2016,
(SELECT SUM(Quantity) FROM Table1 WHERE YEAR(Date)=2017 ) Annual2017
FROM table1
You need some conditional aggreation over a Window Aggregate. Simply remove both PARTITION BY as you're already filtering the year in the CASE:
SELECT
ReceiptNumber
,Quantity
,Date
,sum(CASE WHEN (Date >= '2016-01-01' and Date < '2017-01-01') THEN
Quantity
ELSE 0 END)
OVER () as Annual2016
,sum(CASE WHEN (Date >= '2017-01-01' and Date < '2018-01-01') THEN
Quantity
ELSE 0 END)
OVER () as Annual2017
FROM Table1
You probably don't need the final GROUP BY ReceiptNumber, Quantity, Date

SQL count number of users every 7 days

I am new to SQL and I need to find count of users every 7 days. I have a table with users for every single day starting from April 2015 up until now:
...
2015-05-16 00:00
2015-05-16 00:00
2015-05-17 00:00
2015-05-17 00:00
2015-05-17 00:00
2015-05-17 00:00
2015-05-17 00:00
2015-05-18 00:00
2015-05-18 00:00
...
and I need to count the number of users every 7 days (weekly) so I have data weekly.
SELECT COUNT(user_id), Activity_Date FROM TABLE_NAME
I need output like this:
TotalUsers week1 week2 week3 ..........and so on
82 80 14 16
I am using DB Visualizer to query Oracle database.
You should try following,
Select
sum(Week1) + sum(Week2) + sum(Week3) + sum(Week4) + sum(Week5) as Total,
sum(Week1) as Week1,
sum(Week2) as Week2,
sum(Week3) as Week3,
sum(Week4) as Week4,
sum(Week5) as Week5
From (
select
case when week = 1 then 1 else 0 end as Week1,
case when week = 2 then 1 else 0 end as Week2,
case when week = 3 then 1 else 0 end as Week3,
case when week = 4 then 1 else 0 end as Week4,
case when week = 5 then 1 else 0 end as Week5
from
(
Select
CEILING(datepart(dd,visitdate)/7+1) week,
user_id
from visitor
)T
)D
Here is Fiddle
You need to add month & year in the result as well.
SELECT COUNT(user_id), Activity_Date FROM TABLE_NAME WHERE Activity_Date > '2015-06-31';
That would get the amount of users for the last 7 days.
This is my test table:
user_id act_date
1 01/04/2015
2 01/04/2015
3 04/04/2015
4 05/04/2015
..
This is my query:
select week_offset, count(*) nb from (
select trunc((act_date-to_date('01042015','DDMMYYYY'))/7) as week_offset from test_date)
group by week_offset
order by 1
and this is the output:
week_offset nb
0 6
1 3
4 5
5 7
6 3
7 1
18 1
Week offset is the number of the week from 01/04/2015, and we can show the first day of the week.
See here for live testing.
How do you define your weeks? Here's an approach for SQL Server that starts each seven-day block relative to the start of April. The expressions will vary according to your specific needs:
select
dateadd(
dd,
datediff(dd, cast('20150401' as date), Activity_Date) / 7 * 7,
cast('20150401' as date)
) as WeekStart,
count(*)
from T
group by datediff(dd, cast('20150401' as date), Activity_Date) / 7
Oracle:
select
trunc(Activity_date, 'DAY') as WeekStart,
count(*)
from T
group by trunc(Activity_date, 'DAY') /* D and DAY are the same thing */

Oracle select sum by time window

Lets assume that we have the ORACLE table of the following format and data:
TIMESTAMP MESSAGENO ORGMESSAGE
------------------------- ---------------------- -------------------------------------
27.04.13 1 START PERIOD
27.04.13 3 10
27.04.13 4 5
28.04.13 5 6
28.04.13 3 20
29.04.13 4 25
29.04.13 5 26
30.04.13 2 END PERIOD
30.04.13 1 START PERIOD
01.05.13 3 10
02.05.13 4 15
02.05.13 5 16
03.05.13 3 30
03.05.13 4 35
04.05.13 5 36
05.05.13 2 END PERIOD
I want to select sum of all the ORGMESSAGE for all the period (window between START PERIOD and END PERIOD) grouped by MESSAGENO.
Exapmle output would be:
PERIOD START PERIOD END MESSAGENO SUM
------------ ------------- -------- ----
27.04.13 30.04.13 3 25
27.04.13 30.04.13 4 30
27.04.13 30.04.13 5 32
30.04.13 05.05.13 3 45
30.04.13 05.05.13 4 50
30.04.13 05.05.13 5 52
I am guessing that use of ORACLE Analityc function woulde be suitable but really dont know how and where to start.
Thanks in advance for any help.
If we assume that the period starts and ends match, then a simple way to find the matching messages is to count the preceding number of starts. This is a cumulative sum and it is easy in Oracle. The rest is just aggregation:
select min(timestamp) as periodstart, max(timestamp) as periodend, messageno, count(*)
from (select om.*,
sum(case when messageno = 1 then 1 else 0 end) over (order by timestamp) as grp
from orgmessages om
) om
where messageno not in (1, 2)
group by grp, messageno;
Note that this method (as with the others) really wants the timestamp to be unique on each record. In the data presented, these solutions will work. But if you have multiple starts and ends on the same day, none of them will work assuming that timestamp only has the date.
First find all period ends per period start. Then join with your table to group and sum.
select
dates.start_date,
dates.end_date,
messageno,
sum(to_number(orgmessage)) as period_sum
from mytable
join
(
select start_dates.timestmp as start_date, min(end_dates.timestmp) as end_date
from (select * from mytable where orgmessage = 'START PERIOD') start_dates
join (select * from mytable where orgmessage = 'END PERIOD') end_dates
on start_dates.timestmp < end_dates.timestmp
group by start_dates.timestmp
) dates on mytable.timestmp between dates.start_date and dates.end_date
where mytable.orgmessage not like '%PERIOD%'
group by dates.start_date, dates.end_date, messageno
order by dates.start_date, dates.end_date, messageno;
SQL fiddle: http://www.sqlfiddle.com/#!4/365de/15.
please, try this one, replace rrr with your table name
select periodstart, periodend, messageno, sum(to_number(orgmessage)) s
from (select TIMESTAMP periodstart,
(select min (TIMESTAMP) from rrr r2 where orgmessage = 'END PERIOD' and r2.TIMESTAMP > r.TIMESTAMP) periodend
from rrr r
where orgmessage = 'START PERIOD'
) borders, rrr r
where r.TIMESTAMP between borders.periodstart and borders.periodend
and r.orgmessage not in ('END PERIOD', 'START PERIOD')
group by periodstart, periodend, messageno
order by periodstart, periodend, messageno