count grouped records by ID and show as weekly with day/week outputted - sql

I have a table in which I have to count total records assigned to each USER by weekly (monday to sunday).
Table BooksIssued
BOOKID USER DATE
1 A 20211001
2 A 20211002
3 A 20211003
4 A 20211004
5 B 20211009
6 C 20211008
7 C 20211008
20211001 is friday.
output of sql query is as follows, the WEEKDATE column shows the week end date (i.e sunday)
WEEKCOUNT USER WEEKDATE
3 A 10/03
1 A 10/10
1 B 10/10
2 C 10/10
I am unable to get the date in output containing day, as grouping is done based on user and week part of date. Please suggest on getting above output.
I am using vertica DB.
Below is sample query i tried (though i could not get the day part of date)
SELECT USER, date_part('WEEK', date)) as WEEKDATE
       SUM(CASE WHEN DATE >= timestampadd(WEEK, DATEDIFF(WEEK, date('1900-01-01 00:00:00.000'), date(sysdate)), date('1900-01-01 00:00:00.000'))
                AND  DATE <  timestampadd(WEEK, DATEDIFF(WEEK, date('1900-01-01 00:00:00.000'), date(sysdate)) + 1, date('1900-01-01 00:00:00.000'))
                THEN 1 ELSE 0 END) AS WEEKCOUNT,
FROM   BOOKSISSUED
GROUP BY USER, date_part('WEEK', date)
when i add date_part('DAY', date) in select clause, i get error as its not in group by.
Please help.

Do you mean this?
WITH
-- your input ...
indata(BOOKID,USR,DT) AS (
SELECT 1,'A',DATE '20211001'
UNION ALL SELECT 2,'A',DATE '20211002'
UNION ALL SELECT 3,'A',DATE '20211003'
UNION ALL SELECT 4,'A',DATE '20211004'
UNION ALL SELECT 5,'B',DATE '20211009'
UNION ALL SELECT 6,'C',DATE '20211008'
UNION ALL SELECT 7,'C',DATE '20211008'
)
SELECT
COUNT(*) AS week_count
, usr
, TO_CHAR(
DATE_TRUNC('WEEK',dt) + INTERVAL '6 DAYS'
, 'MM/DD'
) AS trcweek
FROM indata
GROUP BY 2,3
ORDER BY 2,3
;
week_count | usr | trcweek
------------+-----+---------
3 | A | 10/03
1 | A | 10/10
1 | B | 10/10
2 | C | 10/10

Can you please check the sql query syntax.
In the SELECT clause second column and group by clause second column
SELECT USER, date_part('WEEK', date) as WEEKDATE,
SUM(CASE WHEN DATE >= timestampadd(WEEK, DATEDIFF(WEEK, date('1900-01-01 00:00:00.000'), date(sysdate)), date('1900-01-01 00:00:00.000'))
AND DATE < timestampadd(WEEK, DATEDIFF(WEEK, date('1900-01-01 00:00:00.000'), date(sysdate)) + 1, date('1900-01-01 00:00:00.000'))
THEN 1 ELSE 0 END) AS WEEKCOUNT
FROM BOOKSISSUED
GROUP BY USER, date_part('WEEK', date)

Related

Use SQL to get monthly churn count and churn rate

Currently using Postgres 9.5
I want to calculate monthly churn_count and churn_rate of the search function.
churn_count: number of users who used the search function last month but not this month
churn_rate: churn_count/total_users_last_month
My dummy data is:
CREATE TABLE yammer_events (
occurred_at TIMESTAMP,
user_id INT,
event_name VARCHAR(50)
);
INSERT INTO yammer_events (occurred_at, user_id, event_name) VALUES
('2014-06-01 00:00:01', 1, 'search_autocomplete'),
('2014-06-01 00:00:01', 2, 'search_autocomplete'),
('2014-07-01 00:00:01', 1, 'search_run'),
('2014-07-01 00:00:02', 1, 'search_run'),
('2014-07-01 00:00:01', 2, 'search_run'),
('2014-07-01 00:00:01', 3, 'search_run'),
('2014-08-01 00:00:01', 1, 'search_run'),
('2014-08-01 00:00:01', 4, 'search_run');
Ideal output should be:
|month |churn_count|churn_rate_percentage|
|--- |--- |--- |
|2014-07-01|0 |0
|2014-08-01|2 |66.6 |
In June: user 1, 2 (2 users)
In July: user 1, 2, 3 (3 users)
In August: user 1, 4 (2 users)
In July, we didn't lose any customer. In August, we lost customer 2 and 3, so the churn_count is 2, and the rate is 2/3*100 = 66.6
I tried the following query to calculate churn_count, but the result is really weird.
WITH monthly_activity AS (
SELECT distinct DATE_TRUNC('month', occurred_at) AS month,
user_id
FROM yammer_events
WHERE event_name LIKE 'search%'
)
SELECT last_month.month+INTERVAL '1 month', COUNT(DISTINCT last_month.user_id)
FROM monthly_activity last_month
LEFT JOIN monthly_activity this_month
ON last_month.user_id = this_month.user_id
AND this_month.month = last_month.month + INTERVAL '1 month'
AND this_month.user_id IS NULL
GROUP BY 1
db<>fiddle
Thank you in advance!
An easy way to do it would be to aggregate the users in an array, and from there extract and count the intersection between the current month and the previous one using the window function LAG(), e.g.
WITH j AS (
SELECT date_trunc('month',occurred_at::date) AS month,
array_agg(distinct user_id) AS users,
count(distinct user_id) AS total_users
FROM yammer_events
GROUP BY 1
ORDER BY 1
)
SELECT month::date,
cardinality(LAG(users) OVER w - users) AS churn_count,
(cardinality(LAG(users) OVER w - users)::numeric /
LAG(total_users) OVER w::numeric) * 100 AS churn_rate_percentage
FROM j
WINDOW w AS (ORDER BY month
ROWS BETWEEN 1 PRECEDING AND CURRENT ROW);
month | churn_count | churn_rate_percentage
------------+-------------+-------------------------
2014-06-01 | |
2014-07-01 | 0 | 0.00000000000000000000
2014-08-01 | 2 | 66.66666666666666666700
(3 rows)
Note: this query relies on the extension intarray. In case you don't have it in your system, just hit:
CREATE EXTENSION intarray;
Demo: db<>fiddle
WITH monthly_activity AS (
SELECT distinct DATE_TRUNC('month', occurred_at) AS month,
user_id
FROM yammer_events
WHERE event_name LIKE 'search%'
)
SELECT
last_month.month+INTERVAL '1 month',
SUM(CASE WHEN this_month.month IS NULL THEN 1 ELSE 0 END) AS churn_count,
SUM(CASE WHEN this_month.month IS NULL THEN 1 ELSE 0 END)*1.00/COUNT(DISTINCT last_month.user_id)*100 AS churn_rate_percentage
FROM monthly_activity last_month
LEFT JOIN monthly_activity this_month
ON last_month.month + INTERVAL '1 month' = this_month.month
AND last_month.user_id = this_month.user_id
GROUP BY 1
ORDER BY 1
LIMIT 2
I think my way is more circuitous but easier for beginners to understand. Just for your reference.

SQL 30 day active user query

I have a table of users and how many events they fired on a given date:
DATE
USERID
EVENTS
2021-08-27
1
5
2021-07-25
1
7
2021-07-23
2
3
2021-07-20
3
9
2021-06-22
1
9
2021-05-05
1
4
2021-05-05
2
2
2021-05-05
3
6
2021-05-05
4
8
2021-05-05
5
1
I want to create a table showing number of active users for each date with active user being defined as someone who has fired an event on the given date or in any of the preceding 30 days.
DATE
ACTIVE_USERS
2021-08-27
1
2021-07-25
3
2021-07-23
2
2021-07-20
2
2021-06-22
1
2021-05-05
5
I tried the following query which returned only the users who were active on the specified date:
SELECT COUNT(DISTINCT USERID), DATE
FROM table
WHERE DATE >= (CURRENT_DATE() - interval '30 days')
GROUP BY 2 ORDER BY 2 DESC;
I also tried using a window function with rows between but seems to end up getting the same result:
SELECT
DATE,
SUM(ACTIVE_USERS) AS ACTIVE_USERS
FROM
(
SELECT
DATE,
CASE
WHEN SUM(EVENTS) OVER (PARTITION BY USERID ORDER BY DATE ROWS BETWEEN 30 PRECEDING AND CURRENT ROW) >= 1 THEN 1
ELSE 0
END AS ACTIVE_USERS
FROM table
)
GROUP BY 1
ORDER BY 1
I'm using SQL:ANSI on Snowflake. Any suggestions would be much appreciated.
This is tricky to do as window functions -- because count(distinct) is not permitted. You can use a self-join:
select t1.date, count(distinct t2.userid)
from table t join
table t2
on t2.date <= t.date and
t2.date > t.date - interval '30 day'
group by t1.date;
However, that can be expensive. One solution is to "unpivot" the data. That is, do an incremental count per user of going "in" and "out" of active states and then do a cumulative sum:
with d as ( -- calculate the dates with "ins" and "outs"
select user, date, +1 as inc
from table
union all
select user, date + interval '30 day', -1 as inc
from table
),
d2 as ( -- accumulate to get the net actives per day
select date, user, sum(inc) as change_on_day,
sum(sum(inc)) over (partition by user order by date) as running_inc
from d
group by date, user
),
d3 as ( -- summarize into active periods
select user, min(date) as start_date, max(date) as end_date
from (select d2.*,
sum(case when running_inc = 0 then 1 else 0 end) over (partition by user order by date) as active_period
from d2
) d2
where running_inc > 0
group by user
)
select d.date, count(d3.user)
from (select distinct date from table) d left join
d3
on d.date >= start_date and d.date < end_date
group by d.date;

Get SUM of Current Week data and Current Year data from SQLite

I have a SQLite database and sales table is like the following,
| Id | quantity | dateTime |
------------------------------------
| 1 | 10 | 2019-12-25 12:55 |
| 2 | 05 | 2019-12-30 12:55 |
| 3 | 25 | 2020-08-23 12:55 |
| 4 | 25 | 2020-08-24 12:55 |
| 5 | 56 | 2020-08-25 12:55 |
| 6 | 25 | 2020-08-26 12:55 |
| 7 | 12 | 2020-08-27 12:55 |
| 8 | 30 | 2020-08-28 12:55 |
| 9 | 40 | 2020-08-29 12:55 |
I need to get the Current Week data (Mon to Sun) and the Current Year data from (Jan to Dec). So if I pass today date I need to get only the Current Week sales data group by days like the following,
If I pass today date and time (2020-08-28 13:55) the query should give me Current Week data like this,
Day Sold Items (SUM(quantity))
Monday 20
Tuesday 25
Wednesday 10
Thursday 50
Friday 60
Saturday 0 (If the date hasn't come yet I need to get 0)
Sunday 0
And same as the Current Year data when I pass the Current Date,
Month Sold Items (SUM(quantity))
JAN 20
FEB 25
MAR 10
APR 50
MAY 60
JUN 0 (If the month hasn't come yet I need to get 0)
JUL 0
... ...
I tried with multiple queries in SQLite but couldn't get what I need. Here are the queries I tried,
Weekly Data (This one gave me past week data also)
SELECT SUM(quantity) as quantity, strftime('%w', dateTime) as Day
From sales
Group by strftime('%w', dateTime)
Monthly Data
SELECT SUM(quantity) as quantity, strftime('%m', dateTime) as Month
From sales
Group by strftime('%m', dateTime)
So anybody can help me to achieve this? Thanks in advance.
For the totals of the current week you need a CTE that returns the names of the days and the another one that returns the Monday of the current week.
You must cross join these CTEs and left join your table to aggregate:
with
days as (
select 1 nr, 'Monday' day union all
select 2, 'Tuesday' union all
select 3, 'Wednesday' union all
select 4, 'Thursday' union all
select 5, 'Friday' union all
select 6, 'Saturday' union all
select 7, 'Sunday'
),
weekMonday as (
select date(
'now',
case when strftime('%w', 'now') <> '1' then '-7 day' else '0 day' end,
'weekday 1'
) monday
)
select d.day,
coalesce(sum(t.quantity), 0) [Sold Items]
from days d cross join weekMonday wm
left join tablename t
on strftime('%w', t.dateTime) + 0 = d.nr % 7
and date(t.dateTime) between wm.monday and date(wm.monday, '6 day')
group by d.nr, d.day
order by d.nr
For the totals of the current year you need a CTE that returns the month names and then left join the table to aggregate:
with
months as (
select 1 nr, 'JAN' month union all
select 2 nr, 'FEB' union all
select 3 nr, 'MAR' union all
select 4 nr, 'APR' union all
select 5 nr, 'MAY' union all
select 6 nr, 'JUN' union all
select 7 nr, 'JUL' union all
select 8 nr, 'AUG' union all
select 9 nr, 'SEP' union all
select 10 nr, 'OCT' union all
select 11 nr, 'NOV' union all
select 12 nr, 'DEC'
)
select m.month,
coalesce(sum(t.quantity), 0) [Sold Items]
from months m
left join tablename t
on strftime('%m', t.dateTime) + 0 = m.nr
and date(t.dateTime) between date('now','start of year') and date('now','start of year', '1 year', '-1 day')
group by m.nr, m.month
order by m.nr
You can use the below query to get the weekly date, I am assuming that everydate has single entry and hence not grouping otherwise you can add group by.
First we will get the weekly calendar based on the input date (I have taken current date)
and then left join with calendar to get the required sold items info.
WITH seq(n) AS
(
SELECT 0 UNION ALL SELECT n + 1 FROM seq
WHERE n < DATEDIFF(DAY, (SELECT DATEADD(DAY, 2 - DATEPART(WEEKDAY, GETDATE()), CAST(GETDATE() AS DATE)) [Week_Start_Date]), (Select DATEADD(DAY, 8 - DATEPART(WEEKDAY, GETDATE()), CAST(GETDATE() AS DATE)) [Week_End_Date]))
),
CALENDAR(d) AS
(
SELECT DATEADD(DAY, n, (SELECT DATEADD(DAY, 2 - DATEPART(WEEKDAY, GETDATE()), CAST(GETDATE() AS DATE)) [Week_Start_Date])) FROM seq
)
SELECT coalesce(QUANTITY, 0) sold_items ,DATENAME(WEEKDAY, d) week_day FROM CALENDAR a left outer join Table_WEEKDAY b
on (a.d = convert(date, b.dateTime))
ORDER BY d
OPTION (MAXRECURSION 0);
You can try the below - DEMO
select day,coalesce(sum(quantity),0) as quantity
from
(select 0 as day union all select 1 union all select 2 union all select 3 union all select 4
union all select 5 union all select 6) as d
left join sales on cast(strftime('%w', dateTime) as int)=day
group by strftime('%w', dateTime),day
order by day

ORACLE SQL: Group the data by the last 4 weeks

i have a trouble with dates, i need to do a query that count the ids from the last four weeks.
I tried this, but it doesn't works.
SELECT count(a.id), sysdate
FROM table_1 a, table_2 b
WHERE b.fk_id = a.id
AND a.column = some_id
CONNECT BY LEVEL <=4
I need a output like this
| count(a.id) | week |
| 2 | 1 |
| 6 | 2 |
| 7 | 3 |
| 21 | 4 |
So, the " count(a.id) " values are the count of the ID's in one of the past 4 weeks.
Here's a MS SQL Server solution. You should be able to convert it to Oracle if needed.
select count(id) as 'count(a.id)'
, datepart(week, MyDate) as 'week'
from table_1
where datepart(week, MyDate) between datepart(week, getdate()) - 5 and datepart(week, getdate()) - 1
group by datepart(week, MyDate)
And here's my attempt at doing this in Oracle.
select count(a.id)
, Week
from (
select cast(TO_CHAR(MyDate, 'WW') as int) +
case when cast(TO_CHAR(MyDate, 'D') as int) < cast(TO_CHAR(trunc(MyDate, 'year'), 'D') as int) then 1 else 0 end Week
, id
, MyDate
from table_1
) a
where a.MyDate between sysdate - cast(TO_CHAR(sysdate, 'D') as int) - 28 + 1 and sysdate - cast(TO_CHAR(sysdate, 'D') as int)
group by Week

breakdown by weeks

Below is a simple query and the result: Is the a way to aggregate the total EVENTs by 7 days, then sum up the total EVENTs? Would a rollup function work? I am using SQL SERVER 05 & 08. Thanks again, folks.
SELECT DATE_SOLD, count(DISTINCT PRODUCTS) AS PRODUCT_SOLD
FROM PRODUCTS
WHERE DATE >='10/1/2009'
and DATE <'10/1/2010'
GROUP BY DATE_SOLD
RESULTS:
DATE_SOLD PRODUCT_SOLD
10/1/09 5
10/2/09 11
10/3/09 14
10/4/09 6
10/5/09 11
10/6/09 13
10/7/09 10
Total 70
10/8/09 4
10/9/09 11
10/10/09 8
10/11/09 4
10/12/09 7
10/13/09 4
10/14/09 9
Total 47
Not having your table design to work with here's what I think you are after (although I have to admit the output needs to be cleaned up). It should, at least, get you some way to the solution you are looking for.
CREATE TABLE MyTable(
event_date date,
event_type char(1)
)
GO
INSERT MyTable VALUES ('2009-1-01', 'A')
INSERT MyTable VALUES ('2009-1-11', 'B')
INSERT MyTable VALUES ('2009-1-11', 'C')
INSERT MyTable VALUES ('2009-1-20', 'N')
INSERT MyTable VALUES ('2009-1-20', 'N')
INSERT MyTable VALUES ('2009-5-23', 'D')
INSERT MyTable VALUES ('2009-5-23', 'E')
INSERT MyTable VALUES ('2009-5-10', 'F')
INSERT MyTable VALUES ('2009-5-10', 'F')
GO
WITH T AS (
SELECT DATEPART(MONTH, event_date) event_month, event_date, event_type
FROM MyTable
)
SELECT CASE WHEN (GROUPING(event_month) = 0)
THEN event_month ELSE '99' END AS event_month,
CASE WHEN (GROUPING(event_date) = 1)
THEN '9999-12-31' ELSE event_date END AS event_date,
COUNT(DISTINCT event_type) AS event_count
FROM T
GROUP BY event_month, event_date WITH ROLLUP
ORDER BY event_month, event_date
This gives the following output:
event_month event_date event_count
1 2009-01-01 1
1 2009-01-11 2
1 2009-01-20 1
1 9999-12-31 4
5 2009-05-10 1
5 2009-05-23 2
5 9999-12-31 3
99 9999-12-31 7
Where the '99' for month and '9999-12-31' for year are the totals.
SELECT DATEDIFF(week, 0, DATE_SOLD) Week,
DATEADD(week, DATEDIFF(week, 0, DATE_SOLD), 0) From,
DATEADD(week, DATEDIFF(week, 0, DATE_SOLD), 0) + 6 To,
COUNT(DISTINCT PRODUCTS) PRODUCT_SOLD
FROM dbo.PRODUCTS
WHERE DATE >= '2009-10-01'
AND DATE < '2010-10-01'
GROUP BY DATEDIFF(week, 0, DATE_SOLD) WITH ROLLUP
ORDER BY DATEDIFF(week, 0, DATE_SOLD)