SQL COUNT number of patients each month - sql

I have a table with:
PATIENT_ID
START_DATE
END_DATE
Ward
1
19/01/2022
19/02/2022
A
2
20/01/2022
19/03/2022
A
And I want to create a summarized table to show for each month, how many patients were active in that ward as well as the total number of patient days for that month. Is this possible in SQL?
I'm thinking I might need an external DIM_DATE table that has all of the months up until now and starts from the first START_DATE out of all the PATIENT_ID's but doesn't sound very efficient?
Note that I have shown only 1 ward but there are also different wards.
Expected result:
Month
Ward
COUNT_PATIENTS
TOTAL_NUMBER_DAYS
31/01/2022
A
2
33
28/02/2022
A
2
47
31/03/2022
A
1
19

with data(PATIENT_ID, START_DATE, END_DATE, Ward) as (
select column1, to_date(column2, 'dd/mm/yyyy'), to_date(column3, 'dd/mm/yyyy'), column4
from values
(1, '19/01/2022','19/02/2022','A'),
(2, '20/01/2022','19/03/2022','A')
), ranges as (
select
date_trunc('month', min(start_date)) as min_start,
dateadd('day', -1, dateadd('month', 1, date_trunc('month', max(end_date)))) as max_end
from data
), gen as (
select
row_number() over(order by null)-1 as rn
from table(generator(ROWCOUNT => 1000))
), all_months as (
select
dateadd('month', g.rn, date_trunc(month, r.min_start)) as month_start,
dateadd('day', -1, dateadd('month', 1, month_start)) as month_end
from ranges as r
cross join gen as g
qualify month_start <= r.max_end
)
select
a.month_end as month,
d.ward,
a.month_end))+1 as days
count(distinct patient_id) as count_patients,
sum(datediff(days, greatest(d.start_date, a.month_start), least(d.END_DATE, a.month_end))+1) as total_numbers_days
from all_months as a
left join data as d
on a.month_start between date_trunc('month', d.START_DATE) and date_trunc('month', d.END_DATE)
group by 1,2
order by 1,2
gives:
MONTH
WARD
COUNT_PATIENTS
TOTAL_NUMBERS_DAYS
2022-01-31
A
2
25
2022-02-28
A
2
47
2022-03-31
A
1
19
I think your 33 is wrong, as the partials are:
MONTH
WARD
PATIENT_ID
_START
_END
DAYS
2022-01-31
A
1
2022-01-19
2022-01-31
13
2022-01-31
A
2
2022-01-20
2022-01-31
12
2022-02-28
A
1
2022-02-01
2022-02-19
19
2022-02-28
A
2
2022-02-01
2022-02-28
28
2022-03-31
A
2
2022-03-01
2022-03-19
19

Related

Aggregate monthly rows created date and ended date

I need to adapt a graph from the current BI implementation to an SQL one. This graph reflects the amount of requests received and each one of these requests have 3 fields that are relevant for this query: the id, created date and the end date.
The graph looks like this https://i.stack.imgur.com/NRIjr.png:
+----+--------------+-------------+
| ID | CREATE_DATE | END_DATE |
+----+--------------+-------------+
| | | |
| 1 | 2022-01-01 | 2022-02-10 |
| | | |
| 2 | 2022-01-03 | 2022-03-01 |
| | | |
| 3 | 2022-02-01 | 2022-04-01 |
| | | |
| 4 | 2022-03-01 | null |
+----+--------------+-------------+
So for this particular example we'd have something like this:
January: active: 2 (requests 1 and 2), finished: 0;
February: active 2 (requests 2, 3), finished 1 (request 1);
March: active 2 (requests 3, 4) finished 1 (request 2)
So for each month I want the active requests for that particular month (those that their ended date goes after that particular month or is null) and the requests that finished during that month (this one might be split to another query, of course) I tried this query, but of course, it doesn't take into account the requests that ended in a particular month, and only gives me the cumulative sum
Edit: I forgot to mention that one of the requirements is that the beggining and end date of the graph might be set by the user. So maybe I want to see the months from April-2022 to April-2020 and see the 2 year behaviour!
WITH cte AS ( SELECT
date_trunc('month',
r.date_init) AS mon,
count(r.id) AS mon_sum
FROM
"FOLLOWUP"."CAT_REQUEST" r
GROUP BY
1 ) SELECT
to_char(mon,
'YYYY-mm') AS mon_text,
COALESCE(sum(c.mon_sum)
OVER (ORDER BY mon),
0) AS running_sum
FROM
generate_series('2022-01-01', '2023-12-25',
interval '1 month') mon
LEFT JOIN
cte c USING (mon)
ORDER BY
mon
I wrote query for you using some different business logic. But, result is will be same result which you needed. Sample query:
with month_list as (
select 1 as id, 'Yanuary' as mname union all
select 2 as id, 'Febriary' as mname union all
select 3 as id, 'Marth' as mname union all
select 4 as id, 'April' as mname union all
select 5 as id, 'May' as mname union all
select 6 as id, 'June' as mname union all
select 7 as id, 'Jule' as mname union all
select 8 as id, 'August' as mname union all
select 9 as id, 'September' as mname union all
select 10 as id, 'October' as mname union all
select 11 as id, 'November' as mname union all
select 12 as id, 'December' as mname
),
test_table as (
select
id,
create_date,
end_date,
extract(month from create_date) as month1,
extract(month from end_date) as month2
from
your_table
)
select
t1.mname,
count(*) as "actived"
from
month_list t1
inner join
test_table t2 on (t1.id >= t2.month1) and (t1.id < t2.month2)
group by
t1.id, t1.mname
order by
t1.id
/* --- Result:
mname actived
--------------------
Yanuary 2
Febriary 2
Marth 1
*/
PostgreSQL has many date & time functions and types.
I write some samples for you:
For example, in my samples function now() our chosen date.
-- get previos 12 month from date (return timestampt)
select now() - '12 month'::interval as newdate
-- Return:
2021-04-03 18:22:48.344 +0400
-- if you need only date, you can cast this to date
select (now() - '12 month'::interval)::date as newdate
-- Return:
2021-04-03
-- generate data from previous 12 month to selected date increase by month:
SELECT t1.datelist::date
from generate_series
(
now()-'12 month'::interval,
now(),
'1 month'
)
AS t1(datelist)
-- Return:
2021-04-03
2021-05-03
2021-06-03
2021-07-03
2021-08-03
2021-09-03
2021-10-03
2021-11-03
2021-12-03
2022-01-03
2022-02-03
2022-03-03
2022-04-03
-- generate data from previous 12 month to selected date increase by month with extracting month names and year:
-- this sample may be as you needed.
SELECT
extract(year from t1.datelist) as "year",
TO_CHAR(t1.datelist, 'Month') as "month",
trim(TO_CHAR(t1.datelist, 'Month')) || '-' || trim(to_char(t1.datelist, 'yyyy')) as "formatted_date"
from generate_series
(
now()-'12 month'::interval,
now(),
'1 month'
)
AS t1(datelist)
-- Return:
year month formatted_date
------------------------------------
2021 April April-2021
2021 May May-2021
2021 June June-2021
2021 July July-2021
2021 August August-2021
2021 September September-2021
2021 October October-2021
2021 November November-2021
2021 December December-2021
2022 January January-2022
2022 February February-2022
2022 March March-2022
2022 April April-2022

SQL - Constructing an SCD2 type dimension from overlapping periods

I have data like this:
GroupId DateFrom DateTo value_
Gr1 2022-03-01 2022-08-01 10
Gr2 2022-01-01 2022-12-31 20
Gr3 2022-01-01 2022-12-31 30
I'm trying to construct an SCD2 type dimension by doing an unpivot on data above
WITH UnPivoted AS (SELECT 'Gr1' AS GroupId, '2022-03-01' AS DateFrom, '2022-08-01' AS DateTo, 10 as value_ UNION ALL
SELECT 'Gr2', '2022-01-01', '2022-12-31', 20 UNION ALL
SELECT 'Gr3', '2022-01-01', '2022-12-31', 30
)
SELECT DateFrom, DateTo, SUM([Gr1]) Gr1, SUM([Gr2]) Gr2, SUM([Gr3]) Gr3
FROM UnPivoted
PIVOT (
SUM(value_) FOR GroupId IN ([Gr1],[Gr2],[Gr3])
) pvt
GROUP BY DateFrom, DateTo
with result:
DateFrom DateTo Gr1 Gr2 Gr3
2022-03-01 2022-08-01 10 NULL NULL
2022-01-01 2022-12-31 NULL 20 30
But, as you can see, date ranges are not identical so my GROUP BY does not work. And there is an overlap in date ranges so output is not correct.
I would like to get this result instead:
DateFrom DateTo Gr1 Gr2 Gr3
2022-01-01 2022-03-01 20 30
2022-03-01 2022-08-01 10 20 30
2022-08-01 2022-12-31 20 30
The best approach that I can come up with is to get all distinct values of DateFrom and DateTo and go through intervals between them one by one, constructing a new row for each interval.
Is there an easier way of getting the desired result?
In case someone else has the same situation, script below works. It also has some additional logic to adjust end dates so they do not overlap with start dates.
input (Unpivoted CTE):
GroupId DateFrom DateTo value_
Gr1 2022-03-01 2022-08-01 10
Gr2 2022-01-01 2022-12-31 20
Gr3 2022-01-01 2022-12-31 30
script:
WITH UnPivoted AS (SELECT 'Gr1' AS GroupId, CAST('2022-03-01' AS date) AS DateFrom, CAST('2022-08-01' AS date) AS DateTo, 10 as value_ UNION ALL
SELECT 'Gr2', '2022-01-01', '2022-12-31', 20 UNION ALL
SELECT 'Gr3', '2022-01-01', '2022-12-31', 30
)
,UniqueDateRanges AS (
SELECT DISTINCT DateFrom
FROM UnPivoted
UNION
SELECT DISTINCT DATEADD(d,1,DateTo)
FROM UnPivoted
)
,DateIntervals_SCD2 AS (
SELECT DateFrom
,CAST(NULLIF(LEAD(DateFrom,1,NULL) OVER(PARTITION BY '1' ORDER BY DateFrom),NULL) AS date) AS DateTo1
,CAST(DATEADD(d,-1,NULLIF(LEAD(DateFrom,1,NULL) OVER(PARTITION BY '1' ORDER BY DateFrom),NULL)) AS date) AS DateTo1_adjusted
FROM UniqueDateRanges
)
,Dataset_Fixed_SCD2 AS (
SELECT di.DateFrom, DateTo1_adjusted AS DateTo, up.GroupId, up.value_
FROM DateIntervals_SCD2 di
LEFT JOIN UnPivoted up ON di.DateFrom BETWEEN up.DateFrom AND up.DateTo AND DateTo1_adjusted BETWEEN up.DateFrom AND up.DateTo
WHERE DateTo1_adjusted IS NOT NULL
)
SELECT *
FROM Dataset_Fixed_SCD2
PIVOT (
SUM(value_) FOR GroupId IN ([Gr1],[Gr2],[Gr3])
) pvt
output:
DateFrom DateTo Gr1 Gr2 Gr3
2022-01-01 2022-02-28 20 30
2022-03-01 2022-08-01 10 20 30
2022-08-02 2022-12-31 20 30

Add missing month in result with values from previous month

I have a result set with month as first column. Some of the month are missing in the result. I need to add previous month record as the missing month till last month.
Current data:
Desired Output:
I have a sql but instead of filling for just missing month it is taking every rows into account and populate it.
select
to_char(generate_series(date_trunc('MONTH',to_date(period,'YYYYMMDD')+interval '1' month),
date_trunc('MONTH',now()+interval '1' day),
interval '1' month) - interval '1 day','YYYYMMDD') as period,
name,age,salary,rating
from( values ('20201205','Alex',35,100,'A+'),
('20210110','Alex',35,110,'A'),
('20210512','Alex',35,999,'A+'),
('20210625','Jhon',20,175,'B-'),
('20210922','Jhon',20,200,'B+')) v (period,name,age,salary,rating) order by 2,3,4,5,1;
Output of this query:
Can someone help in getting desired output.
Regards!!
You can achieve this with a recursive cte like this:
with RECURSIVE ctetest as (SELECT * FROM (values ('2020-12-31'::date,'Alex',35,100,'A+'),
('2021-01-31'::date,'Alex',35,110,'A'),
('2021-05-31'::date,'Alex',35,999,'A+'),
('2021-06-30'::date,'Jhon',20,175,'B-'),
('2021-09-30'::date,'Jhon',20,200,'B+')) v (mth, emp, age, salary, rating)),
cte AS (
SELECT MIN(mth) AS mth, emp, age, salary, rating
FROM ctetest
GROUP BY emp, age, salary, rating
UNION
SELECT COALESCE(n.mth, (l.mth + interval '1 day' + interval '1 month' - interval '1 day')::date), COALESCE(n.emp, l.emp),
COALESCE(n.age, l.age), COALESCE(n.salary, l.salary), COALESCE(n.rating, l.rating)
FROM cte l
LEFT OUTER JOIN ctetest n ON n.mth = (l.mth + interval '1 day' + interval '1 month' - interval '1 day')::date
AND n.emp = l.emp
WHERE (l.mth + interval '1 day' + interval '1 month' - interval '1 day')::date <= (SELECT MAX(mth) FROM ctetest)
)
SELECT * FROM cte order by 2, 1;
Note that although ctetest is not itself recursive, being only used to get the test data, if any cte among multiple ctes are recursive, you must have the recursive keyword after the with.
You can use cross join lateral to fill the gaps and then union all with the original data.
WITH the_table (period, name, age, salary, rating) as ( values
('2020-12-01'::date, 'Alex', 35, 100, 'A+'),
('2021-01-01'::date, 'Alex', 35, 110, 'A'),
('2021-05-01'::date, 'Alex', 35, 999, 'A+'),
('2021-06-01'::date, 'Jhon', 20, 100, 'B-'),
('2021-09-01'::date, 'Jhon', 20, 200, 'B+')
),
t as (
select *, coalesce(
lead(period) over (partition by name order by period) - interval 'P1M',
max(period) over ()
) last_period
from the_table
)
SELECT lat::date period, name, age, salary, rating
from t
cross join lateral generate_series
(period + interval 'P1M', last_period, interval 'P1M') lat
UNION ALL
SELECT * from the_table
ORDER BY name, period;
Please note that using integer data type for a date column is sub-optimal. Better review your data design and use date data type instead. You can then present it as integer if necessary.
period
name
age
salary
rating
2020-12-01
Alex
35
100
A+
2021-01-01
Alex
35
110
A
2021-02-01
Alex
35
110
A
2021-03-01
Alex
35
110
A
2021-04-01
Alex
35
110
A
2021-05-01
Alex
35
999
A+
2021-06-01
Alex
35
999
A+
2021-07-01
Alex
35
999
A+
2021-08-01
Alex
35
999
A+
2021-09-01
Alex
35
999
A+
2021-06-01
Jhon
20
100
B-
2021-07-01
Jhon
20
100
B-
2021-08-01
Jhon
20
100
B-
2021-09-01
Jhon
20
200
B+

How to fill the time gap after grouping date record for months in postgres

I have table records as -
date n_count
2020-02-19 00:00:00 4
2020-07-14 00:00:00 1
2020-07-17 00:00:00 1
2020-07-30 00:00:00 2
2020-08-03 00:00:00 1
2020-08-04 00:00:00 2
2020-08-25 00:00:00 2
2020-09-23 00:00:00 2
2020-09-30 00:00:00 3
2020-10-01 00:00:00 11
2020-10-05 00:00:00 12
2020-10-19 00:00:00 1
2020-10-20 00:00:00 1
2020-10-22 00:00:00 1
2020-11-02 00:00:00 376
2020-11-04 00:00:00 72
2020-11-11 00:00:00 1
I want to be grouped all the records into months for finding month total count which is working, but there is a missing of month. how to fill this gap.
time month_count
"2020-02-01" 4
"2020-07-01" 4
"2020-08-01" 5
"2020-09-01" 5
"2020-10-01" 26
"2020-11-01" 449
This is what I have tried.
SELECT (date_trunc('month', date))::date AS time,
sum(n_count) as month_count
FROM table1
group by time
order by time asc
You can use generate_series() to generate all starts of months between the earliest and latest date available in the table, then bring the table with a left join:
select d.dt, coalesce(sum(t.n_count), 0) as month_count
from (
select generate_series(date_trunc('month', min(date)), date_trunc('month', max(date)), '1 month') as dt
from table1
) as d(dt)
left join table1 t on t.date >= d.dt and t.date < d.dt + interval '1 month'
group by d.dt
order by d.dt
I would simply UNION a date series, generated from MIN and MAX date:
demo:db<>fiddle
WITH cte AS ( -- 1
SELECT
*,
date_trunc('month', date)::date AS time
FROM
t
)
SELECT
time,
SUM(n_count) as month_count --3
FROM (
SELECT
time,
n_count
FROM cte
UNION
SELECT -- 2
generate_series(
(SELECT MIN(time) FROM cte),
(SELECT MAX(time) FROM cte),
interval '1 month'
)::date,
0
) s
GROUP BY time
ORDER BY time
Use CTE to calculate date_trunc only once. Could be left out if you like to call your table twice in the UNION below
Generate monthly date series from MIN to MAX date containing your n_count value = 0. Add it to the table
Do your calculation

oracle count based on month

I am attempting to write Oracle SQL.
I am looking for solution something similar. Please find below data I have
start_date end_date customer
01-01-2012 31-06-2012 a
01-01-2012 31-01-2012 b
01-02-2012 31-03-2012 c
I want the count of customer in that date period. My result should look like below
Month : Customer Count
JAN-12 : 2
FEB-12 : 2
MAR-12 : 2
APR-12 : 1
MAY-12 : 1
JUN-12 : 1
One option would be to generate the months separately in another query and join that to your data table (note that I'm assuming that you intended customer A to have an end-date of June 30, 2012 since there is no June 31).
SQL> ed
Wrote file afiedt.buf
1 with mnths as(
2 select add_months( date '2012-01-01', level - 1 ) mnth
3 from dual
4 connect by level <= 6 ),
5 data as (
6 select date '2012-01-01' start_date, date '2012-06-30' end_date, 'a' customer from dual union all
7 select date '2012-01-01', date '2012-01-31', 'b' from dual union all
8 select date '2012-02-01', date '2012-03-31', 'c' from dual
9 )
10 select mnths.mnth, count(*)
11 from data,
12 mnths
13 where mnths.mnth between data.start_date and data.end_date
14 group by mnths.mnth
15* order by mnths.mnth
SQL> /
MNTH COUNT(*)
--------- ----------
01-JAN-12 2
01-FEB-12 2
01-MAR-12 2
01-APR-12 1
01-MAY-12 1
01-JUN-12 1
6 rows selected.
WITH TMP(monthyear,start_date,end_date,customer) AS (
select LAST_DAY(start_date),
CAST(ADD_MONTHS(start_date, 1) AS DATE),
end_date,
customer
from data
union all
select LAST_DAY(start_date),
CAST(ADD_MONTHS(start_date, 1) AS DATE),
end_date,
customer
from TMP
where LAST_DAY(end_date) >= LAST_DAY(start_date)
)
SELECT TO_CHAR(MonthYear, 'MON-YY') TheMonth,
Count(Customer) Customers
FROM TMP
GROUP BY MonthYear
ORDER BY MonthYear;
SQLFiddle