Improve query to be less repetitive - sql

Is there a way to improve this query? I see two problems here -
Repetitive code
Hard coded strings
The first CTE calculates count based on 18 months. The second CTE calculates count based on 12 months.
with month_18 as (
select proc_cd, count(*) as month_18 from
(
select distinct patient, proc_cd from
service
where proc_cd = '35'
and month_id >= (select month_id from annual)
and month_id <= '202009' --This month should be 18 months from the month above
and length(patient) > 1
) a
group by proc_cd
),
month_12 as
(
select proc_cd, count(*) as month_12 from
(
select distinct patient_id, proc_cd from
service
where proc_cd = '35'
and month_id >= '201910'
and month_id <= '202009' --This month should be 12 months from the month above
and length(patient) > 1
) a
group by proc_cd
)
select a.*, b.month_12 from
month_18 a
join month_12 b
on a.proc_cd = b.proc_cd

If I understand correctly, you can use conditional aggregation:
select proc_cd,
count(distinct patient) filter (where month_id >= (select month_id from annual) and month_id <= '202009') as month_18,
count(distinct patient) filter (where month_id >= '201910' and month_id <= '202009')
from service
where proc_cd = 35 and
length(patient) > 1
group by proc_cd;
If you have to deal with date arithmetic on the month ids, you can convert to a date, do the arithmetic and convert back to a string:
select to_char(to_date(month_id, 'YYYYMM') - interval '12 month', 'YYYYMM')
from (values ('202009')) v(month_id);

Related

Fill missing months in a SELECT query

I'm trying to fill missing months in a SELECT query.
It looks like this :
SELECT sl.loonperiode_dt, (sum(slr.uren)) code_220
FROM HR.soc_loonbrief_regels slr,
HR.soc_loonbrieven sl,
HR.werknemers w,
HR.v_kontrakten vk
WHERE sl.loonperiode_dt BETWEEN '01012018' AND '01122018'
AND slr.loon_code_id IN (394)
AND slr.loonbrief_id = sl.loonbrief_id
AND w.werknemer_id = sl.werknemer_id
AND w.werknemer_id = vk.werknemer_id
AND vk.functie_id IN (121, 122, 128)
AND sl.loonperiode_dt BETWEEN hist_start_dt AND last_day(nvl(hist_eind_dt, sl.loonperiode_dt))
AND w.afdeling_id like '961'
GROUP BY sl.loonperiode_dt
ORDER BY sl.loonperiode_dt
It outputs this table :
31/01/18 234
30/04/18 245,8
31/05/18 714,6
31/07/18 288,04
31/08/18 281
30/11/18 515,12
I obviously would like it to be like that :
31/01/18 234
28/02/18 0
31/03/18 0
30/04/18 245,8
31/05/18 714,6
30/06/18 0
31/07/18 288,04
31/08/18 281
30/09/18 0
31/10/18 0
30/11/18 515,12
31/12/18 0
I have a calendar table 'CONV_HC.calendar' with dates in a column named 'DAT'.
I have seen many questions and answers about this, but I can't figure out how to apply the LEFT JOIN method or any other one to my current problem.
Thanks a lot in advance,
You could have a already done table with months and "join" with it, group by the date, or you can create one with subquery or using a with statement, something like
WITH Months (month) AS (
SELECT 1 AS Month FROM DUAL
UNION ALL
SELECT MONTH + 1
FROM Months
WHERE MONTH < 12
)
SELECT *
FROM Months
LEFT JOIN SomeTable
ON SomeTable.month = Months.MONTH
--ON Extract(MONTH FROM SomeTable.date) = Months.MONTH
edit
A better example:
--Just to simulate some table data
WITH SomeData AS (
SELECT TO_DATE('01/01/2019', 'MM/DD/YYYY') AS Dat, 5 AS Value FROM dual
UNION ALL
SELECT TO_DATE('01/05/2019', 'MM/DD/YYYY') AS Dat, 7 AS Value FROM dual
UNION ALL
SELECT TO_DATE('03/03/2019', 'MM/DD/YYYY') AS Dat, 2 AS Value FROM dual
UNION ALL
SELECT TO_DATE('11/05/2019', 'MM/DD/YYYY') AS Dat, 9 AS Value FROM dual
)
, Months (StartDate, MaxYear) AS (
SELECT CAST(TO_DATE('01/01/2019', 'MM/DD/YYYY') AS DATE) AS StartDate, 2019 AS MaxYear FROM DUAL
UNION ALL
SELECT CAST(ADD_MONTHS(StartDate, 1) AS DATE), MaxYear
FROM Months
WHERE EXTRACT(YEAR FROM ADD_MONTHS(StartDate, 1)) <= MaxYear
)
SELECT
Months.StartDate AS Dat
, SUM(SomeData.Value) AS SumValue
FROM Months
LEFT JOIN SomeData
ON Extract(MONTH FROM SomeData.Dat) = Extract(MONTH FROM Months.StartDate)
GROUP BY
Months.StartDate
edit
You won't find a just copy past solution, you need to get the idea from it and change to your context.
let's try this. You can "add" the missing months in an APP, or you can JOIN it with a already done table, doesn't need to be a real table, you can make one. The with statement is an example of it. So lets get all month, at the last day for 2019:
--Geting the last day of every month for 2019
WITH Months (CurrentMonth, MaxYear) AS (
SELECT CAST(TO_DATE('01/01/2019', 'MM/DD/YYYY') AS DATE) AS CurrentMonth, 2019 AS MaxYear FROM DUAL
UNION ALL
SELECT CAST(ADD_MONTHS(CurrentMonth, 1) AS DATE), MaxYear
FROM Months
WHERE EXTRACT(YEAR FROM ADD_MONTHS(CurrentMonth, 1)) <= MaxYear
)
SELECT LAST_DAY(Months.CurrentMonth) AS LastDay
FROM Months
Ok, now we have all months avaliable for the join. In your query, you already have the sum done so lets skip the sum and just use your data. Just add another with query.
--Geting the last day of every month for 2018
WITH Months (CurrentMonth, MaxYear) AS (
SELECT CAST(TO_DATE('01/01/2018', 'MM/DD/YYYY') AS DATE) AS CurrentMonth, 2018 AS MaxYear FROM DUAL
UNION ALL
SELECT CAST(ADD_MONTHS(CurrentMonth, 1) AS DATE), MaxYear
FROM Months
WHERE EXTRACT(YEAR FROM ADD_MONTHS(CurrentMonth, 1)) <= MaxYear
)
, YourData as (
SELECT sl.loonperiode_dt, (sum(slr.uren)) code_220
FROM HR.soc_loonbrief_regels slr,
HR.soc_loonbrieven sl,
HR.werknemers w,
HR.v_kontrakten vk
WHERE sl.loonperiode_dt BETWEEN '01012018' AND '01122018'
AND slr.loon_code_id IN (394)
AND slr.loonbrief_id = sl.loonbrief_id
AND w.werknemer_id = sl.werknemer_id
AND w.werknemer_id = vk.werknemer_id
AND vk.functie_id IN (121, 122, 128)
AND sl.loonperiode_dt BETWEEN hist_start_dt AND last_day(nvl(hist_eind_dt, sl.loonperiode_dt))
AND w.afdeling_id like '961'
GROUP BY sl.loonperiode_dt
--ORDER BY sl.loonperiode_dt
)
SELECT
LAST_DAY(Months.CurrentMonth) AS LastDay
, COALESCE(YourData.code_220, 0) AS code_220
FROM Months
Left Join YourData
on Extract(MONTH FROM Months.CurrentMonth) = Extract(MONTH FROM YourData.loonperiode_dt)
--If you have more years: AND Extract(YEAR FROM Months.CurrentMonth) = Extract(YEAR FROM YourData.loonperiode_dt)
ORDER BY LastDay ASC

get list of student with attendance min15days in a month and come for continuous 4 months in a year

I need a query to get the list of students attended there class for atleast 15 days in a month for continuous 4 months.
table maybe like
studentid monthyear attendance
1 Apr2018 16
1 May2018 23
1 Jun2018 18
1 Jul2018 16
1 Aug2018 25
2 Apr2018 2
2 May2018 15
and so on...
Db fiddle
Try this query:
select #rn := 0;
select studentid from (
select studentid, month(dt) - (#rn := #rn + 1) grp from (
select * ,
str_to_date(concat('01 ', insert(monthyear, 4, 0, ' ')), '%d %M %Y') dt
from tbl
where attendance >= 15 --only those records, where attenadnce is at least 15
) a where year(dt) = 2018 --particular year
order by studentid,dt
) a group by studentid,grp having count(*) >= 4
Demo - I exapnded your data with some more cases :)
The idea is simple - if student has attended for some consecutive months, consecutive months would increment by one, just like row number, so I used difference between months and row numbers - for consecutive months, the difference should be constant, so it's enought to group by that difference and take those groups, where count is >= 4 :)
UPDATE
For SQL Server:
select studentid from (
select studentid, month(dt) - row_number() over (order by studentid, dt) grp from (
select * ,
cast(concat('01 ', stuff(monthyear, 4, 0, ' ')) as date) dt
from tbl
where attendance >= 15 --only those records, where attenadnce is at least 15
) a where year(dt) = 2018 --particular year
) a group by studentid, grp having count(*) >= 4
SQL Server demo
In general, a simple selft join that would catch the difference of months would suffice
In this case, a conversion of the column monthyear is required in the join command itself
The query, without the conversion :
SELECT t1.studentid, count(*) as cnt
FROM
table t1
INNER JOIN table t2 ON t1.studentid = t2.studentid AND
t2.attendance >= 15
AND t1.monthyear BETWEEN t2.monthyear AND (t2.monthyear - 3)
WHERE
t1.attendance >= 15
GROUP BY
studentid
HAVING
count(*) >=4
The conversion is as follows:
STR_TO_DATE(
CONCAT(SUBSTR(t1.monthyear,1, LENGTH(t1.monthyear) - 4),' ', RIGHT(t1.monthyear, 4), %M %Y)
so the query should be:
SELECT t1.studentid, count(*) as cnt
FROM
table t1
INNER JOIN table t2 ON t1.studentid = t2.studentid AND
t2.attendance >= 15
AND STR_TO_DATE(
CONCAT(SUBSTR(t1.monthyear,1, LENGTH(t1.monthyear) - 4),' ', RIGHT(t1.monthyear, 4), %M %Y) BETWEEN STR_TO_DATE(
CONCAT(SUBSTR(t2.monthyear,1, LENGTH(t2.monthyear) - 4),' ', RIGHT(t2.monthyear, 4), %M %Y) AND DATE_SUB(STR_TO_DATE(
CONCAT(SUBSTR(t2.monthyear,1, LENGTH(t2.monthyear) - 4),' ', RIGHT(t2.monthyear, 4), %M %Y), INTERVAL 3 MONTH)
WHERE
t1.attendance >= 15
GROUP BY
studentid
HAVING
count(*) >=4
I think this is the simplest method:
select distinct studentid
from (select t.*, cast(monthyear as date) as my,
lag(cast(monthyear as date), 3) over (partition by studentid order by cast(monthyear as date)) as prev_my
from tbl t
where attendance >= 15
) t
where prev_my = dateadd(month, -3, my);
Here is a db<>fiddle.
The logic is pretty simple:
Only consider rows that satisfy the attendance criterion.
Use LAG() to look at the 3rd record in past.
If all months meet the attendance criterion, then this will be exactly 3 months before.
The select distinct is because you want students, not the specific periods.

SQL count display 0s using 1 table

I have a table called requests, and Im looking to count how many requests where the column rideId is not null each day during the last week. I have the following query:
Select count(*), dayname(time) as Day
from request
where time >= (select current_timestamp - interval 7 day) and rideId is not null
group by dayname(time)
order by dayofweek(Day);
How can I make it so it shows me those days where there is no request with rideId and count should be 0
Table is: Request(userId, time, rideId)
Move the not null check into your count, and join to a calendar table to bring in the missing days.
SELECT
t1.dname,
COALESCE(t2.numRides, 0) AS numRides
FROM
(
SELECT 'Monday' AS dname, 2 AS dow UNION ALL
SELECT 'Tuesday', 3 UNION ALL
SELECT 'Wednesday', 4 UNION ALL
SELECT 'Thursday', 5 UNION ALL
SELECT 'Friday', 6 UNION ALL
SELECT 'Saturday', 7 UNION ALL
SELECT 'Sunday', 1
) t1
LEFT JOIN
(
SELECT DAYNAME(time) AS dname, COUNT(rideId) AS numRides
FROM request
WHERE time >= DATE_SUB(CURDATE(),INTERVAL 7 DAY)
GROUP BY DAYNAME(time)
) t2
ON t1.dname = t2.dname
ORDER BY t1.dow;
Select a.day, coalesce(b.cnt, 0) as cnt
from (--select all days here) a
left join
(select dayname(time) as day, count(*) as cnt
from requests
where some_condition
group by day) b
using a.day = b.day
order by day;

Total Count of Active Employees by Date

I have in the past written queries that give me counts by date (hires, terminations, etc...) as follows:
SELECT per.date_start AS "Date",
COUNT(peo.EMPLOYEE_NUMBER) AS "Hires"
FROM hr.per_all_people_f peo,
hr.per_periods_of_service per
WHERE per.date_start BETWEEN peo.effective_start_date AND peo.EFFECTIVE_END_DATE
AND per.date_start BETWEEN :PerStart AND :PerEnd
AND per.person_id = peo.person_id
GROUP BY per.date_start
I was now looking to create a count of active employees by date, however I am not sure how I would date the query as I use a range to determine active as such:
SELECT COUNT(peo.EMPLOYEE_NUMBER) AS "CT"
FROM hr.per_all_people_f peo
WHERE peo.current_employee_flag = 'Y'
and TRUNC(sysdate) BETWEEN peo.effective_start_date AND peo.EFFECTIVE_END_DATE
Here is a simple way to get started. This works for all the effective and end dates in your data:
select thedate,
SUM(num) over (order by thedate) as numActives
from ((select effective_start_date as thedate, 1 as num from hr.per_periods_of_service) union all
(select effective_end_date as thedate, -1 as num from hr.per_periods_of_service)
) dates
It works by adding one person for each start and subtracting one for each end (via num) and doing a cumulative sum. This might have duplicates dates, so you might also do an aggregation to eliminate those duplicates:
select thedate, max(numActives)
from (select thedate,
SUM(num) over (order by thedate) as numActives
from ((select effective_start_date as thedate, 1 as num from hr.per_periods_of_service) union all
(select effective_end_date as thedate, -1 as num from hr.per_periods_of_service)
) dates
) t
group by thedate;
If you really want all dates, then it is best to start with a calendar table, and use a simple variation on your original query:
select c.thedate, count(*) as NumActives
from calendar c left outer join
hr.per_periods_of_service pos
on c.thedate between pos.effective_start_date and pos.effective_end_date
group by c.thedate;
If you want to count all employees who were active during the entire input date range
SELECT COUNT(peo.EMPLOYEE_NUMBER) AS "CT"
FROM hr.per_all_people_f peo
WHERE peo.[EFFECTIVE_START_DATE] <= :StartDate
AND (peo.[EFFECTIVE_END_DATE] IS NULL OR peo.[EFFECTIVE_END_DATE] >= :EndDate)
Here is my example based on Gordon Linoff answer
with a little modification, because in SUBSTRACT table all records were appeared with -1 in NUM, even if no date was in END DATE = NULL.
use AdventureWorksDW2012 --using in MS SSMS for choosing DATABASE to work with
-- and may be not work in other platforms
select
t.thedate
,max(t.numActives) AS "Total Active Employees"
from (
select
dates.thedate
,SUM(dates.num) over (order by dates.thedate) as numActives
from
(
(
select
StartDate as thedate
,1 as num
from DimEmployee
)
union all
(
select
EndDate as thedate
,-1 as num
from DimEmployee
where EndDate IS NOT NULL
)
) AS dates
) AS t
group by thedate
ORDER BY thedate
worked for me, hope it will help somebody
I was able to get the results I was looking for with the following:
--Active Team Members by Date
SELECT "a_date",
COUNT(peo.EMPLOYEE_NUMBER) AS "CT"
FROM hr.per_all_people_f peo,
(SELECT DATE '2012-04-01'-1 + LEVEL AS "a_date"
FROM dual
CONNECT BY LEVEL <= DATE '2012-04-30'+2 - DATE '2012-04-01'-1
)
WHERE peo.current_employee_flag = 'Y'
AND "a_date" BETWEEN peo.effective_start_date AND peo.EFFECTIVE_END_DATE
GROUP BY "a_date"
ORDER BY "a_date"

sql to find row for min date in each month

I have a table, lets say "Records" with structure:
id date
-- ----
1 2012-08-30
2 2012-08-29
3 2012-07-25
I need to write an SQL query in PostgreSQL to get record_id for MIN date in each month.
month record_id
----- ---------
8 2
7 3
as we see 2012-08-29 < 2012-08-30 and it is 8 month, so we should show record_id = 2
I tried something like this,
SELECT
EXTRACT(MONTH FROM date) as month,
record_id,
MIN(date)
FROM Records
GROUP BY 1,2
but it shows 3 records.
Can anybody help?
SELECT DISTINCT ON (EXTRACT(MONTH FROM date))
id,
date
FROM Records1
ORDER BY EXTRACT(MONTH FROM date),date
SQLFiddle http://sqlfiddle.com/#!12/76ca2/3
UPD: This query:
1) Orders the records by month and date
2) For every month picks the first record (the first record has MIN(date) because of ordering)
Details here http://www.postgresql.org/docs/current/static/sql-select.html#SQL-DISTINCT
This will return multiples if you have duplicate minimum dates:
Select
minbymonth.Month,
r.record_id
From (
Select
Extract(Month From date) As Month,
Min(date) As Date
From
records
Group By
Extract(Month From date)
) minbymonth
Inner Join
records r
On minbymonth.date = r.date
Order By
1;
Or if you have CTEs
With MinByMonth As (
Select
Extract(Month From date) As Month,
Min(date) As Date
From
records
Group By
Extract(Month From date)
)
Select
m.Month,
r.record_id
From
MinByMonth m
Inner Join
Records r
On m.date = r.date
Order By
1;
http://sqlfiddle.com/#!1/2a054/3
select extract(month from date)
, record_id
, date
from
(
select
record_id
, date
, rank() over (partition by extract(month from date) order by date asc) r
from records
) x
where r=1
order by date
SQL Fiddle
select distinct on (date_trunc('month', date))
date_trunc('month', date) as month,
id,
date
from records
order by 1, 3 desc
I think you need use sub-query, something like this:
SELECT
EXTRACT(MONTH FROM r.date) as month,
r.record_id
FROM Records as r
INNER JOIN (
SELECT
EXTRACT(MONTH FROM date) as month,
MIN(date) as mindate
FROM Records
GROUP BY EXTRACT(MONTH FROM date)
) as sub on EXTRACT(MONTH FROM r.date) = sub.month and r.date = sub.mindate