Fill missing months in a SELECT query

Fill missing months in a SELECT query - sql

I'm trying to fill missing months in a SELECT query.
It looks like this :
SELECT sl.loonperiode_dt, (sum(slr.uren)) code_220
FROM HR.soc_loonbrief_regels slr,
HR.soc_loonbrieven sl,
HR.werknemers w,
HR.v_kontrakten vk
WHERE sl.loonperiode_dt BETWEEN '01012018' AND '01122018'
AND slr.loon_code_id IN (394)
AND slr.loonbrief_id = sl.loonbrief_id
AND w.werknemer_id = sl.werknemer_id
AND w.werknemer_id = vk.werknemer_id
AND vk.functie_id IN (121, 122, 128)
AND sl.loonperiode_dt BETWEEN hist_start_dt AND last_day(nvl(hist_eind_dt, sl.loonperiode_dt))
AND w.afdeling_id like '961'
GROUP BY sl.loonperiode_dt
ORDER BY sl.loonperiode_dt
It outputs this table :
31/01/18 234
30/04/18 245,8
31/05/18 714,6
31/07/18 288,04
31/08/18 281
30/11/18 515,12
I obviously would like it to be like that :
31/01/18 234
28/02/18 0
31/03/18 0
30/04/18 245,8
31/05/18 714,6
30/06/18 0
31/07/18 288,04
31/08/18 281
30/09/18 0
31/10/18 0
30/11/18 515,12
31/12/18 0
I have a calendar table 'CONV_HC.calendar' with dates in a column named 'DAT'.
I have seen many questions and answers about this, but I can't figure out how to apply the LEFT JOIN method or any other one to my current problem.
Thanks a lot in advance,

You could have a already done table with months and "join" with it, group by the date, or you can create one with subquery or using a with statement, something like
WITH Months (month) AS (
SELECT 1 AS Month FROM DUAL
UNION ALL
SELECT MONTH + 1
FROM Months
WHERE MONTH < 12
)
SELECT *
FROM Months
LEFT JOIN SomeTable
ON SomeTable.month = Months.MONTH
--ON Extract(MONTH FROM SomeTable.date) = Months.MONTH
edit
A better example:
--Just to simulate some table data
WITH SomeData AS (
SELECT TO_DATE('01/01/2019', 'MM/DD/YYYY') AS Dat, 5 AS Value FROM dual
UNION ALL
SELECT TO_DATE('01/05/2019', 'MM/DD/YYYY') AS Dat, 7 AS Value FROM dual
UNION ALL
SELECT TO_DATE('03/03/2019', 'MM/DD/YYYY') AS Dat, 2 AS Value FROM dual
UNION ALL
SELECT TO_DATE('11/05/2019', 'MM/DD/YYYY') AS Dat, 9 AS Value FROM dual
)
, Months (StartDate, MaxYear) AS (
SELECT CAST(TO_DATE('01/01/2019', 'MM/DD/YYYY') AS DATE) AS StartDate, 2019 AS MaxYear FROM DUAL
UNION ALL
SELECT CAST(ADD_MONTHS(StartDate, 1) AS DATE), MaxYear
FROM Months
WHERE EXTRACT(YEAR FROM ADD_MONTHS(StartDate, 1)) <= MaxYear
)
SELECT
Months.StartDate AS Dat
, SUM(SomeData.Value) AS SumValue
FROM Months
LEFT JOIN SomeData
ON Extract(MONTH FROM SomeData.Dat) = Extract(MONTH FROM Months.StartDate)
GROUP BY
Months.StartDate
edit
You won't find a just copy past solution, you need to get the idea from it and change to your context.
let's try this. You can "add" the missing months in an APP, or you can JOIN it with a already done table, doesn't need to be a real table, you can make one. The with statement is an example of it. So lets get all month, at the last day for 2019:
--Geting the last day of every month for 2019
WITH Months (CurrentMonth, MaxYear) AS (
SELECT CAST(TO_DATE('01/01/2019', 'MM/DD/YYYY') AS DATE) AS CurrentMonth, 2019 AS MaxYear FROM DUAL
UNION ALL
SELECT CAST(ADD_MONTHS(CurrentMonth, 1) AS DATE), MaxYear
FROM Months
WHERE EXTRACT(YEAR FROM ADD_MONTHS(CurrentMonth, 1)) <= MaxYear
)
SELECT LAST_DAY(Months.CurrentMonth) AS LastDay
FROM Months
Ok, now we have all months avaliable for the join. In your query, you already have the sum done so lets skip the sum and just use your data. Just add another with query.
--Geting the last day of every month for 2018
WITH Months (CurrentMonth, MaxYear) AS (
SELECT CAST(TO_DATE('01/01/2018', 'MM/DD/YYYY') AS DATE) AS CurrentMonth, 2018 AS MaxYear FROM DUAL
UNION ALL
SELECT CAST(ADD_MONTHS(CurrentMonth, 1) AS DATE), MaxYear
FROM Months
WHERE EXTRACT(YEAR FROM ADD_MONTHS(CurrentMonth, 1)) <= MaxYear
)
, YourData as (
SELECT sl.loonperiode_dt, (sum(slr.uren)) code_220
FROM HR.soc_loonbrief_regels slr,
HR.soc_loonbrieven sl,
HR.werknemers w,
HR.v_kontrakten vk
WHERE sl.loonperiode_dt BETWEEN '01012018' AND '01122018'
AND slr.loon_code_id IN (394)
AND slr.loonbrief_id = sl.loonbrief_id
AND w.werknemer_id = sl.werknemer_id
AND w.werknemer_id = vk.werknemer_id
AND vk.functie_id IN (121, 122, 128)
AND sl.loonperiode_dt BETWEEN hist_start_dt AND last_day(nvl(hist_eind_dt, sl.loonperiode_dt))
AND w.afdeling_id like '961'
GROUP BY sl.loonperiode_dt
--ORDER BY sl.loonperiode_dt
)
SELECT
LAST_DAY(Months.CurrentMonth) AS LastDay
, COALESCE(YourData.code_220, 0) AS code_220
FROM Months
Left Join YourData
on Extract(MONTH FROM Months.CurrentMonth) = Extract(MONTH FROM YourData.loonperiode_dt)
--If you have more years: AND Extract(YEAR FROM Months.CurrentMonth) = Extract(YEAR FROM YourData.loonperiode_dt)
ORDER BY LastDay ASC

Related

Improve query to be less repetitive

Is there a way to improve this query? I see two problems here -
Repetitive code
Hard coded strings
The first CTE calculates count based on 18 months. The second CTE calculates count based on 12 months.
with month_18 as (
select proc_cd, count(*) as month_18 from
(
select distinct patient, proc_cd from
service
where proc_cd = '35'
and month_id >= (select month_id from annual)
and month_id <= '202009' --This month should be 18 months from the month above
and length(patient) > 1
) a
group by proc_cd
),
month_12 as
(
select proc_cd, count(*) as month_12 from
(
select distinct patient_id, proc_cd from
service
where proc_cd = '35'
and month_id >= '201910'
and month_id <= '202009' --This month should be 12 months from the month above
and length(patient) > 1
) a
group by proc_cd
)
select a.*, b.month_12 from
month_18 a
join month_12 b
on a.proc_cd = b.proc_cd

If I understand correctly, you can use conditional aggregation:
select proc_cd,
count(distinct patient) filter (where month_id >= (select month_id from annual) and month_id <= '202009') as month_18,
count(distinct patient) filter (where month_id >= '201910' and month_id <= '202009')
from service
where proc_cd = 35 and
length(patient) > 1
group by proc_cd;
If you have to deal with date arithmetic on the month ids, you can convert to a date, do the arithmetic and convert back to a string:
select to_char(to_date(month_id, 'YYYYMM') - interval '12 month', 'YYYYMM')
from (values ('202009')) v(month_id);

Last 10 weeks in SQL Server

I'm using a Procedure that returns the turnover of the stores by weeks:
https://i.ibb.co/N3sP2Jp/1.png
I want just the last 10 weeks, from current week.
And the same for the previous year.
SELECT
DATENAME(WEEK, [GP_DATEPIECE]) AS [WEEK],
[et_libelle] AS [STORE NAME],
SUM(TOTALTTC) AS [TU],
SUM(TOTALTTC) AS [TU -1],
GROUP BY
[et_libelle],
DATENAME(WEEK, [GP_DATEPIECE])

THIS ANSWER THE ORIGINAL VERSION OF THE QUESTION.
To get the last 10 weeks in the data, you can do:
where datepiece >= dateadd(week, datediff(week, 0, getdate()) - 10, 0)

I have updated the script as per your requirement and I hope we are now so close to your requirement now. The output from the below query is a raw rows for your data comparison. You can now apply aggregation on your result set by applying GROUP by on YR (for whole years data comparison) and even you can add WK number in GROUP by so that you can compare data between Year and Week wise.
Note: As I applied ISO_Week, Week Starts at Monday and Ends at Sunday.
WITH CTE (COMMON,DayMinus)
AS
(
SELECT 1,0 UNION ALL
SELECT 1,1 UNION ALL
SELECT 1,2 UNION ALL
SELECT 1,3 UNION ALL
SELECT 1,4 UNION ALL
SELECT 1,5 UNION ALL
SELECT 1,6 UNION ALL
SELECT 1,7 UNION ALL
SELECT 1,8 UNION ALL
SELECT 1,9
)
SELECT YEAR(your_date_column) YR,
DATEPART(ISO_WEEK, your_date_column) WK,
*
FROM your_table
WHERE YEAR(your_date_column) IN (2019,2018)
AND DATEPART(ISO_WEEK, your_date_column) IN
(
SELECT A.WKNUM-CTE.DayMinus AS [WEEK NUMBER]
FROM CTE
INNER JOIN (
SELECT 1 AS COMMON,DATENAME(ISO_WEEK,GETDATE()) WKNUM
) A ON CTE.COMMON = A.COMMON
)

SQL counting days with gap / overlapping

I am working on a "counting days" problem almost identical to this one. I have a list of date(s), and need to count how many days used excluding duplicate, and handling the gaps. Same input and output.
From: Markus Jarderot
Input
ID d1 d2
1 2011-08-01 2011-08-08
1 2011-08-02 2011-08-06
1 2011-08-03 2011-08-10
1 2011-08-12 2011-08-14
2 2011-08-01 2011-08-03
2 2011-08-02 2011-08-06
2 2011-08-05 2011-08-09
Output
ID hold_days
1 11
2 8
SQL to find time elapsed from multiple overlapping intervals
But for the life of me I couldn't understand Markus Jarderot's solution.
SELECT DISTINCT
t1.ID,
t1.d1 AS date,
-DATEDIFF(DAY, (SELECT MIN(d1) FROM Orders), t1.d1) AS n
FROM Orders t1
LEFT JOIN Orders t2 -- Join for any events occurring while this
ON t2.ID = t1.ID -- is starting. If this is a start point,
AND t2.d1 <> t1.d1 -- it won't match anything, which is what
AND t1.d1 BETWEEN t2.d1 AND t2.d2 -- we want.
GROUP BY t1.ID, t1.d1, t1.d2
HAVING COUNT(t2.ID) = 0
Why is DATEDIFF(DAY, (SELECT MIN(d1) FROM Orders), t1.d1) picking from the min(d1) from the entire list? Is that regardless of ID.
And what does t1.d1 BETWEEN t2.d1 AND t2.d2 do? Is that to ensure only overlapped interval are calculated?
Same thing with group by, I think because if in the event the same identical period will be discarded? I tried to trace the solution by hand but getting more confused.

This is mostly a duplicate of my answer here (including explanation) but with the inclusion of grouping on an id column. It should use a single table scan and does not require a recursive sub-query factoring clause (CTE) or self joins.
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE your_table ( id, usr, start_date, end_date ) AS
SELECT 1, 'A', DATE '2017-06-01', DATE '2017-06-03' FROM DUAL UNION ALL
SELECT 1, 'B', DATE '2017-06-02', DATE '2017-06-04' FROM DUAL UNION ALL -- Overlaps previous
SELECT 1, 'C', DATE '2017-06-06', DATE '2017-06-06' FROM DUAL UNION ALL
SELECT 1, 'D', DATE '2017-06-07', DATE '2017-06-07' FROM DUAL UNION ALL -- Adjacent to previous
SELECT 1, 'E', DATE '2017-06-11', DATE '2017-06-20' FROM DUAL UNION ALL
SELECT 1, 'F', DATE '2017-06-14', DATE '2017-06-15' FROM DUAL UNION ALL -- Within previous
SELECT 1, 'G', DATE '2017-06-22', DATE '2017-06-25' FROM DUAL UNION ALL
SELECT 1, 'H', DATE '2017-06-24', DATE '2017-06-28' FROM DUAL UNION ALL -- Overlaps previous and next
SELECT 1, 'I', DATE '2017-06-27', DATE '2017-06-30' FROM DUAL UNION ALL
SELECT 1, 'J', DATE '2017-06-27', DATE '2017-06-28' FROM DUAL UNION ALL -- Within H and I
SELECT 2, 'K', DATE '2011-08-01', DATE '2011-08-08' FROM DUAL UNION ALL -- Your data below
SELECT 2, 'L', DATE '2011-08-02', DATE '2011-08-06' FROM DUAL UNION ALL
SELECT 2, 'M', DATE '2011-08-03', DATE '2011-08-10' FROM DUAL UNION ALL
SELECT 2, 'N', DATE '2011-08-12', DATE '2011-08-14' FROM DUAL UNION ALL
SELECT 3, 'O', DATE '2011-08-01', DATE '2011-08-03' FROM DUAL UNION ALL
SELECT 3, 'P', DATE '2011-08-02', DATE '2011-08-06' FROM DUAL UNION ALL
SELECT 3, 'Q', DATE '2011-08-05', DATE '2011-08-09' FROM DUAL;
Query 1:
SELECT id,
SUM( days ) AS total_days
FROM (
SELECT id,
dt - LAG( dt ) OVER ( PARTITION BY id
ORDER BY dt ) + 1 AS days,
start_end
FROM (
SELECT id,
dt,
CASE SUM( value ) OVER ( PARTITION BY id
ORDER BY dt ASC, value DESC, ROWNUM ) * value
WHEN 1 THEN 'start'
WHEN 0 THEN 'end'
END AS start_end
FROM your_table
UNPIVOT ( dt FOR value IN ( start_date AS 1, end_date AS -1 ) )
)
WHERE start_end IS NOT NULL
)
WHERE start_end = 'end'
GROUP BY id
Results:
| ID | TOTAL_DAYS |
|----|------------|
| 1 | 25 |
| 2 | 13 |
| 3 | 9 |

The brute force method is to create all days (in a recursive query) and then count:
with dates(id, day, d2) as
(
select id, d1 as day, d2 from mytable
union all
select id, day + 1, d2 from dates where day < d2
)
select id, count(distinct day)
from dates
group by id
order by id;
Unfortunately there is a bug in some Oracle versions and recursive queries with dates don't work there. So try this code and see whether it works in your system. (I have Oracle 11.2 and the bug still exists there; so I guess you need Oracle 12c.)

I guess Markus' idea is to find all starting points that are not within other ranges and all ending points that aren't. Then just take the first starting point till the first ending point, then the next starting point till the next ending point, etc. As Markus isn't using a window function to number starting and ending points, he must find a more complicated way to achieve this. Here is the query with ROW_NUMBER. Maybe this gives you a start what to look for in Markus' query.
select startpoint.id, sum(endpoint.day - startpoint.day)
from
(
select id, d1 as day, row_number() over (partition by id order by d1) as rn
from mytable m1
where not exists
(
select *
from mytable m2
where m1.id = m2.id
and m1.d1 > m2.d1 and m1.d1 <= m2.d2
)
) startpoint
join
(
select id, d2 as day, row_number() over (partition by id order by d1) as rn
from mytable m1
where not exists
(
select *
from mytable m2
where m1.id = m2.id
and m1.d2 >= m2.d1 and m1.d2 < m2.d2
)
) endpoint on endpoint.id = startpoint.id and endpoint.rn = startpoint.rn
group by startpoint.id
order by startpoint.id;

If all your intervals start at different dates, consider them in ascending order by d1 counting how many days are from d1 to the next interval.
You can discard an interval of it is contained in another one.
The last interval won't have a follower.
This query should give you how many days each interval give
select a.id, a.d1,nvl(min(b.d1), a.d2) - a.d1
from orders a
left join orders b
on a.id = b.id and a.d1 < b.d1 and a.d2 between b.d1 and b.d2
group by a.id, a.d1
Then group by id and sum days

Get SQL to process each row 1 by 1

I have been going around for while trying to get an anwswer to my issue, I think it revolves around cursors in SQL but I am not sure. I think I know how to write the loop for a single row of data but I don't know how to run it for all the records:
Hopefully there is an easy answer:
I have a table, let's call it A, that has Product_Code, Start_Date, End_Date and Value
I would need an output table B that has column: Product_Code, Month, Year, Value when Month * Year is in between Start_Date and End_date
Each record of A should then create several record into B. Hope that's fairly clear, I'm happy to elaborate if not! :)

CREATE TABLE YearMonth(
Year int not null,
Month int not null,
FirstDay date not null,
LastDay date not null
);
Fill this table with as many years and months that your range of data is covered (no problem if you have too much).
You could do this with a statement like this:
WITH y(year) AS (
SELECT 2007
union all
SELECT 2008
union all
SELECT 2009
union all
SELECT 2010
union all
SELECT 2011
union all
SELECT 2012
union all
SELECT 2013
union all
SELECT 2014
union all
SELECT 2015
union all
SELECT 2016
),
m(month) AS (
SELECT 1
union all
SELECT 2
union all
SELECT 3
union all
SELECT 4
union all
SELECT 5
union all
SELECT 6
union all
SELECT 7
union all
SELECT 8
union all
SELECT 9
union all
SELECT 10
union all
SELECT 11
union all
SELECT 12
)
INSERT INTO YearMonth(Year, Month, FirstDay, LastDay)
SELECT y.year
,m.month
,convert(date, convert(nvarchar(4), y.year) + '.' + convert(nvarchar(2), m.month) + '.01', 102)
,DateAdd(day, - 1,
CASE WHEN m.month = 12 THEN
convert(date, convert(nvarchar(4), y.year + 1) + '.01.01', 102)
ELSE
convert(date, convert(nvarchar(4), y.year) + '.' + convert(nvarchar(2), m.month + 1) + '.01', 102)
END)
FROM y CROSS JOIN m
The tricky part to calculate the LastDay works like this: create a date that is the first of the following month, then subtract one day from it. This handles the problem that the last day of the month can be 28, 29, 30, or 31.
Then just use a join:
INSERT INTO B(Product_Code, Month, Year, Value)
SELECT A.Product_Code
,YearMonth.Month
,YearMonth.Year
,A.Value
FROM A
JOIN YearMonth ON YearMonth.LastDay <= A.StartDate
AND YearMonth.FirstDay <= A.EndDate
Depending on the exact interpretation of "Month*Year is in between Start_Date and End_date", you might have to switch one or both of the <=s to <.

sql server rolling 12 months sum with date gaps

Suppose I have a table that indicates the number of items sold in a particular month for each sales rep. However, there will not be a row for a particular person in months where there were no sales. Example
rep_id month_yr num_sales
1 01/01/2012 3
1 05/01/2012 1
1 11/01/2012 1
2 02/01/2012 2
2 05/01/2012 1
I want to be able to create a query that shows for each rep_id and all possible months (01/01/2012, 02/01/2012, etc. through current) a rolling 12 month sales sum, like this:
rep_id month_yr R12_Sum
1 11/01/2012 5
1 12/01/2012 5
1 01/01/2013 5
1 02/01/2013 2
I have found some examples online, but the problem I'm running into is I'm missing some dates for each rep_id. Do I need to cross join or something?

To solve this problem, you need a driver table that has all year/month combinations. Then, you need to create this for each rep.
The solution is then to left join the actual data to this driver and aggregate the period that you want. Here is the query:
with months as (
select 1 as mon union all select 2 union all select 3 union all select 4 union all
select 5 as mon union all select 6 union all select 7 union all select 8 union all
select 9 as mon union all select 10 union all select 11 union all select 12
),
years as (select 2010 as yr union all select 2011 union all select 2012 union all select 2013
),
monthyears as (
select yr, mon, yr*12+mon as yrmon
from months cross join years
),
rmy as (
select *
from monthyears my cross join
(select distinct rep_id from t
) r
)
select rmy.rep_id, rmy.yr, rmy.mon, SUM(t.num_sales) as r12_sum
from rmy join
t
on rmy.rep_id = t.rep_id and
t.year(month_yr)*12 + month(month_yr) between rmy.yrmon - 11 and rmy.yrmon
group by rmy.rep_id, rmy.yr, rmy.mon
order by 1, 2, 3
This hasn't been tested, so it may have syntactic errors. Also, it doesn't convert the year/month combination back to a date, leaving the values in separate columns.

Here is one solution:
SELECT
a.rep_id
,a.month_yr
,SUM(b.R12_Sum) AS R12_TTM
FROM YourTable a
LEFT OUTER JOIN YourTable b
ON a.rep_id = b.rep_id
AND a.month_yr <= b.month_yr
AND a.month_yr >= DATEADD(MONTH, -11, b.month_yr)
GROUP BY
a.rep_id
,a.month_yr

It's certainly not pretty but is more simple than a CTE, numbers table or self join:
DECLARE #startdt DATETIME
SET #startdt = '2012-01-01'
SELECT rep_id, YEAR(month_yr), MONTH(month_yr), SUM(num_sales)
FROM MyTable WHERE month_yr >= #startdt AND month_yr < DATEADD(MONTH,1,#startdt)
UNION ALL
SELECT rep_id, YEAR(month_yr), MONTH(month_yr), SUM(num_sales)
FROM MyTable WHERE month_yr >= DATEADD(MONTH,1,#startdt) AND month_yr < DATEADD(MONTH,2,#startdt)
UNION ALL
SELECT rep_id, YEAR(month_yr), MONTH(month_yr), SUM(num_sales)
FROM MyTable WHERE month_yr >= DATEADD(MONTH,2,#startdt) AND month_yr < DATEADD(MONTH,3,#startdt)
UNION ALL
SELECT rep_id, YEAR(month_yr), MONTH(month_yr), SUM(num_sales)
FROM MyTable WHERE month_yr >= DATEADD(MONTH,3,#startdt) AND month_yr < DATEADD(MONTH,4,#startdt)
UNION ALL
etc etc

The following demonstrates using a CTE to generate a table of dates and generating a summary report using the CTE. Sales representatives are omitted from the results when they have had no applicable sales.
Try jiggling the reporting parameters, e.g. setting #RollingMonths to 1, for more entertainment.
-- Sample data.
declare #Sales as Table ( rep_id Int, month_yr Date, num_sales Int );
insert into #Sales ( rep_id, month_yr, num_sales ) values
( 1, '01/01/2012', 3 ),
( 1, '05/01/2012', 1 ),
( 1, '11/01/2012', 1 ),
( 2, '02/01/2012', 1 ),
( 2, '05/01/2012', 2 );
select * from #Sales;
-- Reporting parameters.
declare #ReportEnd as Date = DateAdd( day, 1 - Day( GetDate() ), GetDate() ); -- The first of the current month.
declare #ReportMonths as Int = 6; -- Number of months to report.
declare #RollingMonths as Int = 12; -- Number of months in rolling sums.
-- Report.
-- A CTE generates a table of month/year combinations covering the desired reporting time period.
with ReportingIntervals as (
select DateAdd( month, 1 - #ReportMonths, #ReportEnd ) as ReportingInterval,
DateAdd( month, 1 - #RollingMonths, DateAdd( month, 1 - #ReportMonths, #ReportEnd ) ) as FirstRollingMonth
union all
select DateAdd( month, 1, ReportingInterval ), DateAdd( month, 1, FirstRollingMonth )
from ReportingIntervals
where ReportingInterval < #ReportEnd )
-- Join the CTE with the sample data and summarize.
select RI.ReportingInterval, S.rep_id, Sum( S.num_sales ) as R12_Sum
from ReportingIntervals as RI left outer join
#Sales as S on RI.FirstRollingMonth <= S.month_yr and S.month_yr <= RI.ReportingInterval
group by RI.ReportingInterval, S.rep_id
order by RI.ReportingInterval, S.rep_id

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Fill missing months in a SELECT query - sql

Related

Improve query to be less repetitive

Last 10 weeks in SQL Server

SQL counting days with gap / overlapping

Get SQL to process each row 1 by 1

sql server rolling 12 months sum with date gaps

Categories

Resources