Count days from start_date to end_date or end of month - sql

With datediff() I can count the days between two dates, but how can I count the days between the later date or the end of the month and the start date?
CREATE TABLE table1 (id int, start_date datetime, end_date datetime, jan int);
INSERT INTO table1 (id, start_date, end_date) VALUES
(1, '2016-12-12', '2017-01-17'),
(2, '2017-01-10', '2017-01-10'),
(3, '2017-01-10', '2017-02-10'),
(4, '2017-01-03', '2017-02-03'),
(5, '2016-12-03', '2017-02-03');
If I run:
select id, month(start_date) as month, datediff(end_date, start_date) as diff
from table1;
it returns
id month diff
1 12 36
2 1 0
3 1 31
4 1 31
5 12 62
but I would like it to return:
id month diff
1 12 19
5 12 28
1 1 17
2 1 0
3 1 21
4 1 28
5 1 31
3 2 10
4 2 3
5 2 3
I'm trying to get the amount of days in a month a event occurs by month.
I've created a separated query to update a new column with the values, but ideally it shouldn't have a new column, since I would need several new columns for each year-month combination and one for each year-month combination:
update table1 set jan= case
when start_date >= "2017-01-01" and end_date <= last_day("2017-01-01") then datediff(end_date, start_date)+1
when start_date >= "2017-01-01" and start_date <= last_day("2017-01-01") and end_date > last_day("2017-01-01") then datediff(last_day("2017-01-01"), start_date)+1
when start_date < "2017-01-01" and end_date between "2017-01-01" and last_day("2017-01-01") then datediff(end_date, "2017-01-01")+1
when start_date < "2017-01-01" and end_date > last_day("2017-01-01") then day(last_day("2017-01-01"))
else null
end;

Your problem is going to be getting multiple rows... so let's take a different tack.
This ends up being trivial if you have a calendar table: a table with a row-per-date (and a bunch of individual columns and indices):
SELECT Table1.id, Calendar.calendar_month, COUNT(*)
FROM Table1
JOIN Calendar
ON Calendar.calendar_date >= start_date
AND Calendar.calendar_date < end_date
GROUP BY Table1.id, Calendar.calendar_month
ORDER BY Table1.id, MIN(Calendar.calendar_date)
Fiddle Demo

I don't know if this is what you're looking for.
select month(start_date) as month,
datediff(LAST_DAY(start_date), start_date) as diff
from table1
UNION ALL
select month(end_date) as month,
IF(end_date < LAST_DAY(start_date), datediff(start_date, end_date),
datediff(end_date, LAST_DAY(start_date)))
from table1;
DEMO

Related

Get days of the week from a date range in Postgres

So I have the following table :
id end_date name number_of_days start_date
1 "2022-01-01" holiday1 1 "2022-01-01"
2 "2022-03-20" holiday2 1 "2022-03-20"
3 "2022-04-09" holiday3 1 "2022-04-09"
4 "2022-05-01" holiday4 1 "2022-05-01"
5 "2022-05-04" holiday5 3 "2022-05-02"
6 "2022-07-12" holiday6 9 "2022-07-20"
I want to check if a week falls in a holiday range.
So far I can select the holidays that overlap with my choosen week( week_start_date, week_end_date) , but i cant get the exact days in which the overlap happens.
this is the query i'm using, i want to add a mechanism to detect the DAYS OF THE WEEK IN WHICH THE OVERLAP HAPPENS
SELECT * FROM holidays
where daterange(CAST(start_date AS date), CAST(end_date as date), '[]') && daterange('2022-07-18', '2022-07-26','[]')
THE CURRENT QUERY RETURNS THE OVERLLAPPING HOLIDA, (id = 6), however i'm trying to get the exact DAYS OF THE WEEK in which the overlap happens ( in this case, it should be monday,tuesday , wednesday)
You can use the * operator with tsranges, generate a series of dates with the lower and upper dates and finally with to_char print the days of the week, e.g.
SELECT
id, name, start_date, end_date, array_agg(dow) AS days
FROM (
SELECT *,
trim(
to_char(
generate_series(lower(overlap), upper(overlap),'1 day'),
'Day')) AS dow
FROM holidays
CROSS JOIN LATERAL (SELECT tsrange(start_date,end_date) *
tsrange('2022-07-18', '2022-07-26')) t (overlap)
WHERE tsrange(start_date,end_date) && tsrange('2022-07-18', '2022-07-26')) j
GROUP BY id,name,start_date,end_date,number_of_days;
id | name | start_date | end_date | days
----+----------+------------+------------+----------------------------
6 | holiday6 | 2022-07-12 | 2022-07-20 | {Monday,Tuesday,Wednesday}
(1 row)
Demo: db<>fiddle

Query a 30 day interval for every 30 day interval in the last year

I want to query every 30 day interval in 2021, but I don't know how to do it without a for loop in SQL.
Here's psuedo code of what I want to do with a table called _table and a date column called application_date:
for _day in range(335):
select '2021-01-01' + _day as start_date, count(*) as _count
from _table
where '2021-01-01' + _day <= application_date <= ('2021-01-01' + _day + interval '30' day )
It would output something like this:
start_date
_count
2021-01-01
{number of rows between 2021-01-01 and 2021-01-31}
2021-01-02
{number of rows between 2021-01-02 and 2021-02-01}
...
...
2021-11-31
{number of rows between 2021-11-31 and 2021-12-30}
2021-12-01
{number of rows between 2021-12-01 and 2021-12-31}
Assuming that you have rows for each day you can group data by date, count it in the group and then use sum window function with range of 30 rows (current + next 30 rows, note that {rows between 2021-01-01 and 2021-01-31} have interval of 31 day, not 30):
-- sample data
WITH dataset(start_date) AS (
VALUES (date '2021-01-01'),
(date '2021-01-01'),
(date '2021-01-01'),
(date '2021-01-02'),
(date '2021-01-03'),
(date '2021-01-03')
)
-- query
select start_date
, sum(cnt) over (order by start_date ROWS BETWEEN CURRENT ROW AND 30 FOLLOWING) rolling_count_31_days
from (
select start_date
, count(*) cnt
from dataset
where year(start_date) = 2021
group by start_date
)
Output:
start_date
rolling_count_31_days
2021-01-01
6
2021-01-02
3
2021-01-03
2
If some dates are missing - checkout this or this answer describing how to insert missing dates and insert dates into the group result with cnt set to 0.
Note that Trino (the new name for PrestoSQL) updated support for RANGE frame type and you can implement this without need to insert missing rows.

Partition rows where dates are between the previous dates

I have the below table.
I want to identify overlapping intervals of start_date and end_date.
*edit I would like to remove the row that has the least amount of days between the start and end date where those rows overlap.
Example:
pgid 1 & pgid 2 have overlapping days. Remove the row that has the least amount of days between start_date and end_date.
Table A
id pgid Start_date End_date Days
1 1 8/4/2018 9/10/2018 37
1 2 9/8/2018 9/8/2018 0
1 3 10/29/2018 11/30/2018 32
1 4 12/1/2018 sysdate 123
Expected Results:
id Start_date End_date Days
1 8/4/2018 9/10/2018 37
1 10/29/2018 11/30/2018 32
1 12/1/2018 sysdate 123
I am thinking exists:
select t.*,
(case when exists (select 1
from t t2
where t2.start_date < t.start_date and
t2.end_date > t.end_date and
t2.id = t.id
)
then 2 else 1
end) as overlap_flag
from t;
Maybe lead and lag:
SELECT
CASE
WHEN END_DATE > LEAD (START_DATE) OVER (PARTITION BY id ORDER BY START_DATE) THEN 1
WHEN START_DATE < LAG (END_DATE) OVER (PARTITION BY id ORDER BY START_DATE) THEN 1
ELSE 0
END OVERLAP_FLAG
FROM A

db2 compare year and month side by side

I need to compare side by side the companies values by current year vs last year and current month with same month of the previous year.
I use this query to get the values
SELECT STORE, SUM(TOTAL) as VAL, DATE FROM MYTABLE
WHERE DATE=CURRENT_DATE GROUP BY STORE ORDER BY STORE
below the results
STORE | VAL | DATE
1 10 CURRENT_DATE (2018-27-03)
1 20 2018-26-03
1 30 2018-25-03
2 20 CURRENT_DATE (2018-27-03)
2 20 2018-26-02
and i need this
STORE | VALUE CURRENT YEAR | VALUE LAST YEAR
1 60 30 (CALCULATED)
2 40 50 (CALCULATED)
STORE | VALUE CURRENT MONTH | VALUE SAME MONTH OF LAST YEAR
1 60 30 (CALCULATED)
2 20 50 (CALCULATED)
Thank you
You could just join two sub-selects together.
E.g with this DDL and Data
CREATE TABLE MYTABLE (STORE int, VAL int, D DATE);
INSERT INTO MYTABLE VALUES
( 1, 10, '2018-03-27')
,( 1, 20, '2018-03-26')
,( 1, 10, '2018-02-25')
,( 1, 35, '2017-03-25')
,( 2, 20, '2018-03-27')
,( 2, 15, '2017-03-26');
This will get you current month and last month last year values
SELECT C.*, LY.VAL_CURR_MONTH_LY
FROM (
SELECT STORE, SUM(VAL) as VAL_CURR_MONTH
FROM MYTABLE WHERE INT(D)/100=INT(CURRENT_DATE)/100
GROUP BY STORE ) AS C
LEFT JOIN
(SELECT STORE
, SUM(VAL) AS VAL_CURR_MONTH_LY
FROM MYTABLE
WHERE INT(D)/100 = INT(CURRENT_DATE)/100 -100
GROUP BY STORE ) LY
ON
C.STORE = LY.STORE
Then this for years
SELECT C.*, LY.VAL_LY
FROM (
SELECT STORE, SUM(VAL) as VAL_CURR_YEAR
FROM MYTABLE WHERE INT(D)/10000=INT(CURRENT_DATE)/10000
GROUP BY STORE ) AS C
LEFT JOIN
(SELECT STORE
, SUM(VAL) AS VAL_LY
FROM MYTABLE
WHERE INT(D)/10000 = INT(CURRENT_DATE)/10000 -1
GROUP BY STORE ) LY
ON
C.STORE = LY.STORE
P.S. there are many other ways to manipulate dates, but casting to INT is maybe one of the easier ways
Also, here is a more flexible way to get the "Same Month of Last Year" value. A similar method can get "last Year" values.
SELECT T.*
, AVG(VAL) OVER(
PARTITION BY STORE
ORDER BY YEAR_MONTH
RANGE BETWEEN 101 PRECEDING AND 100 PRECEDING
) AS SAME_MONTH_PREV_YEAR
FROM
( SELECT STORE
, INTEGER(D)/100 AS YEAR_MONTH
, SUM(VAL) AS VAL
FROM
MYTABLE T
GROUP BY
STORE
, INTEGER(D)/100
) AS T
;
Gives
STORE YEAR_MONTH VAL SAME_MONTH_PREV_YEAR
----- ---------- --- --------------------
1 201703 35 NULL
1 201802 10 NULL
1 201803 30 35
2 201703 15 NULL
2 201803 20 15
It is better to avoid functions on table columns in where clauses. Check following SQLs which are based on P. Vernon sample table.
Note: These SQLs are for DB2 LUW 11.1
For month:
SELECT STORE,
SUM(CASE WHEN YEAR(D) = year(current date) THEN val
ELSE 0 END) as VAL_CURR_MONTH,
SUM(CASE WHEN YEAR(D) = year(current date) - 1 THEN vaL
ELSE 0 END) as VAL_CURR_MONTH_LY
FROM MYTABLE
WHERE D between first_day(current date) and last_day(current date)
or D between first_day(current date - 1 year) and last_day(current date - 1 year)
GROUP BY STORE
ORDER BY STORE
For year:
SELECT STORE, SUM(CASE WHEN YEAR(D) = year(current date) THEN val
ELSE 0 END) as VAL_CY,
SUM(CASE WHEN YEAR(D) = year(current date) - 1 THEN vaL
ELSE 0 END) as VAL_LY
FROM MYTABLE
WHERE D between first_day(current date - (month(current date) - 1) months)
and last_day(current date + (12 - month(current date)) months)
or D between first_day(current date - (month(current date) - 1) months - 1 year)
and last_day(current date + (12 - month(current date)) months - 1 year)
GROUP BY STORE
ORDER BY STORE

Teradata - Split date range into month columns with day count

I need to split different date ranges over a quarter period into month columns with only the days actually used in that month. Each record (range) would be different.
Example:
Table
Record_ID Start_Date End_Date
1 10/27 11/30
2 11/30 12/14
3 12/14 12/31
Range 1 = 10/5 to 12/14
Range 2 = 11/20 to 12/31
Range 3 = 10/28 to 12/2
Output:
Range 1
Oct Nov Dec
27 30 14
Similar to #ULick's answer using sys_calendar.calendar, but a little more succinct:
CREATE VOLATILE MULTISET TABLE datetest (record_id int, start_date date, end_date date) ON COMMIT PRESERVE ROWS;
INSERT INTO datetest VALUES (1, '2017-10-05', '2017-12-14');
INSERT INTO datetest VALUES (2, '2017-11-20','2017-12-31');
SELECT record_id,
SUM(CASE WHEN month_of_year = 10 THEN 1 ELSE 0 END) as October,
SUM(CASE WHEN month_of_year = 11 THEN 1 ELSE 0 END) as November,
SUM(CASE WHEN month_of_year = 12 THEN 1 ELSE 0 END) as December
FROM datetest
INNER JOIN sys_calendar.calendar cal
ON cal.calendar_date BETWEEN start_date and end_date
GROUP BY record_id;
DROP TABLE datetest;
Because Quarter was mentioned in the question (I'm not sure how it relates here) there is also quarter_of_year and month_of_quarter available in the sys_calendar to slice and dice this even further.
Also, if you are on 16.00+ There is PIVOT functionality which may help get rid of the CASE statements here.
First join with the calendar to get all the dates within the range and get the number of days per each month (incl. full month, not mentioned in Start_Date and End_Date).
Then sum up each month in a column per Range.
create table SplitDateRange ( Range bigint, Start_Date date, End_Date date );
insert into SplitDateRange values ( 1, '2018-10-05', '2018-12-14' );
insert into SplitDateRange values ( 2, '2018-11-20', '2018-12-31' );
insert into SplitDateRange values ( 3, '2018-10-28', '2018-12-02' );
select
Range
, sum(case when mon = 10 then days else 0 end) as "Oct"
, sum(case when mon = 11 then days else 0 end) as "Nov"
, sum(case when mon = 12 then days else 0 end) as "Dec"
from (
select
Range
, extract(MONTH from C.calendar_date) as mon
, max(C.calendar_date) - min(calendar_date) +1 as days
from Sys_Calendar.CALENDAR as C
inner join SplitDateRange as DR
on C.calendar_date between DR.Start_Date and DR.End_Date
group by 1,2
) A
group by Range
order by Range
;
Different approach, avoids the cross join to the calendar by applying Teradata Expand On feature for creating time series. More text, but should be more efficient for larger tables/ranges:
SELECT record_id,
Sum(CASE WHEN mth = 10 THEN days_in_month ELSE 0 END) AS October,
Sum(CASE WHEN mth = 11 THEN days_in_month ELSE 0 END) AS November,
Sum(CASE WHEN mth = 12 THEN days_in_month ELSE 0 END) AS December
FROM
( -- this Derived Table simply avoids repeating then EXTRACT/INTERVAL calculations (can't be done directly in the nested Select)
SELECT record_id,
Extract(MONTH From Begin(expanded_pd)) AS mth,
Cast((INTERVAL( base_pd P_INTERSECT expanded_pd) DAY) AS INT) AS days_in_month
FROM
(
SELECT record_id,
PERIOD(start_date, end_date+1) AS base_pd,
expanded_pd
FROM datetest
-- creates one row per month
EXPAND ON base_pd AS expanded_pd BY ANCHOR PERIOD Month_Begin
) AS dt
) AS dt
GROUP BY 1