Calculate number of workdays PER MONTH from start_date and end_date - sql

So I have a table that looks like this :
task_id | start_date |end_date
I want to calculate the number of workdays (just days from mondays to fridays , no holidays) per month.
for example : if a task took from 02-01-2022 to 05-02-2022 to be accomplished, i need the result to look something like this
task_id | january |february |march |april .............|december
1 21 4 0 0 .......... 0

You can try to use generate_series function to generate date during your start_date and end_date which we can easy to count then the condition aggregate function to make pivot.
extract can get the month number or workdays(from Mondays to Fridays) by TIMESTAMP type, we can use that be count condition in aggregate function.
SELECT t1.task_id,
count(CASE WHEN extract(isodow from dt) BETWEEN 1 AND 5 AND EXTRACT(MONTH from dt) = 1 THEN 1 END) january,
count(CASE WHEN extract(isodow from dt) BETWEEN 1 AND 5 AND EXTRACT(MONTH from dt) = 2 THEN 1 END) february,
count(CASE WHEN extract(isodow from dt) BETWEEN 1 AND 5 AND EXTRACT(MONTH from dt) = 3 THEN 1 END) march
-- more months you can write
FROM T t1
CROSS JOIN generate_series(t1.start_date,t1.end_date,'1 day'::interval) dt
group by t1.task_id
sqlfiddle

Related

Get occurences of past 2 weeks on any given date

I have data like
id | date |
-------------
1 | 1.1.20 |
3 | 4.1.20 |
2 | 4.1.20 |
1 | 5.1.20 |
6 | 2.1.20 |
What I would like to get is to get the amount of occurrences an user with ID did in the past 2 weeks on any given date so basically "occurences between date - 14 days and date. I'm trying to categorize users by their amount of sessions past 2 weeks, and I'm following them by daily cohorts.
This query does not work since there can be days when the user does not log in aka does not have a row:
COUNT (distinct id) OVER (PARTITION BY id ORDER BY date ROWS BETWEEN 14 PRECEDING AND 0 FOLLOWING)
Unfortunately, Presto does not support range() window functions. One method is a self-join/aggregation or correlated subquery:
select t.id, count(tprev.id)
from t left join
t tprev
on tprev.id = t.id and
tprev.date > t.date - interval '13' day and
tprev.date <= t.date
group by t.id;
This interprets your request as wanting 14 days of data, including the current day.
Another method that is much more verbose but might be faster is to use lag() . . . and lag() again:
select t.id,
(1 + -- current date
(case when lag(date, 1) over (partition by id order by date) > date - interval '14' day then 1 else 0 end) +
(case when lag(date, 2) over (partition by id order by date) > date - interval '14' day then 1 else 0 end) +
. . .
(case when lag(date, 13) over (partition by id order by date) > date - interval '14' day then 1 else 0 end) +
) as cnt_14
from t;

day of week function for week day aggregates

I currently have a query that reads from a table and aggregates based on category. It gives me what I need but I"m trying to add another column that looks at all records for that category/employee combo for the days of this past week. SO if the job with this query runs on Wednesday Night, it needs to get a total of all category/employee records for Monday and Tuesday Night as well.
The query:
SELECT employee,
sum(case when category = 'Shoes' and date_of_report >= current_date - 1 days then daily_total else 0 end) as Shoes_DAILY,
sum(case when category = 'Shoes' and date_of_report >= ( current date - ( dayofweek(current date) - 1 ) days ) then sum(daily_total) else 0 end) as dailyTotalWeek
from shoeTotals
where date_of_report >= current_date
group by employee;
So the third column there is what's messing me up saying function use not valid. here's what I want:
The source table has these records for this past week:
employee | daily_total | date_of_report
--------------------------------------------------
123 14 2019-08-26
123 1 2019-08-27
123 56 2019-08-28
123 6 2019-08-29
123 8 2019-08-30 * today
My desired output would get (based on employee and category) the total for today (8) and then the sum of all the employees' records for that category on each preceding weekday. Running on Monday night would only count that days records, friday night would count monday through friday's as shown above.
employee | shoes_daily | dailyTotalWeek
--------------------------------------------------
123 8 85
What am I doing wrong with the dayofweek function?
You cannot nest aggregation functions. I think you simply want:
select employee,
sum(case when category = 'Shoes' and date_of_report >= current_date - 1 days
then daily_total else 0
end) as Shoes_DAILY,
sum(case when category = 'Shoes' and date_of_report >= ( current date - ( dayofweek(current date) - 1 ) days )
then daily_total else 0
end) as dailyTotalWeek
from shoeTotals
where date_of_report >= current date - ( dayofweek(current date) - 1 ) days
group by employee;

Teradata - Split date range into month columns with day count

I need to split different date ranges over a quarter period into month columns with only the days actually used in that month. Each record (range) would be different.
Example:
Table
Record_ID Start_Date End_Date
1 10/27 11/30
2 11/30 12/14
3 12/14 12/31
Range 1 = 10/5 to 12/14
Range 2 = 11/20 to 12/31
Range 3 = 10/28 to 12/2
Output:
Range 1
Oct Nov Dec
27 30 14
Similar to #ULick's answer using sys_calendar.calendar, but a little more succinct:
CREATE VOLATILE MULTISET TABLE datetest (record_id int, start_date date, end_date date) ON COMMIT PRESERVE ROWS;
INSERT INTO datetest VALUES (1, '2017-10-05', '2017-12-14');
INSERT INTO datetest VALUES (2, '2017-11-20','2017-12-31');
SELECT record_id,
SUM(CASE WHEN month_of_year = 10 THEN 1 ELSE 0 END) as October,
SUM(CASE WHEN month_of_year = 11 THEN 1 ELSE 0 END) as November,
SUM(CASE WHEN month_of_year = 12 THEN 1 ELSE 0 END) as December
FROM datetest
INNER JOIN sys_calendar.calendar cal
ON cal.calendar_date BETWEEN start_date and end_date
GROUP BY record_id;
DROP TABLE datetest;
Because Quarter was mentioned in the question (I'm not sure how it relates here) there is also quarter_of_year and month_of_quarter available in the sys_calendar to slice and dice this even further.
Also, if you are on 16.00+ There is PIVOT functionality which may help get rid of the CASE statements here.
First join with the calendar to get all the dates within the range and get the number of days per each month (incl. full month, not mentioned in Start_Date and End_Date).
Then sum up each month in a column per Range.
create table SplitDateRange ( Range bigint, Start_Date date, End_Date date );
insert into SplitDateRange values ( 1, '2018-10-05', '2018-12-14' );
insert into SplitDateRange values ( 2, '2018-11-20', '2018-12-31' );
insert into SplitDateRange values ( 3, '2018-10-28', '2018-12-02' );
select
Range
, sum(case when mon = 10 then days else 0 end) as "Oct"
, sum(case when mon = 11 then days else 0 end) as "Nov"
, sum(case when mon = 12 then days else 0 end) as "Dec"
from (
select
Range
, extract(MONTH from C.calendar_date) as mon
, max(C.calendar_date) - min(calendar_date) +1 as days
from Sys_Calendar.CALENDAR as C
inner join SplitDateRange as DR
on C.calendar_date between DR.Start_Date and DR.End_Date
group by 1,2
) A
group by Range
order by Range
;
Different approach, avoids the cross join to the calendar by applying Teradata Expand On feature for creating time series. More text, but should be more efficient for larger tables/ranges:
SELECT record_id,
Sum(CASE WHEN mth = 10 THEN days_in_month ELSE 0 END) AS October,
Sum(CASE WHEN mth = 11 THEN days_in_month ELSE 0 END) AS November,
Sum(CASE WHEN mth = 12 THEN days_in_month ELSE 0 END) AS December
FROM
( -- this Derived Table simply avoids repeating then EXTRACT/INTERVAL calculations (can't be done directly in the nested Select)
SELECT record_id,
Extract(MONTH From Begin(expanded_pd)) AS mth,
Cast((INTERVAL( base_pd P_INTERSECT expanded_pd) DAY) AS INT) AS days_in_month
FROM
(
SELECT record_id,
PERIOD(start_date, end_date+1) AS base_pd,
expanded_pd
FROM datetest
-- creates one row per month
EXPAND ON base_pd AS expanded_pd BY ANCHOR PERIOD Month_Begin
) AS dt
) AS dt
GROUP BY 1

Count days from start_date to end_date or end of month

With datediff() I can count the days between two dates, but how can I count the days between the later date or the end of the month and the start date?
CREATE TABLE table1 (id int, start_date datetime, end_date datetime, jan int);
INSERT INTO table1 (id, start_date, end_date) VALUES
(1, '2016-12-12', '2017-01-17'),
(2, '2017-01-10', '2017-01-10'),
(3, '2017-01-10', '2017-02-10'),
(4, '2017-01-03', '2017-02-03'),
(5, '2016-12-03', '2017-02-03');
If I run:
select id, month(start_date) as month, datediff(end_date, start_date) as diff
from table1;
it returns
id month diff
1 12 36
2 1 0
3 1 31
4 1 31
5 12 62
but I would like it to return:
id month diff
1 12 19
5 12 28
1 1 17
2 1 0
3 1 21
4 1 28
5 1 31
3 2 10
4 2 3
5 2 3
I'm trying to get the amount of days in a month a event occurs by month.
I've created a separated query to update a new column with the values, but ideally it shouldn't have a new column, since I would need several new columns for each year-month combination and one for each year-month combination:
update table1 set jan= case
when start_date >= "2017-01-01" and end_date <= last_day("2017-01-01") then datediff(end_date, start_date)+1
when start_date >= "2017-01-01" and start_date <= last_day("2017-01-01") and end_date > last_day("2017-01-01") then datediff(last_day("2017-01-01"), start_date)+1
when start_date < "2017-01-01" and end_date between "2017-01-01" and last_day("2017-01-01") then datediff(end_date, "2017-01-01")+1
when start_date < "2017-01-01" and end_date > last_day("2017-01-01") then day(last_day("2017-01-01"))
else null
end;
Your problem is going to be getting multiple rows... so let's take a different tack.
This ends up being trivial if you have a calendar table: a table with a row-per-date (and a bunch of individual columns and indices):
SELECT Table1.id, Calendar.calendar_month, COUNT(*)
FROM Table1
JOIN Calendar
ON Calendar.calendar_date >= start_date
AND Calendar.calendar_date < end_date
GROUP BY Table1.id, Calendar.calendar_month
ORDER BY Table1.id, MIN(Calendar.calendar_date)
Fiddle Demo
I don't know if this is what you're looking for.
select month(start_date) as month,
datediff(LAST_DAY(start_date), start_date) as diff
from table1
UNION ALL
select month(end_date) as month,
IF(end_date < LAST_DAY(start_date), datediff(start_date, end_date),
datediff(end_date, LAST_DAY(start_date)))
from table1;
DEMO

SQL - How to count records for each status in one line per day?

I have a table Sales
Sales
--------
id
FormUpdated
TrackingStatus
There are several status e.g. Complete, Incomplete, SaveforLater, ViewRates etc.
I want to have my results in this form for the last 8 days(including today).
Expected Result:
Date Part of FormUpdated, Day of Week, Counts of ViewRates, Counts of Sales(complete), Counts of SaveForLater
--------------------------------------
2015-05-19 Tuesday 3 1 21
2015-05-18 Monday 12 5 10
2015-05-17 Sunday 6 1 8
2015-05-16 Saturday 5 3 7
2015-05-15 Friday 67 5 32
2015-05-14 Thursday 17 0 5
2015-05-13 Wednesday 22 0 9
2015-05-12 Tuesday 19 2 6
Here is my sql query:
select datename(dw, FormUpdated), count(ID), TrackingStatus
from Sales
where FormUpdated <= GETDATE()
AND FormUpdated >= GetDate() - 8
group by datename(dw, FormUpdated), TrackingStatus
order by datename(dw, FormUpdated) desc
I do not know how to make the next step.
Update
I forgot to mention, I only need the Date part of the FormUpdated, not all parts.
You can use SUM(CASE WHEN TrackingStatus = 'SomeTrackingStatus' THEN 1 ELSE 0 END)) to get the status count for each tracking status in individual column. Something like this. SQL Fiddle
select
CONVERT(DATE,FormUpdated) FormUpdated,
DATENAME(dw, CONVERT(DATE,FormUpdated)),
SUM(CASE WHEN TrackingStatus = 'ViewRates' THEN 1 ELSE 0 END) c_ViewRates,
SUM(CASE WHEN TrackingStatus = 'Complete' THEN 1 ELSE 0 END) c_Complete,
SUM(CASE WHEN TrackingStatus = 'SaveforLater' THEN 1 ELSE 0 END) c_SaveforLater
from Sales
where FormUpdated <= GETDATE()
AND FormUpdated >= DATEADD(D,-8,GetDate())
group by CONVERT(DATE,FormUpdated)
order by CONVERT(DATE,FormUpdated) desc
You can also use a PIVOT to achieve this result - you'll just need to complete the list of TrackingStatus names in both the SELECT and the FOR, and no GROUP BY required:
WITH DatesOnly AS
(
SELECT Id, CAST(FormUpdated AS DATE) AS DateOnly, DATENAME(dw, FormUpdated) AS DayOfWeek, TrackingStatus
FROM Sales
)
SELECT DateOnly, DayOfWeek,
-- List of Pivoted Columns
[Complete],[Incomplete], [ViewRates], [SaveforLater]
FROM DatesOnly
PIVOT
(
COUNT(Id)
-- List of Pivoted columns
FOR TrackingStatus IN([Complete],[Incomplete], [ViewRates], [SaveforLater])
) pvt
WHERE DateOnly <= GETDATE() AND DateOnly >= GetDate() - 8
ORDER BY DateOnly DESC
SqlFiddle
Also, I think your ORDER BY is wrong - it should just be the Date, not day of week.