Calculating datetime intervals taking into an account the current date - sql

I have a sub request which returns this:
item_id, item_datetime, item_duration_in_days
1, '7-dec-2016-12:00', 3
2, '8-dec-2016-11:00', 4
3, '20-dec-2016-05:00', 10
4, '2-jan-2017-14:00', 50
5, '29-jan-2017-22:00', 89
I want to get "item_id" which falls into "now()". For that the algorithm is:
1) var duration_days = interval 'item_duration_in_days[i]'
2) for the very first item:
new_datetime[i] = item_datetime[i] + duration_days
3) for others:
- if a new_datetime from the previous step overlaps with the current item_datetime[i]:
new_datetime[i] = new_datetime[i - 1] + duration_days
- else:
new_datetime[i] = item_datetime[i] + duration_days
4) return an item for each iteration:
{id, item_datetime, new_datetime}
That is, there'll be something like:
item_id item_datetime new_datetime
1 7 dec 2016 10 dec 2016
2 11 dec 2016 15 dec 2016
3 20 dec 2016 30 dec 2016
4 2 jan 2017 22 feb 2017 <------- found because now() == Feb 5
5 22 feb 2017 21 may 2017
How can I do that? I think it should be something like "fold" function. Can it be done via an sql request? Or will have to be an PSQL procedure for intermediate variable storage?
Or please give pointers how to calculate that.

If I understand correctly your task, you need recursive call. Function take first row at first and process each next.
WITH RECURSIVE x AS (
SELECT *
FROM (
SELECT item_id,
item_datetime,
item_datetime + (item_duration_in_days::text || ' day')::interval AS cur_end
FROM ti
ORDER BY item_datetime
LIMIT 1
) AS first
UNION ALL
SELECT item_id,
cur_start,
cur_start + (item_duration_in_days::text || ' day')::interval
FROM (
SELECT item_id,
CASE WHEN item_datetime > prev_end THEN
item_datetime
ELSE
prev_end
END AS cur_start,
item_duration_in_days
FROM (
SELECT ti.item_id,
ti.item_datetime,
x.cur_end + '1 day'::interval AS prev_end,
item_duration_in_days
FROM x
JOIN ti ON (
ti.item_id != x.item_id
AND ti.item_datetime >= x.item_datetime
)
ORDER BY ti.item_datetime
LIMIT 1
) AS a
) AS a
) SELECT * FROM x;
Result:
item_id | item_datetime | cur_end
---------+---------------------+---------------------
1 | 2016-12-07 12:00:00 | 2016-12-10 12:00:00
2 | 2016-12-11 12:00:00 | 2016-12-15 12:00:00
3 | 2016-12-20 05:00:00 | 2016-12-30 05:00:00
4 | 2017-01-02 14:00:00 | 2017-02-21 14:00:00
5 | 2017-02-22 14:00:00 | 2017-05-22 14:00:00
(5 rows)
For seeing current job :
....
) SELECT * FROM x WHERE item_datetime <= now() AND cur_end >= now();
item_id | item_datetime | cur_end
---------+---------------------+---------------------
4 | 2017-01-02 14:00:00 | 2017-02-21 14:00:00
(1 row)

Related

Calculate running sum of previous 3 months from monthly aggregated data

I have a dataset that I have aggregated at monthly level. The next part needs me to take, for every block of 3 months, the sum of the data at monthly level.
So essentially my input data (after aggregated to monthly level) looks like:
month
year
status
count_id
08
2021
stat_1
1
09
2021
stat_1
3
10
2021
stat_1
5
11
2021
stat_1
10
12
2021
stat_1
10
01
2022
stat_1
5
02
2022
stat_1
20
and then my output data to look like:
month
year
status
count_id
3m_sum
08
2021
stat_1
1
1
09
2021
stat_1
3
4
10
2021
stat_1
5
8
11
2021
stat_1
10
18
12
2021
stat_1
10
25
01
2022
stat_1
5
25
02
2022
stat_1
20
35
i.e 3m_sum for Feb = Feb + Jan + Dec. I tried to do this using a self join and wrote a query along the lines of
WITH CTE AS(
SELECT date_part('month',date_col) as month
,date_part('year',date_col) as year
,status
,count(distinct id) as count_id
FROM (date_col, status, transaction_id) as a
)
SELECT a.month, a.year, a.status, sum(b.count_id) as 3m_sum
from cte as a
left join cte as b on a.status = b.status
and b.month >= a.month - 2 and b.month <= a.month
group by 1,2,3
This query NEARLY works. Where it falls apart is in Jan and Feb. My data is from August 2021 to Apr 2022. The means, the value for Jan should be Nov + Dec + Jan. Similarly for Feb it should be Dec + Jan + Feb.
As I am doing a join on the MONTH, all the months of Aug - Nov are treated as being values > month of jan/feb and so the query isn't doing the correct sum.
How can I adjust this bit to give the correct sum?
I did think of using a LAG function, but (even though I'm 99% sure a month won't ever be missed), I can't guarantee we will never have a month with 0 values, and therefore my LAG function will be summing the wrong rows.
I also tried doing the same join, but at individual date level (and not aggregating in my nested query) but this gave vastly different numbers, as I want the sum of the aggregation and I think the sum from the individual row was duplicated a lot of stuff I do a COUNT DISTINCT on to remove.
You can use a SUM with a window frame of 2 PRECEDING. To ensure you don't miss rows, use a calendar table and left-join all the results to it.
SELECT *,
SUM(a.count_id) OVER (ORDER BY c.year, c.month ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)
FROM Calendar c
LEFT JOIN a ON a.year = c.year AND a.month = c.month
WHERE c.year >= 2021 AND c.year <= 2022;
db<>fiddle
You could also use LAG but you would need it twice.
It should be #Charlieface's answer - only that I get one different result than you put in your expected result table:
WITH
-- your input - and I avoid keywords like "MONTH" or "YEAR"
-- and also identifiers starting with digits are forbidden -
indata(mm,yy,status,count_id,sum_3m) AS (
SELECT 08,2021,'stat_1',1,1
UNION ALL SELECT 09,2021,'stat_1',3,4
UNION ALL SELECT 10,2021,'stat_1',5,8
UNION ALL SELECT 11,2021,'stat_1',10,18
UNION ALL SELECT 12,2021,'stat_1',10,25
UNION ALL SELECT 01,2022,'stat_1',5,25
UNION ALL SELECT 02,2022,'stat_1',20,35
)
SELECT
*
, SUM(count_id) OVER(
ORDER BY yy,mm
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
) AS sum_3m_calc
FROM indata;
-- out mm | yy | status | count_id | sum_3m | sum_3m_calc
-- out ----+------+--------+----------+--------+-------------
-- out 8 | 2021 | stat_1 | 1 | 1 | 1
-- out 9 | 2021 | stat_1 | 3 | 4 | 4
-- out 10 | 2021 | stat_1 | 5 | 8 | 9
-- out 11 | 2021 | stat_1 | 10 | 18 | 18
-- out 12 | 2021 | stat_1 | 10 | 25 | 25
-- out 1 | 2022 | stat_1 | 5 | 25 | 25
-- out 2 | 2022 | stat_1 | 20 | 35 | 35

How to count employees per hour working in between hours?

How to count employees per hour working in between intime and outtime hours.
I have below table format with intime,outtime of employee .
My Table :
emp_reader_id att_date in_time out_time Shift_In_Time Shift_Out_Time
111 2020-03-01 2020-03-01 08:55:24.000 2020-03-01 10:26:56.000 09:00:00.0000000 10:30:00.0000000
112 2020-03-01 2020-03-01 08:45:49.000 2020-03-01 11:36:14.000 09:00:00.0000000 11:30:00.0000000
113 2020-03-01 2020-03-01 10:58:19.000 2020-03-01 13:36:31.000 09:00:00.0000000 12:00:00.0000000
Need to count the employee in the below format.
Expected Output:
Period Working Employee Count
0 - 1 0
1 - 2 0
2 - 3 0
3 - 4 0
4 - 5 0
5 - 6 0
6 - 7 0
7 - 8 0
8 - 9 2
9 - 10 2
10 - 11 3
11 - 12 2
12 - 13 1
13 - 14 1
14 - 15 0
15 - 16 0
16 - 17 0
17 - 18 0
18 - 19 0
19 - 20 0
20 - 21 0
21 - 22 0
22 - 23 0
23 - 0 0
I tried with below query with my raw data , but it will not work i need from above table
SELECT
(DATENAME(hour, C.DT) + ' - ' + DATENAME(hour, DATEADD(hour, 2, C.DT))) as PERIOD,
Count(C.EVENTID) as Emp_Work_On_Time
FROM
trnevents C
WHERE convert(varchar(50),C.DT,23) ='2020-03-01'
GROUP BY (DATENAME(hour, C.DT) + ' - ' +
DATENAME(hour, DATEADD(hour, 2, C.DT)))
you need to have a list of hours (0 to 23) and then left join to your table.
The following query uses recursive cte to generate that list. You may also use VALUES constructor or TALLY table. Which will gives same effect
; with hours as
(
select hour = 0
union all
select hour = hour + 1
from hours
where hour < 23
)
select convert(varchar(2), h.hour) + ' - ' + convert(varchar(2), (h.hour + 1) % 24) as [Period],
count(t.emp_reader_id) as [Working Employee Count]
from hours h
left join timesheet t on h.hour >= datepart(hour, in_time)
and h.hour <= datepart(hour, out_time)
group by h.hour
Demo : db<>fiddle
Hope that might help but take a look how shift in and shift out are in the code... seems to me its automatic so it could have all you need
SELECT COUNT(Idemp) from aaShiftCountEmp WHERE in_time<'2020-03-01 09:00:00.000' AND out_time>'2020-03-01 10:00:00.000'
this is just example for 9h to 10h but u can make it auto,
btw are u sure that this shoul not show SHIFT ppl cOUNT? i mean u sure 0-1, 1-2 instead of 0-1.30, 1.30-3?? etc?

Date Functions Conversions in SQL

Is there a way using TSQL to convert an integer to year, month and days
for e.g. 365 converts to 1year 0 months and 0 days
366 converts to 1year 0 months and 1 day
20 converts to 0 year 0 months and 20 days
200 converts to 0 year 13 months and 9 days
408 converts to 1 year 3 months and 7 days .. etc
I don't know of any inbuilt way in SQL Server 2008, but the following logic will give you all the pieces you need to concatenate the items together:
select
n
, year(dateadd(day,n,0))-1900 y
, month(dateadd(day,n,0))-1 m
, day(dateadd(day,n,0))-1 d
from (
select 365 n union all
select 366 n union all
select 20 n union all
select 200 n union all
select 408 n
) d
| n | y | m | d |
|-------|---|---|----|
| 365 | 1 | 0 | 0 |
| 366 | 1 | 0 | 1 |
| 20 | 0 | 0 | 20 |
| 200 | 0 | 6 | 19 |
| 408 | 1 | 1 | 12 |
Note that zero used in in the DATEDADD function is the date 1900-01-01, hence 1900 is deducted from the year calculation.
Thanks to Martin Smith for correcting my assumption about the leap year.
You could try without using any functions just by dividing integer values if we consider all months are 30 days:
DECLARE #days INT;
SET #days = 365;
SELECT [Years] = #days / 365,
[Months] = (#days % 365) / 30,
[Days] = (#days % 365) % 30;
#days = 365
Years Months Days
1 0 0
#days = 20
Years Months Days
0 0 20

check date ranges with other date ranges

I have the table Distractionswith the following columns:
id startTime endTime(possible null)
Also, I have two parameters, it's period. pstart and pend.
I have to find all distractions for the period and count hours.
For example, we have:
Distractions:
`id` `startTime` `endTime`
1 01.01.2014 00:00 03.01.2014 00:00
2 25.03.2014 00:00 02.04.2014 00:00
3 27.03.2014 00:00 null
The columns contains time, but don't use them.
Period is pstart = 01.01.2014 and pend = 31.03.2014
For example above the result is equal:
for id = 1 - 72 hours
for id = 2 - 168 hours(7 days from 25 to
31 - end of period)
for id = 3 - 120 hours (5 days from 27 to 31 - the distraction not completed, therefore select end of period)
the sum is equal 360.
My code:
select
sum ((ds."endTime" - ds."startTime")*24) as hoursCount
from "Distractions" ds
--where ds."startTime" >= :pstart and ds."endTime" <= :pend
-- I don't know how to create where condition properly.
You'll have to take care of cases where date ranges are outside the input range and also account for starttime and endtime being null.
This where clause should keep only the valid data ranges. I have substituted the null startime with a earliest date and null endtime with a date
far in the future.
where coalesce(endtime,date'9999-12-31') >= :pstart
and coalesce(starttime,date'0000-01-01') <= :pend
Once you have filtered records, you need to adjust the date values so that anything starting before the input :pstart is moved forward to the :pstart,
and anything ending after :pend is moved back to :pend. Subtracting these two should give the value you are looking for. But, there is a catch. Since
the time is 00:00:00, when you subtract the dates, it will miss one full day. So, add 1 to it.
SQL Fiddle
Oracle 11g R2 Schema Setup:
create table myt(
id number,
starttime date,
endtime date
);
insert into myt values( 1 ,date'2014-01-01', date'2014-01-03');
insert into myt values( 2 ,date'2014-03-25', date'2014-04-02');
insert into myt values( 3 ,date'2014-03-27', null);
insert into myt values( 4 ,null, date'2013-04-02');
insert into myt values( 5 ,date'2015-03-25', date'2015-04-02');
insert into myt values( 6 ,date'2013-12-25', date'2014-04-09');
insert into myt values( 7 ,date'2013-12-26', date'2014-01-09');
Query 1:
select id,
case when coalesce(starttime,date'0000-01-01') < date'2014-01-01'
then date'2014-01-01'
else starttime
end adj_starttime,
case when coalesce(endtime,date'9999-12-31') > date'2014-03-31'
then date'2014-03-31'
else endtime
end adj_endtime,
(case when coalesce(endtime,date'9999-12-31') > date'2014-03-31'
then date'2014-03-31'
else endtime
end -
case when coalesce(starttime,date'0000-01-01') < date'2014-01-01'
then date'2014-01-01'
else starttime
end
+ 1) * 24 hoursCount
from myt
where coalesce(endtime,date'9999-12-31') >= date'2014-01-01'
and coalesce(starttime,date'0000-01-01') <= date'2014-03-31'
order by 1
Results:
| ID | ADJ_STARTTIME | ADJ_ENDTIME | HOURSCOUNT |
|----|--------------------------------|--------------------------------|------------|
| 1 | January, 01 2014 00:00:00+0000 | January, 03 2014 00:00:00+0000 | 72 |
| 2 | March, 25 2014 00:00:00+0000 | March, 31 2014 00:00:00+0000 | 168 |
| 3 | March, 27 2014 00:00:00+0000 | March, 31 2014 00:00:00+0000 | 120 |
| 6 | January, 01 2014 00:00:00+0000 | March, 31 2014 00:00:00+0000 | 2160 |
| 7 | January, 01 2014 00:00:00+0000 | January, 09 2014 00:00:00+0000 | 216 |

Sql Server 2012 - Group data by varying timeslots

I have some data to analyze which is at half hour granularity, but would like to group it by 2, 3, 6, 12 hour and 2 days and 1 week to make some more meaningful comparisons.
|DateTime | Value |
|01 Jan 2013 00:00 | 1 |
|01 Jan 2013 00:30 | 1 |
|01 Jan 2013 01:00 | 1 |
|01 Jan 2013 01:30 | 1 |
|01 Jan 2013 02:00 | 2 |
|01 Jan 2013 02:30 | 2 |
|01 Jan 2013 03:00 | 2 |
|01 Jan 2013 03:30 | 2 |
Eg. 2 hour grouped result will be
|DateTime | Value |
|01 Jan 2013 00:00 | 4 |
|01 Jan 2013 02:00 | 8 |
To get the 2 hourly grouped result, I thought of this code -
CASE
WHEN DatePart(HOUR,DateTime)%2 = 0 THEN
CAST(CAST(DatePart(HOUR,DateTime) AS varchar) + '':00'' AS TIME)
ELSE
CAST(CAST(DATEPART(HOUR,DateTime) As Int) - 1 AS varchar) + '':00'' END Time
This seems to work alright, but I cant think of using this to generalize to 3, 6, 12 hours.
I can for 6, 12 hours just use case statements and achieve result but is there any way to generalize so that I can achieve 2,3,6,12 hour granularity and also 2 days and a week level granularity? By generalize, I mean I could pass on a variable with desired granularity to the same query rather than having to constitute different queries with different case statements.
Is this possible? Please provide some pointers.
Thanks a lot!
I think you can use
Declare #Resolution int = 3 -- resolution in hours
Select
DateAdd(Hour,
DateDiff(Hour, 0, datetime) / #Resolution * #Resolution, -- integer arithmetic
0) as bucket,
Sum(values)
From
table
Group By
DateAdd(Hour,
DateDiff(Hour, 0, datetime) / #Resolution * #Resolution, -- integer arithmetic
0)
Order By
bucket
This calculates the number of hours since a known fixed date, rounds down to the resolution size you're interested in, then adds them back on to the fixed date.
It will miss buckets out, though if you have no data in them
Example Fiddle