dates in postgres - sql

I want to see how long the client spend time connecting to our website daily.
My table source in created as below and contains the data as shown below.
CREATE TABLE source_ (
"nbr" numeric (10),
"begdate" timestamp,
"enddate" timestamp,
"str" varchar(35))
;
INSERT INTO source_
("nbr", "begdate", "enddate", "str")
VALUES
(111, '2019-11-25 07:00:00', '2019-11-25 08:00:00', 'TMP123'),
(222, '2019-03-01 12:04:02', '2019-03-01 12:05:02', 'SOC'),
(111, '2019-11-25 19:00:00', '2019-11-25 19:30:00', 'TMP12'),
(444, '2020-02-11 22:00:00', '2020-02-12 02:00:00', 'MARATEN'),
(444, '2020-02-11 23:00:00', '2020-02-12 01:00:00', 'MARA12'),
(444, '2020-02-12 13:00:00', '2020-02-12 14:00:00', 'MARA12'),
(444, '2020-02-12 07:00:00', '2020-02-12 08:00:00', 'MARA1222')
;
create table target_ (nbr numeric (10), date_ int(10), state varchar(30), terms interval);
I did an attempt below, but as you can see i associated the date_ (day of the event) to the beddate which is not always true see (4th row) when the event is between two days.
INSERT INTO target_
(nbr, date_, state, terms)
select
nbr,
DATE_TRUNC('day', begdate) as date_,
state,
sum(term) as terms
from (
select
nbr, begdate,
(case
when trim(str) ~ '^TMP' then 'TMP'
when trim(str) ~ '^MARA' then 'MARATEN'
else 'SOC'
end) as state,
(enddate - begdate)as term from source_ ) X
group by nbr, date_, state;
expected output
111 2019-11-25 00:00:00+00 TMP 90
222 2019-03-01 00:00:00+00 SOC 60
444 2020-02-11 00:00:00+00 MARATEN 180
444 2020-02-12 00:00:00+00 MARATEN 300

If I understand correctly, you can use generate_series() to expand the periods and then aggregate:
select gs.dte,
(case when trim(str) ~ '^TMP' then 'TMP'
when trim(str) ~ '^MARA' then 'MARATEN'
else 'SOC'
end) as state,
sum( least(s.enddate, gs.dte + interval '1 day') - greatest(s.begdate, gs.dte))
from source s cross join lateral
generate_series(begdate::date, enddate::date, interval '1 day') gs(dte)
group by state, gs.dte
order by gs.dte, state;
Here is a db<>fiddle.

Related

Split Time in Seconds for each Hour in a day given start and end time in Redshift

Input: Table time has State and Two timestamps (start and end time) for each.
user state start_time end_time
1 Work 2022-08-15 11:00:38 2022-08-15 14:11:03
1 Break 2022-08-15 14:11:03 2022-08-15 14:25:25
1 Work 2022-08-15 14:25:25 2022-08-15 15:09:10
1 Work 2022-08-15 15:09:10 2022-08-15 15:14:15
1 Break 2022-08-15 15:14:15 2022-08-15 18:07:50
1 Work 2022-08-15 18:07:50 2022-08-15 19:25:31
1 Work 2022-08-15 19:25:31 2022-08-15 19:34:57
1 Work 2022-08-15 19:34:57 2022-08-15 20:10:57
1 Work 2022-08-15 20:10:57
Requirement:Find the total time spent on "work"(in seconds)between each hour.
For example: if we need time spent by user working between 7-8PM the output should be 3593 sec
I was able to get this but say for example we want the worktime from 8 - 9 (which is 10 min as per output) , the below code is unable to get that
Generate series doesnt seem to work on redshift
generate_series (start_timestamp, end_time - interval '1 sec', interval '1 sec')
Code so far:
select user,start_time,
extract(epoch from (end_time-start_time)) as seconds
from times
where TO_CHAR(start_time, 'YYYY-MM-DD HH24:MI:SS') >= '2022-08-15 19:00:00'
and TO_CHAR(end_time, 'YYYY-MM-DD HH24:MI:SS') < '2022-08-15 20:00:00'
and state = 'Work'
union all
select user,start_time,
extract(epoch from ((('2022-08-15 20:00:00')::timestamp)-
start_time)) as seconds
from times
where TO_CHAR(start_timestamp, 'YYYY-MM-DD HH24:MI:SS') >= '2022-08-15 19:00:00'
and TO_CHAR(end_time, 'YYYY-MM-DD HH24:MI:SS') >= '2022-08-15 19:00:00'
and TO_CHAR(start_timestamp, 'YYYY-MM-DD HH24:MI:SS') < '2022-08-15 20:00:00'
and state = 'Work'
A couple of things - you can replace generate_series() with a recursive cte, since you desire hour totals you just want to generate a series of hours, and it is not a good idea to use key words like "user" as column names.
Also it isn't clear what you want done with the last row - assume end of hour, end of day, or just ignore. I went with ignore but changing this is straight forward.
I'd approach it this way:
create table test (
"user" int,
state varchar(16),
start_time timestamp,
end_time timestamp);
insert into test values
(1, 'Work', '2022-08-15 11:00:38', '2022-08-15 14:11:03'),
(1, 'Break', '2022-08-15 14:11:03', '2022-08-15 14:25:25'),
(1, 'Work', '2022-08-15 14:25:25', '2022-08-15 15:09:10'),
(1, 'Work', '2022-08-15 15:09:10', '2022-08-15 15:14:15'),
(1, 'Break', '2022-08-15 15:14:15', '2022-08-15 18:07:50'),
(1, 'Work', '2022-08-15 18:07:50', '2022-08-15 19:25:31'),
(1, 'Work', '2022-08-15 19:25:31', '2022-08-15 19:34:57'),
(1, 'Work', '2022-08-15 19:34:57', '2022-08-15 20:10:57'),
(1, 'Work', '2022-08-15 20:10:57',NULL);
with recursive hours(hr) as (
select '2022-08-15 00:00:00'::timestamp as hr
union all
select hr + interval '1 hour'
from hours
where hr < '2022-08-15 23:00:00'),
relevants as (
select *
from test t
join hours h
on t.start_time < h.hr + interval '1 hour'
and t.end_time >= h.hr
where t.state = 'Work'),
within_hour as (
select "user", state,
case when start_time < hr
then hr
else start_time end as start_time,
case when end_time > hr + interval '1 hour'
then hr + interval '1 hour'
else end_time end as end_time
from relevants)
select *, extract(epoch from end_time) -
extract(epoch from start_time) as work_sec
from within_hour;
I broke the query out into CTE steps to view what is going on.

Daily totals for sawtooth pattern local maxima

I have multiple monotonic counters that can be reset ad-hoc. These counters exhibit sawtooth behavior when graphed (however they are not strictly increasing). I want a monthly report showing daily sums of the maxima for each counter.
My strategy so far is to put a '1' on the rows where the counter is less than the previous sampling of the counter (also less than or equal to the next). Then calculate a running total on that column to identify series without resets.
Then I group over the daily intervals to calculate max-min for each series in the day, then sum those portions to get grand totals for the day.
What I have works, but it takes ~10s to run. The execution plan shows two big sorts: one in cteData and I think the other is in cteSeries. I feel like I should be able to eliminate one of them but I'm at a loss how to do it.
The result of this code is (which I can now see is actually skipping a sample across the interval boundary):
interval tagname total
2020-01-01 alpha 3
2020-01-01 bravo 4
2020-01-02 alpha 3
2020-01-02 bravo 4
IF OBJECT_ID('tempdb..#counter_data') IS NOT NULL
DROP TABLE #counter_data;
CREATE TABLE #counter_data(
t_stamp DATETIME NOT NULL
,tagname VARCHAR(32) NOT NULL
,val REAL NULL
PRIMARY KEY(t_stamp, tagname)
);
INSERT INTO #counter_data(t_stamp, tagname, val)
VALUES
('2020-01-01 04:00', 'alpha', 0)
,('2020-01-01 04:00', 'bravo', 0)
,('2020-01-01 08:00', 'alpha', 1)
,('2020-01-01 08:00', 'bravo', 1)
,('2020-01-01 12:00', 'alpha', 2)
,('2020-01-01 12:00', 'bravo', 2)
,('2020-01-01 16:00', 'alpha', 0)
,('2020-01-01 16:00', 'bravo', 3)
,('2020-01-01 20:00', 'alpha', 1)
,('2020-01-01 20:00', 'bravo', 4)
,('2020-01-02 04:00', 'alpha', 2)
,('2020-01-02 04:00', 'bravo', 5)
,('2020-01-02 08:00', 'alpha', 3)
,('2020-01-02 08:00', 'bravo', 6)
,('2020-01-02 12:00', 'alpha', 0)
,('2020-01-02 12:00', 'bravo', 7)
,('2020-01-02 16:00', 'alpha', 1)
,('2020-01-02 16:00', 'bravo', 8)
,('2020-01-02 20:00', 'alpha', 2)
,('2020-01-02 20:00', 'bravo', 9)
;
DECLARE #dateStart AS DATETIME = '2020-01-01';
DECLARE #dateEnd AS DATETIME = DATEADD(month, 2, #dateStart);
WITH cteData AS(
SELECT
t_stamp
,tagname
,val
,CASE
WHEN val < LAG(val) OVER(PARTITION BY tagname ORDER BY t_stamp)
AND val <= LEAD(val) OVER(PARTITION BY tagname ORDER BY t_stamp)
THEN 1
ELSE 0
END AS rn
FROM #counter_data
WHERE
t_stamp >= #dateStart AND t_stamp < #dateEnd
AND tagname IN(
'alpha'
,'bravo'
)
)
,cteSeries AS(
SELECT
CAST(t_stamp AS DATE) AS interval
,tagname
,val
,SUM(rn) OVER(PARTITION BY tagname ORDER BY t_stamp) AS series
FROM cteData
)
,cteSubtotal AS(
SELECT
interval
,tagname
,MAX(val) - MIN(val) AS subtotal
FROM cteSeries
GROUP BY interval, tagname, series
)
,cteGrandTotal AS(
SELECT
interval
,tagname
,SUM(subtotal) AS total
FROM cteSubtotal
GROUP BY interval, tagname
)
SELECT *
FROM cteGrandTotal
ORDER BY interval, tagname
I would just calculate the increase of the counter in each row by comparing it to the previous row:
with cte
as
(
SELECT *,isnull(lag(val) over (partition by tagname order by t_stamp),0) as previousVal
FROM counter_data
)
SELECT cast(t_stamp as date),tagname, sum(case when val>previousVal then val-previousval else val end )
FROM cte
GROUP BY cast(t_stamp as date),tagname;
This looks like a gaps-and-islands problem. I think that you want lag() to get the "previous" value and a conditional sum to compute the daily count.
select
tag_name,
cast(t_stamp as date) t_date,
sum(case when val = lag_val + 1 the 1 else 0 end) total
from (
select
c.*,
lag(val) over(
partition by tagname, cast(t_stamp as date)
order by t_stamp
) lag_val
from #counter_data c
) c
group by tagname, cast(t_stamp as date)
order by t_date, tagname

Shift manipulation in SQL to get counts

I have attendance in following table called Attendance
EID is employee ID and in shift column, D denotes a Day shift and N denotes a Night shift.
Now I'm trying to get following data pertaining to each employee.
No of Day shifts - count of D,
No of Night shifts - count of N,
No of Days worked - no of days an employee has worked either shift or both shifts (Even an employee worked both Day and Night on the same day its taken as one day.)
I can get all three information in three different results as follows...
WITH CTE (EID, in_time, shift) AS
(
SELECT EID, in_time, shift FROM Attendance
WHERE (in_time BETWEEN CONVERT(DATETIME, '2014-01-07 00:00:00', 102) AND CONVERT(DATETIME, '2014-07-31 00:00:00', 102)) AND PID = 'A002'
)
SELECT EID, COUNT(*) AS DayTotal
FROM CTE
WHERE (shift = 'D')
GROUP BY EID
SELECT EID, COUNT(*) AS NightTotal
FROM Attendance
WHERE (shift = 'N')
GROUP BY EID
;
WITH CTE2 (EID, in_time, shift) AS
(
SELECT EID, in_time, shift FROM Attendance
WHERE (in_time BETWEEN CONVERT(DATETIME, '2014-01-07 00:00:00', 102) AND CONVERT(DATETIME, '2014-07-31 00:00:00', 102)) AND PID = 'A002'
)
SELECT EID, COUNT ( DISTINCT CONVERT (DATE, in_time)) AS [Days]
FROM CTE2
WHERE (shift = 'D' OR shift = 'N')
GROUP BY EID
But I want to have this in single result (table). So I tried following query but it's not giving the intended output.
WITH CTE (EID, in_time, shift) AS
(
SELECT EID, in_time, shift FROM Attendance
WHERE (in_time BETWEEN CONVERT(DATETIME, '2014-01-07 00:00:00', 102) AND CONVERT(DATETIME, '2014-07-31 00:00:00', 102)) AND PID = 'A002'
)
SELECT EID,
CASE WHEN Shift = 'D' THEN COUNT(Shift) END AS [Day],
CASE WHEN Shift = 'N' THEN COUNT(Shift) END AS [Night],
COUNT ( DISTINCT CONVERT (DATE, in_time)) AS [Days]
FROM CTE
GROUP BY EID, shift
Could you please let me know a way to do this?
The intended result
I think you can get what you want using conditional aggregation:
SELECT EID,
sum(case when shift = 'd' then 1 else 0 end) as dayshifts,
sum(case when shift = 'n' then 1 else 0 end) as nightshifts,
count(*) as total
FROM Attendance a
WHERE (in_time BETWEEN CONVERT(DATETIME, '2014-01-07 00:00:00', 102) AND
CONVERT(DATETIME, '2014-07-31 00:00:00', 102)) AND
PID = 'A002';
EDIT:
If you want counts of distinct dates for the total, then use count(distinct):
SELECT EID,
sum(case when shift = 'd' then 1 else 0 end) as dayshifts,
sum(case when shift = 'n' then 1 else 0 end) as nightshifts,
count(distinct case when shift in ('d', 'n') then cast(in_time as date) end) as total
FROM Attendance a
WHERE (in_time BETWEEN CONVERT(DATETIME, '2014-01-07 00:00:00', 102) AND
CONVERT(DATETIME, '2014-07-31 00:00:00', 102)) AND
PID = 'A002';
WITH cte (eid, in_time, shift)
AS (SELECT eid,
in_time,
shift
FROM attendance
WHERE ( in_time BETWEEN CONVERT(DATETIME, '2014-01-07 00:00:00', 102)
AND
CONVERT(DATETIME,
'2014-07-31 00:00:00',
102
) )
AND pid = 'A002')
SELECT eid,
Sum(CASE
WHEN shift = 'D' THEN 1
ELSE 0
END) AS DayTotal,
Sum(CASE
WHEN shift = 'N' THEN 1
ELSE 0
END) AS NightTotal,
Count (DISTINCT CONVERT (DATE, in_time)) AS Days
FROM cte
GROUP BY eid
#Chathuranga, Since Day and Night Shifts of a day should be counted as one, Please let me know if the below solution works for you.
DECLARE #Attendance TABLE (EID INT,
PID CHAR(4),
In_Time DATETIME,
Out_Time DATETIME,
Shift CHAR(1))
INSERT INTO #Attendance
VALUES
('100', 'A001', '2014-07-01 07:00:00.000', '2014-07-01 19:30:00.000', 'D'),
('102', 'A001', '2014-07-01 19:30:00.000', '2014-07-02 07:00:00.000', 'N'),
('100', 'A001', '2014-07-01 19:30:00.000', '2014-07-02 07:00:00.000', 'N'),
('104', 'A001', '2014-07-02 07:00:00.000', '2014-07-02 19:30:00.000', 'D'),
('100', 'A001', '2014-07-03 19:30:00.000', '2014-07-04 07:00:00.000', 'N'),
('102', 'A001', '2014-07-03 19:30:00.000', '2014-07-04 07:00:00.000', 'N'),
('104', 'A001', '2014-07-03 07:00:00.000', '2014-07-03 19:30:15.000', 'D'),
('102', 'A001', '2014-07-04 07:00:00.000', '2014-07-04 19:30:00.000', 'D'),
('100', 'A001', '2014-07-04 07:00:00.000', '2014-07-04 19:30:10.000', 'D')
SELECT EID,
SUM(CASE
WHEN Shift = 'D' THEN 1
ELSE 0
END) AS DayShift,
SUM(CASE
WHEN Shift = 'N' THEN 1
ELSE 0
END) AS NightShift,
COUNT(DISTINCT CAST(In_Time AS DATE)) AS DayTotal
FROM #Attendance
GROUP BY EID

RANKING for counting phone numbers

Thanks for all your suggestions.It works fine on a small test data but I guess I under-estimated the total rows and plans so I guess I will have to post some biggers sample data
Here it is:
CREATE TABLE MasterTable
(`Date` datetime, `PhNO` int, `Plan_name` varchar(13), `Plan_price` varchar(3))
;
INSERT INTO MasterTable
(`Date`, `PhNO`, `Plan_name`, `Plan_price`)
VALUES
('2014-01-01 13:00:00', 3232222, 'Basepack1', '$32'),
('2014-01-01 13:00:00', 3232222, 'Basepack2', '$31'),
('2014-01-01 13:00:00', 3221111, 'Basepack6', '$21'),
('2014-01-01 13:00:00', 543222, 'BaseValuePack', '$76'),
('2014-01-01 13:00:00', 543222, 'Basepack1', '$30'),
('2014-01-01 13:00:00', 32322221, 'Basepack1', '$37'),
('2014-01-01 13:00:00', 32322221, 'Basepack2', '$21'),
('2014-01-01 13:00:00', 32322354, 'Basepack7', '$23'),
('2014-01-01 13:00:00', 32322254, 'Basepack8', '$11'),
('2014-01-01 13:00:00', 3267767, 'Non-base1', '$21'),
('2014-01-01 13:00:00', 3267762, 'Non-base1', '$21'),
('2014-01-01 13:00:00', 32677676, 'Non-base3', '$76'),
('2014-01-01 13:00:00', 5267767, 'Non-base9', '$21')
Now I basically want to group all the 'Non-base%' plans under 'Casual' category
Here is the desired output:
Date Plan_name Phone_no_count
'2014-01-01 13:00:00' 'Basepack1' 2
'2014-01-01 13:00:00' 'Basepack2' 0
'2014-01-01 13:00:00' 'Basepack6' 1
'2014-01-01 13:00:00' 'BaseValuepack' 1
'2014-01-01 13:00:00' 'Casual' 4
Thanks
...........................................
Previous request:
I need to count all the phone nos categorised by certain plans. But to group those plans, I need to rank them first based on their price.
Here is the ddl
CREATE TABLE MasterTable
(`Date` datetime, `PhNO` int, `Plan_name` varchar(13), `Plan_price` varchar(3))
;
INSERT INTO MasterTable
(`Date`, `PhNO`, `Plan_name`, `Plan_price`)
VALUES
('2014-01-01 13:00:00', 3232222, 'Basepack1', '$32'),
('2014-01-01 13:00:00', 3232222, 'Basepack2', '$31'),
('2014-01-01 13:00:00', 3221111, 'Basepack6', '$21'),
('2014-01-01 13:00:00', 543222, 'BaseValuePack', '$76'),
('2014-01-01 13:00:00', 543222, 'Basepack1', '$30'),
('2014-01-01 13:00:00', 32322221, 'Basepack1', '$37'),
('2014-01-01 13:00:00', 32322221, 'Basepack2', '$21')
;
Now the rule is, I only need to count a phone no once who has more than 1 plan based on their plan price(higher one) .
But there is also a scenario where a phone no has two different 'packs'- Basepack and valuepack, which means I need to count that phone no only once categorised under value pack(in this case we need to ignore the price).
Here is the desired output.
Date Plan_name Phone_no_count
-------------------------------------------------
'2014-01-01 13:00:00' 'Basepack1' 2
'2014-01-01 13:00:00' 'Basepack2' 0
'2014-01-01 13:00:00' 'Basepack6' 1
'2014-01-01 13:00:00' 'BaseValuepack' 1
How do I use the rank function to achieve this result?
Try this code, Assuming it is mysql as the code posted looks like mysql
SELECT date,
plan_name,
Sum((SELECT Count(1)
FROM mastertable T2
WHERE t2.phno = tt.phno
AND t2.date = Tt.date
AND ( tt.plan_price > t2.plan_price
OR TT.count = 1 ))) CNT
FROM (SELECT *,
(SELECT Count(1)
FROM mastertable Ti
WHERE ti.date = T.date
AND ti.phno = t.phno) AS Count
FROM mastertable T) Tt
GROUP BY date,
plan_name
Here is the SQL Fiddle Demo
;with cte as
(
select *,ROW_NUMBER()over(partition by phno order by plan_price desc)rn
from #MasterTable
)
,
cte1 as
(
select a.Plan_name,count(*)[Phone_no_count] from cte a
where rn=1
group by a.Plan_name
)
select distinct a.dates,a.Plan_name,isnull(b.Phone_no_count,0)[[Phone_no_count]
from #MasterTable a left join cte1 b on a.Plan_name=b.Plan_name

Calculate working hours between 2 dates in PostgreSQL

I am developing an algorithm with Postgres (PL/pgSQL) and I need to calculate the number of working hours between 2 timestamps, taking into account that weekends are not working and the rest of the days are counted only from 8am to 15pm.
Examples:
From Dec 3rd at 14pm to Dec 4th at 9am should count 2 hours:
3rd = 1, 4th = 1
From Dec 3rd at 15pm to Dec 7th at 8am should count 8 hours:
3rd = 0, 4th = 8, 5th = 0, 6th = 0, 7th = 0
It would be great to consider hour fractions as well.
According to your question working hours are: Mo–Fr, 08:00–15:00.
Rounded results
For just two given timestamps
Operating on units of 1 hour. Fractions are ignored, therefore not precise but simple:
SELECT count(*) AS work_hours
FROM generate_series (timestamp '2013-06-24 13:30'
, timestamp '2013-06-24 15:29' - interval '1h'
, interval '1h') h
WHERE EXTRACT(ISODOW FROM h) < 6
AND h::time >= '08:00'
AND h::time <= '14:00';
The function generate_series() generates one row if the end is greater than the start and another row for every full given interval (1 hour). This wold count every hour entered into. To ignore fractional hours, subtract 1 hour from the end. And don't count hours starting before 14:00.
Use the field pattern ISODOW instead of DOW for EXTRACT() to simplify expressions. Returns 7 instead of 0 for Sundays.
A simple (and very cheap) cast to time makes it easy to identify qualifying hours.
Fractions of an hour are ignored, even if fractions at begin and end of the interval would add up to an hour or more.
For a whole table
CREATE TABLE t (t_id int PRIMARY KEY, t_start timestamp, t_end timestamp);
INSERT INTO t VALUES
(1, '2009-12-03 14:00', '2009-12-04 09:00')
, (2, '2009-12-03 15:00', '2009-12-07 08:00') -- examples in question
, (3, '2013-06-24 07:00', '2013-06-24 12:00')
, (4, '2013-06-24 12:00', '2013-06-24 23:00')
, (5, '2013-06-23 13:00', '2013-06-25 11:00')
, (6, '2013-06-23 14:01', '2013-06-24 08:59') -- max. fractions at begin and end
;
Query:
SELECT t_id, count(*) AS work_hours
FROM (
SELECT t_id, generate_series (t_start, t_end - interval '1h', interval '1h') AS h
FROM t
) sub
WHERE EXTRACT(ISODOW FROM h) < 6
AND h::time >= '08:00'
AND h::time <= '14:00'
GROUP BY 1
ORDER BY 1;
db<>fiddle here
Old sqlfiddle
More precision
To get more precision you can use smaller time units. 5-minute slices for instance:
SELECT t_id, count(*) * interval '5 min' AS work_interval
FROM (
SELECT t_id, generate_series (t_start, t_end - interval '5 min', interval '5 min') AS h
FROM t
) sub
WHERE EXTRACT(ISODOW FROM h) < 6
AND h::time >= '08:00'
AND h::time <= '14:55' -- 15.00 - interval '5 min'
GROUP BY 1
ORDER BY 1;
The smaller the unit the higher the cost.
Cleaner with LATERAL in Postgres 9.3+
In combination with the new LATERAL feature in Postgres 9.3, the above query can then be written as:
1-hour precision:
SELECT t.t_id, h.work_hours
FROM t
LEFT JOIN LATERAL (
SELECT count(*) AS work_hours
FROM generate_series (t.t_start, t.t_end - interval '1h', interval '1h') h
WHERE EXTRACT(ISODOW FROM h) < 6
AND h::time >= '08:00'
AND h::time <= '14:00'
) h ON TRUE
ORDER BY 1;
5-minute precision:
SELECT t.t_id, h.work_interval
FROM t
LEFT JOIN LATERAL (
SELECT count(*) * interval '5 min' AS work_interval
FROM generate_series (t.t_start, t.t_end - interval '5 min', interval '5 min') h
WHERE EXTRACT(ISODOW FROM h) < 6
AND h::time >= '08:00'
AND h::time <= '14:55'
) h ON TRUE
ORDER BY 1;
This has the additional advantage that intervals containing zero working hours are not excluded from the result like in the above versions.
More about LATERAL:
Find most common elements in array with a group by
Insert multiple rows in one table based on number in another table
Exact results
Postgres 8.4+
Or you deal with start and end of the time frame separately to get exact results to the microsecond. Makes the query more complex, but cheaper and exact:
WITH var AS (SELECT '08:00'::time AS v_start
, '15:00'::time AS v_end)
SELECT t_id
, COALESCE(h.h, '0') -- add / subtract fractions
- CASE WHEN EXTRACT(ISODOW FROM t_start) < 6
AND t_start::time > v_start
AND t_start::time < v_end
THEN t_start - date_trunc('hour', t_start)
ELSE '0'::interval END
+ CASE WHEN EXTRACT(ISODOW FROM t_end) < 6
AND t_end::time > v_start
AND t_end::time < v_end
THEN t_end - date_trunc('hour', t_end)
ELSE '0'::interval END AS work_interval
FROM t CROSS JOIN var
LEFT JOIN ( -- count full hours, similar to above solutions
SELECT t_id, count(*)::int * interval '1h' AS h
FROM (
SELECT t_id, v_start, v_end
, generate_series (date_trunc('hour', t_start)
, date_trunc('hour', t_end) - interval '1h'
, interval '1h') AS h
FROM t, var
) sub
WHERE EXTRACT(ISODOW FROM h) < 6
AND h::time >= v_start
AND h::time <= v_end - interval '1h'
GROUP BY 1
) h USING (t_id)
ORDER BY 1;
db<>fiddle here
Old sqlfiddle
Postgres 9.2+ with tsrange
The new range types offer a more elegant solution for exact results in combination with the intersection operator *:
Simple function for time ranges spanning only one day:
CREATE OR REPLACE FUNCTION f_worktime_1day(_start timestamp, _end timestamp)
RETURNS interval
LANGUAGE sql IMMUTABLE AS
$func$ -- _start & _end within one calendar day! - you may want to check ...
SELECT CASE WHEN extract(ISODOW from _start) < 6 THEN (
SELECT COALESCE(upper(h) - lower(h), '0')
FROM (
SELECT tsrange '[2000-1-1 08:00, 2000-1-1 15:00)' -- hours hard coded
* tsrange( '2000-1-1'::date + _start::time
, '2000-1-1'::date + _end::time ) AS h
) sub
) ELSE '0' END
$func$;
If your ranges never span multiple days, that's all you need.
Else, use this wrapper function to deal with any interval:
CREATE OR REPLACE FUNCTION f_worktime(_start timestamp
, _end timestamp
, OUT work_time interval)
LANGUAGE plpgsql IMMUTABLE AS
$func$
BEGIN
CASE _end::date - _start::date -- spanning how many days?
WHEN 0 THEN -- all in one calendar day
work_time := f_worktime_1day(_start, _end);
WHEN 1 THEN -- wrap around midnight once
work_time := f_worktime_1day(_start, NULL)
+ f_worktime_1day(_end::date, _end);
ELSE -- multiple days
work_time := f_worktime_1day(_start, NULL)
+ f_worktime_1day(_end::date, _end)
+ (SELECT count(*) * interval '7:00' -- workday hard coded!
FROM generate_series(_start::date + 1
, _end::date - 1, '1 day') AS t
WHERE extract(ISODOW from t) < 6);
END CASE;
END
$func$;
Call:
SELECT t_id, f_worktime(t_start, t_end) AS worktime
FROM t
ORDER BY 1;
db<>fiddle here
Old sqlfiddle
How about this: create a small table with 24*7 rows, one row for each hour in a week.
CREATE TABLE hours (
hour timestamp not null,
is_working boolean not null
);
INSERT INTO hours (hour, is_working) VALUES
('2009-11-2 00:00:00', false),
('2009-11-2 01:00:00', false),
. . .
('2009-11-2 08:00:00', true),
. . .
('2009-11-2 15:00:00', true),
('2009-11-2 16:00:00', false),
. . .
('2009-11-2 23:00:00', false);
Likewise add 24 rows for each of the other days. It doesn't matter what year or month you give, as you'll see in a moment. You just need to represent all seven days of the week.
SELECT t.id, t.start, t.end, SUM(CASE WHEN h.is_working THEN 1 ELSE 0 END) AS hours_worked
FROM mytable t JOIN hours h
ON (EXTRACT(DOW FROM TIMESTAMP h.hour) BETWEEN EXTRACT(DOW FROM TIMESTAMP t.start)
AND EXTRACT(DOW FROM TIMESTAMP t.end))
AND (EXTRACT(DOW FROM TIMESTAMP h.hour) > EXTRACT(DOW FROM TIMESTAMP t.start)
OR EXTRACT(HOUR FROM TIMESTAMP h.hour) >= EXTRACT(HOUR FROM TIMESTAMP t.start))
AND (EXTRACT(DOW FROM TIMESTAMP h.hour) < EXTRACT(DOW FROM TIMESTAMP t.end)
OR EXTRACT(HOUR FROM TIMESTAMP h.hour) <= EXTRACT(HOUR FROM TIMESTAMP t.end))
GROUP BY t.id, t.start, t.end;
This following functions will take the input for the
working start time of the day
working end time of the day
start time
end time
-- helper function
CREATE OR REPLACE FUNCTION get_working_time_in_a_day(sdt TIMESTAMP, edt TIMESTAMP, swt TIME, ewt TIME) RETURNS INT AS
$$
DECLARE
sd TIMESTAMP; ed TIMESTAMP; swdt TIMESTAMP; ewdt TIMESTAMP; seconds INT;
BEGIN
swdt = sdt::DATE || ' ' || swt; -- work start datetime for a day
ewdt = sdt::DATE || ' ' || ewt; -- work end datetime for a day
IF (sdt < swdt AND edt <= swdt) -- case 1 and 2
THEN
seconds = 0;
END IF;
IF (sdt < swdt AND edt > swdt AND edt <= ewdt) -- case 3 and 4
THEN
seconds = EXTRACT(EPOCH FROM (edt - swdt));
END IF;
IF (sdt < swdt AND edt > swdt AND edt > ewdt) -- case 5
THEN
seconds = EXTRACT(EPOCH FROM (ewdt - swdt));
END IF;
IF (sdt = swdt AND edt > swdt AND edt <= ewdt) -- case 6 and 7
THEN
seconds = EXTRACT(EPOCH FROM (edt - sdt));
END IF;
IF (sdt = swdt AND edt > ewdt) -- case 8
THEN
seconds = EXTRACT(EPOCH FROM (ewdt - sdt));
END IF;
IF (sdt > swdt AND edt <= ewdt) -- case 9 and 10
THEN
seconds = EXTRACT(EPOCH FROM (edt - sdt));
END IF;
IF (sdt > swdt AND sdt < ewdt AND edt > ewdt) -- case 11
THEN
seconds = EXTRACT(EPOCH FROM (ewdt - sdt));
END IF;
IF (sdt >= ewdt AND edt > ewdt) -- case 12 and 13
THEN
seconds = 0;
END IF;
RETURN seconds;
END;
$$
LANGUAGE plpgsql;
-- Get work time difference
CREATE OR REPLACE FUNCTION get_working_time(sdt TIMESTAMP, edt TIMESTAMP, swt TIME, ewt TIME) RETURNS INT AS
$$
DECLARE
seconds INT = 0;
strst VARCHAR(9) = ' 00:00:00';
stret VARCHAR(9) = ' 23:59:59';
tend TIMESTAMP; tempEdt TIMESTAMP;
x int;
BEGIN
<<test>>
WHILE sdt <= edt LOOP
tend = sdt::DATE || stret; -- get the false end datetime for start time
IF edt >= tend
THEN
tempEdt = tend;
ELSE
tempEdt = edt;
END IF;
-- skip saturday and sunday
x = EXTRACT(DOW FROM sdt);
if (x > 0 AND x < 6)
THEN
seconds = seconds + get_working_time_in_a_day(sdt, tempEdt, swt, ewt);
ELSE
-- RAISE NOTICE 'MISSED A DAY';
END IF;
sdt = (sdt + (INTERVAL '1 DAY'))::DATE || strst;
END LOOP test;
--RAISE NOTICE 'diff in minutes = %', (seconds / 60);
RETURN seconds;
END;
$$
LANGUAGE plpgsql;
-- Table Definition
DROP TABLE IF EXISTS test_working_time;
CREATE TABLE test_working_time(
pk SERIAL PRIMARY KEY,
start_datetime TIMESTAMP,
end_datetime TIMESTAMP,
start_work_time TIME,
end_work_time TIME
);
-- Test data insertion
INSERT INTO test_working_time VALUES
(1, '2015-11-03 01:00:00', '2015-11-03 07:00:00', '08:00:00', '22:00:00'),
(2, '2015-11-03 01:00:00', '2015-11-04 07:00:00', '08:00:00', '22:00:00'),
(3, '2015-11-03 01:00:00', '2015-11-05 07:00:00', '08:00:00', '22:00:00'),
(4, '2015-11-03 01:00:00', '2015-11-06 07:00:00', '08:00:00', '22:00:00'),
(5, '2015-11-03 01:00:00', '2015-11-07 07:00:00', '08:00:00', '22:00:00'),
(6, '2015-11-03 01:00:00', '2015-11-03 08:00:00', '08:00:00', '22:00:00'),
(7, '2015-11-03 01:00:00', '2015-11-04 08:00:00', '08:00:00', '22:00:00'),
(8, '2015-11-03 01:00:00', '2015-11-05 08:00:00', '08:00:00', '22:00:00'),
(9, '2015-11-03 01:00:00', '2015-11-06 08:00:00', '08:00:00', '22:00:00'),
(10, '2015-11-03 01:00:00', '2015-11-07 08:00:00', '08:00:00', '22:00:00'),
(11, '2015-11-03 01:00:00', '2015-11-03 11:00:00', '08:00:00', '22:00:00'),
(12, '2015-11-03 01:00:00', '2015-11-04 11:00:00', '08:00:00', '22:00:00'),
(13, '2015-11-03 01:00:00', '2015-11-05 11:00:00', '08:00:00', '22:00:00'),
(14, '2015-11-03 01:00:00', '2015-11-06 11:00:00', '08:00:00', '22:00:00'),
(15, '2015-11-03 01:00:00', '2015-11-07 11:00:00', '08:00:00', '22:00:00'),
(16, '2015-11-03 01:00:00', '2015-11-03 22:00:00', '08:00:00', '22:00:00'),
(17, '2015-11-03 01:00:00', '2015-11-04 22:00:00', '08:00:00', '22:00:00'),
(18, '2015-11-03 01:00:00', '2015-11-05 22:00:00', '08:00:00', '22:00:00'),
(19, '2015-11-03 01:00:00', '2015-11-06 22:00:00', '08:00:00', '22:00:00'),
(20, '2015-11-03 01:00:00', '2015-11-07 22:00:00', '08:00:00', '22:00:00'),
(21, '2015-11-03 01:00:00', '2015-11-03 23:00:00', '08:00:00', '22:00:00'),
(22, '2015-11-03 01:00:00', '2015-11-04 23:00:00', '08:00:00', '22:00:00'),
(23, '2015-11-03 01:00:00', '2015-11-05 23:00:00', '08:00:00', '22:00:00'),
(24, '2015-11-03 01:00:00', '2015-11-06 23:00:00', '08:00:00', '22:00:00'),
(25, '2015-11-03 01:00:00', '2015-11-07 23:00:00', '08:00:00', '22:00:00'),
(26, '2015-11-03 08:00:00', '2015-11-03 11:00:00', '08:00:00', '22:00:00'),
(27, '2015-11-03 08:00:00', '2015-11-04 11:00:00', '08:00:00', '22:00:00'),
(28, '2015-11-03 08:00:00', '2015-11-05 11:00:00', '08:00:00', '22:00:00'),
(29, '2015-11-03 08:00:00', '2015-11-06 11:00:00', '08:00:00', '22:00:00'),
(30, '2015-11-03 08:00:00', '2015-11-07 11:00:00', '08:00:00', '22:00:00'),
(31, '2015-11-03 08:00:00', '2015-11-03 22:00:00', '08:00:00', '22:00:00'),
(32, '2015-11-03 08:00:00', '2015-11-04 22:00:00', '08:00:00', '22:00:00'),
(33, '2015-11-03 08:00:00', '2015-11-05 22:00:00', '08:00:00', '22:00:00'),
(34, '2015-11-03 08:00:00', '2015-11-06 22:00:00', '08:00:00', '22:00:00'),
(35, '2015-11-03 08:00:00', '2015-11-07 22:00:00', '08:00:00', '22:00:00'),
(36, '2015-11-03 08:00:00', '2015-11-03 23:00:00', '08:00:00', '22:00:00'),
(37, '2015-11-03 08:00:00', '2015-11-04 23:00:00', '08:00:00', '22:00:00'),
(38, '2015-11-03 08:00:00', '2015-11-05 23:00:00', '08:00:00', '22:00:00'),
(39, '2015-11-03 08:00:00', '2015-11-06 23:00:00', '08:00:00', '22:00:00'),
(40, '2015-11-03 08:00:00', '2015-11-07 23:00:00', '08:00:00', '22:00:00'),
(41, '2015-11-03 12:00:00', '2015-11-03 18:00:00', '08:00:00', '22:00:00'),
(42, '2015-11-03 12:00:00', '2015-11-04 18:00:00', '08:00:00', '22:00:00'),
(43, '2015-11-03 12:00:00', '2015-11-05 18:00:00', '08:00:00', '22:00:00'),
(44, '2015-11-03 12:00:00', '2015-11-06 18:00:00', '08:00:00', '22:00:00'),
(45, '2015-11-03 12:00:00', '2015-11-07 18:00:00', '08:00:00', '22:00:00'),
(46, '2015-11-03 12:00:00', '2015-11-03 22:00:00', '08:00:00', '22:00:00'),
(47, '2015-11-03 12:00:00', '2015-11-04 22:00:00', '08:00:00', '22:00:00'),
(48, '2015-11-03 12:00:00', '2015-11-05 22:00:00', '08:00:00', '22:00:00'),
(49, '2015-11-03 12:00:00', '2015-11-06 22:00:00', '08:00:00', '22:00:00'),
(50, '2015-11-03 12:00:00', '2015-11-07 22:00:00', '08:00:00', '22:00:00'),
(51, '2015-11-03 12:00:00', '2015-11-03 23:00:00', '08:00:00', '22:00:00'),
(52, '2015-11-03 12:00:00', '2015-11-04 23:00:00', '08:00:00', '22:00:00'),
(53, '2015-11-03 12:00:00', '2015-11-05 23:00:00', '08:00:00', '22:00:00'),
(54, '2015-11-03 12:00:00', '2015-11-06 23:00:00', '08:00:00', '22:00:00'),
(55, '2015-11-03 12:00:00', '2015-11-07 23:00:00', '08:00:00', '22:00:00'),
(56, '2015-11-03 22:00:00', '2015-11-03 23:00:00', '08:00:00', '22:00:00'),
(57, '2015-11-03 22:00:00', '2015-11-04 23:00:00', '08:00:00', '22:00:00'),
(58, '2015-11-03 22:00:00', '2015-11-05 23:00:00', '08:00:00', '22:00:00'),
(59, '2015-11-03 22:00:00', '2015-11-06 23:00:00', '08:00:00', '22:00:00'),
(60, '2015-11-03 22:00:00', '2015-11-07 23:00:00', '08:00:00', '22:00:00'),
(61, '2015-11-03 22:30:00', '2015-11-03 23:30:00', '08:00:00', '22:00:00'),
(62, '2015-11-03 22:30:00', '2015-11-04 23:30:00', '08:00:00', '22:00:00'),
(63, '2015-11-03 22:30:00', '2015-11-05 23:30:00', '08:00:00', '22:00:00'),
(64, '2015-11-03 22:30:00', '2015-11-06 23:30:00', '08:00:00', '22:00:00'),
(65, '2015-11-03 22:30:00', '2015-11-07 23:30:00', '08:00:00', '22:00:00');
-- select query to get work time difference
SELECT
start_datetime,
end_datetime,
start_work_time,
end_work_time,
get_working_time(start_datetime, end_datetime, start_work_time, end_work_time) AS diff_in_minutes
FROM
test_working_time;
This will give the difference of only the work hours in seconds between the start and end datetime