I have some phone call data in a mssql 2008 database and would like to split it in to 15 (or X) minute intervals to be used in some Erlang calculations.
call logg:
call start end
1 2011-01-01 12:00:01 2011-01-01 12:16:00
2 2011-01-01 12:14:00 2011-01-01 12:17:30
3 2011-01-01 12:29:30 2011-01-01 12:46:20
Would be shown as
call start end
1 2011-01-01 12:00:01 2011-01-01 12:15:00
1 2011-01-01 12:15:00 2011-01-01 12:16:00
2 2011-01-01 12:14:00 2011-01-01 12:15:00
2 2011-01-01 12:15:00 2011-01-01 12:17:30
3 2011-01-01 12:29:30 2011-01-01 12:30:00
3 2011-01-01 12:30:00 2011-01-01 12:45:00
3 2011-01-01 12:45:00 2011-01-01 12:46:20
Does anyone have any good suggestions on how to do this?
Thanks in advance
Sample table
create table CallLogTable (call int, start datetime, [end] datetime)
insert CallLogTable select
1, '2011-01-01 12:00:01', '2011-01-01 12:16:00' union all select
2, '2011-01-01 12:14:00', '2011-01-01 12:17:30' union all select
3, '2011-01-01 12:29:30', '2011-01-01 12:46:20'
The query
select
call,
case when st < start then start else st end [start],
case when et > [end] then [end] else et end [end]
from (select *,
xstart = dateadd(mi, 15*(datediff(mi, 0, d.start)/15), 0),
blocks = datediff(mi, d.[start], d.[end])/15+2
from CallLogTable d) d
cross apply (
select
st = dateadd(mi,v.number*15,xstart),
et = dateadd(mi,v.number*15+15,xstart)
from master..spt_values v
where v.type='P' and v.number <= d.blocks
and d.[end] > dateadd(mi,v.number*15,xstart)) v
order by call, start
If creating a view off this query, drop the last [order by] line
Notes
the expression (xstart) dateadd(mi, 15*(datediff(mi, 0, d.start)/15), 0) calculates the 15-minute border on which the call started
blocks is pre-calculated as a quick cutoff so as not to process more rows than is necessary from spt_values
cross apply allows each row in the prior table to be used in the subquery. the subquery builds every 15-minute block covering the period
the case statements align the start and end time to the actuals, if those are inside the 15-minute borders
Richard is right, this query splits the calls into 15 minute intervals:
Try this:
With CallData ([call],start,[end]) as
(
select [call],start,case when [end]<=dateadd(minute,15,start) then [end] else dateadd(minute,15,start) end as [end] from CallLogTable
union all
select CallData.[call],CallData.[end],case when CallLogTable.[end]<=dateadd(minute,15,CallData.[end]) then CallLogTable.[end] else dateadd(minute,15,CallData.[end]) end as [end] from CallLogTable join CallData on CallLogTable.[call]=CallData.[call]
where CallData.[end]<case when CallLogTable.[end]<=dateadd(minute,15,CallData.[end]) then CallLogTable.[end] else dateadd(m,15,CallData.[end]) end
)
select * from CallData
Unfortunatelly I do not have a SQL at hand so I cannot test it. This is the idea however to make it so you will probably manage to adjust it in case it fails somewhere.
I put the aliases and the mistake was using m instead of minute. Can you try it to see if it works. TX. (that happens when not testing is done)
To split it at 15 minutes (00/15/30/45) u can use this:
With CallData ([call],start,[end]) as
(
select [call],start,case when [end]<=dateadd(minute,15*((datediff(minute,0,start)/15)+1),0) then [end] else dateadd(minute,15*((datediff(minute,0,start)/15)+1),0) end as [end] from CallLogTable
union all
select CallData.[call],CallData.[end],case when CallLogTable.[end]<=dateadd(minute,15*((datediff(minute,0,CallData.[End])/15)+1),0) then CallLogTable.[end] else dateadd(minute,15*((datediff(minute,0,CallData.[End])/15)+1),0) end as [end] from CallLogTable join CallData on CallLogTable.[call]=CallData.[call]
where CallData.[end]<case when CallLogTable.[end]<=dateadd(minute,15*((datediff(minute,0,CallData.[End])/15)+1),0) then CallLogTable.[end] else dateadd(minute,15*((datediff(minute,0,CallData.[End])/15)+1),0) end
)
select * from CallData order by [call],start
Fascinating problem!
Just for kicks, here's a PostgreSQL approach, using generate_sequence() to fill out the interior 15 minute intervals. There's undoubtedly a way to coalesce the first two unions that build the first and last intervals, but that is left as an exercise for the reader.
select
c.call
,c.dt_start - date_trunc('day', c.dt_start) as "begin"
,(date_trunc('second', (cast (c.dt_start - date_trunc('day', c.dt_start) as interval)
/ (15*60) + interval '1 second'))) * (15*60) as "end"
from
call c
where
(date_trunc('second', (cast (c.dt_start - date_trunc('day', c.dt_start) as interval)
/ (15*60) + interval '1 second'))) * (15*60)
<= date_trunc('second', (cast (c.dt_end - date_trunc('day', c.dt_end) as interval)
/ (15*60))) * (15*60)
union select
c.call
,greatest(
c.dt_start - date_trunc('day', c.dt_start),
date_trunc('second', (cast (c.dt_end - date_trunc('day', c.dt_end) as interval)
/ (15*60))) * (15*60)
) as "t_last_q"
,c.dt_end - date_trunc('day', c.dt_end) as "t_end"
from
call c
union select TQ.call, TQ.t_next_q, SEQ.SLICE
from
(select cast(g || ' seconds' as interval) as SLICE
from generate_series(0, 86400, 15*60) g) SEQ,
(select
c.call
,(date_trunc('second', (cast (c.dt_start - date_trunc('day', c.dt_start) as interval)
/ (15*60) + interval '1 second'))) * (15*60) as "t_next_q"
,date_trunc('second', (cast (c.dt_end - date_trunc('day', c.dt_end) as interval)
/ (15*60))) * (15*60) as "t_last_q"
from
call c
) TQ
where
SEQ.SLICE > TQ.t_next_q
and SEQ.SLICE <= TQ.t_last_q
There are only 1440 minutes in the day. So you could create and populate with 1440 rows a MINUTES table whose primary key is a four-digit number representing hour-minute in 24-hour format (e.g. 9:13PM would be 2113). Then you could have as many columns in that table as you needed to characterize any minute in the day: what quarter-hour it falls into, whether it is considered off-peak or peak, what its billing rate is under Plan A, and so on. You just keep adding columns as your use cases require. Totally extensible.
In your example, the first column MINUTES.QuarterHour would indicate which quarter-hour the minute fell into. 02:17 is in quarter-hour 6, for example. Once the table is populated, all you need to do then is use the HHMM chunk of your datetime value from your phone-calls table to pull-back the quarter-hour to which that time belongs, using a simple join on HHMMChunk = MINUTES.id. The advantages: queries are much simpler and easier to write and maintain and they are probably not as computationally intensive.
EDIT: the approach is also generic and portable (i.e. not implementation-specific).
Related
I am trying to calculated the time of an intervention per day:
Imagine the following scenario:
Intervention Start
Intervention End
Total time
01/09/2021 10:00:00
01/09/2021 12:00:00
01/09/2021 02:00:00
02/09/2021 23:30:00
03/09/2021 01:30:00
02/09/2021 00:30:00 and 03/09/2021 01:30:00
What is the best way to achieve this?
We can have more than 1 day of difference between interventions, also.
You can use a difference as milliseconds, seconds, minutes ... depending on your precision requirements. ie:
select interventionStart, interventionEnd, datediff(seconds, interventionStart, interventionEnd) totalTime
from myTable;
You could then convert seconds to your desire of display (ie: in .Net you might use TimeSpan's ToString() method.)
EDIT: If you absolutely need times "per date", then you could do this:
WITH
adjusted AS (
SELECT
interventionStart, interventionEnd
FROM myTable
WHERE CAST(interventionStart AS DATE)=CAST(interventionEnd AS DATE)
UNION ALL
SELECT
interventionStart, CAST(interventionEnd AS DATE) interventionEnd
FROM myTable
WHERE CAST(interventionStart AS DATE)!=CAST(interventionEnd AS DATE)
UNION ALL
SELECT
CAST(interventionEnd AS DATE) interventionStart, interventionEnd
FROM myTable
WHERE CAST(interventionStart AS DATE)!=CAST(interventionEnd AS DATE)
)
SELECT
adjusted.interventionStart
, adjusted.interventionEnd
, DATEDIFF(SECOND, interventionStart, interventionEnd) totalTime
, DATEADD(
SECOND, DATEDIFF(SECOND, interventionStart, interventionEnd)
, CAST(CAST(adjusted.interventionStart AS DATE) AS DATETIME)
) ifYouWish
FROM adjusted;
DbFiddle demo is here
You could try this if it answers your question.
SELECT Intervention_Start, Intervention_End, AGE(Intervention_Start,
Intervention_End) AS Total_Time from <table_name>;
Time in seconds by date. You can covert it to minutes, hours at will
select cast(c.d as date) dt, datediff(second, case when c.d > m.InterventionStart then c.d else m.InterventionStart end,
case when c2.d < m.InterventionEnd then c2.d else m.InterventionEnd end) seconds
from calendar c -- contains d = datetime of the start of the day, 2021-09-01 00:00:00 etc
-- next day
cross apply (values (dateadd(day, 1, c.d))) c2(d)
left join mytable m on c.d < m.InterventionEnd and m.InterventionStart < c2.d
db-fiddle
let's assume that I have a table with entries and these entries contains timestamp column (as Long) which is telling us when that entry arrived into a table.
Now, I want to make a SELECT query, in which I want to know how many entries came in selected interval with concrete frequency.
For example: interval is from 27.10.2020 to 30.10.2020 and frequency is 6 hours. The result of the query would tell me how many entries came in this interval in 6 hour groups.
Like:
27.10.2020 00:00:00 - 27.10.2020 06:00:00 : 2 entries
27.10.2020 06:00:00 - 27.10.2020 12:00:00 : 5 entries
27.10.2020 12:00:00 - 27.10.2020 18:00:00 : 0 entries
27.10.2020 18:00:00 - 28.10.2020 00:00:00 : 11 entries
28.10.2020 00:00:00 - 28.10.2020 06:00:00 : 8 entries
etc ...
The frequency parameter can be inserted in hours, days, weeks ...
Thank you all for you help!
First you need a recursive CTE like that returns the time intervals:
with cte as (
select '2020-10-27 00:00:00' datestart,
datetime('2020-10-27 00:00:00', '+6 hour') dateend
union all
select dateend,
min('2020-10-30 00:00:00', datetime(dateend, '+6 hour'))
from cte
where dateend < '2020-10-30 00:00:00'
)
Then you must do LEFT join of this CTE to the table and aggregate:
with cte as (
select '2020-10-27 00:00:00' datestart,
datetime('2020-10-27 00:00:00', '+6 hour') dateend
union all
select dateend,
min('2020-10-30 00:00:00', datetime(dateend, '+6 hour'))
from cte
where dateend < '2020-10-30 00:00:00'
)
select c.datestart, c.dateend, count(t.datecol) entries
from cte c left join tablename t
on datetime(t.datecol, 'unixepoch') >= c.datestart and datetime(t.datecol, 'unixepoch') < c.dateend
group by c.datestart, c.dateend
Replace tablename and datecol with the names of your table and date column.
If your date column contains milliseconds then change the ON clause to this:
on datetime(t.datecol / 1000, 'unixepoch') >= c.datestart
and datetime(t.datecol / 1000, 'unixepoch') < c.dateend
Here is one option:
select
datetime((strftime('%s', ts) / (6 * 60 * 60)) * 6 * 60 * 60, 'unixepoch') newts,
count(*) cnt
from mytable
where ts >= '2020-10-27' and ts < '2020-10-30'
group by newts
order by newts
ts represents the datetime column in your table. SQLite does not have a long datatype, so this assumes that you have a legitimate date stored as text.
The logic of the query is to turn the date to an epoch timestamp, then round it to 6 hours, which is represented by 6 * 60 * 60.
I have a query (see SQL Fiddle) which calculates the total track time per day. It worked fine until I found that my data is not clean and it has some intervals overlapping (i.e. starttime is repeated in some cases).
There are 1440 minutes in a day and therefore the maximum track time should be 1440, but due to the overlapping intervals the track time exceeds 1440 minutes per day in some cases.
At the moment the query makes it 1440 if the sum exceeds 1440. But if a value is less than 1440 it still can be wrong.
For example
One interval is from 10:00 to 14:00.
Second interval is from 13:00 to 15:00.
End result is 4 + 2 = 6 hours, where hour between 13:00 and 14:00 is counted twice.
End result is 360 minutes, which is less than 1440, but it is not a
correct answer, because data is not correct.
I want some help to fix the query so that it skips overlaps and calculates the correct track time. Thanks
;WITH
CTE_Dates
AS
(
SELECT
Email
,CAST(MIN(StartTime) AS date) AS StartDate
,CAST(MAX(EndTime) AS date) AS EndDate
FROM track
GROUP BY Email
)
SELECT
CTE_Dates.Email
,DayStart AS xDate
-- if some intervals overlap, it is possible
-- to get SUM more than 1440 per day
-- truncate such values for now
,CASE
WHEN ISNULL(SUM(DATEDIFF(second, RangeStart, RangeEnd)) / 60, 0) > 1440
THEN 1440
ELSE ISNULL(SUM(DATEDIFF(second, RangeStart, RangeEnd)) / 60, 0)
END AS TrackMinutes
FROM
Numbers
CROSS JOIN CTE_Dates
CROSS APPLY
(
SELECT
DATEADD(day, Numbers.Number-1, CTE_Dates.StartDate) AS DayStart
,DATEADD(day, Numbers.Number, CTE_Dates.StartDate) AS DayEnd
) AS A_Date
OUTER APPLY
(
SELECT
-- MAX(DayStart, StartTime)
CASE WHEN DayStart > StartTime THEN DayStart ELSE StartTime END AS RangeStart
-- MIN(DayEnd, EndTime)
,CASE WHEN DayEnd < EndTime THEN DayEnd ELSE EndTime END AS RangeEnd
FROM track AS T
WHERE
T.Email = CTE_Dates.Email
AND T.StartTime < DayEnd
AND T.EndTime > DayStart
) AS A_Track
WHERE
Numbers.Number <= DATEDIFF(day, CTE_Dates.StartDate, CTE_Dates.EndDate)+1
GROUP BY DayStart, CTE_Dates.Email
ORDER BY DayStart;
This is a "gaps and islands" problem. I faked my own test data (since you didn't provide any), but I think it works. The key intuition is that all values within the same "island" (that is, contiguous time interval) will have the same difference from a row_number() column. If you want a little insight into it, do a raw select from the IntervalsByDay cte (as opposed to the subquery I have now); this will show you the islands calculated (with start and end points).
edit: I didn't see that you had a fiddle on the first go around. My answer has been changed to reflect your data and desired output
with i as (
select datediff(minute, '2013-01-01', StartTime) as s,
datediff(minute, '2013-01-01', EndTime) as e
from #track
), brokenDown as (
select distinct n.Number
from i
join dbadmin.dbo.Numbers as n
on n.Number >= i.s
and n.Number <= i.e
), brokenDownWithID as (
select Number, Number - row_number() over(order by Number) as IslandID,
cast(dateadd(minute, number, '2013-01-01') as date) as d
from brokenDown
), IntervalsByDay as (
select
dateadd(minute, min(number), '2013-01-01') as [IntervalStart],
dateadd(minute, max(number), '2013-01-01') as [IntervalEnd],
d,
max(Number) - min(Number) + 1 as [NumMinutes]
from brokenDownWithID
group by IslandID, d
)
select d, sum(NumMinutes) as NumMinutes
from IntervalsByDay
group by d
order by d
I have a table in a PostgreSQL database containing dates and a total count per day.
mydate total
2012-05-12 12
2012-05-14 8
2012-05-13 4
2012-05-12 12
2012-05-15 2
2012-05-17 1
2012-05-18 1
2012-05-21 1
2012-05-25 1
Now I need to get the weekly totals for a given date range.
Ex. I want to get the weekly totals from 2012-05-01 up to 2012-05-31.
I'm looking at this output:
2012-05-01 2012-05-07 0
2012-05-08 2012-05-14 36
2012-05-15 2012-05-22 5
2012-05-23 2012-05-29 1
2012-05-30 2012-05-31 0
This works for any given date range:
CREATE FUNCTION f_tbl_weekly_sumtotals(_range_start date, _range_end date)
RETURNS TABLE (week_start date, week_end date, sum_total bigint)
LANGUAGE sql AS
$func$
SELECT w.week_start, w.week_end, COALESCE(sum(t.total), 0)
FROM (
SELECT week_start::date, LEAST(week_start::date + 6, _range_end) AS week_end
FROM generate_series(_range_start::timestamp
, _range_end::timestamp
, interval '1 week') week_start
) w
LEFT JOIN tbl t ON t.mydate BETWEEN w.week_start and w.week_end
GROUP BY w.week_start, w.week_end
ORDER BY w.week_start
$func$;
Call:
SELECT * FROM f_tbl_weekly_sumtotals('2012-05-01', '2012-05-31');
Major points
I wrapped it in a function for convenience, so the date range has to be provided once only.
The subquery w produces the series of weeks starting from the first day of the given date range. The upper bound is capped with LEAST to stay within the upper bound of the given date range.
Then LEFT JOIN to the data table (tbl in my example) to keep all weeks in the result, even where no data rows are found.
The rest should be obvious. COALESCE to output 0 instead of NULL for empty weeks.
Data types have to match, I assumed mydate date and total int for lack of information. (The sum() of an int is bigint.)
Explanation for my particular use of generate_series():
Generating time series between two dates in PostgreSQL
Using this function
CREATE OR REPLACE FUNCTION last_day(date)
RETURNS date AS
$$
SELECT (date_trunc('MONTH', $1) + INTERVAL '1 MONTH - 1 day')::date;
$$ LANGUAGE 'sql' IMMUTABLE STRICT;
AND generate_series (from 8.4 onwards) we can create the date partitions.
SELECT wk.wk_start,
CAST(
CASE (extract(month from wk.wk_start) = extract(month from wk.wk_start + interval '6 days'))
WHEN true THEN wk.wk_start + interval '6 days'
ELSE last_day(wk.wk_start)
END
AS date) AS wk_end
FROM
(SELECT CAST(generate_series('2012-05-01'::date,'2012-05-31'::date,interval '1 week') AS date) AS wk_start) AS wk;
Then putting it together with the data
CREATE TABLE my_tab(mydate date,total integer);
INSERT INTO my_tab
values
('2012-05-12'::date,12),
('2012-05-14'::date,8),
('2012-05-13'::date,4),
('2012-05-12'::date,12),
('2012-05-15'::date,2),
('2012-05-17'::date,1),
('2012-05-18'::date,1),
('2012-05-21'::date,1),
('2012-05-25'::date,1);
WITH month_by_week AS
(SELECT wk.wk_start,
CAST(
CASE (extract(month from wk.wk_start) = extract(month from wk.wk_start + interval '6 days'))
WHEN true THEN wk.wk_start + interval '6 days'
ELSE last_day(wk.wk_start)
END
AS date) AS wk_end
FROM
(SELECT CAST(generate_series('2012-05-01'::date,'2012-05-31'::date,interval '1 week') AS date) AS wk_start) AS wk
)
SELECT month_by_week.wk_start,
month_by_week.wk_end,
SUM(COALESCE(mt.total,0))
FROM month_by_week
LEFT JOIN my_tab mt ON mt.mydate BETWEEN month_by_week.wk_start AND month_by_week.wk_end
GROUP BY month_by_week.wk_start,
month_by_week.wk_end
ORDER BY month_by_week.wk_start;
I have two timestamps e.g '20-Nov-2010 20:11:22' started and ended. Now I want to calculate the time between 9:00 to 21:00 which is 12 hours.
The input will be two dates like '10-Nov-2010' and '20-Nov-2010' start date and end date
componentid starttime endtime result
3 13-Nov-2010 10:00:00 13-Nov-2010 21:00:00 11:00 hours
5 14-Nov-2010 09:30:00 14-Nov-2010 22:00:00 11:30 and
3 15-Nov-2010 08:20:00 15-Nov-2010 20:00:00 11:00 minutes
4 16-Nov-2010 08:00:00 16-Nov-2010 23:00:00 12:00
sum 45:30
Now from examples I only want the hours and minutes in between 9:00 and 21:00 the time which comes in this range from 10-Nov-2010 and 20-Nov-2010. I don't know how to do that in Oracle SQL - can you please explain how to do it?
This is for the final sum
select
trunc(Mi/60) || ':' || right('0' || mod(Mi,60), 2) Total
from
(
select sum
(
(
case when endtime-trunc(endtime) > 21.0/24 then 21.0/24 else endtime-trunc(endtime) end
-
case when starttime-trunc(starttime) < 9.0/24 then 9.0/24 else starttime-trunc(starttime) end
) * 24 * 60
) Mi
from tbl
where starttime >= to_date('10-Nov-2010')
and endtime < to_date('20-Nov-2010') + 1
) M
How would you do it by hand?
You need to determine the minimum (earlier) of the end time on the day and 21:00 on the day to establish the end point.
You need to determine the maximum (later) of the start time on the day and 09:00 on the day to establish the start point.
The difference between these two values is the time you want.
Reducing that to SQL requires a knowledge of the time-manipulation functions for Oracle - this is one of the areas where each DBMS has its own set of idiosyncratic functions and rules and what works on one will not necessarily work on any other.
Try this:
SELECT componentid,
starttime,
endtime,
(
CASE WHEN (endtime> TRUNC(endtime+ 21/24)) THEN TRUNC(endtime+ 21/24) ELSE endtime END -
CASE WHEN (starttime< TRUNC(endtime+ 9/24)) THEN TRUNC(starttime+ 9/24) ELSE starttime END
)*24 result
FROM <YOURTABLE>
In the interest of providing yet another way of doing time arithmetic, I'll use a view with interval arithmetic. (PostgreSQL, not Oracle.)
CREATE VIEW experimental_time_diffs AS
SELECT component_id, start_time, end_time,
case when cast(end_time as time) > '21:00'
then date_trunc('day', end_time) + interval '21' hour else end_time end -
case when cast(start_time as time) < '9:00'
then date_trunc('day', start_time) + interval '9' hour else start_time end
AS adj_elapsed_time
FROM your_table;
And you can sum on the view's column "adj_elapsed_time".
select sum(adj_elapsed_time) as total_elapsed_time
from experimental_time_diffs;