MySQL - Calculate the net time difference between two date-times while excluding breaks? - sql

In a MySQL query I am using the timediff/time_to_sec functions to calculate the total minutes between two date-times.
For example:
2010-03-23 10:00:00
-
2010-03-23 08:00:00
= 120 minutes
What I would like to do is exclude any breaks that occur during the selected time range.
For example:
2010-03-23 10:00:00
-
2010-03-23 08:00:00
-
(break 08:55:00 to 09:10:00)
= 105 minutes
Is there a good method to do this without resorting to a long list of nested IF statements?
UPDATE1:
To clarify - I am trying to calculate how long a user takes to accomplish a given task. If they take a coffee break that time period needs to be excluded. The coffee breaks are a at fixed times.

sum all your breaks that occur during the times, and then subtract to the result of the timediff/time_to_sec function
SELECT TIME_TO_SEC(TIMEDIFF('17:00:00', '09:00:00')) -- 28800
SELECT TIME_TO_SEC(TIMEDIFF('12:30:00', '12:00:00')) -- 1800
SELECT TIME_TO_SEC(TIMEDIFF('10:30:00', '10:15:00')) -- 900
-- 26100
Assuming this structure :
CREATE TABLE work_unit (
id INT NOT NULL,
initial_time TIME,
final_time TIME
)
CREATE TABLE break (
id INT NOT NULL,
initial_time TIME,
final_time TIME
)
INSERT work_unit VALUES (1, '09:00:00', '17:00:00')
INSERT break VALUES (1, '10:00:00', '10:15:00')
INSERT break VALUES (2, '12:00:00', '12:30:00')
You can calculate it with next query:
SELECT *, TIME_TO_SEC(TIMEDIFF(final_time, initial_time)) total_time
, (SELECT SUM(
TIME_TO_SEC(TIMEDIFF(b.final_time, b.initial_time)))
FROM break b
WHERE (b.initial_time BETWEEN work_unit.initial_time AND work_unit.final_time) OR (b.final_time BETWEEN work_unit.initial_time AND work_unit.final_time)
) breaks
FROM work_unit

Related

Adding minutes of runtime from on/off records during a time period

I have a SQL database that collects temperature and sensor data from the barn.
The table definition is:
CREATE TABLE [dbo].[DataPoints]
(
[timestamp] [datetime] NOT NULL,
[pointname] [nvarchar](50) NOT NULL,
[pointvalue] [float] NOT NULL
)
The sensors report outside temperature (degrees), inside temperature (degrees), and heating (as on/off).
Sensors create a record when the previous reading has changed, so temperatures are generated every few minutes, one record for heat coming ON, one for heat going OFF, and so on.
I'm interested in how many minutes of heat has been used overnight, so a 24-hour period from 6 AM yesterday to 6 AM today would work fine.
This query:
SELECT *
FROM [home_network].[dbo].[DataPoints]
WHERE (pointname = 'Heaters')
AND (timestamp BETWEEN '2022-12-18 06:00:00' AND '2022-12-19 06:00:00')
ORDER BY timestamp
returns this data:
2022-12-19 02:00:20 | Heaters | 1
2022-12-19 02:22:22 | Heaters | 0
2022-12-19 03:43:28 | Heaters | 1
2022-12-19 04:25:31 | Heaters | 0
The end result should be 22 minutes + 42 minutes = 64 minutes of heat, but I can't see how to get this result from a single query. It also just happens that this result set has two complete heat on/off cycles, but that will not always be the case. So, if the first heat record was = 0, that means that at 6 AM, the heat was already on, but the start time won't show in the query. The same idea applies if the last heat record is =1 at, say 05:15, which means 45 minutes have to be added to the total.
Is it possible to get this minutes-of-heat-time result with a single query? Actually, I don't know the right approach, and it doesn't matter if I have to run several queries. If needed, I could use a small app that reads the raw data, and applies logic outside of SQL to arrive at the total. But I'd prefer to be able to do this within SQL.
This isn't a complete answer, but it should help you get started. From the SQL in the post, I'm assuming you're using SQL Server. I've formatted the code to match. Replace #input with your query above if you want to test on your own data. (SELECT * FROM [home_network].[dbo]...)
--generate dummy table with sample output from question
declare #input as table(
[timestamp] [datetime] NOT NULL,
[pointname] [nvarchar](50) NOT NULL,
[pointvalue] [float] NOT NULL
)
insert into #input values
('2022-12-19 02:00:20','Heaters',1),
('2022-12-19 02:22:22','Heaters',0),
('2022-12-19 03:43:28','Heaters',1),
('2022-12-19 04:25:31','Heaters',0);
--Append a row number to the result
WITH A as (
SELECT *,
ROW_NUMBER() OVER(ORDER BY(SELECT 1)) as row_count
from #input)
--Self join the table using the row number as a guide
SELECT sum(datediff(MINUTE,startTimes.timestamp,endTimes.timestamp))
from A as startTimes
LEFT JOIN A as endTimes on startTimes.row_count=endTimes.row_count-1
--Only show periods of time where the heater is turned on at the start
WHERE startTimes.row_count%2=1
Your problem can be divided into 2 steps:
Filter sensor type and date range, while also getting time span of each record by calculating date difference between timestamp of current record and the next one in chronological order.
Filter records with ON status and summarize the duration
(Optional) convert to HH:MM:SS format to display
Here's the my take on the problem with comments of what I do in each step, all combined into 1 single query.
-- Step 3: Convert output to HH:MM:SS, this is just for show and can be reduced
SELECT STUFF(CONVERT(VARCHAR(8), DATEADD(SECOND, total_duration, 0), 108),
1, 2, CAST(FLOOR(total_duration / 3600) AS VARCHAR(5)))
FROM (
-- Step 2: select records with status ON (1) and aggregate total duration in seconds
SELECT sum(duration) as total_duration
FROM (
-- Step 1: Use LEAD to get next adjacent timestamp and calculate date difference (time span) between the current record and the next one in time order
SELECT TOP 100 PERCENT
DATEDIFF(SECOND, timestamp, LEAD(timestamp, 1, '2022-12-19 06:00:00') OVER (ORDER BY timestamp)) as duration,
pointvalue
FROM [dbo].[DataPoints]
-- filtered by sensor name and time range
WHERE pointname = 'Heaters'
AND (timestamp BETWEEN '2022-12-18 06:00:00' AND '2022-12-19 06:00:00')
ORDER BY timestamp ASC
) AS tmp
WHERE tmp.pointvalue = 1
) as tmp2
Note: As the last record does not have next adjacent timestamp, it will be filled with the end time of inspection (In this case it's 6AM of the next day).
I do not really think it would be possible to achieve within single query.
Option 1:
implement stored procedure where you can implement some logic how to calculate these periods.
Option 2:
add new column (duration) and on insert new record calculate difference between NOW and previous timestamp and update duration for previous record

sql query for finding ID numbers on date range

I want to get the ID numbers for the last 24 hour range. Say I run a task at 4:00AM each morning and want to get the previous 24 hours of data going back to 4:00AM the previous day. I need to get the id codes to search the correct tables. If the data is like this what would be the best way to query the ID numbers?
ID
Start Time
EndTime
2112
2021-08-10 23:25:28.750
NULL
2111
2021-08-06 17:42:27.400
2021-08-10 23:25:28.750
2110
2021-08-03 20:21:14.093
2021-08-06 17:42:27.400
So if I had the date range of 8/10 - 8/11 I would need to get two codes. 2111 and 2112. If I need to get 8/11 - 8/12 I would only get 2112 as the endtime is null.
Any thoughts on the best way to query this out?
You need to do something like that :
DECLARE #employee TABLE(
ID int,
StartTime datetime,
EndTime datetime
)
INSERT INTO #employee SELECT '2112','2021-08-10 23:25:28.750',NULL
INSERT INTO #employee SELECT '2111','2021-08-06 17:42:27.400','2021-08-10 23:25:28.750'
INSERT INTO #employee SELECT '2110','2021-08-03 20:21:14.093','2021-08-06 17:42:27.400'
SELECT ID,* from #employee where
EndTime >= GETDATE()-1 OR EndTime is null
It will takes -1 day from execution time . So if you execute it right now you will heave only null value in output - because now it's 14.08 and this Edtime is null ( still running i think ).
DBFiddleDemo

HiveQL - Query Number of Entries over fixed unit of time

I have a table that is similar to the following:
LOGIN ID (STRING): TIME_STAMP (STRING HH:MM:SS)
BillyJoel 10:45:00
PianoMan 10:45:30
WeDidnt 10:45:45
StartTheFire 10:46:00
AlwaysBurning 10:46:30
Is there any possible way to get a query that gives me a column of the number of logins over a period of time? Something like this:
3 (number of logins from 10:45:00 - 10:45:59)
2 (number of logins from 10:46:00 - 10:46:59)
Note: If you can only do it with int timestamps, that's alright. My original table is all strings, so I thought I would represent that here. The stuff in parentheses don't need to be printed
If you want it by minute, you can just lop off the seconds:
select substr(1, 5, time_stamp) as hhmm, count(*)
from t
group by hhmm
order by hhmm;

How to get timebased data from MSSQL 2008 aggregated at time interval

I have an equipment that reports its number of produced pieces at random time intervals. At each record, the internal counter is reset, so if I want to get the total pieces, I would net to sum over an interval.
ts pieces
--------------------------------
2013-01-23 11:58 2013
2013-01-23 12:12 3025
2013-01-23 12:12 3025
2013-01-23 12:13 112
2013-01-23 12:17 1122
2013-01-23 12:34 3112
2013-01-23 12:36 3025
What if I want to query this data and I want the produced pieces from 12:00 to 12:30. I cannot simply do the following:
SELECT SUM(pieces)
FROM table
WHERE ts BETWEEN '2013-01-23 12:00' and '2013-01-23 12:30'
With this query, I would have too many pieces in the beginning (as of the 3025 pieces reported at 12:12 some were produced in the 2 minutes before 12:00) and I would have too little pieces at the end (pieces produced between 12:17 and 12:30 were only reported at 12:34).
Is there a built in feature in SQL server to do such calculations on timebased series, or would it require me to manually interpolate based on dateDiff between first/last values in the interval and last/first outside?
You have to do the interpolation manually. I've built some code that does it, based on various other assumptions (i.e. whether ts represents "all items produced up until and including this minute" or "all items produced before this minute"):
declare #t table (ts datetime2 not null,pieces int not null)
insert into #t(ts,pieces) values
('2013-01-23T11:58:00',2013),
('2013-01-23T12:12:00',3025),
--('2013-01-23T12:12:00',3025), --Error, two identical counts at same time?
('2013-01-23T12:13:00',112 ),
('2013-01-23T12:17:00',1122),
('2013-01-23T12:34:00',3112),
('2013-01-23T12:36:00',3025)
declare #Start datetime2
declare #End datetime2
select #Start = '2013-01-23T12:00:00', #End = '2013-01-23T12:30:00'
;With Numbers as (
select distinct number
from master..spt_values
), RowsNumbered as (
select
ROW_NUMBER() OVER (ORDER BY ts) as rn,
*
from #t
), Paired as (
select
rn2.ts as prevTime,
rn1.ts as thisTime,
rn1.pieces as pieces
from
RowsNumbered rn1
inner join
RowsNumbered rn2
on
rn1.rn = rn2.rn + 1
), Minutes as (
select
DATEADD(minute,-number,thisTime) as Time,
(pieces * 1.0) / DATEDIFF(minute,prevTime,thisTime) as Pieces
from
Paired p
inner join
Numbers n
on
number < 60 and --Reasonable?
DATEADD(minute,-number,thisTime) > prevTime and
number >= 0
)
select SUM(pieces)
from Minutes
where Time >= #Start and Time < #End
I'm also treating the start and end times as a semi-open interval with an inclusive start and exclusive end. This is normally the more sensible way to work with continuous data such as datetime.
Hopefully, you can see how I'm building up each CTE to get from the individual timestamps and piece counts to, for each minute, having a concept of how many pieces were produced in that minute. Once we've got to that point, we can just SUM over the minutes we want to include.

Group DateTime into 5,15,30 and 60 minute intervals

I am trying to group some records into 5-, 15-, 30- and 60-minute intervals:
SELECT AVG(value) as "AvgValue",
sample_date/(5*60) as "TimeFive"
FROM DATA
WHERE id = 123 AND sample_date >= 3/21/2012
i want to run several queries, each would group my average values into the desired time increments. So the 5-min query would return results like this:
AvgValue TimeFive
6.90 1995-01-01 00:05:00
7.15 1995-01-01 00:10:00
8.25 1995-01-01 00:15:00
The 30-min query would result in this:
AvgValue TimeThirty
6.95 1995-01-01 00:30:00
7.40 1995-01-01 01:00:00
The datetime column is in yyyy-mm-dd hh:mm:ss format
I am getting implicit conversion errors of my datetime column. Any help is much appreciated!
Using
datediff(minute, '1990-01-01T00:00:00', yourDatetime)
will give you the number of minutes since 1990-1-1 (you can use the desired base date).
Then you can divide by 5, 15, 30 or 60, and group by the result of this division.
I've cheked it will be evaluated as an integer division, so you'll get an integer number you can use to group by.
i.e.
group by datediff(minute, '1990-01-01T00:00:00', yourDatetime) /5
UPDATE As the original question was edited to require the data to be shown in date-time format after the grouping, I've added this simple query that will do what the OP wants:
-- This convert the period to date-time format
SELECT
-- note the 5, the "minute", and the starting point to convert the
-- period back to original time
DATEADD(minute, AP.FiveMinutesPeriod * 5, '2010-01-01T00:00:00') AS Period,
AP.AvgValue
FROM
-- this groups by the period and gets the average
(SELECT
P.FiveMinutesPeriod,
AVG(P.Value) AS AvgValue
FROM
-- This calculates the period (five minutes in this instance)
(SELECT
-- note the division by 5 and the "minute" to build the 5 minute periods
-- the '2010-01-01T00:00:00' is the starting point for the periods
datediff(minute, '2010-01-01T00:00:00', T.Time)/5 AS FiveMinutesPeriod,
T.Value
FROM Test T) AS P
GROUP BY P.FiveMinutesPeriod) AP
NOTE: I've divided this in 3 subqueries for clarity. You should read it from inside out. It could, of course, be written as a single, compact query
NOTE: if you change the period and the starting date-time you can get any interval you need, like weeks starting from a given day, or whatever you can need
If you want to generate test data for this query use this:
CREATE TABLE Test
( Id INT IDENTITY PRIMARY KEY,
Time DATETIME,
Value FLOAT)
INSERT INTO Test(Time, Value) VALUES('2012-03-22T00:00:22', 10)
INSERT INTO Test(Time, Value) VALUES('2012-03-22T00:03:22', 10)
INSERT INTO Test(Time, Value) VALUES('2012-03-22T00:04:45', 10)
INSERT INTO Test(Time, Value) VALUES('2012-03-22T00:07:21', 20)
INSERT INTO Test(Time, Value) VALUES('2012-03-22T00:10:25', 30)
INSERT INTO Test(Time, Value) VALUES('2012-03-22T00:11:22', 30)
INSERT INTO Test(Time, Value) VALUES('2012-03-22T00:14:47', 30)
The result of executing the query is this:
Period AvgValue
2012-03-22 00:00:00.000 10
2012-03-22 00:05:00.000 20
2012-03-22 00:10:00.000 30
Building on #JotaBe's answer (to which I cannot comment on--otherwise I would), you could also try something like this which would not require a subquery.
SELECT
AVG(value) AS 'AvgValue',
-- Add the rounded seconds back onto epoch to get rounded time
DATEADD(
MINUTE,
(DATEDIFF(MINUTE, '1990-01-01T00:00:00', your_date) / 30) * 30,
'1990-01-01T00:00:00'
) AS 'TimeThirty'
FROM YourTable
-- WHERE your_date > some max lookback period
GROUP BY
(DATEDIFF(MINUTE, '1990-01-01T00:00:00', your_date) / 30)
This change removes temp tables and subqueries. It uses the same core logic for grouping by 30 minute intervals but, when presenting the data back as part of the result I'm just reversing the interval calculation to get the rounded date & time.
So, in case you googled this, but you need to do it in mysql, which was my case:
In MySQL you can do
GROUP BY
CONCAT(
DATE_FORMAT(`timestamp`,'%m-%d-%Y %H:'),
FLOOR(DATE_FORMAT(`timestamp`,'%i')/5)*5
)
In the new SQL Server 2022, you can use DATE_BUCKET, this rounds it down to the nearest interval specified.
SELECT
DATE_BUCKET(minute, 5, d.sample_date) AS TimeFive,
AVG(d.value) AS AvgValue
FROM DATA d
WHERE d.id = 123
AND d.sample_date >= '20121203'
GROUP BY
DATE_BUCKET(minute, 5, d.sample_date);
You can use the following statement, this removed the second component and calculates the number of minutes away from the five minute mark and uses this to round down to the time block. This is ideal if you want to change your window, you can simply change the mod value.
select dateadd(minute, - datepart(minute, [YOURDATE]) % 5, dateadd(minute, datediff(minute, 0, [YOURDATE]), 0)) as [TimeBlock]
This will help exactly what you want
replace dt - your datetime c - call field astro_transit1 - your table 300 refer 5 min so add 300 each time for time gap increase
SELECT FROM_UNIXTIME( 300 * ROUND( UNIX_TIMESTAMP( r.dt ) /300 ) ) AS 5datetime, ( SELECT r.c FROM astro_transit1 ra WHERE ra.dt = r.dt ORDER BY ra.dt DESC LIMIT 1 ) AS first_val FROM astro_transit1 r GROUP BY UNIX_TIMESTAMP( r.dt ) DIV 300 LIMIT 0 , 30