SQL - get device continous uptime - sql

Device uptime time series table
There is a device monitor table recording if a device is up (STATE 1) or down for each day.
DEVICE_ID, STATE, DATE
1 0 2017-10-09
1 1 2017-10-10
1 1 2017-10-11
1 1 2017-10-12
1 0 2017-10-13
1 1 2017-10-14
1 1 2017-10-15
1 0 2017-10-16
1 1 2017-10-17
1 0 2017-10-18
...
2 0 2017-10-10
...
Question
How can I the duration of dates during which each device was up? The device 1 went up on 2017-10-10 and went down on 2017-10-13, hence it was up for 3 days (10, 11, 12). Then 2 days from 2017-10-14 to 2017-10-15.
The expected result should look like below.
DEVICE_ID, STATE, DATE
1 3 2017-10-10
1 2 2017-10-14
1 1 2017-10-17
Please advise.

This is a gaps-and-islands problem. You can solve this version with the difference of row numbers:
select device_id, min(date), max(date), count(*) as num_days
from (select t.*,
row_number() over (partition by device_id order by date) as seqnum,
row_number() over (partition by device_id, state order by date) as seqnum_2
from t
) t
where state = 1
group by device_id, (seqnum - seqnum_2), state;
Why this works is a little tricky to explain. If you stare at the results of the subquery, you will see how the difference between the two row number values defines the adjacent values that you want.

Related

How to find the number of events for the first 24 hours for each user id

I'm working on snowflake to solve a problem. I wanted to find the number of events for the first 24 hours for each user id.
This is a snippet of the database table I'm working on. I modified the table and used a date format without the time for simplification purposes.
user_id
client_event_time
1
2022-07-28
1
2022-07-29
1
2022-08-21
2
2022-07-29
2
2022-07-30
2
2022-08-03
I used the following approach to find the minimum event time per user_id.
SELECT user_id, client_event_time,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY client_event_time) row_number,
MIN(client_event_time) OVER (PARTITION BY user_id) MinEventTime
FROM Data
ORDER BY user_id, client_event_time;
user_id
client_event_time
row_number
MinEventTime
1
2022-07-28
1
2022-07-28
1
2022-07-29
2
2022-07-28
1
2022-08-21
3
2022-07-28
2
2022-07-29
1
2022-07-29
2
2022-07-30
2
2022-07-29
2
2022-08-03
3
2022-07-29
Then I tried to find the difference between the minimum event time and client_event_time, and if the difference is less than or equal to 24, I counted the client_event_time.
with NewTable as (
(SELECT user_id,client_event_time, event_type,
row_number() over (partition by user_id order by CLIENT_EVENT_TIME) row_number,
MIN(client_event_time) OVER (PARTITION BY user_id) MinEventTime
FROM Data
ORDER BY user_id, client_event_time))
SELECT user_id,
COUNT(case when timestampdiff(hh, client_event_time, MinEventTime) <= 24 then 1 else 0 end) AS duration
FROM NEWTABLE
GROUP BY user_id
I got the following result:
user_id
duration
1
3
2
3
I wanted to find the following result:
user_id
duration
1
2
2
2
Could you please help me solve this problem? Thanks!
This looks like a problem for windowed functions! I like them a lot.
Here's you sample data
DECLARE #table TABLE (user_id INT, client_event_time DATETIME)
INSERT INTO #table (user_id, client_event_time) VALUES
(1, '2022-07-28 13:30:00'),
(1, '2022-07-29 08:30:00'),
(1, '2022-08-21 12:34:56'),
(2, '2022-07-29 08:30:00'),
(2, '2022-07-30 13:30:00'),
(2, '2022-08-03 12:34:56')
I added some hours to it, so we can look at 24 hour windows more easily. For user_id 1 we can see they had 2 events in the 24 hours after their initial one. For user_id 2 there was only the first one. We can capture that with a MIN OVER, along with the actual datetimes.
SELECT user_id, MIN(client_event_time) OVER (PARTITION BY user_id) AS FirstEventDateTime, client_event_time
FROM #table
user_id FirstEventDateTime client_event_time
-------------------------------------------------------
1 2022-07-28 13:30:00.000 2022-07-28 13:30:00.000
1 2022-07-28 13:30:00.000 2022-07-29 08:30:00.000
1 2022-07-28 13:30:00.000 2022-08-21 12:34:56.000
2 2022-07-29 08:30:00.000 2022-07-29 08:30:00.000
2 2022-07-29 08:30:00.000 2022-07-30 13:30:00.000
2 2022-07-29 08:30:00.000 2022-08-03 12:34:56.000
Now we have the first datetime and each rows datetime in the resultset together, we can make a comparison:
SELECT user_id, MIN(client_event_time) OVER (PARTITION BY user_id) AS FirstEventDateTime, client_event_time, CASE WHEN DATEDIFF(HOUR,MIN(client_event_time) OVER (PARTITION BY user_id), client_event_time) < 24 THEN 1 ELSE 0 END AS EventsInFirst24Hours
FROM #table
user_id FirstEventDateTime client_event_time EventsInFirst24Hours
----------------------------------------------------------------------------
1 2022-07-28 13:30:00.000 2022-07-28 13:30:00.000 1
1 2022-07-28 13:30:00.000 2022-07-29 08:30:00.000 1
1 2022-07-28 13:30:00.000 2022-08-21 12:34:56.000 0
2 2022-07-29 08:30:00.000 2022-07-29 08:30:00.000 1
2 2022-07-29 08:30:00.000 2022-07-30 13:30:00.000 0
2 2022-07-29 08:30:00.000 2022-08-03 12:34:56.000 0
Now we have an indicator telling us which events occurred in the first 24 hours, all we really need is to sum it, but SQL Server is mean about using a windowed function in another aggregate, so we need to cheat and put it into a subquery.
SELECT user_id, SUM(EventsInFirst24Hours) AS CountOfEventsInFirst24Hours
FROM (
SELECT user_id, MIN(client_event_time) OVER (PARTITION BY user_id) AS FirstEventDateTime, client_event_time, CASE WHEN DATEDIFF(HOUR,MIN(client_event_time) OVER (PARTITION BY user_id), client_event_time) < 24 THEN 1 ELSE 0 END AS EventsInFirst24Hours
FROM #table
) a
GROUP BY user_id
And that gets us to the result:
user_id CountOfEventsInFirst24Hours
-----------------------------------
1 2
2 1
A little about what's going on with the windowed function:
MIN - the aggregation we want it to do. The common aggregate functions have windowed counterparts.
(client_event_time) - the value we want to do it to.
OVER (PARTITION BY user_id) - the window we want to set up. In this case we want to know the minimum datetime for each of the user_ids.
We can partition by as many columns as we'd like.
You can also use an ORDER BY with as many columns as you'd like, but that was not necessary here. Ex:
OVER (PARTITION BY column1, column2 ORDER BY column4, column5 DESC)
Partition (or group by) column1 and column2 and order by column4 and column5 descending.
Easier done with a qualify
with cte as
(select *
from mytable
qualify event_time<=min(event_time) over (partition by user_id) + interval '24 hours')
select user_id, count(*) as counts
from cte
group by user_id
If you want the count of events around 24 hours of the minimun event time, you canuse a group by CTE that givbes you all the minumum event tomes for all users
the rest is to get all the rows that are in the tme limit
WITH min_data as
(SELECT user_id,MIN(client_event_time) mindate FROM data GROUP BY user_id)
SELECT d.user_id, COUNT(*)
FROM data d JOIN min_data md ON d.user_id = md.user_id WHERE client_event_time <= mindate + INTERVAL '24 hour'
GROUP BY d.user_id
ORDER BY d.user_id
user_id
count
1
2
2
2

MSSQL - Running sum with reset after gap

I have been trying to solve a problem for a few days now, but I just can't get it solved. Hence my question today.
I would like to calculate the running sum in the following table. My result so far looks like this:
PersonID
Visit_date
Medication_intake
Previous_date
Date_diff
Running_sum
1
2012-04-26
1
1
2012-11-16
1
2012-04-26
204
204
1
2013-04-11
0
1
2013-07-19
1
1
2013-12-05
1
2013-07-19
139
343
1
2014-03-18
1
2013-12-05
103
585
1
2014-06-24
0
2
2014-12-01
1
2
2015-03-09
1
2014-12-01
98
98
2
2015-09-28
0
This is my desired result. So only the running sum over contiguous blocks (Medication_intake=1) should be calculated.
PersonID
Visit_date
Medication_intake
Previous_date
Date_diff
Running_sum
1
2012-04-26
1
1
2012-11-16
1
2012-04-26
204
204
1
2013-04-11
0
1
2013-07-19
1
1
2013-12-05
1
2013-07-19
139
139
1
2014-03-18
1
2013-12-05
103
242
1
2014-06-24
0
2
2014-12-01
1
2
2015-03-09
1
2014-12-01
98
98
2
2015-09-28
0
I work with Microsoft SQL Server 2019 Express.
Thank you very much for your tips!
This is a gaps and islands problem, and one approach uses the difference in row numbers method:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY PersonID
ORDER BY Visit_date) rn1,
ROW_NUMBER() OVER (PARTITION BY PersonId, Medication_intake
ORDER BY Visit_date) rn2
FROM yourTable
)
SELECT PersonID, Visit_date, Medication_intake, Previous_date, Date_diff,
CASE WHEN Date_diff IS NOT NULL AND Medication_intake = 1
THEN SUM(Date_diff) OVER (PARTITION BY PersonID, rn1 - rn2
ORDER BY Visit_date) END AS Running_sum
FROM cte
ORDER BY PersonID, Visit_date;
Demo
The CASE expression in the outer query computes the rolling sum for date diff along islands of records having a medication intake value of 1. For other records, or for records where date diff be null, the value generated is simply null.

Oracle select and update periodically records

I have a table that includes multiple records duplicated status values.
DEVICE STATUS CHANGE_DATE
1 1 21.11.2017 12:01
1 0 21.11.2017 13:05
1 1 21.11.2017 14:06
1 0 21.11.2017 14:26
1 1 21.11.2017 14:36
2 0 21.11.2017 15:28
2 1 21.11.2017 15:39
Device status change priodically.
First question is, I want to select by device id last status is 1.
Query result will be like this;
DEVICE STATUS CHANGE_DATE
1 1 21.11.2017 14:36
2 1 21.11.2017 15:39
The second query is about update for same result. If data comes like this:
DEVICE STATUS CHANGE_DATE
1 1 11.11.2017 12:36
1 1 21.11.2017 14:36
2 1 21.11.2017 15:39
Device 1 is an old date. And I want to update a record is older 2 days.
How can I create these oracle queries? Select and update older records.
I'm tempted to close the question as too broad. But, I'll answer the first part:
select t.*
from (select t.*,
row_number() over (partition by device order by change_date desc) as seqnum
from t
) t
where seqnum = 1;
I think the "update" should be a separate question.

Get MAX count but keep the repeated calculated value if highest

I have the following table, I am using SQL Server 2008
BayNo FixDateTime FixType
1 04/05/2015 16:15:00 tyre change
1 12/05/2015 00:15:00 oil change
1 12/05/2015 08:15:00 engine tuning
1 04/05/2016 08:11:00 car tuning
2 13/05/2015 19:30:00 puncture
2 14/05/2015 08:00:00 light repair
2 15/05/2015 10:30:00 super op
2 20/05/2015 12:30:00 wiper change
2 12/05/2016 09:30:00 denting
2 12/05/2016 10:30:00 wiper repair
2 12/06/2016 10:30:00 exhaust repair
4 12/05/2016 05:30:00 stereo unlock
4 17/05/2016 15:05:00 door handle repair
on any given day need do find the highest number of fixes made on a given bay number, and if that calculated number is repeated then it should also appear in the resultset
so would like to see the result set as follows
BayNo FixDateTime noOfFixes
1 12/05/2015 00:15:00 2
2 12/05/2016 09:30:00 2
4 12/05/2016 05:30:00 1
4 17/05/2016 15:05:00 1
I manage to get the counts of each but struggling to get the max and keep the highest calculated repeated value. can someone help please
Use window functions.
Get the count for each day by bayno and also find the min fixdatetime for each day per bayno.
Then use dense_rank to compute the highest ranked row for each bayno based on the number of fixes.
Finally get the highest ranked rows.
select distinct bayno,minfixdatetime,no_of_fixes
from (
select bayno,minfixdatetime,no_of_fixes
,dense_rank() over(partition by bayno order by no_of_fixes desc) rnk
from (
select t.*,
count(*) over(partition by bayno,cast(fixdatetime as date)) no_of_fixes,
min(fixdatetime) over(partition by bayno,cast(fixdatetime as date)) minfixdatetime
from tablename t
) x
) y
where rnk = 1
Sample Demo
You are looking for rank() or dense_rank(). I would right the query like this:
select bayno, thedate, numFixes
from (select bayno, cast(fixdatetime) as date) as thedate,
count(*) as numFixes,
rank() over (partition by cast(fixdatetime as date) order by count(*) desc) as seqnum
from t
group by bayno, cast(fixdatetime as date)
) b
where seqnum = 1;
Note that this returns the date in question. The date does not have a time component.

How many Days each item was in each State, the full value of the period

This post is really similar to my question:
SQL Server : how many days each item was in each state
but I dont have the column Revision to see wich is the previous state, and also I want to get the full time of a status, I b
....
I'm want to get how long one item has been in one status in general, my table look like this:
ID DATE STATUS
3D56B7B1-FCB3-4897-BAEB-004796E0DC8D 2016-04-05 11:30:00.000 1
3D56B7B1-FCB3-4897-BAEB-004796E0DC8D 2016-04-08 11:30:00.000 13
274C5DA9-9C38-4A54-A697-009933BB7B7F 2016-04-29 08:00:00.000 5
274C5DA9-9C38-4A54-A697-009933BB7B7F 2016-05-04 08:00:00.000 4
A70A66DC-9D9E-49BE-93CF-00F9E3E06CE2 2016-04-14 07:50:00.000 1
A70A66DC-9D9E-49BE-93CF-00F9E3E06CE2 2016-04-21 14:00:00.000 2
A70A66DC-9D9E-49BE-93CF-00F9E3E06CE2 2016-04-23 12:15:00.000 3
A70A66DC-9D9E-49BE-93CF-00F9E3E06CE2 2016-04-23 16:15:00.000 1
BF122AE1-CB39-4967-8F37-012DC55E92A7 2016-04-05 10:30:00.000 1
BF122AE1-CB39-4967-8F37-012DC55E92A7 2016-04-20 17:00:00.000 5
I want to get this
Column 1 : ID Column 2 : Status Column 3 : Time with the status
Column 3 : Time with the status
= NextDate - PreviosDate + 1
if is the last Status, is count as 1
if is more than one Status on the same day, I get the Last one (u can say that only mather the last Status of the day)
by ID, Status must be unique
I should look like this:
ID STATUS TIME
3D56B7B1-FCB3-4897-BAEB-004796E0DC8D 1 3
3D56B7B1-FCB3-4897-BAEB-004796E0DC8D 13 1
274C5DA9-9C38-4A54-A697-009933BB7B7F 5 5
274C5DA9-9C38-4A54-A697-009933BB7B7F 4 1
A70A66DC-9D9E-49BE-93CF-00F9E3E06CE2 1 8
A70A66DC-9D9E-49BE-93CF-00F9E3E06CE2 2 2
BF122AE1-CB39-4967-8F37-012DC55E92A7 1 15
BF122AE1-CB39-4967-8F37-012DC55E92A 5 1
Thanks to #ConradFrix comments, this is how works ..
WITH CTE
AS
(
SELECT
ID,
STATUS,
DATE,
LEAD(DATE, 1) over (partition by ID order by DATE) LEAD,
ISNULL(DATEDIFF(DAYOFYEAR, DATE,
LEAD(DATE, 1) over (partition by ID order by DATE)), 1) DIF_BY_LEAD
FROM TABLE_NAME
)
SELECT ID, STATUS, SUM(DIF_BY_LEAD) AS TIME_STATUS
FROM CTE GROUP BY ID, STATUS
ORDER BY ID, STATUS