Summing dates across multiple rows in SQL? - sql

We have a Table that stores alarms for certain SetPoints in our system. I'm attempting to write a query that first gets the difference between two dates (spread across two rows), and then sums all of the date differences to get a total sum for the amount of time the setpoint was in alarm.
We have one database where I've accomplished similar, but in that case, both the startTime and endTime were in the same row. In this case, this is not adequate
Some example Data
| Row | TagID | SetPointID | EventLogTime | InAlarm |
-------------------------------------------------------------------------------------
| 1 | 1 | 2 | 2016-01-01 01:49:18.070 | 1 |
| 2 | 1 | 1 | 2016-01-01 03:23:39.970 | 1 |
| 3 | 1 | 2 | 2016-01-01 03:23:40.070 | 0 |
| 4 | 1 | 1 | 2016-01-01 08:04:01.260 | 0 |
| 5 | 1 | 2 | 2016-01-01 08:04:01.370 | 1 |
| 6 | 1 | 1 | 2016-01-01 11:40:36.367 | 1 |
| 7 | 1 | 2 | 2016-01-01 11:40:36.503 | 0 |
| 8 | 1 | 1 | 2016-01-01 13:00:30.263 | 0 |
Results
| TagID | SetPointID | TotalTimeInAlarm |
------------------------------------------------------
| 1 | 1 | 6.004443 (hours) |
| 1 | 2 | 5.182499 (hours) |
Essentially, what I need to do is to get the start time and end time for each tag and each setpoint, then I need to get the total time in alarm. I'm thing CTEs might be able to help, but I'm not sure.
I believe the pseudo query logic would be similar to
Define #startTime DATETIME, #endTime DATETIME
SELECT TagID,
SetPointID,
ABS(First Occurrence of InAlarm = True (since last occurrence WHERE InAlarm = False)
- First Occurrence of InAlarm = False (since last occurrence WHERE InAlarm = True))
-- IF no InAlarm = False use #endTime.
GROUP BY TagID, SetPointID

You can use the LEAD windowed function (or LAG) to do this pretty easily. This assumes that the rows always come in pairs with 1-0-1-0 for "InAlarm". If that doesn't happen then it's going to throw things off. You would need to have business rules for these situations in any event.
;WITH CTE_Timespans AS
(
SELECT
TagID,
SetPointID,
InAlarm,
EventLogTime,
LEAD(EventLogTime, 1) OVER (PARTITION BY TagID, SetPointID ORDER BY EventLogTime) AS EndingEventLogTime
FROM
My_Table
)
SELECT
TagID,
SetPointID,
SUM(DATEDIFF(SS, EventLogTime, EndingEventLogTime))/3600.0 AS TotalTime
FROM
CTE_Timespans
WHERE
InAlarm = 1
GROUP BY
TagID,
SetPointID

One easy way is to use OUTER APPLY to get the next date that is not InAlarm
SELECT mt.TagID,
mt.SetPointID,
SUM(DATEDIFF(ss,mt.EventLogTime,oa.EventLogTime)) / 3600.0 AS [TotalTimeInAlarm]
FROM MyTable mt
OUTER APPLY (SELECT MIN([EventLogTime]) EventLogTime
FROM MyTable mt2
WHERE mt.TagID = mt2.TagID
AND mt.SetPointID = mt2.SetPointID
AND mt2.EventLogTime > mt.EventLogTime
AND InAlarm = 0
) oa
WHERE mt.InAlarm = 1
GROUP BY mt.TagID,
mt.SetPointID
LEAD() might perform better if using MSSQL 2012+

In SQL Server 2014+:
SELECT tagId, setPointId, SUM(DATEDIFF(second, pt, eventLogTime)) / 3600. AS diff
FROM (
SELECT *,
LAG(inAlarm) OVER (PARTITION BY tagId, setPointId ORDER BY eventLogTime, row) ppa,
LAG(eventLogTime) OVER (PARTITION BY tagId, setPointId ORDER BY eventLogTime, row) pt
FROM (
SELECT LAG(inAlarm) OVER (PARTITION BY tagId, setPointId ORDER BY eventLogTime, row) pa,
*
FROM mytable
) q
WHERE EXISTS
(
SELECT pa
EXCEPT
SELECT inAlarm
)
) q
WHERE ppa = 0
AND inAlarm = 1
GROUP BY
tagId, setPointId
This will filter out consecutive events with same alarm state

Related

Get some values from the table by selecting

I have a table:
| id | Number |Address
| -----| ------------|-----------
| 1 | 0 | NULL
| 1 | 1 | NULL
| 1 | 2 | 50
| 1 | 3 | NULL
| 2 | 0 | 10
| 3 | 1 | 30
| 3 | 2 | 20
| 3 | 3 | 20
| 4 | 0 | 75
| 4 | 1 | 22
| 4 | 2 | 30
| 5 | 0 | NULL
I need to get: the NUMBER of the last ADDRESS change for each ID.
I wrote this select:
select dh.id, dh.number from table dh where dh =
(select max(min(t.history)) from table t where t.id = dh.id group by t.address)
But this select not correctly handling the case when the address first changed, and then changed to the previous value. For example id=1: group by return:
| Number |
| -------- |
| NULL |
| 50 |
I have been thinking about this select for several days, and I will be happy to receive any help.
You can do this using row_number() -- twice:
select t.id, min(number)
from (select t.*,
row_number() over (partition by id order by number desc) as seqnum1,
row_number() over (partition by id, address order by number desc) as seqnum2
from t
) t
where seqnum1 = seqnum2
group by id;
What this does is enumerate the rows by number in descending order:
Once per id.
Once per id and address.
These values are the same only when the value is 1, which is the most recent address in the data. Then aggregation pulls back the earliest row in this group.
I answered my question myself, if anyone needs it, my solution:
select * from table dh1 where dh1.number = (
select max(x.number)
from (
select
dh2.id, dh2.number, dh2.address, lag(dh2.address) over(order by dh2.number asc) as prev
from table dh2 where dh1.id=dh2.id
) x
where NVL(x.address, 0) <> NVL(x.prev, 0)
);

Partition & consecutive in SQL

fellow stackers
I have a data set like so:
+---------+------+--------+
| user_id | date | metric |
+---------+------+--------+
| 1 | 1 | 1 |
| 1 | 2 | 1 |
| 1 | 3 | 1 |
| 2 | 1 | 1 |
| 2 | 2 | 1 |
| 2 | 3 | 0 |
| 2 | 4 | 1 |
+---------+------+--------+
I am looking to flag those customers who has 3 consecutive "1"s in the metric column. I have a solution as below.
select distinct user_id
from (
select user_id
,metric +
ifnull( lag(metric, 1) OVER (PARTITION BY user_id ORDER BY date), 0 ) +
ifnull( lag(metric, 2) OVER (PARTITION BY user_id ORDER BY date), 0 )
as consecutive_3
from df
) b
where consecutive_3 = 3
While it works it is not scalable. As one can imagine what the above query would look like if I were looking for a consecutive 50.
May I ask if there is a scalable solution? Any cloud SQL will do. Thank you.
If you only want such users, you can use a sum(). Assuming that metric is only 0 or 1:
select user_id,
(case when max(metric_3) = 3 then 1 else 0 end) as flag_3
from (select df.*,
sum(metric) over (partition by user_id
order by date
rows between 2 preceding and current row
) as metric_3
from df
) df
group by user_id;
By using a windowing clause, you can easily expand to as many adjacent 1s as you like.

SQL: Select single item per name with multiple criteria

I'm trying to select a single item per value in a "Name" column according to several criteria.
The criteria I want to use look like this:
Only include results where IsEnabled = 1
Return the single result with the lowest priority (we're using 1 to mean "top priority")
In case of a tie, return the result with the newest Timestamp
I've seen several other questions that ask about returning the newest timestamp for a given value, and I've been able to adapt that to return the minimum value of Priority - but I can't figure out how to filter off of both Priority and Timestamp.
Here is the question that's been most helpful in getting me this far.
Sample data:
+------+------------+-----------+----------+
| Name | Timestamp | IsEnabled | Priority |
+------+------------+-----------+----------+
| A | 2018-01-01 | 1 | 1 |
| A | 2018-03-01 | 1 | 5 |
| B | 2018-01-01 | 1 | 1 |
| B | 2018-03-01 | 0 | 1 |
| C | 2018-01-01 | 1 | 1 |
| C | 2018-03-01 | 1 | 1 |
| C | 2018-05-01 | 0 | 1 |
| C | 2018-06-01 | 1 | 5 |
+------+------------+-----------+----------+
Desired output:
+------+------------+-----------+----------+
| Name | Timestamp | IsEnabled | Priority |
+------+------------+-----------+----------+
| A | 2018-01-01 | 1 | 1 |
| B | 2018-01-01 | 1 | 1 |
| C | 2018-03-01 | 1 | 1 |
+------+------------+-----------+----------+
What I've tried so far (this gets me only enabled items with lowest priority, but does not filter for the newest item in case of a tie):
SELECT DATA.Name, DATA.Timestamp, DATA.IsEnabled, DATA.Priority
From MyData AS DATA
INNER JOIN (
SELECT MIN(Priority) Priority, Name
FROM MyData
GROUP BY Name
) AS Temp ON DATA.Name = Temp.Name AND DATA.Priority = TEMP.Priority
WHERE IsEnabled=1
Here is a SQL fiddle as well.
How can I enhance this query to only return the newest result in addition to the existing filters?
Use row_number():
select d.*
from (select d.*,
row_number() over (partition by name order by priority, timestamp) as seqnum
from mydata d
where isenabled = 1
) d
where seqnum = 1;
The most effective way that I've found for these problems is using CTEs and ROW_NUMBER()
WITH CTE AS(
SELECT *, ROW_NUMBER() OVER( PARTITION BY Name ORDER BY Priority, TimeStamp DESC) rn
FROM MyData
WHERE IsEnabled = 1
)
SELECT Name, Timestamp, IsEnabled, Priority
From CTE
WHERE rn = 1;

Query update except latest data on same conditions

I have a scenario to update the rows
within the same condition(status = 1) but not the latest row.
So this is the table design.
--------------------------------------------------
|idx | status | var1 | date
--------------------------------------------------
| 2 | 1 | cat | 2018-06-17 15:41:32.110
| 3 | 1 | dog | 2018-06-17 11:41:32.110
| 2 | 1 | lamb | 2018-06-17 11:41:32.110
| 2 | 1 | pc | 2018-06-17 09:41:32.110
| 3 | 1 | doll | 2018-06-17 09:41:32.110
What I want is to get all the same conditions
where idx is equal and status = 1, and
update the status to 0 except the most recent row.
In this case, there are 3 rows which have idx of 2 and status = 1,
and 2 rows which have idx of 3 and status = 1.
After the query, the table should look like this
--------------------------------------------------
|idx | status | var1 | date
--------------------------------------------------
| 2 | 1 | cat | 2018-06-17 15:41:32.110
| 3 | 1 | dog | 2018-06-17 11:41:32.110
| 2 | 0 | lamb | 2018-06-17 11:41:32.110
| 2 | 0 | pc | 2018-06-17 09:41:32.110
| 3 | 0 | doll | 2018-06-17 09:41:32.110
I have no idea how to do this and tried to at least display
the rows which has more than 1 equal conditions and came up with this query
select Idx, status, COUNT(Idx) as count from table
group by Idx, status
having COUNT(Idx) > 1 and status = 1
order by Idx
This shows how many rows I have in the same condition,
but I would also like to have rows to display var1 and date
but I don't know how to do that.
As I am working in a .Net development, I could make a list of idx
to a list and do a for loop on each idx and update in that for loop,
but I would love to learn more about sql, how to solve this through.
We can try updating with a CTE:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY idx ORDER BY date DESC) rn
FROM yourTable
)
UPDATE cte
SET status = 0
WHERE rn > 1 AND status = 1;
You can also achieve it without the CTE:
UPDATE t SET status = 0 FROM tbl t WHERE NOT EXISTS
( SELECT 1 FROM tbl GROUP BY idx HAVING MAX(date)=t.date AND idx=t.idx );
see here: http://rextester.com/BVAS22315
The difference between Tim's and my solution would be that in case of two records with the same idx having exactly the same date, Tim's command would leave only one record unchanged (status=1) while my command would keep them both unchanged.
And, using the window function ROW_NUMBER(), you can also do it like this:
UPDATE t SET status=0 FROM
(SELECT *, ROW_NUMBER() OVER (PARTITION BY idx ORDER BY date DESC) rn
FROM tbl) t
WHERE rn>1
This second version will behave exactly like Tim's solution, see here: http://rextester.com/MFRAMR93418
(Note the identical dates for 'dog' and 'lamb' and only one gets updated.)

Offsetting MySQL Max

I'm running a MySQL query to get the highest ID of each row grouped by each field. I do this with:
SELECT period,max(id) AS maxid
FROM f
WHERE type = '1'
GROUP BY period
This produces:
+--------+-------+
| period | maxid |
+--------+-------+
| 1 | 21878 |
| 2 | 21879 |
| 3 | 20188 |
| 4 | 21873 |
| 5 | 21872 |
| 6 | 21874 |
| 7 | 21875 |
| 8 | 21876 |
| 9 | 21877 |
+--------+-------+
This is the result I am expecting.
However, I now want to run a query which returns the maximum id but one for each period. I figured the best way to do this would be to use the offset paramater on LIMIT. To test that this will work, I ran:
SELECT period,(SELECT id FROM freight_data ORDER BY id DESC LIMIT 1) AS maxid
FROM f
WHERE type = '1'
GROUP BY period
This produces:
+--------+-------+
| period | maxid |
+--------+-------+
| 1 | 21903 |
| 2 | 21903 |
| 3 | 21903 |
| 4 | 21903 |
| 5 | 21903 |
| 6 | 21903 |
| 7 | 21903 |
| 8 | 21903 |
| 9 | 21903 |
+--------+-------+
I can see why this is happening, as my subquery isn't taking any of the conditions in to account when getting the ID, so it's just returning the highest ID in the table.
Thus, my questions are:
How does MAX work? and
Is there a way I can product a similar result as max(id) but offset by one result?
Any help will be greatly appreciated!
Thanks
You could do this, which is only slightly horrible:
SELECT DISTINCT ff.period, (
SELECT id
FROM f
WHERE period = ff.period
AND type = '1'
ORDER BY id DESC
LIMIT 1, 1
) as max_id_but_1
FROM f as ff
WHERE type = '1';
EDIT:
If every id belongs to only one period, I think you can use this:
SELECT period, max(id)
FROM f
WHERE type = '1'
AND id NOT IN (
SELECT max(id)
FROM f
WHERE type = '1'
GROUP BY period
)
GROUP BY period;
However, you will not get results for periods with only one row. Of course, you could code around that.
If I understand your question correctly you want the second-highest id for each period, right?
This is ugh-tastic and not tested of course:
SELECT period,max(id) AS maxid
FROM f
WHERE type = '1'
AND maxid NOT IN(
SELECT period,max(id) AS maxid
FROM f
WHERE type = '1'
GROUP BY period
)
GROUP BY period
You might get some conflicts on the identifier 'maxid'.
SELECT period,
(
SELECT id
FROM f fi
WHERE fi.type = '1'
AND fi.period = f.period
ORDER BY
type DESC, period DESC, id DESC
LIMIT 1, 1
)
FROM f
WHERE type = '1'
GROUP BY
period
Create an index on f (type, period, id) for this to work fast.