Whoever answers this thank you so, so much!
Here's a little snippet of my data:
DATE Score Multiplier Weighting
2022-01-05 3 4 7
2022-01-05 4 7 8
2022-01-06 5 2 4
2022-01-06 3 4 7
2022-01-06 4 7 8
2022-01-07 5 2 4
Each row of this data is when something "happened" and multiple events occur during the same day.
What I need to do is take the rolling average of this data over the past 3 months.
So for ONLY 2022-01-05, my weighted average (called ADJUSTED) would be:
DATE ADJUSTED
2022-01-05 [(3*4) + (4*7)]/(7+8)
Except I need to do this over the previous 3 months (so on Jan 5, 2022, I'd need the rolling weighted average -- using the "Weighting" column -- over the preceding 3 months; can also use previous 90 days if that makes it easier).
Not sure if this is a clear enough description, but would appreciate any help.
Thank you!
IF I have interpreted this correctly I believe a GROUP BY query will meet the need:
sample data
CREATE TABLE mytable(
DATE DATE NOT NULL
,Score INTEGER NOT NULL
,Multiplier INTEGER NOT NULL
,Weighting INTEGER NOT NULL
);
INSERT INTO mytable(DATE,Score,Multiplier,Weighting) VALUES ('2022-01-05',3,4,7);
INSERT INTO mytable(DATE,Score,Multiplier,Weighting) VALUES ('2022-01-05',4,7,8);
INSERT INTO mytable(DATE,Score,Multiplier,Weighting) VALUES ('2022-01-06',5,2,4);
INSERT INTO mytable(DATE,Score,Multiplier,Weighting) VALUES ('2022-01-06',3,4,7);
INSERT INTO mytable(DATE,Score,Multiplier,Weighting) VALUES ('2022-01-06',4,7,8);
INSERT INTO mytable(DATE,Score,Multiplier,Weighting) VALUES ('2022-01-07',5,2,4);
query
select
date
, sum(score) sum_score
, sum(multiplier) sum_multiplier
, sum(weighting) sum_weight
, (sum(score)*1.0 + sum(multiplier)*1.0) / (sum(weighting)*1.0) ADJUSTED
from mytable
group by date
result
+------------+-----------+----------------+------------+-------------------+
| date | sum_score | sum_multiplier | sum_weight | ADJUSTED |
+------------+-----------+----------------+------------+-------------------+
| 2022-01-05 | 7 | 11 | 15 | 1.200000000000000 |
| 2022-01-06 | 12 | 13 | 19 | 1.315789473684210 |
| 2022-01-07 | 5 | 2 | 4 | 1.750000000000000 |
+------------+-----------+----------------+------------+-------------------+
db<>fiddle here
Note: I have not attempted to avoid possible divide by 0 or any NULL value problems in the query ablove
I am using SQL Server.
select DISTINCT caseNumber, dateStarted,dateStopped from patientView where dateStarted !='' and dateStopped != '';
We get the following output,
CaseNumber
dateStarted
dateStopped
1
2022-01-01
2022-01-04
1
2022-01-05
2022-01-19
2
2022-01-03
2022-01-10
4
2022-01-05
2022-01-11
4
2022-01-13
2022-01-14
4
2022-01-21
2022-01-23
5
2022-01-15
2022-01-16
5
2022-01-17
2022-01-24
5
2022-01-24
2022-01-26
8
2022-01-17
2022-01-20
8
2022-01-21
2022-01-28
11
2022-01-18
2022-01-25
11
2022-01-26
2022-01-27
I want to calculate the duration for each caseNumber. For eg. caseNumber 1 has 2 rows and hence total duration would be 18days.
I would suggest using the group by keyword to group redundant case numbers and take the min for the startdates and max for stopdates. You can do something like:
SELECT caseNumber, max(dateStopped)-min(dateStarted)
from patientView
where dateStarted != '' and dateStopped != ''
GROUP BY caseNumber;
It is not clear whether you want the sum of the durations for individual patientView records or the duration from the earliest start to the latest end. It is also not clear whether the stop date is inclusive or exclusive. Is 2022-01-01 to 2022-01-04 considered 3 days or 4 days?
Here is code that shows 4 different calculations:
DECLARE #patientView TABLE (CaseNumber INT, dateStarted DATETIME, dateStopped DATETIME)
INSERT #patientView
VALUES
(1, '2022-01-01 ', '2022-01-04'),
(1, '2022-01-05 ', '2022-01-19'),
(2, '2022-01-03 ', '2022-01-10'),
(4, '2022-01-05 ', '2022-01-11'),
(4, '2022-01-13 ', '2022-01-14'),
(4, '2022-01-21 ', '2022-01-23'),
(5, '2022-01-15 ', '2022-01-16'),
(5, '2022-01-17 ', '2022-01-24'),
(5, '2022-01-24 ', '2022-01-26'),
(8, '2022-01-17 ', '2022-01-20'),
(8, '2022-01-21 ', '2022-01-28'),
(11, '2022-01-18 ', '2022-01-25'),
(11, '2022-01-26 ', '2022-01-27')
SELECT
CaseNumber,
SumDaysExclusive = SUM(DATEDIFF(day, dateStarted, dateStopped)),
SumDaysInclusive = SUM(DATEDIFF(day, dateStarted, dateStopped) + 1),
RangeDaysExclusive = DATEDIFF(day, MIN(dateStarted), MAX(dateStopped)),
RangeDaysInclusive = DATEDIFF(day, MIN(dateStarted), MAX(dateStopped)) + 1
FROM #patientView
GROUP BY CaseNumber
ORDER BY CaseNumber
Results:
CaseNumber
SumDaysExclusive
SumDaysInclusive
RangeDaysExclusive
RangeDaysInclusive
1
17
19
18
19
2
7
8
7
8
4
9
12
18
19
5
10
13
11
12
8
10
12
11
12
11
8
10
9
10
db<>fiddle
The test data above uses DATETIME types. (DATE would also work.) If you have dates stored as character data (not a good practice), you may need to add CAST or CONVERT statements.
The data is stored as map(varchar, varchar) and looks like this:
Date Info ID
2020-06-10 {"Price":"102.45", "Time":"09:31", "Symbol":"AAPL"} 10
2020-06-10 {"Price":"10.28", "Time":"12:31", "Symbol":"MSFT"} 10
2020-06-11 {"Price":"12.45", "Time":"09:48", "Symbol":"T"} 10
Is there a way to split up the info column and return a table where each entry has its own column?
Something like this:
Date Price Time Symbol ID
2020-06-10 102.45 09:31 AAPL 10
2020-06-10 10.28 12:31 MSFT 10
Note, there is the potential for the time column to not appear in every entry. For example, an entry can look like this:
Date Info ID
2020-06-10 {"Price":"10.28", "Symbol":"MSFT"} 10
In this case, I would like it to just fill it with a nan value
Thanks
You can use the subscript operator ([]) or the element_at function to access the values in the map. The difference between the two is that [] will fail with an error if the key is missing from the map.
WITH data(dt, info, id) AS (VALUES
(DATE '2020-06-10', map_from_entries(ARRAY[('Price', '102.45'), ('Time', '09:31'), ('Symbol','AAPL')]), 10),
(DATE '2020-06-10', map_from_entries(ARRAY[('Price', '10.28'), ('Time', '12:31'), ('Symbol','MSFT')]), 10),
(DATE '2020-06-11', map_from_entries(ARRAY[('Price', '12.45'), ('Time', '09:48'), ('Symbol','T')]), 10),
(DATE '2020-06-12', map_from_entries(ARRAY[('Price', '20.99'), ('Symbol','X')]), 10))
SELECT
dt AS "date",
element_at(info, 'Price') AS price,
element_at(info, 'Time') AS time,
element_at(info, 'Symbol') AS symbol,
id
FROM data
date | price | time | symbol | id
------------+--------+-------+--------+----
2020-06-10 | 102.45 | 09:31 | AAPL | 10
2020-06-10 | 10.28 | 12:31 | MSFT | 10
2020-06-11 | 12.45 | 09:48 | T | 10
2020-06-12 | 20.99 | NULL | X | 10
This answers the original version of the question.
If that is really a string, you can use regular expressions:
select t.*,
regexp_extract(info, '"Price":"([^"]*)"', 1) as price,
regexp_extract(info, '"Symbol":"([^"]*)"', 1) as symbol,
regexp_extract(info, '"Time":"([^"]*)"', 1) as time
from t;
I have a transaction table like this:
Trandate channelID branch amount
--------- --------- ------ ------
01/05/2019 1 2 2000
11/05/2019 1 2 2200
09/03/2020 1 2 5600
15/03/2020 1 2 600
12/10/2019 2 10 12000
12/10/2019 2 10 12000
15/11/2019 4 7 4400
15/02/2020 4 2 2500
I need to sum amount and count transactions by year and month. I tried this:
select DISTINCT
DATEPART(YEAR,a.TranDate) as [YearT],
DATEPART(MONTH,a.TranDate) as [monthT],
count(*) as [countoftran],
sum(a.Amount) as [amount],
a.Name as [branch],
a.ChannelName as [channelID]
from transactions as a
where a.TranDate>'20181231'
group by a.Name, a.ChannelName, DATEPART(YEAR,a.TranDate), DATEPART(MONTH,a.TranDate)
order by a.Name, YearT, MonthT
It works like charm. However, I will use this data on PowerBI thus I cannot show these results in a "line graphic" due to the year and month info being in separate columns.
I tried changing format on SQL to 'YYYYMM' alas powerBI doesn't recognise this column as date.
So, in the end, I need a result table looks like this:
YearT channelID branch Tamount TranT
--------- --------- ------ ------- -----
31/05/2019 1 2 4400 2
30/03/2020 1 2 7800 2
31/10/2019 2 10 24000 2
30/11/2019 4 7 4400 1
29/02/2020 4 2 2500 1
I have tried several little changes with no result.
Help is much appreciated.
You may try with the following statement:
SELECT
EOMONTH(DATEFROMPARTS(YEAR(Trandate), MONTH(Trandate), 1)) AS YearT,
branch, channelID,
SUM(amount) AS TAmount,
COUNT(*) AS TranT
FROM (VALUES
('20190501', 1, 2, 2000),
('20190511', 1, 2, 2200),
('20200309', 1, 2, 5600),
('20200315', 1, 2, 600),
('20191012', 2, 10, 12000),
('20191012', 2, 10, 12000),
('20191115', 4, 7, 4400),
('20200215', 4, 2, 2500)
) v (Trandate, channelID, branch, amount)
GROUP BY DATEFROMPARTS(YEAR(Trandate), MONTH(Trandate), 1), branch, channelID
ORDER BY DATEFROMPARTS(YEAR(Trandate), MONTH(Trandate), 1)
Result:
YearT branch channelID TAmount TranT
2019-05-31 2 1 4200 2
2019-10-31 10 2 24000 2
2019-11-30 7 4 4400 1
2020-02-29 2 4 2500 1
2020-03-31 2 1 6200 2
Edit: I have edited this question to make more understandable. Excuse me for any misunderstandings.
I have a temporary table with columns
zone_name, nodeid, nodelabel, nodegainedservice, nodelostservice
Zone1, 3, Windows-SRV1, "2012-11-27 13:10:30+08", "2012-11-27 13:00:40+08"
Zone1, 5, Windows-SRV2, "2012-12-20 13:10:30+08", "2012-12-18 13:00:40+08"
....
....
Many zones and many nodes and same nodes with gained service and lost service many times.
nodegainedservice meaning node has come alive and nodelostservice meaning node has gone down.
How could I make a query to fetch each zone availability in a period?
e.g., Zone1 have Windows-SRV1, Windows-SRV2. Find how many times and how long Zone1 is down. These servers are replication servers, zone goes down when all the servers in the zone are down at some time and comes up if any of them comes alive.
Please use the below sample data
zonename nodeid nodelabel noderegainedservice nodelostservice
Zone1 27 Windows-SRV1 2013-02-21 10:04:56+08 2013-02-21 09:48:48+08
Zone1 27 Windows-SRV1 2013-02-21 10:14:01+08 2013-02-21 10:09:27+08
Zone1 27 Windows-SRV1 2013-02-22 10:26:29+08 2013-02-22 10:24:20+08
Zone1 27 Windows-SRV1 2013-02-22 11:27:24+08 2013-02-22 11:25:15+08
Zone1 27 Windows-SRV1 2013-02-28 16:24:59+08 2013-02-28 15:52:59+08
Zone1 27 Windows-SRV1 2013-02-28 16:56:19+08 2013-02-28 16:40:18+08
Zone1 39 Windows-SRV2 2013-02-21 13:15:53+08 2013-02-21 12:26:04+08
Zone1 39 Windows-SRV2 2013-02-23 13:23:10+08 2013-02-22 10:21:14+08
Zone1 39 Windows-SRV2 2013-02-24 13:35:23+08 2013-02-23 13:33:32+08
Zone1 39 Windows-SRV2 2013-02-26 15:17:25+08 2013-02-25 14:25:51+08
Zone1 39 Windows-SRV2 2013-02-28 18:49:56+08 2013-02-28 15:43:01+08
Zone1 13 Windows-SRV3 2013-02-22 17:23:59+08 2013-02-22 10:19:13+08
Zone1 13 Windows-SRV3 2013-02-28 16:54:27+08 2013-02-28 16:13:48+08
Output zone_outages as follows
e.g.,
zonename duration from_time to_time
zone1 00:02:09 2013-02-22 10:24:20+08 2013-02-22 10:26:29+08
zone1 00:02:09 2013-02-22 11:25:15+08 2013-02-22 11:27:24+08
zone1 00:11:11 2013-02-28 16:13:48+08 2013-02-28 16:24:59+08
zone1 00:14:09 2013-02-28 16:40:18+08 2013-02-28 16:54:27+08
Note: There could be entries like this
Zone2 24 Windows-SRV12 \n \n
In this case Zone2 Windows-SRV12 has never gone down and Zone2 availability will be 100%.
Have you considered PG 9.2's range type instead of two separate timestamp fields?
http://www.postgresql.org/docs/9.2/static/rangetypes.html
Something like:
CREATE TABLE availability (
zone_name varchar, nodeid int, nodelabel varchar, during tsrange
);
INSERT INTO availability
VALUES (zone1, 3, 'srv1', '[2013-01-01 14:30, 2013-01-01 15:30)');
Unless I'm mistaking, you'd then be able to work with unions, intersections and such, which should make your work simpler. There are likely a few aggregate functions I'm unfamiliar with that cater to the latter, too.
If needed, additionally look into with statements and window functions for more complex queries:
http://www.postgresql.org/docs/9.2/static/tutorial-window.html
http://www.postgresql.org/docs/9.2/static/functions-window.html
Some testing reveals that sum() doesn't work with tsrange types.
That being said, the sql schema used in the follow-up queries:
drop table if exists nodes;
create table nodes (
zone int not null,
node int not null,
uptime tsrange
);
-- this requires the btree_gist extension:
-- alter table nodes add exclude using gist (uptime with &&, zone with =, node with =);
The data (slight variation from your sample):
insert into nodes values
(1, 1, '[2013-02-20 00:00:00, 2013-02-21 09:40:00)'),
(1, 1, '[2013-02-21 09:48:48, 2013-02-21 10:04:56)'),
(1, 1, '[2013-02-21 10:09:27, 2013-02-21 10:14:01)'),
(1, 1, '[2013-02-22 10:24:20, 2013-02-22 10:26:29)'),
(1, 1, '[2013-02-22 11:25:15, 2013-02-22 11:27:24)'),
(1, 1, '[2013-02-28 15:52:59, 2013-02-28 16:24:59)'),
(1, 1, '[2013-02-28 16:40:18, 2013-02-28 16:56:19)'),
(1, 1, '[2013-02-28 17:00:00, infinity)'),
(1, 2, '[2013-02-20 00:00:01, 2013-02-21 12:15:00)'),
(1, 2, '[2013-02-21 12:26:04, 2013-02-21 13:15:53)'),
(1, 2, '[2013-02-22 10:21:14, 2013-02-23 13:23:10)'),
(1, 2, '[2013-02-23 13:33:32, 2013-02-24 13:35:23)'),
(1, 2, '[2013-02-25 14:25:51, 2013-02-26 15:17:25)'),
(1, 2, '[2013-02-28 15:43:01, 2013-02-28 18:49:56)'),
(2, 3, '[2013-02-20 00:00:01, 2013-02-22 09:01:00)'),
(2, 3, '[2013-02-22 10:19:13, 2013-02-22 17:23:59)'),
(2, 3, '[2013-02-28 16:13:48, 2013-02-28 16:54:27)');
Raw data in order (for clarity):
select *
from nodes
order by zone, uptime, node;
Yields:
zone | node | uptime
------+------+-----------------------------------------------
1 | 1 | ["2013-02-20 00:00:00","2013-02-21 09:40:00")
1 | 2 | ["2013-02-20 00:00:01","2013-02-21 12:15:00")
1 | 1 | ["2013-02-21 09:48:48","2013-02-21 10:04:56")
1 | 1 | ["2013-02-21 10:09:27","2013-02-21 10:14:01")
1 | 2 | ["2013-02-21 12:26:04","2013-02-21 13:15:53")
1 | 2 | ["2013-02-22 10:21:14","2013-02-23 13:23:10")
1 | 1 | ["2013-02-22 10:24:20","2013-02-22 10:26:29")
1 | 1 | ["2013-02-22 11:25:15","2013-02-22 11:27:24")
1 | 2 | ["2013-02-23 13:33:32","2013-02-24 13:35:23")
1 | 2 | ["2013-02-25 14:25:51","2013-02-26 15:17:25")
1 | 2 | ["2013-02-28 15:43:01","2013-02-28 18:49:56")
1 | 1 | ["2013-02-28 15:52:59","2013-02-28 16:24:59")
1 | 1 | ["2013-02-28 16:40:18","2013-02-28 16:56:19")
1 | 1 | ["2013-02-28 17:00:00",infinity)
2 | 3 | ["2013-02-20 00:00:01","2013-02-22 09:01:00")
2 | 3 | ["2013-02-22 10:19:13","2013-02-22 17:23:59")
2 | 3 | ["2013-02-28 16:13:48","2013-02-28 16:54:27")
(17 rows)
Nodes available # 2013-02-21 09:20:00:
with upnodes as (
select zone, node, uptime
from nodes
where '2013-02-21 09:20:00'::timestamp <# uptime
)
select *
from upnodes
order by zone, uptime, node;
Yields:
zone | node | uptime
------+------+-----------------------------------------------
1 | 1 | ["2013-02-20 00:00:00","2013-02-21 09:40:00")
1 | 2 | ["2013-02-20 00:00:01","2013-02-21 12:15:00")
2 | 3 | ["2013-02-20 00:00:01","2013-02-22 09:01:00")
(3 rows)
Nodes available from 2013-02-21 00:00:00 incl to 2013-02-24 00:00:00 excl:
with upnodes as (
select zone, node, uptime
from nodes
where '[2013-02-21 00:00:00, 2013-02-24 00:00:00)'::tsrange && uptime
)
select * from upnodes
order by zone, uptime, node;
Yields:
zone | node | uptime
------+------+-----------------------------------------------
1 | 1 | ["2013-02-20 00:00:00","2013-02-21 09:40:00")
1 | 2 | ["2013-02-20 00:00:01","2013-02-21 12:15:00")
1 | 1 | ["2013-02-21 09:48:48","2013-02-21 10:04:56")
1 | 1 | ["2013-02-21 10:09:27","2013-02-21 10:14:01")
1 | 2 | ["2013-02-21 12:26:04","2013-02-21 13:15:53")
1 | 2 | ["2013-02-22 10:21:14","2013-02-23 13:23:10")
1 | 1 | ["2013-02-22 10:24:20","2013-02-22 10:26:29")
1 | 1 | ["2013-02-22 11:25:15","2013-02-22 11:27:24")
1 | 2 | ["2013-02-23 13:33:32","2013-02-24 13:35:23")
2 | 3 | ["2013-02-20 00:00:01","2013-02-22 09:01:00")
2 | 3 | ["2013-02-22 10:19:13","2013-02-22 17:23:59")
(11 rows)
Zones available from 2013-02-21 00:00:00 incl to 2013-02-24 00:00:00 excl'
with upnodes as (
select zone, node, uptime
from nodes
where '[2013-02-21 00:00:00, 2013-02-24 00:00:00)'::tsrange && uptime
),
upzones_max as (
select u1.zone, tsrange(lower(u1.uptime), max(upper(u2.uptime))) as uptime
from upnodes as u1
join upnodes as u2 on u2.zone = u1.zone and u2.uptime && u1.uptime
group by u1.zone, lower(u1.uptime)
),
upzones as (
select u1.zone, tsrange(min(lower(u2.uptime)), upper(u1.uptime)) as uptime
from upzones_max as u1
join upzones_max as u2 on u2.zone = u1.zone and u2.uptime && u1.uptime
group by u1.zone, upper(u1.uptime)
)
select zone, uptime, upper(uptime) - lower(uptime) as duration
from upzones
order by zone, uptime;
Yields:
zone | uptime | duration
------+-----------------------------------------------+-----------------
1 | ["2013-02-20 00:00:00","2013-02-21 12:15:00") | 1 day 12:15:00
1 | ["2013-02-21 12:26:04","2013-02-21 13:15:53") | 00:49:49
1 | ["2013-02-22 10:21:14","2013-02-23 13:23:10") | 1 day 03:01:56
1 | ["2013-02-23 13:33:32","2013-02-24 13:35:23") | 1 day 00:01:51
2 | ["2013-02-20 00:00:01","2013-02-22 09:01:00") | 2 days 09:00:59
2 | ["2013-02-22 10:19:13","2013-02-22 17:23:59") | 07:04:46
(6 rows)
There might be a better way to write the latter query if you write (or find) a custom aggregate function that sums overlapping range types -- the non-trivial issue that I ran into was to isolate an adequate group by clause; I ended up settling with two nested group by clauses.
The queries could also be rewritten to accommodate your current schema, either by replacing the uptime field by an expression such as tsrange(start_date, end_date), or by writing a view that does so.
DROP table if exists temptable;
CREATE TABLE temptable
(
zone_name character varying(255),
nodeid integer,
nodelabel character varying(255),
nodegainedservice timestamp with time zone,
nodelostservice timestamp with time zone
);
INSERT INTO tempTable (zone_name, nodeid, nodelabel, nodegainedservice, nodelostservice) VALUES
('Zone1', 27, 'Windows-SRV1', '2013-02-21 10:04:56+08', '2013-02-21 09:48:48+08'),
('Zone1', 27, 'Windows-SRV1', '2013-02-21 10:14:01+08', '2013-02-21 10:09:27+08'),
('Zone1', 27, 'Windows-SRV1', '2013-02-22 10:26:29+08', '2013-02-22 10:24:20+08'),
('Zone1', 27, 'Windows-SRV1', '2013-02-22 11:27:24+08', '2013-02-22 11:25:15+08'),
('Zone1', 27, 'Windows-SRV1', '2013-02-28 16:24:59+08', '2013-02-28 15:52:59+08'),
('Zone1', 27, 'Windows-SRV1', '2013-02-28 16:56:19+08', '2013-02-28 16:40:18+08'),
('Zone1', 39, 'Windows-SRV2', '2013-02-21 13:15:53+08', '2013-02-21 12:26:04+08'),
('Zone1', 39, 'Windows-SRV2', '2013-02-23 13:23:10+08', '2013-02-22 10:21:14+08'),
('Zone1', 39, 'Windows-SRV2', '2013-02-24 13:35:23+08', '2013-02-23 13:33:32+08'),
('Zone1', 39, 'Windows-SRV2', '2013-02-26 15:17:25+08', '2013-02-25 14:25:51+08'),
('Zone1', 39, 'Windows-SRV2', '2013-02-28 18:49:56+08', '2013-02-28 15:43:01+08'),
('Zone2', 13, 'Windows-SRV3', '2013-02-22 17:23:59+08', '2013-02-22 10:19:13+08'),
('Zone2', 13, 'Windows-SRV3', '2013-02-28 16:54:27+08', '2013-02-28 16:13:48+08'),
('Zone2', 14, 'Windows-SRV4', '2013-02-22 11:02:56+08', '2013-02-22 10:01:48+08');
with downodes as (
select zone_name, nodeid, nodelostservice, nodegainedservice
from temptable
WHERE (nodelostservice, nodegainedservice) OVERLAPS ('Wed Feb 20 00:00:00 +0800 2013'::TIMESTAMP, 'Fri Mar 01 00:00:00 +0800 2013'::TIMESTAMP)
),
donezones_max as(
select downodes1.zone_name, downodes1.nodeid, downodes1.nodelostservice, min(downodes2.nodegainedservice) as nodegainedservice
from downodes as downodes1
join downodes as downodes2 on downodes2.zone_name = downodes1.zone_name and ((downodes2.nodelostservice, downodes2.nodegainedservice) OVERLAPS (downodes1.nodelostservice, downodes1.nodegainedservice))
group by downodes1.zone_name, downodes1.nodeid, downodes1.nodelostservice
),
downzones as(
select downodes1.zone_name, downodes1.nodeid, max(downodes2.nodelostservice) as nodelostservice, downodes1.nodegainedservice
from donezones_max as downodes1
join donezones_max as downodes2 on downodes2.zone_name = downodes1.zone_name and ((downodes2.nodelostservice, downodes2.nodegainedservice) OVERLAPS (downodes1.nodelostservice, downodes1.nodegainedservice))
group by downodes1.zone_name, downodes1.nodeid, downodes1.nodegainedservice
),
zone_outages as(
SELECT
zone_name,
nodelostservice,
nodegainedservice,
nodegainedservice - nodelostservice AS duration,
CAST('1' AS INTEGER) as outage_counter
FROM downzones GROUP BY zone_name, nodelostservice, nodegainedservice HAVING COUNT(*) > 1 ORDER BY zone_name, nodelostservice)
select
zone_name,
EXTRACT(epoch from (SUM(duration) / (greatest(1, SUM(outage_counter))))) AS average_duration_seconds,
SUM(outage_counter) AS outage_count
FROM zone_outages GROUP BY zone_name ORDER BY zone_name