Group data into separate partitions based on identified NULL values [closed]

Group data into separate partitions based on identified NULL values [closed] - sql

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 4 years ago.
Improve this question
I'm looking to break a partition based on a NULL value as seen below in the 'GroupNumber' column. The purpose is that within window function statements, there isn't another identifier within my dataset that could break the groups apart (e.g. seen below to derive the "GroupNumber" column). The point is the create this "GroupNumber" column. Is there a way to break/reset the partition when a NULL value exists (ordered by date DESC)? Note: There can be multiple NULL instances for each partition. Any help is appreciated.
METHODOLOGY:
Create bit flag column to represent NULL values.
Use rolling sum (sorted by date DESC) to create these groups. This is a great method because at each observed NULL value, the "GROUP" field would increment dynamically. This would allow for aggregate calculations using this new field as a partition.
EXAMPLE SETUP:
IF OBJECT_ID('tempdb..#GroupNULL', 'U') IS NOT NULL
DROP TABLE #GroupNULL
CREATE TABLE #GroupNULL
([ID] INT NOT NULL,
[Date] date NULL,
[Number] INT NULL)
INSERT INTO #GroupNULL (ID, Date, Number) VALUES (1001, '8/12/2018', 35)
INSERT INTO #GroupNULL (ID, Date, Number) VALUES (1001, '8/11/2018', 27)
INSERT INTO #GroupNULL (ID, Date, Number) VALUES (1001, '8/10/2018', 7)
INSERT INTO #GroupNULL (ID, Date, Number) VALUES (1001, '8/9/2018', 18)
INSERT INTO #GroupNULL (ID, Date, Number) VALUES (1001, '8/8/2018', NULL)
INSERT INTO #GroupNULL (ID, Date, Number) VALUES (1001, '8/7/2018', 3)
INSERT INTO #GroupNULL (ID, Date, Number) VALUES (1001, '8/6/2018', 42)
INSERT INTO #GroupNULL (ID, Date, Number) VALUES (1001, '8/5/2018', 16)
INSERT INTO #GroupNULL (ID, Date, Number) VALUES (1001, '8/4/2018', 9)
INSERT INTO #GroupNULL (ID, Date, Number) VALUES (1001, '8/3/2018', NULL)
FURTHER CONTEXT: I would like to partition this dataset into 2 groups, with the first NULL value (ordered by date DESC) to be the first value of the group.

Here's an example that should get you pretty close. It uses windowing aggregates to add up the number of NULLs you have seen in a given order of the table as returned in a query. This works on recent versions of SQL Server/SQL Azure (SQL Server 2012+ I believe)
drop table t1
create table t1 (col1 int, col2 int)
insert into t1(col1, col2) values (1, 1)
insert into t1(col1, col2) values (1, 10)
insert into t1(col1, col2) values (2, NULL)
insert into t1(col1, col2) values (2, 10)
insert into t1(col1, col2) values (3, 2)
insert into t1(col1, col2) values (3, NULL)
SELECT
col1,
col2,
IsBoundary,
SUM(IsBoundary) OVER(ORDER BY col1, col2 ROWS UNBOUNDED PRECEDING) + 1 as GroupNumber
FROM
(
SELECT
col1,
col2,
CASE WHEN col2 is NULL then 1 ELSE 0 END as IsBoundary
FROM
t1
) A
ORDER BY col1, col2
col1 col2 IsBoundary GroupNumber
----------- ----------- ----------- -----------
1 1 0 1
1 10 0 1
2 NULL 1 2
2 10 0 2
3 NULL 1 3
3 2 0 3

SETUP
IF OBJECT_ID('tempdb..#GroupNULL', 'U') IS NOT NULL
DROP TABLE #GroupNULL
CREATE TABLE #GroupNULL
([ID] INT NOT NULL,
[Date] date NULL,
[Number] INT NULL)
INSERT INTO #GroupNULL (ID, Date, Number) VALUES (1001, '8/12/2018', 35)
INSERT INTO #GroupNULL (ID, Date, Number) VALUES (1001, '8/11/2018', 27)
INSERT INTO #GroupNULL (ID, Date, Number) VALUES (1001, '8/10/2018', 7)
INSERT INTO #GroupNULL (ID, Date, Number) VALUES (1001, '8/9/2018', 18)
INSERT INTO #GroupNULL (ID, Date, Number) VALUES (1001, '8/8/2018', NULL)
INSERT INTO #GroupNULL (ID, Date, Number) VALUES (1001, '8/7/2018', 3)
INSERT INTO #GroupNULL (ID, Date, Number) VALUES (1001, '8/6/2018', 42)
INSERT INTO #GroupNULL (ID, Date, Number) VALUES (1001, '8/5/2018', 16)
INSERT INTO #GroupNULL (ID, Date, Number) VALUES (1001, '8/4/2018', 9)
INSERT INTO #GroupNULL (ID, Date, Number) VALUES (1001, '8/3/2018', NULL)
SOLUTION
SELECT x.*,
SUM(Flagged) OVER(ORDER BY ID, Date DESC ROWS UNBOUNDED PRECEDING) AS [GroupNumber]
FROM
(SELECT *,
CASE WHEN LAG(Number) OVER(PARTITION BY ID ORDER BY Date DESC) IS NULL
THEN 1
ELSE 0
END AS [Flagged]
FROM #GroupNULL) x
ID Date Number Flagged GroupNumber
----------- ---------- ----------- ----------- -----------
1001 2018-08-12 35 1 1
1001 2018-08-11 27 0 1
1001 2018-08-10 7 0 1
1001 2018-08-09 18 0 1
1001 2018-08-08 NULL 0 1
1001 2018-08-07 3 1 2
1001 2018-08-06 42 0 2
1001 2018-08-05 16 0 2
1001 2018-08-04 9 0 2
1001 2018-08-03 NULL 0 2

Related

Partition the date into a weeks from a given date to the last date in the record

I wanted to count the time gap between two rows for the same id if the second is less than an hour after the first, and partition the count for the week.
Suppose given date with time is 2020-07-01 08:00
create table #Temp (
Id integer not null,
Time datetime not null
);
insert into #Temp values (1, '2020-07-01 08:00');
insert into #Temp values (1, '2020-07-01 08:01');
insert into #Temp values (1, '2020-07-01 08:06');
insert into #Temp values (1, '2020-07-01 08:30');
insert into #Temp values (1, '2020-07-08 09:35');
insert into #Temp values (1, '2020-07-15 16:10');
insert into #Temp values (1, '2020-07-15 16:20');
insert into #Temp values (1, '2020-07-17 06:40');
insert into #Temp values (1, '2020-07-17 06:41');
insert into #Temp values (2, '2020-07-01 08:30');
insert into #Temp values (2, '2020-07-01 09:26');
insert into #Temp values (2, '2020-07-01 10:25');
insert into #Temp values (2, '2020-07-09 08:30');
insert into #Temp values (2, '2020-07-09 09:26');
insert into #Temp values (2, '2020-07-09 10:25');
insert into #Temp values (3, '2020-07-21 08:30');
insert into #Temp values (3, '2020-07-21 09:26');
insert into #Temp values (3, '2020-07-21 10:25');
The week should extend up to the last date in the record. Here, the last date is
2020-07-21 10:25
Have to transform the output from this piece of code and divide the duration weekly.
select Id, sum(datediff(minute, Time, next_ts)) as duration_minutes
from (select t.*,
lead(Time) over (partition by id order by Time) as next_ts
from #Temp t
) t
where datediff(minute, Time, next_ts) < 60
group by Id;
Output:
id duration_minutes
1 41
2 230
3 115
The desired output should divide this duration on a weekly basis,
like Week 1, Week 2, Week 3, and so on.
Desired Output:
If the
start date is 2020-07-01 08:00
end date is 2020-07-21 10:25
id | Week 1 | Week 2 | Week 3
--------------------------------------
1 | 30 | 0 | 11
2 | 115 | 115 | 0
3 | 0 | 0 | 115
similarly, if the
start date is 2020-07-08 08:00
id | Week 1 | Week 2
---------------------------
1 | 11 | 0
2 | 115 | 0
3 | 0 | 115

Is this what you want?
select Id,
1 + datediff(second, '2020-07-01 06:00', time) / (24 * 60 * 60 * 7) as week_num,
sum(datediff(minute, Time, next_ts)) as duration_minutes
from (select t.*,
lead(Time) over (partition by id order by Time) as next_ts
from Temp t
) t
where datediff(minute, Time, next_ts) < 60
group by Id, datediff(second, '2020-07-01 06:00', time) / (24 * 60 * 60 * 7)
order by id, week_num;
Here is a db<>fiddle.

I am not able to understand the logic behind the week periods. Anyone, in the example below I am using the following code to set the week:
'Week ' + CAST(DENSE_RANK() OVER (ORDER BY DATEDIFF(DAY, #FirstDate, next_ts) / 7) AS VARCHAR(12))
You can adjust it to ignore the ours, be more precise or something else to match your real requirements.
Apart from that, you just need to perform a dynamic PIVOT. Here is the full working example:
DROP TABLE IF EXISTS #Temp;
create table #Temp (
Id integer not null,
Time datetime not null
);
insert into #Temp values (1, '2020-07-01 08:00');
insert into #Temp values (1, '2020-07-01 08:01');
insert into #Temp values (1, '2020-07-01 08:06');
insert into #Temp values (1, '2020-07-01 08:30');
insert into #Temp values (1, '2020-07-08 09:35');
insert into #Temp values (1, '2020-07-15 16:10');
insert into #Temp values (1, '2020-07-15 16:20');
insert into #Temp values (1, '2020-07-17 06:40');
insert into #Temp values (1, '2020-07-17 06:41');
insert into #Temp values (2, '2020-07-01 08:30');
insert into #Temp values (2, '2020-07-01 09:26');
insert into #Temp values (2, '2020-07-01 10:25');
insert into #Temp values (2, '2020-07-09 08:30');
insert into #Temp values (2, '2020-07-09 09:26');
insert into #Temp values (2, '2020-07-09 10:25');
insert into #Temp values (3, '2020-07-21 08:30');
insert into #Temp values (3, '2020-07-21 09:26');
insert into #Temp values (3, '2020-07-21 10:25');
DROP TABLE IF EXISTS #TEST
CREATE TABLE #TEST
(
[ID] INT
,[week_day] VARCHAR(12)
,[time_in_minutes] BIGINT
)
DECLARE #FirstDate DATE;
SELECT #FirstDate = MIN(Time)
FROM #Temp
INSERT INTO #TEST
select id
,'Week ' + CAST(DENSE_RANK() OVER (ORDER BY DATEDIFF(DAY, #FirstDate, next_ts) / 7) AS VARCHAR(12))
,datediff(minute, Time, next_ts)
from (select t.*,
lead(Time) over (partition by id order by Time) as next_ts
from #Temp t
) t
where datediff(minute, Time, next_ts) < 60
DECLARE #columns NVARCHAR(MAX);
SELECT #columns = STUFF
(
(
SELECT ',' + QUOTENAME([week_day])
FROM
(
SELECT DISTINCT CAST(REPLACE([week_day], 'Week ', '') AS INT)
,[week_day]
FROM #TEST
) DS ([rowID], [week_day])
ORDER BY [rowID]
FOR XML PATH(''), TYPE
).value('.', 'VARCHAR(MAX)')
,1
,1
,''
);
DECLARE #DanymicSQL NVARCHAR(MAX);
SET #DanymicSQL = N'
SELECT [ID], ' + #columns + '
FROM #TEST
PIVOT
(
SUM([time_in_minutes]) FOR [week_day] IN (' + #columns + ')
) PVT';
EXEC sp_executesql #DanymicSQL;

How to calculate the average value of following n rows based on another column - SQL (Oracle)

I am trying to calculate average monthly value of premiums for each POLICY_ID in monthly basis as shown below. When a customer updates his/her yearly payment frequency to a value different than 12, I need to manually calculate the average monthly value for the PREMIUM. How can I achieve the values shown in MONTHLY _PREMIUM_DESIRED?
Thanks in advance.
Note: Oracle version 12c
What I've tried:
SELECT
T.*,
SUM(PREMIUM) OVER(PARTITION BY T.POLICY_ID ORDER BY T.POLICY_ID, T.PAYMENT_DATE ROWS BETWEEN CURRENT ROW AND 12/T.YEARLY_PAYMENT_FREQ-1 FOLLOWING ) / (12/T.YEARLY_PAYMENT_FREQ) MONTLY_PREMIUM_CALCULATED
FROM MYTABLE T
;
Code for data:
DROP TABLE MYTABLE;
CREATE TABLE MYTABLE (POLICY_ID NUMBER(11), PAYMENT_DATE DATE, PREMIUM NUMBER(5), YEARLY_PAYMENT_FREQ NUMBER(2),MONTHLY_PREMIUM_DESIRED NUMBER(5));
INSERT INTO MYTABLE VALUES (1, DATE '2014-10-01',120,12,120);
INSERT INTO MYTABLE VALUES (1, DATE '2014-11-01',360,4,120);
INSERT INTO MYTABLE VALUES (1, DATE '2014-12-01',0,4,120);
INSERT INTO MYTABLE VALUES (1, DATE '2015-01-01',0,4,120);
INSERT INTO MYTABLE VALUES (1, DATE '2015-02-01',360,4,120);
INSERT INTO MYTABLE VALUES (1, DATE '2015-03-01',0,4,120);
INSERT INTO MYTABLE VALUES (1, DATE '2015-04-01',0,4,120);
INSERT INTO MYTABLE VALUES (1, DATE '2015-05-01',720,2,120);
INSERT INTO MYTABLE VALUES (1, DATE '2015-06-01',0,2,120);
INSERT INTO MYTABLE VALUES (1, DATE '2015-07-01',0,2,120);
INSERT INTO MYTABLE VALUES (1, DATE '2015-08-01',0,2,120);
INSERT INTO MYTABLE VALUES (1, DATE '2015-09-01',0,2,120);
INSERT INTO MYTABLE VALUES (1, DATE '2015-10-01',0,2,120);
INSERT INTO MYTABLE VALUES (1, DATE '2015-11-01',120,12,120);
INSERT INTO MYTABLE VALUES (2, DATE '2015-01-01',60,3,15);
INSERT INTO MYTABLE VALUES (2, DATE '2015-02-01',0,3,15);
INSERT INTO MYTABLE VALUES (2, DATE '2015-03-01',0,3,15);
INSERT INTO MYTABLE VALUES (2, DATE '2015-04-01',0,3,15);
INSERT INTO MYTABLE VALUES (2, DATE '2015-05-01',180,1,15);
INSERT INTO MYTABLE VALUES (2, DATE '2015-06-01',0,1,15);
INSERT INTO MYTABLE VALUES (2, DATE '2015-07-01',0,1,15);
INSERT INTO MYTABLE VALUES (2, DATE '2015-08-01',0,1,15);
INSERT INTO MYTABLE VALUES (2, DATE '2015-09-01',0,1,15);
INSERT INTO MYTABLE VALUES (2, DATE '2015-10-01',0,1,15);
INSERT INTO MYTABLE VALUES (2, DATE '2015-11-01',0,1,15);
INSERT INTO MYTABLE VALUES (2, DATE '2015-12-01',0,1,15);
INSERT INTO MYTABLE VALUES (2, DATE '2016-01-01',0,1,15);
INSERT INTO MYTABLE VALUES (2, DATE '2016-02-01',0,1,15);
INSERT INTO MYTABLE VALUES (2, DATE '2016-03-01',0,1,15);
INSERT INTO MYTABLE VALUES (2, DATE '2016-04-01',0,1,15);
INSERT INTO MYTABLE VALUES (2, DATE '2016-05-01',15,12,15);
INSERT INTO MYTABLE VALUES (2, DATE '2016-06-01',15,12,15);
SELECT * FROM MYTABLE;
EDIT:
Regardless from payment frequency PREMIUM amount can also be changed by customer. Below, for the POLICY_ID = 1, I have added new records starting from "2015/11/01" to demonstrate this situation. In this case, average monthly premium increased from 120 to 240.
Also removed the screenshot to make the question more readable.
DROP TABLE MYTABLE2;
CREATE TABLE MYTABLE2 (POLICY_ID NUMBER(11), PAYMENT_DATE DATE, PREMIUM NUMBER(5), YEARLY_PAYMENT_FREQ NUMBER(2),MONTHLY_PREMIUM_DESIRED NUMBER(5));
INSERT INTO MYTABLE2 VALUES (1, DATE '2014-10-01',120,12,120);
INSERT INTO MYTABLE2 VALUES (1, DATE '2014-11-01',360,4,120);
INSERT INTO MYTABLE2 VALUES (1, DATE '2014-12-01',0,4,120);
INSERT INTO MYTABLE2 VALUES (1, DATE '2015-01-01',0,4,120);
INSERT INTO MYTABLE2 VALUES (1, DATE '2015-02-01',360,4,120);
INSERT INTO MYTABLE2 VALUES (1, DATE '2015-03-01',0,4,120);
INSERT INTO MYTABLE2 VALUES (1, DATE '2015-04-01',0,4,120);
INSERT INTO MYTABLE2 VALUES (1, DATE '2015-05-01',720,2,120);
INSERT INTO MYTABLE2 VALUES (1, DATE '2015-06-01',0,2,120);
INSERT INTO MYTABLE2 VALUES (1, DATE '2015-07-01',0,2,120);
INSERT INTO MYTABLE2 VALUES (1, DATE '2015-08-01',0,2,120);
INSERT INTO MYTABLE2 VALUES (1, DATE '2015-09-01',0,2,120);
INSERT INTO MYTABLE2 VALUES (1, DATE '2015-10-01',0,2,120);
INSERT INTO MYTABLE2 VALUES (1, DATE '2015-11-01',240,12,240);
INSERT INTO MYTABLE2 VALUES (1, DATE '2016-12-01',240,12,240); --newly added records
INSERT INTO MYTABLE2 VALUES (1, DATE '2016-01-01',960,4,240); --newly added records
INSERT INTO MYTABLE2 VALUES (1, DATE '2016-02-01',0,4,240); --newly added records
INSERT INTO MYTABLE2 VALUES (1, DATE '2016-03-01',0,4,240); --newly added records
INSERT INTO MYTABLE2 VALUES (1, DATE '2016-04-01',0,4,240); --newly added records
INSERT INTO MYTABLE2 VALUES (1, DATE '2016-05-01',960,4,240); --newly added records
INSERT INTO MYTABLE2 VALUES (1, DATE '2016-06-01',0,4,240); --newly added records
INSERT INTO MYTABLE2 VALUES (1, DATE '2016-07-01',0,4,240); --newly added records
INSERT INTO MYTABLE2 VALUES (1, DATE '2016-08-01',0,4,240); --newly added records
INSERT INTO MYTABLE2 VALUES (2, DATE '2015-01-01',60,3,15);
INSERT INTO MYTABLE2 VALUES (2, DATE '2015-02-01',0,3,15);
INSERT INTO MYTABLE2 VALUES (2, DATE '2015-03-01',0,3,15);
INSERT INTO MYTABLE2 VALUES (2, DATE '2015-04-01',0,3,15);
INSERT INTO MYTABLE2 VALUES (2, DATE '2015-05-01',180,1,15);
INSERT INTO MYTABLE2 VALUES (2, DATE '2015-06-01',0,1,15);
INSERT INTO MYTABLE2 VALUES (2, DATE '2015-07-01',0,1,15);
INSERT INTO MYTABLE2 VALUES (2, DATE '2015-08-01',0,1,15);
INSERT INTO MYTABLE2 VALUES (2, DATE '2015-09-01',0,1,15);
INSERT INTO MYTABLE2 VALUES (2, DATE '2015-10-01',0,1,15);
INSERT INTO MYTABLE2 VALUES (2, DATE '2015-11-01',0,1,15);
INSERT INTO MYTABLE2 VALUES (2, DATE '2015-12-01',0,1,15);
INSERT INTO MYTABLE2 VALUES (2, DATE '2016-01-01',0,1,15);
INSERT INTO MYTABLE2 VALUES (2, DATE '2016-02-01',0,1,15);
INSERT INTO MYTABLE2 VALUES (2, DATE '2016-03-01',0,1,15);
INSERT INTO MYTABLE2 VALUES (2, DATE '2016-04-01',0,1,15);
INSERT INTO MYTABLE2 VALUES (2, DATE '2016-05-01',15,12,15);
INSERT INTO MYTABLE2 VALUES (2, DATE '2016-06-01',15,12,15);
SELECT * FROM MYTABLE2;

I think the calculation is:
select t.*,
premium / (12 / yearly_payment_freq)) as monthly_premium_calculated
from mytable t;
EDIT:
I see, you also need this spread over the intermediate months. So you can assign the groups by counting the number of non-zero payments. Then:
select t.*,
( max(premium) over (partition by policy_id, grp) /
(12 / yearly_payment_freq)
) as monthly_premium_calculated
from (select t.*,
sum(case when premium > 0 then 1 else 0 end) over (partition by policy_id order by payment_date) as grp
from mytable t
) t;
Here is a db<>fiddle (it uses Postgres because that is easier to set up than Oracle).

Get multiple lines from a single table and insert into another table in SQL Server

I need to insert data from a table into another table. It has to be only one row or multiple rows depending on the input parameters.
Here is an example
The table with the original rows
ID | PATTERNID
----+-----------
1 | 1
2 | 1
3 | 1
4 | 1
5 | 2
6 | 3
7 | 3
8 | 3
There can be multiple ID for one pattern.
And I need to insert data into another table with the pattern ID.
I trying to do a stored procedure in which i just have to pass the patternID as a parameter. I would like to make inserts into another table.
Thanks for the help !

I think this should work for you...
/*
-- create temp tables for test
CREATE TABLE #SourceTable
(
ID INT
, PATTERNID INT
);
CREATE TABLE #TargetTable
(
ID INT
, PATTERNID INT
);
INSERT INTO #SourceTable (ID, PATTERNID) VALUES (1, 1)
INSERT INTO #SourceTable (ID, PATTERNID) VALUES (2, 1)
INSERT INTO #SourceTable (ID, PATTERNID) VALUES (3, 1)
INSERT INTO #SourceTable (ID, PATTERNID) VALUES (4, 1)
INSERT INTO #SourceTable (ID, PATTERNID) VALUES (5, 2)
INSERT INTO #SourceTable (ID, PATTERNID) VALUES (6, 3)
INSERT INTO #SourceTable (ID, PATTERNID) VALUES (7, 3)
INSERT INTO #SourceTable (ID, PATTERNID) VALUES (8, 3)
*/
DELETE FROM #TargetTable
DECLARE #ParamValue INT;
SET #ParamValue = 2;
INSERT INTO #TargetTable
(
ID
, PATTERNID
)
SELECT ID, PATTERNID FROM #SourceTable
WHERE PATTERNID = #ParamValue
SELECT * FROM #TargetTable
Noel

SQL select items that make datetime range between flag toggle

Say I have a table like this one:
CREATE TABLE TESTTABLE (
ID Integer NOT NULL,
ATMOMENT Timestamp NOT NULL,
ISALARM Integer NOT NULL,
CONSTRAINT PK_TESTTABLE PRIMARY KEY (ID)
);
It has ISALARM flag that toggles between 0 and 1 at random moments ATMOMENT, like in this example dataset:
INSERT INTO TESTTABLE (ID, ATMOMENT, ISALARM) VALUES ('1', '01.01.2016, 00:00:00.000', '1');
INSERT INTO TESTTABLE (ID, ATMOMENT, ISALARM) VALUES ('2', '01.01.2016, 00:01:00.000', '1');
INSERT INTO TESTTABLE (ID, ATMOMENT, ISALARM) VALUES ('3', '01.01.2016, 00:02:00.000', '0');
INSERT INTO TESTTABLE (ID, ATMOMENT, ISALARM) VALUES ('4', '01.01.2016, 00:02:00.000', '0');
INSERT INTO TESTTABLE (ID, ATMOMENT, ISALARM) VALUES ('10', '02.01.2016, 00:00:00.000', '1');
INSERT INTO TESTTABLE (ID, ATMOMENT, ISALARM) VALUES ('11', '02.01.2016, 00:00:00.000', '1');
INSERT INTO TESTTABLE (ID, ATMOMENT, ISALARM) VALUES ('12', '02.01.2016, 00:01:00.000', '0');
INSERT INTO TESTTABLE (ID, ATMOMENT, ISALARM) VALUES ('20', '03.01.2016, 00:00:00.000', '1');
INSERT INTO TESTTABLE (ID, ATMOMENT, ISALARM) VALUES ('21', '03.01.2016, 00:01:00.000', '1');
INSERT INTO TESTTABLE (ID, ATMOMENT, ISALARM) VALUES ('22', '03.01.2016, 00:02:00.000', '0');
INSERT INTO TESTTABLE (ID, ATMOMENT, ISALARM) VALUES ('23', '03.01.2016, 00:02:00.000', '1');
INSERT INTO TESTTABLE (ID, ATMOMENT, ISALARM) VALUES ('30', '04.01.2016, 00:00:00.000', '1');
INSERT INTO TESTTABLE (ID, ATMOMENT, ISALARM) VALUES ('31', '04.01.2016, 00:00:00.000', '1');
INSERT INTO TESTTABLE (ID, ATMOMENT, ISALARM) VALUES ('32', '04.01.2016, 00:00:00.000', '0');
INSERT INTO TESTTABLE (ID, ATMOMENT, ISALARM) VALUES ('33', '04.01.2016, 00:00:00.000', '0');
INSERT INTO TESTTABLE (ID, ATMOMENT, ISALARM) VALUES ('40', '05.01.2016, 00:00:00.000', '1');
INSERT INTO TESTTABLE (ID, ATMOMENT, ISALARM) VALUES ('41', '05.01.2016, 00:00:00.000', '1');
INSERT INTO TESTTABLE (ID, ATMOMENT, ISALARM) VALUES ('42', '05.01.2016, 00:00:00.000', '0');
INSERT INTO TESTTABLE (ID, ATMOMENT, ISALARM) VALUES ('43', '05.01.2016, 00:00:00.000', '0');
I need to select all alarm ranges, i.e. the ATMOMENT ranges where ISALARM is set to 1 (first time after previous range is closed) at range begin and reset back to 0 at range end. Say for clarity first reset is enough to close such range; say also that the simultaneous ISALARM set and reset are treated like the range end (while possibly as the begin).
Example dataset above is expected to produce something like this:
ALARMBEGIN | LASTALARMBEGIN | ALARMEND
-------------------------- | -------------------------- | --------
'01.01.2016, 00:00:00.000' | '01.01.2016, 00:01:00.000' | '01.01.2016, 00:02:00.000'
'02.01.2016, 00:00:00.000' | '02.01.2016, 00:00:00.000' | '02.01.2016, 00:01:00.000'
'03.01.2016, 00:00:00.000' | '03.01.2016, 00:02:00.000' | '03.01.2016, 00:02:00.000'
'04.01.2016, 00:00:00.000' | '04.01.2016, 00:00:00.000' | '04.01.2016, 00:00:00.000'
'05.01.2016, 00:00:00.000' | '05.01.2016, 00:00:00.000' | '05.01.2016, 00:00:00.000'
My own solution to this (below) looks pretty ugly and runs stunningly slow (about 1minute) even if the TESTTABLE has relatively small dataset with only ~2500 records (tested it with Firebird2.5 and Postgresql; I'm not good with DB optimization; "CREATE INDEX IDX_TESTTABLE1 ON TESTTABLE (ATMOMENT,ISALARM)" helps but not very much).
It is pretty strange for me because simple linear iteration on all TESTTABLE records (ordered by ATMOMENT) while comparing ISALARM field to one of the previous record gives me the ranges I want much faster.
Are there any elegant solution to make SQL select this faster and in cleaner way?
SELECT DISTINCT a1.ATMOMENT AS ALARMBEGIN, a2.ATMOMENT AS LASTALARMBEGIN, a3.ATMOMENT AS ALARMEND
FROM TESTTABLE a1
JOIN TESTTABLE a2 ON
(a1.ATMOMENT<a2.ATMOMENT
AND NOT EXISTS(SELECT * FROM TESTTABLE x WHERE
x.ISALARM=0 AND a1.ATMOMENT<=x.ATMOMENT AND x.ATMOMENT<a2.ATMOMENT))
OR (a1.ATMOMENT=a2.ATMOMENT)
JOIN TESTTABLE a3 ON
(a2.ATMOMENT<a3.ATMOMENT
AND NOT EXISTS(SELECT * FROM TESTTABLE x WHERE
(x.ISALARM=0 AND a2.ATMOMENT<=x.ATMOMENT AND x.ATMOMENT<a3.ATMOMENT)
OR (x.ISALARM=1 AND a2.ATMOMENT<x.ATMOMENT AND x.ATMOMENT<=a3.ATMOMENT)))
OR (a2.ATMOMENT=a3.ATMOMENT)
WHERE a1.ISALARM<>0 AND a2.ISALARM<>0 AND a3.ISALARM=0
AND (NOT EXISTS(SELECT * FROM TESTTABLE x1 WHERE
x1.ATMOMENT<a1.ATMOMENT)
OR EXISTS(SELECT * FROM TESTTABLE x1 WHERE
x1.ISALARM=0
AND x1.ATMOMENT<a1.ATMOMENT
AND NOT EXISTS(SELECT * FROM TESTTABLE x2 WHERE
x1.ATMOMENT<x2.ATMOMENT AND x2.ATMOMENT<a1.ATMOMENT)))
ORDER BY a1.ATMOMENT
Thank you.
Upd 1
Thanks to Gordon Linoff's and Jayvee's solutions (which are very good with Firebird3.0 and PostgreSQL) I've decided to rely on ordering efficiency of Firebird2.5 and contrived the "select" which is even uglier than my previous one but runs significantly faster. For those who need it done with Firebird2.5:
WITH
GROUPEDTABLE_TT (ATMOMENT, NOTISALARMRESET, ISALARMSET)
AS(
SELECT a.ATMOMENT, MIN(a.ISALARM), MAX(a.ISALARM)
FROM TESTTABLE a
GROUP BY a.ATMOMENT),
INTERVALBEGIN_TT
AS(
SELECT a1.ATMOMENT
FROM GROUPEDTABLE_TT a1
WHERE
a1.ISALARMSET<>0
AND (NOT EXISTS (SELECT * FROM GROUPEDTABLE_TT x WHERE
x.ATMOMENT<a1.ATMOMENT)
OR (SELECT FIRST 1 x.NOTISALARMRESET FROM GROUPEDTABLE_TT x WHERE
x.ATMOMENT<a1.ATMOMENT
ORDER BY x.ATMOMENT DESC)=0)),
INTERVALLAST_TT
AS(
SELECT a2.ATMOMENT FROM GROUPEDTABLE_TT a2
WHERE a2.ISALARMSET=1
AND (a2.NOTISALARMRESET=0
OR (a2.NOTISALARMRESET=1
AND (SELECT FIRST 1 x.NOTISALARMRESET FROM GROUPEDTABLE_TT x WHERE
x.ATMOMENT>a2.ATMOMENT
ORDER BY x.ATMOMENT ASC)=0
AND (SELECT FIRST 1 x.ISALARMSET FROM GROUPEDTABLE_TT x WHERE
x.ATMOMENT>a2.ATMOMENT
ORDER BY x.ATMOMENT ASC)=0))),
INTERVALEND_TT
AS(
SELECT a1.ATMOMENT
FROM GROUPEDTABLE_TT a1
WHERE
a1.NOTISALARMRESET=0
AND (a1.ISALARMSET=1
OR (a1.ISALARMSET=0
AND (SELECT FIRST 1 x.ISALARMSET FROM GROUPEDTABLE_TT x WHERE
x.ATMOMENT<a1.ATMOMENT
ORDER BY x.ATMOMENT DESC)=1
AND (SELECT FIRST 1 x.NOTISALARMRESET FROM GROUPEDTABLE_TT x WHERE
x.ATMOMENT<a1.ATMOMENT
ORDER BY x.ATMOMENT DESC)=1))),
ENCLOSEDINTERVALS_TT (BEGINMOMENT, LASTBEGINMOMENT, ENDMOMENT)
AS(
SELECT ib.ATMOMENT,
(SELECT FIRST 1 il.ATMOMENT FROM INTERVALLAST_TT il WHERE
ib.ATMOMENT<=il.ATMOMENT ORDER BY il.ATMOMENT ASC),
(SELECT FIRST 1 ie.ATMOMENT FROM INTERVALEND_TT ie WHERE
ib.ATMOMENT<=ie.ATMOMENT ORDER BY ie.ATMOMENT ASC)
FROM INTERVALBEGIN_TT ib)
SELECT * FROM ENCLOSEDINTERVALS_TT
ORDER BY BEGINMOMENT
Upd 2
...but my selects seems to show quadratic growth (or at least faster then linear) of the fetch number depending of the total record number; it's better to use procedure with single-pass linear iteration for FB2.5. Or to use FB30 with solutions below...

This has been tested in PostgreSQL, the idea is create 3 ordered common tables for beginnings, last beginnings and ends respectively and then join the 3 tables.
It can be done with less code by creating only one CTE and flagging the rows with a case statement and then a selfjoin, which you can do later but in this way the code is more self explanatory and should be fairly efficient too.
;
with beginnings
as
(
select atmoment, row_number() over(order by atmoment) rn from
(
select *, lag(atmoment,1) over(order by atmoment,isalarm desc) prevtime,
lag(isalarm,1) over(order by atmoment,isalarm desc) prevstatus
from testtable
) t
where coalesce(prevstatus,0)=0 and isalarm=1
),
ends
as
(
select atmoment, row_number() over(order by atmoment) rn from
(
select *, lead(atmoment,1) over(order by atmoment,isalarm) nexttime,
lead(isalarm,1) over(order by atmoment,isalarm) nextstatus
from testtable
) t
where coalesce(nextstatus,1)=1 and isalarm=0
),
lastbeginnings
as
(
select atmoment, row_number() over(order by atmoment) rn from
(
select *, lead(atmoment,1) over(order by atmoment,isalarm desc) nexttime,
lead(isalarm,1) over(order by atmoment,isalarm desc) nextstatus
from testtable
) t
where coalesce(nextstatus,0)=0 and isalarm=1
)
select b.atmoment ALARMBEGIN, lb.atmoment LASTALARMBEGIN, e.atmoment ALARMEND
from beginnings b
join lastbeginnings lb on lb.rn=b.rn
join ends e on e.rn=b.rn
result:
> 2016-01-01 00:00:00 | 2016-01-01 00:01:00 | 2016-01-01 00:02:00
> 2016-01-02 00:00:00 | 2016-01-02 00:00:00 | 2016-01-02 00:01:00
> 2016-01-03 00:00:00 | 2016-01-03 00:02:00 | 2016-01-03 00:02:00
> 2016-01-04 00:00:00 | 2016-01-04 00:00:00 | 2016-01-04 00:00:00
> 2016-01-05 00:00:00 | 2016-01-05 00:00:00 | 2016-01-05 00:00:00

I think you can do this in Firebird 3.0, using row_number():
select alarm, min(atmoment), max(atmoment)
from (select t.*,
row_number() over (order by atmoment) as seqnum,
row_number() over (partition by alarm order by atmoment) as seqnum_a
from testtable t
) t
group by alarm, (seqnum - seqnum_a);
It is a little hard to explain how this works. But if you run the subquery, you'll see how the difference identifies the groups you are interested in.

Select MAX dates plus ID value

Please consider the following table...
DECLARE #tmp TABLE
(
ID int,
userID int,
testID int,
someDate datetime
)
...containing the following values:
INSERT INTO #tmp (ID, userID, testID, someDate) VALUES (1, 1, 50, '2010-10-01')
INSERT INTO #tmp (ID, userID, testID, someDate) VALUES (2, 1, 50, '2010-11-01')
INSERT INTO #tmp (ID, userID, testID, someDate) VALUES (3, 1, 50, '2010-12-01')
INSERT INTO #tmp (ID, userID, testID, someDate) VALUES (4, 2, 20, '2010-10-01')
INSERT INTO #tmp (ID, userID, testID, someDate) VALUES (5, 2, 30, '2010-11-01')
INSERT INTO #tmp (ID, userID, testID, someDate) VALUES (6, 2, 20, '2012-11-01')
I need to retrieve the maximum date for each userID/testID combination of values, and also the accompanying ID value. The results should be:
ID userID testID someDate
-------------------------------
3 1 50 2010-12-01
5 2 30 2010-11-01
6 2 20 2012-11-01
When I try the following query, the result set becomes incorrect and all rows are shown. I cannot omit ID from the GROUP BY clause because it causes and error. Can anyone help please? It seems long-winded to join the table to itself to get these values.
SELECT ID, userID, testID, MAX(someDate)
FROM #tmp
GROUP BY testId,userID,ID;
http://www.sqlfiddle.com/#!6/d41d8/5219

Please try:
select * from (
select
*,
ROW_NUMBER() over (partition by userID, testID order by SomeDate desc) Rnum
From #tmp
)x where Rnum=1

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Group data into separate partitions based on identified NULL values [closed] - sql

Related

Partition the date into a weeks from a given date to the last date in the record

How to calculate the average value of following n rows based on another column - SQL (Oracle)

Get multiple lines from a single table and insert into another table in SQL Server

SQL select items that make datetime range between flag toggle

Select MAX dates plus ID value

Categories

Resources