Find out the sequence number of a pattern - sql

Consider below table
Table
ActivityId Flag Type
---------- ----- -----
1 N
2 N
3 Y EXT
4 Y
5 Y
6 N
7 Y INT
8 Y
9 N
10 N
11 N
12 Y EXT
13 N
14 N
15 N
16 Y EXT
17 Y
18 Y INT
19 Y
20 Y EXT
21 Y
22 N
23 N
First record has always Flag = N and then any sequence of Flag = Y or Flag = N can exist for next records. Every time flag changes from N to Y, the Type field is either EXT or INT. Next Y records (before next N) might have Type = EXT or INT or NULL and this is not important.
I want to calculate Cycle No for this sequence of N/Y. First cycle starts when Flag = N (always first record has flag = N) and cycle ends when flag changes to Y and Type = EXT. Then, next cycle starts when flag changes to N and ends when flag becomes Y and type = EXT. this is repeated until all records are processed. The result for above table is:
Result
ActivityId Flag Type Cycle No
---------- ----- ----- --------
1 N 1
2 N 1
3 Y EXT 1
4 Y
5 Y
6 N 2
7 Y INT 2
8 Y 2
9 N 2
10 N 2
11 N 2
12 Y EXT 2
13 N 3
14 N 3
15 N 3
16 Y EXT 3
17 Y
18 Y INT
19 Y
20 Y EXT
21 Y
22 N 4
23 N 4
I am using SQL Server 2008 R2 (no LAG/LEAD).
Can you please help me find the SQL query to calculate Cycle No?

I've got a solution, its not pretty, but through stepwise refinement I get to your result:
The solution is in three steps
Isolate the activityid for all the possible start cycle and end
cycle rows.
Filter out all the dud start events
Number the cycles, and find the interval of activityid for each
cycle
First i select out all start and end cycle events:
with tab as
(select * from (values
(1,'N',''),(2,'N',''),(3,'Y','EXT'),(4,'Y','')
,(5,'Y',''),(6,'N',''),(7,'Y','INT'),(8,'Y','')
,(9,'N',''),(10,'N',''),(11,'N',''),(12,'Y','EXT')
,(13,'N',''),(14,'N',''),(15,'N',''),(16,'Y','EXT')
,(17,'Y',''),(18,'Y','INT'),(19,'Y',''),(20,'Y','EXT')
,(21,'Y',''),(22,'N',''),(23,'N','')) a(ActivityId,Flag,[Type]))
,CTE1 as
( select
ROW_NUMBER() over (order by t1.ActivityId) rn
,t1.ActivityId
,case when t1.Flag='N' then 'Start' else 'End' end Cycle
from tab t1
where t1.Flag='N' or (t1.Flag='Y' and t1.[Type]='Ext')
)
select * from cte1
This returns
rn ActivityId Cycle
1 1 Start
2 2 Start
3 3 End
4 6 Start
5 9 Start
6 10 Start
7 11 Start
8 12 End
9 13 Start
10 14 Start
11 15 Start
12 16 End
13 20 End
14 22 Start
15 23 Start
The problem is now that while we are sure of when the cycle ends, that is when Flag is N and Type is Ext, we are not sure when the cycle starts. Row 1 and 2 both denote a possible start event. But luckily we can see that only a start event following an End event is to be counted. Since we do not have lag or lead we must join the CTE with itself:
,CTE2 as
(
select ROW_NUMBER() over (order by a1.activityid) rn
,a1.ActivityId
,a1.Cycle
,a2.Cycle PrevCycle
from CTE1 a1 left join CTE1 a2 on a1.rn=a2.rn+1
where
a2.Cycle is null -- First Cycle
or
( a2.Cycle is not null
and
(
(a1.Cycle='End' and a2.Cycle='Start') -- End of cycle
or
(a1.Cycle='Start'
and a2.Cycle='End') -- next cycles
)
)
)
select * from cte2
This returns
rn ActivityId Cycle PrevCycle
1 1 Start NULL
2 3 End Start
3 6 Start End
4 12 End Start
5 13 Start End
6 16 End Start
7 22 Start End
I Select the first start event - since we always start with one, and then keep the END events that follow a start event.
Finally we only keep the rest of the start events if the previous event was an End event.
Now we can find the start and end of each cycle, and number them:
,cte3 as
(
select ROW_NUMBER() over (order by b1.ActivityId) CycleNumber
,b1.ActivityId StartId,b2.ActivityId EndId
from cte2 b1 left join cte2 b2
on b1.rn=b2.rn-1
where b1.Cycle='Start'
)
select * from cte3
Which gives us what we need:
CycleNumber StartId EndId
1 1 3
2 6 12
3 13 16
4 22 NULL
Now we just need to join this back on our table:
select
a.ActivityId,a.Flag,a.[Type],CycleNumber
from tab a
left join cte3 b on a.ActivityId between b.StartId and isnull(b.EndId,a.ActivityId)
This gives the result you were looking for.
This is just a quick and dirty solution, perhaps with a little TLC you can pretty it up, and reduce the number of steps.
The full solution is here:
with tab as
(select * from (values
(1,'N',''),(2,'N',''),(3,'Y','EXT'),(4,'Y','')
,(5,'Y',''),(6,'N',''),(7,'Y','INT'),(8,'Y','')
,(9,'N',''),(10,'N',''),(11,'N',''),(12,'Y','EXT')
,(13,'N',''),(14,'N',''),(15,'N',''),(16,'Y','EXT')
,(17,'Y',''),(18,'Y','INT'),(19,'Y',''),(20,'Y','EXT')
,(21,'Y',''),(22,'N',''),(23,'N','')) a(ActivityId,Flag,[Type]))
,CTE1 as
( select
ROW_NUMBER() over (order by t1.ActivityId) rn
,t1.ActivityId
,case when t1.Flag='N' then 'Start' else 'End' end Cycle
from tab t1
where t1.Flag='N' or (t1.Flag='Y' and t1.[Type]='Ext')
)
,CTE2 as
(
select ROW_NUMBER() over (order by a1.activityid) rn
,a1.ActivityId
,a1.Cycle
,a2.Cycle PrevCycle
from CTE1 a1 left join CTE1 a2 on a1.rn=a2.rn+1
where
a2.Cycle is null -- First Cycle
or
( a2.Cycle is not null
and
(
a1.Cycle='End' -- End of cycle
or
(a1.Cycle='Start'
and a2.Cycle='End') -- next cycles
)
)
)
,cte3 as
(
select ROW_NUMBER() over (order by b1.ActivityId) CycleNumber
,b1.ActivityId StartId,b2.ActivityId EndId
from cte2 b1 left join cte2 b2
on b1.rn=b2.rn-1
where b1.Cycle='Start'
)
select
a.ActivityId,a.Flag,a.[Type],CycleNumber
from tab a
left join cte3 b on a.ActivityId between b.StartId and isnull(b.EndId,a.ActivityId)

If you are happy with recursion, this can be achieved rather simply with a bit of comparison logic to the preceding row when ordered by your ActivityId:
declare #t table(ActivityId int,Flag nvarchar(1),TypeValue nvarchar(3));
insert into #t values(1 ,'N',null),(2 ,'N',null),(3 ,'Y','EXT'),(4 ,'Y',null),(5 ,'Y',null),(6 ,'N',null),(7 ,'Y','INT'),(8 ,'Y',null),(9 ,'N',null),(10,'N',null),(11,'N',null),(12,'Y','EXT'),(13,'N',null),(14,'N',null),(15,'N',null),(16,'Y','EXT'),(17,'Y',null),(18,'Y','INT'),(19,'Y',null),(20,'Y','EXT'),(21,'Y',null),(22,'N',null),(23,'N',null);
with rn as -- Derived table purely to guarantee incremental row number. If you can guarantee your ActivityId values are incremental start to finish, this isn't required.
( select row_number() over (order by ActivityId) as rn
,ActivityId
,Flag
,TypeValue
from #t
),d as
( select rn -- Recursive CTE that compares the current row to the one previous.
,ActivityId
,Flag
,TypeValue
,cast(1 as decimal(10,5)) as CycleNo
from rn
where rn = 1
union all
select rn.rn
,rn.ActivityId
,rn.Flag
,rn.TypeValue
,cast(
case when d.Flag = 'Y' and d.TypeValue = 'EXT' and d.CycleNo >= 1
then case when rn.Flag = 'N'
then d.CycleNo + 1
else (d.CycleNo + 1) * 0.0001 -- This part keeps track of the cycle number in fractional values, which can be removed by converting the final result to INT.
end
else case when rn.Flag = 'N' and d.CycleNo < 1
then d.CycleNo * 10000
else d.CycleNo
end
end
as decimal(10,5)) as CycleNo
from rn
inner join d
on d.rn = rn.rn - 1
)
select ActivityId
,Flag
,TypeValue
,cast(CycleNo as int) as CycleNo
from d
order by ActivityId;
Output:
+------------+------+-----------+---------+
| ActivityId | Flag | TypeValue | CycleNo |
+------------+------+-----------+---------+
| 1 | N | NULL | 1 |
| 2 | N | NULL | 1 |
| 3 | Y | EXT | 1 |
| 4 | Y | NULL | 0 |
| 5 | Y | NULL | 0 |
| 6 | N | NULL | 2 |
| 7 | Y | INT | 2 |
| 8 | Y | NULL | 2 |
| 9 | N | NULL | 2 |
| 10 | N | NULL | 2 |
| 11 | N | NULL | 2 |
| 12 | Y | EXT | 2 |
| 13 | N | NULL | 3 |
| 14 | N | NULL | 3 |
| 15 | N | NULL | 3 |
| 16 | Y | EXT | 3 |
| 17 | Y | NULL | 0 |
| 18 | Y | INT | 0 |
| 19 | Y | NULL | 0 |
| 20 | Y | EXT | 0 |
| 21 | Y | NULL | 0 |
| 22 | N | NULL | 4 |
| 23 | N | NULL | 4 |
+------------+------+-----------+---------+

Related

How to count all the connected nodes (rows) in a graph on Postgres?

My table has account_id and device_id. One account_id could have multiple device_ids and vice versa. I am trying to count the depth of each connected many-to-many relationship.
Ex:
account_id | device_id
1 | 10
1 | 11
1 | 12
2 | 10
3 | 11
3 | 13
3 | 14
4 | 15
5 | 15
6 | 16
How do I construct a query that knows to combine accounts 1-3 together, 4-5 together, and leave 6 by itself? All 7 entries of accounts 1-3 should be grouped together because they all touched the same account_id or device_id at some point. I am trying to group them together and output the count.
Account 1 was used on device's 10, 11, 12. Those devices used other accounts too so we want to include them in the group. They used additional accounts 2 and 3. But account 3 was further used by 2 more devices so we will include them as well. The expansion of the group brings in any other account or device that also "touched" an account or device already in the group.
A diagram is shown below:
You can use a recursive cte:
with recursive t(account_id, device_id) as (
select 1, 10 union all
select 1, 11 union all
select 1, 12 union all
select 2, 10 union all
select 3, 11 union all
select 3, 13 union all
select 3, 14 union all
select 4, 15 union all
select 5, 15 union all
select 6, 16
),
a as (
select distinct t.account_id as a, t2.account_id as a2
from t join
t t2
on t2.device_id = t.device_id and t.account_id >= t2.account_id
),
cte as (
select a.a, a.a2 as mina
from a
union all
select a.a, cte.a
from cte join
a
on a.a2 = cte.a and a.a > cte.a
)
select grp, array_agg(a)
from (select a, min(mina) as grp
from cte
group by a
) a
group by grp;
Here is a SQL Fiddle.
You can GROUP BY the device_id and then aggregate together the account_id into a Postgres array. Here is an example query, although I'm not sure what your actual table name is.
SELECT
device_id,
array_agg(account_id) as account_ids
FROM account_device --or whatever your table name is
GROUP BY device_id;
Results - hope it's what you're looking for:
16 | {6}
15 | {4,5}
13 | {3}
11 | {1,3}
14 | {3}
12 | {1}
10 | {1,2}
-- \i tmp.sql
CREATE TABLE graph(
account_id integer NOT NULL --references accounts(id)
, device_id integer not NULL --references(devices(id)
,PRIMARY KEY(account_id, device_id)
);
INSERT INTO graph (account_id, device_id)VALUES
(1,10) ,(1,11) ,(1,12)
,(2,10)
,(3,11) ,(3,13) ,(3,14)
,(4,15)
,(5,15)
,(6,16)
;
-- SELECT* FROM graph ;
-- Find the (3) group leaders
WITH seed AS (
SELECT row_number() OVER () AS cluster_id -- give them a number
, g.account_id
, g.device_id
FROM graph g
WHERE NOT EXISTS (SELECT*
FROM graph nx
WHERE (nx.account_id = g.account_id OR nx.device_id = g.device_id)
AND nx.ctid < g.ctid
)
)
-- SELECT * FROM seed;
;
WITH recursive omg AS (
--use the above CTE in a sub-CTE
WITH seed AS (
SELECT row_number()OVER () AS cluster_id
, g.account_id
, g.device_id
, g.ctid AS wtf --we need an (ordered!) canonical id for the tuples
-- ,just to identify and exclude them
FROM graph g
WHERE NOT EXISTS (SELECT*
FROM graph nx
WHERE (nx.account_id = g.account_id OR nx.device_id = g.device_id) AND nx.ctid < g.ctid
)
)
SELECT s.cluster_id
, s.account_id
, s.device_id
, s.wtf
FROM seed s
UNION ALL
SELECT o.cluster_id
, g.account_id
, g.device_id
, g.ctid AS wtf
FROM omg o
JOIN graph g ON (g.account_id = o.account_id OR g.device_id = o.device_id)
-- AND (g.account_id > o.account_id OR g.device_id > o.device_id)
AND g.ctid > o.wtf
-- we still need to exclude duplicates here
-- (which could occur if there are cycles in the graph)
-- , this could be done using an array
)
SELECT *
FROM omg
ORDER BY cluster_id, account_id,device_id
;
Results:
DROP SCHEMA
CREATE SCHEMA
SET
CREATE TABLE
INSERT 0 10
cluster_id | account_id | device_id
------------+------------+-----------
1 | 1 | 10
2 | 4 | 15
3 | 6 | 16
(3 rows)
cluster_id | account_id | device_id | wtf
------------+------------+-----------+--------
1 | 1 | 10 | (0,1)
1 | 1 | 11 | (0,2)
1 | 1 | 12 | (0,3)
1 | 1 | 12 | (0,3)
1 | 2 | 10 | (0,4)
1 | 3 | 11 | (0,5)
1 | 3 | 13 | (0,6)
1 | 3 | 14 | (0,7)
1 | 3 | 14 | (0,7)
2 | 4 | 15 | (0,8)
2 | 5 | 15 | (0,9)
3 | 6 | 16 | (0,10)
(12 rows)
Newer version (I added an Id column to the table)
-- for convenience :set of all adjacent nodes.
CREATE TEMP VIEW pair AS
SELECT one.id AS one
, two.id AS two
FROM graph one
JOIN graph two ON (one.account_id = two.account_id OR one.device_id = two.device_id)
AND one.id <> two.id
;
WITH RECURSIVE flood AS (
SELECT g.id, g.id AS parent_id
, 0 AS lev
, ARRAY[g.id]AS arr
FROM graph g
UNION ALL
SELECT c.id, p.parent_id AS parent_id
, 1+p.lev AS lev
, p.arr || ARRAY[c.id] AS arr
FROM graph c
JOIN flood p ON EXISTS (
SELECT * FROM pair WHERE p.id = pair.one AND c.id = pair.two)
AND p.parent_id < c.id
AND NOT p.arr #> ARRAY[c.id] -- avoid cycles/loops
)
SELECT g.*, a.parent_id
, dense_rank() over (ORDER by a.parent_id)AS group_id
FROM graph g
JOIN (SELECT id, MIN(parent_id) AS parent_id
FROM flood
GROUP BY id
) a
ON g.id = a.id
ORDER BY a.parent_id, g.id
;
New results:
CREATE VIEW
id | account_id | device_id | parent_id | group_id
----+------------+-----------+-----------+----------
1 | 1 | 10 | 1 | 1
2 | 1 | 11 | 1 | 1
3 | 1 | 12 | 1 | 1
4 | 2 | 10 | 1 | 1
5 | 3 | 11 | 1 | 1
6 | 3 | 13 | 1 | 1
7 | 3 | 14 | 1 | 1
8 | 4 | 15 | 8 | 2
9 | 5 | 15 | 8 | 2
10 | 6 | 16 | 10 | 3
(10 rows)

Count groups of NULL values - partition or window?

n | g
---------
1 | 1
2 | NULL
3 | 1
4 | 1
5 | 1
6 | 1
7 | NULL
8 | NULL
9 | NULL
10 | 1
11 | 1
12 | 1
13 | 1
14 | 1
15 | 1
16 | 1
17 | NULL
18 | 1
19 | 1
20 | 1
21 | NULL
22 | 1
23 | 1
24 | 1
25 | 1
26 | NULL
27 | NULL
28 | 1
29 | 1
30 | NULL
31 | 1
From the above column g I should get this result:
x|y
---
1|4
2|1
3|1
where
x stands for the count of contiguous NULLs and
y stands for the times a single group of NULLs occurs.
I.e., there is ...
4 groups of only 1 NULL,
1 group of 2 NULLs and
1 group of 3 NULLs
Compute a running count of not-null values with a window function to form groups, then 2 two nested counts ...
SELECT x, count(*) AS y
FROM (
SELECT grp, count(*) FILTER (WHERE g IS NULL) AS x
FROM (
SELECT g, count(g) OVER (ORDER BY n) AS grp
FROM tbl
) sub1
WHERE g IS NULL
GROUP BY grp
) sub2
GROUP BY 1
ORDER BY 1;
count() only counts not null values.
This includes the preceding row with a not null g in the following group (grp) of NULL values - which has to be removed from the count.
I replaced the HAVING clause I had for that in my initial query with WHERE g IS NULL, like #klin uses in his answer), that's simpler.
Related:
Find ā€œnā€ consecutive free numbers from table
Select longest continuous sequence
If n is a gapless sequence of integer numbers, you can simplify further:
SELECT x, count(*) AS y
FROM (
SELECT grp, count(*) AS x
FROM (
SELECT n - row_number() OVER (ORDER BY n) AS grp
FROM tbl
WHERE g IS NULL
) sub1
GROUP BY 1
) sub2
GROUP BY 1
ORDER BY 1;
Eliminate not null values immediately and deduct the row number from n, thereby arriving at (meaningless) group numbers directly ...
While the only possible value in g is 1, sum() is a smart trick (like #klin provided). But that should be a boolean column then, wouldn't make sense as numeric type. So I assume that's just a simplification of the actual problem in the question.
select x, count(x) y
from (
select s, count(s) x
from (
select *, sum(g) over (order by i) as s
from example
) s
where g isnull
group by 1
) s
group by 1
order by 1;
Test it here.

Select statement that duplicates rows based on N value of column

I have a Power table that stores building circuit details. A circuit can be 1 phase or 3 phase but is always represented as 1 row in the circuit table.
I want to insert the details of the circuits into a join table which joins panels to circuits
My current circuit table has the following details
CircuitID | Voltage | Phase | PanelID | Cct |
1 | 120 | 1 | 1 | 1 |
2 | 208 | 3 | 1 | 3 |
3 | 208 | 2 | 1 | 8 |
Is it possible to create a select where by when it sees a 3 phase row it selects 3 rows (or 2 select 2 rows) and increments the Cct column by 1 each time or do I have to create a loop?
CircuitID | PanelID | Cct |
1 | 1 | 1 |
2 | 1 | 3 |
2 | 1 | 4 |
2 | 1 | 5 |
3 | 1 | 8 |
3 | 1 | 9 |
Here is one way to do it
First generate numbers using tally table(best possible way). Here is one excellent article about generating number without loops. Generate a set or sequence without loops
Then join the numbers table with yourtable where phase value of each record should be greater than sequence number in number's table
;WITH e1(n) AS
(
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), -- 10
e2(n) AS (SELECT 1 FROM e1 CROSS JOIN e1 AS b), -- 10*10
e3(n) AS (SELECT 1 FROM e1 CROSS JOIN e2), -- 10*100
numbers as ( SELECT n = ROW_NUMBER() OVER (ORDER BY n) FROM e3 )
SELECT CircuitID,
PanelID,
Cct = Cct + ( n - 1 )
FROM Yourtable a
JOIN numbers b
ON a.Phase >= b.n
You can do this with a one recursive cte.
WITH cte AS
(
SELECT [CircuitID], [Voltage], [Phase], [PanelID], [Cct], [Cct] AS [Ref]
FROM [Power]
UNION ALL
SELECT [CircuitID], [Voltage], [Phase], [PanelID], [Cct] + 1, [Ref]
FROM cte
WHERE [Cct] + 1 < [Phase] + [Ref]
)
SELECT [CircuitID], [PanelID], [Cct]
FROM cte
ORDER BY [CircuitID]
Simplest way,
Select y.* from (
Select 1 CircuitID,120 Voltage,1 Phase,1 PanelID, 1 Cct
union
Select 2,208,3,1,3
union
Select 3,208,2,1,8)y,
(Select 1 x
union
Select 2 x
union
Select 3 x)x
Where x.x <= y.Phase
Directly copy paste this and try, it will run 100%. After that, just replace my 'y' table with your real table.

If the difference between two sequences is bigger than 30, deduct bigger sequence

I'm having a hard time trying to make a query that gets a lot of numbers, a sequence of numbers, and if the difference between two of them is bigger than 30, then the sequence resets from this number. So, I have the following table, which has another column other than the number one, which should be maintained intact:
+----+--------+--------+
| Id | Number | Status |
+----+--------+--------+
| 1 | 1 | OK |
| 2 | 1 | Failed |
| 3 | 2 | Failed |
| 4 | 3 | OK |
| 5 | 4 | OK |
| 6 | 36 | Failed |
| 7 | 39 | OK |
| 8 | 47 | OK |
| 9 | 80 | Failed |
| 10 | 110 | Failed |
| 11 | 111 | OK |
| 12 | 150 | Failed |
| 13 | 165 | OK |
+----+--------+--------+
It should turn it into this one:
+----+--------+--------+
| Id | Number | Status |
+----+--------+--------+
| 1 | 1 | OK |
| 2 | 1 | Failed |
| 3 | 2 | Failed |
| 4 | 3 | OK |
| 5 | 4 | OK |
| 6 | 1 | Failed |
| 7 | 4 | OK |
| 8 | 12 | OK |
| 9 | 1 | Failed |
| 10 | 1 | Failed |
| 11 | 2 | OK |
| 12 | 1 | Failed |
| 13 | 16 | OK |
+----+--------+--------+
Thanks for your attention, I will be available to clear any doubt regarding my problem! :)
EDIT: Sample of this table here: http://sqlfiddle.com/#!6/ded5af
With this test case:
declare #data table (id int identity, Number int, Status varchar(20));
insert #data(number, status) values
( 1,'OK')
,( 1,'Failed')
,( 2,'Failed')
,( 3,'OK')
,( 4,'OK')
,( 4,'OK') -- to be deleted, ensures IDs are not sequential
,(36,'Failed') -- to be deleted, ensures IDs are not sequential
,(36,'Failed')
,(39,'OK')
,(47,'OK')
,(80,'Failed')
,(110,'Failed')
,(111,'OK')
,(150,'Failed')
,(165,'OK')
;
delete #data where id between 6 and 7;
This SQL:
with renumbered as (
select rn = row_number() over (order by id), data.*
from #data data
),
paired as (
select
this.*,
startNewGroup = case when this.number - prev.number >= 30
or prev.id is null then 1 else 0 end
from renumbered this
left join renumbered prev on prev.rn = this.rn -1
),
groups as (
select Id,Number, GroupNo = Number from paired where startNewGroup = 1
)
select
Id
,Number = 1 + Number - (
select top 1 GroupNo
from groups where groups.id <= paired.id
order by GroupNo desc)
,status
from paired
;
yields as desired:
Id Number status
----------- ----------- --------------------
1 1 OK
2 1 Failed
3 2 Failed
4 3 OK
5 4 OK
8 1 Failed
9 4 OK
10 12 OK
11 1 Failed
12 1 Failed
13 2 OK
14 1 Failed
15 16 OK
Update: using the new LAG() function allows somewhat simpler SQL without a self-join early on:
with renumbered as (
select
data.*
,gap = number - lag(number, 1) over (order by number)
from #data data
),
paired as (
select
*,
startNewGroup = case when gap >= 30 or gap is null then 1 else 0 end
from renumbered
),
groups as (
select Id,Number, GroupNo = Number from paired where startNewGroup = 1
)
select
Id
,Number = 1 + Number - ( select top 1 GroupNo
from groups
where groups.id <= paired.id
order by GroupNo desc
)
,status
from paired
;
I don't deserve answer but I think this is even shorter
with gapped as
( select id, number, gap = number - lag(number, 1) over (order by id)
from #data data
),
select Id, status
ReNumber = Number + 1 - isnull( (select top 1 gapped.Number
from gapped
where gapped.id <= data.id
and gap >= 30
order by gapped.id desc), 1)
from #data data;
This is simply Pieter Geerkens's answer slightly simplified. I removed some intermediate results and columns:
with renumbered as (
select data.*, gap = number - lag(number, 1) over (order by number)
from #data data
),
paired as (
select *
from renumbered
where gap >= 30 or gap is null
)
select Id, Number = 1 + Number - (select top 1 Number
from paired
where paired.id <= renumbered.id
order by Number desc)
, status
from renumbered;
It should have been a comment, but it's too long for that and wouldn't be understandable.
You might need to make another cte before this and use row_number instead of ID to join the recursive cte, if your ID's are not in sequential order
WITH cte AS
( SELECT
Id, [Number], [Status],
0 AS Diff,
[Number] AS [NewNumber]
FROM
Table1
WHERE Id = 1
UNION ALL
SELECT
t1.Id, t1.[Number], t1.[Status],
CASE WHEN t1.[Number] - cte.[Number] >= 30 THEN t1.Number - 1 ELSE Diff END,
CASE WHEN t1.[Number] - cte.[Number] >= 30 THEN 1 ELSE t1.[Number] - Diff END
FROM Table1 t1
JOIN cte ON cte.Id + 1 = t1.Id
)
SELECT Id, [NewNumber], [Status]
FROM cte
SQL Fiddle
Here is another SQL Fiddle with an example of what you would do if the ID is not sequential..
SQL Fiddle 2
In case sql fiddle stops working
--Order table to make sure there is a sequence to follow
WITH OrderedSequence AS
(
SELECT
ROW_NUMBER() OVER (ORDER BY Id) RnId,
Id,
[Number],
[Status]
FROM
Sequence
),
RecursiveCte AS
( SELECT
Id, [Number], [Status],
0 AS Diff,
[Number] AS [NewNumber],
RnId
FROM
OrderedSequence
WHERE Id = 1
UNION ALL
SELECT
t1.Id, t1.[Number], t1.[Status],
CASE WHEN t1.[Number] - cte.[Number] >= 30 THEN t1.Number - 1 ELSE Diff END,
CASE WHEN t1.[Number] - cte.[Number] >= 30 THEN 1 ELSE t1.[Number] - Diff END,
t1.RnId
FROM OrderedSequence t1
JOIN RecursiveCte cte ON cte.RnId + 1 = t1.RnId
)
SELECT Id, [NewNumber], [Status]
FROM RecursiveCte
I tried to optimize the queries here, since it took 1h20m to process my data. I had it down to 30s after some further research.
WITH AuxTable AS
( SELECT
id,
number,
status,
relevantId = CASE WHEN
number = 1 OR
((number - LAG(number, 1) OVER (ORDER BY id)) > 29)
THEN id
ELSE NULL
END,
deduct = CASE WHEN
((number - LAG(number, 1) OVER (ORDER BY id)) > 29)
THEN number - 1
ELSE 0
END
FROM #data data
)
,AuxTable2 AS
(
SELECT
id,
number,
status,
AT.deduct,
MAX(AT.relevantId) OVER (ORDER BY AT.id ROWS UNBOUNDED PRECEDING ) AS lastRelevantId
FROM AuxTable AT
)
SELECT
id,
number,
status,
number - MAX(deduct) OVER(PARTITION BY lastRelevantId ORDER BY id ROWS UNBOUNDED PRECEDING ) AS ReNumber,
FROM AuxTable2
I think this runs faster, but it's not shorter.

How to select extra columns while using group by clause?

I have a table which contains data in this format.
productid filterName boolfilter numericfilter
1 X 1 NULL
1 Y NULL 99inch
1 Z 0 NULL
2 Y NULL 55kg
2 Y NULL 45kg
3 K NULL 20
3 M NULL 35
3 N NULL 25
4 X 1 NULL
4 K 1 NULL
I need data in this format.
Need products where only numeric filters are setup but no boolean filters
productid filterName numericfilter
2 Y 55kg
2 Y 45kg
3 K 20
3 M 35
3 N 25
I have written this query,
SELCT productid
FROM tbl_filters
GROUP BY productid
HAVING SUM(CAST(boolfilter AS INT)) IS NULL
I am getting prouctid 2 and 3, but i need the extra columns also as i have mentioned.
When i am using multiple columns in groupby clause i am not getting the required output.
SELECT t.productid, t.filterName, t.numericfilter
FROM Table_Name t
WHERE t.numericfilter IS NOT NULL
AND NOT EXISTS (SELECT 1
FROM TABLE_NAME
WHERE t.productid = productid
AND boolfilter IS NOT NULL)
Working SQL FIDDLE
| PRODUCTID | FILTERNAME | NUMERICFILTER |
|-----------|------------|---------------|
| 2 | Y | 55kg |
| 2 | Y | 45kg |
| 3 | K | 20 |
| 3 | M | 35 |
| 3 | N | 25 |
Use window functions instead:
SELECT productid, filterName, numericfilter
FROM (SELECT f.*,
MAX(boolfilter) OVER (PARTITION BY productid) as maxbf
FROM tbl_filters f
) f
WHERE maxbf is null;
Fiddle DEMO.
This calculates the maximum of boolfilter for each productid. If it is always NULL, then the result is NULL. Note that you don't need a cast() for this.