Does sequence contain 5 numbers that are each one apart solved recursively - sql

This is the data:
create table #t
(ID int)
insert into #t
values
(-2)
,(-1)
-- ,(0)
,(1)
,(2)
,(3)
,(4)
,(7)
,(5)
,(21)
,(22)
,(23)
,(24)
,(25)
,(8);
We want to know if there are 5 numbers within the above sequence that are each 1 apart e.g. 21-22-23-24-25 gives a positive result. So is there an island of 5 anywhere in the list?
None recursively I've got a few possibilities but is there a simple recursive solution?
Or is there a simpler non-recursive solution?
--::::::::::::::
--:: 1. LONG-WINDED
with t as
(
select id,
U = (id+5),
L = (id-5)
from #t
)
, up as
(
select x.id,
cnt = count(*)
from t x
join t y on
(y.id > x.L and y.id <= x.id)
group by x.id
)
, down as --<<MAYBE DOWN IS NOT NEEDED
(
select x.id,
cnt = count(*)
from t x
join t y on
(y.id < x.U and y.id >= x.id)
group by x.id
)
select id from up where cnt >= 5
union all
select id from down where cnt >= 5
Following two are better:
--::::::::::::::
--::
--:: 2. PRETTY!!
SELECT *
FROM #t A
WHERE EXISTS
(
SELECT *
FROM #t B
WHERE (
(A.id + 5) > B.id
AND
A.id <= B.id
)
HAVING COUNT(*) >=5
)
--::::::::::::::
--::
--:: 3. PRETTY PRETTY!!
--::
SELECT ID
FROM #t A
CROSS APPLY
(
SELECT cnt = COUNT(*)
FROM #t B
WHERE (A.id + 5) > B.id AND A.id <= B.id
) C
WHERE C.cnt>=5
Following used this reference to Itzak article
--::::::::::::::
--::
--:: 4. One of the Windowed functions
--::
WITH x AS
(
SELECT ID,
y = LAG(ID,4) OVER(ORDER BY ID),
dif = ID - LAG(ID,4) OVER(ORDER BY ID)
FROM #t A
)
SELECT ID,y
FROM x
WHERE dif = 4

Yes, there is a much simpler solution. Take the difference between the numbers and an increasing sequence of numbers. If the numbers are consecutive, the difference is constant. So, you can do:
select grp, count(*) as num_in_sequence, min(id) as first_id, max(id) as last_id
from (select t.*,
(id - row_number() over (order by id)) as grp
from #t t
) t
group by grp
having count(*) >= 5;
EDIT:
I think this is the simplest of all. One window function and a comparison:
select t.*
from (select t.*, lead(id, 4) over (order by id) as id4
from #t
) t
where id4 - id = 4;
This does make the assumption that there are no duplicates in the ids, which is true of the OP data.
As I look further, this is the last solution in the OP. Kudos!

Related

SQL Self Recursive Join with Grouping

In SQL Server, I have the following table:
Name New_Name
---------------
A B
B C
C D
G H
H I
Z B
I want to create a new table that links all the names that are related to a single new groupID
GroupID Name
-------------
1 A
1 B
1 C
1 D
1 Z
2 G
2 H
2 I
I'm a bit stuck on this can can't figure out how to do it apart from a bunch of joins. But I would like to do it properly.
Edited the question to allow grouping from two different start points, A and Z into one group.
Since you've changed the question, I'm updating the answer. Please note that answer is same in sense of logical structure. All it does differently is that instead of going from G to I in calculating levels, answer now calculates from I to G.
Working demo link
;with cte as
(
select
t1.Name as Name, row_number() over (order by t1.Name) r,
t1.New_Name as New_Name,
1 as level
from yt t1 left join yt t2
on t1.New_Name=t2.name
where t2.name is null
union all
select
yt.Name as Name, r,
yt.New_Name as New_Name,
c.level+1 as level
from cte c join yt
on yt.New_Name=c.Name
),
cte2 as
(
select r as group_id, Name from cte
union
select c1.r as group_id, c1.New_name as Name from cte c1
where level = (select min(level) from cte c2 where c2.r=c1.r)
)
select * from cte2;
Below is old answer.
You can try below CTE based query:
create table yt (Name varchar(10), New_Name varchar(10));
insert into yt values
('A','B'),
('B','C'),
('C','D'),
('G','H'),
('H','I');
;with cte as
(
select
t1.Name as Name, row_number() over (order by t1.Name) r,
t1.New_Name as New_Name,
1 as level
from yt t1 left join yt t2
on t1.Name=t2.New_name
where t2.new_name is null
union all
select
yt.Name as Name, r,
yt.New_Name as New_Name,
c.level+1 as level
from cte c join yt
on yt.Name=c.New_Name
),
cte2 as
(
select r as group_id, Name from cte
union
select c1.r as group_id, c1.New_name as Name from cte c1
where level = (select max(level) from cte c2 where c2.r=c1.r)
)
select * from cte2;
see working demo
Little bit complex but working.
DECLARE #T TABLE (Name VARCHAR(2), New_Name VARCHAR(2))
INSERT INTO #T
VALUES
('A','B'),
('B','C'),
('C','D'),
('G','H'),
('H','I'),
('Z','B')
;WITH CTE AS
(
SELECT * , RN = ROW_NUMBER() OVER(ORDER BY Name) FROM #T
)
,CTE2 AS (SELECT T1.RN, T1.Name Name1, T1.New_Name New_Name1,
X.Name Name2, X.New_Name New_Name2,
FLAG = CASE WHEN X.Name IS NULL THEN 1 ELSE 0 END
FROM CTE T1
OUTER APPLY (SELECT * FROM CTE T2 WHERE T2.RN > T1.RN
AND (T2.Name IN (T1.Name , T1.New_Name)
OR T2.New_Name IN (T1.Name , T1.New_Name)
)) AS X
)
,CTE3 AS (SELECT *,
GroupID = ROW_NUMBER() OVER (ORDER BY RN) -
ROW_NUMBER() OVER (PARTITION BY Flag ORDER BY RN) +1
FROM CTE2
)
SELECT
DISTINCT GroupID, Name
FROM
(SELECT * FROM CTE3 WHERE Name2 IS NOT NULL) SRC
UNPIVOT ( Name FOR COL IN ([Name1], [New_Name1], [Name2], [New_Name2])) UNPVT
Result
GroupID Name
-------------------- ----
1 A
1 B
1 C
1 D
1 Z
2 G
2 H
2 I

2 rows differences

I would like to get 2 consecutive rows from an SQL table.
One of the columns storing UNIX datestamp and between 2 rows the difference only this value.
For example:
id_int dt_int
1. row 8211721 509794233
2. row 8211722 509794233
I need only those rows where dt_int the same (edited)
Do you want both lines to be shown?
A solution could be this:
with foo as
(
select
*
from (values (8211721),(8211722),(8211728),(8211740),(8211741)) a(id_int)
)
select
id_int
from
(
select
id_int
,id_int-isnull(lag(id_int,1) over (order by id_int) ,id_int-6) prev
,isnull(lead(id_int,1) over (order by id_int) ,id_int+6)-id_int nxt
from foo
) a
where prev<=5 or nxt<=5
We use lead and lag, to find the differences between rows, and keep the rows where there is less than or equal to 5 for the row before or after.
If you use 2008r2, then lag and lead are not available. You could use rownumber in stead:
with foo as
(
select
*
from (values (8211721),(8211722),(8211728),(8211740),(8211741)) a(id_int)
)
, rownums as
(
select
id_int
,row_number() over (order by id_int) rn
from foo
)
select
id_int
from
(
select
cur.id_int
,cur.id_int-prev.id_int prev
,nxt.id_int-cur.id_int nxt
from rownums cur
left join rownums prev
on cur.rn-1=prev.rn
left join rownums nxt
on cur.rn+1=nxt.rn
) a
where isnull(prev,6)<=5 or isnull(nxt,6)<=5
Assuming:
lead() analytical function available.
ID_INT is what we need to sort by to determine table order...
you may need to partition by some value lead(ID_int) over(partition by SomeKeysuchasOrderNumber order by ID_int asc) so that orders and dates don't get mixed together.
.
WITH CTE AS (
SELECT A.*
, lead(ID_int) over ([missing partition info] ORDER BY id_Int asc) - id_int as ID_INT_DIFF
FROM Table A)
SELECT *
FROM CTE
WHERE ID_INT_DIFF < 5;
You can try it. This version works on SQL Server 2000 and above. Today I don not a more recent SQL Server to write on.
declare #t table (id_int int, dt_int int)
INSERT #T SELECT 8211721 , 509794233
INSERT #T SELECT 8211722 , 509794233
INSERT #T SELECT 8211723 , 509794235
INSERT #T SELECT 8211724 , 509794236
INSERT #T SELECT 8211729 , 509794237
INSERT #T SELECT 8211731 , 509794238
;with cte_t as
(SELECT
ROW_NUMBER() OVER (ORDER BY id_int) id
,id_int
,dt_int
FROM #t),
cte_diff as
( SELECT
id_int
,dt_int
,(SELECT TOP 1 dt_int FROM cte_t b WHERE a.id < b.id) dt_int1
,dt_int - (SELECT TOP 1 dt_int FROM cte_t b WHERE a.id < b.id) Difference
FROM cte_t a
)
SELECT DISTINCT id_int , dt_int FROM #t a
WHERE
EXISTS(SELECT 1 FROM cte_diff b where b.Difference =0 and a.dt_int = b.dt_int)

How to get the middle most record(s) from a group of data in sql

create table #middle
(
A INT,
B INT,
C INT
)
INSERT INTO #middle (A,B,C) VALUES (7,6,2),(1,0,8),(9,12,16),(7, 16, 2),(1,12,8), (9,12,16),(9,12,16),(7, 16, 2),(1,12,8), (9,12,16)
;WITH MIDS
AS (SELECT *,
Row_number()
OVER (
ORDER BY a, b, c DESC )AS rn
FROM #middle)
SELECT *
FROM MIDS
WHERE rn <= (SELECT CASE ( Count(*)%2 )
WHEN 0 THEN ( Count(*) / 2 ) + 1
ELSE ( Count(*) / 2 )
END
FROM MIDS) except (SELECT *
FROM MIDS
WHERE rn < (SELECT ( Count(*) / 2 )
FROM MIDS))
The query i have tried works >4 records but not for '3'.Now my question is how should i modify my query so that for 3 records i should get the 2nd record which is the middle most record among them,try to insert only 3 records from above records and help. Thanks in advance.
You can use OFFSET and FETCH
select *
from #middle
order by a, b, c desc
offset (select count(*) / 2 - (case when count(*) % 2 = 0 then 1 else 0 end) from #middle) rows
fetch next (select 2 - (count(*) % 2) from #middle) rows only
There are many ways to get the median in SQL. Here is a simple way:
select h.*
from (select h.*, row_number() over (order by a, b, c desc) as seqnum,
count(*) over () as cnt
from #highest h
) h
where 2 * rn in (cnt, cnt - 1, cnt + 1);
For an even number of records, you will get two rows. You need to decide what you actually want in this case.
How about this:
**EDITED
;WITH MIDS
AS (SELECT *,
Row_number()
OVER (
ORDER BY a, b, c DESC )AS rn
FROM #middle),
Cnt
AS
(SELECT COUNT(*) c, COUNT(*)%2 as rem, COUNT(*)/2 as mid FROM Mids)
SELECT *
FROM MIDS
CROSS APPLY cnt
where (rn >= cnt.mid and rn <= cnt.mid + 1 AND cnt.rem = 0) OR
(cnt.rem <> 0 AND rn = cnt.mid+1)

concatenate recursive cross join

I need to concatenate the name in a recursive cross join way. I don't know how to do this, I have tried a CTE using WITH RECURSIVE but no success.
I have a table like this:
group_id | name
---------------
13 | A
13 | B
19 | C
19 | D
31 | E
31 | F
31 | G
Desired output:
combinations
------------
ACE
ACF
ACG
ADE
ADF
ADG
BCE
BCF
BCG
BDE
BDF
BDG
Of course, the results should multiply if I add a 4th (or more) group.
Native Postgresql Syntax:
SqlFiddleDemo
WITH RECURSIVE cte1 AS
(
SELECT *, DENSE_RANK() OVER (ORDER BY group_id) AS rn
FROM mytable
),cte2 AS
(
SELECT
CAST(name AS VARCHAR(4000)) AS name,
rn
FROM cte1
WHERE rn = 1
UNION ALL
SELECT
CAST(CONCAT(c2.name,c1.name) AS VARCHAR(4000)) AS name
,c1.rn
FROM cte1 c1
JOIN cte2 c2
ON c1.rn = c2.rn + 1
)
SELECT name as combinations
FROM cte2
WHERE LENGTH(name) = (SELECT MAX(rn) FROM cte1)
ORDER BY name;
Before:
I hope if you don't mind that I use SQL Server Syntax:
Sample:
CREATE TABLE #mytable(
ID INTEGER NOT NULL
,TYPE VARCHAR(MAX) NOT NULL
);
INSERT INTO #mytable(ID,TYPE) VALUES (13,'A');
INSERT INTO #mytable(ID,TYPE) VALUES (13,'B');
INSERT INTO #mytable(ID,TYPE) VALUES (19,'C');
INSERT INTO #mytable(ID,TYPE) VALUES (19,'D');
INSERT INTO #mytable(ID,TYPE) VALUES (31,'E');
INSERT INTO #mytable(ID,TYPE) VALUES (31,'F');
INSERT INTO #mytable(ID,TYPE) VALUES (31,'G');
Main query:
WITH cte1 AS
(
SELECT *, rn = DENSE_RANK() OVER (ORDER BY ID)
FROM #mytable
),cte2 AS
(
SELECT
TYPE = CAST(TYPE AS VARCHAR(MAX)),
rn
FROM cte1
WHERE rn = 1
UNION ALL
SELECT
[Type] = CAST(CONCAT(c2.TYPE,c1.TYPE) AS VARCHAR(MAX))
,c1.rn
FROM cte1 c1
JOIN cte2 c2
ON c1.rn = c2.rn + 1
)
SELECT *
FROM cte2
WHERE LEN(Type) = (SELECT MAX(rn) FROM cte1)
ORDER BY Type;
LiveDemo
I've assumed that the order of "cross join" is dependent on ascending ID.
cte1 generate DENSE_RANK() because your IDs contain gaps
cte2 recursive part with CONCAT
main query just filter out required length and sort string
The recursive query is a bit simpler in Postgres:
WITH RECURSIVE t AS ( -- to produce gapless group numbers
SELECT dense_rank() OVER (ORDER BY group_id) AS grp, name
FROM tbl
)
, cte AS (
SELECT grp, name
FROM t
WHERE grp = 1
UNION ALL
SELECT t.grp, c.name || t.name
FROM cte c
JOIN t ON t.grp = c.grp + 1
)
SELECT name AS combi
FROM cte
WHERE grp = (SELECT max(grp) FROM t)
ORDER BY 1;
The basic logic is the same as in the SQL Server version provided by #lad2025, I added a couple of minor improvements.
Or you can use a simple version if your maximum number of groups is not too big (can't be very big, really, since the result set grows exponentially). For a maximum of 5 groups:
WITH t AS ( -- to produce gapless group numbers
SELECT dense_rank() OVER (ORDER BY group_id) AS grp, name AS n
FROM tbl
)
SELECT concat(t1.n, t2.n, t3.n, t4.n, t5.n) AS combi
FROM (SELECT n FROM t WHERE grp = 1) t1
LEFT JOIN (SELECT n FROM t WHERE grp = 2) t2 ON true
LEFT JOIN (SELECT n FROM t WHERE grp = 3) t3 ON true
LEFT JOIN (SELECT n FROM t WHERE grp = 4) t4 ON true
LEFT JOIN (SELECT n FROM t WHERE grp = 5) t5 ON true
ORDER BY 1;
Probably faster for few groups. LEFT JOIN .. ON true makes this work even if higher levels are missing. concat() ignores NULL values. Test with EXPLAIN ANALYZE to be sure.
SQL Fiddle showing both.

Pull out first index record in each REPEATING group ordered by index

I have this table with this data
DECLARE #tbl TABLE
(
IDX INTEGER,
VAL VARCHAR(50)
)
--Inserted values for testing
INSERT INTO #tbl(IDX, VAL) VALUES(1,'A')
INSERT INTO #tbl(IDX, VAL) VALUES(2,'A')
INSERT INTO #tbl(IDX, VAL) VALUES(3,'A')
INSERT INTO #tbl(IDX, VAL) VALUES(4,'B')
INSERT INTO #tbl(IDX, VAL) VALUES(5,'B')
INSERT INTO #tbl(IDX, VAL) VALUES(6,'B')
INSERT INTO #tbl(IDX, VAL) VALUES(7,'A')
INSERT INTO #tbl(IDX, VAL) VALUES(8,'A')
INSERT INTO #tbl(IDX, VAL) VALUES(9,'A')
INSERT INTO #tbl(IDX, VAL) VALUES(10,'C')
INSERT INTO #tbl(IDX, VAL) VALUES(11,'C')
INSERT INTO #tbl(IDX, VAL) VALUES(12,'A')
INSERT INTO #tbl(IDX, VAL) VALUES(13,'A')
--INSERT INTO #tbl(IDX, VAL) VALUES(14,'A') -- this line has bad binary code
INSERT INTO #tbl(IDX, VAL) VALUES(14,'A') -- replace with this line and it works
INSERT INTO #tbl(IDX, VAL) VALUES(15,'D')
INSERT INTO #tbl(IDX, VAL) VALUES(16,'D')
Select * From #tbl -- to see what you have inserted...
And the Output I'm looking for is the FIRST and LAST Idx and Val in each group of Val's prior ordering over Idx. Noting that Val's may repeat !!! also Idx may not be in ascending order in the table as they are in the imsert statments. No cursors please !
i.e
Val First Last
=================
A 1 3
B 4 6
A 7 9
C 10 11
A 12 14
D 15 16
If the idx values are guaranteed to be sequential, then try this:
Select f.val, f.idx first, l.idx last
From #tbl f
join #tbl l
on l.val = f.val
and l.idx > f.idx
and not exists
(Select * from #tbl
Where val = f.val
and idx = l.idx + 1)
and not exists
(Select * from #tbl
Where val = f.val
and idx = f.idx - 1)
and not exists
(Select * from #tbl
Where val <> f.val
and idx Between f.idx and l.idx)
order by f.idx
if the idx values are not sequential, then it needs to be a bit more complex...
Select f.val, f.idx first, l.idx last
From #tbl f
join #tbl l
on l.val = f.val
and l.idx > f.idx
and not exists
(Select * from #tbl
Where val = f.val
and idx = (select Min(idx)
from #tbl
where idx > l.idx))
and not exists
(Select * from #tbl
Where val = f.val
and idx = (select Max(idx)
from #tbl
where idx < f.idx))
and not exists
(Select * from #tbl
Where val <> f.val
and idx Between f.idx and l.idx)
order by f.idx
SQL Server 2012
In SQL Server 2012, you can use cte sequence with lag/lead analytical functions like below (fiddle here). The code does not assume any type or sequence about idx, and queries first and last occurrence of val within each window.
;with cte as
(
select val, idx,
ROW_NUMBER() over(order by (select 0)) as urn --row_number without ordering
from #tbl),
cte1 as
(
select urn, val, idx,
lag(val, 1) over(order by urn) as prevval,
lead(val, 1) over(order by urn) as nextval
from cte
),
cte2 as
(
select val, idx, ROW_NUMBER() over(order by (select 0)) as orn,
(ROW_NUMBER() over(order by (select 0))+1)/2 as prn from cte1
where (prevval <> nextval or prevval is null or nextval is null)
),
cte3 as
(
select val, FIRST_VALUE(idx) over(partition by prn order by prn) as firstidx,
LAST_VALUE(idx) over(partition by prn order by prn) as lastidx, orn
from cte2
),
cte4 as
(
select val, firstidx, lastidx, min(orn) as rn
from cte3
group by val, firstidx, lastidx
)
select val, firstidx, lastidx
from cte4
order by rn;
SQL Server 2008
In SQL Server 2008, it is bit more tortured code due to the lack of lag/lead analytical functions. (fiddle here). Here also, the code does not assume any type or sequence about idx, and queries first and last occurrence of val within each window.
;with cte as
(
select val, idx, ROW_NUMBER() over(order by (select 0)) as urn
from #tbl),
cte1 as
(
select m.urn, m.val, m.idx,
_lag.val as prevval, _lead.val as nextval
from cte as m
left join cte as _lag
on _lag.urn = m.urn-1
left join cte AS _lead
on _lead.urn = m.urn+1),
cte2 as
(
select val, idx, ROW_NUMBER() over(order by (select 0)) as orn,
(ROW_NUMBER() over(order by (select 0))+1)/2 as prn from cte1
where (prevval <> nextval or prevval is null or nextval is null)),
cte3 as
( select *, ROW_NUMBER() over(partition by prn order by orn) as rownum
from cte2),
cte4 as
(select o.val, (select i.idx from cte3 as i where i.rownum = 1 and i.prn = o.prn)
as firstidx,
(select i.idx from cte3 as i where i.rownum = 2 and i.prn = o.prn) as lastidx,
o.orn from cte3 as o),
cte5 as (
select val, firstidx, lastidx, min(orn) as rn
from cte4
group by val, firstidx, lastidx
)
select val, firstidx, lastidx
from cte5
order by rn;
Note:
Both of the solutions are based on the assumption that the database engine preserves the order of insertion, though relational database does not guaranteed the order in theory.
A way to do it - at least for SQL Server 2008 without using special functionality would be to introduce a helper table and helper variable.
Now whether that's actually possible for you as is (due to many other requirements) I don't know - but it might lead you on a solution path, but it does look to solve your current set up requirements of no cursor and nor lead/lag:
So basically what I do is make a helper table and a helper grouping variable:
(sorry about the naming)
DECLARE #grp TABLE
(
idx INTEGER ,
val VARCHAR(50) ,
gidx INT
)
DECLARE #gidx INT = 1
INSERT INTO #grp
( idx, val, gidx )
SELECT idx ,
val ,
0
FROM #tbl AS t
I populate this with the values from your source table #tbl.
Then I do an update trick to assign a value to gidx based on when VAL changes value:
UPDATE g
SET #gidx = gidx = CASE WHEN val <> ISNULL(( SELECT val
FROM #grp AS g2
WHERE g2.idx = g.idx - 1
), val) THEN #gidx + 1
ELSE #gidx
END
FROM #grp AS g
What this does is assign a value of 1 to gidx until VAL changes, then it assigns gidx + 1 which is also assigned to #gixd variable. And so on.
This gives you the following usable result:
idx val gidx
1 A 1
2 A 1
3 A 1
4 B 2
5 B 2
6 B 2
7 A 3
8 A 3
9 A 3
10 C 4
11 C 4
12 A 5
13 A 5
14 A 5
15 D 6
16 D 6
Notice that gidx now is a grouping factor.
Then it's a simple matter of extracting the data with a sub select:
SELECT ( SELECT TOP 1
VAL
FROM #GRP g3
WHERE g2.gidx = g3.gidx
) AS Val ,
MIN(idx) AS First ,
MAX(idx) AS Last
FROM #grp AS g2
GROUP BY gidx
This yields the result:
A 1 3
B 4 6
A 7 9
C 10 11
A 12 14
D 15 16
Fiddler link
I'm assuming that IDX values are unique. If they can also be assumed to start from 1 and have no gaps, as in your example, you could try the following SQL Server 2005+ solution:
WITH partitioned AS (
SELECT
IDX, Val,
grp = IDX - ROW_NUMBER() OVER (PARTITION BY Val ORDER BY IDX ASC)
FROM #tbl
)
SELECT
Val,
FirstIDX = MIN(IDX),
LastIDX = MAX(IDX)
FROM partitioned
GROUP BY
Val, grp
ORDER BY
FirstIDX
;
If IDX values may have gaps and/or may start from a value other than 1, you could use the following modification of the above instead:
WITH partitioned AS (
SELECT
IDX, Val,
grp = ROW_NUMBER() OVER ( ORDER BY IDX ASC)
- ROW_NUMBER() OVER (PARTITION BY Val ORDER BY IDX ASC)
FROM #tbl
)
SELECT
Val,
FirstIDX = MIN(IDX),
LastIDX = MAX(IDX)
FROM partitioned
GROUP BY
Val, grp
ORDER BY
FirstIDX
;
Note: If you end up using either of these queries, please make sure the statement preceding the query is delimited with a semicolon, particularly if you are using SQL Server 2008 or later version.