Find all records within x units of each other - sql

I have a table like this:
CREATE TABLE t(idx integer primary key, value integer);
INSERT INTO t(idx, value)
VALUES
(1, 1),
(2, 2),
(3, 3),
(4, 6),
(5, 7),
(6, 12)
I would like to return all the groups of records where the values are within 2 of each other, with an associated group label as a new column by which to identify them.
I thought perhaps a recursive query might be suitable...but my sql-fu is lacking.

You can use a recursive CTE:
with recursive tt as (
select t.*, row_number() over (order by idx) as seqnum
from t
),
cte as (
select idx, value, value as grp,
seqnum, 1 as lev
from tt
where seqnum = 1
union all
select tt.idx, tt.value,
(case when tt.value > grp + 2 then tt.value else cte.grp end),
tt.seqnum, 1 + lev
from cte join
tt
on tt.seqnum = cte.seqnum + 1
)
select *
from cte;
Here is a db<>fiddle. Note that this added a row with the value of "4" to show that the first four rows are split into two groups.

I assume you want to group rows so that any two values in each group may differ only by at most 2. Then you are right, recursive query is the solution. In each level of recursion the bounds of new group are precomputed. Groups are disjoint so finally join original table with computed group number and group by this number. Db fiddle here.
with recursive r (minv,maxv,level) as (
select min(t.value), min(t.value) + 2, 1
from t
union all
select minv, maxv, level from (
select t.value as minv, t.value + 2 as maxv, r.level + 1 as level, row_number() over (order by minv) rn
from r
join t on t.value > r.maxv
) x where x.rn = 1
)
select r.level
, format('ids from %s to %s', min(t.idx), max(t.idx)) as id_label
, format('values from %s to %s', min(t.value), max(t.value)) as value_label
from t join r on t.value between r.minv and r.maxv
group by r.level
order by r.level
(The inner query in the recursive part is just to limit number of newly added rows only to one. Simpler clause select min(t.value), min(t.value) + 2 is not possible because aggregation functions are not allowed in recursive part, analytic function is workaround.)

Related

Query to Find maximum possible combinations between two columns

The target is to create all possible combinations of joining the two columns. every article of the first column ('100','101','102','103') must be in the combination result.
Sample Code
create table basis
(article Integer,
supplier VarChar(10) );
Insert into basis Values (100, 'A');
Insert into basis Values (101, 'A');
Insert into basis Values (101, 'B');
Insert into basis Values (101, 'C');
Insert into basis Values (102, 'D');
Insert into basis Values (103, 'B');
Result set
combination_nr;article;supplier
1;100;'A'
1;101;'A'
1;102;'D'
1;103;'B'
2;100;'A'
2;101;'B'
2;102;'D'
2;103;'B'
3;100;'A'
3;101;'C'
3;102;'D'
3;103;'B'
Let suppose if we add one more row against 102 as 'A' then our result set will be like this
Also according to the below-given calculations now we have 24 result sets
1;100;'A'
1;101;'A'
1;102;'A'
1;103;'B'
2;100;'A'
2;101;'A'
2;102;'D'
2;103;'B'
3;100;'A'
3;101;'B'
3;102;'A'
3;103;'B'
4;100;'A'
4;101;'B'
4;102;'D'
4;103;'B'
5;100;'A'
5;101;'C'
5;102;'A'
5;103;'B'
6;100;'A'
6;101;'C'
6;102;'D'
6;103;'B'
Already tried code
I have tried different Cross Joins but they always give exceeded rows according to my result sets.
SELECT article, supplier
FROM (SELECT DISTINCT supplier FROM basis2) AS t1
CROSS JOIN (SELECT DISTINCT article FROM basis2) AS t2;
Calculations:
article 100: 1 supplier ('A')
article 101: 3 suppliers ('A','B','C')
article 102: 1 supplier ('D')
article 103: 1 supplier ('B')
unique articles: 4 (100,101,102,103)
1x3x1x1 x 4 = 12 (combination rows)
You can do what you want using a recursive CTE. It is easier to put the combinations in single rows rather than across multiple rows:
with b as (
select b.*, dense_rank() over (order by article) as seqnum
from basis b
),
cte as (
select convert(varchar(max), concat(article, ':', supplier)) as suppliers, seqnum
from b
where seqnum = 1
union all
select concat(cte.suppliers, ',', concat(article, ':', supplier)), b.seqnum
from cte join
b
on b.seqnum = cte.seqnum + 1
)
select row_number() over (order by suppliers), suppliers
from (select cte.*, max(seqnum) over () as max_seqnum
from cte
) cte
where seqnum = max_seqnum;
For your particular result set, you can unroll the string:
with b as (
select b.*, dense_rank() over (order by article) as seqnum
from basis b
),
cte as (
select convert(varchar(max), concat(article, ':', supplier)) as suppliers, seqnum
from b
where seqnum = 1
union all
select concat(cte.suppliers, ',', concat(article, ':', supplier)), b.seqnum
from cte join
b
on b.seqnum = cte.seqnum + 1
)
select seqnum,
left(s.value, charindex(':', s.value) - 1) as article,
stuff(s.value, 1, charindex(s.value, ':'), '') as supplier
from (select row_number() over (order by suppliers) as seqnum, suppliers
from (select cte.*, max(seqnum) over () as max_seqnum
from cte
) cte
where seqnum = max_seqnum
) cte cross apply
string_split(suppliers, ',') s;
Here is a db<>fiddle.

Linked lists: query first and last element of chained lists stored in SQL table

I have an SQL table with "lines" representing elements of chained lists.
I could for example have the following records:
(id, previous_id)
------------------
(1, NULL)
(2, NULL)
(3, 2)
(4, 3)
(5, NULL)
(6, 4)
(7, 5)
We have 3 lists in this table:
(1,)
(2,3,4,6)
(5,7)
I would like to find the last element of each list and the number of elements in the list.
The query I am looking for would output:
last, len
1, 1
6, 4
7, 2
Is this possible in SQL?
You can use a recursive CTE:
with recursive cte as (
select l.previous_id as id, id as last
from lines l
where not exists (select 1 from lines l2 where l2.previous_id = l.id)
union all
select l.previous_id, cte.last
from cte join
lines l
on cte.id = l.id
)
select cte.last, count(*)
from cte
group by cte.last;
Here is a db<>fiddle.
WITH RECURSIVE cte AS (
SELECT id AS first, id AS last, 1 as len
FROM lines
WHERE previous_id IS NULL
UNION ALL
SELECT c.first, l.id, len + 1
FROM cte c
JOIN lines l ON l.previous_id = c.last
)
SELECT DISTINCT ON (first)
last, len -- , first -- also?
FROM cte
ORDER BY first, len DESC;
db<>fiddle here
Produces your result exactly.
If yo also want the first element like your title states, that's readily available.
Here is an implementation in Microsoft SQL Server 2016 db<>fiddle
WITH chain
AS (SELECT l.id AS [first],
l.id AS [last],
1 AS [len]
FROM lines AS l
WHERE l.previous_id IS NULL
UNION ALL
SELECT c.[first],
l.id,
c.[len] + 1 AS [len]
FROM chain AS c
JOIN lines AS l ON l.previous_id = c.[last]),
result
AS (SELECT DISTINCT
c.[first],
c.[last],
c.[len],
ROW_NUMBER() OVER(PARTITION BY c.[first] ORDER BY c.[len] DESC) AS rn
FROM chain as c)
SELECT r.[first],
r.[last],
r.[len]
FROM result AS r
WHERE r.rn = 1
ORDER BY r.[first];

Finding gap in column with SQL Server

I have a table with a column, int type, it's not the primary key. I have thousand of record.
I'd like to find the missing ids.
I have these data :
1
2
3
4
6
8
11
14
I'd like have this as result : 5,7,9,10,12,13
DO you know how I can do this ?
Thanks,
It is easier to get this as ranges:
select (col + 1) as first_missing, (next_col - 1) as last_missing
from (select t.*, lead(col) over (order by col) as next_col
from t
) t
where next_col <> col + 1;
If you actually want this as a list, I would suggest a recursive CTE:
with cte as (
select t.col, lead(col) over (order by col) as next_col, 1 as lev
from t
union all
select cte.col + 1, next_col, lev + 1
from cte
where col + 1 < next_col
)
select cte.col
from cte
where lev > 1;
Note: If the gaps can be more than 100, you will need OPTION (MAXRECURSION 0).
Here is a db<>fiddle.
Assuming mytab is your table, the relevant column is mycol and the potential values are 1-10,000
with t(i) as (select 1 union all select i+1 from t where i<10)
,all_values(mycol) as (select row_number() over (order by (select null)) from t t0,t t1,t t2, t t3)
select *
from all_values a left join mytab t on a.mycol = t.mycol
where t.mycol is null

Swap two adjacent rows of a column in sql

I'm trying to solve this following problem:
Write a sql query to swap two adjacent rows in a column of a table.
Input table
Name Id
A 1
B 2
C 3
D 4
E 5
Output table
Name Id
A 2
B 1
C 4
D 3
E 5
Description:- 1 is associated with A and 2 with B, swap them, thus now 1 is associated with B and 2 with A, Similarly do for C and D, Since E doesn't has any pair, leave it as it is.
Note:- This may be solved using CASE Statements, but I am trying for a generalized solution, Say currently it is only 5 rows, it may be 10,20 etc..
Eg:
SELECT
*,CASE WHEN Name = A then 2 ELSEIF Name = B then 1 etc...
FROM YourTable
You can use window functions to solve this.
on MySQL (>= 8.0):
SELECT ID, IFNULL(CASE WHEN t.rn % 2 = 0 THEN LAG(Name) OVER (ORDER BY ID) ELSE LEAD(Name) OVER (ORDER BY ID) END, Name) AS Name
FROM (
SELECT ID, Name, ROW_NUMBER() OVER (ORDER BY ID) AS rn
FROM table_name
) t
demo on dbfiddle.uk
on SQL-Server:
SELECT ID, ISNULL(CASE WHEN t.rn % 2 = 0 THEN LAG(Name) OVER (ORDER BY ID) ELSE LEAD(Name) OVER (ORDER BY ID) END, Name) AS Name
FROM (
SELECT ID, Name, ROW_NUMBER() OVER (ORDER BY ID) AS rn
FROM table_name
) t
demo on dbfiddle.uk
If you have sql-server, you can try this.
DECLARE #YourTable TABLE (Name VARCHAR(10), Id INT)
INSERT INTO #YourTable VALUES
('A', 1),
('B', 2),
('C', 3),
('D', 4),
('E', 5)
;WITH CTE AS (
SELECT *, ROW_NUMBER()OVER(ORDER BY Name) AS RN FROM #YourTable
)
SELECT T1.Name, ISNULL(T2.Id, T1.Id) Id FROM CTE T1
LEFT JOIN CTE T2 ON T1.RN + CASE WHEN T1.RN%2 = 0 THEN - 1 ELSE 1 END = T2.RN
Result:
Name Id
---------- -----------
A 2
B 1
C 4
D 3
E 5
You didn't specify your DBMS, but the following is standard ANSI SQL.
You can use a values() clause to provide the mapping of the IDs and then join against that:
with id_map (source_id, target_id) as (
values
(1, 2),
(2, 1)
)
select t.name, coalesce(m.target_id, t.id) as mapped_id
from the_table t
left join id_map m on m.source_id = t.id
order by name;
Alternatively if you only want to specify the mapping once for one direction, you can use this:
with id_map (source_id, target_id) as (
values
(1, 2)
)
select t.name,
case id
when m.source_id then m.target_id
when m.target_id then m.source_id
else id
end as mapped_id
from the_table t
left join id_map m on t.id in (m.source_id, m.target_id)
order by name;
Online example: https://rextester.com/FBFH52231

How can I transform my N little queries into one query?

I have a query that gives me the first available value for a given date and pair.
SELECT
TOP 1 value
FROM
my_table
WHERE
date >= 'myinputdate'
AND key = 'myinpukey'
ORDER BY date
I have N pairs of key and dates, and I try to find out how not to query each pair one by one. The table is rather big, and N as well, so it's currently heavy and slow.
How can I query all the pairs in one query ?
A solution is to use APPLY like a "function" created on the fly with one or many columns from another set:
DECLARE #inputs TABLE (
myinputdate DATE,
myinputkey INT)
INSERT INTO #inputs(
myinputdate,
myinputkey)
VALUES
('2019-06-05', 1),
('2019-06-01', 2)
SELECT
I.myinputdate,
I.myinputkey,
R.value
FROM
#inputs AS I
CROSS APPLY (
SELECT TOP 1
T.value
FROM
my_table AS T
WHERE
T.date >= I.myinputdate AND
T.key = I.myinputkey
ORDER BY
T.date ) AS R
You can use OUTER APPLY if you want NULL result values to be shown also. This supports fetching multiple columns and using ORDER BY with TOP to control amount of rows.
This solution is without variables. You control your N by setting the right value to the row_num predicate.
There are plenty of ways how to do you what you want and it all depends on your specific needs. As it answered already, that you can use temp/variable table to store these conditions and then join it on the same conditions you use predicates. You can also create user defined data type and use it as param to the function/procedure. You might use CROSS APPLY + VALUES clause to get that list and then join it.
DROP TABLE IF EXISTS #temp;
CREATE TABLE #temp ( d DATE, k VARCHAR(100) );
GO
INSERT INTO #temp
VALUES ( '20180101', 'a' ),
( '20180102', 'b' ),
( '20180103', 'c' ),
( '20180104', 'd' ),
( '20190101', 'a' ),
( '20190102', 'b' ),
( '20180402', 'c' ),
( '20190103', 'c' ),
( '20190104', 'd' );
SELECT a.d ,
a.k
FROM ( SELECT d ,
k ,
ROW_NUMBER() OVER ( PARTITION BY k ORDER BY d DESC ) row_num
FROM #temp
WHERE (d >= '20180401'
AND k = 'a')
OR (d > '20180401'
AND k = 'b')
OR (d > '20180401'
AND k = 'c')
) a
WHERE a.row_num <= 1;
-- VALUES way
SELECT a.d ,
a.k
FROM ( SELECT t.d ,
t.k ,
ROW_NUMBER() OVER ( PARTITION BY t.k ORDER BY t.d DESC ) row_num
FROM #temp t
CROSS APPLY (VALUES('20180401','a'), ('20180401', 'b'), ('20180401', 'c')) f(d,k)
WHERE t.d >= f.d AND f.k = t.k
) a
WHERE a.row_num <= 1;
If all the keys are using the same date, then use window functions:
SELECT key, value
FROM (SELECT t.*, ROW_NUMBER() OVER (PARTITION BY key ORDER BY date) as seqnum
FROM my_table t
WHERE date >= #input_date AND
key IN ( . . . )
) t
WHERE seqnum = 1;
SELECT key, date,value
FROM (SELECT ROW_NUMBER() OVER (PARTITION BY key,date ORDER BY date) as rownum,key,date,value
FROM my_table
WHERE
date >= 'myinputdate'
) as d
WHERE d.rownum = 1;