Group identifiers/values that are related with each other between multiple columns - sql

I want to group identifiers that are related with each other between multiple columns and create/assign a unique group id.
Also, If we receive a new row, we can assign the right id respecting what has been done before for others group id
For example:
Col1
Col2
Col3
Col4
AA
Null
33
12
BB
Null
45
12
AA
123
65
15
CC
123
NULL
42
DD
Null
10
42
EE
NULL
20
NULL
FF
145
33
NULL
GG
NULL
NULL
11
Desired result:
The group ID =1 beacuse in col1, it's the same value row 1 and 3 (AA) and for row 4 it's also ID 1 because in the second column, the value for AA it's 123 (the same for CC)
If there is any match between rows and cross the columns, we generate an id
Col1
Col 2
Col 3
Col 4
Group ID
AA
Null
33
12
1
BB
Null
45
12
1
AA
123
65
15
1
CC
123
NULL
42
1
DD
Null
10
42
1
EE
NULL
20
NULL
2
FF
145
33
NULL
1
GG
NULL
NULL
11
3

I've been doing some work on this and agree with Kashyap- I cannot find a way to do this is a single statement. You need either a recursive CTE or a loop. Synapse does not currently support recursive CTEs, which leaves using a loop to create the effect you want.
One concern that came up while I was working with this. As you continue to add data, you'll have more and more overlaps and could eventually end up with just one group. That depends on your dataset- you might have something you can guarantee will have discrete divisions. The way the script I put together works, a new match will update any group IDs, even in existing data. You could modify it to only set group IDs only for new rows, but then you could end up in a situation when one row matches multiple groups.
Certainly not the only option, but this is the script I pulled together. It is dependent on having a unique ID that will remain the same in each iteration. Because the loop uses updates instead of inserts, prepping the data would involve inserting the data into your new table without the group, and you can create your ID at that time using auto-increment or otherwise. The script works best with an INT ID column, but should work with a guid if that is necessary.
So process is essentially this:
Do whatever initial prep you need to do to inserting data into the table and creating an ID
Join the table back onto itself, once for each column that could contain a match
Update the Group ID to be the minimum value across the IDs and current group IDs of that set of matches.
Check to see if we need to do another round. Because we are using minimum ID as a group number, there will be a row where the ID = group ID in each group
CREATE TABLE #testtable
(
[id] INT NOT NULL,
[col1] INT NOT NULL,
[col2] INT NULL,
[col3] INT NULL,
[groupnumber] INT NULL
)
INSERT INTO #testtable
(id,
col1,
col2,
col3)
INSERT INTO #testTable (id, col1, col2, col3)
SELECT 1, 1, 5, 33 UNION ALL -- First
SELECT 2, 2, null, 45 UNION ALL -- Second
SELECT 3, 1, 123, 65 UNION ALL -- First
SELECT 4, 3, 123, null UNION ALL -- First
SELECT 5, 10, null, 10 UNION ALL -- Third
SELECT 6, 5, null, 45 UNION ALL -- Second
SELECT 7, 6, 145, 33 -- First
DECLARE #RemainingRows INT,
#LoopCounter INT, #MaxLoops int -- To protect against infinite loop
SET #RemainingRows = (SELECT COUNT([id]) FROM #testtable)
SET #LoopCounter = 0;
SET #MaxLoops = 10;
WHILE( #RemainingRows > 0
AND #LoopCounter < #MaxLoops )
BEGIN
WITH combineddata AS
(
SELECT
id,
col1,
col2,
col3,
groupnumber
FROM
#testtable
),
--Create a set a rows that contains all rows and all possible matches
matcheddata AS
(
SELECT
c1.id,
c1.col1 AS c1col1,
c1.col2 AS c1col2,
c1.col3 AS c1col3,
c1.groupnumber AS groupNumber1,
c2.id AS RowNum2,
c2.groupnumber AS groupNumber2,
c3.id AS RowNum3,
c3.groupnumber AS groupNumber3,
c4.id AS RowNum4,
c4.groupnumber AS groupNumber4
FROM
combineddata c1
LEFT JOIN
combineddata c2
ON c1.col1 = c2.col1
LEFT JOIN
combineddata c3
ON c1.col2 = c3.col2
LEFT JOIN
combineddata c4
ON c1.col3 = c4.col3
)
UPDATE #testtable
SET
groupnumber =
CASE
WHEN
NEW.groupnumber IS NULL
THEN
NULL
ELSE
NEW.groupnumber
END
FROM
(
SELECT
id,
c1col1,
c1col2,
c1col3,
MIN(groupnumber) AS GroupNumber
FROM
matcheddata CROSS apply (
SELECT
MIN(c) AS GroupNumber
FROM (VALUES
(id),
(RowNum2),
(RowNum3),
(RowNum4),
(groupNumber1),
(groupNumber2),
(groupNumber3),
(groupNumber4)
) AS v (C)
WHERE
c IS NOT NULL) g
GROUP BY
id,
c1col1,
c1col2,
c1col3
) NEW
INNER JOIN # testtable
ON NEW.id = #testtable.id
SET
#LoopCounter = #LoopCounter + 1
SET
#RemainingRows =
(
SELECT
COUNT(t1.id)
FROM
#testtable t1
LEFT JOIN
#testtable t2
ON t1.groupnumber = t2.[id]
WHERE
t2.id IS NULL
OR t2.id <> t2.groupnumber
)
PRINT 'Remaining Rows: ' + CAST(#RemainingRows AS VARCHAR) PRINT 'Counter: ' + CAST(#LoopCounter AS VARCHAR);
END
SELECT * FROM #testtable
IF Object_id('tempdb..#testTable') IS NOT NULL
BEGIN
DROP TABLE # testtable
END```

Related

T-SQL - Copying & Transposing Data

I'm trying to copy data from one table to another, while transposing it and combining it into appropriate rows, with different columns in the second table.
First time posting. Yes this may seem simple to everyone here. I have tried for a couple hours to solve this. I do not have much support internally and have learned a great deal on this forum and managed to get so much accomplished with your other help examples. I appreciate any help with this.
Table 1 has the data in this format.
Type Date Value
--------------------
First 2019 1
First 2020 2
Second 2019 3
Second 2020 4
Table 2 already has the Date rows populated and columns created. It is waiting for the Values from Table 1 to be placed in the appropriate column/row.
Date First Second
------------------
2019 1 3
2020 2 4
For an update, I might use two joins:
update t2
set first = tf.value,
second = ts.value
from table2 t2 left join
table1 tf
on t2.date = tf.date and tf.type = 'First' left join
table1 ts
on t2.date = ts.date and ts.type = 'Second'
where tf.date is not null or ts.date is not null;
use conditional aggregation
select date,max(case when type='First' then value end) as First,
max(case when type='Second' then value end) as Second from t
group by date
You can do conditional aggregation :
select date,
max(case when type = 'first' then value end) as first,
max(case when type = 'Second' then value end) as Second
from table t
group by date;
After that you can use cte :
with cte as (
select date,
max(case when type = 'first' then value end) as first,
max(case when type = 'Second' then value end) as Second
from table t
group by date
)
update t2
set t2.First = t1.First,
t2.Second = t1.Second
from table2 t2 inner join
cte t1
on t1.date = t2.date;
Seems like you're after a PIVOT
DECLARE #Table1 TABLE
(
[Type] NVARCHAR(100)
, [Date] INT
, [Value] INT
);
DECLARE #Table2 TABLE(
[Date] int
,[First] int
,[Second] int
)
INSERT INTO #Table1 (
[Type]
, [Date]
, [Value]
)
VALUES ( 'First', 2019, 1 )
, ( 'First', 2020, 2 )
, ( 'Second', 2019, 3 )
, ( 'Second', 2020, 4 );
INSERT INTO #Table2 (
[Date]
)
VALUES (2019),(2020)
--Show us what's in the tables
SELECT * FROM #Table1
SELECT * FROM #Table2
--How to pivot the data from Table 1
SELECT * FROM #Table1
PIVOT (
MAX([Value]) --Pivot on this Column
FOR [Type] IN ( [First], [Second] ) --Make column where [Value] is in one of this
) AS [pvt] --Table alias
--which gives
--Date First Second
------------- ----------- -----------
--2019 1 3
--2020 2 4
--Using that we can update #Table2
UPDATE [tbl2]
SET [tbl2].[First] = pvt.[First]
,[tbl2].[Second] = pvt.[Second]
FROM #Table1 tbl1
PIVOT (
MAX([Value]) --Pivot on this Column
FOR [Type] IN ( [First], [Second] ) --Make column where [Value] is in one of this
) AS [pvt] --Table alias
INNER JOIN #Table2 tbl2 ON [tbl2].[Date] = [pvt].[Date]
--Results from #Table 2 after updated
SELECT * FROM #Table2
--which gives
--Date First Second
------------- ----------- -----------
--2019 1 3
--2020 2 4

Finding row of "duplicate" data with greatest value

I have a table setup as follows:
Key || Code || Date
5 2 2018
5 1 2017
8 1 2018
8 2 2017
I need to retrieve only the key and code where:
Code=2 AND Date > the other record's date
So based on this data above, I need to retrieve:
Key 5 with code=2
Key 8 does not meet the criteria since code 2's date is lower than code 1's date
I tried joining the table on itself but this returned incorrect data
Select key,code
from data d1
Join data d2 on d1.key = d2.key
Where d1.code = 2 and d1.date > d2.date
This method returned data with incorrect values and wrong data.
Perhaps you want this:
select d.*
from data d
where d.code = 2 and
d.date > (select d2.date
from data d2
where d2.key = d.key and d2.code = 1
);
If you just want the key, I would go for aggregation:
select d.key
from data d
group by d.key
having max(case when d2.code = 2 then date end) > max(case when d2.code <> 2 then date end);
use row_number, u can select rows with dates in ascending order. This is based on your sample data, selecting 2 rows
DECLARE #table TABLE ([key] INT, code INT, DATE INT)
INSERT #table
SELECT 5, 2, 2018
UNION ALL
SELECT 5, 2, 2018
UNION ALL
SELECT 8, 1, 2018
UNION ALL
SELECT 8, 2, 2017
SELECT [key], code, DATE
FROM (
SELECT [key], code, DATE, ROW_NUMBER() OVER (
PARTITION BY [key], code ORDER BY DATE
) rn
FROM #table
) x
WHERE rn = 2

SQL - Group by numbers according to their difference

I have a table and I want to group rows that have at most x difference at col2.
For example,
col1 col2
abg 3
abw 4
abc 5
abd 6
abe 20
abf 21
After query I want to get groups such that
group 1: abg 3
abw 4
abc 5
abd 6
group 2: abe 20
abf 21
In this example difference is 1.
How can write such a query?
For Oracle (or anything that supports window functions) this will work:
select col1, col2, sum(group_gen) over (order by col2) as grp
from (
select col1, col2,
case when col2 - lag(col2) over (order by col2) > 1 then 1 else 0 end as group_gen
from some_table
)
Check it on SQLFiddle.
This should get what you need, and changing the gap to that of 5, or any other number is a single change at the #lastVal +1 (vs whatever other difference). The prequery "PreSorted" is required to make sure the data is being processed sequentially so you don't get out-of-order entries.
As each current row is processed, it's column 2 value is stored in the #lastVal for test comparison of the next row, but remains as a valid column "Col2". There is no "group by" as you are just wanting a column to identify where each group is associated vs any aggregation.
select
#grp := if( PreSorted.col2 > #lastVal +1, #grp +1, #grp ) as GapGroup,
PreSorted.col1,
#lastVal := PreSorted.col2 as Col2
from
( select
YT.col1,
YT.col2
from
YourTable YT
order by
YT.col2 ) PreSorted,
( select #grp := 1,
#lastVal := -1 ) sqlvars
try this query, you can use 1 and 2 as input and get you groups:
var grp number(5)
exec :grp :=1
select * from YourTABLE
where (:grp = 1 and col2 < 20) or (:grp = 2 and col2 > 6);

How do you find a missing number in a table field starting from a parameter and incrementing sequentially?

Let's say I have an sql server table:
NumberTaken CompanyName
2 Fred 3 Fred 4 Fred 6 Fred 7 Fred 8 Fred 11 Fred
I need an efficient way to pass in a parameter [StartingNumber] and to count from [StartingNumber] sequentially until I find a number that is missing.
For example notice that 1, 5, 9 and 10 are missing from the table.
If I supplied the parameter [StartingNumber] = 1, it would check to see if 1 exists, if it does it would check to see if 2 exists and so on and so forth so 1 would be returned here.
If [StartNumber] = 6 the function would return 9.
In c# pseudo code it would basically be:
int ctr = [StartingNumber]
while([SELECT NumberTaken FROM tblNumbers Where NumberTaken = ctr] != null)
ctr++;
return ctr;
The problem with that code is that is seems really inefficient if there are thousands of numbers in the table. Also, I can write it in c# code or in a stored procedure whichever is more efficient.
Thanks for the help
A solution using JOIN:
select min(r1.NumberTaken) + 1
from MyTable r1
left outer join MyTable r2 on r2.NumberTaken = r1.NumberTaken + 1
where r1.NumberTaken >= 1 --your starting number
and r2.NumberTaken is null
I called my table Blank, and used the following:
declare #StartOffset int = 2
; With Missing as (
select #StartOffset as N where not exists(select * from Blank where ID = #StartOffset)
), Sequence as (
select #StartOffset as N from Blank where ID = #StartOffset
union all
select b.ID from Blank b inner join Sequence s on b.ID = s.N + 1
)
select COALESCE((select N from Missing),(select MAX(N)+1 from Sequence))
You basically have two cases - either your starting value is missing (so the Missing CTE will contain one row), or it's present, so you count forwards using a recursive CTE (Sequence), and take the max from that and add 1
Edit from comment. Yes, create another CTE at the top that has your filter criteria, then use that in the rest of the query:
declare #StartOffset int = 2
; With BlankFilters as (
select ID from Blank where hasEntered <> 1
), Missing as (
select #StartOffset as N where not exists(select * from BlankFilters where ID = #StartOffset)
), Sequence as (
select #StartOffset as N from BlankFilters where ID = #StartOffset
union all
select b.ID from BlankFilters b inner join Sequence s on b.ID = s.N + 1
)
select COALESCE((select N from Missing),(select MAX(N)+1 from Sequence))
this may now return a row that does exist in the table, but hasEntered=1
Tables:
create table Blank (
ID int not null,
Name varchar(20) not null
)
insert into Blank(ID,Name)
select 2 ,'Fred' union all
select 3 ,'Fred' union all
select 4 ,'Fred' union all
select 6 ,'Fred' union all
select 7 ,'Fred' union all
select 8 ,'Fred' union all
select 11 ,'Fred'
go
Try the set based approach - should be faster
select min(t1.NumberTaken)+1 as "min_missing" from t t1
where not exists (select 1 from t t2
where t1.NumberTaken = t2.NumberTaken+1)
and t1.NumberTaken > #StartingNumber
This is Sybase syntax, so massage for SQL server consumption if needed.
Create a temp table with all numbers from StartingValue to EndValue and LEFT OUTER JOIN to your data table.

How do I find a "gap" in running counter with SQL?

I'd like to find the first "gap" in a counter column in an SQL table. For example, if there are values 1,2,4 and 5 I'd like to find out 3.
I can of course get the values in order and go through it manually, but I'd like to know if there would be a way to do it in SQL.
In addition, it should be quite standard SQL, working with different DBMSes.
In MySQL and PostgreSQL:
SELECT id + 1
FROM mytable mo
WHERE NOT EXISTS
(
SELECT NULL
FROM mytable mi
WHERE mi.id = mo.id + 1
)
ORDER BY
id
LIMIT 1
In SQL Server:
SELECT TOP 1
id + 1
FROM mytable mo
WHERE NOT EXISTS
(
SELECT NULL
FROM mytable mi
WHERE mi.id = mo.id + 1
)
ORDER BY
id
In Oracle:
SELECT *
FROM (
SELECT id + 1 AS gap
FROM mytable mo
WHERE NOT EXISTS
(
SELECT NULL
FROM mytable mi
WHERE mi.id = mo.id + 1
)
ORDER BY
id
)
WHERE rownum = 1
ANSI (works everywhere, least efficient):
SELECT MIN(id) + 1
FROM mytable mo
WHERE NOT EXISTS
(
SELECT NULL
FROM mytable mi
WHERE mi.id = mo.id + 1
)
Systems supporting sliding window functions:
SELECT -- TOP 1
-- Uncomment above for SQL Server 2012+
previd
FROM (
SELECT id,
LAG(id) OVER (ORDER BY id) previd
FROM mytable
) q
WHERE previd <> id - 1
ORDER BY
id
-- LIMIT 1
-- Uncomment above for PostgreSQL
Your answers all work fine if you have a first value id = 1, otherwise this gap will not be detected. For instance if your table id values are 3,4,5, your queries will return 6.
I did something like this
SELECT MIN(ID+1) FROM (
SELECT 0 AS ID UNION ALL
SELECT
MIN(ID + 1)
FROM
TableX) AS T1
WHERE
ID+1 NOT IN (SELECT ID FROM TableX)
There isn't really an extremely standard SQL way to do this, but with some form of limiting clause you can do
SELECT `table`.`num` + 1
FROM `table`
LEFT JOIN `table` AS `alt`
ON `alt`.`num` = `table`.`num` + 1
WHERE `alt`.`num` IS NULL
LIMIT 1
(MySQL, PostgreSQL)
or
SELECT TOP 1 `num` + 1
FROM `table`
LEFT JOIN `table` AS `alt`
ON `alt`.`num` = `table`.`num` + 1
WHERE `alt`.`num` IS NULL
(SQL Server)
or
SELECT `num` + 1
FROM `table`
LEFT JOIN `table` AS `alt`
ON `alt`.`num` = `table`.`num` + 1
WHERE `alt`.`num` IS NULL
AND ROWNUM = 1
(Oracle)
The first thing that came into my head. Not sure if it's a good idea to go this way at all, but should work. Suppose the table is t and the column is c:
SELECT
t1.c + 1 AS gap
FROM t as t1
LEFT OUTER JOIN t as t2 ON (t1.c + 1 = t2.c)
WHERE t2.c IS NULL
ORDER BY gap ASC
LIMIT 1
Edit: This one may be a tick faster (and shorter!):
SELECT
min(t1.c) + 1 AS gap
FROM t as t1
LEFT OUTER JOIN t as t2 ON (t1.c + 1 = t2.c)
WHERE t2.c IS NULL
This works in SQL Server - can't test it in other systems but it seems standard...
SELECT MIN(t1.ID)+1 FROM mytable t1 WHERE NOT EXISTS (SELECT ID FROM mytable WHERE ID = (t1.ID + 1))
You could also add a starting point to the where clause...
SELECT MIN(t1.ID)+1 FROM mytable t1 WHERE NOT EXISTS (SELECT ID FROM mytable WHERE ID = (t1.ID + 1)) AND ID > 2000
So if you had 2000, 2001, 2002, and 2005 where 2003 and 2004 didn't exist, it would return 2003.
The following solution:
provides test data;
an inner query that produces other gaps; and
it works in SQL Server 2012.
Numbers the ordered rows sequentially in the "with" clause and then reuses the result twice with an inner join on the row number, but offset by 1 so as to compare the row before with the row after, looking for IDs with a gap greater than 1. More than asked for but more widely applicable.
create table #ID ( id integer );
insert into #ID values (1),(2), (4),(5),(6),(7),(8), (12),(13),(14),(15);
with Source as (
select
row_number()over ( order by A.id ) as seq
,A.id as id
from #ID as A WITH(NOLOCK)
)
Select top 1 gap_start from (
Select
(J.id+1) as gap_start
,(K.id-1) as gap_end
from Source as J
inner join Source as K
on (J.seq+1) = K.seq
where (J.id - (K.id-1)) <> 0
) as G
The inner query produces:
gap_start gap_end
3 3
9 11
The outer query produces:
gap_start
3
Inner join to a view or sequence that has a all possible values.
No table? Make a table. I always keep a dummy table around just for this.
create table artificial_range(
id int not null primary key auto_increment,
name varchar( 20 ) null ) ;
-- or whatever your database requires for an auto increment column
insert into artificial_range( name ) values ( null )
-- create one row.
insert into artificial_range( name ) select name from artificial_range;
-- you now have two rows
insert into artificial_range( name ) select name from artificial_range;
-- you now have four rows
insert into artificial_range( name ) select name from artificial_range;
-- you now have eight rows
--etc.
insert into artificial_range( name ) select name from artificial_range;
-- you now have 1024 rows, with ids 1-1024
Then,
select a.id from artificial_range a
where not exists ( select * from your_table b
where b.counter = a.id) ;
This one accounts for everything mentioned so far. It includes 0 as a starting point, which it will default to if no values exist as well. I also added the appropriate locations for the other parts of a multi-value key. This has only been tested on SQL Server.
select
MIN(ID)
from (
select
0 ID
union all
select
[YourIdColumn]+1
from
[YourTable]
where
--Filter the rest of your key--
) foo
left join
[YourTable]
on [YourIdColumn]=ID
and --Filter the rest of your key--
where
[YourIdColumn] is null
For PostgreSQL
An example that makes use of recursive query.
This might be useful if you want to find a gap in a specific range
(it will work even if the table is empty, whereas the other examples will not)
WITH
RECURSIVE a(id) AS (VALUES (1) UNION ALL SELECT id + 1 FROM a WHERE id < 100), -- range 1..100
b AS (SELECT id FROM my_table) -- your table ID list
SELECT a.id -- find numbers from the range that do not exist in main table
FROM a
LEFT JOIN b ON b.id = a.id
WHERE b.id IS NULL
-- LIMIT 1 -- uncomment if only the first value is needed
My guess:
SELECT MIN(p1.field) + 1 as gap
FROM table1 AS p1
INNER JOIN table1 as p3 ON (p1.field = p3.field + 2)
LEFT OUTER JOIN table1 AS p2 ON (p1.field = p2.field + 1)
WHERE p2.field is null;
I wrote up a quick way of doing it. Not sure this is the most efficient, but gets the job done. Note that it does not tell you the gap, but tells you the id before and after the gap (keep in mind the gap could be multiple values, so for example 1,2,4,7,11 etc)
I'm using sqlite as an example
If this is your table structure
create table sequential(id int not null, name varchar(10) null);
and these are your rows
id|name
1|one
2|two
4|four
5|five
9|nine
The query is
select a.* from sequential a left join sequential b on a.id = b.id + 1 where b.id is null and a.id <> (select min(id) from sequential)
union
select a.* from sequential a left join sequential b on a.id = b.id - 1 where b.id is null and a.id <> (select max(id) from sequential);
https://gist.github.com/wkimeria/7787ffe84d1c54216f1b320996b17b7e
Here is an alternative to show the range of all possible gap values in portable and more compact way :
Assume your table schema looks like this :
> SELECT id FROM your_table;
+-----+
| id |
+-----+
| 90 |
| 103 |
| 104 |
| 118 |
| 119 |
| 120 |
| 121 |
| 161 |
| 162 |
| 163 |
| 185 |
+-----+
To fetch the ranges of all possible gap values, you have the following query :
The subquery lists pairs of ids, each of which has the lowerbound column being smaller than upperbound column, then use GROUP BY and MIN(m2.id) to reduce number of useless records.
The outer query further removes the records where lowerbound is exactly upperbound - 1
My query doesn't (explicitly) output the 2 records (YOUR_MIN_ID_VALUE, 89) and (186, YOUR_MAX_ID_VALUE) at both ends, that implicitly means any number in both of the ranges hasn't been used in your_table so far.
> SELECT m3.lowerbound + 1, m3.upperbound - 1 FROM
(
SELECT m1.id as lowerbound, MIN(m2.id) as upperbound FROM
your_table m1 INNER JOIN your_table
AS m2 ON m1.id < m2.id GROUP BY m1.id
)
m3 WHERE m3.lowerbound < m3.upperbound - 1;
+-------------------+-------------------+
| m3.lowerbound + 1 | m3.upperbound - 1 |
+-------------------+-------------------+
| 91 | 102 |
| 105 | 117 |
| 122 | 160 |
| 164 | 184 |
+-------------------+-------------------+
select min([ColumnName]) from [TableName]
where [ColumnName]-1 not in (select [ColumnName] from [TableName])
and [ColumnName] <> (select min([ColumnName]) from [TableName])
Here is standard a SQL solution that runs on all database servers with no change:
select min(counter + 1) FIRST_GAP
from my_table a
where not exists (select 'x' from my_table b where b.counter = a.counter + 1)
and a.counter <> (select max(c.counter) from my_table c);
See in action for;
PL/SQL via Oracle's livesql,
MySQL via sqlfiddle,
PostgreSQL via sqlfiddle
MS Sql via sqlfiddle
It works for empty tables or with negatives values as well. Just tested in SQL Server 2012
select min(n) from (
select case when lead(i,1,0) over(order by i)>i+1 then i+1 else null end n from MyTable) w
If You use Firebird 3 this is most elegant and simple:
select RowID
from (
select `ID_Column`, Row_Number() over(order by `ID_Column`) as RowID
from `Your_Table`
order by `ID_Column`)
where `ID_Column` <> RowID
rows 1
-- PUT THE TABLE NAME AND COLUMN NAME BELOW
-- IN MY EXAMPLE, THE TABLE NAME IS = SHOW_GAPS AND COLUMN NAME IS = ID
-- PUT THESE TWO VALUES AND EXECUTE THE QUERY
DECLARE #TABLE_NAME VARCHAR(100) = 'SHOW_GAPS'
DECLARE #COLUMN_NAME VARCHAR(100) = 'ID'
DECLARE #SQL VARCHAR(MAX)
SET #SQL =
'SELECT TOP 1
'+#COLUMN_NAME+' + 1
FROM '+#TABLE_NAME+' mo
WHERE NOT EXISTS
(
SELECT NULL
FROM '+#TABLE_NAME+' mi
WHERE mi.'+#COLUMN_NAME+' = mo.'+#COLUMN_NAME+' + 1
)
ORDER BY
'+#COLUMN_NAME
-- SELECT #SQL
DECLARE #MISSING_ID TABLE (ID INT)
INSERT INTO #MISSING_ID
EXEC (#SQL)
--select * from #MISSING_ID
declare #var_for_cursor int
DECLARE #LOW INT
DECLARE #HIGH INT
DECLARE #FINAL_RANGE TABLE (LOWER_MISSING_RANGE INT, HIGHER_MISSING_RANGE INT)
DECLARE IdentityGapCursor CURSOR FOR
select * from #MISSING_ID
ORDER BY 1;
open IdentityGapCursor
fetch next from IdentityGapCursor
into #var_for_cursor
WHILE ##FETCH_STATUS = 0
BEGIN
SET #SQL = '
DECLARE #LOW INT
SELECT #LOW = MAX('+#COLUMN_NAME+') + 1 FROM '+#TABLE_NAME
+' WHERE '+#COLUMN_NAME+' < ' + cast( #var_for_cursor as VARCHAR(MAX))
SET #SQL = #sql + '
DECLARE #HIGH INT
SELECT #HIGH = MIN('+#COLUMN_NAME+') - 1 FROM '+#TABLE_NAME
+' WHERE '+#COLUMN_NAME+' > ' + cast( #var_for_cursor as VARCHAR(MAX))
SET #SQL = #sql + 'SELECT #LOW,#HIGH'
INSERT INTO #FINAL_RANGE
EXEC( #SQL)
fetch next from IdentityGapCursor
into #var_for_cursor
END
CLOSE IdentityGapCursor;
DEALLOCATE IdentityGapCursor;
SELECT ROW_NUMBER() OVER(ORDER BY LOWER_MISSING_RANGE) AS 'Gap Number',* FROM #FINAL_RANGE
Found most of approaches run very, very slow in mysql. Here is my solution for mysql < 8.0. Tested on 1M records with a gap near the end ~ 1sec to finish. Not sure if it fits other SQL flavours.
SELECT cardNumber - 1
FROM
(SELECT #row_number := 0) as t,
(
SELECT (#row_number:=#row_number+1), cardNumber, cardNumber-#row_number AS diff
FROM cards
ORDER BY cardNumber
) as x
WHERE diff >= 1
LIMIT 0,1
I assume that sequence starts from `1`.
If your counter is starting from 1 and you want to generate first number of sequence (1) when empty, here is the corrected piece of code from first answer valid for Oracle:
SELECT
NVL(MIN(id + 1),1) AS gap
FROM
mytable mo
WHERE 1=1
AND NOT EXISTS
(
SELECT NULL
FROM mytable mi
WHERE mi.id = mo.id + 1
)
AND EXISTS
(
SELECT NULL
FROM mytable mi
WHERE mi.id = 1
)
DECLARE #Table AS TABLE(
[Value] int
)
INSERT INTO #Table ([Value])
VALUES
(1),(2),(4),(5),(6),(10),(20),(21),(22),(50),(51),(52),(53),(54),(55)
--Gaps
--Start End Size
--3 3 1
--7 9 3
--11 19 9
--23 49 27
SELECT [startTable].[Value]+1 [Start]
,[EndTable].[Value]-1 [End]
,([EndTable].[Value]-1) - ([startTable].[Value]) Size
FROM
(
SELECT [Value]
,ROW_NUMBER() OVER(PARTITION BY 1 ORDER BY [Value]) Record
FROM #Table
)AS startTable
JOIN
(
SELECT [Value]
,ROW_NUMBER() OVER(PARTITION BY 1 ORDER BY [Value]) Record
FROM #Table
)AS EndTable
ON [EndTable].Record = [startTable].Record+1
WHERE [startTable].[Value]+1 <>[EndTable].[Value]
If the numbers in the column are positive integers (starting from 1) then here is how to solve it easily. (assuming ID is your column name)
SELECT TEMP.ID
FROM (SELECT ROW_NUMBER() OVER () AS NUM FROM 'TABLE-NAME') AS TEMP
WHERE ID NOT IN (SELECT ID FROM 'TABLE-NAME')
ORDER BY 1 ASC LIMIT 1
SELECT ID+1 FROM table WHERE ID+1 NOT IN (SELECT ID FROM table) ORDER BY 1;