SQL query to count all occurences that start with a substring

SQL query to count all occurences that start with a substring - sql

Suppose that I have the following table:
ID: STR:
01 abc
02 abcdef
03 abx
04 abxy
05 abxyz
06 abxyv
I need to use an SQL query that returns the ID column and the occurrences of the corresponding string as a prefix for string in other rows. E.g. The desired result for the table above:
ID: OCC:
01 2
02 1
03 4
04 3
05 1
06 1

You could JOIN the table with itself and GROUP BY the ID to get you the result.
SELECT t1.ID, COUNT(*)
FROM ATable t1
INNER JOIN ATable t2 ON t2.Str LIKE t1.Str + '%'
GROUP BY
t1.ID
Some notes:
You want to make sure you have an index on the Str column
Depending on the amount of data, your DBMS might choke on the amount it has to handle. Worst case, you are asking for the SQR(#Rows) in your table.

Can can do that in SQL Server's T-SQL with the following code. Caution: I do not guarantee how this will perform though with a large dataset!
Declare #Table table
(
Id int,
String varchar(10)
)
Insert Into #Table
( Id, String )
Values ( 1, 'abc' ),
( 2, 'abcdef' ),
( 3, 'abx' ),
( 4, 'abxy' ),
( 5, 'abxyz' ),
( 6, 'abxyv' )
Select t.Id,
t.String
From #Table as t
Inner Join #Table as t2 On t2.String Like t.String + '%'
Order By t.Id
Select t.Id,
Count(*) As 'Count'
From #Table as t
Inner Join #Table as t2 On t2.String Like t.String + '%'
Group By t.Id
Order By t.Id

Related

Group identifiers/values that are related with each other between multiple columns

I want to group identifiers that are related with each other between multiple columns and create/assign a unique group id.
Also, If we receive a new row, we can assign the right id respecting what has been done before for others group id
For example:
Col1
Col2
Col3
Col4
AA
Null
33
12
BB
Null
45
12
AA
123
65
15
CC
123
NULL
42
DD
Null
10
42
EE
NULL
20
NULL
FF
145
33
NULL
GG
NULL
NULL
11
Desired result:
The group ID =1 beacuse in col1, it's the same value row 1 and 3 (AA) and for row 4 it's also ID 1 because in the second column, the value for AA it's 123 (the same for CC)
If there is any match between rows and cross the columns, we generate an id
Col1
Col 2
Col 3
Col 4
Group ID
AA
Null
33
12
1
BB
Null
45
12
1
AA
123
65
15
1
CC
123
NULL
42
1
DD
Null
10
42
1
EE
NULL
20
NULL
2
FF
145
33
NULL
1
GG
NULL
NULL
11
3

I've been doing some work on this and agree with Kashyap- I cannot find a way to do this is a single statement. You need either a recursive CTE or a loop. Synapse does not currently support recursive CTEs, which leaves using a loop to create the effect you want.
One concern that came up while I was working with this. As you continue to add data, you'll have more and more overlaps and could eventually end up with just one group. That depends on your dataset- you might have something you can guarantee will have discrete divisions. The way the script I put together works, a new match will update any group IDs, even in existing data. You could modify it to only set group IDs only for new rows, but then you could end up in a situation when one row matches multiple groups.
Certainly not the only option, but this is the script I pulled together. It is dependent on having a unique ID that will remain the same in each iteration. Because the loop uses updates instead of inserts, prepping the data would involve inserting the data into your new table without the group, and you can create your ID at that time using auto-increment or otherwise. The script works best with an INT ID column, but should work with a guid if that is necessary.
So process is essentially this:
Do whatever initial prep you need to do to inserting data into the table and creating an ID
Join the table back onto itself, once for each column that could contain a match
Update the Group ID to be the minimum value across the IDs and current group IDs of that set of matches.
Check to see if we need to do another round. Because we are using minimum ID as a group number, there will be a row where the ID = group ID in each group
CREATE TABLE #testtable
(
[id] INT NOT NULL,
[col1] INT NOT NULL,
[col2] INT NULL,
[col3] INT NULL,
[groupnumber] INT NULL
)
INSERT INTO #testtable
(id,
col1,
col2,
col3)
INSERT INTO #testTable (id, col1, col2, col3)
SELECT 1, 1, 5, 33 UNION ALL -- First
SELECT 2, 2, null, 45 UNION ALL -- Second
SELECT 3, 1, 123, 65 UNION ALL -- First
SELECT 4, 3, 123, null UNION ALL -- First
SELECT 5, 10, null, 10 UNION ALL -- Third
SELECT 6, 5, null, 45 UNION ALL -- Second
SELECT 7, 6, 145, 33 -- First
DECLARE #RemainingRows INT,
#LoopCounter INT, #MaxLoops int -- To protect against infinite loop
SET #RemainingRows = (SELECT COUNT([id]) FROM #testtable)
SET #LoopCounter = 0;
SET #MaxLoops = 10;
WHILE( #RemainingRows > 0
AND #LoopCounter < #MaxLoops )
BEGIN
WITH combineddata AS
(
SELECT
id,
col1,
col2,
col3,
groupnumber
FROM
#testtable
),
--Create a set a rows that contains all rows and all possible matches
matcheddata AS
(
SELECT
c1.id,
c1.col1 AS c1col1,
c1.col2 AS c1col2,
c1.col3 AS c1col3,
c1.groupnumber AS groupNumber1,
c2.id AS RowNum2,
c2.groupnumber AS groupNumber2,
c3.id AS RowNum3,
c3.groupnumber AS groupNumber3,
c4.id AS RowNum4,
c4.groupnumber AS groupNumber4
FROM
combineddata c1
LEFT JOIN
combineddata c2
ON c1.col1 = c2.col1
LEFT JOIN
combineddata c3
ON c1.col2 = c3.col2
LEFT JOIN
combineddata c4
ON c1.col3 = c4.col3
)
UPDATE #testtable
SET
groupnumber =
CASE
WHEN
NEW.groupnumber IS NULL
THEN
NULL
ELSE
NEW.groupnumber
END
FROM
(
SELECT
id,
c1col1,
c1col2,
c1col3,
MIN(groupnumber) AS GroupNumber
FROM
matcheddata CROSS apply (
SELECT
MIN(c) AS GroupNumber
FROM (VALUES
(id),
(RowNum2),
(RowNum3),
(RowNum4),
(groupNumber1),
(groupNumber2),
(groupNumber3),
(groupNumber4)
) AS v (C)
WHERE
c IS NOT NULL) g
GROUP BY
id,
c1col1,
c1col2,
c1col3
) NEW
INNER JOIN # testtable
ON NEW.id = #testtable.id
SET
#LoopCounter = #LoopCounter + 1
SET
#RemainingRows =
(
SELECT
COUNT(t1.id)
FROM
#testtable t1
LEFT JOIN
#testtable t2
ON t1.groupnumber = t2.[id]
WHERE
t2.id IS NULL
OR t2.id <> t2.groupnumber
)
PRINT 'Remaining Rows: ' + CAST(#RemainingRows AS VARCHAR) PRINT 'Counter: ' + CAST(#LoopCounter AS VARCHAR);
END
SELECT * FROM #testtable
IF Object_id('tempdb..#testTable') IS NOT NULL
BEGIN
DROP TABLE # testtable
END```

Find overlapping sets of data in a table

I need to identify duplicate sets of data and give those sets who's data is similar a group id.
id threshold cost
-- ---------- ----------
1 0 9
1 100 7
1 500 6
2 0 9
2 100 7
2 500 6
I have thousands of these sets, most are the same with different id's. I need find all the like sets that have the same thresholds and cost amounts and give them a group id. I'm just not sure where to begin. Is the best way to iterate and insert each set into a table and then each iterate through each set in the table to find what already exists?

This is one of those cases where you can try to do something with relational operators. Or, you can just say: "let's put all the information in a string and use that as the group id". SQL Server seems to discourage this approach, but it is possible. So, let's characterize the groups using:
select d.id,
(select cast(threshold as varchar(8000)) + '-' + cast(cost as varchar(8000)) + ';'
from data d2
where d2.id = d.id
for xml path ('')
order by threshold
) as groupname
from data d
group by d.id;
Oh, I think that solves your problem. The groupname can serve as the group id. If you want a numeric id (which is probably a good idea, use dense_rank():
select d.id, dense_rank() over (order by groupname) as groupid
from (select d.id,
(select cast(threshold as varchar(8000)) + '-' + cast(cost as varchar(8000)) + ';'
from data d2
where d2.id = d.id
for xml path ('')
order by threshold
) as groupname
from data d
group by d.id
) d;

Here's the solution to my interpretation of the question:
IF OBJECT_ID('tempdb..#tempGrouping') IS NOT NULL DROP Table #tempGrouping;
;
WITH BaseTable AS
(
SELECT 1 id, 0 as threshold, 9 as cost
UNION SELECT 1, 100, 7
UNION SELECT 1, 500, 6
UNION SELECT 2, 0, 9
UNION SELECT 2, 100, 7
UNION SELECT 2, 500, 6
UNION SELECT 3, 1, 9
UNION SELECT 3, 100, 7
UNION SELECT 3, 500, 6
)
, BaseCTE AS
(
SELECT
id
--,dense_rank() over (order by threshold, cost ) as GroupId
,
(
SELECT CAST(TblGrouping.threshold AS varchar(8000)) + '/' + CAST(TblGrouping.cost AS varchar(8000)) + ';'
FROM BaseTable AS TblGrouping
WHERE TblGrouping.id = BaseTable.id
ORDER BY TblGrouping.threshold, TblGrouping.cost
FOR XML PATH ('')
) AS MultiGroup
FROM BaseTable
GROUP BY id
)
,
CTE AS
(
SELECT
*
,DENSE_RANK() OVER (ORDER BY MultiGroup) AS GroupId
FROM BaseCTE
)
SELECT *
INTO #tempGrouping
FROM CTE
-- SELECT * FROM #tempGrouping;
UPDATE BaseTable
SET BaseTable.GroupId = #tempGrouping.GroupId
FROM BaseTable
INNER JOIN #tempGrouping
ON BaseTable.Id = #tempGrouping.Id
IF OBJECT_ID('tempdb..#tempGrouping') IS NOT NULL DROP Table #tempGrouping;
Where BaseTable is your table, and and you don't need the CTE "BaseTable", because you have a data table.
You may need to take extra-precautions if your threshold and cost fields can be NULL.

Joining a list of values with table rows in SQL

Suppose I have a list of values, such as 1, 2, 3, 4, 5 and a table where some of those values exist in some column. Here is an example:
id name
1 Alice
3 Cindy
5 Elmore
6 Felix
I want to create a SELECT statement that will include all of the values from my list as well as the information from those rows that match the values, i.e., perform a LEFT OUTER JOIN between my list and the table, so the result would be like follows:
id name
1 Alice
2 (null)
3 Cindy
4 (null)
5 Elmore
How do I do that without creating a temp table or using multiple UNION operators?

If in Microsoft SQL Server 2008 or later, then you can use Table Value Constructor
Select v.valueId, m.name
From (values (1), (2), (3), (4), (5)) v(valueId)
left Join otherTable m
on m.id = v.valueId
Postgres also has this construction VALUES Lists:
SELECT * FROM (VALUES (1, 'one'), (2, 'two'), (3, 'three')) AS t (num,letter)
Also note the possible Common Table Expression syntax which can be handy to make joins:
WITH my_values(num, str) AS (
VALUES (1, 'one'), (2, 'two'), (3, 'three')
)
SELECT num, txt FROM my_values
With Oracle it's possible, though heavier From ASK TOM:
with id_list as (
select 10 id from dual union all
select 20 id from dual union all
select 25 id from dual union all
select 70 id from dual union all
select 90 id from dual
)
select * from id_list;

the following solution for oracle is adopted from this source. the basic idea is to exploit oracle's hierarchical queries. you have to specify a maximum length of the list (100 in the sample query below).
select d.lstid
, t.name
from (
select substr(
csv
, instr(csv,',',1,lev) + 1
, instr(csv,',',1,lev+1 )-instr(csv,',',1,lev)-1
) lstid
from (select ','||'1,2,3,4,5'||',' csv from dual)
, (select level lev from dual connect by level <= 100)
where lev <= length(csv)-length(replace(csv,','))-1
) d
left join test t on ( d.lstid = t.id )
;
check out this sql fiddle to see it work.

Bit late on this, but for Oracle you could do something like this to get a table of values:
SELECT rownum + 5 /*start*/ - 1 as myval
FROM dual
CONNECT BY LEVEL <= 100 /*end*/ - 5 /*start*/ + 1
... And then join that to your table:
SELECT *
FROM
(SELECT rownum + 1 /*start*/ - 1 myval
FROM dual
CONNECT BY LEVEL <= 5 /*end*/ - 1 /*start*/ + 1) mypseudotable
left outer join myothertable
on mypseudotable.myval = myothertable.correspondingval

Assuming myTable is the name of your table, following code should work.
;with x as
(
select top (select max(id) from [myTable]) number from [master]..spt_values
),
y as
(select row_number() over (order by x.number) as id
from x)
select y.id, t.name
from y left join myTable as t
on y.id = t.id;
Caution: This is SQL Server implementation.
fiddle

For getting sequential numbers as required for part of output (This method eliminates values to type for n numbers):
declare #site as int
set #site = 1
while #site<=200
begin
insert into ##table
values (#site)
set #site=#site+1
end
Final output[post above step]:
select * from ##table
select v.id,m.name from ##table as v
left outer join [source_table] m
on m.id=v.id

Suppose your table that has values 1,2,3,4,5 is named list_of_values, and suppose the table that contain some values but has the name column as some_values, you can do:
SELECT B.id,A.name
FROM [list_of_values] AS B
LEFT JOIN [some_values] AS A
ON B.ID = A.ID

difficult sql query

I have a table containing many columns, I have to make my selection according to these two columns:
TIME ID
-216 AZA
215 AZA
56 EA
-55 EA
66 EA
-03 AR
03 OUI
-999 OP
999 OP
04 AR
87 AR
The expected output is
TIME ID
66 EA
03 OUI
87 AR
I need to select the rows with no matches. There are rows which have the same ID, and almost the same time but inversed with a little difference. For example the first row with the TIME -216 matches the second record with time 215. I tried to solve it in many ways, but everytime I find myself lost.

First step -- find rows with duplicate IDs. Second step -- filter for rows which are near-inverse duplicates.
First step:
SELECT t1.TIME, t2.TIME, t1.ID FROM mytable t1 JOIN mytable
t2 ON t1.ID = t2.ID AND t1.TIME > t2.TIME;
The second part of the join clause ensures we only get one record for each pair.
Second step:
SELECT t1.TIME,t2.TIME,t1.ID FROM mytable t1 JOIN mytable t2 ON t1.ID = t2.ID AND
t1.TIME > t2.TIME WHERE ABS(t1.TIME + t2.TIME) < 3;
This will produce some duplicate results if eg. (10, FI), (-10, FI) and (11, FI) are in your table as there are two valid pairs. You can possibly filter these out as follows:
SELECT t1.TIME,MAX(t2.TIME),t1.ID FROM mytable t1 JOIN mytable t2 ON
t1.ID = t2.ID AND t1.TIME > t2.TIME WHERE ABS(t1.TIME + t2.TIME) < 3 GROUP BY
t1.TIME,t1.ID;
But it's unclear which result you want to drop. Hopefully this points you in the right direction, though!

Does this help?
create table #RawData
(
[Time] int,
ID varchar(3)
)
insert into #rawdata ([time],ID)
select -216, 'AZA'
union
select 215, 'AZA'
union
select 56, 'EA'
union
select -55, 'EA'
union
select 66, 'EA'
union
select -03, 'AR'
union
select 03, 'OUI'
union
select -999, 'OP'
union
select 999, 'OP'
union
select 04, 'AR'
union
select 87, 'AR'
union
-- this value added to illustrate that the algorithm does not ignore this value
select 156, 'EA'
--create a copy with an ID to help out
create table #Data
(
uniqueId uniqueidentifier,
[Time] int,
ID varchar(3)
)
insert into #Data(uniqueId,[Time],ID) select newid(),[Time],ID from #RawData
declare #allowedDifference int
select #allowedDifference = 1
--find duplicates with matching inverse time
select *, d1.Time + d2.Time as pairDifference from #Data d1 inner join #Data d2 on d1.ID = d2.ID and (d1.[Time] + d2.[Time] <=#allowedDifference and d1.[Time] + d2.[Time] >= (-1 * #allowedDifference))
-- now find all ID's ignoring these pairs
select [Time],ID from #data
where uniqueID not in (select d1.uniqueID from #Data d1 inner join #Data d2 on d1.ID = d2.ID and (d1.[Time] + d2.[Time] <=3 and d1.[Time] + d2.[Time] >= -3))

How do I find a "gap" in running counter with SQL?

I'd like to find the first "gap" in a counter column in an SQL table. For example, if there are values 1,2,4 and 5 I'd like to find out 3.
I can of course get the values in order and go through it manually, but I'd like to know if there would be a way to do it in SQL.
In addition, it should be quite standard SQL, working with different DBMSes.

In MySQL and PostgreSQL:
SELECT id + 1
FROM mytable mo
WHERE NOT EXISTS
(
SELECT NULL
FROM mytable mi
WHERE mi.id = mo.id + 1
)
ORDER BY
id
LIMIT 1
In SQL Server:
SELECT TOP 1
id + 1
FROM mytable mo
WHERE NOT EXISTS
(
SELECT NULL
FROM mytable mi
WHERE mi.id = mo.id + 1
)
ORDER BY
id
In Oracle:
SELECT *
FROM (
SELECT id + 1 AS gap
FROM mytable mo
WHERE NOT EXISTS
(
SELECT NULL
FROM mytable mi
WHERE mi.id = mo.id + 1
)
ORDER BY
id
)
WHERE rownum = 1
ANSI (works everywhere, least efficient):
SELECT MIN(id) + 1
FROM mytable mo
WHERE NOT EXISTS
(
SELECT NULL
FROM mytable mi
WHERE mi.id = mo.id + 1
)
Systems supporting sliding window functions:
SELECT -- TOP 1
-- Uncomment above for SQL Server 2012+
previd
FROM (
SELECT id,
LAG(id) OVER (ORDER BY id) previd
FROM mytable
) q
WHERE previd <> id - 1
ORDER BY
id
-- LIMIT 1
-- Uncomment above for PostgreSQL

Your answers all work fine if you have a first value id = 1, otherwise this gap will not be detected. For instance if your table id values are 3,4,5, your queries will return 6.
I did something like this
SELECT MIN(ID+1) FROM (
SELECT 0 AS ID UNION ALL
SELECT
MIN(ID + 1)
FROM
TableX) AS T1
WHERE
ID+1 NOT IN (SELECT ID FROM TableX)

There isn't really an extremely standard SQL way to do this, but with some form of limiting clause you can do
SELECT `table`.`num` + 1
FROM `table`
LEFT JOIN `table` AS `alt`
ON `alt`.`num` = `table`.`num` + 1
WHERE `alt`.`num` IS NULL
LIMIT 1
(MySQL, PostgreSQL)
or
SELECT TOP 1 `num` + 1
FROM `table`
LEFT JOIN `table` AS `alt`
ON `alt`.`num` = `table`.`num` + 1
WHERE `alt`.`num` IS NULL
(SQL Server)
or
SELECT `num` + 1
FROM `table`
LEFT JOIN `table` AS `alt`
ON `alt`.`num` = `table`.`num` + 1
WHERE `alt`.`num` IS NULL
AND ROWNUM = 1
(Oracle)

The first thing that came into my head. Not sure if it's a good idea to go this way at all, but should work. Suppose the table is t and the column is c:
SELECT
t1.c + 1 AS gap
FROM t as t1
LEFT OUTER JOIN t as t2 ON (t1.c + 1 = t2.c)
WHERE t2.c IS NULL
ORDER BY gap ASC
LIMIT 1
Edit: This one may be a tick faster (and shorter!):
SELECT
min(t1.c) + 1 AS gap
FROM t as t1
LEFT OUTER JOIN t as t2 ON (t1.c + 1 = t2.c)
WHERE t2.c IS NULL

This works in SQL Server - can't test it in other systems but it seems standard...
SELECT MIN(t1.ID)+1 FROM mytable t1 WHERE NOT EXISTS (SELECT ID FROM mytable WHERE ID = (t1.ID + 1))
You could also add a starting point to the where clause...
SELECT MIN(t1.ID)+1 FROM mytable t1 WHERE NOT EXISTS (SELECT ID FROM mytable WHERE ID = (t1.ID + 1)) AND ID > 2000
So if you had 2000, 2001, 2002, and 2005 where 2003 and 2004 didn't exist, it would return 2003.

The following solution:
provides test data;
an inner query that produces other gaps; and
it works in SQL Server 2012.
Numbers the ordered rows sequentially in the "with" clause and then reuses the result twice with an inner join on the row number, but offset by 1 so as to compare the row before with the row after, looking for IDs with a gap greater than 1. More than asked for but more widely applicable.
create table #ID ( id integer );
insert into #ID values (1),(2), (4),(5),(6),(7),(8), (12),(13),(14),(15);
with Source as (
select
row_number()over ( order by A.id ) as seq
,A.id as id
from #ID as A WITH(NOLOCK)
)
Select top 1 gap_start from (
Select
(J.id+1) as gap_start
,(K.id-1) as gap_end
from Source as J
inner join Source as K
on (J.seq+1) = K.seq
where (J.id - (K.id-1)) <> 0
) as G
The inner query produces:
gap_start gap_end
3 3
9 11
The outer query produces:
gap_start
3

Inner join to a view or sequence that has a all possible values.
No table? Make a table. I always keep a dummy table around just for this.
create table artificial_range(
id int not null primary key auto_increment,
name varchar( 20 ) null ) ;
-- or whatever your database requires for an auto increment column
insert into artificial_range( name ) values ( null )
-- create one row.
insert into artificial_range( name ) select name from artificial_range;
-- you now have two rows
insert into artificial_range( name ) select name from artificial_range;
-- you now have four rows
insert into artificial_range( name ) select name from artificial_range;
-- you now have eight rows
--etc.
insert into artificial_range( name ) select name from artificial_range;
-- you now have 1024 rows, with ids 1-1024
Then,
select a.id from artificial_range a
where not exists ( select * from your_table b
where b.counter = a.id) ;

This one accounts for everything mentioned so far. It includes 0 as a starting point, which it will default to if no values exist as well. I also added the appropriate locations for the other parts of a multi-value key. This has only been tested on SQL Server.
select
MIN(ID)
from (
select
0 ID
union all
select
[YourIdColumn]+1
from
[YourTable]
where
--Filter the rest of your key--
) foo
left join
[YourTable]
on [YourIdColumn]=ID
and --Filter the rest of your key--
where
[YourIdColumn] is null

For PostgreSQL
An example that makes use of recursive query.
This might be useful if you want to find a gap in a specific range
(it will work even if the table is empty, whereas the other examples will not)
WITH
RECURSIVE a(id) AS (VALUES (1) UNION ALL SELECT id + 1 FROM a WHERE id < 100), -- range 1..100
b AS (SELECT id FROM my_table) -- your table ID list
SELECT a.id -- find numbers from the range that do not exist in main table
FROM a
LEFT JOIN b ON b.id = a.id
WHERE b.id IS NULL
-- LIMIT 1 -- uncomment if only the first value is needed

My guess:
SELECT MIN(p1.field) + 1 as gap
FROM table1 AS p1
INNER JOIN table1 as p3 ON (p1.field = p3.field + 2)
LEFT OUTER JOIN table1 AS p2 ON (p1.field = p2.field + 1)
WHERE p2.field is null;

I wrote up a quick way of doing it. Not sure this is the most efficient, but gets the job done. Note that it does not tell you the gap, but tells you the id before and after the gap (keep in mind the gap could be multiple values, so for example 1,2,4,7,11 etc)
I'm using sqlite as an example
If this is your table structure
create table sequential(id int not null, name varchar(10) null);
and these are your rows
id|name
1|one
2|two
4|four
5|five
9|nine
The query is
select a.* from sequential a left join sequential b on a.id = b.id + 1 where b.id is null and a.id <> (select min(id) from sequential)
union
select a.* from sequential a left join sequential b on a.id = b.id - 1 where b.id is null and a.id <> (select max(id) from sequential);
https://gist.github.com/wkimeria/7787ffe84d1c54216f1b320996b17b7e

Here is an alternative to show the range of all possible gap values in portable and more compact way :
Assume your table schema looks like this :
> SELECT id FROM your_table;
+-----+
| id |
+-----+
| 90 |
| 103 |
| 104 |
| 118 |
| 119 |
| 120 |
| 121 |
| 161 |
| 162 |
| 163 |
| 185 |
+-----+
To fetch the ranges of all possible gap values, you have the following query :
The subquery lists pairs of ids, each of which has the lowerbound column being smaller than upperbound column, then use GROUP BY and MIN(m2.id) to reduce number of useless records.
The outer query further removes the records where lowerbound is exactly upperbound - 1
My query doesn't (explicitly) output the 2 records (YOUR_MIN_ID_VALUE, 89) and (186, YOUR_MAX_ID_VALUE) at both ends, that implicitly means any number in both of the ranges hasn't been used in your_table so far.
> SELECT m3.lowerbound + 1, m3.upperbound - 1 FROM
(
SELECT m1.id as lowerbound, MIN(m2.id) as upperbound FROM
your_table m1 INNER JOIN your_table
AS m2 ON m1.id < m2.id GROUP BY m1.id
)
m3 WHERE m3.lowerbound < m3.upperbound - 1;
+-------------------+-------------------+
| m3.lowerbound + 1 | m3.upperbound - 1 |
+-------------------+-------------------+
| 91 | 102 |
| 105 | 117 |
| 122 | 160 |
| 164 | 184 |
+-------------------+-------------------+

select min([ColumnName]) from [TableName]
where [ColumnName]-1 not in (select [ColumnName] from [TableName])
and [ColumnName] <> (select min([ColumnName]) from [TableName])

Here is standard a SQL solution that runs on all database servers with no change:
select min(counter + 1) FIRST_GAP
from my_table a
where not exists (select 'x' from my_table b where b.counter = a.counter + 1)
and a.counter <> (select max(c.counter) from my_table c);
See in action for;
PL/SQL via Oracle's livesql,
MySQL via sqlfiddle,
PostgreSQL via sqlfiddle
MS Sql via sqlfiddle

It works for empty tables or with negatives values as well. Just tested in SQL Server 2012
select min(n) from (
select case when lead(i,1,0) over(order by i)>i+1 then i+1 else null end n from MyTable) w

If You use Firebird 3 this is most elegant and simple:
select RowID
from (
select `ID_Column`, Row_Number() over(order by `ID_Column`) as RowID
from `Your_Table`
order by `ID_Column`)
where `ID_Column` <> RowID
rows 1

-- PUT THE TABLE NAME AND COLUMN NAME BELOW
-- IN MY EXAMPLE, THE TABLE NAME IS = SHOW_GAPS AND COLUMN NAME IS = ID
-- PUT THESE TWO VALUES AND EXECUTE THE QUERY
DECLARE #TABLE_NAME VARCHAR(100) = 'SHOW_GAPS'
DECLARE #COLUMN_NAME VARCHAR(100) = 'ID'
DECLARE #SQL VARCHAR(MAX)
SET #SQL =
'SELECT TOP 1
'+#COLUMN_NAME+' + 1
FROM '+#TABLE_NAME+' mo
WHERE NOT EXISTS
(
SELECT NULL
FROM '+#TABLE_NAME+' mi
WHERE mi.'+#COLUMN_NAME+' = mo.'+#COLUMN_NAME+' + 1
)
ORDER BY
'+#COLUMN_NAME
-- SELECT #SQL
DECLARE #MISSING_ID TABLE (ID INT)
INSERT INTO #MISSING_ID
EXEC (#SQL)
--select * from #MISSING_ID
declare #var_for_cursor int
DECLARE #LOW INT
DECLARE #HIGH INT
DECLARE #FINAL_RANGE TABLE (LOWER_MISSING_RANGE INT, HIGHER_MISSING_RANGE INT)
DECLARE IdentityGapCursor CURSOR FOR
select * from #MISSING_ID
ORDER BY 1;
open IdentityGapCursor
fetch next from IdentityGapCursor
into #var_for_cursor
WHILE ##FETCH_STATUS = 0
BEGIN
SET #SQL = '
DECLARE #LOW INT
SELECT #LOW = MAX('+#COLUMN_NAME+') + 1 FROM '+#TABLE_NAME
+' WHERE '+#COLUMN_NAME+' < ' + cast( #var_for_cursor as VARCHAR(MAX))
SET #SQL = #sql + '
DECLARE #HIGH INT
SELECT #HIGH = MIN('+#COLUMN_NAME+') - 1 FROM '+#TABLE_NAME
+' WHERE '+#COLUMN_NAME+' > ' + cast( #var_for_cursor as VARCHAR(MAX))
SET #SQL = #sql + 'SELECT #LOW,#HIGH'
INSERT INTO #FINAL_RANGE
EXEC( #SQL)
fetch next from IdentityGapCursor
into #var_for_cursor
END
CLOSE IdentityGapCursor;
DEALLOCATE IdentityGapCursor;
SELECT ROW_NUMBER() OVER(ORDER BY LOWER_MISSING_RANGE) AS 'Gap Number',* FROM #FINAL_RANGE

Found most of approaches run very, very slow in mysql. Here is my solution for mysql < 8.0. Tested on 1M records with a gap near the end ~ 1sec to finish. Not sure if it fits other SQL flavours.
SELECT cardNumber - 1
FROM
(SELECT #row_number := 0) as t,
(
SELECT (#row_number:=#row_number+1), cardNumber, cardNumber-#row_number AS diff
FROM cards
ORDER BY cardNumber
) as x
WHERE diff >= 1
LIMIT 0,1
I assume that sequence starts from `1`.

If your counter is starting from 1 and you want to generate first number of sequence (1) when empty, here is the corrected piece of code from first answer valid for Oracle:
SELECT
NVL(MIN(id + 1),1) AS gap
FROM
mytable mo
WHERE 1=1
AND NOT EXISTS
(
SELECT NULL
FROM mytable mi
WHERE mi.id = mo.id + 1
)
AND EXISTS
(
SELECT NULL
FROM mytable mi
WHERE mi.id = 1
)

DECLARE #Table AS TABLE(
[Value] int
)
INSERT INTO #Table ([Value])
VALUES
(1),(2),(4),(5),(6),(10),(20),(21),(22),(50),(51),(52),(53),(54),(55)
--Gaps
--Start End Size
--3 3 1
--7 9 3
--11 19 9
--23 49 27
SELECT [startTable].[Value]+1 [Start]
,[EndTable].[Value]-1 [End]
,([EndTable].[Value]-1) - ([startTable].[Value]) Size
FROM
(
SELECT [Value]
,ROW_NUMBER() OVER(PARTITION BY 1 ORDER BY [Value]) Record
FROM #Table
)AS startTable
JOIN
(
SELECT [Value]
,ROW_NUMBER() OVER(PARTITION BY 1 ORDER BY [Value]) Record
FROM #Table
)AS EndTable
ON [EndTable].Record = [startTable].Record+1
WHERE [startTable].[Value]+1 <>[EndTable].[Value]

If the numbers in the column are positive integers (starting from 1) then here is how to solve it easily. (assuming ID is your column name)
SELECT TEMP.ID
FROM (SELECT ROW_NUMBER() OVER () AS NUM FROM 'TABLE-NAME') AS TEMP
WHERE ID NOT IN (SELECT ID FROM 'TABLE-NAME')
ORDER BY 1 ASC LIMIT 1

SELECT ID+1 FROM table WHERE ID+1 NOT IN (SELECT ID FROM table) ORDER BY 1;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL query to count all occurences that start with a substring - sql

Related

Group identifiers/values that are related with each other between multiple columns

Find overlapping sets of data in a table

Joining a list of values with table rows in SQL

difficult sql query

How do I find a "gap" in running counter with SQL?

Categories

Resources