Tricky SQL. Consolidating rows - sql

I have a (in my oppinion) tricky SQL problem.
I got a table with subscriptions. Each subscription has an ID and a set of attributes which will change over time. When an attribute value changes a new row is created with the subscription key and the new values – but ONLY for the changed attributes. The values for the attributes that weren’t changed are left empty. It looks something like this (I left out the ValidTo and ValidFrom dates that I use to sort the result correctly):
SubID Att1 Att2
1 J
1 L
1 B
1 H
1 A H
I need to transform this table so I can get the following result:
SubID Att1 Att2
1 J
1 J L
1 B L
1 B H
1 A H
So basically; if an attribute is empty then take the previous value for that attribute.
Anything solution goes…. I mean it doesn’t matter what I have to do to get the result: a view on top of the table, an SSIS package to create a new table or something third.

You can do this with a correlated subquery:
select t.subid,
(select t2.att1 from t t2 where t2.rowid <= t.rowid and t2.att1 is not null order by rowid desc limit 1) as att1,
(select t2.att2 from t t2 where t2.rowid <= t.rowid and t2.att2 is not null order by rowid desc limit 1) as att1
from t
This assumes that you have a rowid or equivalent (such as date time created) that specifies the ordering of the rows. It also uses limit to limit the results. In other databases, this might use top instead. (And Oracle uses a slightly more complex expression.)
I would write this using ValidTo. However, because there is ValidTo and ValidFrom, the actual expression is much more complicated. I would need for the question to clarify the rules for using these values with respect to imputing values at other times.

this one works in oracle 11g
select SUBID
,NVL(ATT1,LAG(ATT1) over(order by ValidTo)) ATT1
,NVL(ATT2,lag(ATT2) over(order by ValidTo)) ATT2
from table_name
i agree with Gordon Linoff and Jack Douglas.this code has limitation as when multiple records with nulls are inserted..
but below code will handle that..
select SUBID
,NVL(ATT1,LAG(ATT1 ignore nulls) over(order by VALIDTO)) ATT1
,NVL(ATT2,LAG(ATT2 ignore nulls) over(order by VALIDTO)) ATT2
from Table_name
please see sql fiddle
http://sqlfiddle.com/#!4/3b530/4

Assuming (based on the fact that you mentioned SSIS) you can use OUTER APPLY to get the previous row:
DECLARE #T TABLE (SubID INT, Att1 CHAR(1), Att2 CHAR(2), ValidFrom DATETIME);
INSERT #T VALUES
(1, 'J', '', '20121201'),
(1, '', 'l', '20121202'),
(1, 'B', '', '20121203'),
(1, '', 'H', '20121204'),
(1, 'A', 'H', '20121205');
SELECT T.SubID,
Att1 = COALESCE(NULLIF(T.att1, ''), prev.Att1, ''),
Att2 = COALESCE(NULLIF(T.att2, ''), prev.Att2, '')
FROM #T T
OUTER APPLY
( SELECT TOP 1 Att1, Att2
FROM #T prev
WHERE prev.SubID = T.SubID
AND prev.ValidFrom < t.ValidFrom
ORDER BY ValidFrom DESC
) prev
ORDER BY T.ValidFrom;
(I've had to add random values for ValidFrom to ensure the order by is correct)
EDIT
The above won't work if you have multiple consecutive rows with blank values - e.g.
DECLARE #T TABLE (SubID INT, Att1 CHAR(1), Att2 CHAR(2), ValidFrom DATETIME);
INSERT #T VALUES
(1, 'J', '', '20121201'),
(1, '', 'l', '20121202'),
(1, 'B', '', '20121203'),
(1, '', 'H', '20121204'),
(1, '', 'J', '20121205'),
(1, 'A', 'H', '20121206');
If this is likely to happen you will need two OUTER APPLYs:
SELECT T.SubID,
Att1 = COALESCE(NULLIF(T.att1, ''), prevAtt1.Att1, ''),
Att2 = COALESCE(NULLIF(T.att2, ''), prevAtt2.Att2, '')
FROM #T T
OUTER APPLY
( SELECT TOP 1 Att1
FROM #T prev
WHERE prev.SubID = T.SubID
AND prev.ValidFrom < t.ValidFrom
AND COALESCE(prev.Att1 , '') != ''
ORDER BY ValidFrom DESC
) prevAtt1
OUTER APPLY
( SELECT TOP 1 Att2
FROM #T prev
WHERE prev.SubID = T.SubID
AND prev.ValidFrom < t.ValidFrom
AND COALESCE(prev.Att2 , '') != ''
ORDER BY ValidFrom DESC
) prevAtt2
ORDER BY T.ValidFrom;
However, since each OUTER APPLY is only returning one value I would change this to a correlated subquery, since the above will evaluate PrevAtt1.Att1 and `PrevAtt2.Att2' for every row whether required or not. However if you change this to:
SELECT T.SubID,
Att1 = COALESCE(
NULLIF(T.att1, ''),
( SELECT TOP 1 Att1
FROM #T prev
WHERE prev.SubID = T.SubID
AND prev.ValidFrom < t.ValidFrom
AND COALESCE(prev.Att1 , '') != ''
ORDER BY ValidFrom DESC
), ''),
Att2 = COALESCE(
NULLIF(T.att2, ''),
( SELECT TOP 1 Att2
FROM #T prev
WHERE prev.SubID = T.SubID
AND prev.ValidFrom < t.ValidFrom
AND COALESCE(prev.Att2 , '') != ''
ORDER BY ValidFrom DESC
), '')
FROM #T T
ORDER BY T.ValidFrom;
The subquery will only evaluate when required (ie. when Att1 or Att2 is blank) rather than for every row. The execution plan does not show this, and in fact the "Actual Execution Plan" of the latter appears more intensive it almost certainly won't be. But as always, the key is testing, run both on your data and see which performs the best, and check the IO statistics for reads etc.

I never touched SQL Server, but I read that it supports analytical functions just like Oracle.
> select * from MYTABLE order by ValidFrom;
SUBID A A VALIDFROM
---------- - - -------------------
1 J 2012-12-06 15:14:51
2 j 2012-12-06 15:15:20
1 L 2012-12-06 15:15:31
2 l 2012-12-06 15:15:39
1 B 2012-12-06 15:15:48
2 b 2012-12-06 15:15:55
1 H 2012-12-06 15:16:03
2 h 2012-12-06 15:16:09
1 A H 2012-12-06 15:16:20
2 a h 2012-12-06 15:16:29
select
t.SubID
,last_value(t.Att1 ignore nulls)over(partition by t.SubID order by t.ValidFrom rows between unbounded preceding and current row) as Att1
,last_value(t.Att2 ignore nulls)over(partition by t.SubID order by t.ValidFrom rows between unbounded preceding and current row) as Att2
,t.ValidFrom
from MYTABLE t;
SUBID A A VALIDFROM
---------- - - -------------------
1 J 2012-12-06 15:45:33
1 J L 2012-12-06 15:45:41
1 B L 2012-12-06 15:45:49
1 B H 2012-12-06 15:45:58
1 A H 2012-12-06 15:46:06
2 j 2012-12-06 15:45:38
2 j l 2012-12-06 15:45:44
2 b l 2012-12-06 15:45:53
2 b h 2012-12-06 15:46:02
2 a h 2012-12-06 15:46:09

with Tricky1 as (
Select SubID, Att1, Att2, row_number() over(order by ValidFrom) As rownum
From Tricky
)
select T1.SubID, T1.Att1, T2.Att2
from Tricky1 T1
cross join Tricky1 T2
where (ABS(T1.rownum-T2.rownum) = 1 or (T1.rownum = 1 and T2.rownum = 1))
and T1.Att1 is not null
;
Also, have a look at accessing previous value, when SQL has no notion of previous value, here.

I was at it for quite a while now. I found a rather simple way of doing it. Not the best solution as such as i know there must be other way, but here it goes.
I had to consolidates duplicates too and in 2008R2.
So if you can try to create a table which contains one set of duplicates records.
According to your example create one table where 'ATT1' is blank. Then use Update queries with Inner join on 'SubId' to populate the data that you need

Related

SQL Query for sorting in particular order

I have this data:
parent_id
comment_id
comment_level
NULL
0xB2
1
NULL
0xB6
1
NULL
0xBA
1
NULL
0xBE
1
NULL
0xC110
1
NULL
0xC130
1
123
0xC13580
2
456
0xC135AC
3
123
0xC13680
2
I want the result in such a way that rows where comment_level=1 should be in descending order by comment_id and other rows(i.e. where comment_level!=1) should be in ascending order by comment_id but the order of comment level greater than 1 should be inserted according to to order of comment level 1 (what I mean is that rows with comment_level=1 should remain in descending order and then rows with comment_level!=1 should be inserted in increasing order but it should be inserted following rows where comment_id is less than it)
Result should look like this
NULL 0xC130 1
123 0xC13580 2
456 0xC135AC 3
123 0xC13680 2
NULL 0xC110 1
NULL 0xBE 1
NULL 0xBA 1
NULL 0xB6 1
NULL 0xB2 1
Note the bold rows in above sort by comment_id in ascending order, but they come after their "main" row (with comment_level = 1), where these main rows sort DESC by comment_id.
I tried creating 2 tables for different comment level and used sorting for union but it didn't work out because 2 different order by doesn't work maybe I tried from this Using different order by with union but it gave me an error and after all even if this worked it still might not have given me the whole answer.
I think I understand what you're going for, and a UNION will not be able to do it.
To accomplish this, each row needs to match with a specific "parent" row that does have 1 for the comment_level. If the comment_level is already 1, the row is its own parent. Then we can sort first by the comment_id from that parent record DESC, and then sort ascending by the local comment_id within the a given group of matching parent records.
You'll need something like this:
SELECT t0.*
FROM [MyTable] t0
CROSS APPLY (
SELECT TOP 1 comment_id
FROM [MyTable] t1
WHERE t1.comment_level = 1 AND t1.comment_id <= t0.comment_id
ORDER BY t1.comment_id DESC
) parents
ORDER BY parents.comment_id DESC,
case when t0.comment_level = 1 then 0 else 1 end,
t0.comment_id
See it work here:
https://dbfiddle.uk/qZBb3YjO
There's probably also a solution using a windowing function that will be more efficient.
And here it is:
SELECT parent_id, comment_id, comment_level
FROM (
SELECT t0.*, t1.comment_id as t1_comment_id
, row_number() OVER (PARTITION BY t0.comment_id ORDER BY t1.comment_id desc) rn
FROM [MyTable] t0
LEFT JOIN [MyTable] t1 ON t1.comment_level = 1 and t1.comment_id <= t0.comment_id
) d
WHERE rn = 1
ORDER BY t1_comment_id DESC,
case when comment_level = 1 then 0 else 1 end,
comment_id
See it here:
https://dbfiddle.uk/me1vGNdM
Those varbinary values aren't arbitrary, they're hierarchyid values. And just a guess, they're probably typed that way in the schema (it's too much to be a coincidence). We can use this fact to do what we're looking to do.
with d_raw as (
select * from (values
(NULL, 0xC130 , 1),
(123 , 0xC13580, 2),
(456 , 0xC135AC, 3),
(123 , 0xC13680, 2),
(NULL, 0xC110 , 1),
(NULL, 0xBE , 1),
(NULL, 0xBA , 1),
(NULL, 0xB6 , 1),
(NULL, 0xB2 , 1)
) as x(parent_id, comment_id, comment_level)
),
d as (
select parent_id, comment_id = cast(comment_id as hierarchyid), comment_level
from d_raw
)
select *,
comment_id.ToString(),
comment_id.GetAncestor(comment_id.GetLevel() - 1).ToString()
from d
order by comment_id.GetAncestor(comment_id.GetLevel() - 1) desc,
comment_id
Note - the CTEs that I'm using are just to get the data into the right format and the last SELECT adds additional columns just to show what's going on; you can omit them from your query without consequence. I think the only interesting part of this solution is using comment_id.GetAncestor(comment_id.GetLevel() - 1) to get the root-level node.
One way to do this and possibly the only one is create two separate queries and union them together. Something like:
(select * from table where comment_level = 1 order by comment_id desc)
union
(select * from table where not comment_level = 1 order by comment_id asc)

SQL Comma separated values comparisons

I am having a challenge comparing values in Available column with values in Required column. They are both comma separated.
Available
Required
Match
One, Two, Three
One, Three
1
One, Three
Three, Five
0
One, Two, Three
Two
1
What I want to achieve is, if values in the Required column are all found in the Available column then it gives me a match of 1 and 0 if one or more values that are in the Required column is missing in the Available column
I want to achieve this in SQL.
If I understand the question correctly, an approach based on STRING_SPLIT() and an appropriate JOIN is an option:
Sample data:
SELECT *
INTO Data
FROM (VALUES
('One, Two, Three', 'One, Three'),
('One, Three', 'Three, Five'),
('One, Two, Three', 'Two')
) v (Available, Required)
Statement:
SELECT
Available, Required,
CASE
WHEN EXISTS (
SELECT 1
FROM STRING_SPLIT(Required, ',') s1
LEFT JOIN STRING_SPLIT(Available, ',') s2 ON TRIM(s1.[value]) = TRIM(s2.[value])
WHERE s2.[value] IS NULL
) THEN 0
ELSE 1
END AS Match
FROM Data
Result:
Available
Required
Match
One, Two, Three
One, Three
1
One, Three
Three, Five
0
One, Two, Three
Two
1
A variation of Zhorov's solution.
It is using a set based operator EXCEPT.
SQL
-- DDL and sample data population, start
DECLARE #tbl TABLE (Available VARCHAR(100), Required VARCHAR(100));
INSERT INTO #tbl (Available, Required) VALUES
('One, Two, Three', 'One, Three'),
('One, Three', 'Three, Five'),
('One, Two, Three', 'Two');
-- DDL and sample data population, end
SELECT t.*
, [Match] = CASE
WHEN EXISTS (
SELECT TRIM([value]) FROM STRING_SPLIT(Required, ',')
EXCEPT
SELECT TRIM([value]) FROM STRING_SPLIT(Available, ',')
) THEN 0
ELSE 1
END
FROM #tbl AS t;
Output
+-----------------+-------------+-------+
| Available | Required | Match |
+-----------------+-------------+-------+
| One, Two, Three | One, Three | 1 |
| One, Three | Three, Five | 0 |
| One, Two, Three | Two | 1 |
+-----------------+-------------+-------+
You need to do a cross join to look in all available values, your query would be :
SELECT t.*
,case when SUM(CASE
WHEN t1.Available LIKE '%' + t.Required + '%'
THEN 1
ELSE 0
END) > 0 THEN 1 ELSE 0 END AS [Match_Calculated]
FROM YOUR_TABLE t
CROSS JOIN YOUR_TABLE t1
GROUP BY t.Available
,t.Required
,t.Match
Here's a dbfiddle
You can use "STRING_SPLIT" to achieve your request
;with Source as
(
select 1 id,'One,Two,Three' Available,'One,Three' Required
union all
select 2 id,'One,Three' Available,'Three,Five' Required
union all
select 3 id,'One,Two,Three' Available,'Two' Required
)
,AvailableTmp as
(
SELECT t.id,
x.value
FROM Source t
CROSS APPLY (SELECT trim(value) value
FROM string_split(t.Available, ',')) x
)
,RequiredTmp as
(
SELECT t.id,
x.value
FROM Source t
CROSS APPLY (SELECT trim(value) value
FROM string_split(t.Required, ',')) x
)
,AllMatchTmp as
(
select a.id
,1 Match
From RequiredTmp a
left join AvailableTmp b on a.id=b.id and a.value = b.value
group by a.id
having max(case when b.value is null then 1 else 0 end ) = 0
)
select a.id
,a.Available
,a.Required
,ISNULL(b.Match,0) Match
from Source a
left join AllMatchTmp b on a.id = b.id
Another way using STRING_SPLIT
DECLARE #data TABLE (Available VARCHAR(100), [Required] VARCHAR(100),
INDEX IX_data(Available,[Required]));
INSERT #data
VALUES ('One, Two, Three', 'One, Three'),('One, Three', 'Three, Five'),
('One, Two, Three', 'Two');
SELECT
Available = d.Available,
[Required] = d.[Required],
[Match] = MIN(f.X)
FROM #data AS d
CROSS APPLY STRING_SPLIT(REPLACE(d.[Required],' ',''),',') AS split
CROSS APPLY (VALUES(REPLACE(d.[Available],' ',''))) AS cleaned(String)
CROSS APPLY (VALUES(IIF(split.[value] NOT IN
(SELECT s.[value] FROM STRING_SPLIT(cleaned.String,',') AS s),0,1))) AS f(X)
GROUP BY d.Available, d.[Required];

How to get the result in below format in SQL Server?

I have a table called recipes with following data.
page_no title
-----------------
1 pancake
2 pizza
3 pasta
5 cookie
page_no 0 is always blank, and missing page_no are blank, I want output as below, for the blank page NULL values in the result.
left_title right_title
------------------------
NULL pancake
Pizza pasta
NULL cookie
I have tried this SQL statement, but it's not returning the desired output:
SELECT
CASE WHEN id % 2 = 0
THEN title
END AS left_title,
CASE WHEN id %2 != 0
THEN title
END AS right_title
FROM
recipes
You are quite close. You just need aggregation:
select max(case when id % 2 = 0 then title end) as left_title,
max(case when id % 2 = 1 then title end) as right_title
from recipes
group by id / 2
order by min(id);
SQL Server does integer division, so id / 2 is always an integer.
Using CTE.. this should be give you a good CTE overview
DECLARE #table TABLE (
pageno int,
title varchar(30)
)
INSERT INTO #table
VALUES (1, 'pancake')
, (2, 'pizza')
, (3, 'pasta')
, (5, 'cookie')
;
WITH cte_pages
AS ( -- generate page numbers
SELECT
0 n,
MAX(pageno) maxpgno
FROM #table
UNION ALL
SELECT
n + 1 n,
maxpgno
FROM cte_pages
WHERE n <= maxpgno),
cte_left
AS ( --- even
SELECT
n,
ROW_NUMBER() OVER (ORDER BY n) rn
FROM cte_pages
WHERE n % 2 = 0),
cte_right
AS ( --- odd
SELECT
n,
ROW_NUMBER() OVER (ORDER BY n) rn
FROM cte_pages
WHERE n % 2 <> 0)
SELECT
tl.title left_title,
tr.title right_title --- final output
FROM cte_left l
INNER JOIN cte_right r
ON l.rn = r.rn
LEFT OUTER JOIN #table tl
ON tl.pageno = l.n
LEFT OUTER JOIN #table tr
ON tr.pageno = r.n

Sql query solution needed

I have a table in the following format
ID class indicator
1 A Y
1 B N
1 C(recent) N
2 X N
2 K(recent) N
if a particular ID has its recent class indicator as N then search if it was Y somewhere before that recent entry if yes then answer should be that class if not then the recent entry is the answer.
Example: for ID 1 the recent entry was C but we got a Y at A also so we have to return A but in ID 2, K is the recent entry and we don't have any Y indicator before that so we should return Y.
One approach is to sort the entries for each ID according to this logic, assign a number to each with row_number() and take just the first:
SELECT id, class, indicator
FROM (SELECT id, class, indicator,
ROW_NUMBER() OVER (PARTITION BY id
ORDER BY indicator DESC,
CASE WHEN class LIKE '%(recent)' THEN 0
ELSE 1
END ASC) AS rn
FROM mytable) t
WHERE rn = 1
in your comment with Mureinik's answer you mention you want to use a datetime column in stead of recentin the class column.
One way to do this is like below, but this will not perform great if you have really many records
your table would look sometimes like this in that case :
ID Class Indicator IndicatorDate
-- ----- --------- -------------
1 A Y 31/03/2018 12:59:18
1 B N 1/04/2018 12:59:18
1 C N 3/04/2018 12:59:18
2 X N 3/04/2018 12:59:18
2 K N 2/04/2018 12:59:18
and a possible solution would be this
declare #table table (ID int, Class varchar(1), Indicator varchar(1), IndicatorDate datetime)
insert into #Table(ID, Class, Indicator, IndicatorDate)
values (1, 'A', 'Y', getdate() - 3), (1, 'B', 'N', getdate() - 2), (1, 'C', 'N', getdate()), (2, 'X', 'N', getdate()), (2, 'K', 'N', getdate() - 1)
select td.ID,
(select top 1 t2.Class from #table t2 where t2.ID = td.ID order by t2.Indicator desc, t2.IndicatorDate desc) as selectedClass,
(select top 1 t2.Indicator from #table t2 where t2.ID = td.ID order by t2.Indicator desc, t2.IndicatorDate desc) as selectedIndicator
from
( select distinct t.ID
from #table t
) td
which would return this
ID selectedClass selectedIndicator
-- ------------- -----------------
1 A Y
2 X N
HOWEVER:
This depends on if the endresult of this query is really what you are looking for, its still not entirely clear to me

How do I find a "gap" in running counter with SQL?

I'd like to find the first "gap" in a counter column in an SQL table. For example, if there are values 1,2,4 and 5 I'd like to find out 3.
I can of course get the values in order and go through it manually, but I'd like to know if there would be a way to do it in SQL.
In addition, it should be quite standard SQL, working with different DBMSes.
In MySQL and PostgreSQL:
SELECT id + 1
FROM mytable mo
WHERE NOT EXISTS
(
SELECT NULL
FROM mytable mi
WHERE mi.id = mo.id + 1
)
ORDER BY
id
LIMIT 1
In SQL Server:
SELECT TOP 1
id + 1
FROM mytable mo
WHERE NOT EXISTS
(
SELECT NULL
FROM mytable mi
WHERE mi.id = mo.id + 1
)
ORDER BY
id
In Oracle:
SELECT *
FROM (
SELECT id + 1 AS gap
FROM mytable mo
WHERE NOT EXISTS
(
SELECT NULL
FROM mytable mi
WHERE mi.id = mo.id + 1
)
ORDER BY
id
)
WHERE rownum = 1
ANSI (works everywhere, least efficient):
SELECT MIN(id) + 1
FROM mytable mo
WHERE NOT EXISTS
(
SELECT NULL
FROM mytable mi
WHERE mi.id = mo.id + 1
)
Systems supporting sliding window functions:
SELECT -- TOP 1
-- Uncomment above for SQL Server 2012+
previd
FROM (
SELECT id,
LAG(id) OVER (ORDER BY id) previd
FROM mytable
) q
WHERE previd <> id - 1
ORDER BY
id
-- LIMIT 1
-- Uncomment above for PostgreSQL
Your answers all work fine if you have a first value id = 1, otherwise this gap will not be detected. For instance if your table id values are 3,4,5, your queries will return 6.
I did something like this
SELECT MIN(ID+1) FROM (
SELECT 0 AS ID UNION ALL
SELECT
MIN(ID + 1)
FROM
TableX) AS T1
WHERE
ID+1 NOT IN (SELECT ID FROM TableX)
There isn't really an extremely standard SQL way to do this, but with some form of limiting clause you can do
SELECT `table`.`num` + 1
FROM `table`
LEFT JOIN `table` AS `alt`
ON `alt`.`num` = `table`.`num` + 1
WHERE `alt`.`num` IS NULL
LIMIT 1
(MySQL, PostgreSQL)
or
SELECT TOP 1 `num` + 1
FROM `table`
LEFT JOIN `table` AS `alt`
ON `alt`.`num` = `table`.`num` + 1
WHERE `alt`.`num` IS NULL
(SQL Server)
or
SELECT `num` + 1
FROM `table`
LEFT JOIN `table` AS `alt`
ON `alt`.`num` = `table`.`num` + 1
WHERE `alt`.`num` IS NULL
AND ROWNUM = 1
(Oracle)
The first thing that came into my head. Not sure if it's a good idea to go this way at all, but should work. Suppose the table is t and the column is c:
SELECT
t1.c + 1 AS gap
FROM t as t1
LEFT OUTER JOIN t as t2 ON (t1.c + 1 = t2.c)
WHERE t2.c IS NULL
ORDER BY gap ASC
LIMIT 1
Edit: This one may be a tick faster (and shorter!):
SELECT
min(t1.c) + 1 AS gap
FROM t as t1
LEFT OUTER JOIN t as t2 ON (t1.c + 1 = t2.c)
WHERE t2.c IS NULL
This works in SQL Server - can't test it in other systems but it seems standard...
SELECT MIN(t1.ID)+1 FROM mytable t1 WHERE NOT EXISTS (SELECT ID FROM mytable WHERE ID = (t1.ID + 1))
You could also add a starting point to the where clause...
SELECT MIN(t1.ID)+1 FROM mytable t1 WHERE NOT EXISTS (SELECT ID FROM mytable WHERE ID = (t1.ID + 1)) AND ID > 2000
So if you had 2000, 2001, 2002, and 2005 where 2003 and 2004 didn't exist, it would return 2003.
The following solution:
provides test data;
an inner query that produces other gaps; and
it works in SQL Server 2012.
Numbers the ordered rows sequentially in the "with" clause and then reuses the result twice with an inner join on the row number, but offset by 1 so as to compare the row before with the row after, looking for IDs with a gap greater than 1. More than asked for but more widely applicable.
create table #ID ( id integer );
insert into #ID values (1),(2), (4),(5),(6),(7),(8), (12),(13),(14),(15);
with Source as (
select
row_number()over ( order by A.id ) as seq
,A.id as id
from #ID as A WITH(NOLOCK)
)
Select top 1 gap_start from (
Select
(J.id+1) as gap_start
,(K.id-1) as gap_end
from Source as J
inner join Source as K
on (J.seq+1) = K.seq
where (J.id - (K.id-1)) <> 0
) as G
The inner query produces:
gap_start gap_end
3 3
9 11
The outer query produces:
gap_start
3
Inner join to a view or sequence that has a all possible values.
No table? Make a table. I always keep a dummy table around just for this.
create table artificial_range(
id int not null primary key auto_increment,
name varchar( 20 ) null ) ;
-- or whatever your database requires for an auto increment column
insert into artificial_range( name ) values ( null )
-- create one row.
insert into artificial_range( name ) select name from artificial_range;
-- you now have two rows
insert into artificial_range( name ) select name from artificial_range;
-- you now have four rows
insert into artificial_range( name ) select name from artificial_range;
-- you now have eight rows
--etc.
insert into artificial_range( name ) select name from artificial_range;
-- you now have 1024 rows, with ids 1-1024
Then,
select a.id from artificial_range a
where not exists ( select * from your_table b
where b.counter = a.id) ;
This one accounts for everything mentioned so far. It includes 0 as a starting point, which it will default to if no values exist as well. I also added the appropriate locations for the other parts of a multi-value key. This has only been tested on SQL Server.
select
MIN(ID)
from (
select
0 ID
union all
select
[YourIdColumn]+1
from
[YourTable]
where
--Filter the rest of your key--
) foo
left join
[YourTable]
on [YourIdColumn]=ID
and --Filter the rest of your key--
where
[YourIdColumn] is null
For PostgreSQL
An example that makes use of recursive query.
This might be useful if you want to find a gap in a specific range
(it will work even if the table is empty, whereas the other examples will not)
WITH
RECURSIVE a(id) AS (VALUES (1) UNION ALL SELECT id + 1 FROM a WHERE id < 100), -- range 1..100
b AS (SELECT id FROM my_table) -- your table ID list
SELECT a.id -- find numbers from the range that do not exist in main table
FROM a
LEFT JOIN b ON b.id = a.id
WHERE b.id IS NULL
-- LIMIT 1 -- uncomment if only the first value is needed
My guess:
SELECT MIN(p1.field) + 1 as gap
FROM table1 AS p1
INNER JOIN table1 as p3 ON (p1.field = p3.field + 2)
LEFT OUTER JOIN table1 AS p2 ON (p1.field = p2.field + 1)
WHERE p2.field is null;
I wrote up a quick way of doing it. Not sure this is the most efficient, but gets the job done. Note that it does not tell you the gap, but tells you the id before and after the gap (keep in mind the gap could be multiple values, so for example 1,2,4,7,11 etc)
I'm using sqlite as an example
If this is your table structure
create table sequential(id int not null, name varchar(10) null);
and these are your rows
id|name
1|one
2|two
4|four
5|five
9|nine
The query is
select a.* from sequential a left join sequential b on a.id = b.id + 1 where b.id is null and a.id <> (select min(id) from sequential)
union
select a.* from sequential a left join sequential b on a.id = b.id - 1 where b.id is null and a.id <> (select max(id) from sequential);
https://gist.github.com/wkimeria/7787ffe84d1c54216f1b320996b17b7e
Here is an alternative to show the range of all possible gap values in portable and more compact way :
Assume your table schema looks like this :
> SELECT id FROM your_table;
+-----+
| id |
+-----+
| 90 |
| 103 |
| 104 |
| 118 |
| 119 |
| 120 |
| 121 |
| 161 |
| 162 |
| 163 |
| 185 |
+-----+
To fetch the ranges of all possible gap values, you have the following query :
The subquery lists pairs of ids, each of which has the lowerbound column being smaller than upperbound column, then use GROUP BY and MIN(m2.id) to reduce number of useless records.
The outer query further removes the records where lowerbound is exactly upperbound - 1
My query doesn't (explicitly) output the 2 records (YOUR_MIN_ID_VALUE, 89) and (186, YOUR_MAX_ID_VALUE) at both ends, that implicitly means any number in both of the ranges hasn't been used in your_table so far.
> SELECT m3.lowerbound + 1, m3.upperbound - 1 FROM
(
SELECT m1.id as lowerbound, MIN(m2.id) as upperbound FROM
your_table m1 INNER JOIN your_table
AS m2 ON m1.id < m2.id GROUP BY m1.id
)
m3 WHERE m3.lowerbound < m3.upperbound - 1;
+-------------------+-------------------+
| m3.lowerbound + 1 | m3.upperbound - 1 |
+-------------------+-------------------+
| 91 | 102 |
| 105 | 117 |
| 122 | 160 |
| 164 | 184 |
+-------------------+-------------------+
select min([ColumnName]) from [TableName]
where [ColumnName]-1 not in (select [ColumnName] from [TableName])
and [ColumnName] <> (select min([ColumnName]) from [TableName])
Here is standard a SQL solution that runs on all database servers with no change:
select min(counter + 1) FIRST_GAP
from my_table a
where not exists (select 'x' from my_table b where b.counter = a.counter + 1)
and a.counter <> (select max(c.counter) from my_table c);
See in action for;
PL/SQL via Oracle's livesql,
MySQL via sqlfiddle,
PostgreSQL via sqlfiddle
MS Sql via sqlfiddle
It works for empty tables or with negatives values as well. Just tested in SQL Server 2012
select min(n) from (
select case when lead(i,1,0) over(order by i)>i+1 then i+1 else null end n from MyTable) w
If You use Firebird 3 this is most elegant and simple:
select RowID
from (
select `ID_Column`, Row_Number() over(order by `ID_Column`) as RowID
from `Your_Table`
order by `ID_Column`)
where `ID_Column` <> RowID
rows 1
-- PUT THE TABLE NAME AND COLUMN NAME BELOW
-- IN MY EXAMPLE, THE TABLE NAME IS = SHOW_GAPS AND COLUMN NAME IS = ID
-- PUT THESE TWO VALUES AND EXECUTE THE QUERY
DECLARE #TABLE_NAME VARCHAR(100) = 'SHOW_GAPS'
DECLARE #COLUMN_NAME VARCHAR(100) = 'ID'
DECLARE #SQL VARCHAR(MAX)
SET #SQL =
'SELECT TOP 1
'+#COLUMN_NAME+' + 1
FROM '+#TABLE_NAME+' mo
WHERE NOT EXISTS
(
SELECT NULL
FROM '+#TABLE_NAME+' mi
WHERE mi.'+#COLUMN_NAME+' = mo.'+#COLUMN_NAME+' + 1
)
ORDER BY
'+#COLUMN_NAME
-- SELECT #SQL
DECLARE #MISSING_ID TABLE (ID INT)
INSERT INTO #MISSING_ID
EXEC (#SQL)
--select * from #MISSING_ID
declare #var_for_cursor int
DECLARE #LOW INT
DECLARE #HIGH INT
DECLARE #FINAL_RANGE TABLE (LOWER_MISSING_RANGE INT, HIGHER_MISSING_RANGE INT)
DECLARE IdentityGapCursor CURSOR FOR
select * from #MISSING_ID
ORDER BY 1;
open IdentityGapCursor
fetch next from IdentityGapCursor
into #var_for_cursor
WHILE ##FETCH_STATUS = 0
BEGIN
SET #SQL = '
DECLARE #LOW INT
SELECT #LOW = MAX('+#COLUMN_NAME+') + 1 FROM '+#TABLE_NAME
+' WHERE '+#COLUMN_NAME+' < ' + cast( #var_for_cursor as VARCHAR(MAX))
SET #SQL = #sql + '
DECLARE #HIGH INT
SELECT #HIGH = MIN('+#COLUMN_NAME+') - 1 FROM '+#TABLE_NAME
+' WHERE '+#COLUMN_NAME+' > ' + cast( #var_for_cursor as VARCHAR(MAX))
SET #SQL = #sql + 'SELECT #LOW,#HIGH'
INSERT INTO #FINAL_RANGE
EXEC( #SQL)
fetch next from IdentityGapCursor
into #var_for_cursor
END
CLOSE IdentityGapCursor;
DEALLOCATE IdentityGapCursor;
SELECT ROW_NUMBER() OVER(ORDER BY LOWER_MISSING_RANGE) AS 'Gap Number',* FROM #FINAL_RANGE
Found most of approaches run very, very slow in mysql. Here is my solution for mysql < 8.0. Tested on 1M records with a gap near the end ~ 1sec to finish. Not sure if it fits other SQL flavours.
SELECT cardNumber - 1
FROM
(SELECT #row_number := 0) as t,
(
SELECT (#row_number:=#row_number+1), cardNumber, cardNumber-#row_number AS diff
FROM cards
ORDER BY cardNumber
) as x
WHERE diff >= 1
LIMIT 0,1
I assume that sequence starts from `1`.
If your counter is starting from 1 and you want to generate first number of sequence (1) when empty, here is the corrected piece of code from first answer valid for Oracle:
SELECT
NVL(MIN(id + 1),1) AS gap
FROM
mytable mo
WHERE 1=1
AND NOT EXISTS
(
SELECT NULL
FROM mytable mi
WHERE mi.id = mo.id + 1
)
AND EXISTS
(
SELECT NULL
FROM mytable mi
WHERE mi.id = 1
)
DECLARE #Table AS TABLE(
[Value] int
)
INSERT INTO #Table ([Value])
VALUES
(1),(2),(4),(5),(6),(10),(20),(21),(22),(50),(51),(52),(53),(54),(55)
--Gaps
--Start End Size
--3 3 1
--7 9 3
--11 19 9
--23 49 27
SELECT [startTable].[Value]+1 [Start]
,[EndTable].[Value]-1 [End]
,([EndTable].[Value]-1) - ([startTable].[Value]) Size
FROM
(
SELECT [Value]
,ROW_NUMBER() OVER(PARTITION BY 1 ORDER BY [Value]) Record
FROM #Table
)AS startTable
JOIN
(
SELECT [Value]
,ROW_NUMBER() OVER(PARTITION BY 1 ORDER BY [Value]) Record
FROM #Table
)AS EndTable
ON [EndTable].Record = [startTable].Record+1
WHERE [startTable].[Value]+1 <>[EndTable].[Value]
If the numbers in the column are positive integers (starting from 1) then here is how to solve it easily. (assuming ID is your column name)
SELECT TEMP.ID
FROM (SELECT ROW_NUMBER() OVER () AS NUM FROM 'TABLE-NAME') AS TEMP
WHERE ID NOT IN (SELECT ID FROM 'TABLE-NAME')
ORDER BY 1 ASC LIMIT 1
SELECT ID+1 FROM table WHERE ID+1 NOT IN (SELECT ID FROM table) ORDER BY 1;