SQL Select first missing value in a series - sql

I was asked the following: I have a table (lets call it tbl) and it has one column of type int (let call it num) and it had sequential numbers in it:
num
---
1
2
4
5
6
8
9
11
Now you need to write a query that returns the first missing number (in this example that answer would be 3).
here's my answer (works):
select top 1 (num + 1)
from tbl
where (num + 1) not in (select num from tbl)
after writing this, I was asked, what if tbl contained 10 million records - how would you improve performence (because obviously myinner query would cause a full table scan).
My thoughts were about an index in on the num field and doing a not exists. but I would love to hear some alternatives.

In SQL Server 2012+, I would simply use lead():
select num
from (select num, lead(num) over (order by num) as next_num
from tbl
) t
where next_num <> num + 1;
However, I suspect that your version would have the best performance if you have an index on tbl(num). The not exists version is worth testing:
select top 1 (num + 1)
from tbl t
where not exists (select 1 from tbl t2 where t2.num = t.num + 1);
The only issue is getting the first number. You are not guaranteed that the table is read "in order". So, this will return one number. With an index (or better yet, clustered index) on num, the following should be fast and guaranteed to return the first number:
select top 1 (num + 1)
from tbl t
where not exists (select 1 from tbl t2 where t2.num = t.num + 1)
order by num;

Here is another approach using ROW_NUMBER:
SQL Fiddle
;WITH CteRN AS(
SELECT *,
RN = num - ROW_NUMBER() OVER(ORDER BY num)
FROM tbl
)
SELECT TOP 1 num - RN
FROM CteRN
WHERE RN > 0
ORDER BY num ASC
With a proper INDEX on num, here is the stats under a million row test harness.
Original - NOT IN : CPU time = 296 ms, elapsed time = 289 ms
wewesthemenace : CPU time = 0 ms, elapsed time = 0 ms
notulysses(NOT EXISTS): CPU time = 687 ms, elapsed time = 679 ms.

All the above answers are accurate alternatively if we can make an table like date dimension that has all sequential number starting from 1 to n.
insert into sequenceNumber
select 1 union all
select 2 union all
select 3 union all
select 4 union all
select 5 union all
select 6 union all
select 7 union all
select 8 union all
select 9 union all
select 10 union all
select 11 union all
select 12 union all
select 13
select * from sequenceNumber
RESULT:
1
2
3
4
5
6
7
8
9
10
11
12
13
Now we can use below query to find out missing number.
select a.num,b.num
from sequenceNumber a
left outer join tbl b on a.num = b.num
where b.num is null
Hope it will be useful.

You can try with not exists:
select top 1 t.num + 1
from tbl t
where not exists (select * from tbl where num = t.num + 1) order by t.num
SQLFiddle
Or use a row_number:
select top 1 t.r
from (
select num
, row_number() over (order by num) as r
from tbl) t
where t.r <> t.num
order by t.num
SQLFiddle

from my side also one example
-- test data
DECLARE #Sequence AS TABLE ( Num INT )
INSERT INTO #Sequence
( Num )
VALUES ( 1 ),
( 2 ),
( 4 ),
( 5 ),
( 6 ),
( 8 ),
( 9 ),
( 11 )
--Final query
SELECT TOP 1
S.RN AS [Missing]
FROM ( SELECT RN = ROW_NUMBER() OVER ( ORDER BY num )
FROM #Sequence
) AS S
LEFT JOIN #Sequence AS S2 ON S.RN = S2.Num
WHERE S2.Num IS NULL
ORDER BY S.RN

Related

SQL Server - Sum of difference between rows in a table

I have a table in the format :
SomeID SomeData
1 3
2 7
3 9
4 10
5 14
6 16
. .
. .
I want to find sum of difference between rows in this table. i.e ( (7-3) + (10-9) + (16-14) + ....)
Which is the best way to do this
Using a self join along with the modulus:
SELECT SUM(t1.SomeData - t2.SomeData) AS total_diff
FROM yourTable t1
INNER JOIN yourTable t2
ON t1.SomeID = t2.SomeID + 1
WHERE t1.SomeID % 2 = 0;
Demo
This answer assumes that the SomeID sequence in fact starts with 1 and increments by 1 with each subsequent row. If not, then we might be able to first apply ROW_NUMBER over SomeID and generate a 1 to N sequence.
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (ORDER BY SomeID) rn
FROM yourTable
)
SELECT SUM(t1.SomeData - t2.SomeData) AS total_diff
FROM cte t1
INNER JOIN cte t2
ON t1.SomeID = t2.SomeID + 1
WHERE t1.rn % 2 = 0;
You can try to use ROW_NUMBER window function to make a serial number then MOD by 2 to get your expected group then use condition aggregate function.
Query 1:
SELECT SUM(CASE WHEN rn = 0 THEN SomeData END) - SUM(CASE WHEN rn = 1 THEN SomeData END)
FROM (
SELECT SomeData,ROW_NUMBER() over(order by SomeID) % 2 rn
FROM t t1
) t1
Results:
| |
|---|
| 7 |

How to arrange continuous serial number in to two or multiple column sequentially in sql server?

I want to print or display 1 to 10 or any max number in two column format using MS Sql-Server query.
Just like below attached screen shot image.
So please give any suggestion.
Using a couple of inline tallies would be way faster than a WHILE. This version will go up to 1000 integers (500 rows):
DECLARE #Start int = 1,
#End int = 99;
SELECT TOP(CONVERT(int,CEILING(((#End*1.) - #Start + 1)/2)))
(ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) - 1)*2 + #Start AS Number1,
CASE WHEN (ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) - 1)*2 + #Start +1 <= #End THEN (ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) - 1)*2 + #Start +1 END AS Number2
FROM (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL))N1(N)
CROSS APPLY (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL))N2(N)
CROSS APPLY (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL))N3(N);
An alternative way that looks less messy with the CASE and TOP would be to use a couple of CTEs:
WITH Tally AS(
SELECT TOP(#End - #Start + 1)
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) - 1 + #Start AS I
FROM (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL))N1(N)
CROSS APPLY (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL))N2(N)
CROSS APPLY (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL))N3(N)),
Numbers AS(
SELECT I AS Number1,
LEAD(I) OVER (ORDER BY I) AS Number2
FROM Tally)
SELECT Number1,
Number2
FROM Numbers
WHERE Number1 % 2 = #Start % 2;
I like to use recursive queries for this:
with cte (num1, num2) as (
select 1, 2
union all
select num1 + 2, num2 + 2 from cte where num2 < 10
)
select * from cte order by num1
You control the maximum number with the inequality condition in the recursive member of the cte.
If you need to generate more than 100 rows, you need to add option(maxrecursion 0) at the very end of the query.
Assuming you are starting with a table with one column, you can use:
select min(number), max(number)
from sample_data
group by floor( (number - 1) / 2);
Alternatively, set-based solution using window functions:
use tempdb
;with sample_data as (
select 1 as val union all
select 2 union all
select 3 union all
select 4 union all
select 5 union all
select 6 union all
select 7 union all
select 8 union all
select 9 union all
select 10
)
, sample_data_split as
(
select
val
, 2- row_number() over (order by val) % 2 as columnid
, NTILE((select count(*) / 2 from sample_data) ) over (order by val) groupid
from sample_data
)
the intermediate result of sample_data_split is:
val columnid groupid
1 1 1
2 2 1
3 1 2
4 2 2
5 1 3
6 2 3
7 1 4
8 2 4
9 1 5
10 2 5
and then to get the resultset into a desired format:
select
min(case when columnid = 1 then val end) as column1
, min(case when columnid = 2 then val end) as column2
from sample_data_split
group by groupid
column1 column2
1 2
3 4
5 6
7 8
9 10
Those CTEs can be merged into a single SELECT:
select
min(case when columnid = 1 then val end) as column1
, min(case when columnid = 2 then val end) as column2
from
(
select
val
, 2- row_number() over (order by val) % 2 as columnid
, NTILE((select count(*) / 2 from sample_data) ) over (order by val) groupid
from sample_data
) d
group by groupid
The positive side of a such approach, that it scales well and has no upper boundary on how much rows to be processed
So I got this solution on it as below...
declare #t table
(
id int identity(1,1),
Number_1 int,
Number_2 int
)
declare #min int=1
declare #max int=10
declare #a int=0;
declare #id int=0
while(#min<=#max)
begin
if(#a=0)
begin
insert into #t
select #min,null
set #a=1
end
else if(#a=1)
begin
select top 1 #id=id from #t order by id desc
update #t set Number_2=#min where id=#id
set #a=0
end
set #min=#min+1
end
select Number_1,Number_2 from #t

How to divide one column multi row into different column in oracle?

I want to divide following column into two different column
Table x
ID
1
2
.
.
10
--Output Should be like this
A B
-- --
1 6
2 7
3 8
4 9
5 10
I tried this, but won't work
SELECT (SELECT * FROM x WHERE id BETWEEN 1 AND 5),
(SELECT * FROM x WHERE id BETWEEN 6 AND 10)
FROM dual;
Also used SUBSTR, that also won't work.
As you did not mention what exactly you want to achieve then i would have write simple query :
select case when id <= 5 then id end as col1,
case when id > 5 and id <= 10 then id end as col2
from table_name
You can achieve it using following considering that you have total ids in multiple of 2:
With cte as
(Select id, max(id) over() / 2 as mx from your_table)
Select t1.id as a, t2.id as b
From cte t1 join cte t2
On mod(t1.id, mx) = mod(t2.id, mx)
And t1.id <= mx and t2.id > mx
Order by t1.id
Cheers!!
I think you could also use the LEAD function to accomplish your task. Here is an example:
WITH x (val) AS
(
SELECT ROWNUM
FROM dual
CONNECT BY ROWNUM < 12
)
SELECT *
FROM (SELECT val AS A,
LEAD(val, (SELECT CEIL(COUNT(*)/2) FROM x)) OVER (ORDER BY val) AS B
FROM x
ORDER BY val)
WHERE ROWNUM <= (SELECT CEIL(COUNT(*)/2) FROM x);
You would just need to replace the WITH clause and its reference in the query with your actual table.
You can use row_number() and aggregation
select max(case when id <= 5 then id end) as col1,
max(case when id > 5 then id end) as col2
from (select x.*,
row_number() over (partition by case when id <= 5 then 1 else 2 end order by id) as seqnum
from x
) x
group by seqnum;

SELECT records until new value SQL

I have a table
Val | Number
08 | 1
09 | 1
10 | 1
11 | 3
12 | 0
13 | 1
14 | 1
15 | 1
I need to return the last values where Number = 1 (however many that may be) until Number changes, but do not need the first instances where Number = 1. Essentially I need to select back until Number changes to 0 (15, 14, 13)
Is there a proper way to do this in MSSQL?
Based on following:
I need to return the last values where Number = 1
Essentially I need to select back until Number changes to 0 (15, 14,
13)
Try (Fiddle demo ):
select val, number
from T
where val > (select max(val)
from T
where number<>1)
EDIT: to address all possible combinations (Fiddle demo 2)
;with cte1 as
(
select 1 id, max(val) maxOne
from T
where number=1
),
cte2 as
(
select 1 id, isnull(max(val),0) maxOther
from T
where val < (select maxOne from cte1) and number<>1
)
select val, number
from T cross join
(select maxOne, maxOther
from cte1 join cte2 on cte1.id = cte2.id
) X
where val>maxOther and val<=maxOne
I think you can use window functions, something like this:
with cte as (
-- generate two row_number to enumerate distinct groups
select
Val, Number,
row_number() over(partition by Number order by Val) as rn1,
row_number() over(order by Val) as rn2
from Table1
), cte2 as (
-- get groups with Number = 1 and last group
select
Val, Number,
rn2 - rn1 as rn1, max(rn2 - rn1) over() as rn2
from cte
where Number = 1
)
select Val, Number
from cte2
where rn1 = rn2
sql fiddle demo
DEMO: http://sqlfiddle.com/#!3/e7d54/23
DDL
create table T(val int identity(8,1), number int)
insert into T values
(1),(1),(1),(3),(0),(1),(1),(1),(0),(2)
DML
; WITH last_1 AS (
SELECT Max(val) As val
FROM t
WHERE number = 1
)
, last_non_1 AS (
SELECT Coalesce(Max(val), -937) As val
FROM t
WHERE EXISTS (
SELECT val
FROM last_1
WHERE last_1.val > t.val
)
AND number <> 1
)
SELECT t.val
, t.number
FROM t
CROSS
JOIN last_1
CROSS
JOIN last_non_1
WHERE t.val <= last_1.val
AND t.val > last_non_1.val
I know it's a little verbose but I've deliberately kept it that way to illustrate the methodolgy.
Find the highest val where number=1.
For all values where the val is less than the number found in step 1, find the largest val where the number<>1
Finally, find the rows that fall within the values we uncovered in steps 1 & 2.
select val, count (number) from
yourtable
group by val
having count(number) > 1
The having clause is the key here, giving you all the vals that have more than one value of 1.
This is a common approach for getting rows until some value changes. For your specific case use desc in proper spots.
Create sample table
select * into #tmp from
(select 1 as id, 'Alpha' as value union all
select 2 as id, 'Alpha' as value union all
select 3 as id, 'Alpha' as value union all
select 4 as id, 'Beta' as value union all
select 5 as id, 'Alpha' as value union all
select 6 as id, 'Gamma' as value union all
select 7 as id, 'Alpha' as value) t
Pull top rows until value changes:
with cte as (select * from #tmp t)
select * from
(select cte.*, ROW_NUMBER() over (order by id) rn from cte) OriginTable
inner join
(
select cte.*, ROW_NUMBER() over (order by id) rn from cte
where cte.value = (select top 1 cte.value from cte order by cte.id)
) OnlyFirstValueRecords
on OriginTable.rn = OnlyFirstValueRecords.rn and OriginTable.id = OnlyFirstValueRecords.id
On the left side we put an original table. On the right side we put only rows whose value is equal to the value in first line.
Records in both tables will be same until target value changes. After line #3 row numbers will get different IDs associated because of the offset and will never be joined with original table:
LEFT RIGHT
ID Value RN ID Value RN
1 Alpha 1 | 1 Alpha 1
2 Alpha 2 | 2 Alpha 2
3 Alpha 3 | 3 Alpha 3
----------------------- result set ends here
4 Beta 4 | 5 Alpha 4
5 Alpha 5 | 7 Alpha 5
6 Gamma 6 |
7 Alpha 7 |
The ID must be unique. Ordering by this ID must be same in both ROW_NUMBER() functions.

SQL - categorize rows

Below is the result set I am working with. What I would like is an additional column that identifies a X number of rows as the same. In my result set, rows 1-4 are the same (would like to mark as 1), rows 5-9 are the same (mark as 2); row 10 (mark as 3)
How is this possible using just SQL? I can't seem to do this using rank or dense_rank functions.
ranking diff bool
-------------------- ----------- -----------
1 0 0
2 0 0
3 0 0
4 0 0
5 54 1
6 0 0
7 0 0
8 0 0
9 0 0
10 62 1
In general case you can do something like this:
select
t.ranking, t.[diff], t.[bool],
dense_rank() over(order by c.cnt) as rnk
from Table1 as t
outer apply (
select count(*) as cnt
from Table1 as t2
where t2.ranking <= t.ranking and t2.[bool] = 1
) as c
In your case you can do it even without dense_rank():
select
t.ranking, t.[diff], t.[bool],
c.cnt + 1 as rnk
from Table1 as t
outer apply (
select count(*) as cnt
from Table1 as t2
where t2.ranking <= t.ranking and t2.[bool] = 1
) as c;
Unfortunately, in SQL Server 2008 you cannot do running total with window function, in SQL Server 2012 it'd be possible to do it with sum([bool]) over(order by ranking).
If you have really big number of rows and your ranking column is unique/primary key, you can use recursive cte approach - like one in this answer, it's fastest one in SQL Server 2008 R2:
;with cte as
(
select t.ranking, t.[diff], t.[bool], t.[bool] as rnk
from Table1 as t
where t.ranking = 1
union all
select t.ranking, t.[diff], t.[bool], t.[bool] + c.rnk as rnk
from cte as c
inner join Table1 as t on t.ranking = c.ranking + 1
)
select t.ranking, t.[diff], t.[bool], 1 + t.rnk
from cte as t
option (maxrecursion 0)
sql fiddle demo