How to find missing numbers between 2 columns? - sql

I am looking for a way to find missing numbers within a range. I have a beginning number column and a ending number column in the same table.
I am trying to get the skipped numbers. I can get the next skipped number, but don't know how to get a list of the numbers that were not in the range. I have a numbers table if that would be useful.
Here is my example:
doc_num_begin doc_num_end
------------- -----------
20000007 20000008
20000011 20000015
20000016 20000017
I'd like to get 20000009,20000010. I have searched but not able to find out how to do this using beginning and ending columns.
Thanks

If you have a numbers table, then this is pretty easy:
select n.num
from Numbers n left outer join
RangeTable rt
on n.number between rt.doc_num_begin and doc_num_end
where rt.doc_num_begin is null
This is doing a left outer join from the numbers to the range table, and then keeping the ones that don't match.
Although pretty easy to express, the performance will probably be rather poor due to the non-equijoin. You may also want to put in conditions on the numbers table, so you don't start at 0, 1, . . ., when the ranges start at 20000007. You would do this as:
select n.num
from Numbers n join
(select MIN(doc_num_begin) as MinVal, MAX(doc_num_end) as MaxVal from RangeTable) const
on n.number between const.MinVal and const.MaxVal left outer join
RangeTable rt
on n.number between rt.doc_num_begin and doc_num_end
where rt.doc_num_begin is null

If you just have to find the missing ranges, you could use this query:
SELECT
t1.doc_num_end + 1 as start_missing_range,
MIN(t2.doc_num_begin) - 1 as end_missing_range
FROM
your_table t1 INNER JOIN your_table t2
ON t1.doc_num_end < t2.doc_num_begin
GROUP BY
t1.doc_num_end
HAVING
MIN(t2.doc_num_begin) - t1.doc_num_end > 1
EDIT: And this query could be used to expand a range:
SELECT num+start_missing_range
FROM
(select 0 as num
union all select 1 as num
union all select 2 as num
union all select 3 as num
union all select 4 as num
union all select 5 as num
union all select 6 as num
union all select 7 as num
union all select 8 as num
union all select 9 as num) numbers inner join
(SELECT
t1.doc_num_end + 1 as start_missing_range,
MIN(t2.doc_num_begin) - 1 as end_missing_range
FROM
your_table t1 INNER JOIN your_table t2
ON t1.doc_num_end < t2.doc_num_begin
GROUP BY
t1.doc_num_end
HAVING
MIN(t2.doc_num_begin) - t1.doc_num_end > 1) rg
on end_missing_range-start_missing_range>=numbers.num
(it will work only if a range contains at maximum 10 numbers, it could be easily expanded to some more... of course, there will always be a limit, but at least you don't need a table with all of the numbers)

You can use known sequence number in any table or sample database for this purpose to filter with this id.
Cross Joining this id, would extend the limit you seek.
SELECT i from (select (w2.WorkOrderID-1)+(w1.WorkOrderID-1)*10000 as i
from AdventureWorks.Production.WorkOrder w1
cross join AdventureWorks.Production.WorkOrder w2
where w1.WorkOrderID<10000 and w2.WorkOrderID<10000) as MyNumbers
WHERE i BETWEEN #StartRange and #EndRange
and not exists (SELECT 1 FROM MyTable
WHERE i BETWEEN doc_num_begin doc_num_end)

Related

Count Number SQL where not exist

Is there an option to get numbers where not in exist in the Table?
Example
product_number
1
2
3
5
I want only the number 4 as result, because it's a free product number.
The Problem is with connect by rownum doesn't work, because out of memory.
You can use lead():
select coalesce(min(product_number), 0) + 1
from (select t.*, lead(product_number) over (order by product_number) as next_pn
from t
) t
where next_pn <> product_number + 1;
Oracle will use an index on (product_number) if one is available.
Something like this?
Select Rownum r -- Generate all numbers from 1 to max
From dual
Connect By Rownum <= (select max(product_number) from products)
where r not in
(
select product_number from products
)
Since you didn't provide sample of real product numbers (apparently) but claim connect by runs out of memory could imply you have very large product numbers.
So restricting the numbers that need checked needs to be reduced to the range known to potentially existing, that being the min and max product numbers. Once that's known we can generate the the index on product number to see if the specific number exists, or in this doesn't exist. So:
with lh as
(select min(product_number) l
, max(product_number) h
from products
)
, range (pn) as
(select product_number pn
from products
where product_number = (select l from lh)
union all
select pn + 1
from range
where pn + 1 <= (select h from lh)
)
select pn available_product_number
from range
where not exists
( select null
from products
where pn = product_number
)
order by pn;

How to SELECT top N rows that sum to a certain amount?

Suppose:
MyTable
--
Amount
1
2
3
4
5
MyTable only has one column, Amount, with 5 rows. They are not necessarily in increasing order.
How can I create a function, which takes a #SUM INT, and returns the TOP N rows that sum to this amount?
So for input 6, I want
Amount
1
2
3
Since 1 + 2 + 3 = 6. 2 + 4 / 1 + 5 won't work since I want TOP N ROWS
For 7/8/9/10, I want
Amount
1
2
3
4
I'm using MS SQL Server 2008 R2, if this matters.
Saying "top N rows" is indeed ambiguous when it comes to relational databases.
I assume that you want to order by "amount" ascending.
I would add a second column (to a table or view) like "sum_up_to_here", and create something like that:
create view mytable_view as
select
mt1.amount,
sum(mt2.amount) as sum_up_to_here
from
mytable mt1
left join mytable mt2 on (mt2.amount < mt1.amount)
group by mt1.amount
or:
create view mytable_view as
select
mt1.amount,
(select sum(amount) from mytable where amount < mt1.amount)
from mytable mt1
and then I would select the final rows:
select amount from mytable_view where sum_up_to_here < (some value)
If you don't bother about performance you may of course run it in one query:
select amount from
(
select
mt1.amount,
sum(mt2.amount) as sum_up_to_here
from
mytable mt1
left join mytable mt2 on (mt2.amount < mt1.amount)
group by mt1.amount
) t where sum_up_to_here < 20
One approach:
select t1.amount
from MyTable t1
left join MyTable t2 on t1.amount > t2.amount
group by t1.amount
having coalesce(sum(t2.amount),0) < 7
SQLFiddle here.
In Sql Server you can use CDEs to make it pretty simple to read.
Here is a CDE I did to sum up totals used in sequence. The CDE is similar to the joins above, and holds the total up to any given index. Outside of the CDE I join it back to the original table so I can select it along with other fields.
;with summrp as (
select m1.idx, sum(m2.QtyReq) as sumUsed
from #mrpe m1
join #mrpe m2 on m2.idx <= m1.idx
group by m1.idx
)
select RefNum, RefLineSuf, QtyReq, ProjectedDate, sumUsed from #mrpe m
join summrp on summrp.idx=m.idx
In SQL Server 2012 you can use this shortcut to get a result like Grzegorz's.
SELECT amount
FROM (
SELECT * ,
SUM(amount) OVER (ORDER BY amount ASC) AS total
from demo
) T
WHERE total <= 6
A fiddle in the hand... http://sqlfiddle.com/#!6/b8506/6

SQL: create sequential list of numbers from various starting points

I'm stuck on this SQL problem.
I have a column that is a list of starting points (prevdoc), and anther column that lists how many sequential numbers I need after the starting point (exdiff).
For example, here are the first several rows:
prevdoc | exdiff
----------------
1 | 3
21 | 2
126 | 2
So I need an output to look something like:
2
3
4
22
23
127
128
I'm lost as to where even to start. Can anyone advise me on the SQL code for this solution?
Thanks!
;with a as
(
select prevdoc + 1 col, exdiff
from <table> where exdiff > 0
union all
select col + 1, exdiff - 1
from a
where exdiff > 1
)
select col
If your exdiff is going to be a small number, you can make up a virtual table of numbers using SELECT..UNION ALL as shown here and join to it:
select prevdoc+number
from doc
join (select 1 number union all
select 2 union all
select 3 union all
select 4 union all
select 5) x on x.number <= doc.exdiff
order by 1;
I have provided for 5 but you can expand as required. You haven't specified your DBMS, but in each one there will be a source of sequential numbers, for example in SQL Server, you could use:
select prevdoc+number
from doc
join master..spt_values v on
v.number <= doc.exdiff and
v.number >= 1 and
v.type = 'p'
order by 1;
The master..spt_values table contains numbers between 0-2047 (when filtered by type='p').
If the numbers are not too large, then you can use the following trick in most databases:
select t.exdiff + seqnum
from t join
(select row_number() over (order by column_name) as seqnum
from INFORMATION_SCHEMA.columns
) nums
on t.exdiff <= seqnum
The use of INFORMATION_SCHEMA columns in the subquery is arbitrary. The only purpose is to generate a sequence of numbers at least as long as the maximum exdiff number.
This approach will work in any database that supports the ranking functions. Most databases have a database-specific way of generating a sequence (such as recursie CTEs in SQL Server and CONNECT BY in Oracle).

Maximum difference between rows

I am not strong with SQL at all, so here it goes:
I have a table with a column containing doubles.
I would like to select all rows that the maximum difference between them is '5'.
How can I do that?
id value
1 4955.54
2 2884.32
3 8485.45
4 4588.54
5 8487.62
RESULT
id value
3 8485.45
5 8487.62
How can I do that in mySQL ?
Many thanks!
This works, although you mean maximum not minimum difference:
SELECT v.id, v.value
FROM Values v
WHERE EXISTS(
SELECT null from Values v2
WHERE v2.id <> v.id and
ABS(v2.value - v.value) BETWEEN 0 AND 5
)
MSDN: EXISTS (Transact-SQL)
MSDN: ABS (Transact-SQL)
MSDN: BETWEEN (Transact-SQL)
select id, value from table t1
inner join table t2 on t1.id <> t2.id
where ABS(t1.value-t2.value)<=5
It's likely to be inefficient if the set of values is large. There is no obvious way to write this query efficiently, but here goes:
select lo.val
, hi.val
from numbers lo
inner join numbers hi
on hi.val - lo.val >= 5
if the val column is indexed, it might help to add another condition like so:
select lo.val
, hi.val
from numbers lo
inner join numbers hi
on hi.val > lo.val
where hi.val - lo.val >= 5

How to find "holes" in a table

I recently inherited a database on which one of the tables has the primary key composed of encoded values (Part1*1000 + Part2).
I normalized that column, but I cannot change the old values.
So now I have
select ID from table order by ID
ID
100001
100002
101001
...
I want to find the "holes" in the table (more precisely, the first "hole" after 100000) for new rows.
I'm using the following select, but is there a better way to do that?
select /* top 1 */ ID+1 as newID from table
where ID > 100000 and
ID + 1 not in (select ID from table)
order by ID
newID
100003
101029
...
The database is Microsoft SQL Server 2000. I'm ok with using SQL extensions.
select ID +1 From Table t1
where not exists (select * from Table t2 where t1.id +1 = t2.id);
not sure if this version would be faster than the one you mentioned originally.
SELECT (ID+1) FROM table AS t1
LEFT JOIN table as t2
ON t1.ID+1 = t2.ID
WHERE t2.ID IS NULL
This solution should give you the first and last ID values of the "holes" you are seeking. I use this in Firebird 1.5 on a table of 500K records, and although it does take a little while, it gives me what I want.
SELECT l.id + 1 start_id, MIN(fr.id) - 1 stop_id
FROM (table l
LEFT JOIN table r
ON l.id = r.id - 1)
LEFT JOIN table fr
ON l.id < fr.id
WHERE r.id IS NULL AND fr.id IS NOT NULL
GROUP BY l.id, r.id
For example, if your data looks like this:
ID
1001
1002
1005
1006
1007
1009
1011
You would receive this:
start_id stop_id
1003 1004
1008 1008
1010 1010
I wish I could take full credit for this solution, but I found it at Xaprb.
from How do I find a "gap" in running counter with SQL?
select
MIN(ID)
from (
select
100001 ID
union all
select
[YourIdColumn]+1
from
[YourTable]
where
--Filter the rest of your key--
) foo
left join
[YourTable]
on [YourIdColumn]=ID
and --Filter the rest of your key--
where
[YourIdColumn] is null
The best way is building a temp table with all IDs
Than make a left join.
declare #maxId int
select #maxId = max(YOUR_COLUMN_ID) from YOUR_TABLE_HERE
declare #t table (id int)
declare #i int
set #i = 1
while #i <= #maxId
begin
insert into #t values (#i)
set #i = #i +1
end
select t.id
from #t t
left join YOUR_TABLE_HERE x on x.YOUR_COLUMN_ID = t.id
where x.YOUR_COLUMN_ID is null
Have thought about this question recently, and looks like this is the most elegant way to do that:
SELECT TOP(#MaxNumber) ROW_NUMBER() OVER (ORDER BY t1.number)
FROM master..spt_values t1 CROSS JOIN master..spt_values t2
EXCEPT
SELECT Id FROM <your_table>
This solution doesn't give all holes in table, only next free ones + first available max number on table - works if you want to fill in gaps in id-es, + get free id number if you don't have a gap..
select numb + 1 from temp
minus
select numb from temp;
This will give you the complete picture, where 'Bottom' stands for gap start and 'Top' stands for gap end:
select *
from
(
(select <COL>+1 as id, 'Bottom' AS 'Pos' from <TABLENAME> /*where <CONDITION*/>
except
select <COL>, 'Bottom' AS 'Pos' from <TABLENAME> /*where <CONDITION>*/)
union
(select <COL>-1 as id, 'Top' AS 'Pos' from <TABLENAME> /*where <CONDITION>*/
except
select <COL>, 'Top' AS 'Pos' from <TABLENAME> /*where <CONDITION>*/)
) t
order by t.id, t.Pos
Note: First and Last results are WRONG and should not be regarded, but taking them out would make this query a lot more complicated, so this will do for now.
Many of the previous answer are quite good. However they all miss to return the first value of the sequence and/or miss to consider the lower limit 100000. They all returns intermediate holes but not the very first one (100001 if missing).
A full solution to the question is the following one:
select id + 1 as newid from
(select 100000 as id union select id from tbl) t
where (id + 1 not in (select id from tbl)) and
(id >= 100000)
order by id
limit 1;
The number 100000 is to be used if the first number of the sequence is 100001 (as in the original question); otherwise it is to be modified accordingly
"limit 1" is used in order to have just the first available number instead of the full sequence
For people using Oracle, the following can be used:
select a, b from (
select ID + 1 a, max(ID) over (order by ID rows between current row and 1 following) - 1 b from MY_TABLE
) where a <= b order by a desc;
The following SQL code works well with SqLite, but should be used without issues also on MySQL, MS SQL and so on.
On SqLite this takes only 2 seconds on a table with 1 million rows (and about 100 spared missing rows)
WITH holes AS (
SELECT
IIF(c2.id IS NULL,c1.id+1,null) as start,
IIF(c3.id IS NULL,c1.id-1,null) AS stop,
ROW_NUMBER () OVER (
ORDER BY c1.id ASC
) AS rowNum
FROM |mytable| AS c1
LEFT JOIN |mytable| AS c2 ON c1.id+1 = c2.id
LEFT JOIN |mytable| AS c3 ON c1.id-1 = c3.id
WHERE c2.id IS NULL OR c3.id IS NULL
)
SELECT h1.start AS start, h2.stop AS stop FROM holes AS h1
LEFT JOIN holes AS h2 ON h1.rowNum+1 = h2.rowNum
WHERE h1.start IS NOT NULL AND h2.stop IS NOT NULL
UNION ALL
SELECT 1 AS start, h1.stop AS stop FROM holes AS h1
WHERE h1.rowNum = 1 AND h1.stop > 0
ORDER BY h1.start ASC