SQL : how to find gaps in a specific range of numbers? - sql

I want to find gaps in a numeric column (not the entire column), but in a specific range.
For example :
My column :
1
2
5
6
8
10
18
19
20
I want to specify a specific range in which my SQL query would look for gaps. For example, I want gaps in the range [15,20]. In this example, the gaps are : 15,16,17.
I built a query that retrieves gaps but not in case the gaps are at the beginning of my range
SELECT cur_value + 1 AS start_gap, next_value - 1 AS end_gap
FROM (
SELECT col AS cur_value, LEAD (col) OVER (ORDER BY col) AS next_value
FROM table
--WHERE col BETWEEN 200 AND 300
)
WHERE next_value - cur_value > 1
ORDER BY start_gap;
How can I do that ?
N.B : Performance is very important in my case. I deal with tons of rows.

I think the simplest method might be:
SELECT cur_value + 1 AS start_gap, next_value - 1 AS end_gap
FROM (SELECT col AS cur_value, LEAD (col) OVER (ORDER BY col) AS next_value
FROM (SELECT col
FROM table
WHERE col BETWEEN 200 AND 300
UNION ALL
SELECT 200-1 FROM DUAL
UNION ALL
SELECT 300+1 FROM DUAL
) t
) t
WHERE next_value - cur_value > 1
ORDER BY start_gap;
Note: This will work on arbitrarily long ranges.

You can simply use hierarchical connect by to produce the list of numbers in the given range and then check to see if the value doesn't exist in the table.
Assuming your range is 15 to 20, use NOT IN:
select * from (
select level - 1 + 15 col
from dual
connect by level <= 20 - 15
) where col not in (select col from your_table);
Demo
Similarly, NOT EXISTS:
select * from (
select level - 1 + 15 col
from dual
connect by level <= 20 - 15
) t where not exists (select 1 from your_table where col = t.col);
Demo 2

Related

SQL - Extracting an ID range for a packet of records

I have a table where I have about 40000000 records. Min(id) = 2 and max(80000000).
I would like create a automated script which will be running in a loop.
But I don't want to create about 80 iteration because a part of then will be empty.
Who knows how I can find range min(id) and max(id) for first iteration, and next?
I used mod but it doesn't work correctly:
SELECT MIN(ID), MAX(ID)
FROM (
SELECT mod(id,45), id FROM table
WHERE mod(id,45) = 0
GROUP BY mod(id,45), id
ORDER BY id desc
)
Because I want to:
first iteration has range for 1mln records: min(id) = 2 max(id) = 1 500 000
second iteration has range for 1 mln records: min(id)=1 550 000, max(id) = 5 000 000
and so on
It should be easy for whatever DBMS supporting ordered numbering of rows.
Db2.
Every SELECT returns 2 rows except the last one, which may return less rows.
SELECT 'SELECT * FROM MYTAB WHERE I BETWEEN ' || MIN (I) || ' AND ' || MAX (I) AS STMT
FROM
(
SELECT I, (ROW_NUMBER () OVER (ORDER BY I) - 1) / 2 AS RN_
FROM (VALUES 1, 9, 2, 7, 4) MYTAB (I)
) G
GROUP BY RN_
The result is:
STMT
SELECT * FROM MYTAB WHERE I BETWEEN 1 AND 2
SELECT * FROM MYTAB WHERE I BETWEEN 4 AND 7
SELECT * FROM MYTAB WHERE I BETWEEN 9 AND 9

How to missing numbers by 100s in oracle

I need to find the missing numbers in a table column in oracle, where the missing numbers must be taken by 100s , meaning that if it's found 1 number at least between 2000 and 2099 , all missing numbers between 2000 and 2099 must be returned and so on.
here is an example that clarify what I need:
create table test1 ( a number(9,0));
insert into test1 values (2001);
insert into test1 values (2002);
insert into test1 values (2004);
insert into test1 values (2105);
insert into test1 values (3006);
insert into test1 values (9410);
commit;
the result must be 2000,2003,2005 to 2099,2100 to 2104,2106 to 2199,3000 to 3005,3007 to 3099,9400 to 9409,9411 to 9499.
I started with this query but it's obviously not returning what I need :
SELECT Level+(2000-1) FROM dual CONNECT BY LEVEL <= 9999
MINUS SELECT a FROM test1;
You can use the hiearchy query as follows:
SQL> SELECT A FROM (
2 SELECT A + COLUMN_VALUE - 1 AS A
3 FROM ( SELECT DISTINCT TRUNC(A, - 2) A
4 FROM TEST_TABLE) T
5 CROSS JOIN TABLE ( CAST(MULTISET(
6 SELECT LEVEL FROM DUAL CONNECT BY LEVEL <= 100
7 ) AS SYS.ODCINUMBERLIST) ) LEVELS
8 )
9 MINUS
10 SELECT A FROM TEST_TABLE;
A
----------
2000
2003
2005
2006
2007
2008
2009
.....
.....
I like to use standard recursive queries for this.
with nums (a, max_a) as (
select min(a), max(a) from test1
union all
select a + 1, max_a from nums where a < max_a
)
select n.a
from nums n
where not exists (select 1 from test1 t where t.a = n.a)
order by n.a
The with clause takes the minimum and maximum value of a in the table, and generates all numbers in between. Then, the outer query filters on those that do not exist in the table.
If you want to generate ranges of missing numbers instead of a comprehensive list, you can use window functions instead:
select a + 1 start_a, lead_a - 1 end_a
from (
select a, lead(a) over(order by a) lead_a
from test1
) t
where lead_a <> a + 1
Demo on DB Fiddle
EDIT:
If you want the missing values within ranges of thousands, then we can slightly adapt the recursive solution:
with nums (a, max_a) as (
select distinct floor(a / 100) * 100 a, floor(a / 100) * 100 + 100 from test1
union all
select a + 1, max_a from nums where a < max_a
)
select n.a
from nums n
where not exists (select 1 from test1 t where t.a = n.a)
order by n.a
Assuming you define fixed upper and lower bound for the range, then just need to eliminate the results of the current query by use of NOT EXISTS such as
SQL> exec :min_val:=2000
SQL> exec :min_val:=2499
SQL> SELECT *
FROM
(
SELECT level + :min_val - 1 AS nr
FROM dual
CONNECT BY level <= :max_val - :min_val + 1
)
WHERE NOT EXISTS ( SELECT * FROM test1 WHERE a = nr )
ORDER BY nr;
/
Demo

looping in sql with delimiter

I just had this idea of how can i loop in sql?
For example
I have this column
PARAMETER_VALUE
E,C;S,C;I,X;G,T;S,J;S,F;C,S;
i want to store all value before (,) in a temp column also store all value after (;) into another column
then it wont stop until there is no more value after (;)
Expected Output for Example
COL1 E S I G S S C
COL2 C C X T J F S
etc . . .
You can get by using regexp_substr() window analytic function with connect by level <= clause
with t1(PARAMETER_VALUE) as
(
select 'E,C;S,C;I,X;G,T;S,J;S,F;C,S;' from dual
), t2 as
(
select level as rn,
regexp_substr(PARAMETER_VALUE,'([^,]+)',1,level) as str1,
regexp_substr(PARAMETER_VALUE,'([^;]+)',1,level) as str2
from t1
connect by level <= regexp_count(PARAMETER_VALUE,';')
)
select listagg( regexp_substr(str1,'([^;]+$)') ,' ') within group (order by rn) as col1,
listagg( regexp_substr(str2,'([^,]+$)') ,' ') within group (order by rn) as col2
from t2;
COL1 COL2
------------- -------------
E S I G S S C C C X T J F S
Demo
Assuming that you need to separate the input into rows, at the ; delimiters, and then into columns at the , delimiter, you could do something like this:
-- WITH clause included to simulate input data. Not part of the solution;
-- use actual table and column names in the SELECT statement below.
with
t1(id, parameter_value) as (
select 1, 'E,C;S,C;I,X;G,T;S,J;S,F;C,S;' from dual union all
select 2, ',U;,;V,V;' from dual union all
select 3, null from dual
)
-- End of simulated input data
select id,
level as ord,
regexp_substr(parameter_value, '(;|^)([^,]*),', 1, level, null, 2) as col1,
regexp_substr(parameter_value, ',([^;]*);' , 1, level, null, 1) as col2
from t1
connect by level <= regexp_count(parameter_value, ';')
and id = prior id
and prior sys_guid() is not null
order by id, ord
;
ID ORD COL1 COL2
--- --- ---- ----
1 1 E C
1 2 S C
1 3 I X
1 4 G T
1 5 S J
1 6 S F
1 7 C S
2 1 U
2 2
2 3 V V
3 1
Note - this is not the most efficient way to split the inputs (nothing will be very efficient - the data model, which is in violation of First Normal Form, is the reason). This can be improved using standard instr and substr, but the query will be more complicated, and for that reason, harder to maintain.
I generated more input data, to illustrate a few things. You may have several inputs that must be broken up at the same time; that must be done with care. (Note the additional conditions in CONNECT BY). I also illustrate the handling of NULL - if a comma comes right after a semicolon, that means that the "column 1" part of that pair must be NULL. That is shown in the output.

to find minimum missing number in oracle

i want to find the minimum missing number of a column named (s_no) and the table named (test_table) in oracle and I write the following code..
select
min_s_no-1+level missing_number
from (
select min(s_no) min_s_no, max(s_no) max_s_no
from test_table
) connect by level <= max_s_no-min_s_no+1
minus
select s_no from test_table
;
it gives me all the missing number as a result. But I want to select the minimum
number. Can any one help me please.
thanks in advance.
Using analytical function LEAD you can get the number from the next row in ascending order. Comparing of this value with with the original number increased by 1 you get the missing values (if two numbers do not match).
To get the first missing value in ascending order is the same selecting the MIN value:
select
num,
lead(num) over (order by num) num_lead,
case when num + 1 != lead(num) over (order by num) then num + 1 end as missing_num
from test_data
order by num;
NUM NUM_LEAD MISSING_NUM
---------- ---------- -----------
4 5
5 6
6 9 7
9 10
10 13 11
13
-- first missing number = MIN missing number
select min(missing_num)
from (
select
case when num + 1 != lead(num) over (order by num) then num + 1 end as missing_num
from test_data
);
MIN(MISSING_NUM)
----------------
7
ADDENDUM
A good practice in writing SQL is to consider edge cases - here a table that contains a complete interval without holes. The first missing value will be the successor of the last number.
select nvl(min(missing_num),max(num)+1) first_missing_value
from (
select
num,
case when num + 1 != lead(num) over (order by num) then num + 1 end as missing_num
from test_data
);
A complete table return no MISSING_NUM, so the original query return NULL. Using the NVL the expected result is provided.
The best way to find the gaps is to use analytic functiions lead or lag. An example with lag:
with test_data as (
select 1 num from dual union all
select 4 from dual union all
select 6 from dual union all
select 8 from dual union all
select 3 from dual union all
select 9 from dual union all
select 0 from dual
)
select min(gap) min_gap
from (
select num, lag(num) over (order by num)+1 gap
from test_data
)
where num != gap
;
MIN_GAP
------------------
2
More about how to find the gaps here
In Oracle 12.1 and above, MATCH_RECOGNIZE can do quick work of this kind of problems:
Edited. Initially I was picking the "next number" where a gap exists (in the example, the value 9). But that is not what the OP wants, he wants the first missing number (7 in this case). I edited to change the measures clause, to find the first missing number as requested. End Edit
with test_data (num) as (
select 4 from dual union all
select 5 from dual union all
select 6 from dual union all
select 9 from dual union all
select 10 from dual union all
select 13 from dual
)
-- end of test data; when you use the SQL query below,
-- replace test_data and num with your actual table and column names.
select result as num
from test_data
match_recognize (
order by num
measures last(b.num) + 1 as result
pattern ( ^ a b* c )
define b as num = prev(num) + 1,
c as num > prev(num) + 1
)
;
NUM
---
7

removing extra sub-query in Oracle, selecting array of values

I'm SELECTing some aggregate data and grouping on the date and a particular field. I want to display all values in that field and a count for those values even if there was no data matching that field on that day. E.g.
Date MyField Count
2009-09-25 A 2
2009-09-25 B 0
2009-09-24 A 1
2009-09-24 B 1
The Oracle SQL I currently have to do this is akin to the following:
SELECT today,
mytable.myfield,
COUNT(
CASE WHEN fields.myfield = mytable.myfield AND
date >= today AND
date < tomorrow
THEN 1
END
)
FROM (
SELECT TRUNC(SYSDATE) + 1 - LEVEL AS today,
TRUNC(SYSDATE) + 2 - LEVEL AS tomorrow
FROM DUAL
CONNECT BY LEVEL <= 30
),
(
/* This is the part that seems inefficient */
SELECT DISTINCT myfield
FROM mytable
WHERE myfield IN ('A', 'B')
) fields,
mytable
GROUP BY today, mytable.myfield
ORDER BY today DESC, mytable.myfield ASC
My concern is that I know exactly which values I want to display for myfield, and it seems inefficient to have a SELECT query that accesses mytable. I was wondering if there's some way I could do something like this in that sub-query:
SELECT ('A', 'B') AS myfield
FROM DUAL
I'm using an older version of Oracle where WITH clauses do not work.
You would have to get them as different rows, not different columns. So you'll end up with
select 'A' from dual
union
select 'B' from dual
In that case, the query should be equivalent as long as there are rows in mytable with fields 'A' and 'B'. If ever there aren't, then your subquery will return rows that the original subquery would not.
Why don't you upgrade your Oracle Version? The with-clause is added first to Oracle 9.2 (2002). Are you still using Oracle 8?
You don't have a join between the FIELDS sub-query and MYTABLE, so your resultset will contain a row for every value of MYFIELD for the last thirty days.
However, rather than adding that join, why not ditch the sub-query and just filter on MYTABLE.MYFIELD? Also, if you are concerned about performance you should bound the date in a WHERE clause, otherwise you will process every row in MYTABLE.
select today
, myfield
, count ( case when trunc(somedate) = today then 1 end ) as ab_count
from ( select trunc(sysdate) + 1 - level as today
from dual
connect by level <= 30 )
, mytable
where myfield in ('A', 'B')
and somedate >= trunc(sysdate) - 30
group by today, myfield
order by today desc, myfield asc
/
edit
I have run your original query and my revised one against some test data. You will just have to take my word for it that the two resulsets were in fact identical - or try it yourself :)
Your query returns:
TODAY M AB_COUNT
----------- - ----------
26-SEP-2009 A 0
26-SEP-2009 B 0
25-SEP-2009 A 2
25-SEP-2009 B 2
24-SEP-2009 A 2
24-SEP-2009 B 0
...
29-AUG-2009 A 1
29-AUG-2009 B 2
28-AUG-2009 A 1
28-AUG-2009 B 0
60 rows selected.
SQL>
My query returns:
TODAY M AB_COUNT
----------- - ----------
26-SEP-2009 A 0
26-SEP-2009 B 0
25-SEP-2009 A 2
25-SEP-2009 B 2
24-SEP-2009 A 2
24-SEP-2009 B 0
...
29-AUG-2009 A 1
29-AUG-2009 B 2
28-AUG-2009 A 1
28-AUG-2009 B 0
60 rows selected.
SQL>