SQL - Extracting an ID range for a packet of records - sql

I have a table where I have about 40000000 records. Min(id) = 2 and max(80000000).
I would like create a automated script which will be running in a loop.
But I don't want to create about 80 iteration because a part of then will be empty.
Who knows how I can find range min(id) and max(id) for first iteration, and next?
I used mod but it doesn't work correctly:
SELECT MIN(ID), MAX(ID)
FROM (
SELECT mod(id,45), id FROM table
WHERE mod(id,45) = 0
GROUP BY mod(id,45), id
ORDER BY id desc
)
Because I want to:
first iteration has range for 1mln records: min(id) = 2 max(id) = 1 500 000
second iteration has range for 1 mln records: min(id)=1 550 000, max(id) = 5 000 000
and so on

It should be easy for whatever DBMS supporting ordered numbering of rows.
Db2.
Every SELECT returns 2 rows except the last one, which may return less rows.
SELECT 'SELECT * FROM MYTAB WHERE I BETWEEN ' || MIN (I) || ' AND ' || MAX (I) AS STMT
FROM
(
SELECT I, (ROW_NUMBER () OVER (ORDER BY I) - 1) / 2 AS RN_
FROM (VALUES 1, 9, 2, 7, 4) MYTAB (I)
) G
GROUP BY RN_
The result is:
STMT
SELECT * FROM MYTAB WHERE I BETWEEN 1 AND 2
SELECT * FROM MYTAB WHERE I BETWEEN 4 AND 7
SELECT * FROM MYTAB WHERE I BETWEEN 9 AND 9

Related

How to missing numbers by 100s in oracle

I need to find the missing numbers in a table column in oracle, where the missing numbers must be taken by 100s , meaning that if it's found 1 number at least between 2000 and 2099 , all missing numbers between 2000 and 2099 must be returned and so on.
here is an example that clarify what I need:
create table test1 ( a number(9,0));
insert into test1 values (2001);
insert into test1 values (2002);
insert into test1 values (2004);
insert into test1 values (2105);
insert into test1 values (3006);
insert into test1 values (9410);
commit;
the result must be 2000,2003,2005 to 2099,2100 to 2104,2106 to 2199,3000 to 3005,3007 to 3099,9400 to 9409,9411 to 9499.
I started with this query but it's obviously not returning what I need :
SELECT Level+(2000-1) FROM dual CONNECT BY LEVEL <= 9999
MINUS SELECT a FROM test1;
You can use the hiearchy query as follows:
SQL> SELECT A FROM (
2 SELECT A + COLUMN_VALUE - 1 AS A
3 FROM ( SELECT DISTINCT TRUNC(A, - 2) A
4 FROM TEST_TABLE) T
5 CROSS JOIN TABLE ( CAST(MULTISET(
6 SELECT LEVEL FROM DUAL CONNECT BY LEVEL <= 100
7 ) AS SYS.ODCINUMBERLIST) ) LEVELS
8 )
9 MINUS
10 SELECT A FROM TEST_TABLE;
A
----------
2000
2003
2005
2006
2007
2008
2009
.....
.....
I like to use standard recursive queries for this.
with nums (a, max_a) as (
select min(a), max(a) from test1
union all
select a + 1, max_a from nums where a < max_a
)
select n.a
from nums n
where not exists (select 1 from test1 t where t.a = n.a)
order by n.a
The with clause takes the minimum and maximum value of a in the table, and generates all numbers in between. Then, the outer query filters on those that do not exist in the table.
If you want to generate ranges of missing numbers instead of a comprehensive list, you can use window functions instead:
select a + 1 start_a, lead_a - 1 end_a
from (
select a, lead(a) over(order by a) lead_a
from test1
) t
where lead_a <> a + 1
Demo on DB Fiddle
EDIT:
If you want the missing values within ranges of thousands, then we can slightly adapt the recursive solution:
with nums (a, max_a) as (
select distinct floor(a / 100) * 100 a, floor(a / 100) * 100 + 100 from test1
union all
select a + 1, max_a from nums where a < max_a
)
select n.a
from nums n
where not exists (select 1 from test1 t where t.a = n.a)
order by n.a
Assuming you define fixed upper and lower bound for the range, then just need to eliminate the results of the current query by use of NOT EXISTS such as
SQL> exec :min_val:=2000
SQL> exec :min_val:=2499
SQL> SELECT *
FROM
(
SELECT level + :min_val - 1 AS nr
FROM dual
CONNECT BY level <= :max_val - :min_val + 1
)
WHERE NOT EXISTS ( SELECT * FROM test1 WHERE a = nr )
ORDER BY nr;
/
Demo

How to compare range of integer values in PL/SQL?

I am trying to compare a range of integer values between a test table and a reference table. If any range of values from the test table overlaps with the available ranges in the reference table, it should be deleted.
Sorry if it's not clear but here is an example data:
TEST_TABLE:
MIN MAX
10 121
122 648
1200 1599
REFERENCE_TABLE:
MIN MAX
50 106
200 1400
1450 1500
MODIFIED TEST_TABLE: (expected result after running PL/SQL)
MIN MAX
10 49
107 121
122 199
1401 1449
1501 1599
In the first row from the example above, the 10-121 has been cut down into two rows: 10-49 and 107-121 because the values 50, 51, ..., 106 are included in the first row of the reference_table (50-106); and so on..
Here's what I've written so far with nested loops. I've created two additional temp tables that would store all the values that would be found in the reference table. Then it would create new sets of ranges to be inserted to test_table.
But this does not seem to work correctly and might cause performance issues especially if we're dealing with values of millions and above:
CREATE TABLE new_table (num_value NUMBER);
CREATE TABLE new_table_next (num_value NUMBER, next_value NUMBER);
-- PL/SQL start
DECLARE
l_count NUMBER;
l_now_min NUMBER;
l_now_max NUMBER;
l_final_min NUMBER;
l_final_max NUMBER;
BEGIN
FOR now IN (SELECT min_num, max_num FROM test_table) LOOP
l_now_min:=now.min_num;
l_now_max:=now.max_num;
WHILE (l_now_min < l_now_max) LOOP
SELECT COUNT(*) -- to check if number is found in reference table
INTO l_count
FROM reference_table refr
WHERE l_now_min >= refr.min_num
AND l_now_min <= refr.max_num;
IF l_count > 0 THEN
INSERT INTO new_table (num_value) VALUES (l_now_min);
COMMIT;
END IF;
l_now_min:=l_now_min+1;
END LOOP;
INSERT INTO new_table_next (num_value, next_value)
VALUES (SELECT num_value, (SELECT MIN (num_value) FROM new_table t2 WHERE t2.num_value > t.num_value) AS next_value FROM new_table t);
DELETE FROM test_table t
WHERE now.min_num = t.min_num
AND now.max_num = t.max_num;
COMMIT;
SELECT (num_value + 1) INTO l_final_min FROM new_table_next;
SELECT (next_value - num_value - 2) INTO l_final_max FROM new_table_next;
INSERT INTO test_table (min_num, max_num)
VALUES (l_final_min, l_final_max);
COMMIT;
DELETE FROM new_table;
DELETE FROM new_table_next;
COMMIT;
END LOOP;
END;
/
Please help, I'm stuck. :)
The idea behind this approach is to unwind both tables, keeping track of whether the numbers are in the reference table or the original table. This is really cumbersome, because adjacent values can cause problems.
The idea is then to do a "gaps-and-islands" type solution along both dimensions -- and then only keep the values that are in the original table and not in the second. Perhaps this could be called "exclusionary gaps-and-islands".
Here is a working version:
with vals as (
select min as x, 1 as inc, 0 as is_ref
from test_table
union all
select max + 1, -1 as inc, 0 as is_ref
from test_table
union all
select min as x, 0, 1 as is_ref
from reference_table
union all
select max + 1 as x, 0, -1 as is_ref
from reference_table
)
select min, max
from (select refgrp, incgrp, ref, inc2, min(x) as min, (lead(min(x), 1, max(x) + 1) over (order by min(x)) - 1) as max
from (select v.*,
row_number() over (order by x) - row_number() over (partition by ref order by x) as refgrp,
row_number() over (order by x) - row_number() over (partition by inc2 order by x) as incgrp
from (select v.*, sum(is_ref) over (order by x, inc) as ref,
sum(inc) over (order by x, inc) as inc2
from vals v
) v
) v
group by refgrp, incgrp, ref, inc2
) v
where ref = 0 and inc2 = 1 and min < max
order by min;
And here is a db<>fiddle.
The inverse problem of getting the overlaps is much easier. It might be feasible to "invert" the reference table to handle this.
select greatest(tt.min, rt.min), least(tt.max, rt.max)
from test_table tt join
reference_table rt
on tt.min < rt.max and tt.max > rt.min -- is there an overlap?
This is modified from a similar task (using dates instead of numbers) I did on Teradata, it's based on the same base data as Gordon's (all begin/end values combined in a single list), but uses a simpler logic:
WITH minmax AS
( -- create a list of all existing start/end values (possible to simplify using Unpivot or Cross Apply)
SELECT Min AS val, -1 AS prio, 1 AS flag -- main table, range start
FROM test_table
UNION ALL
SELECT Max+1, -1, -1 -- main table, range end
FROM test_table
UNION ALL
SELECT Min, 1, 1 -- reference table, adjusted range start
FROM reference_table
UNION ALL
SELECT Max+1, 1, -1 -- reference table, adjusted range end
FROM reference_table
)
, all_ranges AS
( -- create all ranges from current to next row
SELECT minmax.*,
Lead(val) Over (ORDER BY val, prio desc, flag) AS next_val, -- next value = end of range
Sum(flag) Over (ORDER BY val, prio desc, flag ROWS Unbounded Preceding) AS Cnt -- how many overlapping periods exist
FROM minmax
)
SELECT val, next_val-1
FROM all_ranges
WHERE Cnt = 1 -- 1st level only
AND prio + flag = 0 -- either (prio -1 and flag 1) = range start in base table
-- or (prio 1 and flag -1) = range end in ref table
ORDER BY 1
See db-fiddle
Here's one way to do this. I put the test data in a WITH clause rather than creating the tables (I find this is easier for testing purposes). I used your column names (MIN and MAX); these are very poor choices though, as MIN and MAX are Oracle keywords. They will generate confusion for sure, and they may cause queries to error out.
The strategy is simple - first take the COMPLEMENT of the ranges in REFERENCE_TABLE, which will also be a union of intervals (using NULL as marker for minus infinity and plus infinity); then take the intersection of each interval in TEST_TABLE with each interval in the complement of REFERENCE_TABLE. How that is done is shown in the final (outer) query in the solution below.
with
test_table (min, max) as (
select 10, 121 from dual union all
select 122, 648 from dual union all
select 1200, 1599 from dual
)
, reference_table (min, max) as (
select 50, 106 from dual union all
select 200, 1400 from dual union all
select 1450, 1500 from dual
)
,
prep (min, max) as (
select lag(max) over (order by max) + 1 as min
, min - 1 as max
from ( select min, max from reference_table
union all
select null, null from dual
)
)
select greatest(t.min, nvl(p.min, t.min)) as min
, least (t.max, nvl(p.max, t.max)) as max
from test_table t inner join prep p
on t.min <= nvl(p.max, t.max)
and t.max >= nvl(p.min, t.min)
order by min
;
MIN MAX
---------- ----------
10 49
107 121
122 199
1401 1449
1501 1599
Example to resolve the problem:
CREATE TABLE xrange_reception
(
vdeb NUMBER,
vfin NUMBER
);
CREATE TABLE xrange_transfert
(
vdeb NUMBER,
vfin NUMBER
);
CREATE TABLE xrange_resultat
(
vdeb NUMBER,
vfin NUMBER
);
insert into xrange_reception values (10,50);
insert into xrange_transfert values (15,25);
insert into xrange_transfert values (30,33);
insert into xrange_transfert values (40,45);
DECLARE
CURSOR cr_rec IS SELECT * FROM xrange_reception;
CURSOR cr_tra IS
SELECT *
FROM xrange_transfert
ORDER BY vdeb;
i NUMBER;
vdebSui NUMBER;
BEGIN
FOR rc IN cr_rec
LOOP
i := 1;
vdebSui := NULL;
FOR tr IN cr_tra
LOOP
IF tr.vdeb BETWEEN rc.vdeb AND rc.vfin
THEN
IF i = 1 AND tr.vdeb > rc.vdeb
THEN
INSERT INTO xrange_resultat (vdeb, vfin)
VALUES (rc.vdeb, tr.vdeb - 1);
ELSIF i = cr_rec%ROWCOUNT AND tr.vfin < rc.vfin
THEN
INSERT INTO xrange_resultat (vdeb, vfin)
VALUES (tr.vfin, rc.vfin);
ELSIF vdebSui < tr.vdeb
THEN
INSERT INTO xrange_resultat (vdeb, vfin)
VALUES (vdebSui + 1, tr.vdeb - 1);
END IF;
vdebSui := tr.vfin;
i := i + 1;
END IF;
END LOOP;
IF vdebSui IS NOT NULL THEN
IF vdebSui < rc.vfin
THEN
INSERT INTO xrange_resultat (vdeb, vfin)
VALUES (vdebSui + 1, rc.vfin);
END IF;
ELSE
INSERT INTO xrange_resultat (vdeb, vfin)
VALUES (rc.vdeb, rc.vfin);
END IF;
END LOOP;
END;
So:
Table xrange_reception:
vdeb vfin
10 50
Table xrange_transfert:
vdeb vfin
15 25
30 33
40 45
Table xrange_resultat:
vdeb vfin
10 14
26 29
34 39
46 50

SQL : how to find gaps in a specific range of numbers?

I want to find gaps in a numeric column (not the entire column), but in a specific range.
For example :
My column :
1
2
5
6
8
10
18
19
20
I want to specify a specific range in which my SQL query would look for gaps. For example, I want gaps in the range [15,20]. In this example, the gaps are : 15,16,17.
I built a query that retrieves gaps but not in case the gaps are at the beginning of my range
SELECT cur_value + 1 AS start_gap, next_value - 1 AS end_gap
FROM (
SELECT col AS cur_value, LEAD (col) OVER (ORDER BY col) AS next_value
FROM table
--WHERE col BETWEEN 200 AND 300
)
WHERE next_value - cur_value > 1
ORDER BY start_gap;
How can I do that ?
N.B : Performance is very important in my case. I deal with tons of rows.
I think the simplest method might be:
SELECT cur_value + 1 AS start_gap, next_value - 1 AS end_gap
FROM (SELECT col AS cur_value, LEAD (col) OVER (ORDER BY col) AS next_value
FROM (SELECT col
FROM table
WHERE col BETWEEN 200 AND 300
UNION ALL
SELECT 200-1 FROM DUAL
UNION ALL
SELECT 300+1 FROM DUAL
) t
) t
WHERE next_value - cur_value > 1
ORDER BY start_gap;
Note: This will work on arbitrarily long ranges.
You can simply use hierarchical connect by to produce the list of numbers in the given range and then check to see if the value doesn't exist in the table.
Assuming your range is 15 to 20, use NOT IN:
select * from (
select level - 1 + 15 col
from dual
connect by level <= 20 - 15
) where col not in (select col from your_table);
Demo
Similarly, NOT EXISTS:
select * from (
select level - 1 + 15 col
from dual
connect by level <= 20 - 15
) t where not exists (select 1 from your_table where col = t.col);
Demo 2

Fill in missing dates in date range from a table

table A
no date count
1 20160401 1
1 20160403 4
2 20160407 3
result
no date count
1 20160401 1
1 20160402 0
1 20160403 4
1 20160404 0
.
.
.
2 20160405 0
2 20160406 0
2 20160407 3
.
.
.
I'm using Oracle and I want to write a query that returns rows for every date within a range based on table A.
Is there some function in Oracle that can help me?
you can use the SEQUENCES.
First create a sequence
Create Sequence seq_name start with 20160401 max n;
where n is the max value till u want to display.
Then use the sql
select seq_name.next,case when seq_name.next = date then count else 0 end from tableA;
Note:- Its better not to use date,count as the column names.
Try this:
with
A as (
select 1 no, to_date('20160401', 'yyyymmdd') dat, 1 cnt from dual union all
select 1 no, to_date('20160403', 'yyyymmdd') dat, 4 cnt from dual union all
select 2 no, to_date('20160407', 'yyyymmdd') dat, 3 cnt from dual),
B as (select min(dat) mindat, max(dat) maxdat from A t),
C as (select level + mindat - 1 dat from B connect by level + mindat - 1 <= maxdat),
D as (select distinct no from A),
E as (select * from D,C)
select E.no, E.dat, nvl(cnt, 0) cnt
from E
full outer join A on A.no = E.no and A.dat = E.dat
order by 1, 2, 3
This isn't an oracle specific answer, you'll need to translate it to oracle yourself.
Create an intervals table, containing all integers from 0 to 999. Something like this:
CREATE TABLE intervals (days int);
INSERT INTO intervals (days) VALUES (0), (1);
DECLARE #rc int;
SELECT #rc = 2;
WHILE (SELECT Count(*) FROM intervals) < 1000 BEGIN
INSERT INTO intervals (days) SELECT days + #rc FROM intervals WHERE days + #rc < 1000;
SELECT #rc = #rc * 2
END;
Then all the dates in the range can be identified by adding intervals.days to the first date you've got, where the first date + intervals.days is <= the end date, and the resultant date is new. Do this by cross joining intervals to your own table. Something like (it would be in SQL, but again you'll need to translate):
SELECT DateAdd(a.date, d, i.days)
FROM (select min(date) from table_A) a, intervals I
WHERE DateAdd(a.date, d, i.days) < (select max(date) from table_A)
AND NOT EXISTS (select 1 from table_A aa where aa.date = DateAdd(a.date, d, i.days))
Hope this gives you a starting point

MYSQL query to get 'n' rows nearby given row

I have a MySQL table by name 'videos', where one of the column is 'cat' (INT) and 'id' is the PRIMARY KEY.
So, if 'x' is the row number,and 'n' is the category id, I need to get nearby 15 rows
Case 1: There are many rows in the category before and after 'x'.. Just get 7 each rows before and after 'x'
SELECT * FROM videos WHERE cat=n AND id<x ORDER BY id DESC LIMIT 0,7
SELECT * FROM videos WHERE cat=n AND id>x LIMIT 0,7
Case 2: If 'x' is in the beginning/end of the the table -> Print all (suppose 'y' rows) the rows before/after 'x' and later print 15-y rows after/before 'x'
Case 1 is not a problem but I am stuck with Case 2. Is there any generic method to get 'p' rows nearby a row 'x' ??
This query will always position N (exact id match) at the centre of the data, unless there are no more rows (in either direction), in which case rows will be added from the prior/next sections as required, while still preserving data from prior/next (as much as available).
set #n := 28;
SELECT * FROM
(
SELECT * FROM
(
(SELECT v.*, 0 as prox FROM videos v WHERE cat=1 AND id = #n)
union all
(SELECT v.*, #rn1:=#rn1+1 FROM (select #rn1:=0) x, videos v WHERE cat=1 AND id < #n ORDER BY id DESC LIMIT 15)
union all
(SELECT v.*, #rn2:=#rn2+1 FROM (select #rn2:=0) y, videos v WHERE cat=1 AND id > #n ORDER BY id LIMIT 15)
) z
ORDER BY prox
LIMIT 15
) w
order by id
For example, if you had 30 ids for cat=1, and you were looking at item #28, it will show items 16 through 30, #28 is the 3rd row from the bottom.
Some explanation:
SELECT v.*, 0 as prox FROM videos v WHERE cat=1 AND id = #n
v.* means to select all columns in the table/alias v. In this case, v is the alias for the table videos.
0 as prox means to create a column named prox, and it will contain just the value 0
The next query:
SELECT v.*, #rn1:=#rn1+1 FROM (select #rn1:=0) x, videos v WHERE cat=1 AND id < #n ORDER BY id DESC LIMIT 15
v.* - as above
#rn1:=#rn1+1 uses a variable to return a sequence number for each record in this subquery. It starts with 1 and for each record, following the ORDER BY id DESC, it will be numbered 2, then 3 etc.
(select #rn1:=0) x This creates a subquery aliased as x, all it does is ensures the variable #rn1 starts with the value 1 for the first row.
The end result is that the variable and 0 as prox ranks each row based on how close it is to the value #n. The clause order by prox limit 15 takes the 15 that are closest to N.