SQL to split result in chunks - sql

I need help in writing a Oracle SQL query to achieve the following.
Let say I have a query that returns about 110,000 sorted unique number values, not necessary from 1 to 110,000, could be any unique numbers and not consecutive. I would like to split them into chunks of 25,000 each and the last chunk holds the rest, 10,000 in this example. and get the min and max of each chunk.
Thanks in advance.
John T.
For this example, I expected to have 5 chunks and the min and max values of each chunk.
Let's ASSUME these numbers are from 1 to 110,000:
Chunk Min Max
1 1 25,000
2 25,001 50,000
3 50,001 75,000
4 75,001 100,000
5 100,001 110,000

For example
with tbl as (
/* sample data */
select round(dbms_random.value() * 1000000) n
from dual
connect by level <= 110000
)
select chunk_no, count(*) cnt, min(n), max(n)
from (
select n, floor((row_number() over(order by n) - 1) / 25000) chunk_no
from tbl
)
group by chunk_no
order by chunk_no

Related

Progressive Select Query in Oracle Database

I want to write a select query that selects distinct rows of data progressively.
Explaining with an example,
Say i have 5000 accounts selected for repayment of loan, these accounts are ordered in descending order( Account 1st has highest outstanding while account 5000nd will have the lowest).
I want to select 1000 unique accounts 5 times such that the total outstanding amount of repayment in all 5 cases are similar.
i have tried out a few methods by trying to select rownums based on odd/even or other such way, but it's only good for upto 2 distributions. I was expecting more like a A.P. as in maths that selects data progressively.
A naïve method of splitting sets into (for example) 5 bins, numbered 0 to 4, is give each row a unique sequential numeric index and then, in order of size, assign the first 10 rows to bins 0,1,2,3,4,4,3,2,1,0 and then repeat for additional sets of 10 rows:
WITH indexed_values (value, rn) AS (
SELECT value,
ROW_NUMBER() OVER (ORDER BY value DESC) - 1
FROM table_name
),
assign_bins (value, rn, bin) AS (
SELECT value,
rn,
CASE WHEN MOD(rn, 2 * 5) >= 5
THEN 5 - MOD(rn, 5) - 1
ELSE MOD(rn, 5)
END
FROM indexed_values
)
SELECT bin,
COUNT(*) AS num_values,
SUM(value) AS bin_size
FROM assign_bins
GROUP BY bin
Which, for some random data:
CREATE TABLE table_name ( value ) AS
SELECT FLOOR(DBMS_RANDOM.VALUE(1, 1000001)) FROM DUAL CONNECT BY LEVEL <= 1000;
May output:
BIN
NUM_VALUES
BIN_SIZE
0
200
100012502
1
200
100004633
2
200
99980342
3
200
99976774
4
200
100005756
It will not get the bins to have equal values but it is relatively simple and will get a close approximation if your values are approximately evenly distributed.
If you want to select values from a certain bin then:
WITH indexed_values (value, rn) AS (
SELECT value,
ROW_NUMBER() OVER (ORDER BY value DESC) - 1
FROM table_name
),
assign_bins (value, rn, bin) AS (
SELECT value,
rn,
CASE WHEN MOD(rn, 2 * 5) >= 5
THEN 5 - MOD(rn, 5) - 1
ELSE MOD(rn, 5)
END
FROM indexed_values
)
SELECT value
FROM assign_bins
WHERE bin = 0
fiddle

Oracle SQL - SUM aggregate not working in LEVEL

I have this query where it has an output when using count but no output when using sum
select to_char(counter)
from (select level counter from dual connect by level <= 1000000)
where counter =
(select sum(a.amount) from table a)
I'm wondering it's because the query only supports outputs of whole numbers? I am expecting to have outputs with decimals.
I am using this as a table value set in Oracle HCM, if anyone's familiar with that. It's why I don't have the aggregate in the SELECT statement as it doesn't support it.
Are you sure your subquery is returning rows? Because SUM of an empty set is null not zero, eg
SQL> select to_char(counter)
2 from ( select level counter from dual connect by level <= 1000000 )
3 where counter =
4 (select sum(deptno) from scott.emp);
TO_CHAR(COUNTER)
----------------------------------------
310
SQL> select to_char(counter)
2 from ( select level counter from dual connect by level <= 1000000 )
3 where counter =
4 (select sum(deptno) from scott.emp where 1=0);
no rows selected
Your query says: Create all integers from 1 to 1,000,000 and of these show me the one that matches exactly the total amount of the table. So if that total is 123.45, you will get no row. If that amount is -123, you will get no row. If that amount is 1,000,001 you will get no row.
If you simply want the total, but you can have aggregations only in subqueries for some weird reason, just do
select * from (select sum(amount) as total from table) t;

sum with a specific condition in select

I have a number for example: 1000 (1)
I have a query that returns different number without any order (2). for example: 100,300,1000,400,500,600
I want to write a query (not a loop) that sum my numbers in (2) till the sum be in the range of (1000-300 , 1000+ 300) -> (700,1300)
for example : 100+300+400 could be an answer or 400+500 or ...
P.S : the first order of numbers that is in that range is an answer.
Not sure if I understood your question fully, but you may be able to achieve this using the windowing clause of analytic functions.
I created a sample table number_list with the values you'd provided. Assuming (2) to be the output from below query ..
SQL> select * from number_list;
VALUE
----------
100
300
1000
400
500
600
6 rows selected.
.. you now need the first list of numbers who's sum falls within a certain range i.e. (1000 - 300) and (1000 + 300) ..
SQL> with sorted_list as
2 (
3 select rownum rnum, value from
4 ( select value from number_list order by value ) -- sort values ascending
5 )
6 select value from sorted_list where rnum <= (
7 select min(rnum) from ( -- determine first value from sorted list to fall in the specified range
8 select rownum rnum, value,
9 sum(value) over ( order by null
10 rows between
11 unbounded preceding -- indicate that the window starts at the first row
12 and current row -- indicate that the window ends at the current row
13 ) sum
14 from sorted_list
15 ) where sum between (1000-300) and (1000+300)
16 );
VALUE
----------
100
300
400

SQLServer make summary of N records from set of M records

I'm using sql server and I want to achieve the following:
I have M number of records in the database and I want to take N records summarising the M records.
Here is an example:
M = 1000;
N = 100;
I want to take 100 records from those 1000 but the 100 records should
be from 0 to 1000 (e.g. 0th record, 10th record, 20th record, 30th
record... 990 record, 1000 record) so I can draw a chart for them.
What is the best way to achieve this with a sql query.
Thank you in adance.
You can do this with ROW_NUMBER() and modulus division:
SELECT *
FROM (SELECT *,ROW_NUMBER() OVER(ORDER BY Col1) RowNum
FROM YourTable
)sub
WHERE RowNum <= 1000 --M if it's a variable
AND RowNum % (1000/100) = 0 -- M/N if they're variables
If you already have a counter, like an INTEGER ID field, it's even easier:
SELECT *
FROM YourTable
WHERE ID <= 1000
AND ID % (1000/100) = 0

How to select similar numbers using SQL?

Here are some example numbers:
987
1001
1004
1009
1010
1016
1020
1050
For example, I would like select the top 4 numbers that close to the given number 1009 (so, the results would be 1001, 1004, 1010 and 1016), how should I write the SQL expression?
Get the distance from the given number by subtracting and using the abs function:
select top 4 Number
from NumberTable
where number <> 1009
order by abs(Number - 1009)
Edit:
As you now mention that you have a very large table, you would need a way to elliminate most of the results first. You could pick the four closest in both direction and then get the correct ones from that:
select top 4 Number
from (
select Number
from (
select top 4 Number
from NumberTable
where number < 1009
order by number desc
)
union all
select Number
from (
select top 4 Number
from NumberTable
where number > 1009
order by number
)
)
order by abs(Number - 1009)
If the numbers are evenly distributed so that you are sure that you can find the numbers in a range like for example +-100 numbers, you can simply get that range first:
select top 4 Number
from (
select Number
from NumberTable
where number between 1009-100 and 1009+100
)
where number <> 1009
order by abs(Number - 1009)
SELECT TOP 4 number
FROM your_table
WHERE number <> #numberToMatch
ORDER BY ABS(number - #numberToMatch)
Taking bits from all other answers on this page!
Assuming an index on the number column this should perform well (at least in SQL Server)
DECLARE #Target int
SET #Target = 1009;
SELECT TOP 4 number
FROM
(
SELECT TOP 4 number from YourTable
WHERE number < #Target
ORDER BY number desc
UNION ALL
SELECT TOP 4 number from YourTable
WHERE number > #Target
ORDER BY number asc
) d
order by abs(number - #Target)
Try this:
DECLARE #Target int
SET #Target = 1009
SELECT TOP 2 number from TABLE
WHERE number < #Target
ORDER BY number desc
UNION
SELECT TOP 2 number from TABLE
WHERE number > #Target
ORDER BY number asc