Select exact number of rows from dual table - sql

The task is the following: select 20 rows from dual table with randomly generated distinct numbers from 23 to 45.
I performed the following:
select distinct floor(dbms_random.value(23,45)) output
from dual
connect by rownum <= 20;
But it selects random number of rows less than 20. For example:
OUTPUT
44
35
25
27
40
32
26
36
43
34
31
33
37
13 rows selected.
Please help, how to select exactly 20 numbers, not less? Lot of thanks in advance!

Use a row generator to generate all the numbers; order them randomly using DBMS_RANDOM.VALUE and then get the first 20 rows:
SELECT OUTPUT
FROM (
SELECT 22 + LEVEL AS OUTPUT
FROM DUAL
CONNECT BY 22 + LEVEL <= 45
ORDER BY DBMS_RANDOM.VALUE
)
WHERE ROWNUM <= 20
Why your code does not work:
The code you are using may randomly generate 20 distinct numbers but it is highly likely that it will not as it will generate 20 rows of random integers between 23 and 45 and then the DISTINCT clause will remove all the duplicates and you are likely to have duplicates which will reduce the final number of rows below 20.
Mathematically, the first row it generates will be unique then there is a 22-in-23 chance the second row is unique and, given the previous rows are unique, a 21-in-23 chance the 3rd row is unique and ... a 4-in-23 chance the 20th row is unique. Multiplying all those probabilities together:
SELECT probabilities ( number_of_rows, probability ) AS (
SELECT 1, 1 FROM DUAL
UNION ALL
SELECT number_of_rows + 1, probability * ( 23 - number_of_rows ) / 23
FROM probabilities
WHERE number_of_rows < 20
)
SELECT * FROM probabilities;
Gives a probability of 0.0000025 that you will generate all 20 rows with your method - possible but improbable.

Related

Select X rows of each criteria

I can get a count of records based on some criteria such as length of the data in specific columns.
But it seems I can get first X records (say 20 records) and they could all be the same length.
How do I get 20 records of each length?
SELECT LABEL_ID, DEST, WEIGHT_OZ
FROM MYTABLE
WHERE
LENGTH(LABEL_ID) IN (10,13,24)
AND ROWNUM <= 20;
This returns 20 records of labels of length 10 (since there are more than 20 records of that length). How do I get 20 of length 10, 20 of length 13, 20 of length 24, etc.?
Thanks.
Assisted by a post here
WITH rws AS (
SELECT o.LABEL_ID, o.DEST, o.WEIGHT_OZ,
ROW_NUMBER () OVER (
PARTITION BY LENGTH(LABEL_ID)
ORDER BY SOME_DATE_COLUMN DESC
) rn
FROM MYTABLE o
WHERE LENGTH(LABEL_ID) IN (10,13,24)
)
SELECT LABEL_ID, DEST, WEIGHT_OZ
FROM rws
WHERE rn <= 20
ORDER BY LENGTH(LABEL_ID), SOME_DATE_COLUMN DESC;

sum with a specific condition in select

I have a number for example: 1000 (1)
I have a query that returns different number without any order (2). for example: 100,300,1000,400,500,600
I want to write a query (not a loop) that sum my numbers in (2) till the sum be in the range of (1000-300 , 1000+ 300) -> (700,1300)
for example : 100+300+400 could be an answer or 400+500 or ...
P.S : the first order of numbers that is in that range is an answer.
Not sure if I understood your question fully, but you may be able to achieve this using the windowing clause of analytic functions.
I created a sample table number_list with the values you'd provided. Assuming (2) to be the output from below query ..
SQL> select * from number_list;
VALUE
----------
100
300
1000
400
500
600
6 rows selected.
.. you now need the first list of numbers who's sum falls within a certain range i.e. (1000 - 300) and (1000 + 300) ..
SQL> with sorted_list as
2 (
3 select rownum rnum, value from
4 ( select value from number_list order by value ) -- sort values ascending
5 )
6 select value from sorted_list where rnum <= (
7 select min(rnum) from ( -- determine first value from sorted list to fall in the specified range
8 select rownum rnum, value,
9 sum(value) over ( order by null
10 rows between
11 unbounded preceding -- indicate that the window starts at the first row
12 and current row -- indicate that the window ends at the current row
13 ) sum
14 from sorted_list
15 ) where sum between (1000-300) and (1000+300)
16 );
VALUE
----------
100
300
400

How to return the rows between 20th and 30th in Oracle Sql [duplicate]

This question already has answers here:
Oracle SQL: Filtering by ROWNUM not returning results when it should
(2 answers)
Closed 4 years ago.
so I have a large table that I'd like to output, however, I only want to see the rows between 20 and 30.
I tried
select col1, col2
from table
where rownum<= 30 and rownum>= 20;
but sql gave an error
I also tried --where rownum between 20 and 30
it also did not work.
so whats the best way to do this?
SELECT *
FROM T
ORDER BY I
OFFSET 20 ROWS --skips 20 rows
FETCH NEXT 10 ROWS ONLY --takes 10 rows
This shows only rows 21 to 30. Take care here that you need to sort the data, otherwise you may get different results every time.
See also here in the documentation.
Addendum: As in the possible duplicate link shown, your problem here is that there can't be a row with number 20 if there is no row with number 19. That's why the rownum-approach works to take only the first x records, but when you need to skip records you need a workaround by selecting the rownum in a subquery or using offset ... fetch
Example for a approach with using rownum (for lower oracle versions or whatever):
with testtab as (
select 'a' as "COL1" from dual
union all select 'b' from dual
union all select 'c' from dual
union all select 'd' from dual
union all select 'e' from dual
)
select * from
(select rownum as "ROWNR", testtab.* from testtab) tabWithRownum
where tabWithRownum.ROWNR > 2 and tabWithRownum.ROWNR < 4;
--returns only rownr 3, col1 'c'
Whenever you use rownum, it counts the rows that your query returns. SO if you are trying to filter by selecting all records between rownum 20 and 30, that is only 10 rows, so 20 and 30 dont exist. You can however, use WITH (whatever you want to name it) as and then wrap your query and rename your rownum column. This way you are selecting from your select. Example.
with T as (
select requestor, request_id, program, rownum as "ROW_NUM"
from fnd_conc_req_summary_v where recalc_parameters='N')
select * from T where row_num between 20 and 30;

Oracle: how to "group by" over a range?

If I have a table like this:
pkey age
---- ---
1 8
2 5
3 12
4 12
5 22
I can "group by" to get a count of each age.
select age,count(*) n from tbl group by age;
age n
--- -
5 1
8 1
12 2
22 1
What query can I use to group by age ranges?
age n
----- -
1-10 2
11-20 2
20+ 1
I'm on 10gR2, but I'd be interested in any 11g-specific approaches as well.
SELECT CASE
WHEN age <= 10 THEN '1-10'
WHEN age <= 20 THEN '11-20'
ELSE '21+'
END AS age,
COUNT(*) AS n
FROM age
GROUP BY CASE
WHEN age <= 10 THEN '1-10'
WHEN age <= 20 THEN '11-20'
ELSE '21+'
END
Try:
select to_char(floor(age/10) * 10) || '-'
|| to_char(ceil(age/10) * 10 - 1)) as age,
count(*) as n from tbl group by floor(age/10);
What you are looking for, is basically the data for a histogram.
You would have the age (or age-range) on the x-axis and the count n (or frequency) on the y-axis.
In the simplest form, one could simply count the number of each distinct age value like you already described:
SELECT age, count(*)
FROM tbl
GROUP BY age
When there are too many different values for the x-axis however, one may want to create groups (or clusters or buckets). In your case, you group by a constant range of 10.
We can avoid writing a WHEN ... THEN line for each range - there could be hundreds if it were not about age. Instead, the approach by #MatthewFlaschen is preferable for the reasons mentioned by #NitinMidha.
Now let's build the SQL...
First, we need to split the ages into range-groups of 10 like so:
0-9
10-19
20 - 29
etc.
This can be achieved by dividing the age column by 10 and then calculating the result's FLOOR:
FLOOR(age/10)
"FLOOR returns the largest integer equal to or less than n"
http://docs.oracle.com/cd/E11882_01/server.112/e26088/functions067.htm#SQLRF00643
Then we take the original SQL and replace age with that expression:
SELECT FLOOR(age/10), count(*)
FROM tbl
GROUP BY FLOOR(age/10)
This is OK, but we cannot see the range, yet. Instead we only see the calculated floor values which are 0, 1, 2 ... n.
To get the actual lower bound, we need to multiply it with 10 again so we get 0, 10, 20 ... n:
FLOOR(age/10) * 10
We also need the upper bound of each range which is lower bound + 10 - 1 or
FLOOR(age/10) * 10 + 10 - 1
Finally, we concatenate both into a string like this:
TO_CHAR(FLOOR(age/10) * 10) || '-' || TO_CHAR(FLOOR(age/10) * 10 + 10 - 1)
This creates '0-9', '10-19', '20-29' etc.
Now our SQL looks like this:
SELECT
TO_CHAR(FLOOR(age/10) * 10) || ' - ' || TO_CHAR(FLOOR(age/10) * 10 + 10 - 1),
COUNT(*)
FROM tbl
GROUP BY FLOOR(age/10)
Finally, apply an order and nice column aliases:
SELECT
TO_CHAR(FLOOR(age/10) * 10) || ' - ' || TO_CHAR(FLOOR(age/10) * 10 + 10 - 1) AS range,
COUNT(*) AS frequency
FROM tbl
GROUP BY FLOOR(age/10)
ORDER BY FLOOR(age/10)
However, in more complex scenarios, these ranges might not be grouped into constant chunks of size 10, but need dynamical clustering.
Oracle has more advanced histogram functions included, see http://docs.oracle.com/cd/E16655_01/server.121/e15858/tgsql_histo.htm#TGSQL366
Credits to #MatthewFlaschen for his approach; I only explained the details.
Here is a solution which creates a "range" table in a sub-query and then uses this to partition the data from the main table:
SELECT DISTINCT descr
, COUNT(*) OVER (PARTITION BY descr) n
FROM age_table INNER JOIN (
select '1-10' descr, 1 rng_start, 10 rng_stop from dual
union (
select '11-20', 11, 20 from dual
) union (
select '20+', 21, null from dual
)) ON age BETWEEN nvl(rng_start, age) AND nvl(rng_stop, age)
ORDER BY descr;
I had to group data by how many transactions appeared in an hour. I did this by extracting the hour from the timestamp:
select extract(hour from transaction_time) as hour
,count(*)
from table
where transaction_date='01-jan-2000'
group by
extract(hour from transaction_time)
order by
extract(hour from transaction_time) asc
;
Giving output:
HOUR COUNT(*)
---- --------
1 9199
2 9167
3 9997
4 7218
As you can see this gives a nice easy way of grouping the number of records per hour.
add an age_range table and an age_range_id field to your table and group by that instead.
// excuse the DDL but you should get the idea
create table age_range(
age_range_id tinyint unsigned not null primary key,
name varchar(255) not null);
insert into age_range values
(1, '18-24'),(2, '25-34'),(3, '35-44'),(4, '45-54'),(5, '55-64');
// again excuse the DML but you should get the idea
select
count(*) as counter, p.age_range_id, ar.name
from
person p
inner join age_range ar on p.age_range_id = ar.age_range_id
group by
p.age_range_id, ar.name order by counter desc;
You can refine this idea if you like - add from_age to_age columns in the age_range table etc - but i'll leave that to you.
hope this helps :)
If using Oracle 9i+, you might be able to use the NTILE analytic function:
WITH tiles AS (
SELECT t.age,
NTILE(3) OVER (ORDER BY t.age) AS tile
FROM TABLE t)
SELECT MIN(t.age) AS min_age,
MAX(t.age) AS max_age,
COUNT(t.tile) As n
FROM tiles t
GROUP BY t.tile
The caveat to NTILE is that you can only specify the number of partitions, not the break points themselves. So you need to specify a number that is appropriate. IE: With 100 rows, NTILE(4) will allot 25 rows to each of the four buckets/partitions. You can not nest analytic functions, so you'd have to layer them using subqueries/subquery factoring to get desired granularity. Otherwise, use:
SELECT CASE t.age
WHEN BETWEEN 1 AND 10 THEN '1-10'
WHEN BETWEEN 11 AND 20 THEN '11-20'
ELSE '21+'
END AS age,
COUNT(*) AS n
FROM TABLE t
GROUP BY CASE t.age
WHEN BETWEEN 1 AND 10 THEN '1-10'
WHEN BETWEEN 11 AND 20 THEN '11-20'
ELSE '21+'
END
I had to get a count of samples by day. Inspired by #Clarkey I used TO_CHAR to extract the date of sample from the timestamp to an ISO-8601 date format and used that in the GROUP BY and ORDER BY clauses. (Further inspired, I also post it here in case it is useful to others.)
SELECT
TO_CHAR(X.TS_TIMESTAMP, 'YYYY-MM-DD') AS TS_DAY,
COUNT(*)
FROM
TABLE X
GROUP BY
TO_CHAR(X.TS_TIMESTAMP, 'YYYY-MM-DD')
ORDER BY
TO_CHAR(X.TS_TIMESTAMP, 'YYYY-MM-DD') ASC
/
Can you try the below solution:
SELECT count (1), '1-10' where age between 1 and 10
union all
SELECT count (1), '11-20' where age between 11 and 20
union all
select count (1), '21+' where age >20
from age
My approach:
select range, count(1) from (
select case
when age < 5 then '0-4'
when age < 10 then '5-9'
when age < 15 then '10-14'
when age < 20 then '15-20'
when age < 30 then '21-30'
when age < 40 then '31-40'
when age < 50 then '41-50'
else '51+'
end
as range from
(select round(extract(day from feedback_update_time - feedback_time), 1) as age
from txn_history
) ) group by range
I have flexibility in defining the ranges
I do not repeat the ranges in select and group clauses
but some one please tell me, how to order them by magnitude!

Generating Random Number In Each Row In Oracle Query

I want to select all rows of a table followed by a random number between 1 to 9:
select t.*, (select dbms_random.value(1,9) num from dual) as RandomNumber
from myTable t
But the random number is the same from row to row, only different from each run of the query. How do I make the number different from row to row in the same execution?
Something like?
select t.*, round(dbms_random.value() * 8) + 1 from foo t;
Edit:
David has pointed out this gives uneven distribution for 1 and 9.
As he points out, the following gives a better distribution:
select t.*, floor(dbms_random.value(1, 10)) from foo t;
At first I thought that this would work:
select DBMS_Random.Value(1,9) output
from ...
However, this does not generate an even distribution of output values:
select output,
count(*)
from (
select round(dbms_random.value(1,9)) output
from dual
connect by level <= 1000000)
group by output
order by 1
1 62423
2 125302
3 125038
4 125207
5 124892
6 124235
7 124832
8 125514
9 62557
The reasons are pretty obvious I think.
I'd suggest using something like:
floor(dbms_random.value(1,10))
Hence:
select output,
count(*)
from (
select floor(dbms_random.value(1,10)) output
from dual
connect by level <= 1000000)
group by output
order by 1
1 111038
2 110912
3 111155
4 111125
5 111084
6 111328
7 110873
8 111532
9 110953
you don’t need a select … from dual, just write:
SELECT t.*, dbms_random.value(1,9) RandomNumber
FROM myTable t
If you just use round then the two end numbers (1 and 9) will occur less frequently, to get an even distribution of integers between 1 and 9 then:
SELECT MOD(Round(DBMS_RANDOM.Value(1, 99)), 9) + 1 FROM DUAL