Fetch ordered records between specific ranks - sql

I ran a SQL query to order(sort) my records. Now how can I fetch not the top 1000 records by using limit but records between ranks of 500 to 800? I provide a specific range and get all records in that range of ranks?

If by "rank" you just mean row numbers, LIMIT & OFFSET will do:
SELECT * FROM tbl ORDER BY col OFFSET 499 LIMIT 301; -- "ranks of 500 to 800"
If you mean actual "rank" as implemented by the window functions rank() or dense_rank() use the respective function in a subquery or CTE like demonstrated by #downernn.
Pesky side effect: SELECT * cannot be used to get all columns of the table. You get the additional column "rank" from the subquery unless you spell out the definitive list of desired columns.
Use the row type of the underlying table to work around this:
SELECT (sub.t).* -- parentheses required!
FROM (
SELECT t, rank() OVER (ORDER BY col1) AS rnk -- or dense_rank()?
FROM tbl t
) sub
ORDER BY col1 -- repeart order (optional)
WHERE rnk BETWEEN 500 AND 800;

Use the rank or row_number window function (do some research on their differences and choose the one that suits you) and an outer query to filter the rows:
SELECT *
FROM
(
SELECT f1, f2, ..., RANK() OVER (ORDER BY fn, fm, ...) as r
FROM ...
WHERE ...
)
WHERE r between 500 and 800

Use offset.
-- Fetch rows 500 to 800 inclusive
select *
from generate_series(1, 1000)
order by 1
limit 301
offset 499
offset is the number of rows to skip, to start at row 500 you skip the first 499 rows. limit is 301 because there's 301 rows between 500 and 800 inclusive. Use 300 if you want 500 to 800 exclusive.

Related

How to output only random xx% of query output records in Redshift?

Is there a way to output only a percentage of the total number of output records in Redshift, when you don't know the number of records returned?
Let's say the output of the query will be 1000 records. You just want to select randomly 60% of it... So that will be 600 records in this case.
If I knew that the output is always 1000, then I would use LIMIT 600. But I don't know how many records will be returned, and I want it to be variable..
Any ideas?
PS: Tried to use LIMIT (0.6*COUNT(*)) and it didn't work.. The error was that "LIMIT doesn't take a variable"
If you don't need an exact number of records but about 60%, then I would recommend:
where random() <= 0.6
If you do need an exact number, then:
select t.*
from (select t.*,
row_number() over (order by random()) as seqnum,
count(*) over () as cnt
from t
) t
where seqnum <= 0.6 * cnt;

Max and min bounds on number of DB results without affecting performance

I want to select rows from a database with bounds on the number of results. I always want to have a minimum number of results returned, even if that means ignoring my other criteria, and I never want to have more than a maximum amount.
My current query looks like this:
(SELECT * FROM Athletes WHERE Height > 72
FETCH FIRST 10 ROWS ONLY)
UNION
(SELECT * FROM Athletes
ORDER BY Height DESC
FETCH FIRST 3 ROWS ONLY)
FETCH FIRST 10 ROWS ONLY
The idea here is that I want to find all athletes taller than six feet (72"). If there are more than ten, I just want any ten of them, but if there are fewer than three, I want the three tallest athletes even if some are under six feet.
This works fine on my test data, but I'd like to get rid of the UNION for production. How can I rewrite this without any performance-draining bits like UNION or DISTINCT?
One method is to use a CTE:
with ten as (
SELECT *
FROM Athletes
WHERE Height > 72
FETCH FIRST 10 ROWS ONLY
)
select t.*
from ten
where (select count(*) from ten) >= 3
union all
select a.*
from atheletes
where (select count(*) from ten) < 3
order by height desc
fetch first 3 rows only;
I think this should be pretty fast, because counting 10 rows in a table should be quite fast and there is no duplicate elimination.
EDIT:
Another method uses window functions but is likely to be less performant:
select a.*
from (select a.*,
sum(case when height > 72 then 1 else 0 end) over () as num_gt72,
row_number() over (order by height desc) as seqnum
from (select a.*
from athletes a
order by height desc
fetch first 10 rows only
) a
) a
where seqnum <= num_gt72 or
(seqnum <= 3 and num_gt72 < 3);

How to skip/offset rows in Oracle database?

I am writing a very simple query for an Oracle DB (version 9).
Somehow I can get first 5 rows:
select * from cities where rownum <= 5
But skipping 5 rows returns an empty result:
select * from cities where rownum >= 5
Using:
Oracle SQL Developer
Oracle DB version 9
Why is the second query returning an empty result?
In Oracle Database 12c (release 1) and above, you can do this very simple, for skip 5 rows:
SELECT * FROM T OFFSET 5 ROWS
and for skip 5 rows and take 15 rows:
SELECT * FROM T OFFSET 5 ROWS FETCH NEXT 15 ROWS ONLY
You can use the following query to skip the first not n of rows.
select * from (
select rslts.*, rownum as rec_no from (
<<Query with proper order by (If you don't have proper order by you will see weird results)>>
) rslts
) where rec_no > <<startRowNum - n>>
The above query is similar to pagination query below.
select * from (
select rslts.*, rownum as rec_no from (
<<Query with proper order by (If you don't have proper order by you will see weird results)>>
) rslts where rownum <= <<endRowNum>>
) where rec_no > <<startRowNum>>
Your cities query:
select * from (
select rslts.*, rownum as rec_no from (
select * from cities order by 1
) rslts
) where rec_no > 5 <<startRowNum>>
Note: Assume first column in cities table is unique key
Oracle increments rownum each time it adds a row to the result set. So saying rownum < 5 is fine; as it adds each of the first 5 rows it increments rownum, but then once ruwnum = 5 the WHERE clause stops matching, no more rows are added to the result, and though you don't notice this rownum stops incrementing.
But if you say WHERE rownum > 5 then right off the bat, the WHERE clause doesn't match; and since, say, the first row isn't added to the result set, rownum isn't incremented... so rownum can never reach a value greater than 5 and the WHERE clause can never match.
To get the result you want, you can use row_number() over() in a subquery, like
select *
from (select row_number() over() rn, -- other values
from table
where -- ...)
where rn > 5
Update - As noted by others, this kind of query only makes sense if you can
control the order of the row numbering, so you should really use row_number() over(order bysomething) where something is a useful ordering key in deciding which records are "the first 5 records".
rownum is being increased only when a row is being output, so this type of condition won't work.
In any case, you are not ordering your rows, so what's the point?
Used row_number() over (order by id):
select * from
(select row_number() over (order by id) rn, c.* from countries c)
where rn > 5
Used ROWNUM:
select * from
(select rownum rn, c.* from countries c)
where rn > 5
Important note:
Using alias as countries c instead of countries is required! Without, it gives an error "missing expression"
Even better would be:
select * from mytab sample(5) fetch next 1 rows only;
Sample clause indicates the probability of each row getting picked up in the sampling process. FETCH NEXT clause indicates the number of rows you want to select.
With this code, you can query your table with skip and take.
select * from (
select a.*, rownum rnum from (
select * from cities
) a
) WHERE rnum >= :skip + 1 AND rnum <= :skip + :take
This code works with Oracle 11g. With Oracle 12, there is already a better way to perform this queries with offset and fetch

How will Oracle optimise a record set if we specify a rownum clause

If I say:
select * from table order by col1 where rownum < 100
If the table has 10 million records, will Oracle bring all 10 million, sort it and then show me the first 10? Or is there a way it will optimise it?
If you do this
select * from table order by col1 where rownum < 100
then Oracle will throw an error as the WHERE clause comes before the ORDER BY.
If you do this
select * from table where rownum < 100 order by col1
then Oracle will return a random 99 records as the WHERE clause comes before the ORDER BY.
If you want to return a the first 100 records, ordered by a column, you must put the order by in a sub-select.
select *
from ( select * from table order by col1 )
where rownum <= 100
Oracle will do the sort, how else will it know the records you want? However, it will be a sort with a stopkey because of the ROWNUM. Oracle doesn't actually sort the entire result set, as some optimisation goes on under the hood, but this is what you can assume takes place.
Please see this article by Tom Kyte.

Equivalents to SQL Server TOP

In SQL Server, TOP may be used to return the first n number of rows in a query. For example, SELECT TOP 100 * FROM users ORDER BY id might be used to return the first 100 people that registered for a site. (This is not necessarily the best way, I am just using it as an example).
My question is - What is the equivalent to TOP in other databases, such as Oracle, MySQL, PostgreSQL, etc? If there is not an equivalent keyword, what workarounds can you recommend to achieve the same result?
To select first 100 rows:
MySQL and PostgreSQL:
SELECT *
FROM Table
ORDER BY
column
LIMIT 100
Oracle:
SELECT *
FROM (
SELECT t.*
FROM table
ORDER BY
column
)
WHERE rownum <= 100
Note that you need a subquery here. If you don't add a subquery, ROWNUM will select first 10 rows in random order and then sort them by column.
To select rows between 100 and 300:
MySQL:
SELECT *
FROM TABLE
ORDER BY
column
LIMIT 100, 200
PostgreSQL:
SELECT *
FROM Table
ORDER BY
column
OFFSET 100 LIMIT 200
Oracle:
SELECT *
FROM (
SELECT t.*, ROW_NUMBER() OVER (ORER BY column) AS rn
FROM table
)
WHERE rn >= 100
AND rownum <= 200
Note that an attempt to simplify it with ROWNUM BETWEEN 100 AND 200 (as opposed to rn BETWEEN 100 AND 200 in the outer query) will return nothing in Oracle!
RN BETWEEN 100 AND 200 will work in Oracle too but is less efficient.
See the article in my blog for performance details:
Oracle: ROW_NUMBER vs ROWNUM
For Postgres and MySQL it's the LIMIT keyword.
SELECT *
FROM users
ORDER BY id
LIMIT 100;
This is standard SQL (Oracle and SQL Server implement it). This is an example of returning up to 100 rows:
SELECT ID_CONTROL FROM (SELECT ROW_NUMBER() OVER (ORDER BY ID_CONTROL)
ROWNUMBER, ID_CONTROL FROM IWS_CONTROL WHERE
CURRENT_STATE = 15 AND CURRENT_STATUS=0) A WHERE ROWNUMBER <= 100)
In SQL Anywhere, it's the same as SQL Server:
SELECT TOP 100 * FROM users ORDER BY id
You can even start in the middle of the result set if you want:
SELECT TOP 100 START AT 50 * FROM users ORDER BY id
gets the 50th through 150th rows of the result set.
LIMIT 100
as in
SELECT * FROM foo ORDER BY bar LIMIT 100
You can use RANK() and DENSE_RANK() in Oracle. Here is a link to AskTom website explaining how to to pagination and top-n queries with DENSE_RANK in Oracle.
Oracle:
select * from (select * from foo ORDER BY bar) where rownum < 100
With a nice explanation on how to make it work in AskTom.
In Ingres the same query would by:
select First 100 * from foo ORDER BY bar
Ingres question was already answered in StackOverflow before.
In DB2 you would make your query look like this:
SELECT * FROM tblData FETCH FIRST 10 ROWS ONLY;
In Oracle you want to use a TOP-N query.
For example:
select *
from (SELECT *
FROM foo
where foo_id=[number]
order by foo_id desc)
where rownum <= 3
This will get you the top three results (because I order by desc in the sub query)