Find out if query exceeds arbitrary limit using ROWNUM? - sql

I have a stored proc in Oracle, and we're limiting the number of records with ROWNUM based on a parameter. However, we also have a requirement to know whether the search result count exceeded the arbitrary limit (even though we're only passing data up to the limit; searches can return a lot of data, and if the limit is exceeded a user may want to refine their query.)
The limit's working well, but I'm attempting to pass an OUT value as a flag to signal when the maximum results were exceeded. My idea for this was to get the count of the inner table and compare it to the count of the outer select query (with ROWNUM) but I'm not sure how I can get that into a variable. Has anyone done this before? Is there any way that I can do this without selecting everything twice?
Thank you.
EDIT: For the moment, I am actually doing two identical selects - one for the count only, selected into my variable, and one for the actual records. I then pass back the comparison of the base result count to my max limit parameter. This means two selects, which isn't ideal. Still looking for an answer here.

You can add a column to the query:
select * from (
select . . . , count(*) over () as numrows
from . . .
where . . .
) where rownum <= 1000;
And then report numrows as the size of the final result set.

You could use a nested subquery:
select id, case when max_count > 3 then 'Exceeded' else 'OK' end as flag
from (
select id, rn, max(rn) over () as max_count
from (
select id, rownum as rn
from t
)
where rownum <= 4
)
where rownum <= 3;
The inner level is your actual query (which you probably have filters and an order-by clause in really). The middle later restricts to your actual limit + 1, which still allows Oracle to optimise using a stop key, and uses an analytic count over that inner result set to see if you got a fourth record (without requiring a scan of all matching records). And the outer layer restricts to your original limit.
With a sample table with 10 rows, this gets:
ID FLAG
---------- --------
1 Exceeded
2 Exceeded
3 Exceeded
If the inner query had a filter that returned fewer rows, say:
select id, rownum as rn
from t
where id < 4
it would get:
ID FLAG
---------- --------
1 OK
2 OK
3 OK
Of course for this demo I haven't done any ordering so you would get indeterminate results. And from your description you would use your variable instead of 3, and (your variable + 1) instead of 4.

In my application I do a very simple approach. I do the normal SELECT and when the number of returned rows is equal to the limit then the client application shows LIMIT reached message, because is it very likely that my query would return more rows in case you would not limit the result.
Of course, when the number of rows is exactly the limit then this is a wrong indication. However, in my application the limit is set mainly for performance reasons by end-user, a typical limit is "1000 rows" or "10000 rows", for example.
In my case this solution is fully sufficient - and it is simple.
Update:
Are you aware of the row_limiting_clause? It was introduced in Oracle 12.1
For example this query
SELECT employee_id, last_name
FROM employees
ORDER BY employee_id
OFFSET 5 ROWS FETCH NEXT 10 ROWS ONLY;
will return row 6 to row 16 of the entire result set. It may support you in finding a solution.
Another idea is this one:
SELECT employee_id, last_name
FROM employees
UNION ALL
SELECT NULL, NULL FROM dual
ORDER BY employee_id NULLS LAST
When you get the row where employee_id IS NULL then you know you reached the end of your result-set and no further records will arrive.

Select the whole thing, then select the count and the data, restricting the number of rows.
with
base as
(
select c1, c2, c3
from table
where condition
)
select (select count(*) from base), c1, c2, c3
from base
where rownum < 100

Related

SQL: Give up/return different result if too many rows

Short version, I have a SQL statement where I only want the results if the number of rows returned is less than some value (say 1000) and otherwise I want a different result set. What's the best way to do this without incurring the overhead of returning the 1000 rows (as would happen if I used limit) when I'm just going to throw them away?
For instance, I want to return the results of
SELECT *
FROM T
WHERE updated_at > timestamp
AND name <= 'Michael'
ORDER BY name ASC
provided there are at most 1000 entries but if there are more than that I want to return
SELECT *
FROM T
ORDER BY name ASC
LIMIT 25
Two queries isn't bad, but I definitely don't want to get 1000 records back from the first query only to toss them.
(Happy to use Postgres extensions too but prefer SQL)
--
To explain I'm refreshing data requested by client in batches and sometimes the client needs to know if there have been any changes in the part they've already received. If there are too many changes, however, I'm just giving up and starting to send the records from the start again.
WITH max1000 AS (
SELECT the_row, count(*) OVER () AS total
FROM (
SELECT the_row -- named row type
FROM T AS the_row
WHERE updated_at > timestamp
AND name <= 'Michael'
ORDER BY name
LIMIT 1001
) sub
)
SELECT (the_row).* -- parentheses required
FROM max1000 m
WHERE total < 1001
UNION ALL
( -- parentheses required
SELECT *
FROM T
WHERE (SELECT total > 1000 FROM max1000 LIMIT 1)
ORDER BY name
LIMIT 25
)
The subquery sub in CTE max1000 gets the complete, sorted result for the first query - wrapped as row type, and with LIMIT 1001 to avoid excess work.
The outer SELECT adds the total row count. See:
Run a query with a LIMIT/OFFSET and also get the total number of rows
The first SELECT of the outer UNION query returns decomposed rows as result - if there are less than 1001 of them.
The second SELECT of the outer UNION query returns the alternate result - if there were more than 1000. Parentheses are required - see:
Combining 3 SELECT statements to output 1 table
Or:
WITH max1000 AS (
SELECT *
FROM T
WHERE updated_at > timestamp
AND name <= 'Michael'
ORDER BY name
LIMIT 1001
)
, ct(ok) AS (SELECT count(*) < 1001 FROM max1000)
SELECT *
FROM max1000 m
WHERE (SELECT ok FROM ct)
UNION ALL
( -- parentheses required
SELECT *
FROM T
WHERE (SELECT NOT ok FROM ct)
ORDER BY name
LIMIT 25
);
I think I like the 2nd better. Not sure which is faster.
Either optimizes performance for less than 1001 rows in most calls. If that's the exception, I would first run a somewhat cheaper count. Also depends a lot on available indexes ...
You get no row if the first query finds no row. (Seems like an odd result.)

How can I get the total result count, and a given subset ('page' of results) with the same SQL Query with Oracle

I would like to display a table of results. The data is sourced from a SQL query on an Oracle database. I would like to show the results one page (say, 10 records) at a time, minimising the actual data being sent to the front-end.
At the same time, I would like to show the total number of possible results (say, showing 1-10 of 123), and to allow for pagination (say, to calculate that 10 per page, 123 results, therefore 13 pages).
I can get the total number of results with a single count query.
SELECT count(*) AS NUM_RESULTS FROM ... etc.
and I can get the desired subset with another query
SELECT * FROM ... etc. WHERE ? <= ROWNUM AND ROWNUM < ?
But, is there a way to get all the relevant details in one single query?
Update
Actually, the above query using ROWNUM seems to work for 0 - 10, but not for 10 - 20, so how can I do that too?
ROWNUM is a bit tricky to use.
The ROWNUM pseudocolumn always starts with 1 for the first result that actually gets fetched. If you filter for ROWNUM>10, you will never fetch any result and therefore will not get any.
If you want to use it for paging (not that you really should), it requires nested subqueries:
select * from
(select rownum n, x.* from
(select * from mytable order by name) x
)
where n between 3 and 5;
Note that you need another nested subquery to get the order by right; if you put the order by one level higher
select * from
(select rownum n, x.* from mytable x order by name)
where n between 3 and 5;
it will pick 3 random(*) rows and sort them, but that is ususally not what you want.
(*) not really random, but probably not what you expect.
See http://use-the-index-luke.com/sql/partial-results/window-functions for more effient ways to implement pagination.
You can use inner join on your table and fetch total number of result in your subquery. The example of an query is as follows:
SELECT E.emp_name, E.emp_age, E.emp_sal, E.emp_count
FROM EMP as E
INNER JOIN (SELECT emp_name, COUNT(*) As emp_count
FROM EMP GROUP BY emp_name) AS T
ON E.emp_name = T.emp_name WHERE E.emp_age < 35;
Not sure exactly what you're after based on your question wording, but it seems like you want to see your specialized table of all records with a row number between two values, and in an adjacent field in each record see the total count of records. If so, you can try selecting everything from your table and joining a subquery of a COUNT value as a field by saying where 1=1 (i.e. everywhere) tack that field onto the record. Example:
SELECT *
FROM table_name LEFT JOIN (SELECT COUNT(*) AS NUM_RESULTS FROM table_name) ON 1=1
WHERE ? <= ROWNUM AND ROWNUM < ?

fetching data using rownum in oracle

I have a query in oracle to fetch data from table using rownum but i didn't get any data.
My query is like this :
select * from table-name where rownum<5
is this is a wrong query to fetch data whose row number is less than 5.
when i used query like :
select * from table-name where rownum<=4
than it will gives a result record.
My question is what is wrong here with ?
Is this is syntax error or anything else??..
rownum is a pseudo column that counts rows in the result set after the where clause has been applied.
SELECT table_name
FROM user_tables
WHERE rownum > 2;
TABLE_NAME
------------------------------
0 rows selected
However, this query will always return zero rows, regardless of the number of rows in the table.
To explain this behaviour, we need to understand how Oracle processes ROWNUM. When assigning ROWNUM to a row, Oracle starts at 1 and only increments the value when a row is selected; that is, when all conditions in the WHERE clause are met. Since our condition requires that ROWNUM is greater than 2, no rows are selected and ROWNUM is never incremented beyond 1.
http://blog.lishman.com/2008/03/rownum.html
another stackoverflow link
Edited
this paragraph i find on oracle website which is much better
Conditions testing for ROWNUM values greater than a positive integer are always false. For example, this query returns no rows:
SELECT * FROM employees
WHERE ROWNUM > 1;
The first row fetched is assigned a ROWNUM of 1 and makes the condition false. The second row to be fetched is now the first row and is also assigned a ROWNUM of 1 and makes the condition false. All rows subsequently fail to satisfy the condition, so no rows are returned.
You can also use ROWNUM to assign unique values to each row of a table, as in this example:
Syntax seems correct to me.
However ROWNUM is calculated on result rows for example:
SELECT * FROM TABLE_NAME WHERE ROWNUM < 10 ORDER BY TABLE_FIELD ASC;
and
SELECT * FROM TABLE_NAME WHERE ROWNUM < 10 ORDER BY TABLE_FIELD DESC;
Will give you different results.
Each time a query is executed, for each tuple, Oracle assigns a ROWNUM which is scopet to that only query.
What are you trying to accomplish ?
Since for the reasons rahularyansharma mentions, rownum based queries won't always function the way you might expect, a way around this is to do something like
SELECT * from (SELECT rownum AS rn, TABLE_NAME.* FROM TABLE_NAME)
where rn > 5;
However, be aware that this will be a fairly inefficient operation when operating over large datasets.

How to select first 'N' records from a database containing million records?

I have an oracle database populated with million records. I am trying to write a SQL query that returns the first 'N" sorted records ( say 100 records) from the database based on certain condition.
SELECT *
FROM myTable
Where SIZE > 2000
ORDER BY NAME DESC
Then programmatically select first N records.
The problem with this approach is :
The query results into half million
records and "ORDER BY NAME" causes
all the records to be sorted on NAME in the descending order. This sorting is taking lot of time. (nearly 30-40 seconds. If I omit ORDER BY, it takes only 1 second).
After the sort I am interested in
only first N (100) records. So the sorting of complete records is not useful.
My questions are:
Is it possible to specify the 'N' in
query itself? ( so that sort applies to only N records and query becomes faster).
Any better way in SQL to improve the query to sort
only N elements and return in quick
time.
If your purpose is to find 100 random rows and sort them afterwards then Lasse's solution is correct. If as I think you want the first 100 rows sorted by name while discarding the others you would build a query like this:
SELECT *
FROM (SELECT *
FROM myTable
WHERE SIZE > 2000 ORDER BY NAME DESC)
WHERE ROWNUM <= 100
The optimizer will understand that it is a TOP-N query and will be able to use an index on NAME. It won't have to sort the entire result set, it will just start at the end of the index and read it backwards and stop after 100 rows.
You could also add an hint to your original query to let the optimizer understand that you are interested in the first rows only. This will probably generate a similar access path:
SELECT /*+ FIRST_ROWS*/* FROM myTable WHERE SIZE > 2000 ORDER BY NAME DESC
Edit: just adding AND rownum <= 100 to the query won't work since in Oracle rownum is attributed before sorting : this is why you have to use a subquery. Without the subquery Oracle will select 100 random rows then sort them.
This shows how to pick the top N rows depending on your version of Oracle.
From Oracle 9i onwards, the RANK() and
DENSE_RANK() functions can be used to
determine the TOP N rows. Examples:
Get the top 10 employees based on
their salary
SELECT ename, sal FROM ( SELECT
ename, sal, RANK() OVER (ORDER BY sal
DESC) sal_rank
FROM emp ) WHERE sal_rank <= 10;
Select the employees making the top 10
salaries
SELECT ename, sal FROM ( SELECT
ename, sal, DENSE_RANK() OVER (ORDER
BY sal DESC) sal_dense_rank
FROM emp ) WHERE sal_dense_rank <= 10;
The difference between the two is explained here
Add this:
AND rownum <= 100
to your WHERE-clause.
However, this won't do what you're asking.
If you want to pick 100 random rows, sort those, and then return them, you'll have to formulate a query without the ORDER BY first, then limit that to 100 rows, then select from that and sort.
This could work, but unfortunately I don't have an Oracle server available to test:
SELECT *
FROM (
SELECT *
FROM myTable
WHERE SIZE > 2000
AND rownum <= 100
) x
ORDER BY NAME DESC
But note the "random" part there, you're saying "give me 100 rows with SIZE > 2000, I don't care which 100".
Is that really what you want?
And no, you won't actually get a random result, in the sense that it'll change each time you query the server, but you are at the mercy of the query optimizer. If the data load and index statistics for that table changes over time, at some point you might get different data than you did on the previous query.
Your problem is that the sort is being done every time the query is run. You can eliminate the sort operation by using an index - the optimiser can use an index to eliminate a sort operation - if the sorted column is declared NOT NULL.
(If the column is nullable, it is still possible, by either (a) adding a NOT NULL predicate to the query, or (b) adding a function-based index and modifying the ORDER BY clause accordingly).
Just for reference, in Oracle 12c, this task can be done using FETCH clause. You can see here for examples and additional reference links regarding this matter.

Paging with Oracle and sql server and generic paging method

I want to implement paging in a gridview or in an html table which I will fill using ajax. How should I write queries to support paging? For example if pagesize is 20 and when the user clicks page 3, rows between 41 and 60 must be shown on table. At first I can get all records and put them into cache but I think this is the wrong way. Because data can be very huge and data can be change from other sessions. so how can I implement this? Is there any generic way ( for all databases ) ?
As others have suggested, you can use rownum in Oracle. It's a little tricky though and you have to nest your query twice.
For example, to paginate the query
select first_name from some_table order by first_name
you need to nest it like this
select first_name from
(select rownum as rn, first_name from
(select first_name from some_table order by first_name)
) where rn > 100 and rn <= 200
The reason for this is that rownum is determined after the where clause and before the order by clause. To see what I mean, you can query
select rownum,first_name from some_table order by first_name
and you might get
4 Diane
2 Norm
3 Sam
1 Woody
That's because oracle evaluates the where clause (none in this case), then assigns rownums, then sorts the results by first_name. You have to nest the query so it uses the rownum assigned after the rows have been sorted.
The second nesting has to do with how rownum is treated in a where condition. Basically, if you query "where rownum > 100" then you get no results. It's a chicken and egg thing where it can't return any rows until it finds rownum > 100, but since it's not returning any rows it never increments rownum, so it never counts to 100. Ugh. The second level of nesting solves this. Note it must alias the rownum column at this point.
Lastly, your order by clause must make the query deterministic. For example, if you have John Doe and John Smith, and you order by first name only, then the two can switch places from one execution of the query to the next.
There are articles here http://www.oracle.com/technology/oramag/oracle/06-sep/o56asktom.html
and here http://www.oracle.com/technology/oramag/oracle/07-jan/o17asktom.html. Now that I see how long my post is, I probably should have just posted those links...
Unfortunately, the methods for restricting the range of rows returned by a query vary from one DBMS to another: Oracle uses ROWNUM (see ocdecio's answer), but ROWNUM won't work in SQL Server.
Perhaps you can encapsulate these differences with a function that takes a given SQL statement and first and last row numbers and generates the appropriate paginatd SQL for the target DBMS - i.e. something like:
sql = paginated ('select empno, ename from emp where job = ?', 101, 150)
which would return
'select * from (select v.*, ROWNUM rn from ('
+ theSql
+ ') v where rownum < 150) where rn >= 101'
for Oracle and something else for SQL Server.
However, note that the Oracle solution is adding a new column RN to the results that you'll need to deal with.
I believe that both have a ROWNUM analytic Function. Use that and you'll be identical.
In Oracle it is here
ROW_NUMBER
Yep, just verified that ROW_NUMBER is the same function in both.
"Because...data can be change from other sessions."
What do you want to happen for this ?
For example, user gets the 'latest' ten rows at 10:30.
At 10:31, 3 new rows are added (so those ten being view by the user are no longer the latest).
At 10:32, the user requests then 'next' ten entries.
Do you want that new set to include those three that have been bumped from 8/9/10 down to 11/12/13 ?
If not, in Oracle you can select the data as it was at 10:30
SELECT * FROM table_1 as of timestamp (timestamp '2009-01-29 10:30:00');
You still need the row_number logic, eg
select * from
(SELECT a.*, row_number() over (order by hire_date) rn
FROM hr.employees as of timestamp (timestamp '2009-01-29 10:30:00') a)
where rn between 10 and 19
select *
from ( select /*+ FIRST_ROWS(n) */ a.*,
ROWNUM rnum
from ( your_query_goes_here,
with order by ) a
where ROWNUM <=
:MAX_ROW_TO_FETCH )
where rnum >= :MIN_ROW_TO_FETCH;
Step 1: your query with order by
Step 2: select a.*, ROWNUM rnum from ()a where ROWNUM <=:MAX_ROW_TO_FETCH
Step 3: select * from ( ) where rnum >= :MIN_ROW_TO_FETCH;
put 1 in 2 and 2 in 3
If the expected data set is huge, I'd recommend to create a temp table, a view or a snapshot (materialized view) to store the query results + a row number retrieved either using ROWNUM or ROW_NUMBER analytic function. After that you can simply query this temp storage using row number ranges.
Basically, you need to separate the actual data fetch from the paging.
There is no uniform way to ensure paging across various RDBMS products. Oracle gives you rownum which you can use in where clause like:
where rownum < 1000
SQL Server gives you row_id( ) function which can be used similar to Oracle's rownum. However, row_id( ) isn't available before SQL Server 2005.