How can I get the total result count, and a given subset ('page' of results) with the same SQL Query with Oracle - sql

I would like to display a table of results. The data is sourced from a SQL query on an Oracle database. I would like to show the results one page (say, 10 records) at a time, minimising the actual data being sent to the front-end.
At the same time, I would like to show the total number of possible results (say, showing 1-10 of 123), and to allow for pagination (say, to calculate that 10 per page, 123 results, therefore 13 pages).
I can get the total number of results with a single count query.
SELECT count(*) AS NUM_RESULTS FROM ... etc.
and I can get the desired subset with another query
SELECT * FROM ... etc. WHERE ? <= ROWNUM AND ROWNUM < ?
But, is there a way to get all the relevant details in one single query?
Update
Actually, the above query using ROWNUM seems to work for 0 - 10, but not for 10 - 20, so how can I do that too?

ROWNUM is a bit tricky to use.
The ROWNUM pseudocolumn always starts with 1 for the first result that actually gets fetched. If you filter for ROWNUM>10, you will never fetch any result and therefore will not get any.
If you want to use it for paging (not that you really should), it requires nested subqueries:
select * from
(select rownum n, x.* from
(select * from mytable order by name) x
)
where n between 3 and 5;
Note that you need another nested subquery to get the order by right; if you put the order by one level higher
select * from
(select rownum n, x.* from mytable x order by name)
where n between 3 and 5;
it will pick 3 random(*) rows and sort them, but that is ususally not what you want.
(*) not really random, but probably not what you expect.
See http://use-the-index-luke.com/sql/partial-results/window-functions for more effient ways to implement pagination.

You can use inner join on your table and fetch total number of result in your subquery. The example of an query is as follows:
SELECT E.emp_name, E.emp_age, E.emp_sal, E.emp_count
FROM EMP as E
INNER JOIN (SELECT emp_name, COUNT(*) As emp_count
FROM EMP GROUP BY emp_name) AS T
ON E.emp_name = T.emp_name WHERE E.emp_age < 35;

Not sure exactly what you're after based on your question wording, but it seems like you want to see your specialized table of all records with a row number between two values, and in an adjacent field in each record see the total count of records. If so, you can try selecting everything from your table and joining a subquery of a COUNT value as a field by saying where 1=1 (i.e. everywhere) tack that field onto the record. Example:
SELECT *
FROM table_name LEFT JOIN (SELECT COUNT(*) AS NUM_RESULTS FROM table_name) ON 1=1
WHERE ? <= ROWNUM AND ROWNUM < ?

Related

SQL: Give up/return different result if too many rows

Short version, I have a SQL statement where I only want the results if the number of rows returned is less than some value (say 1000) and otherwise I want a different result set. What's the best way to do this without incurring the overhead of returning the 1000 rows (as would happen if I used limit) when I'm just going to throw them away?
For instance, I want to return the results of
SELECT *
FROM T
WHERE updated_at > timestamp
AND name <= 'Michael'
ORDER BY name ASC
provided there are at most 1000 entries but if there are more than that I want to return
SELECT *
FROM T
ORDER BY name ASC
LIMIT 25
Two queries isn't bad, but I definitely don't want to get 1000 records back from the first query only to toss them.
(Happy to use Postgres extensions too but prefer SQL)
--
To explain I'm refreshing data requested by client in batches and sometimes the client needs to know if there have been any changes in the part they've already received. If there are too many changes, however, I'm just giving up and starting to send the records from the start again.
WITH max1000 AS (
SELECT the_row, count(*) OVER () AS total
FROM (
SELECT the_row -- named row type
FROM T AS the_row
WHERE updated_at > timestamp
AND name <= 'Michael'
ORDER BY name
LIMIT 1001
) sub
)
SELECT (the_row).* -- parentheses required
FROM max1000 m
WHERE total < 1001
UNION ALL
( -- parentheses required
SELECT *
FROM T
WHERE (SELECT total > 1000 FROM max1000 LIMIT 1)
ORDER BY name
LIMIT 25
)
The subquery sub in CTE max1000 gets the complete, sorted result for the first query - wrapped as row type, and with LIMIT 1001 to avoid excess work.
The outer SELECT adds the total row count. See:
Run a query with a LIMIT/OFFSET and also get the total number of rows
The first SELECT of the outer UNION query returns decomposed rows as result - if there are less than 1001 of them.
The second SELECT of the outer UNION query returns the alternate result - if there were more than 1000. Parentheses are required - see:
Combining 3 SELECT statements to output 1 table
Or:
WITH max1000 AS (
SELECT *
FROM T
WHERE updated_at > timestamp
AND name <= 'Michael'
ORDER BY name
LIMIT 1001
)
, ct(ok) AS (SELECT count(*) < 1001 FROM max1000)
SELECT *
FROM max1000 m
WHERE (SELECT ok FROM ct)
UNION ALL
( -- parentheses required
SELECT *
FROM T
WHERE (SELECT NOT ok FROM ct)
ORDER BY name
LIMIT 25
);
I think I like the 2nd better. Not sure which is faster.
Either optimizes performance for less than 1001 rows in most calls. If that's the exception, I would first run a somewhat cheaper count. Also depends a lot on available indexes ...
You get no row if the first query finds no row. (Seems like an odd result.)

query on the first n rows from large table without checking all rows -oracle sql

Every query is taking a lot of time on my table which is very large. For testing purpose I want queries to be implemented for first few rows of the table. for ex : In select * from table where ROWNUM=1,there would be a check for all rows i.e if ROWNUM is 1 or not. But I want to test my queries for few rows only to save time.
If you want to select only top n rown then you must use -
SELECT *
FROM TABLE
WHERE ROWNUM <= N;
rownum is a pseudocolumn it does not exist on the table records and is assigned at runtime once the predicate (where clause) phase of the query is completed. Due to this only the first query returns the result
select * from hr.employees where employee_id >190 and rownum<2;-- Will return one row
select * from hr.employees where employee_id >190 and rownum>2;-- Won't return any resultset
select * from hr.employees where employee_id >190 and rownum=3;-- Won't return any resultset
Reason behind the last two queries for not returning the resultset is once the predicate (employee_id >190) gets completed and a rownum is assigned to the first row then for (2 query rownum>2) 1>2 returns false and for (3 query rownum=3) 1=3 returns false so no data is returned.
Thanks
Andy
How about creating a mini test-table for your testing queries?
create table my_test_table as select * from big_table where rownum <= n;
Now you could run something like
select count(*) from my_test_table where color='red';
and divide that result by n to get your estimate for what fraction of the rows have color='red' in your big database. Of course, note that you could get really unlucky (i.e. your small table could be a poor sample of the total table population), in which case you can probably just increase n to achieve a better sample.
If you can't or would rather not create a new table, you can certainly just use a nested query:
select * from (select * from big_table where rownum <= n)
where <condition>;

Find out if query exceeds arbitrary limit using ROWNUM?

I have a stored proc in Oracle, and we're limiting the number of records with ROWNUM based on a parameter. However, we also have a requirement to know whether the search result count exceeded the arbitrary limit (even though we're only passing data up to the limit; searches can return a lot of data, and if the limit is exceeded a user may want to refine their query.)
The limit's working well, but I'm attempting to pass an OUT value as a flag to signal when the maximum results were exceeded. My idea for this was to get the count of the inner table and compare it to the count of the outer select query (with ROWNUM) but I'm not sure how I can get that into a variable. Has anyone done this before? Is there any way that I can do this without selecting everything twice?
Thank you.
EDIT: For the moment, I am actually doing two identical selects - one for the count only, selected into my variable, and one for the actual records. I then pass back the comparison of the base result count to my max limit parameter. This means two selects, which isn't ideal. Still looking for an answer here.
You can add a column to the query:
select * from (
select . . . , count(*) over () as numrows
from . . .
where . . .
) where rownum <= 1000;
And then report numrows as the size of the final result set.
You could use a nested subquery:
select id, case when max_count > 3 then 'Exceeded' else 'OK' end as flag
from (
select id, rn, max(rn) over () as max_count
from (
select id, rownum as rn
from t
)
where rownum <= 4
)
where rownum <= 3;
The inner level is your actual query (which you probably have filters and an order-by clause in really). The middle later restricts to your actual limit + 1, which still allows Oracle to optimise using a stop key, and uses an analytic count over that inner result set to see if you got a fourth record (without requiring a scan of all matching records). And the outer layer restricts to your original limit.
With a sample table with 10 rows, this gets:
ID FLAG
---------- --------
1 Exceeded
2 Exceeded
3 Exceeded
If the inner query had a filter that returned fewer rows, say:
select id, rownum as rn
from t
where id < 4
it would get:
ID FLAG
---------- --------
1 OK
2 OK
3 OK
Of course for this demo I haven't done any ordering so you would get indeterminate results. And from your description you would use your variable instead of 3, and (your variable + 1) instead of 4.
In my application I do a very simple approach. I do the normal SELECT and when the number of returned rows is equal to the limit then the client application shows LIMIT reached message, because is it very likely that my query would return more rows in case you would not limit the result.
Of course, when the number of rows is exactly the limit then this is a wrong indication. However, in my application the limit is set mainly for performance reasons by end-user, a typical limit is "1000 rows" or "10000 rows", for example.
In my case this solution is fully sufficient - and it is simple.
Update:
Are you aware of the row_limiting_clause? It was introduced in Oracle 12.1
For example this query
SELECT employee_id, last_name
FROM employees
ORDER BY employee_id
OFFSET 5 ROWS FETCH NEXT 10 ROWS ONLY;
will return row 6 to row 16 of the entire result set. It may support you in finding a solution.
Another idea is this one:
SELECT employee_id, last_name
FROM employees
UNION ALL
SELECT NULL, NULL FROM dual
ORDER BY employee_id NULLS LAST
When you get the row where employee_id IS NULL then you know you reached the end of your result-set and no further records will arrive.
Select the whole thing, then select the count and the data, restricting the number of rows.
with
base as
(
select c1, c2, c3
from table
where condition
)
select (select count(*) from base), c1, c2, c3
from base
where rownum < 100

using a single query to eliminate N+1 select issue

I want to return the last report of a given range of units. The last report will be identified by its time of creation. Therefore, the result would be a collection of last reports for a given range of units. I do not want to use a bunch of SELECT statements e.g.:
SELECT * FROM reports WHERE unit_id = 9999 ORDER BY time desc LIMIT 1
SELECT * FROM reports WHERE unit_id = 9998 ORDER BY time desc LIMIT 1
...
I initially tried this (but already knew it wouldn't work because it will only return 1 report):
'SELECT reports.* FROM reports INNER JOIN units ON reports.unit_id = units.id WHERE units.account_id IS NOT NULL AND units.account_id = 4 ORDER BY time desc LIMIT 1'
So I am looking for some kind of solution using subqueries or derived tables, but I can't just seem to figure out how to do it properly:
'SELECT reports.* FROM reports
WHERE id IN
(
SELECT id FROM reports
INNER JOIN units ON reports.unit_id = units.id
ORDER BY time desc
LIMIT 1
)
Any solution to do this with subqueries or derived tables?
The simple way to do this in Postgres uses distinct on:
select distinct on (unit_id) r.*
from reports r
order by unit_id, time desc;
This construct is specific to Postgres and databases that use its code base. It the expression distinct on (unit_id) says "I want to keep only one row for each unit_id". The row chosen is the first row encountered with that unit_id based on the order by clause.
EDIT:
Your original query would be, assuming that id increases along with the time field:
SELECT r.*
FROM reports r
WHERE id IN (SELECT max(id)
FROM reports
GROUP BY unit_id
);
You might also try this as a not exists:
select r.*
from reports r
where not exists (select 1
from reports r2
where r2.unit_id = r.unit_id and
r2.time > r.time
);
I thought the distinct on would perform well. This last version (and maybe the previous) would really benefit from an index on reports(unit_id, time).

Paging with Oracle and sql server and generic paging method

I want to implement paging in a gridview or in an html table which I will fill using ajax. How should I write queries to support paging? For example if pagesize is 20 and when the user clicks page 3, rows between 41 and 60 must be shown on table. At first I can get all records and put them into cache but I think this is the wrong way. Because data can be very huge and data can be change from other sessions. so how can I implement this? Is there any generic way ( for all databases ) ?
As others have suggested, you can use rownum in Oracle. It's a little tricky though and you have to nest your query twice.
For example, to paginate the query
select first_name from some_table order by first_name
you need to nest it like this
select first_name from
(select rownum as rn, first_name from
(select first_name from some_table order by first_name)
) where rn > 100 and rn <= 200
The reason for this is that rownum is determined after the where clause and before the order by clause. To see what I mean, you can query
select rownum,first_name from some_table order by first_name
and you might get
4 Diane
2 Norm
3 Sam
1 Woody
That's because oracle evaluates the where clause (none in this case), then assigns rownums, then sorts the results by first_name. You have to nest the query so it uses the rownum assigned after the rows have been sorted.
The second nesting has to do with how rownum is treated in a where condition. Basically, if you query "where rownum > 100" then you get no results. It's a chicken and egg thing where it can't return any rows until it finds rownum > 100, but since it's not returning any rows it never increments rownum, so it never counts to 100. Ugh. The second level of nesting solves this. Note it must alias the rownum column at this point.
Lastly, your order by clause must make the query deterministic. For example, if you have John Doe and John Smith, and you order by first name only, then the two can switch places from one execution of the query to the next.
There are articles here http://www.oracle.com/technology/oramag/oracle/06-sep/o56asktom.html
and here http://www.oracle.com/technology/oramag/oracle/07-jan/o17asktom.html. Now that I see how long my post is, I probably should have just posted those links...
Unfortunately, the methods for restricting the range of rows returned by a query vary from one DBMS to another: Oracle uses ROWNUM (see ocdecio's answer), but ROWNUM won't work in SQL Server.
Perhaps you can encapsulate these differences with a function that takes a given SQL statement and first and last row numbers and generates the appropriate paginatd SQL for the target DBMS - i.e. something like:
sql = paginated ('select empno, ename from emp where job = ?', 101, 150)
which would return
'select * from (select v.*, ROWNUM rn from ('
+ theSql
+ ') v where rownum < 150) where rn >= 101'
for Oracle and something else for SQL Server.
However, note that the Oracle solution is adding a new column RN to the results that you'll need to deal with.
I believe that both have a ROWNUM analytic Function. Use that and you'll be identical.
In Oracle it is here
ROW_NUMBER
Yep, just verified that ROW_NUMBER is the same function in both.
"Because...data can be change from other sessions."
What do you want to happen for this ?
For example, user gets the 'latest' ten rows at 10:30.
At 10:31, 3 new rows are added (so those ten being view by the user are no longer the latest).
At 10:32, the user requests then 'next' ten entries.
Do you want that new set to include those three that have been bumped from 8/9/10 down to 11/12/13 ?
If not, in Oracle you can select the data as it was at 10:30
SELECT * FROM table_1 as of timestamp (timestamp '2009-01-29 10:30:00');
You still need the row_number logic, eg
select * from
(SELECT a.*, row_number() over (order by hire_date) rn
FROM hr.employees as of timestamp (timestamp '2009-01-29 10:30:00') a)
where rn between 10 and 19
select *
from ( select /*+ FIRST_ROWS(n) */ a.*,
ROWNUM rnum
from ( your_query_goes_here,
with order by ) a
where ROWNUM <=
:MAX_ROW_TO_FETCH )
where rnum >= :MIN_ROW_TO_FETCH;
Step 1: your query with order by
Step 2: select a.*, ROWNUM rnum from ()a where ROWNUM <=:MAX_ROW_TO_FETCH
Step 3: select * from ( ) where rnum >= :MIN_ROW_TO_FETCH;
put 1 in 2 and 2 in 3
If the expected data set is huge, I'd recommend to create a temp table, a view or a snapshot (materialized view) to store the query results + a row number retrieved either using ROWNUM or ROW_NUMBER analytic function. After that you can simply query this temp storage using row number ranges.
Basically, you need to separate the actual data fetch from the paging.
There is no uniform way to ensure paging across various RDBMS products. Oracle gives you rownum which you can use in where clause like:
where rownum < 1000
SQL Server gives you row_id( ) function which can be used similar to Oracle's rownum. However, row_id( ) isn't available before SQL Server 2005.