What are the differences (advantages and disadvantages) between these two coding techniques?
select * from (
select rownum rnun, * from table where rownum < x
) where rnum > y
select * from (
select * from table
) where rownum < x and x > y
The two queries return different rows.
Neither query is deterministic. So neither query should ever be used in a real system.
The first query appears to be at least an attempt to generate a window of rows (rows between x and y). Since there is no ORDER BY, however, the order of rows is not deterministic and the window probably doesn't do what you want.
The second query returns an arbitrary x rows of data (assuming x > y). Otherwise it returns 0 rows (if y >= x). If you're trying to build some sort of windowing query, this isn't it.
If you want a windowing query that works, you'd want something like
SELECT *
FROM (SELECT a.*,
row_number() over (order by something) rnum
FROM table_name)
WHERE rnum BETWEEN x AND y
If you wanted to use ROWNUM, you'd need something like
SELECT *
FROM (SELECT a.*,
rownum rnum
FROM( SELECT b.*
FROM table_name
ORDER BY something) a)
WHERE rownum < y
AND rnum > x
But this tends to be less efficient than the analytic query approach.
Besides the absence of the “order by” clause…..
May be the question is about Oracle STOPKEY feature? In case of “paging” queries Oracle can use a STOPKEY feature to limit the number of rows in the subquery, this can lead to some performance gain.
Look at this query:
select *
from (select a.*,
row_number() over (order by sname) rnum
from t_patient_card a)
where rnum between 1 and 100
Cost Cardinality
SELECT STATEMENT, GOAL = FIRST_ROWS 313272 3571266
VIEW HOSPITAL2$ 313272 3571266
SORT ORDER BY 313272 3571266
COUNT
TABLE ACCESS FULL HOSPITAL2$ T_PATIENT_CARD 38883 3571266
Oracle fetched all the rows before is return only 100 of them
Let’s rewrite the query like this:
select *
from (
select rownum as rn,tt.* from
(
select t.* from t_patient_card t order by t.sname
)tt where rownum<100
)
WHERE rn >1
In this case we user rownum<100 in the subquery to inform the optimizer that we want to get less the 100 rows.
Cost Cardinality
SELECT STATEMENT, GOAL = ALL_ROWS 313272 99
VIEW HOSPITAL2$ 313272 99
COUNT STOPKEY
VIEW HOSPITAL2$ 313272 3571266
SORT ORDER BY STOPKEY 313272 3571266
TABLE ACCESS FULL HOSPITAL2$ T_PATIENT_CARD 38883 3571266
You can see the “count stopkey” and cardinality is only 99 after this step.In my database the second query executes one second faster then the first one.
Related
I am writing a very simple query for an Oracle DB (version 9).
Somehow I can get first 5 rows:
select * from cities where rownum <= 5
But skipping 5 rows returns an empty result:
select * from cities where rownum >= 5
Using:
Oracle SQL Developer
Oracle DB version 9
Why is the second query returning an empty result?
In Oracle Database 12c (release 1) and above, you can do this very simple, for skip 5 rows:
SELECT * FROM T OFFSET 5 ROWS
and for skip 5 rows and take 15 rows:
SELECT * FROM T OFFSET 5 ROWS FETCH NEXT 15 ROWS ONLY
You can use the following query to skip the first not n of rows.
select * from (
select rslts.*, rownum as rec_no from (
<<Query with proper order by (If you don't have proper order by you will see weird results)>>
) rslts
) where rec_no > <<startRowNum - n>>
The above query is similar to pagination query below.
select * from (
select rslts.*, rownum as rec_no from (
<<Query with proper order by (If you don't have proper order by you will see weird results)>>
) rslts where rownum <= <<endRowNum>>
) where rec_no > <<startRowNum>>
Your cities query:
select * from (
select rslts.*, rownum as rec_no from (
select * from cities order by 1
) rslts
) where rec_no > 5 <<startRowNum>>
Note: Assume first column in cities table is unique key
Oracle increments rownum each time it adds a row to the result set. So saying rownum < 5 is fine; as it adds each of the first 5 rows it increments rownum, but then once ruwnum = 5 the WHERE clause stops matching, no more rows are added to the result, and though you don't notice this rownum stops incrementing.
But if you say WHERE rownum > 5 then right off the bat, the WHERE clause doesn't match; and since, say, the first row isn't added to the result set, rownum isn't incremented... so rownum can never reach a value greater than 5 and the WHERE clause can never match.
To get the result you want, you can use row_number() over() in a subquery, like
select *
from (select row_number() over() rn, -- other values
from table
where -- ...)
where rn > 5
Update - As noted by others, this kind of query only makes sense if you can
control the order of the row numbering, so you should really use row_number() over(order bysomething) where something is a useful ordering key in deciding which records are "the first 5 records".
rownum is being increased only when a row is being output, so this type of condition won't work.
In any case, you are not ordering your rows, so what's the point?
Used row_number() over (order by id):
select * from
(select row_number() over (order by id) rn, c.* from countries c)
where rn > 5
Used ROWNUM:
select * from
(select rownum rn, c.* from countries c)
where rn > 5
Important note:
Using alias as countries c instead of countries is required! Without, it gives an error "missing expression"
Even better would be:
select * from mytab sample(5) fetch next 1 rows only;
Sample clause indicates the probability of each row getting picked up in the sampling process. FETCH NEXT clause indicates the number of rows you want to select.
With this code, you can query your table with skip and take.
select * from (
select a.*, rownum rnum from (
select * from cities
) a
) WHERE rnum >= :skip + 1 AND rnum <= :skip + :take
This code works with Oracle 11g. With Oracle 12, there is already a better way to perform this queries with offset and fetch
I am using a sql to select data from a table for a particular range of records.
I am using rownum to implement greater and less than logic to arrive at the required set of data.
With the use of below sql I can fetch the records from 21 to 40 from my table. In total for this condition table contains 100 rows.
Through this sql I want to fetch an indicator(value) which tells that there are more records for this condition.
I could not find any solution in google.
Sql -
select * from ( select rownum rnum, a.* from(SELECT TO_CHAR(D.DATE,'YYYYMMDD'),D.TYPE,
TO_CHAR(D.VDATE,'YYYYMMDD'),D.AMT,D.PARTICULAR,D.NUM,D.ID,
D.CODE,D.INFO FROM MySCHEMA
.MYTABLE D WHERE D.DATE >= TO_CHAR(TO_DATE('20160701','YYYYMMDD'),'DD-MON-RRRR')
AND D.DATE <= TO_CHAR(TO_DATE('20161105','YYYYMMDD'),'DD-MON-RRRR') AND D.XDATE >= TO_CHAR(TO_DATE('20160701','YYYYMMDD'),'DD-MON-RRRR')
AND D.XDATE <= TO_CHAR(TO_DATE('20161105','YYYYMMDD'),'DD-MON-RRRR')
AND D.FLG='Y' AND D.TYPE IN('D','C')
AND
D.ACI = 'CO6'
ORDER BY D.DATE DESC
)
a where rownum <= 40 ) where rnum >= 21;
You can add a count(*) over () total_rows to your inner select. That will tell you how many rows the inner query would return without the rownum predicates. That is going to mean, however, that every time you ask for a page of results, Oracle has to execute the inner query completely and then discard all the rows you aren't fetching. That's going to be more expensive than what you are currently doing
select *
from ( select rownum rnum, a.*
from(SELECT count(1) over () total_rows,
<<the rest of your query>>
If I say:
select * from table order by col1 where rownum < 100
If the table has 10 million records, will Oracle bring all 10 million, sort it and then show me the first 10? Or is there a way it will optimise it?
If you do this
select * from table order by col1 where rownum < 100
then Oracle will throw an error as the WHERE clause comes before the ORDER BY.
If you do this
select * from table where rownum < 100 order by col1
then Oracle will return a random 99 records as the WHERE clause comes before the ORDER BY.
If you want to return a the first 100 records, ordered by a column, you must put the order by in a sub-select.
select *
from ( select * from table order by col1 )
where rownum <= 100
Oracle will do the sort, how else will it know the records you want? However, it will be a sort with a stopkey because of the ROWNUM. Oracle doesn't actually sort the entire result set, as some optimisation goes on under the hood, but this is what you can assume takes place.
Please see this article by Tom Kyte.
How can one filter a grouped resultset for only those groups that meet some criterion compared against the other groups? For example, only those groups that have the maximum number of constituent records?
I had thought that a subquery as follows should do the trick:
SELECT * FROM (
SELECT *, COUNT(*) AS Records
FROM T
GROUP BY X
) t HAVING Records = MAX(Records);
However the addition of the final HAVING clause results in an empty recordset... what's going on?
In MySQL (Which I assume you are using since you have posted SELECT *, COUNT(*) FROM T GROUP BY X Which would fail in all RDBMS that I know of). You can use:
SELECT T.*
FROM T
INNER JOIN
( SELECT X, COUNT(*) AS Records
FROM T
GROUP BY X
ORDER BY Records DESC
LIMIT 1
) T2
ON T2.X = T.X
This has been tested in MySQL and removes the implicit grouping/aggregation.
If you can use windowed functions and one of TOP/LIMIT with Ties or Common Table expressions it becomes even shorter:
Windowed function + CTE: (MS SQL-Server & PostgreSQL Tested)
WITH CTE AS
( SELECT *, COUNT(*) OVER(PARTITION BY X) AS Records
FROM T
)
SELECT *
FROM CTE
WHERE Records = (SELECT MAX(Records) FROM CTE)
Windowed Function with TOP (MS SQL-Server Tested)
SELECT TOP 1 WITH TIES *
FROM ( SELECT *, COUNT(*) OVER(PARTITION BY X) [Records]
FROM T
)
ORDER BY Records DESC
Lastly, I have never used oracle so apolgies for not adding a solution that works on oracle...
EDIT
My Solution for MySQL did not take into account ties, and my suggestion for a solution to this kind of steps on the toes of what you have said you want to avoid (duplicate subqueries) so I am not sure I can help after all, however just in case it is preferable here is a version that will work as required on your fiddle:
SELECT T.*
FROM T
INNER JOIN
( SELECT X
FROM T
GROUP BY X
HAVING COUNT(*) =
( SELECT COUNT(*) AS Records
FROM T
GROUP BY X
ORDER BY Records DESC
LIMIT 1
)
) T2
ON T2.X = T.X
For the exact question you give, one way to look at it is that you want the group of records where there is no other group that has more records. So if you say
SELECT taxid, COUNT(*) as howMany
GROUP by taxid
You get all counties and their counts
Then you can treat that expressions as a table by making it a subquery, and give it an alias. Below I assign two "copies" of the query the names X and Y and ask for taxids that don't have any more in one table. If there are two with the same number I'd get two or more. Different databases have proprietary syntax, notably TOP and LIMIT, that make this kind of query simpler, easier to understand.
SELECT taxid FROM
(select taxid, count(*) as HowMany from flats
GROUP by taxid) as X
WHERE NOT EXISTS
(
SELECT * from
(
SELECT taxid, count(*) as HowMany FROM
flats
GROUP by taxid
) AS Y
WHERE Y.howmany > X.howmany
)
Try this:
SELECT * FROM (
SELECT *, MAX(Records) as max_records FROM (
SELECT *, COUNT(*) AS Records
FROM T
GROUP BY X
) t
) WHERE Records = max_records
I'm sorry that I can't test the validity of this query right now.
In my SP I have the following:
with Paging(RowNo, ID, Name, TotalOccurrences) as
(
ROW_NUMBER() over (order by TotalOccurrences desc) as RowNo, V.ID, V.Name, R.TotalOccurrences FROM dbo.Videos V INNER JOIN ....
)
SELECT * FROM Paging WHERE RowNo BETWEEN 1 and 50
SELECT COUNT(*) FROM Paging
The result is that I get the error: invalid object name 'Paging'.
Can I query again the Paging table? I don't want to include the count for all results as a new column ... I would prefer to return as another data set. Is that possible?
Thanks, Radu
After more research I fond another way of doing this:
with Paging(RowNo, ID, Name, TotalOccurrences) AS
(
ROW_NUMBER() over (order by TotalOccurrences desc) as RowNo, V.ID, V.Name, R.TotalOccurrences FROM dbo.Videos V INNER JOIN ....
)
select RowNo, ID, Name, TotalOccurrences, (select COUNT(*) from Paging) as TotalResults from Paging where RowNo between (#PageNumber - 1 )* #PageSize + 1 and #PageNumber * #PageSize;
I think that this has better performance than calling two times the query.
You can't do that because the CTE you are defining will only be available to the FIRST query that appears after it's been defined. So when you run the COUNT(*) query, the CTE is no longer available to reference. That's just a limitation of CTEs.
So to do the COUNT as a separate step, you'd need to not use the CTE and instead use the full query to COUNT on.
Or, you could wrap the CTE up in an inline table valued function and use that instead, to save repeating the main query, something like this:
CREATE FUNCTION dbo.ufnExample()
RETURNS TABLE
AS
RETURN
(
with Paging(RowNo, ID, Name, TotalOccurrences) as
(
ROW_NUMBER() over (order by TotalOccurrences desc) as RowNo, V.ID, V.Name, R.TotalOccurrences FROM dbo.Videos V INNER JOIN ....
)
SELECT * FROM Paging
)
SELECT * FROM dbo.ufnExample() x WHERE RowNo BETWEEN 1 AND 50
SELECT COUNT(*) FROM dbo.ufnExample() x
Please be aware that Radu D's solution's query plan shows double hits to those tables. It is doing two executions under the covers. However, this still may be the best way as I haven't found a truly scalable 1-query design.
A less scalable 1-query design is to dump a completed ordered list into a #tablevariable , SELECT ##ROWCOUNT to get the full count, and select from #tablevariable where row number between X and Y. This works well for <10000 rows, but with results in the millions of rows, populating that #tablevariable gets expensive.
A hybrid approach is to populate this temp/variable up to 10000 rows. If not all 10000 rows are filled up, you're set. If 10000 rows are filled up, you'll need to rerun the search to get the full count. This works well if most of your queries return well under 10000 rows. The 10000 limit is a rough approximation, you can play around with this threshold for your case.
Write "AS" after the CTE table name Paging as below:
with Paging AS (RowNo, ID, Name, TotalOccurrences) as
(
ROW_NUMBER() over (order by TotalOccurrences desc) as RowNo, V.ID, V.Name, R.TotalOccurrences FROM dbo.Videos V INNER JOIN ....
)
SELECT * FROM Paging WHERE RowNo BETWEEN 1 and 50
SELECT COUNT(*) FROM Paging