I need to write a query that selects a minimum value and it's second most minimum value from a list of integers.
Grabbing the smallest value is obvious:
select min(value) from table;
But the second smallest is not so obvious.
For the record, this list of integers is not sequential -- the min can be 1000, and the second most min can be 10000.
Use an analytic function
SELECT value
FROM (SELECT value,
dense_rank() over (order by value asc) rnk
FROM table)
WHERE rnk = 2
The analytic functions RANK, DENSE_RANK, and ROW_NUMBER are identical except for how they handle ties. RANK uses a sports-style process of breaking ties so if two rows tie for a rank of 1, the next row has a rank of 3. DENSE_RANK gives both of the rows tied for first place a rank of 1 and then assigns the next row a rank of 2. ROW_NUMBER arbitrarily breaks the tie and gives one of the two rows with the lowest value a rank of 1 and the other a rank of 2.
select
value
from
(select
value,
dense_rank() over (order by value) rank
from
table)
where
rank = 2
Advantage: You can get the third value just as easy, or the bottom 10 rows (rank <= 10).
Note that the performance of this query will benefit from a proper index on 'value'.
SELECT MIN(value)
FROM TABLE
WHERE Value > (SELECT MIN(value) FROM TABLE)
Related
I have a problem with GROUP BY one column and choose second column that is string depends on Count number from column three.
So I have a table with ID's in column one, string in column two and Count in column three. I have ordered that by ID's and Count descending.
Most of the ID's are unique but sometimes id's occurs more than once. In this case I would like to choose only string with bigger count number. How can I do that?
SELECT id, string, count
FROM ...
ORDER BY id, count DESC
In BigQuery, you can use aggregation:
select array_agg(t order by count desc limit 1)[ordinal(1)].*
from t
group by id;
What this does is construct an array of the full records for each id. But this array is ordered by the largest count first -- and only the first element of the array is used. The [ordinal(1)].* is just a convenient way to return the record fields as separate columns.
The more canonical method in SQL would be:
select t.* except (seqnum)
from (select t.*,
row_number() over (partition by id order by count desc) as seqnum
from t
) t
where seqnum = 1;
Given;
CREATE TABLE T1 (ID INTEGER, DESCRIPTION VARCHAR2(20));
INSERT INTO T1 VALUES (1,'ONE');
INSERT INTO T1 VALUES (2,'TWO');
INSERT INTO T1 VALUES (3,'THREE');
INSERT INTO T1 VALUES (4,'FOUR');
INSERT INTO T1 VALUES (5,'FIVE');
COMMIT;
Why does;
SELECT * FROM
( SELECT ROWNUM, ID, DESCRIPTION
FROM T1)
WHERE MOD(ROWNUM,1)=0;
Return
ROWNUM ID DESCRIPTION
------ -------------------------------------- --------------------
1 1 ONE
2 2 TWO
3 3 THREE
4 4 FOUR
5 5 FIVE
Whereas;
SELECT * FROM
( SELECT ROWNUM, ID, DESCRIPTION
FROM T1)
WHERE MOD(ROWNUM,2)=0;
Return zero rows ???
Confused, expected ROWNUM=(2,4) to be returned...
SELECT B.* FROM
( SELECT ROWNUM a, ID, DESCRIPTION
FROM T1) B
WHERE MOD(A,2)=0;
Reason: Your approach involves running rownum twice. You don't need to; nor really do you want to. Based on order of operations, the where clause will execute before the the outer select; which means the select hasn't determined the values for each row, and the number of rows is not known yet.
Additional:
I would recommend adding an order by to the inline view so the rownumbers are in a expected specific order as opposed to what the engine derives.
You have 2 operations of ROWNUM.
The 1st ROWNUM generates the numbers 1 through 5.
The 2nd ROWNUM doesn't generate anything because for the row the ROWNUM value is 1, but since MOD(1,2)=0 is false, the record is not being outputted and the ROWNUM is not being incremented, failing the condition again and again.
This query, using alias, returns exactly what you have expected:
SELECT * FROM
( SELECT ROWNUM as rn, ID, DESCRIPTION
FROM T1)
WHERE MOD(rn,2)=0;
Some facts about the ROWNUM pseudo column in Oracle:
The ROWNUM assigned to each row is determined by the order in which Oracle retrieves the row from the DB.
The order in which rows are returned is non deterministic, such that running it once may return rows in one ordering, and a second time around may have a different ordering if the base tables have been reorganized, or Oracle uses a different query plan.
The order in which ROWNUMs are assigned to rows is not necessarily correlated with the that of an order by clause (the order by clause may affect the ROWNUM order since it may cause a different query plan to be used, but the ROWNUMbers are unlikely to match the sort order).
ROWNUMbers are assigned after the records are filtered by the WHERE clause, so if you filter out ROWNUM 1 you will never return any records.
Filtering a subquery that returns an aliased ROWNUM column works because the entire subquery is returned to the outer query before the outer query filters the rows, but the ROWNUMs will still have a non deterministic order.
To successfully return a top N or Nth row query in a deterministic fashion you need to assign row numbers in a deterministic way. One such way is to use the the `ROW_NUMBER' analytic function in a subquery:
select * from
(select ROW_NUMBER() over (order by ID) rn
, ID
, DESCRIPTION
from T1)
where rn <= 4 -- Top N
or
where rn = 4 -- 1st Nth row
or even
WHERE MOD(rn,2)=0 -- every Nth row
In either case the ORDER BY clause in the ROW_NUMBER analytic function needs to match the granularity of the data otherwise ties in the ordering will again be non deterministic, most likely matching the current ROWNUM ordering.
I'm trying to select the estimated hours of a row with the lowest date from a table.
SELECT prev_est_hrs
FROM ( SELECT MIN(change_date), prev_est_hrs
FROM task_history
WHERE task_id = 5
GROUP BY prev_est_hrs
);
However this is returning two rows, why? I thought MIN was supposed to return the lowest only?
Help much appreciated.
You have a GROUP BY clause. The MIN will return the minimum value in each group.
Also, you are only returning the group by value from the outer SELECT.
Mitch is right. One way to get the prev_est_hrs for the record with the earliest change_date, which seems to be what you're trying to find, is with an analytic function:
SELECT prev_est_hrs
FROM (
SELECT prev_est_hrs, ROW_NUMBER() OVER (ORDER BY change_date) AS rn
FROM task_history
WHERE task_id = 5
)
WHERE rn = 1;
You need to consider what should happen if you have two rows with the same date. This would pick one of them at random. If there is some other criteria you could use to break the tie you could add that to the order by clause. If you wanted all matching rows in that case you could use rank() instead; look at dense_rank() as well, they all have their place.
Use this query to do that:
SELECT max(prev_est_hrs) keep (dense_rank first order by change_date)
FROM task_history
WHERE task_id = 5;
I have a table of data and want to retrieve the penultimate record.
How is this done?
TABLE: results
-------
30
31
35
I need to get 31.
I've been trying with rownum but it doesn't seem to work.
Assuming you want the second highest number and there are no ties
SELECT results
FROM (SELECT results,
rank() over (order by results desc) rnk
FROM your_table_name)
WHERE rnk = 2
Depending on how you want to handle ties, you may want either the rank, dense_rank, or row_number analytic function. If there are two 35's for example, would you want 35 returned? Or 31? If there are two 31's, would you want a single row returned? Or would you want both 31's returned.
This can use for n th rank ##
select Total_amount from (select Total_amount, rank() over (order by Total_amount desc) Rank from tbl_booking)tbl_booking where Rank=3
SELECT id
FROM (
SELECT id
FROM table
WHERE
PROCS_DT is null
ORDER BY prty desc, cret_dt ) where rownum >0 and rownum <=100
The above query is giving me back 100 records as expected
SELECT id
FROM (
SELECT id
FROM table
WHERE
PROCS_DT is null
ORDER BY prty desc, cret_dt ) where rownum >101 and rownum <=200
why is the above query returning me zero records?
Can some one help me how i can keep on. I am dumb in oracle...
Try this:
SELECT id
FROM
(SELECT id,
rownum AS rn
FROM
(SELECT id
FROM TABLE
WHERE PROCS_DT IS NULL
ORDER BY prty DESC, cret_dt) )
WHERE rn >101
AND rn <=200
If you are comfortable using the ANALYTIC functions, try this:
SELECT id
FROM
(
SELECT id,
ROW_NUMBER() OVER(ORDER BY prty DESC, cret_dt ) rn
FROM table
WHERE procs_dt IS NULL
)
WHERE rn >101 and rn <=200
ROWNUM values are assigned to rows as they are returned from a query (or subquery). If a row is not returned, it is not assigned a ROWNUM value at all; so the ROWNUM values returned always begin at 1 and increment by 1 for each row.
(Note that these values are assigned prior to any sorting indicated by the ORDER BY clause. This is why in your case you need to check rownum outside the subquery.)
The odd bit of logic you have to understand is that when you have a predicate on ROWNUM, you are filtering on a value that will only exist if the row passes the filter. Conceptually, Oracle applies any other filters in the query first, then tentatively assigns ROWNUM 1 to the first matching row and checks it against the filter on ROWNUM. If it passes this check, it will be returned with that ROWNUM value, and the next row will be tentatively assigned ROWNUM 2. But if it does not pass the check, the row is discarded, and the same ROWNUM value is tentatively assigned to the next row.
Therefore, if the filter on ROWNUM does not accept a value of 1, no rows will ever pass the filter.
The use of the analytic function ROW_NUMBER() shown in the other answers is one way around this. This function explicitly assigns row numbers (distinct from ROWNUM) based on a given ordering. However, this can change performance significantly, as the optimizer does not necessarily realize that it can avoid assigning numbers to ever possible row in order to complete the query.
The traditional ROWNUM-based way of doing what you want is:
SELECT id
FROM (
SELECT rownum rn, id
FROM (
SELECT id
FROM table
WHERE
PROCS_DT is null
ORDER BY prty desc, cret_dt
) where rownum <=200
) where rn > 101
The innermost query conceptually finds all matching rows and sorts them. The next layer assigns ROWNUMs to these and returns only the first 200 matches. (And actually, the Oracle optimizer understands the significance of a sort followed by a ROWNUM filter, and will usually do the sort in such a way as to identify the top 200 rows without caring about the specific ordering of the other rows.)
The middle layer also takes the ROWNUMs that it assigns and returns them as part of its result set with the alias "rn". This allows the outermost layer to filter on that value to establish the lower limit.
I would experiment with this variant and the analytic function to see which performs better in your case.