Oracle 10g performance between rownum or with clause - sql

I am using Oracle Database 10g Enterprise Edition 10.2.0.4.0 64bit
and I would like to know, what is the best way to write the following query?
1. With rownum
SELECT * FROM
(
SELECT ID_DONNEE_H, DATE_DONNEE
FROM DONNEE_H d
WHERE d.DATE_DONNEE > sysdate -50000
AND d.ID_SC = 38648
ORDER BY DATE_DONNEE DESC
)
WHERE rownum=1;
2. With a WITH clause
with req as (
select d.ID_DONNEE_H, row_number() over (order by DATE_DONNEE desc) as seqnum
from DONNEE_H d
where d.DATE_DONNEE > sysdate -50000
AND d.ID_SC = 38648 )
select * from req where seqnum = 1;
3. With rank clause
select * from (select d.ID_DONNEE_H, row_number() over (order by DATE_DONNEE desc) as seqnum
from DONNEE_H d
where d.DATE_DONNEE > sysdate -50000
AND d.ID_SC = 38648) test
where seqnum = 1;
I think 2 and 3 are similar, but which is the fastest, 1, 2 or 3?

I don't think you can particularly generalise as to which query is the "best" in all situations. As with most IT questions, the answer is: "it depends"! You would need to investigate each query on its own merits.
As an aside, you've missed off another alternative - assuming that you're only after a single column, rather than the whole row for the highest column you're interested in:
with sample_data as (select 10 col1, 3 col2 from dual union all
select 20 col1, 3 col2 from dual union all
select 30 col1, 1 col2 from dual union all
select 40 col1, 2 col2 from dual)
select max(col1) keep (dense_rank first order by col2 desc) col1_val_of_max_col2
from sample_data;
COL1_VAL_OF_MAX_COL2
--------------------
20

Related

Oracle get rank for only latest date

I have origin table A:
dt
c1
value
2022/10/1
1
1
2022/10/2
1
2
2022/10/3
1
3
2022/10/1
2
4
2022/10/2
2
6
2022/10/3
2
5
Currently I got the latest dt's percent_rank by:
select * from
(
select
*,
percent_rank() over (partition by c1 order by value) as prank
from A
) as pt
where pt.dt = Date'2022-10-3'
Demo: https://www.db-fiddle.com/f/rXynTaD5nmLqFJdjDSCZpL/0
the excepted result looks like:
dt
c1
value
prank
2022/10/3
1
3
1
2022/10/3
2
5
0.5
Which means at 2022-10-3, the value in c1 group's percent_rank in history is 100% while in c2 group is 66%.
But this sql will sort evey partition which I thought it's time complexity is O(n log n).
I just need the latest date's rank and I thought I could do that by calculating count(last_value > value)/count() which cost O(n).
Any suggestions?
Rather than hard-coding the maximum date, you can use the ROW_NUMBER() analytic function:
SELECT *
FROM (
SELECT t.*,
PERCENT_RANK() OVER (PARTITION BY c1 ORDER BY value) AS prank,
ROW_NUMBER() OVER (PARTITION BY c1 ORDER BY dt DESC) AS rn
FROM table_name t
) t
WHERE rn = 1
Which, for the sample data:
CREATE TABLE table_name (dt, c1, value) AS
SELECT DATE '2022-10-01', 1, 1 FROM DUAL UNION ALL
SELECT DATE '2022-10-02', 1, 2 FROM DUAL UNION ALL
SELECT DATE '2022-10-03', 1, 3 FROM DUAL UNION ALL
SELECT DATE '2022-10-01', 2, 4 FROM DUAL UNION ALL
SELECT DATE '2022-10-02', 2, 6 FROM DUAL UNION ALL
SELECT DATE '2022-10-03', 2, 5 FROM DUAL;
Outputs:
DT
C1
VALUE
PRANK
RN
2022-10-03 00:00:00
1
3
1
1
2022-10-03 00:00:00
2
5
.5
1
fiddle
But this sql will sort every partition which I thought it's time complexity is O(n log n).
Whatever you do you will need to iterate over the entire result-set.
I just need the latest date's rank and I thought I could do that by calculating count(last_value > value)/count().
Then you will need to find the last value which (unless you are hardcoding the last date) will involve using an index- or table-scan over all the values in each partition and sorting the values and then to find a count of the greater values will require a second index- or table-scan. You can profile both solutions but I expect you would find that using analytic functions is going to be equally efficient, if not better, than trying to use aggregation functions.
For example:
SELECT c1,
dt,
value,
( SELECT ( COUNT(CASE WHEN value <= t.value THEN 1 END) - 1 )
/ ( COUNT(*) - 1 )
FROM table_name c
WHERE c.c1 = t.c1
) AS prank
FROM table_name t
WHERE dt = DATE '2022-10-03'
If going to access the table twice and you are likely to find that the I/O costs of table access are going to far outweight any potential savings from using a different method. However, if you look at the explain plan (fiddle) then the query is still performing an aggregate sort so there is not going to be any cost savings, only additional costs from this method.
Try this
select t.c1, t.dt, t.value
from TABLENAME t
inner join (
select c1, max(dt) as MaxDate
from TABLENAME
group BY dt
) tm on t.c1 = tm.c1 and t.dt = tm.MaxDate ORDER BY dt DESC;
Or as simple as
SELECT * from TABLENAME ORDER BY dt DESC;
I fiddled it a bit, it is almost the same answer as MT0 already put.
select dt, c1, val, prank*100 as percent_rank from (select
t1.*,
percent_rank() over (partition by c1 order by val) as prank,
row_number() over (partition by c1 order by dt desc) rn from t1) where rn=1;
result
DT C1 VAL PERCENT_RANK
2022-10-03 1 3 100
2022-10-03 2 5 50
http://sqlfiddle.com/#!4/ec60a/23
I used Row_number = 1 to get the latest date.
And also pushed the percent_rank as percent.
Is this what you desire?

sql - single line per distinct values in a given column

is there a way using sql, in bigquery more specifically, to get one line per unique value in a given column
I know that this is possible using a sequence of union queries where you have as much union as distinct values as there is in the column of interest. but i'm wondering if there is a better way to do it.
You can use row_number():
select t.* except (seqnum)
from (select t.*, row_number() over (partition by col order by col) as seqnum
from t
) t
where seqnum = 1;
This returns an arbitrary row. You can control which row by adjusting the order by.
Another fun solution in BigQuery uses structs:
select array_agg(t limit 1)[ordinal(1)].*
from t
group by col;
You can add an order by (order by X limit 1) if you want a particular row.
here is just a more formated format :
select tab.* except(seqnum)
from (
select *, row_number() over (partition by column_x order by column_x) as seqnum
from `project.dataset.table`
) as tab
where seqnum = 1
Below is for BigQuery Standard SQL
#standardSQL
SELECT AS VALUE ANY_VALUE(t)
FROM `project.dataset.table` t
GROUP BY col
You can test, play with above using dummy data as in below example
#standardSQL
WITH `project.dataset.table` AS (
SELECT 1 id, 1 col UNION ALL
SELECT 2, 1 UNION ALL
SELECT 3, 1 UNION ALL
SELECT 4, 2 UNION ALL
SELECT 5, 2 UNION ALL
SELECT 6, 3
)
SELECT AS VALUE ANY_VALUE(t)
FROM `project.dataset.table` t
GROUP BY col
with result
Row id col
1 1 1
2 4 2
3 6 3

Add rownum from specific number - Oracle SQL

I have a table:
table1
col1 col2
1 a
1 b
1 c
I want to add rownum but from a specific number, for ex. starting from 100, so it would look like:
col1 col2 rn
1 a 100
1 b 101
1 c 102
I know how to add rownum like below:
select a.*, rownum as rn from table1 a;
But I don't know how to add from a specific number. How to do it in Oracle SQL?
The ANSI SQL way of doing this would be to use ROW_NUMBER:
SELECT col1, col2, 99 + ROW_NUMBER() OVER (ORDER BY col2) rn
FROM table1;
You might be able to use Oracle's ROWNUM function here, but in that case you would also need to provide an ORDER BY clause to your query:
SELECT col1, col2, 99 + ROWNUM AS rn
FROM table1
ORDER BY col2;
I think it's not necessary to get this kind of rownum from systematic source, you can use below query for example
select a.*, 99+rownum as rn from table1 a;

Extract specific rows from a table in Oracle

I want to write a query which will fetch me the first and last 3 records from the table
Below is the table details
select * from employee_src
Now to get the above result i am using the below query
select fname,lname,ssn,salary,dno from employee_src where rownum <=3
union all
select fname,lname,ssn,salary,dno from (select fname,lname,ssn,salary,dno from employee_src order by rownum desc) where rownum <=3
On running this query I am getting the below result
Even though I am getting the first 3 and last 3 rows but the last 3 rows are not in the order as in the original table. How to get this fixed.
Try this.
select * from (select * from employee_src order by rownum Asc) where rownum <= 3
union all
select *, from ( select * from employee_src from dual order by rownum desc
) as employee_src_last3
where rownum <= 3
Try now...

How to group rows, select one by a number - ext with other columns - in SQL Oracle?

I had an issue with writing a query that would gather groups in a column, and then select one of them by a number.
A good person (#sstan) gave me this:
select your_col
from (select your_col,
row_number() over (order by your_col) as rn
from your_table
group by your_col)
where rn = 2
And it works. However, it appears that my query needs to consider other columns. For now, it looks like this:
select MAINCOL, sum(some_col+other_col) as together_col, count(another_col)
from my_table
where date_col >= next_day(trunc(sysdate), 'MONDAY') - 14
and date_col < next_day(trunc(sysdate), 'MONDAY') - 7
group by MAINCOL, other_col, together_col
order by MAINCOL
So the challenge is to extend the upper query with what is below. Although I couldn't make it work, it seems simple..
You may try with Inner table alias
SELECT your_col,rn.your_col,rn.your_col2,rn.your_col3
FROM(select your_col,your_col2,your_col3,row_number() over (order by your_col)
from your_table group by your_col)as rn where rn = 2
Got it!
With help of Stack, of course.
select t.*
from (select MAINCOL, col1, col2, col3, col4, DENSE_RANK()OVER(ORDER BY MAINCOL) GROUPID
from tab_1
group by MAINCOL, col1, col2
) t
where GROUPID = 1;