How to get Nth big item [duplicate] - sql

This question already has answers here:
Finding Nth Minimum of a Varchar value in Oracle
(2 answers)
Closed 9 years ago.
I am getting 11th older person from users table with the following SQL statement
select MAX(age)
from ( select *
from (select *
from users
order by age asc)
where rownum <12)
is there a simplified and efficient query to get 11th older person with full information?
USING
Oracle 11G

WITH AgeOrderedPersons AS (
SELECT usr.*
,ROW_NUMBER() OVER (ORDER BY Age) AS Number
FROM Users usr
)
SELECT *
FROM AgeOrderedPersons
WHERE Number = 11
If you want all users with same age use DENSE_RANK() instead of ROW_NUMBER()

SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE users ( age, x ) AS
SELECT 20, 1 FROM DUAL
UNION ALL SELECT 80, 2 FROM DUAL
UNION ALL SELECT 47, 3 FROM DUAL
UNION ALL SELECT 33, 4 FROM DUAL
UNION ALL SELECT 24, 5 FROM DUAL
UNION ALL SELECT 7, 6 FROM DUAL
UNION ALL SELECT 102, 7 FROM DUAL
UNION ALL SELECT 99, 8 FROM DUAL
UNION ALL SELECT 90, 9 FROM DUAL
UNION ALL SELECT 28, 10 FROM DUAL
UNION ALL SELECT 46, 11 FROM DUAL
UNION ALL SELECT 54, 12 FROM DUAL
UNION ALL SELECT 67, 13 FROM DUAL
UNION ALL SELECT 17, 14 FROM DUAL
UNION ALL SELECT 34, 15 FROM DUAL
UNION ALL SELECT 32, 16 FROM DUAL
UNION ALL SELECT 39, 17 FROM DUAL
UNION ALL SELECT 26, 18 FROM DUAL
UNION ALL SELECT 15, 19 FROM DUAL
UNION ALL SELECT 12, 20 FROM DUAL;
Query 1:
SELECT DISTINCT
NTH_VALUE( age, 11 ) IGNORE NULLS OVER ( ORDER BY age ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) AS age,
NTH_VALUE( x, 11 ) IGNORE NULLS OVER ( ORDER BY age ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) AS x
FROM users
Results:
| AGE | X |
|-----|----|
| 34 | 15 |
Query 2:
Include an ordering clause in the statement with ROW_NUMBER():
WITH ranked AS (
SELECT u.*,
ROW_NUMBER() OVER ( ORDER BY age ) AS rn
FROM users u
ORDER BY age
)
SELECT age, x
FROM ranked
WHERE rn = 11
Results:
| AGE | X |
|-----|----|
| 34 | 15 |
Query 3:
WITH ordered AS (
SELECT *
FROM users
ORDER BY age
),
ranked AS (
SELECT o.*,
ROWNUM AS rn
FROM ordered o
WHERE ROWNUM <= 11
)
SELECT age, x
FROM ranked
WHERE rn = 11
Results:
| AGE | X |
|-----|----|
| 34 | 15 |

Related

Select greatest n per group using EXISTS

I have a RoadInsp table in Oracle 18c. I've put the data in a CTE for purpose of this question:
with roadinsp (objectid, asset_id, date_) as (
select 1, 1, to_date('2016-04-01','YYYY-MM-DD') from dual union all
select 2, 1, to_date('2019-03-01','YYYY-MM-DD') from dual union all
select 3, 1, to_date('2022-01-01','YYYY-MM-DD') from dual union all
select 4, 2, to_date('2016-04-01','YYYY-MM-DD') from dual union all
select 5, 2, to_date('2021-01-01','YYYY-MM-DD') from dual union all
select 6, 3, to_date('2022-03-01','YYYY-MM-DD') from dual union all
select 7, 3, to_date('2016-04-01','YYYY-MM-DD') from dual union all
select 8, 3, to_date('2018-03-01','YYYY-MM-DD') from dual union all
select 9, 3, to_date('2013-03-01','YYYY-MM-DD') from dual union all
select 10, 3, to_date('2010-06-01','YYYY-MM-DD') from dual
)
select * from roadinsp
OBJECTID ASSET_ID DATE_
---------- ---------- ----------
1 1 2016-04-01
2 1 2019-03-01
3 1 2022-01-01 --select this row
4 2 2016-04-01
5 2 2021-01-01 --select this row
6 3 2022-03-01 --select this row
7 3 2016-04-01
8 3 2018-03-01
9 3 2013-03-01
10 3 2010-06-01
I'm using GIS software that only lets me use SQL in a WHERE clause/SQL expression, not a full SELECT query.
I want to select the greatest n per group using the WHERE clause. In other words, for each ASSET_ID, I want to select the row that has the latest date.
As an experiment, I want to make the selection specifically using the EXISTS operator.
The reason being: While this post technically pertains to Oracle (since that's what S.O. community members would have access to), in practice, I want to use the logic in a proprietary database called a file geodatabase. The file geodatabase has very limited SQL support; a small subset of SQL-92 syntax. But it does seem to support EXISTS and subqueries, although not correlated subqueries, joins, or any modern SQL syntax. Very frustrating.
SQL reference for query expressions used in ArcGIS
Subquery support in file geodatabases is limited to the following:
Scalar subqueries with comparison operators. A scalar subquery returns a single value, for example:
GDP2006 > (SELECT MAX(GDP2005) FROM countries)
For file geodatabases, the set functions AVG, COUNT, MIN, MAX, and
SUM can only be used in scalar subqueries.
EXISTS predicate, for example:
EXISTS (SELECT * FROM indep_countries WHERE COUNTRY_NAME = 'Mexico')
Question:
Using the EXISTS operator, is there a way to select the greatest n per group? (keeping in mind the limitations mentioned above)
Edit:
If an asset has multiple rows with the same top date, then only one of those rows should be selected.
rank analytic function does the job, if it is available to you (Oracle 18c certainly does support it).
Sample data:
SQL> with roadinsp (objectid, asset_id, date_) as (
2 select 1, 1, to_date('2016-04-01','YYYY-MM-DD') from dual union all
3 select 2, 1, to_date('2019-03-01','YYYY-MM-DD') from dual union all
4 select 3, 1, to_date('2022-01-01','YYYY-MM-DD') from dual union all
5 select 4, 2, to_date('2016-04-01','YYYY-MM-DD') from dual union all
6 select 5, 2, to_date('2021-01-01','YYYY-MM-DD') from dual union all
7 select 6, 3, to_date('2022-03-01','YYYY-MM-DD') from dual union all
8 select 7, 3, to_date('2016-04-01','YYYY-MM-DD') from dual union all
9 select 8, 3, to_date('2018-03-01','YYYY-MM-DD') from dual union all
10 select 9, 3, to_date('2013-03-01','YYYY-MM-DD') from dual union all
11 select 10, 3, to_date('2010-06-01','YYYY-MM-DD') from dual
12 ),
Query begins here: first rank rows per asset_id by date in descending order:
13 temp as
14 (select r.*,
15 rank() over (partition by asset_id order by date_ desc) rnk
16 from roadinsp r
17 )
Finally, fetch rows that rank as the highest:
18 select *
19 from temp
20 where rnk = 1;
OBJECTID ASSET_ID DATE_ RNK
---------- ---------- ---------- ----------
3 1 2022-01-01 1
5 2 2021-01-01 1
6 3 2022-03-01 1
SQL>
If you can't use that, how about a subquery?
<snip>
13 select r.objectid, r.asset_id, r.date_
14 from roadinsp r
15 where (r.asset_id, r.date_) in (select t.asset_id, t.max_date
16 from (select a.asset_id, max(a.date_) max_date
17 from roadinsp a
18 group by a.asset_id
19 ) t
20 );
OBJECTID ASSET_ID DATE_
---------- ---------- ----------
6 3 2022-03-01
5 2 2021-01-01
3 1 2022-01-01
SQL>

GROUP BY ID and select MAX

Good Evening,
I am working on a table like this in Oracle:
ID
BALANCE
SEQ
1
102
13
1
119
15
2
50
4
3
20
11
3
15
10
3
45
9
4
90
5
5
67
20
5
12
19
6
20
1
I want to select, for each ID, the BALANCE having MAX(SEQ).
So final result would be:
ID
BALANCE
SEQ
1
119
15
2
50
4
3
20
11
4
90
5
5
67
20
6
20
1
How can I do that?
I've tried several Group by queries but with no success.
Thanks for any help
One method is aggregation using keep:
select id,
max(balance) keep (dense_rank first order by seq desc) as balance,
max(seq)
from t
group by id;
You may use normal rank()
SELECT ID, BALANCE, SEQ FROM (
select
ID, BALANCE, SEQ, RANK() OVER (PARTITION BY ID ORDER BY SEQ DESC) ranks
from t
) WHERE ranks = 1
sample demo
SELECT ID, BALANCE, SEQ FROM (
SELECT ID, BALANCE, SEQ, RANK() OVER (PARTITION BY ID ORDER BY SEQ DESC) ranks
FROM (
SELECT 1 ID, 102 BALANCE, 13 SEQ FROM dual UNION all
SELECT 1, 119, 15 FROM dual UNION all
SELECT 2, 50, 4 FROM dual UNION all
SELECT 3, 20, 11 FROM dual UNION all
SELECT 3, 15, 10 FROM dual UNION all
SELECT 3, 45, 9 FROM dual UNION all
SELECT 4, 90, 5 FROM dual UNION all
SELECT 5, 67, 20 FROM dual UNION all
SELECT 5, 12, 19 FROM dual UNION all
SELECT 6, 20, 1 FROM dual
)
) WHERE ranks = 1
you can add it in your Big query as below
SELECT ID, BALANCE, SEQ FROM (
select
ID, BALANCE, SEQ, RANK() OVER (PARTITION BY ID ORDER BY SEQ DESC) ranks
from (**YOUR BIG QUERY HERE**)
) WHERE ranks = 1

SQL report - Matching percent value with a number

I have a small issue with my report and I need to know if its even possible to do it?
Im using Oracle12c and the tool OBIEE, im trying to create a custom column with numbers values (1 and 2) that are matching my results from my "Percent" column in a way I described below.
Here is my results in table:
I will give u an example of how it should work:
Emilian is an owner of few customers, the customers have their annual revenue listed and the column next to it its the Percent value of the total customer revenue for Emilian. Now, in my custom column I need to show "1" for customers that contribute more than (or exact) 80% of his total and "2" for the rest. So in Emilian Case, his first two customers will be "1" since 78% + 14% is already above 80% and the rest will be "2". For other Owners that only have one customer, all of them logically would be matched with "1" since their contribution is 100%
Hope I made this clear, will be veery grateful for the help with coding it :)
Alex
There's probably a much more efficient way to do this. I built up what you need to get at with a series of sub-selects. This still doesn't handle the equal percents, but you said that isn't an expected problem. I'd still watch out for it.
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE t1 ( ownerId int, customerId int, revenue int ) ;
INSERT INTO t1 ( ownerid, customerid, revenue )
SELECT 1, 1, 99 FROM dual UNION ALL
SELECT 1, 2, 200 FROM dual UNION ALL
SELECT 1, 3, 300 FROM dual UNION ALL
SELECT 1, 4, 400 FROM dual UNION ALL
SELECT 2, 5, 100 FROM dual UNION ALL
SELECT 2, 6, 100 FROM dual UNION ALL
SELECT 2, 7, 200 FROM dual UNION ALL
SELECT 2, 8, 600 FROM dual UNION ALL
SELECT 3, 9, 100 FROM dual UNION ALL
SELECT 3, 10, 900 FROM dual UNION ALL
SELECT 4, 11, 1000 FROM dual UNION ALL
SELECT 5, 12, 1000 FROM dual UNION ALL
SELECT 6, 13, 200 FROM dual UNION ALL
SELECT 6, 14, 200 FROM dual UNION ALL
SELECT 6, 15, 200 FROM dual UNION ALL
SELECT 6, 16, 200 FROM dual UNION ALL
SELECT 6, 17, 200 FROM dual UNION ALL
SELECT 42, 736784, 1480000 FROM dual UNION ALL
SELECT 42, 736580, 280160 FROM dual UNION ALL
SELECT 42, 1040137, 112486 FROM dual UNION ALL
SELECT 42, 738685, 22903 FROM dual UNION ALL
SELECT 42, 736781, 56 FROM dual
;
Query 1:
SELECT s3.ownerID, s3.customerID, s3.revenue, s3.OwnerRevenue
, CAST(s3.customerRevPct AS decimal(5,2)) AS customerRevPct
, CASE WHEN s3.PctRT < 80 OR s3.custCount = 1 THEN 1 ELSE 2 END AS customCol
/* Do the running pcts add up to 80+? 1 customer = 100% == 1. What if all are pcts are equal? */
FROM (
SELECT s2.*
, 100-SUM(nvl(s2.customerRevPct,0)) OVER (PARTITION BY s2.ownerID ORDER BY s2.customerRevPct, s2.customerID RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS pctRT
, COUNT(*) OVER (PARTITION BY s2.ownerID ORDER BY (s2.ownerID) ) AS custCount /* Is there only 1 customer? */
FROM (
SELECT s1.*
, ( ( ( s1.revenue * 1.0 ) / s1.ownerRevenue ) * 100 ) AS customerRevPct
FROM (
SELECT t1.ownerID, t1.customerID, t1.revenue
, SUM(t1.revenue) OVER ( PARTITION BY t1.ownerID ) AS ownerRevenue
FROM t1
) s1
) s2
) s3
WHERE ownerID = 42 /* REMOVE THIS LINE - TESTING ONLY */
ORDER BY s3.ownerID, s3.customerRevPct DESC
Results:
| OWNERID | CUSTOMERID | REVENUE | OWNERREVENUE | CUSTOMERREVPCT | CUSTOMCOL |
|---------|------------|---------|--------------|----------------|-----------|
| 42 | 736784 | 1480000 | 1895605 | 78.08 | 1 |
| 42 | 736580 | 280160 | 1895605 | 14.78 | 1 |
| 42 | 1040137 | 112486 | 1895605 | 5.93 | 2 |
| 42 | 738685 | 22903 | 1895605 | 1.21 | 2 |
| 42 | 736781 | 56 | 1895605 | 0 | 2 |
EDIT: I changed the Fiddle to illustrate your data example.
create table custrev(owner varchar2(100), cust_id number, revenue number);
insert into custrev values('Emilian',1,1480000);
insert into custrev values('Emilian',2,280160);
insert into custrev values('Emilian',3,112486);
insert into custrev values('Emilian',4,22903);
insert into custrev values('Emilian',5,56);
insert into custrev values('Andy',6,1378);
insert into custrev values('Sandy',7,560000);
commit;
Below is the SQL for your requirement.
select owner, cust_id, revenue, pct,
case when pct = 100 then 1
when flg is null or flg < 80 then 1
else 2 end flag_col
from (select owner, cust_id, revenue, pct,--cumulative_sum,
lag(cumulative_sum) over(partition by owner
order by revenue desc) flg
from (select owner, cust_id, revenue, pct,
sum(pct) over(partition by owner
order by revenue desc
rows between unbounded preceding
and current row) cumulative_sum
from (select owner, cust_id, revenue,
round(ratio_to_report(revenue) over(partition by owner)*100,2) pct
from custrev)
)
)
order by owner, revenue desc;
Output:
OWNER CUST_ID REVENUE PCT FLAG_COL
Andy 6 1378 100 1
Emilian 1 1480000 78.08 1
Emilian 2 280160 14.78 1
Emilian 3 112486 5.93 2
Emilian 4 22903 1.21 2
Emilian 5 56 0 2
Sandy 7 560000 100 1
Alex,
OBIEE is based on models. Not on SQL.
So sorry to say this but the SQL code will help you exactly zero...

Need to fetch rows which having lowest plus some number and with out using inner or sub query

CREATE TABLE "User" ( Name, Age ) AS
SELECT 'Ira1', 10 FROM DUAL
UNION ALL SELECT 'Ira2', 11 FROM DUAL
UNION ALL SELECT 'Ira3', 15 FROM DUAL
UNION ALL SELECT 'Ira4', 16 FROM DUAL
UNION ALL SELECT 'Ira5', 17 FROM DUAL
I want those rows whose Age is greater than lowest Age +5. Lowest Age is 10.
So i want all those having Age greater than 15.
The inner query which I have is.
select * from user
where age > (select age+5 from (select age from user order by age asc) where rownum=1);
Which returns:
Ira4 16
Ira5 17
Is there a way we can do it using single query. I mean no inner or sub query.
You can simplify the code slightly by using the MIN aggregation function (2 table scans):
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE "User" ( Name, Age ) AS
SELECT 'Ira1', 10 FROM DUAL
UNION ALL SELECT 'Ira2', 11 FROM DUAL
UNION ALL SELECT 'Ira3', 15 FROM DUAL
UNION ALL SELECT 'Ira4', 16 FROM DUAL
UNION ALL SELECT 'Ira5', 17 FROM DUAL
Query 1:
SELECT *
FROM "User"
WHERE Age > ( SELECT MIN( Age ) + 5 FROM "User" )
Results:
| NAME | AGE |
|------|-----|
| Ira4 | 16 |
| Ira5 | 17 |
Query 2:
And you can get a completely different explain plan using an analytic function (only 1 table scan):
SELECT Name, Age
FROM (
SELECT u.*,
MIN( Age ) OVER ( ORDER BY Age ) AS min_age
FROM "User" u
)
WHERE Age > Min_Age + 5
Results:
| NAME | AGE |
|------|-----|
| Ira4 | 16 |
| Ira5 | 17 |
You could use an analytic function to get the minimum age, but you'd still need a subquery. It would only do one pass through the table though.
with usr ( Name, Age ) AS ( SELECT 'Ira1', 10 FROM DUAL
UNION ALL SELECT 'Ira2', 11 FROM DUAL
UNION ALL SELECT 'Ira3', 15 FROM DUAL
UNION ALL SELECT 'Ira4', 16 FROM DUAL
UNION ALL SELECT 'Ira5', 17 FROM DUAL)
select name,
age
from (select name,
age,
min(age) over () min_age
from usr)
where age > min_age + 5;
NAME AGE
---- ----------
Ira4 16
Ira5 17

SQL Grouping by Ranges

I have a data set that has timestamped entries over various sets of groups.
Timestamp -- Group -- Value
---------------------------
1 -- A -- 10
2 -- A -- 20
3 -- B -- 15
4 -- B -- 25
5 -- C -- 5
6 -- A -- 5
7 -- A -- 10
I want to sum these values by the Group field, but parsed as it appears in the data. For example, the above data would result in the following output:
Group -- Sum
A -- 30
B -- 40
C -- 5
A -- 15
I do not want this, which is all I've been able to come up with on my own so far:
Group -- Sum
A -- 45
B -- 40
C -- 5
Using Oracle 11g, this is what I've hobbled togther so far. I know that this is wrong, by I'm hoping I'm at least on the right track with RANK(). In the real data, entries with the same group could be 2 timestamps apart, or 100; there could be one entry in a group, or 100 consecutive. It does not matter, I need them separated.
WITH SUB_Q AS
(SELECT K_ID
, GRP
, VAL
-- GET THE RANK FROM TIMESTAMP TO SEPARATE GROUPS WITH SAME NAME
, RANK() OVER(PARTITION BY K_ID ORDER BY TMSTAMP) AS RNK
FROM MY_TABLE
WHERE K_ID = 123)
SELECT T1.K_ID
, T1.GRP
, SUM(CASE
WHEN T1.GRP = T2.GRP THEN
T1.VAL
ELSE
0
END) AS TOTAL_VALUE
FROM SUB_Q T1 -- MAIN VALUE
INNER JOIN SUB_Q T2 -- TIMSTAMP AFTER
ON T1.K_ID = T2.K_ID
AND T1.RNK = T2.RNK - 1
GROUP BY T1.K_ID
, T1.GRP
Is it possible to group in this way? How would I go about doing this?
I approach this problem by defining a group which is the different of two row_number():
select group, sum(value)
from (select t.*,
(row_number() over (order by timestamp) -
row_number() over (partition by group order by timestamp)
) as grp
from my_table t
) t
group by group, grp
order by min(timestamp);
The difference of two row numbers is constant for adjacent values.
A solution using LAG and windowed analytic functions:
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE TEST ( "Timestamp", "Group", Value ) AS
SELECT 1, 'A', 10 FROM DUAL
UNION ALL SELECT 2, 'A', 20 FROM DUAL
UNION ALL SELECT 3, 'B', 15 FROM DUAL
UNION ALL SELECT 4, 'B', 25 FROM DUAL
UNION ALL SELECT 5, 'C', 5 FROM DUAL
UNION ALL SELECT 6, 'A', 5 FROM DUAL
UNION ALL SELECT 7, 'A', 10 FROM DUAL;
Query 1:
WITH changes AS (
SELECT t.*,
CASE WHEN LAG( "Group" ) OVER ( ORDER BY "Timestamp" ) = "Group" THEN 0 ELSE 1 END AS hasChangedGroup
FROM TEST t
),
groups AS (
SELECT "Group",
VALUE,
SUM( hasChangedGroup ) OVER ( ORDER BY "Timestamp" ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) AS grp
FROM changes
)
SELECT "Group",
SUM( VALUE )
FROM Groups
GROUP BY "Group", grp
ORDER BY grp
Results:
| Group | SUM(VALUE) |
|-------|------------|
| A | 30 |
| B | 40 |
| C | 5 |
| A | 15 |
This is typical "star_of_group" problem (see here: https://timurakhmadeev.wordpress.com/2013/07/21/start_of_group/)
In your case, it would be as follows:
with t as (
select 1 timestamp, 'A' grp, 10 value from dual union all
select 2, 'A', 20 from dual union all
select 3, 'B', 15 from dual union all
select 4, 'B', 25 from dual union all
select 5, 'C', 5 from dual union all
select 6, 'A', 5 from dual union all
select 7, 'A', 10 from dual
)
select min(timestamp), grp, sum(value) sum_value
from (
select t.*
, sum(start_of_group) over (order by timestamp) grp_id
from (
select t.*
, case when grp = lag(grp) over (order by timestamp) then 0 else 1 end
start_of_group
from t
) t
)
group by grp_id, grp
order by min(timestamp)
;