In Oracle, how to select specific row while aggregating all rows - sql

I have a requirement that I need to both aggregate all rows by id, and find 1 specific row among the rows of the same id. It's like 2 SQL queries, but I want to make it in 1 SQL query. I'm using Oracle database.
for example,table t1 whose data looks like:
id | name | num
----- -------- -------
1 | 'a' | 1
2 | 'b' | 3
2 | 'c' | 6
2 | 'd' | 6
I want to aggregate the data by the id, find the 'name' with the highest 'count', and sum all count of the id to 'total_count'.
There are 2 rows with same num, pick up the first one.
id | highest_num | name_of_highest_num | total_num | avg_num
----- ------------- --------------------- ------------ -------------------
1 | 1 | 'a' | 1 | 1
2 | 6 | 'c' | 15 | 5
Can I get this result by 1 Oracle SQL query?
Thanks in advance for any replies.

Oracle Setup:
CREATE TABLE table_name ( id, name, num ) AS
SELECT 1, 'a', 1 FROM DUAL UNION ALL
SELECT 2, 'b', 3 FROM DUAL UNION ALL
SELECT 2, 'c', 6 FROM DUAL UNION ALL
SELECT 2, 'd', 6 FROM DUAL;
Query:
SELECT id,
MAX( num ) AS highest_num,
MAX( name ) KEEP ( DENSE_RANK LAST ORDER BY num ) AS name_of_highest_num,
SUM( num ) AS total_num,
AVG( num ) AS avg_num
FROM table_name
GROUP BY id
Output:
ID HIGHEST_NUM NAME_OF_HIGHEST_NUM TOTAL_NUM AVG_NUM
-- ----------- ------------------- --------- -------
1 1 a 1 1
2 6 d 15 5

Here's one option using row_number in a subquery with conditional aggregation:
select id,
max(num) as highest_num,
max(case when rn = 1 then name end) as name_of_highest_num,
sum(num) as total_num,
avg(num) as avg_num
from (
select id, name, num,
row_number() over (partition by id order by num desc) rn
from a
) t
group by id
SQL Fiddle Demo

Sounds like you want to use some analytic functions. Something like this should work
select id,
num highest_num,
name name_of_highest_num,
total total_num,
average avg_num
from (select id,
num,
name,
rank() over (partition by id
order by num desc, name asc) rnk,
sum(num) over (partition by id) total,
avg(num) over (partition by id) average
from table t1)
where rnk = 1

Related

Does Oracle LISTAGG work with GROUP BY and PIVOT?

I have this table:
ID CATEGORY SCORE
-------------------------
1 A 2
1 B 1
and am trying to get the avg(score) for each category:
ID SCORE_A_AVG SCORE_B_AVG
------------------------------
1 1.5 1.5
I've tried this but all results are null
select p.* from (
select ID, LISTAGG(CATEGORY, ',') within group (order by CATEGORY), avg(score) score
from foo
group by ID)
PIVOT (avg(score) score
FOR category IN ('A' as A_AVG, 'B' as B_AVG)) p;
I can get results for one or the other category by NOT using LISTAGG:
... MIN(CATEGORY) ...
Yes, it does work but you need the aggregation in the PIVOT:
SELECT *
FROM test_data
PIVOT (
AVG( score ) AS avg_score,
LISTAGG( score, ',' ) WITHIN GROUP ( ORDER BY score ) AS scores
FOR category IN (
'A' AS a,
'B' AS b
)
)
So for the test data:
CREATE TABLE test_data ( ID, CATEGORY, SCORE ) AS
SELECT 1, 'A', 2 FROM DUAL UNION ALL
SELECT 1, 'A', 3 FROM DUAL UNION ALL
SELECT 1, 'B', 1 FROM DUAL UNION ALL
SELECT 1, 'B', 2 FROM DUAL
this outputs:
ID | A_AVG_SCORE | A_SCORES | B_AVG_SCORE | B_SCORES
-: | ----------: | :------- | ----------: | :-------
1 | 2.5 | 2,3 | 1.5 | 1,2
db<>fiddle here
I would just use conditional aggregation:
select id,
avg(case when category = 'A' then score end) as score_a,
avg(case when category = 'B' then score end) as score_b
from foo
group by id;
I have no idea why you are thinking listagg() rather than avg().

How can I use the self join to find equal values within a group?

I am looking to find instances of GROUPID where all price values are 0. The following is a simplified version of what I am looking at
--------------------------------
| Groupid | Price | Customer|
--------------------------------
| 001 | 9 | 4 |
| 001 | 0 | 4 |
| 002 | 4 | 4 |
| 002 | 4 | 4 |
| 003 | 0 | 4 |
| 003 | 0 | 4 |
| 004 | 4 | 4 |
| 004 | 7 | 4 |
--------------------------------
I am attempting to use the following query to find all GROUPID where both PRICE values for that particular group = 0.
SELECT * FROM MYTABLE WHERE GROUPID IN
(SELECT TB1.GROUPID FROM MYTABLE TB1 JOIN MYTABLE TB2 ON TB1.GROUPID = TB2.GROUPID
AND TB1.PRICE = 0 AND TB2.PRICE = 0)
and CUSTOMER = 4
ORDER BY GROUPID;
This query returns:
| Groupid | Price | Customer|
--------------------------------
| 001 | 9 | 4 |
| 001 | 0 | 4 |
| 003 | 0 | 4 |
| 003 | 0 | 4 |
--------------------------------
In my case, I only need it to return GROUPID 003.
I'd also like to ask for assistance in modifying the query to return all non 0 equal PRICE values within a groupid. It doesn't have to be in the same query as above. For example the return would be:
| Groupid | Price | Customer|
--------------------------------
| 002 | 4 | 4 |
| 002 | 4 | 4 |
Any help would be appreciated. Thank you for your time.
If all the prices are zero, then look at the minimum and maximum price for the groupid:
select groupid
from mytable t
group by groupid
having min(price) = 0 and max(price) = 0;
I should point out that no self-join is required for this.
Try this:
SELECT * FROM MYTABLE as m1 where Price = 0
and not exists (
select 1
from MYTABLE as m2
where m2.Groupid = m1.Groupid
and m2.Price <> 0
)
You can list the the rows, whose group_id has no rows with non-zero price
select groupid, price, customer from mytable t
where not exists (
select 1 from mytable where group_id = t.group_id
and price != 0
);
I would do it by counting the number of distinct prices over each group (and I'm assuming that each group will have the same customer) and then filtering on the price(s) you're interested in.
For example, for prices that are 0 for all rows in each groupid+customer:
WITH mytable AS (SELECT '001' groupid, 9 price, 4 customer FROM dual UNION ALL
SELECT '001' groupid, 0 price, 4 customer FROM dual UNION ALL
SELECT '002' groupid, 4 price, 4 customer FROM dual UNION ALL
SELECT '002' groupid, 4 price, 4 customer FROM dual UNION ALL
SELECT '003' groupid, 0 price, 4 customer FROM dual UNION ALL
SELECT '003' groupid, 0 price, 4 customer FROM dual UNION ALL
SELECT '004' groupid, 4 price, 4 customer FROM dual UNION ALL
SELECT '004' groupid, 7 price, 4 customer FROM dual)
SELECT groupid,
customer,
price
FROM (SELECT groupid,
customer,
price,
COUNT(DISTINCT price) OVER (PARTITION BY groupid, customer) num_distinct_prices
FROM mytable)
WHERE num_distinct_prices = 1
AND price = 0;
GROUPID CUSTOMER PRICE
------- ---------- ----------
003 4 0
003 4 0
Just change the and price = 0 to and price != 0 if you want the groups which have the same non-zero price for all rows. Or simply remove that predicate altogether.
EDIT
Gordon's is the best solution for the first part:
SELECT groupid
FROM mytable t
GROUP BY GroupID
HAVING (MAX(price) = 0 and MIN(price) = 0)
And for the second part:
SELECT groupid
FROM mytable t
GROUP BY GroupID
HAVING MIN(price) <> 0 AND (MAX(price) = MIN(price))
My original one:
SELECT groupid
FROM mytable t
GROUP BY GroupID
HAVING SUM(Price) =0
This assumes, there are no negative prices.
To the second part of your question:
SELECT groupid
FROM mytable t
WHERE Price > 0
GROUP BY GroupID, Price
HAVING COUNT(price) > 1
In your sample data, you have only one customer. I assume if you had more than one customer, you would still want to return the rows where the groupid has the same price, across all rows and all customers. If so, you could use the query below. It is almost the same as Boneist's - I just use min(price) and max(price) instead of count(distinct), and I don't include customer in partition by.
If the price may be NULL, it will be ignored in the computation of max price and min price; if all the NON-NULL prices are equal for a groupid, all the rows for that group will be returned. If price can be NULL and this is NOT the desired behavior, that can be changed easily - but the OP will have to clarify.
The query below retrieves all the cases when there is a single price for the groupid. To retrieve only the groups where the price is 0 (an additional condition), add and price = 0 to the WHERE clause of the outer query. I added more test data to illustrate some of the cases the query covers.
with
test_data ( groupid, price, customer ) as (
select '001', 9, 4 from dual union all
select '001', 0, 4 from dual union all
select '002', 4, 4 from dual union all
select '002', 4, 4 from dual union all
select '003', 0, 4 from dual union all
select '003', 0, 4 from dual union all
select '004', 4, 4 from dual union all
select '004', 7, 4 from dual union all
select '002', 4, 8 from dual union all
select '005', 2, 8 from dual union all
select '005', null, 8 from dual
),
prep ( groupid, price, customer, min_price, max_price) as (
select groupid, price, customer,
min(price) over (partition by groupid),
max(price) over (partition by groupid)
from test_data
)
select groupid, price, customer
from prep
where min_price = max_price
;
GROUPID PRICE CUSTOMER
------- --------- ---------
002 4 8
002 4 4
002 4 4
003 0 4
003 0 4
005 8
005 2 8
This may be what you want:
SELECT * FROM MYTABLE
WHERE GROUPID NOT IN (
SELECT GROUPID
FROM MYTABLE
WHERE Price <> 0)
and just change the last line for the other query:
SELECT * FROM MYTABLE
WHERE GROUPID NOT IN (
SELECT GROUPID
FROM MYTABLE
WHERE Price = 0)
I would do it very similarly to what Gordon posted
SELECT groupId
FROM MyTable
GROUP BY groupId
HAVING SUM(price) = 0

SQL theory: Filtering out duplicates in one column, picking lowest value in other column

I am trying to figure out the best way to remove rows from a result set where either the value in one column or the value in a different column has a duplicate in the result set.
Imagine the results of a query are as follows:
a_value | b_value
-----------------
1 | 1
2 | 1
2 | 2
3 | 1
4 | 3
5 | 2
6 | 4
6 | 5
What I want to do is:
Eliminate all rows that have duplicate values in a_value
Pick only 1 row for a given b_value
So I'd want the filtered results to end up like this after eliminating a_value duplicates:
a_value | b_value
-----------------
1 | 1
3 | 1
4 | 3
5 | 2
And then like this after picking only a single b_value:
a_value | b_value
-----------------
1 | 1
4 | 3
5 | 2
I'd appreciate suggestions on how to accomplish this task in an efficient way via SQL.
with
q_res ( a_value, b_value ) as (
select 1, 1 from dual union all
select 2, 1 from dual union all
select 2, 2 from dual union all
select 3, 1 from dual union all
select 4, 3 from dual union all
select 5, 2 from dual union all
select 6, 4 from dual union all
select 6, 5 from dual
)
-- end test data; solution begins below
select min(a_value) as a_value, b_value
from (
select a_value, min(b_value) as b_value
from q_res
group by a_value
having count(*) = 1
)
group by b_value
order by a_value -- ORDER BY is optional
;
A_VALUE B_VALUE
------- -------
1 1
4 3
5 2
1) In the inner query I am avoiding all duplicates which are present in a_value
column and getting all the remaining rows from input table and storing them
as t2. By joining t2 with t1 there would be full data without any dups as per
your #1 in requirement.
SELECT t1.*
FROM Table t1,
(
SELECT a_value
FROM Table
GROUP BY a_value
HAVING COUNT(*) = 1
) t2
WHERE t1.a_value = t2.a_value;
2) Once the filtered data is obtained, I am assigning rank to each row in the filtered dataset obtained in step-1 and I am selecting only rows with rank=1.
SELECT X.a_value,
X.b_value
FROM
(
SELECT t1.*,
ROW_NUMBER() OVER ( PARTITION BY t1.b_value ORDER BY t1.a_value,t1.b_value ) AS rn
FROM Table t1,
(
SELECT a_value
FROM Table
GROUP BY a_value
HAVING COUNT(*) = 1
) t2
WHERE t1.a_value = t2.a_value
) X
WHERE X.rn = 1;

Union merging different columns

I have two tables:
Table1:
DATAID| NAME | FACTOR
1 | Ann | 1
2 | Kate | 1
3 | Piter | 1
Table2:
DATAID| NAME | FACTOR
1 | John | 2
6 | Arse | 2
3 | Garry | 2
I would like UNION those tables and get this result:
DATAID| NAME | FACTOR
1 | Ann | 1,2
2 | Kate | 1
3 | Piter | 1,2
6 | Arse | 2
So when there's 2 rows with same dataid, I would like to get 'NAME' column from Table1 and some kind of aggregated FACTOR, for example '1,2' or 3
One method uses listagg():
select dataid, name,
listagg(factor, ',') within group (order by factor) as factors
from ((select dataid, name, factor from table1 t1
) union all
(select dataid, name, factor from table2 t2
)
) t
group by dataid, name;
Note: I notice that the names are not the same for a given id. You can choose one by using aggregation functions.
Or, if you only have one row in each table, you can use a full outer join:
select coalesce(t1.dataid, t2.dataid) as dataid,
coalesce(t1.name, t2.name) as name,
trim(leading ',' from coalesce(',' || t1.factor, ',') || coalesce(',' || t2.factor, '') as factors
from t1 full outer join
t2
on t1.dataid = t2.dataid;
Something like this should work. In your actual situation you will not need the first two CTE's (the subqueries in the WITH clause I added for testing).
with
table1 ( dataid, name, factor ) as (
select 1, 'Ann' , 1 from dual union all
select 2, 'Kate' , 1 from dual union all
select 3, 'Piter', 1 from dual
),
table2 ( dataid, name, factor ) as (
select 1, 'John' , 2 from dual union all
select 6, 'Arse' , 2 from dual union all
select 3, 'Garry', 2 from dual
),
u ( dataid, name, factor, source ) as (
select dataid, name, factor, 1 from table1
union all
select dataid, name, factor, 2 from table2
),
z ( dataid, name, factor ) as (
select dataid, first_value(name) over (partition by dataid order by source),
factor
from u
)
select dataid, name,
listagg(factor, ',') within group (order by factor) as factor
from z
group by dataid, name
order by dataid
;
Output:
DATAID NAME FACTOR
------- ----- ---------
1 Ann 1,2
2 Kate 1
3 Piter 1,2
6 Arse 2
4 rows selected.

Use SUM function in oracle

I have a table in Oracle which contains :
id | month | payment | rev
----------------------------
A | 1 | 10 | 0
A | 2 | 20 | 0
A | 2 | 30 | 1
A | 3 | 40 | 0
A | 4 | 50 | 0
A | 4 | 60 | 1
A | 4 | 70 | 2
I want to calculate the payment column (SUM(payment)). For (id=A month=2) and (id=A month=4), I just want to take the greatest value from REV column. So that the sum is (10+30+40+70)=150. How to do it?
You can also use below.
select id,sum(payment) as value
from
(
select id,month,max(payment) from table1
group by id,month
)
group by id
Edit: for checking greatest rev value
select id,sum(payment) as value
from (
select id,month,rev,payment ,row_number() over (partition by id,month order by rev desc) as rno from table1
) where rno=1
group by id
This presupposes you don't have more than one value per rev. If that's not the case, then you probably want a row_number analytic instead of max.
with latest as (
select
id, month, payment, rev,
max (rev) over (partition by id, month) as max_rev
from table1
)
select sum (payment)
from latest
where rev = max_rev
Or there's this, if I've understood the requirement right:
with demo as (
select 'A'as id, 1 as month, 10 as payment, 0 as rev from dual
union all select 'A',2,20,0 from dual
union all select 'A',2,30,1 from dual
union all select 'A',3,40,0 from dual
union all select 'A',4,50,0 from dual
union all select 'A',4,60,1 from dual
union all select 'A',4,70,2 from dual
)
select sum(payment) keep (dense_rank last order by rev)
from demo;
You can check the breakdown by including the key columns:
with demo as (
select 'A'as id, 1 as month, 10 as payment, 0 as rev from dual
union all select 'A',2,20,0 from dual
union all select 'A',2,30,1 from dual
union all select 'A',3,40,0 from dual
union all select 'A',4,50,0 from dual
union all select 'A',4,60,1 from dual
union all select 'A',4,70,2 from dual
)
select id, month, max(rev)
, sum(payment) keep (dense_rank last order by rev)
from demo
group by id, month;
select sum(payment) from tableName where id='A' and month=2 OR month=4 order by payment asc;