Oracle: MAX of SUM of Each Group - sql

My scenario is to show the hotel room with the highest maintenance cost for each hotel branch by using subqueries. I have three separate tables: branch, room, and maintenance.
Table branch
id NUMBER(3) PRIMARY KEY
location VARCHAR2(20)
Table room
id NUMBER(3) PRIMARY KEY
room_number CHAR(4)
branch_id NUMBER(3)
Table maintenance
id NUMBER(3) PRIMARY KEY
room_id NUMBER(3)
cost NUMBER(4)
With my desired output being in the format
location | room_number | cost
-------------------------------
| |
| |
| |
I'm not sure how to select the max value per branch after adding the total costs of each room. Please advise.

You can use window functions:
select *
from (
select b.location, r.room_number, m.cost,
rank() over(partition by b.id order by m.cost desc) rn
from branch b
inner join room r on r.branch_id = b.id
inner join maintenance m on m.room_id = r.id
) t
where rn = 1
If a room might have several maintenances, then we need aggregation:
select *
from (
select b.location, r.room_number, sum(m.cost) as cost,
rank() over(partition by b.id order by sum(m.cost) desc) rn
from branch b
inner join room r on r.branch_id = b.id
inner join maintenance m on m.room_id = r.id
group by b.id, b.location, r.room_number
) t
where rn = 1

You can use ROW_NUMBER() analytic function along with joining those tables
SELECT location, room_number, cost
FROM (SELECT b.location,
r.room_number,
m.cost,
ROW_NUMBER() OVER(PARTITION BY r.branch_id ORDER BY m.cost DESC) AS rn
FROM branch b
JOIN room r
ON r.branch_id = b.id
JOIN maintenance m
ON m.room_id = r.id)
WHERE rn = 1
P.S. If ties(the equal values of costs) matter(should be included even if generation for extra rows of maximum cost values are not problem), then you can replace ROW_NUMBER() with DENSE_RANK()

Related

Average of top 2

I would like to get the average of the top2 limit1 per policyid. I need my resulting table to also have objectid.
Limit1 and objectid come from the table p_coverage.
Policyid comes from the table p_risk.
The table p_item is a linking table between p_risk and p_coverage.
The way I thought I should build my query is: create a ranking of limit1 within each policyid. Then take the avg top2.
However the ranking doesn't work and give wrong result. My query works if I take columns from ONE table, but as soon as I add joins between them it gives false ranking.
SELECT policyid, limit1, /*pcob,*/ RANK() OVER(PARTITION BY policyid ORDER BY limit1 DESC) AS rn
FROM (SELECT policyid, limit1/*, pc.objectid ASpcob*/
FROM p_risk pr
LEFT JOIN p_item
ON pr.objectid=p_item.riskobjectid
LEFT JOIN p_coverage pc
ON p_item.objectid=pc.insuranceitemid) AS s
) AS SubQueryAlias
GROUP BY
policyid, limit1/*, pcob*/, rn
ORDER BY rn,policyid,limit1 DESC
The table at the end of the picture is what I'd like to have. The first table is the result of the query of Golden Linoff
If I understand correctly, you want the ROW_NUMBER() in the subquery and then to aggregate and filter in the outer query:
SELECT policyid, AVG(limit1) as avg_top2_limit1
FROM (SELECT policyid, limit1,
DENSE_RANK() OVER (PARTITION BY policyid ORDER BY limit1 DESC) as seqnum
FROM p_risk pr LEFT JOIN
p_item i
ON pr.objectid = i.riskobjectid LEFT JOIN
p_coverage pc
ON i.objectid = pc.insuranceitemid) AS s
) p
WHERE seqnum <= 2
GROUP BY policyid
thanks to previous comment! I succeed to do what I wanted. There is the query
select b.policyid, avg(b.limit1) as avg_top2_limit1 from(
SELECT distinct(policyid) policyid, limit1
FROM (SELECT policyid, limit1,
Dense_rank() OVER (PARTITION BY policyid ORDER BY limit1 DESC) as
seqnum
FROM p_risk pr LEFT JOIN
p_item i
ON pr.objectid = i.riskobjectid LEFT JOIN
p_coverage pc
ON i.objectid = pc.insuranceitemid) AS s
WHERE seqnum <= 2 ) as b
GROUP BY policyid`

How to find the three greatest values in each category in PostgreSQL?

I am a SQL beginner. I have trouble on how to find the top 3 max values in each category. The question was
"For order_ids in January 2006, what were the top (by revenue) 3 product_ids for each category_id? "
Table A:
(Column name)
customer_id
order_id
order_date
revenue
product_id
Table B:
product_id
category_id
I tried to combine table B and A using an Inner Join and filtered by the order_date. But then I am stuck on how to find the top 3 max values in each category_id.
Thanks.
This is so far what I can think of
SELECT B.product_id, category_id FROM A
JOIN B ON B.product_id = A.product_id
WHERE order_date BETWEEN ‘2006-01-01’ AND ‘2006-01-31’
ORDER BY revenue DESC
LIMIT 3;
This kind of query is typically solved using window functions
select *
from (
SELECT b.product_id,
b.category_id,
a.revenue,
dense_rank() over (partition by b.category_id, b.product_id order by a.revenue desc) as rnk
from A
join b ON B.product_id = A.product_id
where a.order_date between date '2006-01-01' AND date '2006-01-31'
) as t
where rnk <= 3
order by product_id, category_id, revenue desc;
dense_rank() will also deal with ties (products with the same revenue in the same category) so you might actually get more than 3 rows per product/category.
If the same product can show up more than once in table b (for the same category) you need to combine this with a GROUP BY to get the sum of all revenues:
select *
from (
SELECT b.product_id,
b.category_id,
sum(a.revenue) as total_revenue,
dense_rank() over (partition by b.category_id, a.product_id order by sum(a.revenue) desc) as rnk
from a
join b on B.product_id = A.product_id
where a.order_date between date '2006-01-01' AND date '2006-01-31'
group by b.product_id, b.category_id
) as t
where rnk <= 3
order by product_id, category_id, total_revenue desc;
When combining window functions and GROUP BY, the window function will be applied after the GROUP BY.
You can use window functions to gather the grouped revenue and then pull the last X in the outer query. I have not worked in PostgreSQL in a bit so I may be missing a shortcut function below.
WITH ByRevenue AS
(
--This creates a virtualized table that can be queried similar to a physical table in the conjoined statements below
SELECT
category_id,
product_id,
MAX(revenue) as max_revenue
FROM
A
JOIN B ON B.product_id = A.product_id
WHERE
order_date BETWEEN ‘2018-01-01’ AND ‘2018-01-31’
GROUP BY
category_id,product_id
)
,Normalized
(
--Pull data from the in memory table above using normal sql syntax and normalize it with a RANK function to achieve the limit.
SELECT
category_id,
product_id,
max_revenue,
ROW_NUMBER() OVER (PARTITION BY category_id,product_id ORDER BY max_revenue DESC) as rn
FROM
ByRevenue
)
--Final query from stuff above with each category/product ranked by revenue
SELECT *
FROM Normalized
WHERE RN<=3;
For top-n queries, the first thing to try is usually the lateral join:
WITH categories as (
SELECT DISTINCT category_id
FROM B
)
SELECT categories.category_id, sub.product_id
FROM categories
JOIN LATERAL (
SELECT a.product_id
FROM B
JOIN A ON (a.product_id = b.product_id)
WHERE b.category_id = categories.category_id
AND order_date BETWEEN '2006-01-01' AND '2006-01-31'
GROUP BY a.product_id
ORDER BY sum(revenue) desc
LIMIT 3
) sub on true;
Try using Fetch n rows only?
Note: Let's think that your primary key here is product_id, so I used them for combining the two table.
SELECT A.category,A.revenue From Table A
INNER JOIN Table B on A.product_id = B.Product_ID
WHERE A.Order_Date between (from date) and (to date)
ORDER BY A.Revenue DESC
Fetch first 3 rows only

PARTITION BY duplicated id and JOIN with the ID with the least value

I need to JOIN through a view in SQLServer 2008 tables hstT and hstD. The main table contains a data regarding employees and their "logins" (so multiple records associated to x employee in x month) and the second table has info about their area based on months, and I need to join both tables but keeping the earliest record as reference for the join and the rest of records associated to that id.
So hstT its something like:
id id2 period name
----------------------
x 1 0718 john
x 1 0818 john
y 2 0718 jane
And hstD:
id2 period area
----------------------
1 0718 sales
1 0818 hr
2 0707 mng
With an OUTER JOIN I manage to merge all data based on ID2 (user id) and the period BUT as I mentioned I need to join the other table based on the earliest record by associating ID (which I could use as criteria) so it would look like this:
id id2 period name area
---------------------------
x 1 0718 john sales
x 1 0818 john sales
y 2 0718 jane mng
I know I could use ROW_number but I don't know how to use it in a view and JOIN it on those conditions:
SELECT T.*,D.*, ROW_NUMBER() OVER (PARTITION BY T.ID ORDER BY T.PERIOD ASC) AS ORID
FROM dbo.hstT AS T LEFT OUTER JOIN
dbo.hstD AS D ON T.period = D.period AND T.id2 = D.id2
WHERE ORID = 1
--prompts error as orid doesn't exist in any table
You can use apply for this:
select t.*, d.area
from hstT t outer apply
(select top (1) d.*
from hstD d
where d.id2 = t.id2 and d.period <= t.period
order by d.period asc
) d;
Actually, if you just want the earliest period, then you can filter and join:
select t.*, d.area
from hstT t left join
(select d.*, row_number() over (partition by id2 order by period asc) as seqnum
from hstD d
order by d.period asc
) d;
on d.id2 = t.id2 and seqnum = 1;

Max value of count in oracle

I have these tables, Orders Table:
Name Null? Type
ORDER_ID NOT NULL NUMBER(5)
CUSTOMER_ID NUMBER(8)
SHIPMENT_METHOD_ID NUMBER(2)
and Shipment_method Table:
Name Null? Type
SHIPMENT_METHOD_ID NOT NULL NUMBER(2)
SHIPMENT_DESCRIPTION VARCHAR2(80)
I'm trying to get the most used shipping method based on the orders, and I'm kind of a beginner here so I need some help.
I'm thinking if it's possible to have MAX(count(order_id)) but how can I do that for each shipment_method_id?
This is another approach:
select shipment_method_id, shipment_description, count(*) as num_orders
from orders
join shipment_method
using (shipment_method_id)
group by shipment_method_id, shipment_description
having count(*) = (select max(count(order_id))
from orders
group by shipment_method_id)
You don't need MAX, you just need to return the top row
SELECT Shipment_Method_Desc
FROM (
SELECT Shipment_Method_ID, Shipment_Method_Desc, COUNT(*) AS ct
FROM Shipment_Method s
JOIN Orders o ON s.Shipment_Method_ID = o.Shipment_Method_ID
GROUP BY Shipment_Method_ID
ORDER BY ct DESC)
WHERE ROWNUM = 1
If you're using Oracle 12c or newer, you can use the row limiting clause instead of the subquery:
SELECT Shipment_Method_ID, Shipment_Method_Desc, COUNT(*) AS ct
FROM Shipment_Method s
JOIN Orders o ON s.Shipment_Method_ID = o.Shipment_Method_ID
GROUP BY Shipment_Method_ID
ORDER BY ct DESC
FETCH FIRST 1 ROW ONLY
Here is a method that allows for more than one Shipment Method having the same maximum number of Orders.
SELECT shipment_method_id
,shipment_description
,orders
FROM
(SELECT shipment_method_id
,shipment_description
,orders
,rank() OVER (ORDER BY orders DESC) orders_rank
FROM
(SELECT smm.shipment_method_id
,smm.shipment_description
,count(*) orders
FROM orders odr
INNER JOIN shipment_method smm
ON (smm.shipment_method_id = odr.shipment_method_id)
GROUP BY smm.shipment_method_id
,smm.shipment_description
)
)
WHERE orders_rank = 1
As a beginner, you may find using with useful which allows to have kind of named intermediate results:
with STATS as (select SHIPMENT_METHOD_ID, count(*) as N
from ORDERS group by SHIPMENT_METHOD_ID)
, MAXIMUM as (select max(N) as N from STATS)
select SHIPMENT_METHOD_ID, SHIPMENT_DESCRIPTION
from STATS
join MAXIMUM on STATS.N = MAXIMUM.N
natural join SHIPMENT_METHOD

Teradata Rank Over Query (Getting one row to left join)

Hi am new to Teradata and am stuck with a problem
There is an ID table which stores an Unique ID given to each person
CREATE TABLE IDS(
ID VARCHAR(8),
UPDATED_DATE DATE)
Then we have a name and address table which do not have any primary keys that stores demographic information for the IDS
CREATE TABLE NAMES(
ID VARCHAR(8),
NAME VARCHAR(50))
CREATE TABLE ADRRESSES(
ID VARCHAR(8)
ADDRESS VARCHAR(200))
Now each ID can have multiple name and IDS. However for names and address I want to use the ones that are have more counts. If two names have the same COUNT I just want the First row
ID NAME COUNT
1234 John Smith 6
1234 Johnnie Smith 6
1234 J Smith 2
In the above example I want the name John Smith. Here is the left Join I am performing since an ID may not have a name or address. Here is what I am trying
SELECT * FROM
(SELECT ID as V_ID from IDS) a
LEFT JOIN
(SELECT ID, NAME, COUNT(*) AS COUNTER,(RANK() OVER(ORDER BY COUNTER DESC)) AS RNK
FROM NAMES
GROUP BY ID)b
ON a.ID = b.ID
AND b.RNK = 1 -- Should give me only the first row
LEFT JOIN
(SELECT ID, ADDRESS, COUNT(*) AS COUNTER, (RANK() OVER (ORDER BY COUNTER DESC) ) AS RNK
FROM ADDRESSES
GROUP BY ID) c
ON c.ID = a.ID
And c.RNK = 1
However this is not getting me the desired result. I tried using ROW NUMBER instead of RANK also but still no results. How should I write this query in TERDATA?
I solved it ...I needed a qualify and a partition by
SELECT * FROM
(SELECT ID as V_ID from IDS) a
LEFT JOIN
(SELECT ID, NAME, COUNT(*) AS COUNTER
FROM NAMES
GROUP BY ID
qualify ROW_NUMBER() OVER(PARTITION BY ID ORDER BY COUNTER DESC) = 1
)b
ON a.ID = b.ID
LEFT JOIN
(SELECT ID, ADDRESS, COUNT(*) AS COUNTER
FROM ADDRESSES
GROUP BY ID
qualify ROW_NUMBER() OVER(PARTITION BY ID ORDER BY COUNTER DESC) = 1
) c
ON c.ID = a.ID