Max value of count in oracle - sql

I have these tables, Orders Table:
Name Null? Type
ORDER_ID NOT NULL NUMBER(5)
CUSTOMER_ID NUMBER(8)
SHIPMENT_METHOD_ID NUMBER(2)
and Shipment_method Table:
Name Null? Type
SHIPMENT_METHOD_ID NOT NULL NUMBER(2)
SHIPMENT_DESCRIPTION VARCHAR2(80)
I'm trying to get the most used shipping method based on the orders, and I'm kind of a beginner here so I need some help.
I'm thinking if it's possible to have MAX(count(order_id)) but how can I do that for each shipment_method_id?

This is another approach:
select shipment_method_id, shipment_description, count(*) as num_orders
from orders
join shipment_method
using (shipment_method_id)
group by shipment_method_id, shipment_description
having count(*) = (select max(count(order_id))
from orders
group by shipment_method_id)

You don't need MAX, you just need to return the top row
SELECT Shipment_Method_Desc
FROM (
SELECT Shipment_Method_ID, Shipment_Method_Desc, COUNT(*) AS ct
FROM Shipment_Method s
JOIN Orders o ON s.Shipment_Method_ID = o.Shipment_Method_ID
GROUP BY Shipment_Method_ID
ORDER BY ct DESC)
WHERE ROWNUM = 1
If you're using Oracle 12c or newer, you can use the row limiting clause instead of the subquery:
SELECT Shipment_Method_ID, Shipment_Method_Desc, COUNT(*) AS ct
FROM Shipment_Method s
JOIN Orders o ON s.Shipment_Method_ID = o.Shipment_Method_ID
GROUP BY Shipment_Method_ID
ORDER BY ct DESC
FETCH FIRST 1 ROW ONLY

Here is a method that allows for more than one Shipment Method having the same maximum number of Orders.
SELECT shipment_method_id
,shipment_description
,orders
FROM
(SELECT shipment_method_id
,shipment_description
,orders
,rank() OVER (ORDER BY orders DESC) orders_rank
FROM
(SELECT smm.shipment_method_id
,smm.shipment_description
,count(*) orders
FROM orders odr
INNER JOIN shipment_method smm
ON (smm.shipment_method_id = odr.shipment_method_id)
GROUP BY smm.shipment_method_id
,smm.shipment_description
)
)
WHERE orders_rank = 1

As a beginner, you may find using with useful which allows to have kind of named intermediate results:
with STATS as (select SHIPMENT_METHOD_ID, count(*) as N
from ORDERS group by SHIPMENT_METHOD_ID)
, MAXIMUM as (select max(N) as N from STATS)
select SHIPMENT_METHOD_ID, SHIPMENT_DESCRIPTION
from STATS
join MAXIMUM on STATS.N = MAXIMUM.N
natural join SHIPMENT_METHOD

Related

Average of top 2

I would like to get the average of the top2 limit1 per policyid. I need my resulting table to also have objectid.
Limit1 and objectid come from the table p_coverage.
Policyid comes from the table p_risk.
The table p_item is a linking table between p_risk and p_coverage.
The way I thought I should build my query is: create a ranking of limit1 within each policyid. Then take the avg top2.
However the ranking doesn't work and give wrong result. My query works if I take columns from ONE table, but as soon as I add joins between them it gives false ranking.
SELECT policyid, limit1, /*pcob,*/ RANK() OVER(PARTITION BY policyid ORDER BY limit1 DESC) AS rn
FROM (SELECT policyid, limit1/*, pc.objectid ASpcob*/
FROM p_risk pr
LEFT JOIN p_item
ON pr.objectid=p_item.riskobjectid
LEFT JOIN p_coverage pc
ON p_item.objectid=pc.insuranceitemid) AS s
) AS SubQueryAlias
GROUP BY
policyid, limit1/*, pcob*/, rn
ORDER BY rn,policyid,limit1 DESC
The table at the end of the picture is what I'd like to have. The first table is the result of the query of Golden Linoff
If I understand correctly, you want the ROW_NUMBER() in the subquery and then to aggregate and filter in the outer query:
SELECT policyid, AVG(limit1) as avg_top2_limit1
FROM (SELECT policyid, limit1,
DENSE_RANK() OVER (PARTITION BY policyid ORDER BY limit1 DESC) as seqnum
FROM p_risk pr LEFT JOIN
p_item i
ON pr.objectid = i.riskobjectid LEFT JOIN
p_coverage pc
ON i.objectid = pc.insuranceitemid) AS s
) p
WHERE seqnum <= 2
GROUP BY policyid
thanks to previous comment! I succeed to do what I wanted. There is the query
select b.policyid, avg(b.limit1) as avg_top2_limit1 from(
SELECT distinct(policyid) policyid, limit1
FROM (SELECT policyid, limit1,
Dense_rank() OVER (PARTITION BY policyid ORDER BY limit1 DESC) as
seqnum
FROM p_risk pr LEFT JOIN
p_item i
ON pr.objectid = i.riskobjectid LEFT JOIN
p_coverage pc
ON i.objectid = pc.insuranceitemid) AS s
WHERE seqnum <= 2 ) as b
GROUP BY policyid`

How to find the three greatest values in each category in PostgreSQL?

I am a SQL beginner. I have trouble on how to find the top 3 max values in each category. The question was
"For order_ids in January 2006, what were the top (by revenue) 3 product_ids for each category_id? "
Table A:
(Column name)
customer_id
order_id
order_date
revenue
product_id
Table B:
product_id
category_id
I tried to combine table B and A using an Inner Join and filtered by the order_date. But then I am stuck on how to find the top 3 max values in each category_id.
Thanks.
This is so far what I can think of
SELECT B.product_id, category_id FROM A
JOIN B ON B.product_id = A.product_id
WHERE order_date BETWEEN ‘2006-01-01’ AND ‘2006-01-31’
ORDER BY revenue DESC
LIMIT 3;
This kind of query is typically solved using window functions
select *
from (
SELECT b.product_id,
b.category_id,
a.revenue,
dense_rank() over (partition by b.category_id, b.product_id order by a.revenue desc) as rnk
from A
join b ON B.product_id = A.product_id
where a.order_date between date '2006-01-01' AND date '2006-01-31'
) as t
where rnk <= 3
order by product_id, category_id, revenue desc;
dense_rank() will also deal with ties (products with the same revenue in the same category) so you might actually get more than 3 rows per product/category.
If the same product can show up more than once in table b (for the same category) you need to combine this with a GROUP BY to get the sum of all revenues:
select *
from (
SELECT b.product_id,
b.category_id,
sum(a.revenue) as total_revenue,
dense_rank() over (partition by b.category_id, a.product_id order by sum(a.revenue) desc) as rnk
from a
join b on B.product_id = A.product_id
where a.order_date between date '2006-01-01' AND date '2006-01-31'
group by b.product_id, b.category_id
) as t
where rnk <= 3
order by product_id, category_id, total_revenue desc;
When combining window functions and GROUP BY, the window function will be applied after the GROUP BY.
You can use window functions to gather the grouped revenue and then pull the last X in the outer query. I have not worked in PostgreSQL in a bit so I may be missing a shortcut function below.
WITH ByRevenue AS
(
--This creates a virtualized table that can be queried similar to a physical table in the conjoined statements below
SELECT
category_id,
product_id,
MAX(revenue) as max_revenue
FROM
A
JOIN B ON B.product_id = A.product_id
WHERE
order_date BETWEEN ‘2018-01-01’ AND ‘2018-01-31’
GROUP BY
category_id,product_id
)
,Normalized
(
--Pull data from the in memory table above using normal sql syntax and normalize it with a RANK function to achieve the limit.
SELECT
category_id,
product_id,
max_revenue,
ROW_NUMBER() OVER (PARTITION BY category_id,product_id ORDER BY max_revenue DESC) as rn
FROM
ByRevenue
)
--Final query from stuff above with each category/product ranked by revenue
SELECT *
FROM Normalized
WHERE RN<=3;
For top-n queries, the first thing to try is usually the lateral join:
WITH categories as (
SELECT DISTINCT category_id
FROM B
)
SELECT categories.category_id, sub.product_id
FROM categories
JOIN LATERAL (
SELECT a.product_id
FROM B
JOIN A ON (a.product_id = b.product_id)
WHERE b.category_id = categories.category_id
AND order_date BETWEEN '2006-01-01' AND '2006-01-31'
GROUP BY a.product_id
ORDER BY sum(revenue) desc
LIMIT 3
) sub on true;
Try using Fetch n rows only?
Note: Let's think that your primary key here is product_id, so I used them for combining the two table.
SELECT A.category,A.revenue From Table A
INNER JOIN Table B on A.product_id = B.Product_ID
WHERE A.Order_Date between (from date) and (to date)
ORDER BY A.Revenue DESC
Fetch first 3 rows only

Get the second last record in a date column within a inner join

I need to pull the second last record in a date column called OrderDate. However, I need to bring only one date (I am making the search into a table with all the purchases orders, dates and costs, in which a have to bring only the second last and its cost). The way its query is written today (and working) is pulling me the the newest date.
select distinct
a.PurchaseNum, a.ItemID, a.SupplierNum, a.Location, a.OrderDate, a.Cost
from
PurchaseOrder a
inner join
(select
l.SupplierNum, l.ItemID, l.Location, maxdate = max(l.OrderDate)
from
PurchaseOrder l
where
l.Cost <> 0
group by
l.SupplierNum, l.itemid, l.Location) l on a.SupplierNum = l.SupplierNumand a.itemid = l.itemid
and l.Location = a.Location
and a.OrderDate = l.maxdate
I have tried to use lag(), offset (but with limitations once is within a join, forcing me to use the order by and include the dateOrder column which is not what I want because we need only one date)
A bit of context: I have a report in which I need to show the last and second last cost of a purchase order for each supplier. Bring the last cost of an order is easy, the problem is go back to the second last... and it is where I am stuck right now.
Any thought?
If I'm understanding you correctly, here's one option using row_number to return the 2 highest orderdate records:
select *
from (
select *,
row_number() over (partition by SupplierNum, ItemID, Location
order by OrderDate desc) rn
from PurchaseOrder
where cost <> 0
) t
where rn <= 2
Inner query does order by desc and outside query does order by asc.
select distinct top 1 a.*
from PurchaseOrder a
inner join
(
select Top 2 l.*
from PurchaseOrder l
where
l.Cost <> 0
group by l.SupplierNum, l.itemid, l.Location order by orderdate desc) l
on a.SupplierNum= l.SupplierNumand a.itemid = l.itemid and l.Location=a.Location and a.OrderDate = l.Orderdate
order by a.orderdate
or
SELECT TOP 1 * FROM (SELECT * FROM PurchaseOrder a
EXCEPT SELECT TOP (SELECT (COUNT(*)-2) FROM PurchaseOrder a where
l.Cost <> 0
group by l.SupplierNum, l.itemid, l.Location) * FROM PurchaseOrder) A
or
SELECT *
FROM PurchaseOrder a
WHERE OrderDate = ( SELECT MAX(OrderDate)
FROM PurchaseOrder
WHERE Orderdate < ( SELECT MAX(OrderDate)
FROM PurchaseOrder l where
l.Cost <> 0
group by l.SupplierNum, l.itemid, l.Location
)
) ;
or
SELECT TOP (1) *
FROM PurchaseOrder
WHERE OrderDate < ( SELECT MAX(OrderDate)
FROM PurchaseOrder where ....
)
ORDER BY OrderDate DESC ;

Last order item in Oracle SQL

I need to list columns from customer table, the date from first order and all data from last one, in a 1:N relationship between customer and order tables. I'm using Oracle 10g.
How the best way to do that?
TABLE CUSTOMER
---------------
id NUMBER
name VARCHAR2(200)
subscribe_date DATE
TABLE ORDER
---------------
id NUMBER
id_order NUMBER
purchase_date DATE
purchase_value NUMBER
Here is one way of doing it, using the row_number function, one join, and on aggregation:
select c.*,
min(o.purchase_date) as FirstPurchaseDate,
min(case when seqnum = 1 then o.id_order end) as Last_IdOrder,
min(case when seqnum = 1 then o.purchase_date end) as Last_PurchaseDate,
min(case when seqnum = 1 then o.purchase_value end) as Last_PurchaseValue
from Customer c join
(select o.*,
row_number() over (partition by o.id order by purchase_date desc) as seqnum
from orders o
) o
on c.customer_id = o.order_id
group by c.customer_id, c.name, c.subscribe_date
It's not obvious how to join the customer table to the orders table (order is a reserved word in Oracle so your table can't be named order). If we assume that the id_order in orders joins to the id in customer
SELECT c.id customer_id,
c.name name,
c.subscribe_date,
o.first_purchase_date,
o.id last_order_id,
o.purchase_date last_order_purchase_date,
o.purchase_value last_order_purchase_value
FROM customer c
JOIN (SELECT o.*,
min(o.purchase_date) over (partition by id_order) first_purchase_date,
rank() over (partition by id_order order by purchase_date desc) rnk
FROM orders o) o ON (c.id = o.id_order)
WHERE rnk = 1
I'm confused by your field names, but I'm going to assume that ORDER.id is the id in the CUSTOMER table.
The earliest order date is easy.
select CUSTOMER.*, min(ORDER.purchase_date)
from CUSTOMER
inner join ORDER on CUSTOMER.id = ORDER.id
group by CUSTOMER.*
To get the last order data, join this to the ORDER table again.
select CUSTOMER.*, min(ORD_FIRST.purchase_date), ORD_LAST.*
from CUSTOMER
inner join ORDER ORD_FIRST on CUSTOMER.id = ORD_FIRST.id
inner join ORDER ORD_LAST on CUSTOMER.id = ORD_LAST.id
group by CUSTOMER.*, ORD_LAST.*
having ORD_LAST.purchase_date = max(ORD_FIRST.purchase_date)
Maybe something like this assuming the ID field in the Order table is actually the Customer ID:
SELECT C.*, O1.*, O2.purchase_Date as FirstPurchaseDate
FROM Customer C
LEFT JOIN
(
SELECT Max(purchase_date) as pdate, id
FROM Orders
GROUP BY id
) MaxPurchaseOrder
ON C.Id = MaxPurchaseOrder.Id
LEFT JOIN Orders O1
ON MaxPurchaseOrder.pdate = O1.purchase_date
AND MaxPurchaseOrder.id = O1.id
LEFT JOIN
(
SELECT Min(purchase_date) as pdate, id
FROM Orders
GROUP BY id
) MinPurchaseOrder
ON C.Id = MinPurchaseOrder.Id
LEFT JOIN Orders O2
ON MinPurchaseOrder.pdate = O2.purchase_date
AND MinPurchaseOrder.id = O2.id
And the sql fiddle.

SQL: improving join efficiency

If I turn this sub-query which selects sales persons and their highest price paid for any item they sell:
select *,
(select top 1 highestProductPrice
from orders o
where o.salespersonid = s.id
order by highestProductPrice desc ) as highestProductPrice
from salespersons s
in to this join in order to improve efficiency:
select *, highestProductPrice
from salespersons s join (
select salespersonid, highestProductPrice, row_number(
partition by salespersonid
order by salespersonid, highestProductPrice) as rank
from orders ) o on s.id = o.salespersonid
It still touches every order record (it enumerates the entire table before filtering by salespersonid it seems.) However you cannot do this:
select *, highestProductPrice
from salespersons s join (
select salespersonid, highestProductPrice, row_number(
partition by salespersonid
order by salespersonid, highestProductPrice) as rank
from orders
where orders.salepersonid = s.id) o on s.id = o.salespersonid
The where clause in the join causes a `multi-part identifier "s.id" could not be bound.
Is there any way to join the top 1 out of each order group with a join but without touching each record in orders?
Try
SELECT
S.*,
T.HighestProductPrice
FROM
SalesPersons S
CROSS APPLY
(
SELECT TOP 1 O.HighestProductPrice
FROM Orders O
WHERE O.SalesPersonid = S.Id
ORDER BY O.SalesPersonid, O.HighestProductPrice DESC
) T
would
select s.*, max(highestProductPrice)
from salespersons s
join orders o on o.salespersonid = s.id
group by s.*
or
select s.*, highestProductPrice
from salespersons s join (select salepersonid,
max(highestProductPrice) as highestProductPrice
from orders o) as o on o.salespersonid = s.id
work?