SQL bestsellers query - sql

I have a problem with finding bestselling books in each category. With attached code I can find only categories of needed books but how to find this books?
SELECT b.category_id, max(b.total_quantity) as max_quantity
FROM (
SELECT books.id, books.category_id, sum(order_items.quantity) as total_quantity
FROM order_items
INNER JOIN orders ON order_items.order_id = orders.id
INNER JOIN books on order_items.book_id = books.id
WHERE orders.status in (2, 3)
GROUP BY books.id
) as b
GROUP BY b.category_id

Use distinct on:
SELECT DISTINCT ON (b.id), b.*, SUM(oi.quantity) as total_quantity
FROM order_items oi JOIN
orders o
ON oi.order_id = o.id JOIN
books b
ON oi.book_id = b.id
WHERE o.status in (2, 3)
GROUP BY b.id
ORDER BY b.category_id, total_quantity DESC

You could filter in the having clause with a correlated, aggregate query that returns the top selling quantity for the related category:
select b.id, b.category_id, sum(oi.quantity) as total_quantity
from order_items oi
inner join orders o on oi.order_id = o.id and o.status in (2, 3)
inner join books b on oi.book_id = b.id
having sum(oi.quantity) = (
select sum(oi.quantity)
from order_items oi1
inner join orders o1 on oi1.order_id = o1.id and o1.status in (2, 3)
inner join books b1 on oi1.book_id = b1.id
where b1.category_id = b.category_id
group by b1.id
order by sum(oi.quantity) desc
limit 1
)
group by b.id, b.category_id
Or, if your RDBMS supports window functions (and allows mixing them with aggregation):
select id, category_id, total_quantity
from (
select
b.id,
b.category_id,
sum(oi.quantity) as total_quantity,
rank() over(partition by b.category_id order by sum(oi.quantity) desc) rn
from order_items oi
inner join orders o on oi.order_id = o.id and o.status in (2, 3)
inner join books b on oi.book_id = b.id
group by b.id, b.category_id
) t
where rn = 1

I would use a DENSE_RANK over the SUM.
If available in the RDBMS.
SELECT *
FROM
(
SELECT books.id, books.category_id
, SUM(orditm.quantity) AS total_quantity
, DENSE_RANK() OVER (PARTITION BY books.category_id ORDER BY SUM(order_items.quantity) DESC) AS Rnk
FROM order_items AS orditm
JOIN orders ON orditm.order_id = orders.id
JOIN books ON orditm.book_id = books.id
WHERE orders.status IN (2, 3)
GROUP BY books.id, books.category_id
) q
WHERE Rnk = 1

I guess you must have book_name column in your books table. So you can try below query -
SELECT b.book_name, b.category_id, max(b.total_quantity) as max_quantity
FROM (
SELECT books.id, books.book_name, books.category_id, sum(order_items.quantity) as total_quantity
FROM order_items
INNER JOIN orders ON order_items.order_id = orders.id
INNER JOIN books on order_items.book_id = books.id
WHERE orders.status in (2, 3)
GROUP BY books.id
) as b
GROUP BY b.category_id

Related

Multiple Tables Query

I have these tables and columns:
order_items: order_id, item_id, product_id, quantity, unit_price
product_categories: category_id, category_name
products: product_id, product_name, description, standard_cost, list_price, category_id
I need to write a SQL query to show the total and the average sale amount for each category.
So far, I have this but I do not know how to get the total and the average of all products per category:
SELECT
p.product_name,
oi.product_id,
pc.category_id, pc.category_name,
(oi.quantity * oi.unit_price) AS total_sale_amount,
AVG(oi.quantity * oi.unit_price) AS average_sale_amount
FROM
products p
INNER JOIN
product_categories pc ON p.category_id = pc.category_id
INNER JOIN
order_items oi ON oi.product_id = p.product_id
GROUP BY
p.product_name, oi.product_id, pc.category_id, pc.category_name,
oi.quantity * oi.unit_price;
Maybe with WHERE and AND clauses having in mind the value of every category ID, however I do not know how to implement it.
category_id: 1, 2, 3, 4, 5
Thanks in advance.
I figured it out, I was using more columns than the actually needed:
SELECT
p.category_id,
pc.category_name,
SUM(oi.quantity * unit_price) AS total_sale_amount,
ROUND(AVG(oi.quantity * unit_price), 2) AS average_sale_amount
FROM
products p
INNER JOIN
order_items oi ON p.product_id = oi.product_id
INNER JOIN
product_categories pc ON pc.category_id = p.category_id
GROUP BY
p.category_id, pc.category_name;
If you're using MySQL you'd need WITH ROLLUP:
SELECT p.product_name,
oi.product_id,
pc.category_id,
pc.category_name,
SUM(oi.quantity * oi.unit_price) AS total_sale_amount,
AVG(oi.quantity * oi.unit_price) AS average_sale_amount
FROM products p
INNER JOIN product_categories pc
ON p.category_id = pc.category_id
INNER JOIN order_items oi
ON oi.product_id = p.product_id
GROUP BY p.product_name,
oi.product_id,
pc.category_id,
pc.category_name
WITH ROLLUP
If you're using Oracle it's a little different:
SELECT p.product_name,
oi.product_id,
pc.category_id,
pc.category_name,
SUM(oi.quantity * oi.unit_price) AS total_sale_amount,
AVG(oi.quantity * oi.unit_price) AS average_sale_amount
FROM products p
INNER JOIN product_categories pc
ON p.category_id = pc.category_id
INNER JOIN order_items oi
ON oi.product_id = p.product_id
GROUP BY ROLLUP(p.product_name,
oi.product_id,
pc.category_id,
pc.category_name)

Left join when there are lots of matched rows from right table

I have two tables.
Product(id, name)
LineItem(id, product_id, order_id)
Order(id, state)
Order can have many products. One product can belong to many orders at the same time.
I would like to select Products, which don't have orders with specific statuses(i.e. 1, 2).
My query is
SELECT products.id, products.price
FROM "products"
LEFT OUTER JOIN line_items ON line_items.product_id = products.id
LEFT OUTER JOIN orders ON orders.id = line_items.order_id AND orders.status IN (1, 2)
WHERE (products.price > 0) AND (orders.id IS NULL) AND "products"."id" = $1
GROUP BY products.id, products.price [["id", 11]]
11 is an id of a product, that should not appear to the result, but it does.
I would like to select Products, which don't have orders with specific statuses(i.e. 1, 2).
SELECT * FROM products p -- I would like to select Products
WHERE NOT EXISTS( -- , which don't have
SELECT *
FROM orders o -- orders
JOIN line_items li ON li.order_id = o.id
WHERE li.product_id = p.id
AND o.status IN (1,2) -- with specific statuses(i.e. 1, 2).
);
select p.id, p.name
from products p
join lineitem l on l.product_id = p.id
join `order` o on l.order_id = o.id
group by p.id, p.name
having sum(case when o.state in (1,2) then 1 else 0 end) = 0
The idea is to start with the products table and use left join to find orders with 1 or 2. If they don't exist, then you want the product:
select p.id, p.name
from product p left join
lineitem li
on li.product_id = p.id left join
orders o -- a better name for the table
on li.order_id = o.id and
o.state in (1, 2)
where o.id is null
group by p.id, p.name;

Postgresql returning the most popular genre of product per customer

I have a query that is supposed to return a list of customers with the most popular product type for each customer. I have have a query that sums up each product purchased in all given product types and lists them in descending order per customer
SELECT c.customer_name as cname, ptr.product_type as pop_gen, sum(od.quantity) as li
FROM product_type_ref as ptr
INNER JOIN product as p
on p.product_type_ref_id = ptr.product_type_ref_id
INNER JOIN order_detail as od
on od.product_id = p.product_id
INNER JOIN order as o
on o.order_id = od.order_id
INNER JOIN customer as c
on c.customer_id = o.customer_id
GROUP BY cname, pop_gen
ORDER BY cname, li DESC
which returns this data:
'andy','Drama',1000
'andy','Action',250
'andy','Comedy',100
'bebe','Drama',250
'bebe','Action',100
'bebe','Comedy',25
'buster','Action',825
'buster','Comedy',768
'buster','Drama',721
'buster','Romance',100
'ron','Romance',50
'ron','Comedy',10
how could i return this:
andy, Drama
bebe, Drama
buster, Action
ron, Romance
In Postgres, you can just use distinct on:
SELECT DISTINCT ON (c.customer_name) c.customer_name as cname,
ptr.product_type as pop_gen, sum(od.quantity) as li
FROM product_type_ref as ptr
INNER JOIN product as p
on p.product_type_ref_id = ptr.product_type_ref_id
INNER JOIN order_detail as od
on od.product_id = p.product_id
INNER JOIN order as o
on o.order_id = od.order_id
INNER JOIN customer as c
on c.customer_id = o.customer_id
GROUP BY cname, pop_gen
ORDER BY cname, li DESC;
Classic greatest-n-per-group. One possible solution is to use ROW_NUMBER():
WITH
CTE
AS
(
SELECT
c.customer_name as cname, ptr.product_type as pop_gen, sum(od.quantity) as li
,ROW_NUMBER() OVER(PARTITION BY c.customer_name ORDER BY sum(od.quantity) DESC) AS rn
FROM
product_type_ref as ptr
INNER JOIN product as p on p.product_type_ref_id = ptr.product_type_ref_id
INNER JOIN order_detail as od on od.product_id = p.product_id
INNER JOIN order as o on o.order_id = od.order_id
INNER JOIN customer as c on c.customer_id = o.customer_id
GROUP BY
cname, pop_gen
)
SELECT
cname, pop_gen, li
FROM CTE
WHERE rn = 1
ORDER BY cname;
Add ROW_NUMBER()
SELECT *
FROM (
SELECT c.customer_name as cname,
ptr.product_type as pop_gen,
sum(od.quantity) as li,
ROW_NUMBER() OVER (PARTITION BY c.customer_name
ORDER BY sum(od.quantity) DESC) as rn
......
) as T
WHERE T.rn = 1

Select maximum value of one table depending on 2 other tables

I have 3 tables
Orders (orderID, CustomerID)
Orderlines (orderID, ProdID)
Products (ProdID, CategoryID)!
I want to find the customerID which has the most different "CategoryID" in one order!
To get you there, start with the basic query to get your info:
SELECT o.customer_id
,l.orderid
,COUNT(DISTINCT categoryid) category_cnt
FROM orders o
JOIN orderlines l on l.orderid = o.orderid
JOIN products p ON l.prodid = p.prodid
GROUP BY l.customner_id, l.orderid
order by COUNT(DISTINCT categoryid) desc;
Once you see that this works out, we will add an analytic to this to show you the rank() function
SELECT o.customer_id
,l.orderid
,COUNT(DISTINCT categoryid) category_cnt
, rank() over (order by COUNT(DISTINCT categoryid) desc) as count_rank
FROM orders o
JOIN orderlines l on l.orderid = o.orderid
JOIN products p ON l.prodid = p.prodid
GROUP BY l.customner_id, l.orderid
order by COUNT(DISTINCT categoryid) desc;
Following so far? OK, so now we just need to push this down into a sub-query to get the record(s) ranked #1 (in case more than one customer order matches the top count)
SELECT customer_id, order_id, category_cnt
FROM (
SELECT o.customer_id
,l.orderid
,COUNT(DISTINCT categoryid) category_cnt
, rank() over (order by COUNT(DISTINCT categoryid) desc) as count_rank
FROM orders o
JOIN orderlines l on l.orderid = o.orderid
JOIN products p ON l.prodid = p.prodid
GROUP BY l.customner_id, l.orderid)
WHERE count_rank = 1;
Try;
with data_a as ( --distinct CategoryID cnt
select
o.orderID,
o.customerID,
count(DISTINCT p.CategoryID) cnt
from Orders o
join Orderlines ol.orderID = o.orderID
join Products p on p.ProdID = ol.ProdID
group by o.orderID, o.customerID
),
data as ( --get all count rnk
select
orderID,
customerID,
rank() over (partition by orderID, customerID order by cnt desc) rnk
from data_a
)
select
orderID,
customerID
from data
where rnk = 1
Step by step: Count distinct categories per order first. Then rank your orders, so that the orders with the most categories get rank #1. Then find customers for all orders ranked #1.
select distinct cutomerid
from orders
where orderid in
(
select orderid
from
(
select orderid, rank() over (order by category_count desc) as rnk
from
(
select ol.orderid, count(distinct p.distinctcategroyid) as category_count
from orderlines ol
join products p on p.prodid = ol.prodid
group by ol.orderid
) counted
) ranked
where rnk = 1
);
Something like that i guess
SELECT o.customerID, t.category_cnt
FROM (SELECT l.orderid, COUNT(DISTINCT categoryid) category_cnt
FROM orderlines l
JOIN products p ON l.prodid = p.prodid
GROUP BY l.orderid
ORDER BY category_cnt DESC) t
JOIN orders o ON o.orderid = t.orderid
WHERE rownum < 2

Missing Right Parenthesis issue

This is the Query I have written
Select C.CUST_NAME,P.PROD_DESCRIP from Customer C
JOIN (Ord O,OrderItem OT, Product P) ON (C.CUST_ID = O.CUST_ID AND O.ORD_ID = OT.ORD_ID AND OT.PROD_ID = P.PROD_ID) GROUP BY C.CUST_NAME ORDER BY OT.ORDITEM_QTY DESC
But the issue is it giving me Right Parenthesis Missing issue
Although that join syntax is allowed in some databases, it is really much clearer to split out the joins:
Select C.CUST_NAME, P.PROD_DESCRIP
from Customer C JOIN
Ord O
on C.CUST_ID = O.CUST_ID JOIN
OrderItem OT
on O.ORD_ID = OT.ORD_ID JOIN
Product P
ON OT.PROD_ID = P.PROD_ID
GROUP BY C.CUST_NAME
ORDER BY OT.ORDITEM_QTY DESC;
By the way, this probably isn't doing what you think it does. It is returning a customer name along with an arbitrary prod_descrip. It is then ordering this result by an arbitrary quantity -- perhaps from the same or a different row.
If you want to get the customer name along with the product with the maximum quantity for that customer, you can do this:
Select C.CUST_NAME,
substring_index(group_concat(P.PROD_DESCRIP order by OT.ORDITEM_QTY desc), ',', 1) as PROD_DESCRIP
from Customer C JOIN
Ord O
on C.CUST_ID = O.CUST_ID JOIN
OrderItem OT
on O.ORD_ID = OT.ORD_ID JOIN
Product P
ON OT.PROD_ID = P.PROD_ID
GROUP BY C.CUST_NAME;
Note: If PROD_DESCRIP could have a comma then you will want to use a different separator character.
EDIT:
The above is the MySQL solution. In Oracle, you would do:
select CUST_NAME, PROD_DESCRIP
from (Select C.CUST_NAME, P.PROD_DESCRIP,
row_number() over (partition by C.CUST_NAME order by OT.ORDITEM_QTY desc) as seqnum
from Customer C JOIN
Ord O
on C.CUST_ID = O.CUST_ID JOIN
OrderItem OT
on O.ORD_ID = OT.ORD_ID JOIN
Product P
ON OT.PROD_ID = P.PROD_ID
) t
where seqnum = 1;
This is actually the preferred standard SQL solution. It will work in most databases (SQL Server, Oracle, Postgres, DB2, and Teradata).
SELECT C.CUST_NAME, P.PROD_DESCRIP
FROM Customer C
INNER JOIN Ord O ON C.CUST_ID = O.CUST_ID
INNER JOIN OrderItem OT ON O.ORD_ID = OT.ORD_ID
INNER JOIN Product P ON OT.PROD_ID = P.PROD_ID
GROUP BY C.CUST_NAME
ORDER BY OT.ORDITEM_QTY DESC
SELECT C.CUST_NAME,P.PROD_DESCRIP
FROM Customer C
JOIN Ord O
ON C.CUST_ID = O.CUST_ID
JOIN OrderItem OT
ON O.ORD_ID = OT.ORD_ID
JOIN Product P
ON OT.PROD_ID = P.PROD_ID
GROUP BY C.CUST_NAME
ORDER BY OT.ORDITEM_QTY DESC