Calculate top two performing product categories from Sales data - sql

I am trying to build a KPI of top 2 performing product categories for each customer.
I have sales data with following relevant columns -
customerid, product, product_category, order_qty, product_amt , order_date
I am using legacy SQL syntax in BQ.

This is a possible solution...
SELECT
customer_id,
product_category,
order_qty
FROM (
SELECT
customerid,
product_category,
SUM(order_qty) AS order_qty,
ROW_NUMBER() OVER(PARTITION BY customerid ORDER BY order_qty DESC) AS rn
FROM
[project:dataset.table]
GROUP BY
1, 2
)
WHERE
rn <= 2
ORDER BY
1, 3 DESC

Related

Averaging and Grouping In google Big Query

I have the table as shown in google big Query:
I just want to do the following:
Calculate Category wise total units sold
Calculate Category wise average selling price
consider below approach
select 'category' type, category name, count(1) units_sold, sum(sale_price) total_sale, round(avg(sale_price), 2) average_selling_price
from your_table group by category
union all
select * from (
select 'product' type, product name, count(1) units_sold, sum(sale_price) total_sale, round(avg(sale_price), 2) average_selling_price
from your_table group by product
order by total_sale desc limit 10
)
union all
select * from (
select 'order_date' type, '' || order_date name, count(1) units_sold, sum(sale_price) total_sale, round(avg(sale_price), 2) average_selling_price
from your_table group by order_date
order by total_sale desc limit 5
)
order by type
if applied to sample/dummy data - output would be like below

Execution orders of SQL aggregate functions

I have a sales table in SQLite:
purchase_date
units_sold
customer_id
15
1
1
17
1
1
30
3
1
I want to get the total unit_solds for each customer on the first date and last date of their purchases. My query is:
select customer_id,
sum(units_sold) total_units_sold
from sales
group by customer_id
having purchase_date = min(purchase_date)
or purchase_date = max(purchase_date)
I was expecting results like:
customer_id
total_units_sold
1
4
but I got:
customer_id
total_units_sold
1
5
I would like to know why this solution doesn't work.
The order of the phrase is incorrect
Note: The having statement is executed after compilation.
You need to get the results as partial queries
For example, I arranged to know the first line of the date according to each customer
as well as the last line of the date (by getting the first line after descending order)
and then execute the group statement
The example is complete
select customer_id,sum(units_sold) from (
select customer_id, units_sold,purchase_date,
ROW_NUMBER() over(partition by customer_id order by purchase_date) As RowDatefirst,
ROW_NUMBER() over(partition by customer_id order by purchase_date desc)As RowDatelast
from sales
) t where t.RowDatefirst = 1 or t.RowDatelast=1
group by customer_id
Try this:
SELECT a.customer_id, SUM(a.units_sold) as total_units_sold
FROM sales a
INNER JOIN (
SELECT customer_id, MIN(purchase_date) as _first ,MAX(purchase_date) as _last
FROM sales
GROUP BY customer_id
) b ON a.customer_id = b.customer_id AND
(a.purchase_date = b._first OR a.purchase_date = b._last)
GROUP BY a.customer_id
http://sqlfiddle.com/#!7/0a4a4/7

SQL get top 3 values / bottom 3 values with group by and sum

I am working on a restaurant management system. There I have two tables
order_details(orderId,dishId,createdAt)
dishes(id,name,imageUrl)
My customer wants to see a report top 3 selling items / least selling 3 items by the month
For the moment I did something like this
SELECT
*
FROM
(SELECT
SUM(qty) AS qty,
order_details.dishId,
MONTHNAME(order_details.createdAt) AS mon,
dishes.name,
dishes.imageUrl
FROM
rms.order_details
INNER JOIN dishes ON order_details.dishId = dishes.id
GROUP BY order_details.dishId , MONTHNAME(order_details.createdAt)) t
ORDER BY t.qty
This gives me all the dishes sold count order by qty.
I have to manually filter max 3 records and reject the rest. There should be a SQL way of doing this. How do I do this in SQL?
You would use row_number() for this purpose. You don't specify the database you are using, so I am guessing at the appropriate date functions. I also assume that you mean a month within a year, so you need to take the year into account as well:
SELECT ym.*
FROM (SELECT YEAR(od.CreatedAt) as yyyy,
MONTH(od.createdAt) as mm,
SUM(qty) AS qty,
od.dishId, d.name, d.imageUrl,
ROW_NUMBER() OVER (PARTITION BY YEAR(od.CreatedAt), MONTH(od.createdAt) ORDER BY SUM(qty) DESC) as seqnum_desc,
ROW_NUMBER() OVER (PARTITION BY YEAR(od.CreatedAt), MONTH(od.createdAt) ORDER BY SUM(qty) DESC) as seqnum_asc
FROM rms.order_details od INNER JOIN
dishes d
ON od.dishId = d.id
GROUP BY YEAR(od.CreatedAt), MONTH(od.CreatedAt), od.dishId
) ym
WHERE seqnum_asc <= 3 OR
seqnum_desc <= 3;
Using the above info i used i combination of group by, order by and limit
as shown below. I hope this is what you are looking for
SELECT
t.qty,
t.dishId,
t.month,
d.name,
d.mageUrl
from
(
SELECT
od.dishId,
count(od.dishId) AS 'qty',
date_format(od.createdAt,'%Y-%m') as 'month'
FROM
rms.order_details od
group by date_format(od.createdAt,'%Y-%m'),od.dishId
order by qty desc
limit 3) t
join rms.dishes d on (t.dishId = d.id)

getting difference between two invoices by ranking and subtracting one from the other

Trying to grab difference in invoices
Attempted using cte's for ranks 1 and 2, but they have a subquery in them and cant be done!
the second query looks the same, but with rank=2.
select *
from (
SELECT i.id, i.subtotal/100 as subtotal, i.created_at, i.paid_at
,RANK() OVER (PARTITION BY i.subscription_id ORDER BY i.created_at DESC) AS Rank
From Invoices i
) as r
where r.rank = 1
order by r.created_at desc;
Following the path that you are on (using row_number()/rank()), you can use conditional aggregation. Assuming you want the difference of the subtotal, then:
select sum(case when seqnum = 1 then subtotal
else - subtotal
end) as difference
from (select i.*, i.subtotal/100 as subtotal,
row_number() over (partition by i.subscription_id order by i.created_at desc) as seqnum
from Invoices i
) i
where seqnum in (1, 2)
order by r.created_at desc;

SQL - Remove duplicates to show the latest date record

I have a view which ultimately I want to return 1 row per customer.
Currently its a Select as follows;
SELECT
Customerid,
MAX(purchasedate) AS purchasedate,
paymenttype,
delivery,
amount,
discountrate
FROM
Customer
GROUP BY
Customerid,
paymenttype,
delivery,
amount,
discountrate
I was hoping the MAX(purchasedate) would work but when I do my groupings it breaks as sometimes there could be a discountrate, sometimes its NULL, paymenttype can differ for each customer also, is there anyway just to show the last purchase a customer makes?
since SQL Server 2008 r2 supports windows function,
SELECT Customerid,
purchasedate,
paymenttype,
delivery,
amount,
discountrate
FROM
(
SELECT Customerid,
purchasedate,
paymenttype,
delivery,
amount,
discountrate,
ROW_NUMBER() OVER (Partition By CustomerID
ORDER BY purchasedate DESC) rn
FROM Customer
) derivedTable
WHERE derivedTable.rn = 1
or by using Common Table Expression
WITH derivedTable
AS
(
SELECT Customerid,
purchasedate,
paymenttype,
delivery,
amount,
discountrate,
ROW_NUMBER() OVER (Partition By CustomerID
ORDER BY purchasedate DESC) rn
FROM Customer
)
SELECT Customerid,
purchasedate,
paymenttype,
delivery,
amount,
discountrate
FROM derivedTable
WHERE derivedTable.rn = 1
or by using join with subquery which works in other DBMS
SELECT a.*
FROM Customer a
INNER JOIN
(
SELECT CustomerID, MAX(purchasedate) maxDate
FROM Customer
GROUP BY CustomerID
) b ON a.CustomerID = b.CustomerID AND
a.purchasedate = b.maxDate