Calculating multiple averages across different parts of the table? - sql

I have the following transactions table:
customer_id purchase_date product category department quantity store_id
1 2020-10-01 Kit Kat Candy Food 2 store_A
1 2020-10-01 Snickers Candy Food 1 store_A
1 2020-10-01 Snickers Candy Food 1 store_A
2 2020-10-01 Snickers Candy Food 2 store_A
2 2020-10-01 Baguette Bread Food 5 store_A
2 2020-10-01 iPhone Cell phones Electronics 2 store_A
3 2020-10-01 Sony PS5 Games Electronics 1 store_A
I would like to calculate the average number of products purchased (for each product in the table). I'm also looking to calculate averages across each category and each department by accounting for all products within the same category or department respectively. Care should be taken to divide over unique customers AND the product quantity being greater than 0 (a 0 quantity indicates a refund, and should not be accounted for).
So basically, the output table would like below:
...where store_id and average_level_type are partition columns.
Is there a way to achieve this in a single pass over the transactions table? or do I need to break down my approach into multiple steps?
Thanks!

How about using “union all” as below -
Select store_id, 'product' as average_level_type,product as id, sum(quantity) as total_quantity,
Count(distinct customer_id) as unique_customer_count, sum(quantity)/count(distinct customer_id) as average
from transactions
where quantity > 0
group by store_id,product
Union all
Select store_id, 'category' as average_level_type, category as id, sum(quantity) as total_quantity,
Count(distinct customer_id) as unique_customer_count, sum(quantity)/count(distinct customer_id) as average
from transactions
where quantity > 0
group by store_id,category
Union all
Select store_id, 'department' as average_level_type,department as id, sum(quantity) as total_quantity,
Count(distinct customer_id) as unique_customer_count, sum(quantity)/count(distinct customer_id) as average
from transactions
where quantity > 0
group by store_id,department;
If you want to avoid using union all in that case you can use something like rollup() or group by grouping sets() to achieve the same but the query would be a little more complicated to get the output in the exact format which you have shown in the question.
EDIT : Below is how you can use grouping sets to get the same output -
Select store_id,
case when G_ID = 3 then 'product'
when G_ID = 5 then 'category'
when G_ID = 6 then 'department' end As average_level_type,
case when G_ID = 3 then product
when G_ID = 5 then category
when G_ID = 6 then department end As id,
total_quantity,
unique_customer_count,
average
from
(select store_id, product, category, department, sum(quantity) as total_quantity, Count(distinct customer_id) as unique_customer_count, sum(quantity)/count(distinct customer_id) as average, GROUPING__ID As G_ID
from transactions
group by store_id,product,category,department
grouping sets((store_id,product),(store_id,category),(store_id,department))
) Tab
order by 2
;

Related

Find the products contributing to the 50% of the total sales using SQL SUM window Function

There are two Tables - orders and item_line
orders
order_id
created_at
total_amount
123
2022-11-11 13:40:50
450.00
124
2022-10-30 00:40:50
1500.00
item_line
order_id
product_id
product_name
quantity
unit_price
123
a1b
milo
4
100.00
123
c2d
coke
5
10.00
124
c2d
coke
150
10.00
The question is:
Find the products contributing to the 50% of the total sales.
My take on this is -
SELECT i.product_name,SUM(o.total_amount)AS 'Net Sales'
FROM item_line i
JOIN orders o on o.order_id = i.order_id
GROUP BY i.product_name
HAVING SUM(o.total_amount) = (SUM(o.total_amount)*0.5);
But this is not correct. SUM windows functions need to be used, but how?
Try the following, explanation is within the query comments:
-- find the the total sales for each product
WITH product_sales AS
(
SELECT product_id, product_name,
SUM(quantity * unit_price) AS product_tot_sales
FROM item_line
GROUP BY product_id, product_name
),
-- find the running sales percentage for each product starting from porduct with highest sales value
running_percentage AS
(
SELECT product_id, product_name, product_tot_sales,
SUM(product_tot_sales) OVER (ORDER BY product_tot_sales DESC) /
SUM(product_tot_sales) OVER () AS running_sales_percentage,
SUM(product_tot_sales) OVER () AS tot_sales
FROM product_sales
)
-- select products that have a running sales percentage less than the min(running_sales_percentage) where running_sales_percentage >= 0.5
-- this will select all of products that contributes of 0.5 of the total sales
SELECT product_id, product_name, product_tot_sales,
tot_sales,
running_sales_percentage
FROM running_percentage
WHERE running_sales_percentage <=
(
SELECT MIN(running_sales_percentage)
FROM running_percentage
WHERE running_sales_percentage >= 0.5
)
You don't need a join with orders table, all data you need is existed in the item_line table.
See demo.

How to find out first product item client purchased whose bought specific products?

I want to write a query to locate a group of clients whose purchased specific 2 product categories, at the same time, getting the information of first transaction date and first item they purchased. Since I used group by function, I could only get customer id but not first item purchase due to the nature of group by. Any thoughts to solve this problem?
What I have are transaction tables(t), customer_id tables(c) and product tables(p). Mine is SQL server 2008.
Update
SELECT t.customer_id
,t.product_category
,MIN(t.transaction_date) AS FIRST_TRANSACTION_DATE
,SUM(t.quantity) AS TOTAL_QTY
,SUM(t.sales) AS TOTAL_SALES
FROM transaction t
WHERE t.product_category IN ('VEGETABLES', 'FRUITS')
AND t.transaction_date BETWEEN '2020/01/01' AND '2022/09/30'
GROUP BY t.customer_id
HAVING COUNT(DISTINCT t.product_category) = 2
**Customer_id** **transaction_date** **product_category** **quantity** **sales**
1 2022-05-30 VEGETABLES 1 100
1 2022-08-30 VEGETABLES 1 100
2 2022-07-30 VEGETABLES 1 100
2 2022-07-30 FRUITS 1 50
2 2022-07-30 VEGETABLES 2 200
3 2022-07-30 VEGETABLES 3 300
3 2022-08-01 FRUITS 1 50
3 2022-08-05 FRUITS 1 50
4 2022-08-07 FRUITS 1 50
4 2022-09-05 FRUITS 2 100
In the above, what I want to show after executing the SQL query is
**Customer_id** **FIRST_TRANSACTION_DATE** **first_product_category** **TOTAL_QUANTITY** **TOTAL_SALES**
2 2022-07-30 VEGETABLES, FRUITS 4 350
3 2022-07-30 VEGETABLES 5 400
Customer_id 1 and 4 will not be shown as they only purchased either vegetables or fruits but not both
Check now, BTW need find logic with product_category
select CustomerId, transaction_date, product_category, quantity, sales
from(
select CustomerId, transaction_date, product_category , sum(quantity) over(partition by CustomerId ) as quantity , sum(sales) over(partition by CustomerId ) as sales, row_number() over(partition by CustomerId order by transaction_date ASC) rn
from(
select CustomerId, transaction_date, product_category, quantity, sales
from tablee t
where (product_category = 'FRUITS' and
EXISTS (select CustomerId
from tablee tt
where product_category = 'VEGETABLES'
and t.CustomerId = tt.CustomerId)) OR
(product_category = 'VEGETABLES' and
EXISTS (select CustomerId
from tablee tt
where product_category = 'FRUITS'
and t.CustomerId = tt.CustomerId)))x)over_all
where rn = 1;
HERE is FIDDLE

Select the best selling product ID

What if I have table like this and I want to select the best selling product_id.
id
transaction_id
product_id
qty_sold
1
21
2
5
2
22
3
2
3
23
4
2
3
24
2
1
3
25
2
4
I want the best selling product_id with the highest qty_sold
Using SQLS, you can group by the productID, add up the number of sold, and order by the total descending. If we also take the minimum transaction ID per product, if two products come out to have the same total qty, we can take the minimum tran ID to split the tie
SELECT TOP 1 product_id, SUM(qty_sold) as sellcount, MIN(transaction_id) as firsttran
FROM t
GROUP BY product_id
ORDER BY SUM(qty_sold) DESC, MIN(transaction_id)
Once you're happy the sums are right etc, you can remove the , SUM(qty_sold) as sellcount, MIN(transaction_id) from the SELECT if you want/if you only need the prod ID

table stores has two columns like store_id and products. Write a SQL query to find store_ids which sell only shampoo and biscuit

Store should sell only two products and that too only shampoo and biscuit
For example
Output:
my query: -
SELECT *
FROM (SELECT store_id AS gp_str
FROM (SELECT store_id,
COUNT(product) AS prd_cnt
FROM stores
GROUP BY store_id) x
WHERE x.prd_cnt = 2) y
LEFT JOIN stores ON y.gp_str = stores.store_id;
My query gave me result of stores which sell only two products but I wanted stores which sell two products which are shampoo and biscuit only.
SELECT store_id
FROM stores
GROUP BY store_id
HAVING COUNT(DISTINCT product) = 2
AND COUNT(*) = SUM(CASE WHEN product IN ('shampoo','biscuit') THEN 1 ELSE 0 END);

how to calculate Sum for two different table with different containing table field

I have two table "tbl_In_Details" and "tbl_Out_Details"
Sample data for tbl_In_Details
ID Item_Name Rate Quantity Source_company
1 wire 10 4 2020-04-21 22:47:29.083
2 Tea 4 20 2020-04-21 22:47:52.823
Sample data for tbl_Out_Details
ID Item_Name Quantity Created_Date
1 wire 1 2020-04-21 22:48:48.233
2 wire 2 2020-04-21 22:50:16.367
3 Tea 2 2020-04-21 23:48:39.943
Now i want to calculate SUM of price and price (RATE*QUANTITY) from out going table
i tried in such a way but not getting result
select o.Item_Name, SUM(o.quantity) as Total_quantity ,SUM(o.quantity * i.Rate) as Expenses
from tbl_In_Details i inner join tbl_Out_Details o
ON i.Item_Name = o.Item_Name group by o.Item_Name,o.quantity, i.Rate
My Output should be
Item_Name Total_quantity Expenses
Tea 2 8
wire 3 30
Your query is completely fine. The only thing you need to get the desired result is to delete grouping by 'quantity' and 'rate'.
select o.Item_Name, SUM(o.quantity) as Total_quantity ,SUM(o.quantity * i.Rate) as Expenses
from tbl_In_Details i inner join tbl_Out_Details o
ON i.Item_Name = o.Item_Name group by o.Item_Name;
One method uses union all and group by:
Aggregate before joining:
select i.item_name, o.quantity, i.expenses
from (select id, item_name, sum(rate*quantity) as expenses
from tbl_In_Details
group by id, item_name
) i join
(select id, item_name, sum(quantity) as quantity
from tbl_Out_Details
group by id, item_name
) o
on i.id = o.id