Select the best selling product ID - sql

What if I have table like this and I want to select the best selling product_id.
id
transaction_id
product_id
qty_sold
1
21
2
5
2
22
3
2
3
23
4
2
3
24
2
1
3
25
2
4
I want the best selling product_id with the highest qty_sold

Using SQLS, you can group by the productID, add up the number of sold, and order by the total descending. If we also take the minimum transaction ID per product, if two products come out to have the same total qty, we can take the minimum tran ID to split the tie
SELECT TOP 1 product_id, SUM(qty_sold) as sellcount, MIN(transaction_id) as firsttran
FROM t
GROUP BY product_id
ORDER BY SUM(qty_sold) DESC, MIN(transaction_id)
Once you're happy the sums are right etc, you can remove the , SUM(qty_sold) as sellcount, MIN(transaction_id) from the SELECT if you want/if you only need the prod ID

Related

How to find out first product item client purchased whose bought specific products?

I want to write a query to locate a group of clients whose purchased specific 2 product categories, at the same time, getting the information of first transaction date and first item they purchased. Since I used group by function, I could only get customer id but not first item purchase due to the nature of group by. Any thoughts to solve this problem?
What I have are transaction tables(t), customer_id tables(c) and product tables(p). Mine is SQL server 2008.
Update
SELECT t.customer_id
,t.product_category
,MIN(t.transaction_date) AS FIRST_TRANSACTION_DATE
,SUM(t.quantity) AS TOTAL_QTY
,SUM(t.sales) AS TOTAL_SALES
FROM transaction t
WHERE t.product_category IN ('VEGETABLES', 'FRUITS')
AND t.transaction_date BETWEEN '2020/01/01' AND '2022/09/30'
GROUP BY t.customer_id
HAVING COUNT(DISTINCT t.product_category) = 2
**Customer_id** **transaction_date** **product_category** **quantity** **sales**
1 2022-05-30 VEGETABLES 1 100
1 2022-08-30 VEGETABLES 1 100
2 2022-07-30 VEGETABLES 1 100
2 2022-07-30 FRUITS 1 50
2 2022-07-30 VEGETABLES 2 200
3 2022-07-30 VEGETABLES 3 300
3 2022-08-01 FRUITS 1 50
3 2022-08-05 FRUITS 1 50
4 2022-08-07 FRUITS 1 50
4 2022-09-05 FRUITS 2 100
In the above, what I want to show after executing the SQL query is
**Customer_id** **FIRST_TRANSACTION_DATE** **first_product_category** **TOTAL_QUANTITY** **TOTAL_SALES**
2 2022-07-30 VEGETABLES, FRUITS 4 350
3 2022-07-30 VEGETABLES 5 400
Customer_id 1 and 4 will not be shown as they only purchased either vegetables or fruits but not both
Check now, BTW need find logic with product_category
select CustomerId, transaction_date, product_category, quantity, sales
from(
select CustomerId, transaction_date, product_category , sum(quantity) over(partition by CustomerId ) as quantity , sum(sales) over(partition by CustomerId ) as sales, row_number() over(partition by CustomerId order by transaction_date ASC) rn
from(
select CustomerId, transaction_date, product_category, quantity, sales
from tablee t
where (product_category = 'FRUITS' and
EXISTS (select CustomerId
from tablee tt
where product_category = 'VEGETABLES'
and t.CustomerId = tt.CustomerId)) OR
(product_category = 'VEGETABLES' and
EXISTS (select CustomerId
from tablee tt
where product_category = 'FRUITS'
and t.CustomerId = tt.CustomerId)))x)over_all
where rn = 1;
HERE is FIDDLE

How to merge two rows and sum the columns

product
quantity
price
milk
3
10
bread
7
3
bread
5
2
And my output table should be
product
total_price
milk
30
bread
31
I can't seem to get my code to work. Here is my code
SELECT product, (SELECT (quantity*unit_price)
FROM shopping_history AS sh ) AS total_price
FROM shopping_history
GROUP BY product
You are looking for the aggregate function SUM (which doesn't require a sub-query) e.g.
SELECT product, SUM(quantity*unit_price) AS Total_Price
FROM shopping_history
GROUP BY product

Calculating multiple averages across different parts of the table?

I have the following transactions table:
customer_id purchase_date product category department quantity store_id
1 2020-10-01 Kit Kat Candy Food 2 store_A
1 2020-10-01 Snickers Candy Food 1 store_A
1 2020-10-01 Snickers Candy Food 1 store_A
2 2020-10-01 Snickers Candy Food 2 store_A
2 2020-10-01 Baguette Bread Food 5 store_A
2 2020-10-01 iPhone Cell phones Electronics 2 store_A
3 2020-10-01 Sony PS5 Games Electronics 1 store_A
I would like to calculate the average number of products purchased (for each product in the table). I'm also looking to calculate averages across each category and each department by accounting for all products within the same category or department respectively. Care should be taken to divide over unique customers AND the product quantity being greater than 0 (a 0 quantity indicates a refund, and should not be accounted for).
So basically, the output table would like below:
...where store_id and average_level_type are partition columns.
Is there a way to achieve this in a single pass over the transactions table? or do I need to break down my approach into multiple steps?
Thanks!
How about using “union all” as below -
Select store_id, 'product' as average_level_type,product as id, sum(quantity) as total_quantity,
Count(distinct customer_id) as unique_customer_count, sum(quantity)/count(distinct customer_id) as average
from transactions
where quantity > 0
group by store_id,product
Union all
Select store_id, 'category' as average_level_type, category as id, sum(quantity) as total_quantity,
Count(distinct customer_id) as unique_customer_count, sum(quantity)/count(distinct customer_id) as average
from transactions
where quantity > 0
group by store_id,category
Union all
Select store_id, 'department' as average_level_type,department as id, sum(quantity) as total_quantity,
Count(distinct customer_id) as unique_customer_count, sum(quantity)/count(distinct customer_id) as average
from transactions
where quantity > 0
group by store_id,department;
If you want to avoid using union all in that case you can use something like rollup() or group by grouping sets() to achieve the same but the query would be a little more complicated to get the output in the exact format which you have shown in the question.
EDIT : Below is how you can use grouping sets to get the same output -
Select store_id,
case when G_ID = 3 then 'product'
when G_ID = 5 then 'category'
when G_ID = 6 then 'department' end As average_level_type,
case when G_ID = 3 then product
when G_ID = 5 then category
when G_ID = 6 then department end As id,
total_quantity,
unique_customer_count,
average
from
(select store_id, product, category, department, sum(quantity) as total_quantity, Count(distinct customer_id) as unique_customer_count, sum(quantity)/count(distinct customer_id) as average, GROUPING__ID As G_ID
from transactions
group by store_id,product,category,department
grouping sets((store_id,product),(store_id,category),(store_id,department))
) Tab
order by 2
;

rank function only returns 1 with date in redshift

I'm running the code below in redshift. I want to get a ranking of the order when a customer purchased a product based on the date. Each purchase has a unique ticketid, each customer has a unique customer_uuid, and each product has a unique product_id. The code below is returning 1 for all rankings and I'm not sure why. Is there an error in my code or is there a problem with ranking by a date field in redshift? Does anyone see how to modify this code to correct the issue.
code:
select customer_uuid,
product_id,
date,
ticketid
rank()
over(partition by customer_uuid,
product_id,
ticketid order by date asc) as rank
from table
order by customer_uuid, product_id
data:
customer_uuid product_id ticketid date
1 2 1 1/1/18
1 2 2 1/2/18
1 2 3 1/3/18
output:
customer_uuid product_id ticketid date rank
1 2 1 1/1/18 1
1 2 2 1/2/18 1
1 2 3 1/3/18 1
desired output:
customer_uuid product_id ticketid date rank
1 2 1 1/1/18 1
1 2 2 1/2/18 2
1 2 3 1/3/18 3
First, you have ticket_id in the partition by, which makes each row unique.
Second, you are using rank(). If you want an enumeration, do you want row_number()?
row_number() over(partition by customer_uuid, product_id order by date asc) as rank
I want to get a ranking of the order when a customer purchased a product based on the date. Each purchase has a unique ticketid, each customer has a unique customer_uuid, and each product has a unique product_id.
Basically you have unique (customer_uuid, product_id, ticket_id) tuples. If you use those as a partition, the rank will always be 1, since there is only one record per partition.
You just need to remove the ticket_id from the partition:
rank() over(
partition by customer_uuid, product_id
order by date
) as rank
Note: rank() will give an equal position to records that share the same (customer_uuid, product_id, date).

Min per group in SQL but with a caveat

I've got this table in SQL below and I need to return "the car vendors that will never be used if the car purchaser is a rational person" or "The vendor for which all car prices are more expensive then others". I've tried to do the idea of joining with itself but I am unable to get it work. The resulting output should be vendor 3 since its price for car 3 and 4 is more expensive than the other option.
id car_vendor_id vendor_name car_id price
---------------------------------------------
1 1 Vendor 1 1 25000
2 1 Vendor 1 2 40000
3 2 Vendor 2 2 35000
4 2 Vendor 2 3 25000
5 3 Vendor 3 3 28000
6 3 Vendor 3 4 40000
7 4 Vendor 4 4 35000
8 4 Vendor 4 5 20000
9 5 Vendor 5 5 18000
10 5 Vendor 5 6 32000
11 6 Vendor 6 6 30000
12 6 Vendor 6 7 20000
One method is row_number() and aggregation:
select car_vendor_id, vendor_name
from (select t.*,
rank() over (partition by car_id order by price) as seqnum
from t
) t
group by car_vendor_id, vendor_name
having min(seqnum) > 1;
The having clause is selecting rows where the vendor has no cars that are "first" based on price.
The following query uses a CTE to work out the price order for each car, so the most expensive is 1.
It then excludes rows where there is a row for the vendor where they are not the most expensive, and lastly checks they are are not the only vendor for a car.
declare #Car table(Vendor int, Car int, Price int)
insert #Car values (1,1,25000),(1,2,40000),(2,2,35000),(2,3,25000),(3,3,28000),(3,4,40000),(4,4,35000),(4,5,20000),(5,5,18000),(5,6,32000),(6,6,30000),(6,7,20000)
;with Price as (
select *, row_number() over(partition by Car order by Price desc) as r from #Car Car
)
select * from Price
where not exists(select * from Price p2 where p2.Vendor=Price.Vendor and p2.r>1)
and Vendor not in (
select Vendor from #Car where Car in (select Car from #Car group by Car having count(*)=1)
)
Check on the next query:
declare #car table(Vendor int, Car int, Price int);
insert #car
values
(1,1,25000),(1,2,40000),(2,2,35000),(2,3,25000),
(3,3,28000),(3,4,40000),(4,4,35000),(4,5,20000),
(5,5,18000),(5,6,32000),(6,6,30000),(6,7,20000);
with
a as (
select
vendor, price,
count(*) over(partition by car) cq,
count(*) over(partition by vendor) vcq,
max(price) over(partition by car) xcp
from #car
)
select vendor
from a
where cq > 1 and xcp = price
group by vendor, vcq
having count(*) = vcq;
To try the query online, please click here.