Transforming table with aggregation over rows and new column creation - sql

I am to new to post table views, so I will try to explain how my data looks:
I have the customer id, order id, the sales date, the product bought and also the price of the product. We sell 3 Products: K, N and E. Every row shows the product the customer bought and it's price. Customers can buy in the same order the same product several times, but also buy other products. Below I am trying to aggregate the prices per product, so that in the end I will have one column per customer and order and new price columns created.
Currently writing CTEs:
with N as (select Customer_ID, Order_ID, Sales_Date,
sum(Price)
from orders
group by 1,2,3
where product = 'N'),
K as (select Customer_ID, Order_ID, Sales_Date,
sum(Price)
from orders
group by 1,2,3
where product = 'K'),
E as (select Customer_ID, Order_ID, Sales_Date,
sum(Price)
from orders
group by 1,2,3
where product = 'E')
select N.*,
K.Price as K_Price,
E.Price as E_Price
from N as N
left join K as K on K.Customer_ID=N.Customer_ID
left join E as E on E.Customer_ID=N.Customer_ID
Is there a more efficient way to do this? If product options increase from 3 to 20 - I will have 20 CTEs, maybe it's better to write the query in a different way?

You can use conditional aggregation:
select Customer_ID, Order_ID, Sales_Date,
sum(price) filter (where product = 'N') as n_price,
sum(price) filter (where product = 'K') as k_price,
sum(price) filter (where product = 'E') as e_price
from orders o
group by Customer_ID, Order_ID, Sales_Date;

Related

Aggregation level is off (Postgresql)

I have Order data for 2 customers and their order. And I am trying to calculate what the sum for the price is for every customter for that specific order only for product N
Table:
This is my query:
select Customer_ID, Order_ID, Sales_Date,
sum(Price) over (partition by Customer_ID, Order_ID order by Customer_ID, Order_ID)
from orders
group by 1,2,3, Price
order by;
For some reason I do not understand it gives me several rows per same customer. I am trying to get only one row generated per customer and order for product N
This is my current Output:
Desired Outcome:
Why are you using window functions? I think you just want aggregation:
select Customer_ID, Order_ID, Sales_Date,
sum(Price)
from orders
group by 1,2,3;
If you only want one product, add where product = 'N'.

Select top 10 products sold in each year

I have two tables :
Sales
columns: (Sales_id, Date , Customer_id, Product_id, Purchase_amount):
Product
columns: ( Product_id, Product_Name, Brand_id,Brand_name)
I have to write a query to find the top 10 products sold every year. The query I have right now is :
WITH PH AS
(SELECT P.Product_Name, LEFT(S.Date,4) "SYEAR", COUNT(S.Product_id) "Product Count"
FROM Sales S LEFT JOIN Product P
ON S.Product_Id=P.Product_Id
GROUP BY P.Product_Name, LEFT(S.Date,4)
SELECT P.Product_Name, "SYEAR", "Product_Count"
FROM (SELECT P.Product_Name, "SYEAR", "Product_Count",
RANK OVER (PARTITION BY "SYEAR" ORDER BY "Product_Count" DESC) "TEMP"
)
WHERE "TEMP"<=10
This doesn't seem like the most optimized query. Can you please help me with that? Can there be an alternate version to obtain the required result?
Notes
The main reason for the repetition of the code is to enable grouping by the year. There's no field for the year in the given table.
The date format is: YYYYMMDD (example: 20200630)
Any help will be appreciated. Thanks in advance
You can combine the window functions with the aggregation:
SELECT PY.*
FROM (SELECT P.Product_Name, LEFT(S.Date,4) AS YEAR, COUNT(*) AS CNT,
RANK() OVER (PARTITION BY LEFT(S.Date, 4) ORDER BY COUNT(*) DESC) AS SEQNUM
FROM Sales S LEFT JOIN
Product P
ON S.Product_Id = P.Product_Id
GROUP BY P.Product_Name, LEFT(S.Date, 4)
) PY
WHERE SEQNUM <= 10;
From a performance perspective, this probably generates an execution plan very similar to your query. It is however simpler to follow.

SQL Select Group By Min() - but select other

I want to select the ID of the Table Products with the lowest Price Grouped By Product.
ID Product Price
1 123 10
2 123 11
3 234 20
4 234 21
Which by logic would look like this:
SELECT
ID,
Min(Price)
FROM
Products
GROUP BY
Product
But I don't want to select the Price itself, just the ID.
Resulting in
1
3
EDIT: The DBMSes used are Firebird and Filemaker
You didn't specify your DBMS, so this is ANSI standard SQL:
select id
from (
select id,
row_number() over (partition by product order by price) as rn
from orders
) t
where rn = 1
order by id;
If your DBMS doesn't support window functions, you can do that with joining against a derived table:
select o.id
from orders o
join (
select product,
min(price) as min_price
from orders
group by product
) t on t.product = o.product and t.min_price = o.price;
Note that this will return a slightly different result then the first solution: if the minimum price for a product occurs more then once, all those IDs will be returned. The first solution will only return one of them. If you don't want that, you need to group again in the outer query:
select min(o.id)
from orders o
join (
select product,
min(price) as min_price
from orders
group by product
) t on t.product = o.product and t.min_price = o.price
group by o.product;
SELECT ID
FROM Products as A
where price = ( select Min(Price)
from Products as B
where B.Product = A.Product )
GROUP BY id
This will show the ID, which in this case is 3.

SQL query for table with multiple keys?

I am sorry if this seems too easy but I was asked this question and I couldn't answer even after preparing SQL thoroughly :(. Can someone answer this?
There's a table - Seller id, product id, warehouse id, quantity of products at each warehouse for each product as per each seller.
We have to list the Product Ids with Seller Id who has highest number of products for that product and the total number of units he has for that product.
I think I got confused because there were 3 keys in the table.
It's not quite clear which DBMS you are using currently. The below should work if your DBMS support window functions.
You can find count of rows for each product and seller, rank each seller within each product using window function rank and then use filter to get only top ranked sellers in each product along with count of units.
select
product_id,
seller_id,
no_of_products
from (
select
product_id,
seller_id,
count(*) no_of_products,
rank() over (partition by product_id order by count(*) desc) rnk
from your_table
group by
product_id,
seller_id
) t where rnk = 1;
If window functions are not supported, you can use correlated query to achieve the same effect:
select
product_id,
seller_id,
count(*) no_of_products
from your_table a
group by
product_id,
seller_id
having count(*) = (
select max(cnt)
from (
select count(*) cnt
from your_table b
where b.product_id = a.product_id
group by seller_id
) t
);
Don't know why having id columns would mess you up... group by the right columns, sum up the totals and just return the first row:
select *
from (
select sellerid, productid, sum(quantity) as total_sold
from theres_a_table
group by sellerid, productid
) x
order by total_sold desc
fetch first 1 row only
If I do not think about optimization, straight forward answer is like this
select *
from
(
select seller_id, product_id, sum(product_qty) as seller_prod_qty
from your_table
group by seller_id, product_id
) spqo
inner join
(
select product_id, max(seller_prod_qty) as max_prod_qty
from
(
select seller_id, product_id, sum(product_qty) as seller_prod_qty
from your_table
group by seller_id, product_id
) spqi
group by product_id
) pmaxq
on spqo.product_id = pmaxq.product_id
and spqo.seller_prod_qty = pmaxq.max_prod_qty
both spqi (inner) and sqpo (outer) give you seller, product, sum of quantity across warehouses. pmaxq gives you max of each product again across warehouses, and then final inner join picks up sum of quantities if seller has highest (max) of the product (could be multiple sellers with the same quantity). I think this is the answer you are looking for. However, I'm sure query can be improved, since what I'm posting is the "conceptual" one :)

Finding the correct 'record' and using ONLY the data from that record

I have a list of products and suppliers.
I need to make sure that the Quantity is larger than zero.
If so, I need to find the product with the lowest price and list the supplier, the product (SKU), quantity and price.
My test data schema is:
create table products(merchant varchar(100), name varchar(150), quantity int, totalprice int);
insert into products values
('Supplier A', 'APC-SMT1000I', 10, 150),
('Supplier B', 'APC-SMT1000I', 15, 250),
('Supplier C', 'APC-SMT1000I', 15, 350),
('Supplier D', 'DEF-SMT1000I', 10, 500),
('Supplier E', 'DEF-SMT1000I', 35, 350),
('Supplier G', 'GHI-SMT1000I', 75, 70)
Logically, I would expect the result to read:
SUPPLIER SKU QTY PRICE
Supplier A APC-SMT1000I 10 150
Supplier D DEF-SMT1000I 35 350
Supplier G GHI-SMT1000I 75 70
My SQL Statement reads:
SELECT merchant AS Supplier, name AS sku,quantity AS Qty,
min(totalprice) AS Price FROM products where quantity > 0 group by name;
My results are:
SUPPLIER SKU QTY PRICE
Supplier A APC-SMT1000I 10 150
Supplier D DEF-SMT1000I 10 350
Supplier G GHI-SMT1000I 75 70
Obviously, the coding is finding the lowest price and displaying it, but not with the correct data.
My Question?
How can I group the data, find the record with the lowest price and make sure the programme uses ONLY the data from that record?
The easiest way to do this is using window/analytic functions. You don't specific the database you are using, but this is ANSI standard functionality available in most (but not all) databases.
Here is the syntax:
select merchant AS Supplier, name AS sku, quantity AS Qty,
totalprice AS Price
from (select p.*,
row_number() over (partition by name
order by totalprice
) as seqnum
from products p
where quantity > 0
) p
where seqnum = 1;
You could use the following query:
SELECT products.*
FROM
products INNER JOIN
(SELECT name, MIN(totalprice) min_price
FROM products
WHERE quantity>0
GROUP BY name) m
ON products.name=m.name AND products.totalprice=min_price
In the subquery I calculate the minimum total price for every name, then I'm joining this subquery with the products table, to return only the rows that have the minimum total price for that name. If there are more than one row with the minimum price, they all will be shown.
Please see fiddle here.
You haven't specified you RDBMS, so I'll provide a few queries.
This one should work in any database (but need 2 table scans):
select
p.merchant as Supplier,
p.name as sku,
p.quantity as Qty,
p.totalprice as Price
from products as p
where
p.totalprice in
(
select min(t.totalprice)
from products as t
where t.name = p.name
)
This one should work for any RDBMS which have row_number window function:
with cte as (
select *, row_number() over(partition by name order by totalprice) as rn
from products
)
select
p.merchant as Supplier,
p.name as sku,
p.quantity as Qty,
p.totalprice as Price
from cte as p
where rn = 1
This one is for PostgreSQL:
select distinct on (p.name)
p.merchant as Supplier,
p.name as sku,
p.quantity as Qty,
p.totalprice as Price
from products as p
order by p.name, p.totalprice
=> sql fiddle demo