Select column based on sum of another column

Select column based on sum of another column - sql

Let's say I have
SalesManagerId, SaleAmount, ProductId
I want to sum up the SaleAmount for each (SalesManagerId, ProductId) and grab the ProductId with the maximum sum(SaleAmount).
Is this possible in one query?
Example:
1, 100, 1
1, 200, 1
1, 600, 1
1, 400, 2
2, 100, 3
3, 100, 4
3, 100, 4
2, 500, 6
3, 100, 5
result:
1, 900, 1
2, 500, 6
3, 200, 4

If you have analytic functions available, you can use a RANK()
Something like:
SELECT SalesManagerId, ProductId, Total
FROM (
SELECT SalesManagerId,
ProductId,
SUM(SaleAmount) as Total,
RANK() OVER(PARTITION BY SalesManagerId
ORDER BY SUM(SaleAmount) DESC) as R
FROM <Table name>
GROUP BY SalesManagerId, ProductId) as InnerQuery
WHERE InnerQuery.R = 1

Assuming at least SQL 2005 so you can use a CTE:
;with cteTotalSales as (
select SalesManagerId, ProductId, SUM(SaleAmount) as TotalSales
from YourSalesTable
group by SalesManagerId, ProductId
),
cteMaxSales as (
select SalesManagerId, MAX(TotalSales) as MaxSale
from cteTotalSales
group by SalesManagerId
)
select ts.SalesManagerId, ms.MaxSale, ts.ProductId
from cteMaxSales ms
inner join cteTotalSales ts
on ms.SalesManagerId = ts.SalesManagerId
and ms.MaxSale = ts.TotalSales
order by ts.SalesManagerId

Use GROUP BY and ORDER:
SELECT SalesManagerId, SUM(SaleAmount) AS SaleSum, ProductId FROM [table-name] GROUP BY SalesManagerId, ProductId ORDER BY SaleSum DESC

Very good question!
Try this:
SELECT MAX(SUM(SaleAmount)), ProductId GROUP BY SalesManagerId, ProductId;
Or alternatively
SELECT SUM(SaleAmount) as Sum, ProductId GROUP BY SalesManagerId, ProductId ORDER BY Sum DESC;
You can't just drop the sum column and get ONLY the product id

Related

Product Price at a Given Date from LeetCode

What is wrong with my code?
QUESTION:
Table: Products: product_id | new_price | change_date
(product_id, change_date) is the primary key of this table.
Each row of this table indicates that the price of some product was changed to a new price at some date.
Write an SQL query to find the prices of all products on 2019-08-16. Assume the price of all products before any change is 10.
MY SOLUTION:
WITH cte1 AS (
SELECT product_id,
new_price AS price,
MAX(change_date) AS new_change_date
FROM Products
WHERE change_date <= CAST('2019-08-16' AS DATE)
GROUP BY product_id
),
cte2 AS (
SELECT product_id,
(new_price - 10) AS price,
MIN(change_date) AS new_change_date
FROM Products
WHERE change_date > CAST('2019-08-16' AS DATE)
GROUP BY product_id
)
SELECT DISTINCT product_id,
price
FROM cte1
UNION ALL
SELECT DISTINCT product_id,
price
FROM cte2
WHERE NOT EXISTS (SELECT product_id from cte1
WHERE cte2.product_id = cte1.product_id)
MY OUTPUT:
{"headers": ["product_id", "price"], "values": [[**1, 20**], [2, 50], [3, 10]]}
EXPECTED:
{"headers": ["product_id", "price"], "values": [[**1, 35**], [2, 50], [3, 10]]}

One method is union all. The first part fetches the most recent price as of the specified date. The second gets everything else:
select p.product_id, p.price
from products p
where change_date = (select max(p2.change_date)
from products p2
where p2.product_id = p.product_id and
p2.change_date <= '2019-08-16'
)
union all
select p.product_id, 10
from products p
where not exists (select 1
from products p2
where p2.product_id = p.product_id and
p2.change_date <= '2019-08-16'
);

how to simplify this multiple-CTE solution to this sql question?

DDL:
create table transactions
(
product_id int,
store_id int,
quantity int,
price numeric
);
DML:
insert into transactions values
(1, 1, 10, 2),
(2, 1, 5, 2),
(1, 2, 5, 4),
(2, 2, 2, 4),
(2, 3, 1, 20),
(1, 3, 1, 8),
(2, 4, 2, 10),
(1, 5, 2, 5),
(2, 5, 1, 3),
(2, 6, 4, 8);
I'm trying to find the top 3 products of the top 3 stores, both are based on sale amount. The solution I have is to use cte as below:
with cte as
(
select store_id, rank_store
from
(select
*,
dense_rank() over(order by sale desc) as rank_store
from
(select
store_id, sum(quantity * price) as sale
from transactions
group by 1) t) t2
where
rank_store <= 3
),
cte2 as
(
select
a.store_id, a.product_id,
sum(a.quantity * a.price) as sale_store_product
from
transactions as a
join
cte as b on a.store_id = b.store_id
group by
1, 2
order by
1, 2
),
cte3 as
(
select
*,
dense_rank() over (partition by store_id order by sale_store_product desc) as rank_product
from
cte2
)
select *
from cte3
where rank_product <= 3;
Here is the expected result:
Basically, the first cte is to get the top 3 stores based on sale amount, I use dense_rank() window function to handle tie cases. then the 2nd cte is to get the top 3 stores' products and their total sale amount. The last cte is to use dense_rank() window function again to rank the products in each stores based on their sale amount. then my last query is to get the top 3 products in each store based on the sale amount.
I'm wondering if this can be improved a bit since I feel three CTEs is kind of too complicated. Appreciate for sharing any solutions and ideas. Thanks.

I'm trying to find the top 3 products of the top 3 stores
How can this be done without aggregating the data twice -- once for store/products and once for stores? This is possible using window functions along with aggregation:
select sp.*
from (select sp.*,
dense_rank() over (order by store_sales, store_id) as store_seqnum
from (select t.store_id, t.product_id,
sum(quantity * price) as sp_sales,
sum(sum(quantity * price)) over (partition by store_id) as store_sales,
row_number() over (partition by t.store_id order by sum(quantity * price)) as sp_seqnum
from transactions t
group by t.store_id, t.product_id
) sp
) sp
where store_seqnum <= 3 and sp_seqnum <= 3;
The inner subquery calculates the product/store information. The next level ranks the stores -- notes that ties are broken using store_id.
Here is a db<>fiddle.

Nth result in BigQuery Group By

I have a derived table like:
id, desc, total, account
1, one, 10, a
1, one, 9, b
1, one, 3, c
2, two, 27, c
I can do a simple
select id, desc, sum(total) as total from mytable group by id
but I want to add the equivalent first(account), first(total), second(account), second(total) to the output so it'd be:
id, desc, total, first_account, first_account_total, second_account, second_account_total
1, one, 21, a, 10, b, 9
2, two, 27, c, 27, null, 0
Any pointers?
Thanks in advance!

Below is for BigQuery Standard SQL
#standardSQL
SELECT id, `desc`, total,
arr[OFFSET(0)].account AS first_account,
arr[OFFSET(0)].total AS first_account_total,
arr[SAFE_OFFSET(1)].account AS second_account,
arr[SAFE_OFFSET(1)].total AS second_account_total
FROM (
SELECT id, `desc`, SUM(total) total,
ARRAY_AGG(STRUCT(account, total) ORDER BY total DESC LIMIT 2) arr
FROM `project.dataset.table`
GROUP BY id, `desc`
)
In cases when more than 2 first bins are required I would use below pattern that eliminates repeating of heavy repeated lines like arr[SAFE_OFFSET(1)].total AS second_account_total
#standardSQL
SELECT * FROM (SELECT NULL id, '' `desc`, NULL total, '' first_account, NULL first_account_total, '' second_account, NULL second_account_total) WHERE FALSE
UNION ALL
SELECT id, `desc`, total, arr[OFFSET(0)].*, arr[SAFE_OFFSET(1)].*
FROM (
SELECT id, `desc`, SUM(total) total,
ARRAY_AGG(STRUCT(account, total) ORDER BY total DESC LIMIT 2) arr
FROM `project.dataset.table`
GROUP BY id, `desc`
)
In above, first line sets layout of output while returning no rows at all because of WHERE FALSE, so then I don't need to explicitly parse struct's elements and provide aliases

Use CTE and UNION ALL with SQL Server 2014

My problem is that:
Create a view that shows the top 5 selling products as well as an aggregated row that shows the total sales for all other products and a Grand total row that sums all of the above.
WITH ProductTop5 AS
(
SELECT [dbo].[Product].[ProductName] AS ProductName, SUM([dbo].[SalesOrderDetail].[LineTotal]) AS TotalAmount
FROM [dbo].[Product]
JOIN [dbo].[SalesOrderDetail] ON [dbo].[Product].[ProductID] = [dbo].[SalesOrderDetail].[ProductID]
GROUP BY [dbo].[Product].[ProductName]
)

You could use ROW_NUMBER/RANK to calculate ranking of product:
WITH Product AS
(
SELECT p.[ProductName] AS ProductName,
SUM(sod.[LineTotal]) AS TotalAmount
FROM [dbo].[Product] p
JOIN [dbo].[SalesOrderDetail] sod
ON p.[ProductID] = sod.[ProductID]
GROUP BY p.[ProductName]
), ProductWithRank AS (
SELECT ProductName, Total_Amount,
ROW_NUMBER() OVER(ORDER BY Total_Amount DESC) AS rn
FROM Product
)
SELECT ProductName, TotalAmount
FROM ProductWithRank
WHERE rn <= 5
UNION ALL
SELECT 'All Others', SUM(Total_Amount)
FROM ProductWithRank
WHERE rn > 5
UNION ALL
SELECT 'Grand Total', SUM(TotalAmount)
FROM ProductWithRank;

creating a pseudo linked list in sql

I have a table that has the following columns
table: route
columns: id, location, order_id
and it has values such as
id, location, order_id
1, London, 12
2, Amsterdam, 102
3, Berlin, 90
5, Paris, 19
Is it possible to do a sql select statement in postgres that will return each row along with the id with the next highest order_id? So I want something like...
id, location, order_id, next_id
1, London, 12, 5
2, Amsterdam, 102, NULL
3, Berlin, 90, 2
5, Paris, 19, 3
Thanks

select
id,
location,
order_id,
lag(id) over (order by order_id desc) as next_id
from your_table

Creating testbed first:
CREATE TABLE route (id int4, location varchar(20), order_id int4);
INSERT INTO route VALUES
(1,'London',12),(2,'Amsterdam',102),
(3,'Berlin',90),(5,'Paris',19);
The query:
WITH ranked AS (
SELECT id,location,order_id,rank() OVER (ORDER BY order_id)
FROM route)
SELECT b.id, b.location, b.order_id, n.id
FROM ranked b
LEFT JOIN ranked n ON b.rank+1=n.rank
ORDER BY b.id;
You can read more on the window functions in the documentation.

yes:
select * ,
(select top 1 id from routes_table where order_id > main.order_id order by 1 desc)
from routes_table main

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Select column based on sum of another column - sql

Use GROUP BY and ORDER: SELECT SalesManagerId, SUM(SaleAmount) AS SaleSum, ProductId FROM [table-name] GROUP BY SalesManagerId, ProductId ORDER BY SaleSum DESC

Very good question! Try this: SELECT MAX(SUM(SaleAmount)), ProductId GROUP BY SalesManagerId, ProductId; Or alternatively SELECT SUM(SaleAmount) as Sum, ProductId GROUP BY SalesManagerId, ProductId ORDER BY Sum DESC; You can't just drop the sum column and get ONLY the product id

Related

Product Price at a Given Date from LeetCode

how to simplify this multiple-CTE solution to this sql question?

Nth result in BigQuery Group By

Use CTE and UNION ALL with SQL Server 2014

creating a pseudo linked list in sql

Categories

Resources