SQL latest record per group with an aggregated column - sql

I have a table similar to this:
STOCK_ID TRADE_TIME PRICE VOLUME
123 1 5 100
123 2 6 150
456 1 7 200
456 2 8 250
For each stock I want to get latest price (where latest is just the max trade_time) and aggregated volume, so for the above table I want to see:
123 6 250
456 8 450
I've just discovered that the current query doesn't (always) work, ie there's no guarantee that the price selected is always the latest:
select stock_id, price, sum(volume) group by stock_id
Is this possible to do without subqueries? Thanks!

As you didn't specify the database you are using Here is some generic SQL that will do what you want.
SELECT
b.stock_id,
b.trade_time,
b.price,
a.sumVolume
FROM (SELECT
stock_id,
max(trade_time) AS maxtime,
sum(volume) as sumVolume
FROM stocktable
GROUP BY stock_id) a
INNER JOIN stocktable b
ON b.stock_id = a.stock_id and b.trade_time = a.maxtime

In SQL Server 2005 and up, you could use a CTE (Common Table Expression) to get what you're looking for:
;WITH MaxStocks AS
(
SELECT
stock_id, price, tradetime, volume,
ROW_NUMBER() OVER(PARTITION BY stock_ID ORDER BY TradeTime DESC) 'RowNo'
FROM
#stocks
)
SELECT
m.StockID, m.Price,
(SELECT SUM(VOLUME)
FROM maxStocks m2
WHERE m2.STock_ID = m.Stock_ID) AS 'TotalVolume'
FROM maxStocks m
WHERE rowno = 1
Since you want both the last trade as well as the volume of all trades for each stock, I don't see how you could do this totally without subqueries, however....

declare #Stock table(STOCK_ID int,TRADE_TIME int,PRICE int,VOLUME int)
insert into #Stock values(123,1,5,100),(123,2,6,150),(456,1,7,200),(456,2,8,250)
Select Stock_ID,Price,(Select sum(Volume) from #Stock B where B.Stock_ID=A.Stock_ID)Volume from #Stock A
where A.Trade_Time=(Select max(Trade_Time) from #Stock)

select a.stock_id, b.price , sum(a.volume) from tablename a
join (select stock_id, max(trade_time), price from tablename
group by stock_id) b
on a.stock_id = b.stock_id
group by stock_id

Related

How to choose max of one column per other column

I am using SQL Server and I have a table "a"
month segment_id price
-----------------------------
1 1 100
1 2 200
2 3 50
2 4 80
3 5 10
I want to make a query which presents the original columns where the price will be the max per month
The result should be:
month segment_id price
----------------------------
1 2 200
2 4 80
3 5 10
I tried to write SQL code:
Select
month, segment_id, max(price) as MaxPrice
from
a
but I got an error:
Column segment_id is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause
I tried to fix it in many ways but didn't find how to fix it
Because you need a group by clause without segment_id
Select month, max(price) as MaxPrice
from a
Group By month
as you want results per each month, and segment_id is non-aggregated in your original select statement.
If you want to have segment_id with maximum price repeating per each month for each row, you need to use max() function as window analytic function without Group by clause
Select month, segment_id,
max(price) over ( partition by month order by segment_id ) as MaxPrice
from a
Edit (due to your lastly edited desired results) : you need one more window analytic function row_number() as #Gordon already mentioned:
Select month, segment_id, price From
(
Select a.*,
row_number() over ( partition by month order by price desc ) as Rn
from a
) q
Where rn = 1
I would recommend a correlated subquery:
select t.*
from t
where t.price = (select max(t2.price) from t t2 where t2.month = t.month);
The "canonical" solution is to use row_number():
select t.*
from (select t.*,
row_number() over (partition by month order by price desc) as seqnum
from t
) t
where seqnum = 1;
With the right indexes, the correlated subquery often performs better.
Only because it was not mentioned.
Yet another option is the WITH TIES clause.
To be clear, the approach by Gordon and Barbaros would be a nudge more performant, but this technique does not require or generate an extra column.
Select Top 1 with ties *
From YourTable
Order By row_number() over (partition by month order by price desc)
With not exists:
select t.*
from tablename t
where not exists (
select 1 from tablename
where month = t.month and price > t.price
)
or:
select t.*
from tablename inner join (
select month, max(price) as price
from tablename
group By month
) g on g.month = t.month and g.price = t.price

Getting the lastest entry grouped by ID

I have a table with stock for products. The problem is that every time there is a stock change, the new value is stored, together with the new Quantity. Example:
ProductID | Quantity | LastUpdate
1 123 2019.01.01
2 234 2019.01.01
1 444 2019.01.02
2 222 2019.01.02
I therefore need to get the latest stock update for every Product and return this:
ProductID | Quantity
1 444
2 222
The following SQL works, but is slow.
SELECT ProductID, Quantity
FROM (
SELECT ProductID, Quantity
FROM Stock
WHERE LastUpdate
IN (SELECT MAX(LastUpdate) FROM Stock GROUP BY ProductID)
)
Since the query is slow and supposed to be left joined into another query, I really would like some input on how to do this better.
Is there another way?
Use analytic functions. row_number can be used in this case.
SELECT ProductID, Quantity
FROM (SELECT ProductID, Quantity, row_number() over(partition by ProductID order by LstUpdte desc) as rnum
FROM Stock
) s
WHERE RNUM = 1
Or with first_value.
SELECT DISTINCT ProductID, FIRST_VALUE(Quantity) OVER(partition by ProductID order by LstUpdte desc) as quantuity
FROM Stock
Just another option is using WITH TIES in concert with Row_Number()
Full Disclosure: Vamsi's answer will be a nudge more performant.
Example
Select Top 1 with ties *
From YourTable
Order by Row_Number() over (Partition By ProductID Order by LastUpdate Desc)
Returns
ProductID Quantity LastUpdate
1 444 2019-01-02
2 222 2019-01-02
So you Could use a CTE(Common Table Expression)
Base Data:
SELECT 1 AS ProductID
,123 AS Quantity
,'2019-01-01' as LastUpdate
INTO #table
UNION
SELECT 2 AS ProductID
,234 AS Quantity
,'2019-01-01' as LastUpdate
UNION
SELECT 1 AS ProductID
,444 AS Quantity
,'2019-01-02' as LastUpdate
UNION
SELECT 2 AS ProductID
,222 AS Quantity
,'2019-01-02' as LastUpdate
Here is the code using a Common Table Expression.
WITH CTE (ProductID, Quantity, LastUpdate, Rnk)
AS
(
SELECT ProductID
,Quantity
,LastUpdate
,ROW_NUMBER() OVER(PARTITION BY ProductID ORDER BY LastUpdate DESC) AS Rnk
FROM #table
)
SELECT ProductID, Quantity, LastUpdate
FROM CTE
WHERE rnk = 1
Returns
You could then Join the CTE to whatever table you need.
row_number() function might be the most efficient, but the big slow down in your query is the use of the IN statement when used on a subquery, it's a little bit of a tricky one but a join is faster. This query should get what you want and be much faster.
SELECT
a.ProductID
,a.Quantity
FROM stock as a
INNER JOIN (
SELECT
ProductID
,MAX(LastUpdate) as LastUpdate
FROM stock
GROUP BY ProductID
) b
ON a.ProductID = b.ProductId AND
a.LastUpdate = b.LastUpdate

SQL Server : SELECT Highest Price and add qty's from table

I have the following table TableAllProds:
ProdName ManuPartNo Price Qty Supplier
--------------------------------------------------
Part1 R10001 100.00 2 Supp1
Part2 R10002 500.00 2 Supp2
Part3 R30023 50.00 1 Supp3
Part2again R10002 100.00 5 Supp4
Part2Again R10002 300.00 10 Supp5
Part1again R10001 200.00 5 Supp3
I have a select statement to bring me back the highest price which works fine if there are duplicate products from different suppliers.
SELECT
ProdName, ManuPartNo, Price, Qty, Supplier
FROM
(SELECT
dbo.TableAllProds.*,
ROW_NUMBER() OVER (PARTITION BY ManuPartNo ORDER BY Price ASC) AS RN
FROM
dbo.TableAllProds) AS t
WHERE
RN = 1
ORDER BY
ManuPartNo
However I would also like to total all of the qty's for all suppliers Example for ManuPartNo - R10001 I would Like to return R10001 - 200.00 - 7(qty) and the supplier of the highest Price if possible.
Not sure how to google this, I can either return the highest/Lowest price easily and also return a sum of the qty for each part but am not sure about how to perform both queries at once.
Thanks for any help.
You can use SUM as a windowed function:
SELECT ProdName, ManuPartNo, Price, Qty, TotalQty, Supplier
FROM ( SELECT *,
ROW_NUMBER() OVER(PARTITION BY ManuPartNo ORDER BY Price ASC) AS RN,
SUM(Qty) OVER(PARTITION BY ManuPartNo) AS TotalQty,
FROM dbo.TableAllProds) AS t
WHERE RN = 1
ORDER BY ManuPartNo;
This seems to be what you want... uncomment the where clause if you only want that supplier.
declare #TableAllProds table (ProdName varchar(16), ManuPartNo varchar(16), Price decimal (5,2), Qty int, Supplier varchar(16))
insert into #TableAllProds
values
('Part1','R10001',100.00,2,'Supp1'),
('Part2','R10002',500.00,2,'Supp2'),
('Part3','R30023',50.00,1,'Supp3'),
('Part2again','R10002',100.00,5,'Supp4'),
('Part2Again','R10002',300.00,10,'Supp5'),
('Part1again','R10001',200.00,5,'Supp3')
;WITH CTE AS(
SELECT
ProdName,
ManuPartNo,
Price,
Supplier,
sum(Qty) over (partition by ManuPartNo) TotalOverAllSuppliers,
case when Price = max(price) over (partition by ManuPartNo) then Supplier end HighestPricedSupplier
FROM
#TableAllProds)
select
*
from cte
--where HighestPricedSupplier is not null
SELECT a.ManuPartNo, a.Price, a.QTY, b.Supplier
FROM (SELECT t1.ManuPartNo, MAX(t1.Price) AS Price, SUM(t1.Qty) AS QTY
FROM dbo.alltableprods t1
GROUP BY t1.ManuPartNo) a
JOIN (SELECT t2.ManuPartNo, t2.price, T2.Supplier,
ROW_NUMBER() OVER (PARTITION BY t2.ManuPartNo ORDER BY t2.price desc)
AS RN
FROM dbo.alltableprods t2
GROUP BY t2.ManuPartNo, t2.Price, t2.Supplier) b ON a.ManuPartNo =
b.ManuPartNo
WHERE b.RN = 1
Using this will return
R10001 200.00 7 Supp3
R10002 500.00 17 Supp2
R30023 50.00 1 Supp3
I have a question though. Is it possible for there to be more than one supplier that has the same part at the same price? If so then this will still work however it will just grab whatever applicable supplier it finds first.
You can query using row_number as below:
Select * from (
Select *, RowN = Row_Number() over(Partition by ManuPartNo order by Price desc), SmQty = Sum(Qty) over(Partition by ManuPartNo) from dbo.TableAllProds ) a
where a.RowN = 1

SQL Select Group By Min() - but select other

I want to select the ID of the Table Products with the lowest Price Grouped By Product.
ID Product Price
1 123 10
2 123 11
3 234 20
4 234 21
Which by logic would look like this:
SELECT
ID,
Min(Price)
FROM
Products
GROUP BY
Product
But I don't want to select the Price itself, just the ID.
Resulting in
1
3
EDIT: The DBMSes used are Firebird and Filemaker
You didn't specify your DBMS, so this is ANSI standard SQL:
select id
from (
select id,
row_number() over (partition by product order by price) as rn
from orders
) t
where rn = 1
order by id;
If your DBMS doesn't support window functions, you can do that with joining against a derived table:
select o.id
from orders o
join (
select product,
min(price) as min_price
from orders
group by product
) t on t.product = o.product and t.min_price = o.price;
Note that this will return a slightly different result then the first solution: if the minimum price for a product occurs more then once, all those IDs will be returned. The first solution will only return one of them. If you don't want that, you need to group again in the outer query:
select min(o.id)
from orders o
join (
select product,
min(price) as min_price
from orders
group by product
) t on t.product = o.product and t.min_price = o.price
group by o.product;
SELECT ID
FROM Products as A
where price = ( select Min(Price)
from Products as B
where B.Product = A.Product )
GROUP BY id
This will show the ID, which in this case is 3.

How to query specific values for some columns and sum of values in others SQL

I'm trying to query some data from SQL such that it sums some columns, gets the max of another column and the corresponding row for a third column. For example,
|dataset|
|shares| |date| |price|
100 05/13/16 20.4
200 05/15/16 21.2
300 06/12/16 19.3
400 02/22/16 20.0
I want my output to be:
|shares| |date| |price|
1000 06/12/16 19.3
The shares have been summed up, the date is max(date), and the price is the price at max(date).
So far, I have:
select sum(shares), max(date), max(price)
but that gives me an incorrect price.
EDIT:
I realize I was unclear in my OP, all the other relevant data is in one table, and the price is in other. My full code is:
select id, stock, side, exchange, max(startdate), max(enddate),
sum(shares), sum(execution_price*shares)/sum(shares), max(limitprice), max(price)
from table1 t1
INNER JOIN table2 t2 on t2.id = t1.id
where location = 'CHICAGO' and startdate > '1/1/2016' and order_type = 'limit'
group by id, stock, side, exchange
You can do this with window functions and aggregation. Here is an example:
select sum(shared), max(date), max(case when seqnum = 1 then price end) as price
from (select t.*, row_number() over (order by date desc) as seqnum
from t
) t;
EDIT:
If the results that you are looking at are in fact the result of a query, you can do:
with t as (<your query here>)
select sum(shared), max(date), max(case when seqnum = 1 then price end) as price
from (select t.*, row_number() over (order by date desc) as seqnum
from t
) t;
Heres one way to do it .... the join would obviously include the ticker symbol for the share also
select
a.sum_share,
a.max_date
b.price
FROM
(
select ticker , sum(shares) sum_share, max(date) max_date from table where ticker = 'MSFT' group by ticker
) a
inner join table on a.max_date = b.date and a.ticker = b.ticker