Join two tables but only get most recent associated record - sql

I am having a hard time constructing an sql query that gets all the associated data with respect to another (associated) table and loops over into that set of data on which are considered as latest (or most recent).
The image below describes my two tables (Inventory and Sales), the Inventory table contains all the item and the Sales table contains all the transaction records. The Inventory.Id is related to Sales.Inventory_Id. And the Wanted result is the output that I am trying to work on to.
My objective is to associate all the sales record with respect to inventory but only get the most recent transaction for each item.
Using a plain join (left, right or inner) doesn't produce the result that I am looking into for I don't know how to add another category in which you can filter the most recent data to join to. Is this doable or should I change my table schema?
Thanks.

You can use APPLY
Select Item,Sales.Price
From Inventory I
Cross Apply(Select top 1 Price
From Sales S
Where I.id = S.Inventory_Id
Order By Date Desc) as Sales

WITH Sales_Latest AS (
SELECT *,
MAX(Date) OVER(PARTITION BY Inventory_Id) Latest_Date
FROM Sales
)
SELECT i.Item, s.Price
FROM Inventory i
INNER JOIN Sales_Latest s ON (i.Id = s.Inventory_Id)
WHERE s.Date = s.Latest_Date
Think carefully about what results you expect if there are two prices in Sales for the same date.

I would just use a correlated subquery:
select Item, Price
from Inventory i
inner join Sales s
on i.id = s.Inventory_Id
and s.Date = (select max(Date) from Sales where Inventory_Id = i.id)

select * from
(
select i.name,
row_number() over (partition by i.id order by s.date desc) as rownum,
s.price,
s.date
from inventory i
left join sales s on i.id = s.inventory_id
) tmp
where rownum = 1
SQLFiddle demo

Related

DISTINCT is not working properly in left join

I have two tables StockIn and StockOut.
StockIn table has an Id number for the product, this Id number is unique in the StockIn table, where else this Id number will be duplicated in StockOut table as many time the stock will be sold out.
Now I want to create a view to find the stock in hand where StockIn.Id = StockOut.Id. So far I manage to query and it is giving the result but it fails when there are multiple order in the StockOut table of the same product, because of the duplicate Id number.
Below is my query:
select DISTINCT
i.Id,
i.AssetsName,
i.Rate,
i.Qty,
So.QtyOut,
Balance = sum( COALESCE(i.Qty,0)- COALESCE(so.QtyOut,0)) OVER(PARTITION BY i.id)
from dbo.StockIn i
LEFT Join StockOut So
on i.Id = So.Id
GO
A simple GROUP BY with normal SUM's should do.
SELECT
si.Id,
si.AssetsName,
si.Rate,
si.Qty,
SUM(so.QtyOut) AS QtyOut,
COALESCE(si.Qty, 0) - SUM(so.QtyOut) AS Balance
FROM dbo.StockIn si
LEFT JOIN dbo.StockOut so ON so.Id = si.Id
GROUP BY si.Id, si.AssetsName, si.Rate, si.Qty;
You'd be better off collapsing your stock out to single row before you join
Remove the distinct - it doesn't help and isn't needed
select
id,
si.qtyin - so.qtyout as balance
from
stockin si
left join
(select id, sum (qtyout) as qtyout from stockout group by id) so
on so.id = si.id
That means your joins will only ever be 1:0 (if no sales) or 1:1 (all sales summed before join)
What happens when you order more stuff? If you create another stockin record with the same ID then the same problem will happen for products that have been re-stocked. Use the same technique to collapse stockin to a single row before you join:
select
id,
si.qtyin - so.qtyout as balance
from
(Select id, sum(qtyin) as qtyin stockin group by id) si
left join
(select id, sum (qtyout) as qtyout from stockout group by id) so
on so.id = si.id
Try this
select
i.Id,
i.AssetsName,
i.Rate,
i.Qty,
sum( COALESCE(i.Qty,0)- COALESCE((Select Qtyout from StockIn Where StockIn.Id=StockOut.Id),0)) OVER(PARTITION BY i.id) as Balance
from dbo.StockIn i
group by Id,AssetsName,Rate,Qty

SQL: How to group by with two tables?

I have the tables products and history and I need to group by name:
products = (id_product, name)
history = (id_history, id_product, amount)
I tried this SQL query but it isn't grouped by name:
SELECT
products.name,
sum(history.amount)
FROM history
INNER JOIN products ON history.id_product = products.id_product
GROUP BY
products.name,
history.amount,
history.id_history;
This is the result:
You should only be grouping by the attributes you need to be aggregated. In this case, you need only products.name.
SELECT
products.name,
sum(history.amount) AS [Amount]
FROM history
INNER JOIN products ON history.id_product = products.id_product
GROUP BY
products.name;
If you need to include products without history (assuming sum should be 0 instead of null in this case), then you can use an OUTER JOIN instead of INNER JOIN to include all products:
SELECT
products.name,
COALESCE(sum(history.amount), 0) AS [Amount]
FROM history
RIGHT OUTER JOIN products ON history.id_product = products.id_product
GROUP BY
products.name;
This is no answer, but too long for a comment.
For readability's sake the product table should be first. After all it is products that we select from, plus a history sum that we can access via [left] join history ... followed by an aggregation, or [left] join (<history aggregation query>), or a subselect in the select clause.
Another step to enhance readability is the use of alias names.
Join the table, then aggregate
select p.name, coalesce(sum(h.amount), 0) as total
from products p
left join history h on h.id_product = p.id_product
group by p.name
order by p.name;
Aggregate, then join
select p.name, coalesce(h.sum_amount, 0) as total
from products p
left join
(
select sum(h.amount) as sum_amount
from history
group by id_product
) h on h.id_product = p.id_product
order by p.name;
Get the sum in the select clause
select
name,
(select sum(amount) from history h where h.id_product = p.id_product) as total
from products p
order by p.name;
And as you were confused on how to use GROUP BY, here is an explanation: GROUP BY ___ means you want one result row per ___. In your original query you had GROUP BY products.name, history.amount, history.id_history saying you wanted one result row per name, amount, and id, while you actually wanted one row per name only, i.e. GROUP BY products.name.

Getting a SUM of the values in INNER JOIN adds up duplicate values

I am running a query which is counting the records on monthly basis from the table.
I am trying to add one extra column called "TotalPrice", I need a sum of all the prices from 'settle' table.
The problem I am facing is because of INNER JOIN, 'SUM' of the prices is adding up multiple prices due to duplicate records which the INNER JOIN is returning. Is there a way to avoid it and get a SUM of the prices from unique records ?
SELECT
CONCAT(year(datetime), '-', month(datetime)) AS YearMonth,
COUNT (DISTINCT a.id) AS TOTAL, SUM(total_price) AS TotalPrice
FROM settle AS a with (nolock)
INNER JOIN transfers b with (nolock) ON b.settleId = a.id
INNER JOIN Fdata AS c with (nolock) ON c.id= b.data
GROUP BY CONCAT(year(datetime), '-', month(datetime))
Thanks in advance.
sql server 2008 onwards:
with CTE as -- A CTE alows us to manipulate the data before we use it, like a derived table
(
select datetime, id, total_price,
row_number() over(partition by id, datetime order by total_price) as rn -- This creates a row number for each combo of id and datetime that appears
FROM settle AS a with (nolock)
INNER JOIN transfers b with (nolock) ON b.settleId = a.id
INNER JOIN Fdata AS c with (nolock) ON c.id= b.data
)
SELECT CONCAT(year(datetime), '-', month(datetime)) AS YearMonth,
COUNT (DISTINCT a.id) AS TOTAL,
SUM(total_price) AS TotalPrice
from CTE
where rn = 1 -- that row_number we created? This selects only the first one, removing duplicates
group by CONCAT(year(datetime), '-', month(datetime))

Fetch most recent records as part of Joins

I am joining 2 tables customer & profile. Both the tables are joined by a specific column cust_id. In profile table, I have more than 1 entry. I want to select the most recent entry by start_ts (column) when joining both the tables. As a result I would like 1 row - row from customer and most recent row from profile in the resultset. Is there a way to do this ORACLE SQL?
I would use window functions:
select . . .
from customer c join
(select p.*,
row_number() over (partition by cust_id order by start_ts desc) as seqnum
from profile
) p
on c.cust_id = p.cust_id and p.seqnum = 1;
You can use a left join if you like to get customers that don't have profiles as well.
One way (which works for all DB engines) is to join the tables you want to select data from and then join against the specific max-record of profile to filter out the data
select c.*, p.*
from customer c
join profile p on c.cust_id = p.cust_id
join
(
select cust_id, max(start_ts) as maxts
from profile
group by cust_id
) p2 on p.cust_id = p2.cust_id and p.start_ts = p2.maxts
Here is another way (if there exists no newer entry then it's the newest):
select
c.*,
p.*
from
customer c inner join
profile p on p.cust_id = c.cust_id and not exists(
select *
from profile
where cust_id = c.cust_id and start_ts > p.start_ts
)

SQL strategy to fetch maximum

Suppose I have these three tables:
I want to get, for all products, it's product_id and the client that bougth it most times (the biggest client of the product).
I solved it like this:
SELECT
product_id AS product,
(SELECT TOP 1 client_id FROM Bill_Item, Bill
WHERE Bill_Item.product_id = p.product_id
and Bill_Item.bill_id = Bill.bill_id
GROUP BY
client_id
ORDER BY
COUNT(*) DESC
) AS client
FROM Product p
Do you know a better way?
the inner query will give you the ranking. The outer query will give you the client that puchase the most for a product
SELECT *
(
SELECT i.product_id, b.client_id,
r = row_number() over (partition by i.product_id
order by count(*) desc)
FROM Bill b
INNER JOIN Bill_Item i ON b.bill_id = i.bill_id
GROUP BY i.product_id, b.client_id
) d
WHERE r = 1
I was going to submit pretty much the same thing as #Squirrell only with a Common Table Expression [CTE] rather than a derived table. So I wont duplicate that but there are some learning points concerning your query. First is IMPLICIT JOINS such as FROM Bill_Item, Bill are really easy to have uintended consequences (one of many questions: Queries that implicit SQL joins can't do?) Next for the Calculated column you can actually do this in a OUTER APPLY or CROSS APPLY which is a very useful technique.
So you could re-write your method as follows:
SELECT *
FROM
Product p
OUTER APPLY (SELECT TOP 1 b.client_id
FROM
Bill_Item bi
INNER JOIN Bill b
ON bi.bill_id = b.bill_id
WHERE
bi.product_id = p.product_id
GROUP BY
b.client_id
ORDER BY
COUNT(*) DESC) c
And to show you how squirell's answer can still include products that have never been sold all you need to do is join Products and LEFT JOIN to other tables:
;WITH cte AS (
SELECT
p.product_id
,b.client_id
,ROW_NUMBER() OVER (PARTITION BY p.product_id ORDER BY COUNT(*) DESC) as RowNumber
FROM
Product p
LEFT JOIN Bill_Item bi
ON p.product_id = bi.product_id
LEFT JOIN Bill b
ON bi.bill_id = b.bill_id
GROUP BY
p.product_id
,b.client_id
)
SELECT *
FROM
cte
WHERE
RowNumber = 1
Techniques used in some of these that are useful.
CTE
APPLY (Outer & Cross)
Window Functions
Squirrel's answer doesn't return products that have never been sold. If you want to include those, then your approach is ok, although I would write the query as:
SELECT product_id as product,
(SELECT TOP 1 b.client_id
FROM Bill_Item bi JOIN
Bill b
ON bi.bill_id = b.bill_id
WHERE Bill_Item.product_id = p.product_id
GROUP BY client_id
ORDER BY COUNT(*) DESC
) as client
FROM Product p;
You can also express this using APPLY, but a correlated subquery is also fine.
Note the correct use of the explicit JOIN syntax.