SQL return limited rows based on agregating sum - sql

I want to return a number of rows from one table whose sum is dependent on a value from a row in another table:
Scenario: Sales order for a qty of particular item. The item is found in a number of Bin locations. The storeman needs to be directed to the oldest material.
I can create a query that will list the Bin, the Qty in the bin and list them in age (oldest to youngest) - all good so far, but say the order is for 100 units and there are 50 or so units in each bin and there are 40 bins, then I don't want to list all the bins, just the oldest two - just enough to be able to fulfill the order.
How do I do that?
Just some more info as requested
DB = MS SQL 2016
Sample Data:
The following is the data for a particular item showing the Bin, the qty in that bin and ageing date:
Bin#, Qty, Date
1,40,2018-05-15
3,45,2018-05-15
8,45,2018-02-10
12,45,2017-11-11
13,45,2018-02-10
15,45,2017-09-02
18,20,2017-09-02
The sales order is for 100 of these items, We want to pick FIFO (First-In-First-Out), so the results I want to return are:
18,20,2017-09-02
15,45,2017-09-02
12,45,2017-11-11
These three bins contain a total of 110 units so that is enough to satisfy the Sales Order. Note that order is Date, then Qty
The actual query is currently:
select
[OrderHed].[OrderNum] as [OrderHed_OrderNum],
[OrderRel].[OrderLine] as [OrderRel_OrderLine],
[Part].[PartNum] as [Part_PartNum],
[Part].[PartDescription] as [Part_PartDescription],
[OrderRel].[OurReqQty] as [OrderRel_OurReqQty],
[PartBin].[BinNum] as [PartBin_BinNum],
[PartBin].[OnhandQty] as [PartBin_OnhandQty],
[PartLot].[FirstRefDate] as [PartLot_FirstRefDate]
from Erp.OrderHed as OrderHed
inner join Erp.OrderDtl as OrderDtl on
OrderHed.Company = OrderDtl.Company
and OrderHed.OrderNum = OrderDtl.OrderNum
inner join Erp.OrderRel as OrderRel on
OrderDtl.Company = OrderRel.Company
and OrderDtl.OrderNum = OrderRel.OrderNum
and OrderDtl.OrderLine = OrderRel.OrderLine
and ( OrderRel.OpenRelease = True )
left outer join Erp.PartBin as PartBin on
OrderRel.Company = PartBin.Company
and OrderRel.WarehouseCode = PartBin.WarehouseCode
and ( not PartBin.BinNum like 'Q' )
inner join Erp.Part as Part on
OrderDtl.Company = Part.Company
and OrderDtl.PartNum = Part.PartNum
right outer join Erp.Part as Part
and
PartBin.Company = Part.Company
and PartBin.PartNum = Part.PartNum
inner join Erp.PartLot as PartLot on
PartBin.Company = PartLot.Company
and PartBin.PartNum = PartLot.PartNum
and PartBin.LotNum = PartLot.LotNum
where (OrderHed.OrderNum = #SalesOrder)
order by OrderDtl.OrderLine, PartLot.FirstRefDate, PartBin.OnhandQty

You can select the bin where their date is less then or equal the minimum date for which the sum of the quantity of all bins with a date less than or equal is greater then or equal your target quantity (e.g. 50).
SELECT *
FROM bin b
WHERE b.date <= (SELECT min(bb.date)
FROM bin bb
WHERE (SELECT sum(bbb.qty)
FROM bin bbb
WHERE bbb.date <= bb.date) >= 50)
ORDER BY b.date,
b.bin#;
This approach however can include more bins than necessary. If there are more bins from the youngest date, than they are needed to just satisfy the target quantity, all of them will be included anyhow. So the person who picks the items for the order would have to chose from these bins. But at least the FIFO rule is kept that way and the person has to count the items anyway and cannot just blindly pick from the returned bins.
SQL Fiddle (Note, that I added bin 20 to demonstrate the above mentioned problem.)
The problem I mentioned about 1. can be circumvented if you give all the bins a number ordered by the date. Then there will be no duplicate values as with the date. You can introduce this number by using ROW_NUMBER() in a CTE. Then select from the CTE with the same logic as in 1. but applied on the row number instead of the date.
WITH cte
AS
(
SELECT ROW_NUMBER() OVER (ORDER BY b.date) row#,
b.*
FROM bin b
)
SELECT *
FROM cte c
WHERE c.row# <= (SELECT min(cc.row#)
FROM cte cc
WHERE (SELECT sum(ccc.qty)
FROM cte ccc
WHERE ccc.row# <= cc.row#) >= 50)
ORDER BY c.date,
c.bin#;
SQL Fiddle (Note, that I added bin 20 again to demonstrate, that the problem mentioned in 1. is tackled.)
Both methods however won't necessarily yield the "optimal" set of bins. For example, there might be a set of bins, with the right dates, that exactly hold the amount of items ordered but this set is only returned by chance. There might also be a set of bins with a cardinality less that the one of the returned set.

Related

SQL SELECT filtering out combinations where another column contains empty cells, then returning records based on max date

I have run into an issue I don't know how to solve. I'm working with a MS Access DB.
I have this data:
I want to write a SELECT statement, that gives the following result:
For each combination of Project and Invoice, I want to return the record containing the maximum date, conditional on all records for that combination of Project and Invoice being Signed (i.e. Signed or Date column not empty).
In my head, first I would sort the irrelevant records out, and then return the max date for the remaining records. I'm stuck on the first part.
Could anyone point me in the right direction?
Thanks,
Hulu
Start with an initial query which fetches the combinations of Project, Invoice, Date from the rows you want returned by your final query.
SELECT
y0.Project,
y0.Invoice,
Max(y0.Date) AS MaxOfDate
FROM YourTable AS y0
GROUP BY y0.Project, y0.Invoice
HAVING Sum(IIf(y0.Signed Is Null,1,0))=0;
The HAVING clause discards any Project/Invoice groups which include a row with a Null in the Signed column.
If you save that query as qryTargetRows, you can then join it back to your original table to select the matching rows.
SELECT
y1.Project,
y1.Invoice,
y1.Desc,
y1.Value,
y1.Signed,
y1.Date
FROM
YourTable AS y1
INNER JOIN qryTargetRows AS sub
ON (y1.Project = sub.Project)
AND (y1.Invoice = sub.Invoice)
AND (y1.Date = sub.MaxOfDate);
Or you can do it without the saved query by directly including its SQL as a subquery.
SELECT
y1.Project,
y1.Invoice,
y1.Desc,
y1.Value,
y1.Signed,
y1.Date
FROM
YourTable AS y1
INNER JOIN
(
SELECT y0.Project, y0.Invoice, Max(y0.Date) AS MaxOfDate
FROM YourTable AS y0
GROUP BY y0.Project, y0.Invoice
HAVING Sum(IIf(y0.Signed Is Null,1,0))=0
) AS sub
ON (y1.Project = sub.Project)
AND (y1.Invoice = sub.Invoice)
AND (y1.Date = sub.MaxOfDate);
Write A SQL query, which should be possible in MS-Access too, like this:
SELECT
Project,
Invoice,
MIN([Desc]) Descriptions,
SUM(Value) Value,
MIN(Signed) Signed,
MAX([Date]) "Date"
FROM data
WHERE Signed<>'' AND [Date]<>''
GROUP BY
Project,
Invoice
output:
Project
Invoice
Descriptions
Value
Signed
Date
A
1
Ball
100
J.D.
2022-09-20
B
1
Sofa
300
J.D.
2022-09-22
B
2
Desk
100
J.D.
2022-09-23
Note: for invoice 1 on project A, you will see a value of 300, which is the total for that invoice (when grouping on Project='A' and Invoice=1).
Maybe I should have used DCONCAT (see: Concatenation in between records in Access Query ) for the Description, to include 'TV' in it. But I am unable to test that so I am only referring to this answer.
Try joining a second query:
Select *
From YourTable As T
Inner Join
(Select Project, Invoice, Max([Date]) As MaxDate
From YourTable
Group By Project, Invoice) As S
On T.Project = S.Project And T.Invoice = S.Invoice And T.Date = S.MaxDate

Selecting Top Row in Calculated Column

I need to subtract the top row in a table that has multiple records from another table that has one row. One table has assets with one date and the other has multiple assets grouped by older dates. I am also limiting the results to times when the newer asset is greater than 40% or less than 40% the older asset.
I have already tried using the row_number function to pull the top row from the second table but am having trouble with the subquery.
Select
p.pid, e.coname, p.seq, p.valmo, p.valyr, p.assets,
(case
when ((p.assets-p1.assets)/p.assets) * 100 <= -40
or ((p.assets-p1.assets)/p.assets) * 100 >=40
and p.assets <> p1.assets
then ((p.assets - p1.assets) / p.assets) * 100
end) as "PercentDiff"
from
pen_plans p
join
pen_plans_archive p1 on p.pid = p1.pid and p.seq = p1.seq
join
entities e on p.pid = e.pid
where
p.assets > 500000 and e.mmd = 'A'
order by
VALYR desc
So I need to subtract the top row in "pen_plans_archive" from the assets in "pen_plans". I've tried to combine something like this in a subquery into the above:
select assets from (select assets row_number() over (partition by assets
order by valyr DESC) as R
from pen_plans_archive) RS
where R=1 order by valyr DESC
The "assets" column definition is Number(12,0).
I expect the query to produce the columns, PID, CONAME, SEQ, VALMO, VALYR, ASSETS, and the Calculated Column PERCENTDIFF with no null values.
The first query produces null values and also is subtracting every asset figure in pen_plans_archive from pen_plans which is not what I need.
Are you just trying to do the Top function?
Select TOP 1 <column>

getting avg of column based on the result set

I have a select statement that divides the count of sales by country, priceBanding (see example below)
The select statement looks like follows:
SELECT p.[Price Band]
,t.[Country]
,o.COUNT([Order]) as [Order Count]
FROM #price p (temp table)
INNER JOIN country t ON p.CountryCode = t.countryCode
INNER JOIN sales o ON o.salesValue >= p.startPrice and s.salesValue < p.endPrice
What i want to be able to do is based on this result i want to get an avg of the unit count i.e. For all orders that are under 20 what is the avg unit counts and the same for all others. How can i do this?
Its most likely simple but I cant think through it.
What I am after:
So as you can see, in the price band <20 in UK the order count is 50, and the avg Units of that is 2. As i mentioned earlier, I want the Avg Units of all orders that are under 20 (which is 50 in the picture).
Is that clearer?
Thanks in advance.
EDIT:
The first table: assume it to be the source
And the second table gets the avg, that's what I am after.
Wouldn't you just use avg()?
SELECT p.[Price Band], t.[Country],
o.COUNT(*) as [Order Count],
AVG(Items)
FROM #price p INNER JOIN
country t
ON p.CountryCode = t.countryCode INNER JOIN
sales o
ON o.salesValue >= p.startPrice and s.salesValue < p.endPrice
GROUP BY p.[Price Band], t.[Country]
ORDER BY t.[Country], p.[Price Band]
Note: SQL Server does integer division of integers (so 3/2 = 1 not 1.5) and similarly for AVG(). It is more accurate to use a decimal point number. An easy way is to use AVG(items * 1.0).

SQL select query with two tables

I'm struggling with a task. I need to create a select query which:
For each specific listed date shows date and revenue where revenue is number of sold units multiplied with unit price (but ONLY if revenue is greater than or equal to 10 000).
There are two tables: product & order.
Product contains columns: unittype, price. And order contains columns: unittype, date, number (number of units sold)
This is my try on the select query:
SELECT
order.date,
product.price*order.number AS revenue
FROM product
INNER JOIN
order
ON product.unittype = order.unittype
WHERE product.price*order.number >= 10000;
None of my results are even close to 10k (between 39 and 1.3k) so I'm wondering if I've typed it wrong or if there are any more efficient ways to type it?
If this is meant to be for the total for the day (and not the individual line), you need an aggregate and a having clause:
SELECT
order.date,
SUM(product.price*order.number) AS revenue
FROM product
INNER JOIN
order
ON product.unittype = order.unittype
GROUP BY order.date
HAVING SUM(product.price*order.number) >= 10000

Count the number of occurrences grouped by some rows

I have made a query to bring me the number of products that have not been in stock (I know that by looking at the orders which the manufacturer returned with some status code), by product, date and storage, that looks like this:
SELECT count(*) as out_of_stock,
prod.id as product_id,
ped.data_envio::date as date,
opl.id as storage_id
from sub_produtos_pedidos spp
left join cad_produtos prod ON spp.ean_produto = prod.cod_ean
left join sub_pedidos sp ON spp.id_pedido = sp.id
left join pedidos ped ON sp.id_pedido = ped.id
left join op_logisticos opl ON sp.id_op_logistico = opl.id
where spp.motivo = '201' -- this is the code that means 'not in inventory'
group by storage_id,product_id,date
That produces an answer like this:
out_of_stock | product_id | date | storage_id
--------------|------------|-------------|-------------
1 | 5 | 2012-10-16 | 1
5 | 4 | 2012-10-16 | 2
Now I need to get the number of occurrences, by product and storage, of products that have been out of stock for 2 or more days, 5 or more days and so on.
So I guess I need to do a new count on the first query, aggregating the resultant rows in some defined day intervals.
I tried looking at the datetime functions in Postgres (http://www.postgresql.org/docs/7.3/static/functions-datetime.html), but couldn't find what I need.
May be I didn't get correctly you question, but it looks you need leverage sub-query.
Now I need to get the number of occurrences, by product and storage, of products that have been out of stock for 2 or more days
So:
SELECT COUNT(*), date, product_id FROM ( YOUR BIG QUERY IS THERE ) a
WHERE a.date < (CURRENT_DATE - interval '2' day)
GROUP BY date, product_id
Since you seem to want every row in the result individually, you cannot aggregate. Use a window function instead to get the count per day. The well known aggregate function count() can also serve as window aggregate function:
SELECT current_date - ped.data_envio::date AS days_out_of_stock
,count(*) OVER (PARTITION BY ped.data_envio::date)
AS count_per_days_out_of_stock
,ped.data_envio::date AS date
,p.id AS product_id
,opl.id AS storage_id
FROM sub_produtos_pedidos spp
LEFT JOIN cad_produtos p ON p.cod_ean = spp.ean_produto
LEFT JOIN sub_pedidos sp ON sp.id = spp.id_pedido
LEFT JOIN op_logisticos opl ON opl.id = sp.id_op_logistico
LEFT JOIN pedidos ped ON ped.id = sp.id_pedido
WHERE spp.motivo = '201' -- code for 'not in inventory'
ORDER BY ped.data_envio::date, p.id, opl.id
Sort order: Products having been out of stock for the longest time first.
Note, you can just subtract dates to get an integer in Postgres.
If you want a running count in the sense of "n rows have been out of stock for this number of days or more", use:
count(*) OVER (ORDER BY ped.data_envio::date) -- ascending order!
AS running_count_per_days_out_of_stock
You get the same count for the same day, peers are lumped together.