How do I join 2 tables to allocate items? - sql

I've created 2 tables that have inventory information (item, location, qty). One of them NeedInv has item/location(s) that need X number of items. The other HaveInv has item/locations(s) with excess X number of items.
I'm trying to join or combine the 2 tables to output which items should be transferred between which locations. I have code that does this for a single distribution location & I've attempted to modify it and add logic to have it work with multiple distribution locations, but it still fails in certain situations.
I've created a [sqlfiddle]1, but the sample data is like so:
CREATE TABLE NeedInv
(item int, location varchar(1), need int)
INSERT INTO NeedInv
(item, location, need)
VALUES
(100, 'A', 4), (100, 'B', 0), (100, 'C', 2), (200, 'A', 0), (200, 'B', 1), (200, 'C', 1), (300, 'A', 3), (300, 'B', 5), (300, 'C', 0)
CREATE TABLE HaveInv
(item int, location varchar(1), have int)
INSERT INTO HaveInv
(item, location, have)
VALUES
(100, 'A', 0), (100, 'B', 3), (100, 'C', 0), (100, 'D', 3), (200, 'A', 1), (200, 'B', 0), (200, 'C', 0), (200, 'D', 1), (300, 'A', 0), (300, 'B', 0), (300, 'C', 20), (300, 'D', 5)
CREATE TABLE DesiredOutput
(item int, SourceLocation varchar(1), TargetLocation varchar(1), Qty int)
INSERT INTO DesiredOutput
(item, SourceLocation, TargetLocation, Qty)
VALUES
(100, 'B', 'A', 3), (100, 'D', 'A', 1), (100, 'D', 'C', 2), (200, 'A', 'B', 2), (200, 'A', 'C', 3), (200, 'D', 'C', 1), (300, 'C', 'A', 3), (300, 'C', 'B', 3)
I was trying to output something like this as a result of joining the tables:
+------+----------------+----------------+-----+
| item | SourceLocation | TargetLocation | Qty |
+------+----------------+----------------+-----+
| 100 | B | A | 3 |
| 100 | D | A | 1 |
| 100 | D | C | 2 |
| 200 | A | B | 2 |
| 200 | A | C | 3 |
| 200 | D | C | 1 |
| 300 | C | A | 3 |
| 300 | C | B | 3 |
+------+----------------+----------------+-----+
My current query to join the 2 tables looks like so:
select
n.*,
(case when Ord <= Remainder and (RemaingNeed > 0 and RemaingNeed < RemainingInv) then Allocated + RemaingNeed else case when RemaingNeed < 0 then 0 else Allocated end end) as NeedToFill
from (
select
n.*,
row_number() over(partition by item order by RN, (case when need > Allocated then 0 else 1 end)) as Ord,
n.TotalAvail - sum(n.Allocated) over (partition by item) as Remainder
from (
select
n.*,
n.TotalAvail - sum(n.Allocated) over (partition by item order by RN) as RemainingInv,
n.need - sum(n.Allocated) over (partition by item, location order by RN) as RemaingNeed
from (
select
n.*,
case when Proportional > need then need else Proportional end as Allocated
from (
select
row_number() over(order by need desc) as RN,
n.*,
h.location as Source,
h.have,
h.TotalAvail,
convert(int, floor(h.have * n.need * 1.0 / n.TotalNeed), 0) as Proportional
from (
select n.*, sum(need) over (partition by item) as TotalNeed
from NeedInv n) n
join (select h.*, sum(have) over (partition by item) as TotalAvail from HaveInv h) h
on n.item = h.item
and h.have > 0
) n
) n
) n
) n
where n.need > 0
It seems to work for most cases except when Allocated is set to zero, but there's still items that could be transferred. This can be seen for item 200 1 where location B only needs 1 but is going to receive 2 items, while location C which also needs 1 item will receive 0.
Any help/guidance would be appreciated!

Your query looks a little complicated for what it needs to do, IMO.
As far as I can tell, this is just a simple matter of building the logic into a query using running totals of inventory. Essentially, it's just a matter of building in rules such that if what you need can be taken from a source location, you take it, otherwise you take as much as possible.
For example, I believe the following query contains the logic required:
SELECT N.Item,
SourceLocation = H.Location,
TargetLocation = N.Location,
Qty =
CASE
WHEN N.TotalRunningRequirement <= H.TotalRunningInventory -- If the current source location has enough stock to fill the request.
THEN
CASE
WHEN N.TotalRunningRequirement - N.Need < H.TotalRunningInventory - H.Have -- If stock required has already been allocated from elsewhere.
THEN N.TotalRunningRequirement - (H.TotalRunningInventory - H.Have) -- Get the total running requirement minus stock allocated from elsewhere.
ELSE N.Need -- Otherwise just take how much is needed.
END
ELSE N.Need - (N.TotalRunningRequirement - H.TotalRunningInventory) -- Current source doesn't have enough stock to fulfil need, so take as much as possible.
END
FROM
(
SELECT *, TotalRunningRequirement = SUM(need) OVER (PARTITION BY item ORDER BY location)
FROM NeedInv
WHERE need > 0
) AS N
JOIN
(
SELECT *, TotalRunningInventory = SUM(have) OVER (PARTITION BY item ORDER BY location)
FROM HaveInv
WHERE have > 0
) AS H
ON H.Item = N.Item
AND H.TotalRunningInventory - (N.TotalRunningRequirement - N.need) > 0 -- Join if stock in source location can be taken
AND H.TotalRunningInventory - H.Have - (N.TotalRunningRequirement - N.need) < N.TotalRunningRequirement
;
Note: Your desired output doesn't seem to match your sample data for Item 200 as far as I can tell.

I was wondering if a Recursive CTE could be used for allocations.
But it turned out a bit more complicated.
The result doesn't completely align with those expected result in the question.
But since the other answer returns the same results, I guess that's fine.
So see it as just an extra method.
Test on db<>fiddle here
It basically loops through the haves and needs in the order of the calculated row_numbers.
And assigns what's still available for what's still needed.
declare #HaveNeedInv table (
item int,
rn int,
loc varchar(1),
have int,
need int,
primary key (item, rn, loc, have, need)
);
insert into #HaveNeedInv (item, loc, have, need, rn)
select item, location, sum(have), 0 as need,
row_number() over (partition by item order by sum(have) desc)
from HaveInv
where have > 0
group by item, location;
insert into #HaveNeedInv (item, loc, have, need, rn)
select item, location, 0 as have, sum(need),
row_number() over (partition by item order by sum(need) desc)
from NeedInv
where need > 0
group by item, location;
;with ASSIGN as
(
select h.item, 0 as lvl,
h.rn as hrn, n.rn as nrn,
h.loc as hloc, n.loc as nloc,
h.have, n.need,
iif(h.have<=n.need,h.have,n.need) as assign
from #HaveNeedInv h
join #HaveNeedInv n on (n.item = h.item and n.need > 0 and n.rn = 1)
where h.have > 0 and h.rn = 1
union all
select t.item, a.lvl + 1,
iif(t.have>0,t.rn,a.hrn),
iif(t.need>0,t.rn,a.nrn),
iif(t.have>0,t.loc,a.hloc),
iif(t.need>0,t.loc,a.nloc),
iif(a.have>a.assign,a.have-a.assign,t.have),
iif(a.need>a.assign,a.need-a.assign,t.need),
case
when t.have > 0
then case
when t.have > (a.need - a.assign) then a.need - a.assign
else t.have
end
else case
when t.need > (a.have - a.assign) then a.have - a.assign
else t.need
end
end
from ASSIGN a
join #HaveNeedInv t
on t.item = a.item
and iif(a.have>a.assign,t.need,t.have) > 0
and t.rn = iif(a.have>a.assign,a.nrn,a.hrn) + 1
)
select
item,
hloc as SourceLocation,
nloc as TargetLocation,
assign as Qty
from ASSIGN
where assign > 0
order by item, hloc, nloc
option (maxrecursion 1000);
Result:
100 B A 3
100 D A 1
100 D C 2
200 A B 1
200 D C 1
300 C A 3
300 C B 5
Changing the order in the row_numbers (to fill #NeedHaveInv) will change the priority, and could return a different result.

Related

Group by range of values in bigquery

Is there any way in Bigquery to group by not the absolute value but a range of values?
I have a query that looks in a product table with 4 different numeric group by's.
What I am looking for is an efficient way to group by in a way like:
group by "A±1000" etc. or "A±10%ofA".
thanks in advance,
You can generate a column as a "named range" then group by the column. As an example for your A+-1000 case:
with data as (
select 100 as v union all
select 200 union all
select 2000 union all
select 2100 union all
select 2200 union all
select 4100 union all
select 8000 union all
select 8000
)
select count(v), ARRAY_AGG(v), ranges
FROM data, unnest([0, 2000, 4000, 6000, 8000]) ranges
WHERE data.v >= ranges - 1000 AND data.v < ranges + 1000
GROUP BY ranges
Output:
+-----+------------------------+--------+
| f0_ | f1_ | ranges |
+-----+------------------------+--------+
| 2 | ["100","200"] | 0 |
| 3 | ["2000","2100","2200"] | 2000 |
| 1 | ["4100"] | 4000 |
| 2 | ["8000","8000"] | 8000 |
+-----+------------------------+--------+
Below example is for BigQuery Standard SQL
#standardSQL
WITH `project.dataset.example` AS (
SELECT * FROM
UNNEST([STRUCT<id INT64, price FLOAT64>
(1, 15), (2, 50), (3, 125), (4, 150), (5, 175), (6, 250)
])
)
SELECT
CASE
WHEN price > 0 AND price <= 100 THEN ' 0 - 100'
WHEN price > 100 AND price <= 200 THEN '100 - 200'
ELSE '200+'
END AS range_group,
COUNT(1) AS cnt
FROM `project.dataset.example`
GROUP BY range_group
-- ORDER BY range_group
with result
Row range_group cnt
1 0 - 100 2
2 100 - 200 3
3 200+ 1
As you can see, in above solution you need construct CASE statement to reflect your ranges - if you have multiple - this can be quite boring - so below is more generic (but more verbose) solution - and it uses recently introduced RANGE_BUCKET function
#standardSQL
WITH `project.dataset.example` AS (
SELECT * FROM
UNNEST([STRUCT<id INT64, price FLOAT64>
(1, 15), (2, 50), (3, 125), (4, 150), (5, 175), (6, 250)
])
), ranges AS (
SELECT [100.0, 200.0] ranges_array
), temp AS (
SELECT OFFSET, IF(prev_val = val, CONCAT(prev_val, ' - '), CONCAT(prev_val, ' - ', val)) rng FROM (
SELECT OFFSET, IFNULL(CAST(LAG(val) OVER(ORDER BY OFFSET) AS STRING), '') prev_val, CAST(val AS STRING) AS val
FROM ranges, UNNEST(ARRAY_CONCAT(ranges_array, [ARRAY_REVERSE(ranges_array)[OFFSET(0)]])) val WITH OFFSET
)
)
SELECT
RANGE_BUCKET(price, ranges_array) range_group,
rng,
COUNT(1) AS cnt
FROM `project.dataset.example`, ranges
JOIN temp ON RANGE_BUCKET(price, ranges_array) = OFFSET
GROUP BY range_group, rng
-- ORDER BY range_group
with result
Row range_group rng cnt
1 0 - 100 2
2 1 100 - 200 3
3 2 200 - 1
As you can see, in second solution you need to define your your ranges in ranges as simple array enlisting your boundaries as SELECT [100.0, 200.0] ranges_array
Then temp does all needed calculation
You can do math operations on the GROUP BY, creating groups by any arbitrary criteria.
For example:
WITH data AS (
SELECT repo.name, COUNT(*) price
FROM `githubarchive.month.201909`
GROUP BY 1
HAVING price>100
)
SELECT FORMAT('range %i-%i', MIN(price), MAX(price)) price_range, COUNT(*) c
FROM data
GROUP BY CAST(LOG(price) AS INT64)
ORDER BY MIN(price)

Issue with Row_Number() Over Partition

I've been trying to reset the row_number when the value changes on Column Value and I have no idea on how should i do this.
This is my SQL snippet:
WITH Sch(SubjectID, VisitID, Scheduled,Actual,UserId,RLev,SubjectTransactionID,SubjectTransactionTypeID,TransactionDateUTC,MissedVisit,FieldId,Value) as
(
select
svs.*,
CASE WHEN stdp.FieldID = 'FrequencyRegime' and svs.SubjectTransactionTypeID in (2,3) THEN
stdp.FieldID
WHEN stdp.FieldID is NULL and svs.SubjectTransactionTypeID = 1
THEN NULL
WHEN stdp.FieldID is NULL
THEN 'FrequencyRegime'
ELSE stdp.FieldID
END AS [FieldID],
CASE WHEN stdp.Value is NULL and svs.SubjectTransactionTypeID = 1
THEN NULL
WHEN stdp.Value IS NULL THEN
(SELECT TOP 1 stdp.Value from SubjectTransaction st
JOIN SubjectTransactionDataPoint STDP on stdp.SubjectTransactionID = st.SubjectTransactionID and stdp.FieldID = 'FrequencyRegime'
where st.SubjectID = svs.SubjectID
order by st.ServerDateST desc)
ELSE stdp.Value END AS [Value]
from SubjectVisitSchedule svs
left join SubjectTransactionDataPoint stdp on svs.SubjectTransactionID = stdp.SubjectTransactionID and stdp.FieldID = 'FrequencyRegime'
)
select
Sch.*,
CASE WHEN sch.Value is not NULL THEN
ROW_NUMBER() over(partition by Sch.Value, Sch.SubjectID order by Sch.SubjectID, Sch.VisitID)
ELSE NULL
END as [FrequencyCounter],
CASE WHEN Sch.Value = 1 THEN 1--v.Quantity
WHEN Sch.Value = 2 and (ROW_NUMBER() over(partition by Sch.Value, Sch.SubjectID order by Sch.SubjectID, Sch.VisitID) % 2) <> 0
THEN 0
WHEN Sch.Value = 2 and (ROW_NUMBER() over(partition by Sch.Value, Sch.SubjectID order by Sch.SubjectID, Sch.VisitID) % 2) = 0
THEN 1
ELSE NULL
END AS [DispenseQuantity]
from Sch
--left join VisitDrugAssignment v on v.VisitID = Sch.VisitID
where SubjectID = '4E80718E-D0D8-4250-B5CF-02B7A259CAC4'
order by SubjectID, VisitID
This is my Dataset:
Based on the Dataset, I am trying to reset the FrequencyCounter to 1 every time the value changes for each subject, Right now it does 50% of what I want, It is counting when the value 1 or 2 is found, but when value 1 comes again after value 2 it continues the count from where it left. I want every time the value is changes the count to also start from the beginning.
It's difficult to reproduce and test without sample data, but if you want to know how to number rows based on change in column value, next approach may help. It's probably not the best one, but at least will give you a good start. Of course, I hope I understand your question correctly.
Data:
CREATE TABLE #Data (
[Id] int,
[Subject] varchar(3),
[Value] int
)
INSERT INTO #Data
([Id], [Subject], [Value])
VALUES
(1, '801', 1),
(2, '801', 2),
(3, '801', 2),
(4, '801', 2),
(5, '801', 1),
(6, '801', 2),
(7, '801', 2),
(8, '801', 2)
Statement:
;WITH ChangesCTE AS (
SELECT
*,
CASE
WHEN LAG([Value]) OVER (PARTITION BY [Subject] ORDER BY [Id]) <> [Value] THEN 1
ELSE 0
END AS [Change]
FROM #Data
), GroupsCTE AS (
SELECT
*,
SUM([Change]) OVER (PARTITION BY [Subject] ORDER BY [Id]) AS [GroupID]
FROM ChangesCTE
)
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY [GroupID] ORDER BY [Id]) AS Rn
FROM GroupsCTE
Result:
--------------------------------------
Id Subject Value Change GroupID Rn
--------------------------------------
1 801 1 0 0 1
2 801 2 1 1 1
3 801 2 0 1 2
4 801 2 0 1 3
5 801 1 1 2 1
6 801 2 1 3 1
7 801 2 0 3 2
8 801 2 0 3 3
As per my understanding, you need DENSE_RANK as you are looking for the row number will only change when value changed. The syntax will be as below-
WITH your_table(your_column)
AS
(
SELECT 2 UNION ALL
SELECT 10 UNION ALL
SELECT 2 UNION ALL
SELECT 11
)
SELECT *,DENSE_RANK() OVER (ORDER BY your_column)
FROM your_table

How to calculate the a third column based on the values of two previous columns?

My sample data is as follows:
Table T1:
+------+-------+
| Item | Order |
+------+-------+
| A | 30 |
| B | 3 |
| C | 15 |
| A | 10 |
| B | 2 |
| C | 15 |
+------+-------+
Table T2:
+------+-------+----------+--------+
| Item | Stock | Released | Packed |
+------+-------+----------+--------+
| A | 30 | 10 | 0 |
| B | 20 | 0 | 5 |
| C | 10 | 5 | 5 |
+------+-------+----------+--------+
Now, my requirement is to fetch the data in the following form:
+------+-------+-----------+----------------+
| Item | Order | Available | Availability % |
+------+-------+-----------+----------------+
| A | 40 | 20 | 50.00 |
| B | 5 | 15 | 100.00 |
| C | 30 | 0 | 0.00 |
+------+-------+-----------+----------------+
I am able to get the data of the first three columns using:
SELECT
T1.Item AS Item, SUM(T1.Order) AS Order, T2.Stock - T2.Released - T2.Packed AS Available
FROM T1 INNER JOIN T2 ON T1.Item = T2.Item
GROUP BY T1.Item, T2.Stock, T2.Released, T2.Packed
My question is: Is there a way to calculate the third column using the calculated values of columns 2 and 3 instead of writing down the entire formulas used to calculate those 2 columns? The reason is that the formula for calculating the third column is not small and uses the values of 2 and 3 multiple times.
Is there a way to do something like:
(CASE WHEN Available = 0 THEN 0
ELSE (CASE WHEN Available > Order THEN 100 ELSE Available/Order END) END) AS [Availability %]
What would you suggest?
Note: Please ignore the syntax used in the CASE expressions used above, I have used it just to explain the formula.
by usuing sub-query you can do that
with cte as
(
SELECT
T1.Item AS Item,
SUM(T1.Order) AS Order,
T2.Stock - T2.Released, T2.Packed AS Available
FROM T1 INNER JOIN T2 ON T1.Item = T2.Item
GROUP BY T1.Item, T2.Stock, T2.Released, T2.Packed
) select cte.*,
(
CASE WHEN Available = 0 THEN 0
ELSE (CASE WHEN Available > Order THEN 100 ELSE
100/(Order/Available)*1.00 END
) END) AS [Availability %] from cte
If you dont wont to use a CTE you can do it like this
declare #t1 table (item varchar(1), orderqty int)
declare #t2 table (item varchar(1), stock int, released int, packed int)
insert into #t1 values ('A', 30), ('B', 3), ('C', 15), ('A', 10), ('B', 2), ('C', 15)
insert into #t2 values ('A', 30, 10, 0), ('B', 20, 0, 5), ('C', 10, 5, 5)
select q.Item,
q.orderqty,
q.available,
case when q.available = 0 then 0
when q.available > q.orderqty then 100
else 100 / (q.orderqty / q.available) -- or whatever formula you need
end as [availability %]
from ( select t1.Item,
sum(t1.orderqty) as orderqty,
t2.Stock - t2.Released - t2.Packed as available
from #t1 t1
left outer join #t2 t2 on t1.Item = t2.Item
group by t1.Item, t2.Stock, t2.Released, t2.Packed
) q
The result is
Item orderqty available availability %
---- -------- --------- --------------
A 40 20 50
B 5 15 100
C 30 0 0
I think that your result table have some mistakes.. But you can have the required result by typing:
Select final_tab.Item,
final_tab.ordered,
final_tab.Available,
CASE WHEN final_tab.Available = 0 THEN 0
ELSE
(CASE WHEN final_tab.Available > final_tab.ordered THEN 100
ELSE convert(float,final_tab.Available)/convert(float,final_tab.ordered)*100 END)
END AS [Availability %]
from
(Select tab1.Item,tab1.ordered,
(Table_2.Stock-Table_2.Released-Table_2.Packed)as Available
from
( SELECT Item,sum([Order]) as ordered
FROM Table_1
group by Item )as tab1
left join
Table_2
on tab1.Item=Table_2.Item)as final_tab
You can try to play with below as well, make sure you have tested output.
declare #t1 table ([item] char(1), [order] int)
insert into #t1
values ('A', 30),
('B', 3),
('C', 30),
('A', 15),
('A', 10),
('B', 2),
('C', 15)
declare #t2 table ([item] char(1), [stock] int, [released] int, [packed] int)
insert into #t2
values ('A',30,10,0),
('B',20,0,5),
('C',10,5,5)
SELECT
T1.Item AS Item,
SUM(T1.[Order]) AS [Order],
T2.Stock - T2.Released as Available,
case when SUM(T1.[Order]) < (T2.Stock - T2.Released) then 100
else cast(cast((T2.Stock - T2.Released) as decimal) / cast(SUM(T1.[Order]) as decimal) * 100 as decimal(4,2))
end AS AvailablePercentage
FROM #T1 t1
INNER JOIN #T2 t2 ON T1.Item = T2.Item
GROUP BY T1.Item, T2.Stock, T2.Released, T2.Packed

GroupBy with respect to record intervals on another table

I prepared a sql fiddle for my question. Here it is There is a working code here. I am asking whether there exists an alternative solution which I did not think.
CREATE TABLE [Product]
([Timestamp] bigint NOT NULL PRIMARY KEY,
[Value] float NOT NULL
)
;
CREATE TABLE [PriceTable]
([Timestamp] bigint NOT NULL PRIMARY KEY,
[Price] float NOT NULL
)
;
INSERT INTO [Product]
([Timestamp], [Value])
VALUES
(1, 5),
(2, 3),
(4, 9),
(5, 2),
(7, 11),
(9, 3)
;
INSERT INTO [PriceTable]
([Timestamp], [Price])
VALUES
(1, 1),
(3, 4),
(7, 2.5),
(10, 3)
;
Query:
SELECT [Totals].*, [PriceTable].[Price]
FROM
(
SELECT [PriceTable].[Timestamp]
,SUM([Value]) AS [TotalValue]
FROM [Product],
[PriceTable]
WHERE [PriceTable].[Timestamp] <= [Product].[Timestamp]
AND NOT EXISTS (SELECT * FROM [dbo].[PriceTable] pt
WHERE pt.[Timestamp] <= [Product].[Timestamp]
AND pt.[Timestamp] > [PriceTable].[Timestamp])
GROUP BY [PriceTable].[Timestamp]
) AS [Totals]
INNER JOIN [dbo].[PriceTable]
ON [PriceTable].[Timestamp] = [Totals].[Timestamp]
ORDER BY [PriceTable].[Timestamp]
Result
| Timestamp | TotalValue | Price |
|-----------|------------|-------|
| 1 | 8 | 1 |
| 3 | 11 | 4 |
| 7 | 14 | 2.5 |
Here, my first table [Product] contains the product values for different timestamps. And second table [PriceTable] contains the prices for different time intervals. A given price is valid until a new price is set. Therefore the price with timestamp 1 is valid for Products with timestamps 1 and 2.
I am trying to get the total number of products with respect to given prices. The SQL on the fiddle produces what I expect.
Is there a smarter way to get the same result?
By the way, I am using SQLServer 2014.
DECLARE #Product TABLE
(
[Timestamp] BIGINT NOT NULL
PRIMARY KEY ,
[Value] FLOAT NOT NULL
);
DECLARE #PriceTable TABLE
(
[Timestamp] BIGINT NOT NULL
PRIMARY KEY ,
[Price] FLOAT NOT NULL
);
INSERT INTO #Product
( [Timestamp], [Value] )
VALUES ( 1, 5 ),
( 2, 3 ),
( 4, 9 ),
( 5, 2 ),
( 7, 11 ),
( 9, 3 );
INSERT INTO #PriceTable
( [Timestamp], [Price] )
VALUES ( 1, 1 ),
( 3, 4 ),
( 7, 2.5 ),
( 10, 3 );
WITH cte
AS ( SELECT * ,
LEAD(pt.[Timestamp]) OVER ( ORDER BY pt.[Timestamp] ) AS [lTimestamp]
FROM #PriceTable pt
)
SELECT cte.[Timestamp] ,
( SELECT SUM(Value)
FROM #Product
WHERE [Timestamp] >= cte.[Timestamp]
AND [Timestamp] < cte.[lTimestamp]
) AS [TotalValue],
cte.[Price]
FROM cte
Idea is to generate intervals from price table like:
1 - 3
3 - 7
7 - 10
and sum up all values in those intervals.
Output:
Timestamp TotalValue Price
1 8 1
3 11 4
7 14 2.5
10 NULL 3
You can simply add WHERE clause if you want to filter out rows where no orders are sold.
Also you can indicate the default value for LEAD window function if you want to close the last interval like:
LEAD(pt.[Timestamp], 1, 100)
and I guess it would be something like this in production:
LEAD(pt.[Timestamp], 1, GETDATE())
I think I've got a query which is easier to read. Does this work for you?
select pt.*,
(select sum(P.Value) from Product P where
P.TimeStamp between pt.TimeStamp and (
--get the next time stamp
select min(TimeStamp)-1 from PriceTable where TimeStamp > pt.TimeStamp
)) as TotalValue from PriceTable pt
--exclude entries with timestamps greater than those in Product table
where pt.TimeStamp < (select max(TimeStamp) from Product)
Very detailed question BTW
You could use a cte
;with cte as
(
select p1.[timestamp] as lowval,
case
when p2.[timestamp] is not null then p2.[timestamp] - 1
else 999999
end hival,
p1.price
from
(
select p1.[timestamp],p1.price,
row_number() over (order by p1.[timestamp]) rn
from pricetable p1 ) p1
left outer join
(select p1.[timestamp],p1.price,
row_number() over (order by p1.[timestamp]) rn
from pricetable p1) p2
on p2.rn = p1.rn + 1
)
select cte.lowval as 'timestamp',sum(p1.value) TotalValue,cte.price
from product p1
join cte on p1.[Timestamp] between cte.lowval and cte.hival
group by cte.lowval,cte.price
order by cte.lowval
It's a lot easier to understand and the execution plan compares favourably with your query (about 10%) cheaper

Values present in a group on a range of numbers (SQL)

I would like to know how much of a model (let's say t-shirts) with a given range of sizes (let's say 3: X, Y, Z) I have on a given date and on a given store (let's say 3: A, B, C) in stock.
Where:
X = between 40 and 50
Y = between 30 and 60
Z = between 20 and 70
The final output would look something like this (but with a lot of results):
Date | Store | Model | Availability X | Availability Y | Availability Z
02/26 | A | shirt | Yes | Yes | No
02/26 | B | shirt | Yes | No | No
02/26 | C | shirt | Yes | Yes | Yes
The availablity means I have to have in stock ALL the sizes between the given range of sizes.
I'm still trying to figure out a way to do that. The tables I have are right now designed like this (some illustrative info):
Table "sets"
id | name | initial_value | final_value
1 | X | 40 | 50
2 | Y | 30 | 60
3 | Z | 20 | 70
Table "items"
id | date | store | model | size | in_stock
1 | 02/26 | A | shirt | 40 | 1
2 | 02/26 | A | shirt | 50 | 2
3 | 02/26 | A | shirt | 30 | 0
4 | 02/26 | B | shirt | 30 | 1
I appreciate any help! Thanks.
Here is the output for SQL Server, I don't know about postgresql.
-- Create SETS
create table dbo.test_sets
(
id int not null,
name varchar(255),
initial_value int not null default (0),
final_value int not null default(0)
)
go
insert into dbo.test_sets( id, name, initial_value, final_value)
values (1, 'X', 40, 50)
insert into dbo.test_sets( id, name, initial_value, final_value)
values (2, 'Y', 30, 60)
insert into dbo.test_sets( id, name, initial_value, final_value)
values (3, 'Z', 20, 70)
go
-- Create ITEMS
create table dbo.test_items
(
id int not null,
[date] date,
store varchar(255) not null,
model varchar(255) not null,
size int not null default (0),
in_stock int not null default(0)
)
go
insert into dbo.test_items( id, [date], store, model, size, in_stock)
values (1, '02/26/2016', 'A', 'shirt', 40, 1)
insert into dbo.test_items( id, [date], store, model, size, in_stock)
values (2, '02/26/2016', 'A', 'shirt', 50, 2)
insert into dbo.test_items( id, [date], store, model, size, in_stock)
values (3, '02/26/2016', 'A', 'shirt', 30, 0)
insert into dbo.test_items( id, [date], store, model, size, in_stock)
values (4, '02/26/2016', 'B', 'shirt', 30, 1)
insert into dbo.test_items( id, [date], store, model, size, in_stock)
values (5, '02/26/2016', 'C', 'shirt', 80, 1)
go
-- Create NUMBERS LOOKUP
create table dbo.test_numbers
(
id int not null
)
go
declare #first as int
declare #step as int
declare #last as int
select #first = 1, #step = 1, #last = 100
BEGIN TRANSACTION
WHILE(#first <= #last)
BEGIN
INSERT INTO dbo.test_numbers VALUES(#first) SET #first += #step
END
COMMIT TRANSACTION
go
-- Query to provide required output
;with unique_store_models as
(
select distinct store, model from dbo.test_items
),
set_sizes as
(
select ts.id, ts.name as size_group, tn.id as size
from
dbo.test_sets ts
inner join dbo.test_numbers tn on
tn.id between ts.initial_Value and ts.final_value
),
items_by_sizes_flat as
(
select
ti.[date],
usm.store,
usm.model,
ss.size_group,
ss.size,
ti.in_stock
from
unique_store_models usm
left outer join dbo.test_items ti on
ti.store = usm.store
and ti.model = usm.model
left outer join set_sizes ss on
ss.size = ti.size
),
items_by_sizes_pivoted as
(
select
*
from
(
select
[date],
store,
model,
size_group,
--size,
in_stock
from
items_by_sizes_flat
) as p
PIVOT
(
count(in_stock) for size_group in ([X], [Y], [Z])
) as pv
)
select
[date],
store,
model,
case
when [X] > 0 then 'Yes' else 'No'
end as [Availability X],
case
when [Y] > 0 then 'Yes' else 'No'
end as [Availability Y],
case
when [Z] > 0 then 'Yes' else 'No'
end as [Availability Z]
from
items_by_sizes_pivoted
Here is the output for the above input:
Maybe you could try something like this:
SELECT date,
store,
model,
SUM(
CASE
WHEN size BETWEEN (SELECT initial_value FROM Table "sets" WHERE id = 1) AND (SELECT final_value FROM Table "sets" WHERE id = 1)
THEN in_stock
ELSE 0
END
) as "Availability X",
SUM(
CASE
WHEN size BETWEEN (SELECT initial_value FROM Table "sets" WHERE id = 2) AND (SELECT final_value FROM Table "sets" WHERE id = 2)
THEN in_stock
ELSE 0
END
) as "Availability Y",
CASE
WHEN size BETWEEN (SELECT initial_value FROM Table "sets" WHERE id = 3) AND (SELECT final_value FROM Table "sets" WHERE id = 3)
THEN in_stock
ELSE 0
END
) as "Availability Z"
FROM Table "items"
WHERE date > '02/26/2016'
AND
date < '02/26/2016'
AND
Model = 'shirt'
GROUP BY date, store, model
I think that would give you the information you're after, though if you wanted output exactly as you have it, then you could wrap another case statement around each availability case statement, or use a CTE like below:
WITH
data AS
(
SELECT date,
store,
model,
SUM(
CASE
WHEN size BETWEEN (SELECT initial_value FROM Table "sets" WHERE id = 1) AND (SELECT final_value FROM Table "sets" WHERE id = 1)
THEN in_stock
ELSE 0
END
) as availability_x,
SUM(
CASE
WHEN size BETWEEN (SELECT initial_value FROM Table "sets" WHERE id = 2) AND (SELECT final_value FROM Table "sets" WHERE id = 2)
THEN in_stock
ELSE 0
END
) as availability_y,
SUM(
CASE
WHEN size BETWEEN (SELECT initial_value FROM Table "sets" WHERE id = 3) AND (SELECT final_value FROM Table "sets" WHERE id = 3)
THEN in_stock
ELSE 0
END
) as availability_z
FROM Table "items"
WHERE date > '02/26/2016'
AND
date < '02/26/2016'
AND
Model = 'shirt'
GROUP BY date, store, model
)
SELECT date, store, model,
CASE
WHEN availability_x > 0 THEN "Yes"
ELSE "No"
END as "Availability X",
CASE
WHEN availability_y > 0 THEN "Yes"
ELSE "No"
END as "Availability Y",
CASE
WHEN availability_z > 0 THEN "Yes"
ELSE "No"
END as "Availability Z"
FROM data
This classic example of crosstab() use case.
Sample data:
-- DDL and data
CREATE TABLE items(
id SERIAL PRIMARY KEY,
"Date" DATE,
store TEXT,
model TEXT,
size INTEGER,
in_stock INTEGER
);
INSERT INTO items VALUES
(1, '02/26/2016':: DATE, 'A', 'shirt', 40, 1),
(2, '02/26/2016':: DATE, 'A', 'shirt', 50, 2),
(3, '02/26/2016':: DATE, 'A', 'shirt', 30, 0),
(4, '02/26/2016':: DATE, 'B', 'shirt', 30, 1);
CREATE TABLE sets(
id SERIAL PRIMARY KEY,
name TEXT,
initial_value INTEGER,
final_value INTEGER
);
INSERT INTO sets VALUES
(1, 'X', 40, 50),
(2, 'Y', 30, 60),
(3, 'Z', 20, 70);
For selecting I used int4range(start,end,inclusion) function to configure size ranges inclusion.Query itself:
SELECT * FROM crosstab(
'SELECT i.store,i."Date",i.model,s.name,
bool_or(CASE WHEN size_range #> i.size
THEN TRUE
ELSE FALSE
END)
FROM items i,sets s,int4range(s.initial_value, s.final_value, ''[)'') AS size_range
WHERE i.in_stock > 0
GROUP BY 1,2,3,4
ORDER BY 1,2',
'SELECT DISTINCT(name) FROM sets ORDER BY 1')
AS output(store TEXT,"Date" DATE,model TEXT,"Availability X" BOOLEAN,"Availability Y" BOOLEAN,"Availability Z" BOOLEAN);
Result:
store | Date | model | Availability X | Availability Y | Availability Z
-------+------------+-------+----------------+----------------+----------------
A | 2016-02-26 | shirt | t | t | t
B | 2016-02-26 | shirt | f | t | t
(2 rows)