T-SQL : Loop alternative to reduce exec time - sql

This is my 1st post here, glad to be become part of a community I've been following for a long time! I haven't found information on this specific topic despite researches, so here I am.
I'm analyzing transaction data to understand financial margin development, separating different effects such as volume, mix, price, cost etc.
This analysis needs to be done at various levels of granularity, e.g. restricted to a certain geographical scope or a given product category. (80k+ analysis levels possible)
I need to compute those results in one batch to store them into an Excel tool.
To do so:
I have a 1st procedure (450 lines with indexes, joins etc) with 10 input variables, returning a single row table containing the output I need for 1 level of analysis (out of the 80k+ mentioned above). It currently takes 45 seconds to run. [working on reducing that time as well - separate problem]
I want to estimate this 1st procedure for 80k+ input permutations. I've the 2nd portion of code below which run on a simple input set. However, that would take too long to process all permutations...
Is there any other approach which would be faster ? Such as considering a join between the possible permutations table and the transaction data ?
What would you do to improve performance ?
declare #TimeframeVar varchar(55)
declare #EndMonthVar float
declare #EndPeriodVar float
declare #BUVar varchar(55)
...
declare #RegionVar varchar(55)
declare #BranchVar varchar(55)
...
declare #ProductCategoryVar varchar(55)
...
drop table #Variables
Select *
into #variables
from xx.Variables
while exists (select top 1 * from #variables)
BEGIN
set #TimeframeVar = ( select top 1 Timeframe from #Variables )
set #EndMonthVar = ( select top 1 EndMonth from #Variables )
set #EndPeriodVar = ( select top 1 EndPeriod from #Variables )
set #BUVar = ( select top 1 BU from #Variables )
...
set #RegionVar = ( select top 1 Region from #Variables )
set #BranchVar = ( select top 1 Branch from #Variables )
...
set #ProductCategoryVar = ( select top 1 ProductCategory from #Variables )
...
exec xx.MarginAnalysis_XX
#TimeframeVar
,#EndMonthVar
,#EndPeriodVar
,#BUVar
...
,#RegionVar
,#BranchVar
...
,#ProductCategoryVar
...
Delete from #variables
where Timeframe = #TimeframeVar
and EndMonth = #EndMonthVar
and EndPeriod = #EndPeriodVar
and BU = #BUVar
...
and region = #RegionVar
and Branch = #BranchVar
...
and productcategory = #ProductCategoryVar
and ordersize = #OrderSizeVar
...
END
Thanks a lot for your help!
EDIT :
EXPECTED FINAL OUTPUT (with 80k+ rows)
+-------------+-----------+--------+-------+-----+-------------+
| Granularity | MarginIni | Volume | Price | ... | MarginFinal |
+-------------+-----------+--------+-------+-----+-------------+
| A | 100 | +20 | -30 | | 90 |
+-------------+-----------+--------+-------+-----+-------------+
| B | 200 | 150 | -30 | | 320 |
+-------------+-----------+--------+-------+-----+-------------+
| C | .. | ... | ... | | ... |
+-------------+-----------+--------+-------+-----+-------------+
INPUT FROM 1st PROC (with 1 row)
+-------------+-----------+--------+-------+-----+-------------+
| Granularity | MarginIni | Volume | Price | ... | MarginFinal |
+-------------+-----------+--------+-------+-----+-------------+
| A | 100 | +20 | -30 | | 90 |
+-------------+-----------+--------+-------+-----+-------------+

Related

SQL - Set column value to the SUM of all references

I want to have the column "CurrentCapacity" to be the SUM of all references specific column.
Lets say there are three rows in SecTable which all have FirstTableID = 1. Size values are 1, 1 and 3.
The row in FirstTable which have ID = 1 should now have a value of 5 in the CurrentCapacity column.
How can I make this and how to do automatically on insert, update and delete?
Thanks!
FirstTable
+----+-------------+-------------------------+
| ID | MaxCapacity | CurrentCapacity |
+----+-------------+-------------------------+
| 1 | 5 | 0 (desired result = 5) |
+----+-------------+-------------------------+
| 2 | 5 | 0 |
+----+-------------+-------------------------+
| 3 | 5 | 0 |
+----+-------------+-------------------------+
SecTable
+----+-------------------+------+
| ID | FirstTableID (FK) | Size |
+----+-------------------+------+
| 1 | 1 | 2 |
+----+-------------------+------+
| 2 | 1 | 3 |
+----+-------------------+------+
In general, a view is a better solution than trying to keep a calculated column up-to-date. For your example, you could use this:
CREATE VIEW capacity AS
SELECT f.ID, f.MaxCapacity, COALESCE(SUM(s.Size), 0) AS CurrentCapacity
FROM FirstTable f
LEFT JOIN SecTable s ON s.FirstTableID = f.ID
GROUP BY f.ID, f.MaxCapacity
Then you can simply
SELECT *
FROM capacity
to get the results you desire. For your sample data:
ID MaxCapacity CurrentCapacity
1 5 5
2 5 0
3 5 0
Demo on SQLFiddle
Got this question to work with this trigger:
CREATE TRIGGER UpdateCurrentCapacity
ON SecTable
AFTER INSERT, UPDATE, DELETE
AS
BEGIN
SET NOCOUNT ON
DECLARE #Iteration INT
SET #Iteration = 1
WHILE #Iteration <= 100
BEGIN
UPDATE FirstTable SET FirstTable.CurrentCapacity = (SELECT COALESCE(SUM(SecTable.Size),0) FROM SecTable WHERE FirstTableID = #Iteration) WHERE ID = #Iteration;
SET #Iteration = #Iteration + 1
END
END
GO
Personally, I would not use a trigger either or store CurrentCapacity as a value since it breaks Normalization rules for database design. You have a relation and can already get the results by creating a view or setting CurrentCapacity to a calculated column.
Your view can look like this:
SELECT Id, MaxCapacity, ISNULL(O.SumSize,0) AS CurrentCapacity
FROM dbo.FirstTable FT
OUTER APPLY
(
SELECT ST.FirstTableId, SUM(ST.Size) as SumSize FROM SecTable ST
WHERE ST.FirstTableId = FT.Id
GROUP BY ST.FirstTableId
) O
Sure, you could fire a proc every time a row is updated/inserted or deleted in the second table and recalculate the column, but you might as well calculate it on the fly. If it's not required to have the column accurate, you can have a job update the values every X hours. You could combine this with your view to have both a "live" and "cached" version of the capacity data.

Update an column in SQL Server with values from a different lookup table?

I'm working on a project where I need to update a Mailing Rate column that is currently blank. I have a few different tables of rates/costs based on type of delivery service. Within each table is essentially a table of weights and zones where the cost depends on weight (y column) and distance travelled (x column).
The table I need to update lists the various types of delivery service, associated weight and associated distance travelled for each piece of mail. I'm having trouble figuring out the logic on how to write a query to update the rates column in this table that match the corresponding tables' lookup of weight/travel distance for each service.
Below is an example of the tables:
| Type of Service | Weight | Distance | Cost |
+-----------------+----------+----------+----------+
| A | 1 | 15 | ? |
| B | 2 | 20 | ? |
| C | 3 | 10 | ? |
| | | | |
| Service A Table | | | |
| Weight | 10 km | 15 km | 20 km |
| 1 | $25.00 | $30.00 | $40.00 |
| 2 | $27.00 | $32.00 | $41.00 |
| 3 | $28.00 | $34.00 | $43.00 |
| | | | |
| Service B Table | | | |
| Weight | 10 km | 15 km | 20 km |
| 1 | $28.00 | $32.00 | $41.00 |
| 2 | $29.00 | $35.00 | $44.00 |
| 3 | $30.00 | $37.00 | $47.00 |
+-----------------+----------+----------+----------+
You can use left join to bring in all the tables . . . and a lot of case logic to choose the columns:
update t
set cost = (case when distance = '10 km'
then coalesce(sa.10km, sb.10km, sc.10km)
when distance = '15 km'
then coalesce(sa.15km, sb.15km, sc.15km)
when distance = '20 km'
then coalesce(sa.20km, sb.20km, sc.20km)
end)
from t left join
servicea sa
on sa.weight = t.weight and t.service = 'A' left join
servicea sb
on sb.weight = t.weight and t.service = 'B' left join
servicea sc
on sc.weight = t.weight and t.service = 'C' ;
Since you have different tables for each type of service it is going to be hard for you if you add new types of service. Optimally the type of service would be a column in one table for the costs. You could create views for each type of service if for some reason you want to view your data that way. I would organize the data like this for easy lookup:
create table ServiceCosts(
ServiceType char(1),
MaxDistance decimal(9, 2),
Cost decimal(9, 2),
constraint pk_ServiceCosts primary key clustered (ServiceType, MaxDistance)
)
Then you could update the table using an outer apply:
update s
set s.Cost = oa.Cost
from SourceTable s
outer apply (
select top(1) Cost
from ServiceCosts sc
where sc.Weight >= s.Weight
and sc.Distance >= s.Distance
order by Cost -- cheapest that fits under weight and distance
) oa
There are a few options you can use, which option is best would depend on the exact criteria of your data. Does every table have every weight? Every distance? What do you do if there is no record meeting your criteria (Cost would be null here)?
Another option would be to create a function...
create function GetCost(
#ServiceType char(1),
#Weight decimal(9, 2),
#Distance decimal(9, 2)
) returns decimal(9, 2)
as
begin
declare #Cost decimal(9, 2) = null;
if #ServiceType = 'A'
select top(1) #Cost = case
when #Distance < 10 then [10 km]
when #Distance < 20 then [20 km]
when #Distance < 30 then [30 km]
else null
end
from ServiceATable
where Weight >= #Weight order by Weight -- least weight at or above limit
else if -- repeat for each service table
return #Cost;
end
Then update:
update ItemTable
set Cost = GetCost(ServiceType, Weigth, Distance)
where Cost is null; -- only update rows without Cost already
You can use a MERGE statement between your main table you want to calculate the cost for and the services table. Do the merge on weight and distance and calculate the costs based on service type.
Consult the merge documentation it should be very easy to do it. I could do it on here but better learning for you.
Here there is an example:
https://www.sqlservertutorial.net/sql-server-basics/sql-server-merge/

Find best match in tree given a combination of multiple keys

I have a structure / tree that looks similar to this.
CostType is mandatory and can exist by itself, but it can have a parent ProfitType or Unit and other CostTypes as children.
There can only be duplicate Units. Other cannot appear multiple times in the structure.
| ID | name | parent_id | ProfitType | CostType | Unit |
| -: | ------------- | --------: |
| 1 | Root | (NULL) |
| 2 | 1 | 1 | 300 | | |
| 3 | 1-1 | 2 | | 111 | |
| 4 | 1-1-1 | 3 | | | 8 |
| 5 | 1-2 | 2 | | 222 | |
| 6 | 1-2-1 | 5 | | 333 | |
| 7 | 1-2-1-1 | 6 | | | 8 |
| 8 | 1-2-1-2 | 6 | | | 9 |
Parameters | should RETURN |
(300,111,8) | 4 |
(null,111,8) | 4 |
(null,null,8) | first match, 4 |
(null,222,8) | best match, 5 |
(null,333,null) | 6 |
I am at a loss on how I could create a function that receives (ProfitType, CostType, Unit) and return the best matching ID from the structure.
This isn't giving exactly the answers you provided as example, but see my comment above - if (null,222,8) should be 7 to match how (null,333,8) returns 4 then this is correct.
Also note that I formatted this using temp tables instead of as a function, I don't want to trip a schema change audit so I posted what I have as temp tables, I can rewrite it as a function Monday when my DBA is available, but I thought you might need it before the weekend. Just edit the "DECLARE #ProfitType int = ..." lines to the values you want to test
I also put in quite a few comments because the logic is tricky, but if they aren't enough leave a comment and I can expand my explanation
/*
ASSUMPTIONS:
A tree can be of arbitrary depth, but will not exceed the recursion limit (defaults to 100)
All trees will include at least 1 CostType
All trees will have at most 1 ProfitType
CostType can appear multiple times in a traversal from root to leaf (can units?)
*/
SELECT *
INTO #Temp
FROM (VALUES (1,'Root',NULL, NULL, NULL, NULL)
, (2,'1', 1, 300, NULL, NULL)
, (3,'1-1', 2, NULL, 111, NULL)
, (4,'1-1-1', 3, NULL, NULL, 8)
, (5,'1-2', 2, NULL, 222, NULL)
, (6,'1-2-1', 5, NULL, 333, NULL)
, (7,'1-2-1-1', 6, NULL, NULL, 8)
, (8,'1-2-1-2', 6, NULL, NULL, 9)
) as TempTable(ID, RName, Parent_ID, ProfitType, CostType, UnitID)
--SELECT * FROM #Temp
DECLARE #ProfitType int = NULL--300
DECLARE #CostType INT = 333 --NULL --111
DECLARE #UnitID INT = NULL--8
--SELECT * FROM #Temp
;WITH cteMatches as (
--Start with all nodes that match one criteria, default a score of 100
SELECT N.ID as ReportID, *, 100 as Score, 1 as Depth
FROM #Temp AS N
WHERE N.CostType= #CostType OR N.ProfitType=#ProfitType OR N.UnitID = #UnitID
), cteEval as (
--This is a recursive CTE, it has a (default) limit of 100 recursions
--, but that can be raised if your trees are deeper than 100 nodes
--Start with the base case
SELECT M.ReportID, M.RName, M.ID ,M.Parent_ID, M.Score
, M.Depth, M.ProfitType , M.CostType , M.UnitID
FROM cteMatches as M
UNION ALL
--This is the recursive part, add to the list of matches the match when
--its immediate parent is also considered. For that match increase the score
--if the parent contributes another match. Also update the ID of the match
--to the parent's IDs so recursion can keep adding if more matches are found
SELECT M.ReportID, M.RName, N.ID ,N.Parent_ID
, M.Score + CASE WHEN N.CostType= #CostType
OR N.ProfitType=#ProfitType
OR N.UnitID = #UnitID THEN 100 ELSE 0 END as Score
, M.Depth + 1, N.ProfitType , N.CostType , N.UnitID
FROM cteEval as M INNER JOIN #Temp AS N on M.Parent_ID = N.ID
)SELECT TOP 1 * --Drop the "TOP 1 *" to see debugging info (runners up)
FROM cteEval
ORDER BY SCORE DESC, DEPTH
DROP TABLE #Temp
I'm sorry I don't have enough rep to comment.
You'll have to define "best answer" (for example, why isn't the answer to null,222,8 7 or null instead of 5?), but here's the approach I'd use:
Derive a new table where ProfitType and CostType are listed explicitly instead of only by inheritance. I would approach that by using a cursor (how awful, I know) and following the parent_id until a ProfitType and CostType is found -- or the root is reached. This presumes an unlimited amount of child/grandchild levels for parent_id. If there is a limit, then you can instead use N self joins where N is the number of parent_id levels allowed.
Then you run multiple queries against the derived table. The first query would be for an exact match (and then exit if found). Then next query would be for the "best" partial match (then exit if found), followed by queries for 2nd best, 3rd best, etc. until you've exhausted your "best" match criteria.
If you need nested parent CostTypes to be part of the "best match" criteria, then I would make duplicate entries in the derived table for each row that has multiple CostTypes with a CostType "level". level 1 is the actual CostType. level 2 is that CostType's parent, level 3 etc. Then your best match queries would return multiple rows and you'd need to pick the row with the lowest level (which is the closest parent/grandparent).

Merging two tables with some business logic for overfill

I have a bit of a challenge ahead of me with a report I need to write.
I have an ordered selected list of results, which has the following:
+---------+----------+----------+
| Header | estimate | TargetId |
+---------+----------+----------+
| Task 1 | 80 | 1 |
| Task 2 | 30 | 1 |
| Task 3 | 40 | 2 |
| Task 4 | 10 | 2 |
+---------+----------+----------+
I’d like to join this onto another set of data containing the Target information:
+--------+----------+
| Target | Capacity |
+--------+----------+
| 1 | 100 |
| 2 | 50 |
| 3 | 50 |
+--------+----------+
However I’d like to do some sort of pivot / cross join to fill each target to capacity and report this in a way to show a forecast of when each of the tasks for the target will be met.
+---------+----------+----------+----------+----------+---+---+
| Header | Overfill | Target 1 | Target 2 | Target 3 | … | … |
+---------+----------+----------+----------+----------+---+---+
| Task 1 | No | 80 | 0 | 0 | 0 | 0 |
| Task 2 | Yes | 20 | 10 | 0 | 0 | 0 |
| Task 3 | No | 0 | 40 | 0 | 0 | 0 |
| Task 4 | Yes | 0 | 0 | 10 | 0 | 0 |
+---------+----------+----------+----------+----------+---+---+
Alternatively displayed:
+---------+--------+-----------+
| Header | Target | Overfill% |
+---------+--------+-----------+
| Task 1 | 1 | 0 |
| Task 2 | 1,2 | 33.33 |
| Task 3 | 2 | 0 |
| Task 4 | 3 | 100% |
+---------+--------+-----------+
The actual set of data will involve a few hundred tasks across 20 – 30 targets, unfortunately I don’t have any code to show as a demonstration, short of the few simple selects, as I’m not sure how to approach the overfill.
I believe this could be achieved through C# easier however I was hoping this could be completed as a pure SP operation so I can return the data as I wish to display it.
Any help or a nudge in the right direction to take would be greatly appreciated,
Chris
Doing this in SQL is a bad idea, but it is possible with a recursive CTE. The solution below uses a recursive CTE with a result set that maintains the state of the solution as it goes. It queries one record for each source for each recursive iteration and updates the state with the results of certain calculations. Depending on the state in the result it will either advance the sequence, the target, or both.
This solution assumes the targets and headers are sequentially ordered. If the targets aren't sequentially ordered, you can use a CTE to add ROW_NUMBER() to targets. Also if you have more than 32767 steps in the solution it will fail as that is the max recursion that sql server supports. Steps should be at most tasks + targets.
One nice thing is that it will handle overflow across multiple targets. For example, if a task has an estimate it that will fill up multiple targets, then the next task will start at the next available bucket, not the assigned one. Go ahead and put some crazy numbers in there.
Finally, I didn't know how you were deriving overflow percentage, I don't know how you got the last row's result from your sample data. I doubt whatever the answer should be would be difficult to derive once the criteria is known.
/** Setup Test Data **/
DECLARE #Tasks TABLE ( Header VARCHAR(20), Estimate INT, TargetId INT );
DECLARE #Targets TABLE ( TargetId INT, Capacity INT );
INSERT INTO #Tasks VALUES
( 'Task 1', 80, 1 ), ( 'Task 2', 30, 1 ), ( 'Task 3', 40, 2 ), ( 'Task 4', 10, 2 );
INSERT INTO #Targets VALUES ( 1, 100 ), ( 2, 50 ), ( 3, 50 );
/** Solution **/
WITH Sequenced AS (
-- Added SequenceId for tasks as it feels janky to order by headers.
SELECT CAST(ROW_NUMBER() OVER (ORDER BY Header) AS INT) [SequenceId], tsk.*
FROM #Tasks tsk
)
, TargetsWithOverflow AS (
SELECT *
FROM #Targets
UNION
SELECT MAX(TargetId) + 1, 99999999 -- overflow target to store excess not handled by targets
FROM #Targets
)
, src AS (
-- intialize state
SELECT 0 [SequenceId], CAST('' AS varchar(20)) [Header], 0 [Estimate], 0 [CurrentTargetId]
, 0 [CurrentTargetFillLevel], 0 [SequenceRemainingEstimate], 0 [OverfillAmt]
UNION ALL
SELECT seq.SequenceId, seq.header, seq.Estimate, tgt.TargetId
, CASE WHEN [Excess] <= 0 THEN TrueFillLevel + TrueEstimate -- capacity meets estimate
ELSE tgt.Capacity -- there is excess estimate
END
, CASE WHEN [Excess] <= 0 THEN 0 -- task complete
ELSE [Excess] -- task is not complete still some of estimate is left
END
, CASE WHEN tgt.TargetId != seq.TargetId THEN
CASE WHEN [Excess] > 0 THEN [TrueEstimate] - [Excess] ELSE [TrueEstimate] END
ELSE 0
END
FROM src
INNER JOIN Sequenced seq ON
(src.SequenceRemainingEstimate = 0 AND seq.SequenceId = src.SequenceId + 1)
OR (src.SequenceRemainingEstimate > 0 AND seq.SequenceId = src.SequenceId)
INNER JOIN TargetsWithOverflow tgt ON
-- Part of target selection is based on if the sequence advanced.
-- If the sequence has advanced then get the target assigned to the sequence
-- Or use the current one if it is GTE to the assigned target.
-- Otherwise get the target after current target.
(tgt.TargetId = seq.TargetId AND tgt.TargetId > src.CurrentTargetId AND seq.SequenceId != src.SequenceId)
OR (tgt.TargetId = src.CurrentTargetId AND tgt.Capacity >= src.CurrentTargetFillLevel AND seq.SequenceId != src.SequenceId)
OR (tgt.TargetId = src.CurrentTargetId + 1 AND seq.SequenceId = src.SequenceId)
CROSS APPLY (
SELECT CASE WHEN tgt.TargetId != src.CurrentTargetId THEN 0 ELSE src.CurrentTargetFillLevel END [TrueFillLevel]
) forFillLevel
CROSS APPLY (
SELECT tgt.Capacity - [TrueFillLevel] [TrueCapacity]
) forCapacity
CROSS APPLY (
SELECT CASE WHEN src.SequenceRemainingEstimate > 0 THEN src.SequenceRemainingEstimate ELSE seq.Estimate END [TrueEstimate]
) forEstimate
CROSS APPLY (
SELECT TrueEstimate - TrueCapacity [Excess]
) forExcess
)
SELECT src.Header
, LEFT(STUFF((SELECT ',' + RTRIM(srcIn.CurrentTargetId)
FROM src srcIn
WHERE srcIn.Header = src.Header
ORDER BY srcIn.CurrentTargetId
FOR XML PATH(''), TYPE).value('.', 'varchar(max)'), 1, 1, ''), 500)
[Target]
, CASE WHEN SUM(OverfillAmt) > 0 THEN 'Yes' ELSE 'No' END [Overfill]
, SUM (OverfillAmt) / (1.0 * AVG(seq.Estimate)) [OverfillPct]
FROM src
INNER JOIN Sequenced seq ON seq.SequenceId = src.SequenceId
WHERE src.SequenceId != 0
GROUP BY src.Header
OPTION (MAXRECURSION 32767)
Output
Header Target Overfill OverfillPct
-------------------- ---------- -------- ----------------
Task 1 1 No 0.00000000000000
Task 2 1,2 Yes 0.33333333333333
Task 3 2 No 0.00000000000000
Task 4 2,3 Yes 1.00000000000000
I just re-read your question and realized that you intend to run this query within a Stored Procedure. If that's the case, you could use techniques from this method and adapt them in a solution that uses a cursor. I hate them, but I doubt it would work any worse than this solution, and wouldn't have the recursion limitation. You'd just store the results into a temp table or table variable and then return the result of the stored procedure from that.

Trouble with Pivot Tables

I know there are a lot of Pivot table examples on the internet, however I'm new to SQL and I'm having a bit of trouble as all the examples seem to be pertaining to aggregate functions.
Table 1:
|Date | Tag |Value |
|06/10 2:00pm | A | 65 |
|06/10 2:00pm | B | 44 |
|06/10 2:00pm | C | 33 |
|06/10 2:02pm | A | 12 |
|06/10 2:02pm | B | 55 |
|06/10 2:02pm | C | 21 |
....
|06/10 1:58am | A | 23 |
What I would like it to look like is (table 2):
|Date | A | B | C |
|06/10 2:00pm| 65 | 44 | 33 |
|06/10 2:02pm| 12 | 55 | 21 |
.....
|06/10 1:58am| 23 | etc. | etc. |
(sorry for the format)
Some problems that encounter (doesn't work with code I have found online)
I'd like to run this as a stored procedure (rather a SQL job), every 2 minutes so that this data from table 1 is constantly being moved to table 2. However I think I would need to alter the date every single time? (thats the syntax I've seen)
The pivot table itself seems simple on its own, but the datetime has been causing me grief.
Any code snipets or links would be greatly appreciated.
Thanks.
The pivot itself seems simple:
select *
from table1
pivot (min (Value) for Tag in ([A], [B], [C])) p
As for stored procedure, I would use last date saved in table2 as a filter for table1, excluding incomplete groups (I'm assuming that there will be, at some point, all three tags present, and that only last date can be incomplete. If not, you will need special processing for last date to update/insert a row).
So, in code:
create proc InsertPivotedTags
as
set NoCount ON
set XACT_ABORT ON
begin transaction
declare #startDate datetime
-- Last date from Table2 or start of time
select #startDate = isnull (max ([Date]), '1753-01-01')
from Table2
insert into Table2
select *
from Table1
pivot (min (Value) for Tag in ([A], [B], [C])) p
where [Date] > #startDate
-- exclude incomplete groups
and a is not null
and b is not null
and c is not null
commit transaction
If groups can be incomplete you should remove exlude filter and add a delete statement that removes last date in case it is incomplete, and adjusts #startDate to three milliseconds earlier to get the same rows again, but now in more filled up state.