Recursive SQL- How can I get this table with a running total? - sql

ID debit credit sum_debit
---------------------------------
1 150 0 150
2 100 0 250
3 0 50 200
4 0 100 100
5 50 0 150
I have this table, my problem is how to get sum_debit column which is the total of the previous row sum_debit with debit minus credit (sum_debit = sum_debit + debit - credit).
each new row I enter debit but credit data is zero, or by entering the value of credit and debit is zero. How do I get sum_debit?

In SQL-Server 2012, you can use the newly added ROWS or RANGE clause:
SELECT
ID, debit, credit,
sum_debit =
SUM(debit - credit)
OVER (ORDER BY ID
ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW
)
FROM
CreditData
ORDER BY
ID ;
Tested in SQL-Fiddle
We could just use OVER(ORDER BY ID) there and the result would be the same. But then the default would be used, which is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW and there are efficiency differences (ROWS should be preferred with running totals.)
There is a great article by #Aaron Bertrand, that has a thorough test of various methods to calculate a running total: Best approaches for running totals – updated for SQL Server 2012
For previous versions of SQL-Server, you'll have to use some other method, like a self-join, a recursive CTE or a cursor. Here is a cursor solution, blindly copied from Aaron's blog, with tables and columns adjusted to your problem:
DECLARE #cd TABLE
( [ID] int PRIMARY KEY,
[debit] int,
[credit] int,
[sum_debit] int
);
DECLARE
#ID INT,
#debit INT,
#credit INT,
#RunningTotal INT = 0 ;
DECLARE c CURSOR
LOCAL STATIC FORWARD_ONLY READ_ONLY
FOR
SELECT ID, debit, credit
FROM CreditData
ORDER BY ID ;
OPEN c ;
FETCH NEXT FROM c INTO #ID, #debit, #credit ;
WHILE ##FETCH_STATUS = 0
BEGIN
SET #RunningTotal = #RunningTotal + (#debit - #credit) ;
INSERT #cd (ID, debit, credit, sum_debit )
SELECT #ID, #debit, #credit, #RunningTotal ;
FETCH NEXT FROM c INTO #ID, #debit, #credit ;
END
CLOSE c;
DEALLOCATE c;
SELECT ID, debit, credit, sum_debit
FROM #cd
ORDER BY ID ;
Tested in SQL-Fiddle-cursor

Assuming "have" is your data table, this should be an ANSI SQL solution:
select h.*, sum(i.debit) as debsum, sum(i.credit) as credsum, sum(i.debit) - sum(i.credit) as rolling_sum
from have h inner join have i
on h.id >= i.id
group by h.id, h.debit, h.credit
order by h.id
In general, the solution is to join the row to all rows preceding the row, and extract the sum of those rows, then group by everything to get back to one row per what you expect. Like this question for example.

Related

Running total by date/ID based on latest change to value SQL

I have a unique case where I want to calculate the running total of quantities day over day. I have been searching a lot but couldn't find the right answer. Code-wise, there is nothing much I can share as it refers to a lot of sensitive data
Below is the table of dummy data:
As you can see, there are multiple duplicate IDs by date. I want to be able to calculate the running total of a date as follows:
For 2022/03/24, the running total would be 9+33 = 42, on 2022/03/26 the running total should be 9+31 = 40. Essentially, the running total for any given day should pick the last value by ID if it changed or the value that exists. In this case on 2022/03/26 for that date, for ID 2072, we pick 31 and not 33 because that's the latest value available.
Expected Output:
There maybe be many days spanning across and the running total needs to be day over day.
Possible related question: SQL Server running total based on change of state of a column
PS: For context, ID is just a unique identifier for an inventory of items. Each item's quantity changes day by day. In this example, ID 1's inventoyr last changed on 2022/03/24 where as ID 2072's changed multiple times. Running total for 2022/03/24 would be quantities of inventory items on that day. On 26th there are no changes for ID 1 but ID 2072 changed, the inventory pool should reflect the total as current inventory size of ID 2072+ current size of ID 1. On 26th, again ID 1 did not have any change, but ID 2072 changed. Therefore inventory size = current size of ID 2072 + current size of ID 1, in this case, 40. Essentially, it is just a current size of inventory with day over day change.
Any help would be really appreciated! Thanks.
I added a few more rows just in case if this is what you really wanted.
I used T-SQL.
declare #orig table(
id int,
quantity int,
rundate date
)
insert into #orig
values (1,9,'20220324'),(2072,33,'20220324'),(2072,31,'20220326'),(2072,31,'20220327'),
(2,10,'20220301'),(2,20,'20220325'),(2,30,'20220327')
declare #dates table (
runningdate date
)
insert into #dates
select distinct rundate from #orig
order by rundate
declare #result table (
dates date,
running_quality int
)
DECLARE #mydate date
DECLARE #sum int
-- CURSOR definition
DECLARE my_cursor CURSOR FOR
SELECT * FROM #dates
OPEN my_cursor
-- Perform the first fetch
FETCH NEXT FROM my_cursor into #mydate
-- Check ##FETCH_STATUS to see if there are any more rows to fetch
WHILE ##FETCH_STATUS = 0
BEGIN
;with cte as (
select * from #orig
where rundate <= #mydate
), cte2 as (
select id, max(rundate) as maxrundate
from cte
group by id
), cte3 as (
select a.*
from cte as a join cte2 as b
on a.id = b.id and a.rundate = b.maxrundate
)
select #sum = sum(quantity)
from cte3
insert into #result
select #mydate, #sum
-- This is executed as long as the previous fetch succeeds
FETCH NEXT FROM my_cursor into #mydate
END -- cursor
CLOSE my_cursor
DEALLOCATE my_cursor
select * from #result
Result:
dates running_quality
2022-03-01 10
2022-03-24 52
2022-03-25 62
2022-03-26 60
2022-03-27 70

Query with row by row calculation for running total

I have a problem where jobs become 'due' at the start of a week and each week there are a certain number of 'slots' available to complete any outstanding jobs. If there are not enough slots then the jobs roll over to the next week.
My initial table looks like this:
Week
Slots
Due
23/8/2021
0
1
30/8/2021
2
3
6/9/2021
5
2
13/9/2021
1
4
I want to maintain a running total of the number of 'due' jobs at the end of each week.
Each week the number due would be added to the running total from last week, then the number of slots this week would be subtracted. If there are enough slots to do all the jobs required then the running total will be 0 (never negative).
As an example - the below shows how I would achieve this in javascript:
var Total = 0;
data.foreach(function(d){
Total += d.Due;
Total -= d.Slots;
Total = Total > 0 ? Total : 0;
d.Total = Total;
});
The result would be as below:
Week
Slots
Due
Total
23/8/2021
0
1
1
30/8/2021
2
3
2
6/9/2021
5
2
0
13/9/2021
1
4
3
Is it possible for me to achieve this in SQL (specifically SQL Server 2012)
I have tried various forms of sum(xxx) over (order by yyy)
Closest I managed was:
sum(Due) over (order by Week) - sum(Slots) over (order by Week) as Total
This provided a running total, but will provide a negative total when there are excess slots.
Is the only way to do this with a cursor? If so - any suggestions?
Thanks.
Possible answer(s) to my own question based on suggestions in comments.
Thorsten Kettner suggested a recursive query:
with cte as (
select [Week], [Due], [Slots]
,case when Due > Slots then Due - Slots else 0 end as [Total]
from [Data]
where [Week] = (select top 1 [Week] from [Data])
union all
select e.[Week], e.[Due], e.[Slots]
, case when cte.Total + e.Due - e.Slots > 0 then cte.Total + e.Due - e.Slots else 0 end as [Total]
from [Data] e
inner join cte on cte.[Week] = dateadd(day,-7,e.[Week])
)
select * from cte
OPTION (MAXRECURSION 200)
Thorsten - is this what you were suggesting? (If you have any improvements, please post as an answer so I can accept it!)
Presumably I have to ensure that MAXRECURSION is set to something higher than the number of rows I will be dealing with?
I am a little bit nervous about the join on dateadd(day,-7,e.[Week]). Would I be better doing something with Row_Number() to get the previous record? I may want to use something other than weeks, or weeks may be missing?
George Menoutis suggested a 'while' query and I was looking for ways to implement that when I came across this post: https://stackoverflow.com/a/35471328/1372848
This suggested that a cursor may not be all that bad compared to a while?
This is the cursor based version I came up with:
SET NOCOUNT ON;
DECLARE #Week Date,
#Due Int,
#Slots Int,
#Total Int = 0;
DECLARE #Output TABLE ([Week] Date NOT NULL, Due Int NOT NULL, Slots Int NOT NULL, Total Int);
DECLARE crs CURSOR STATIC LOCAL READ_ONLY FORWARD_ONLY
FOR SELECT [Week], Due, Slots
FROM [Data]
ORDER BY [Week] ASC;
OPEN crs;
FETCH NEXT
FROM crs
INTO #Week, #Due, #Slots;
WHILE (##FETCH_STATUS = 0)
BEGIN
Set #Total = #Total + #Due;
Set #Total = #Total - #Slots;
Set #Total = IIF(#Total > 0, #Total , 0)
INSERT INTO #Output ([Week], [Due], [Slots], [Total])
VALUES (#Week, #Due, #Slots, #Total);
FETCH NEXT
FROM crs
INTO #Week, #Due, #Slots;
END;
CLOSE crs;
DEALLOCATE crs;
SELECT *
FROM #Output;
Both of these seem to work as intended. The recursive query feels better (cursors = bad etc), but is it designed to be used this way (with a recursion for every input row and therefore potentially a very high number of recursions?)
Many thanks for everyone's input :-)
Improvement on previous answer following input from Thorsten
with numbered as (
select *, ROW_NUMBER() OVER (ORDER BY [Week]) as RN
from [Data]
)
,cte as (
select [Week], [Due], [Slots], [RN]
,case when Due > Slots then Due - Slots else 0 end as [Total]
from numbered
where RN = 1
union all
select e.[Week], e.[Due], e.[Slots], e.[RN]
, case when cte.Total + e.Due - e.Slots > 0 then cte.Total + e.Due - e.Slots else 0 end as [Total]
from numbered e
inner join cte on cte.[RN] = e.[RN] - 1
)
select * from cte
OPTION (MAXRECURSION 0)
Many thanks Thorsten for all your help.

SQL Server - loop through table and update based on count

I have a SQL Server database. I need to loop through a table to get the count of each value in the column 'RevID'. Each value should only be in the table a certain number of times - for example 125 times. If the count of the value is greater than 125 or less than 125, I need to update the column to ensure all values in the RevID (are over 25 different values) is within the same range of 125 (ok to be a few numbers off)
For example, the count of RevID = "A2" is = 45 and the count of RevID = 'B2' is = 165 then I need to update RevID so the 45 count increases and the 165 decreases until they are within the 125 range.
This is what I have so far:
DECLARE #i INT = 1,
#RevCnt INT = SELECT RevId, COUNT(RevId) FROM MyTable group by RevId
WHILE(#RevCnt >= 50)
BEGIN
UPDATE MyTable
SET RevID= (SELECT COUNT(RevID) FROM MyTable)
WHERE RevID < 50)
#i = #i + 1
END
I have also played around with a cursor and instead of trigger. Any idea on how to achieve this? Thanks for any input.
Okay I cam back to this because I found it interesting even though clearly there are some business rules/discussion that you and I and others are not seeing. anyway, if you want to evenly and distribute arbitrarily there are a few ways you could do it by building recursive Common Table Expressions [CTE] or by building temp tables and more. Anyway here is a way that I decided to give it a try, I did utilize 1 temp table because sql was throwing in a little inconsistency with the main logic table as a cte about every 10th time but the temp table seems to have cleared that up. Anyway, this will evenly spread RevId arbitrarily and randomly assigning any remainder (# of Records / # of RevIds) to one of the RevIds. This script also doesn't rely on having a UniqueID or anything it works dynamically over row numbers it creates..... here you go just subtract out test data etc and you have what you more than likely want. Though rebuilding the table/values would probably be easier.
--Build Some Test Data
DECLARE #Table AS TABLE (RevId VARCHAR(10))
DECLARE #C AS INT = 1
WHILE #C <= 400
BEGIN
IF #C <= 200
BEGIN
INSERT INTO #Table (RevId) VALUES ('A1')
END
IF #c <= 170
BEGIN
INSERT INTO #Table (RevId) VALUES ('B2')
END
IF #c <= 100
BEGIN
INSERT INTO #Table (RevId) VALUES ('C3')
END
IF #c <= 400
BEGIN
INSERT INTO #Table (RevId) VALUES ('D4')
END
IF #c <= 1
BEGIN
INSERT INTO #Table (RevId) VALUES ('E5')
END
SET #C = #C+ 1
END
--save starting counts of test data to temp table to compare with later
IF OBJECT_ID('tempdb..#StartingCounts') IS NOT NULL
BEGIN
DROP TABLE #StartingCounts
END
SELECT
RevId
,COUNT(*) as Occurences
INTO #StartingCounts
FROM
#Table
GROUP BY
RevId
ORDER BY
RevId
/************************ This is the main method **********************************/
--clear temp table that is the main processing logic
IF OBJECT_ID('tempdb..#RowNumsToChange') IS NOT NULL
BEGIN
DROP TABLE #RowNumsToChange
END
--figure out how many records there are and how many there should be for each RevId
;WITH cteTargetNumbers AS (
SELECT
RevId
--,COUNT(*) as RevIdCount
--,SUM(COUNT(*)) OVER (PARTITION BY 1) / COUNT(*) OVER (PARTITION BY 1) +
--CASE
--WHEN ROW_NUMBER() OVER (PARTITION BY 1 ORDER BY NEWID()) <=
--SUM(COUNT(*)) OVER (PARTITION BY 1) % COUNT(*) OVER (PARTITION BY 1)
--THEN 1
--ELSE 0
--END as TargetNumOfRecords
,SUM(COUNT(*)) OVER (PARTITION BY 1) / COUNT(*) OVER (PARTITION BY 1) +
CASE
WHEN ROW_NUMBER() OVER (PARTITION BY 1 ORDER BY NEWID()) <=
SUM(COUNT(*)) OVER (PARTITION BY 1) % COUNT(*) OVER (PARTITION BY 1)
THEN 1
ELSE 0
END - COUNT(*) AS NumRecordsToUpdate
FROM
#Table
GROUP BY
RevId
)
, cteEndRowNumsToChange AS (
SELECT *
,SUM(CASE WHEN NumRecordsToUpdate > 1 THEN NumRecordsToUpdate ELSE 0 END)
OVER (PARTITION BY 1 ORDER BY RevId) AS ChangeEndRowNum
FROM
cteTargetNumbers
)
SELECT
*
,LAG(ChangeEndRowNum,1,0) OVER (PARTITION BY 1 ORDER BY RevId) as ChangeStartRowNum
INTO #RowNumsToChange
FROM
cteEndRowNumsToChange
;WITH cteOriginalTableRowNum AS (
SELECT
RevId
,ROW_NUMBER() OVER (PARTITION BY RevId ORDER BY (SELECT 0)) as RowNumByRevId
FROM
#Table t
)
, cteRecordsAllowedToChange AS (
SELECT
o.RevId
,o.RowNumByRevId
,ROW_NUMBER() OVER (PARTITION BY 1 ORDER BY (SELECT 0)) as ChangeRowNum
FROM
cteOriginalTableRowNum o
INNER JOIN #RowNumsToChange t
ON o.RevId = t.RevId
AND t.NumRecordsToUpdate < 0
AND o.RowNumByRevId <= ABS(t.NumRecordsToUpdate)
)
UPDATE o
SET RevId = u.RevId
FROM
cteOriginalTableRowNum o
INNER JOIN cteRecordsAllowedToChange c
ON o.RevId = c.RevId
AND o.RowNumByRevId = c.RowNumByRevId
INNER JOIN #RowNumsToChange u
ON c.ChangeRowNum > u.ChangeStartRowNum
AND c.ChangeRowNum <= u.ChangeEndRowNum
AND u.NumRecordsToUpdate > 0
IF OBJECT_ID('tempdb..#RowNumsToChange') IS NOT NULL
BEGIN
DROP TABLE #RowNumsToChange
END
/***************************** End of Main Method *******************************/
-- Compare the results and clean up
;WITH ctePostUpdateResults AS (
SELECT
RevId
,COUNT(*) as AfterChangeOccurences
FROM
#Table
GROUP BY
RevId
)
SELECT *
FROM
#StartingCounts s
INNER JOIN ctePostUpdateResults r
ON s.RevId = r.RevId
ORDER BY
s.RevId
IF OBJECT_ID('tempdb..#StartingCounts') IS NOT NULL
BEGIN
DROP TABLE #StartingCounts
END
Since you've given no rules for how you'd like the balance to operate we're left to speculate. Here's an approach that would find the most overrepresented value and then find an underrepresented value that can take on the entire overage.
I have no idea how optimal this is and it will probably run in an infinite loop without more logic.
declare #balance int = 125;
declare #cnt_over int;
declare #cnt_under int;
declare #revID_overrepresented varchar(32);
declare #revID_underrepresented varchar(32);
declare #rowcount int = 1;
while #rowcount > 0
begin
select top 1 #revID_overrepresented = RevID, #cnt_over = count(*)
from T
group by RevID
having count(*) > #balance
order by count(*) desc
select top 1 #revID_underrepresented = RevID, #cnt_under = count(*)
from T
group by RevID
having count(*) < #balance - #cnt_over
order by count(*) desc
update top #cnt_over - #balance T
set RevId = #revID_underrepresented
where RevId = #revID_overrepresented;
set #rowcount = ##rowcount;
end
The problem is I don't even know what you mean by balance...You say it needs to be evenly represented but it seems like you want it to be 125. 125 is not "even", it is just 125.
I can't tell what you are trying to do, but I'm guessing this is not really an SQL problem. But you can use SQL to help. Here is some helpful SQL for you. You can use this in your language of choice to solve the problem.
Find the rev values and their counts:
SELECT RevID, COUNT(*)
FROM MyTable
GROUP BY MyTable
Update #X rows (with RevID of value #RevID) to a new value #NewValue
UPDATE TOP #X FROM MyTable
SET RevID = #NewValue
WHERE RevID = #RevID
Using these two queries you should be able to apply your business rules (which you never specified) in a loop or whatever to change the data.

Create a random selection weighted on number of points, SQL

I have a table of winners for a prize draw, where each winner has earned a number of points over the year. There are 1300 registered users, with points varying between 50 and 43,000. I need to be able to select a random winner, which is straight forward, but the challenge I am having is building the logic where each point counts as an entry ticket into the prize draw. Would appreciate any help.
John
Your script would look something similar to this:
Script 1 :
DECLARE #Name varchar(100),
#Points int,
#i int
DECLARE Cursor1 CURSOR FOR SELECT Name, Points FROM Table1
OPEN Cursor1
FETCH NEXT FROM Cursor1
INTO #Name, #Points
WHILE ##FETCH_STATUS = 0
BEGIN
SET #i = 0
WHILE #i < #Points
BEGIN
INSERT INTO Table2 (Name)
VALUES (#Name)
SET #i = #i + 1
END
FETCH NEXT FROM Cursor1 INTO #Name, #Points
END
DEALLOCATE Cursor1
I have created a table (Table1) with only a Name and Points column (varchar(100) and int), I have created a cursor in order to look through all the records within Table1 and then loop through the Points and then inserted each record into another table (Table2).
This then imports the Name depending on the Points column.
Script 2 :
DECLARE #Name varchar(100),
#Points int,
#i int,
#Count int
CREATE TABLE #temptable(
UserEmailID nvarchar(200),
Points int)
DECLARE Cursor1 CURSOR FOR SELECT UserEmailID, Points FROM Table1_TEST
OPEN Cursor1
FETCH NEXT FROM Cursor1
INTO #Name, #Points
WHILE ##FETCH_STATUS = 0
BEGIN
SET #i = 0
WHILE #i < #Points
BEGIN
INSERT INTO #temptable (UserEmailID, Points)
VALUES (#Name, #Points)
SET #i = #i + 1
END
FETCH NEXT FROM Cursor1 INTO #Name, #Points
END
DEALLOCATE Cursor1
SELECT * FROM #temptable
DROP TABLE #temptable
In Script2 I have imported the result into a TEMP table as requested.
The script now runs through each record within you Table1 and imports the individuals UserEmailID and Points into the TEMP table depending on how much the Points are in Table1.
So if John has a total of 3 points, and Sarah 2, the script will import Johns UserEmailID 3 times into the TEMP table and 2 times for Sarah.
If you apply the random selector on the TEMP table, it will then randomly select a individual.
John would obviously stand a better chance to win because he has 3 records in the TEMP table whereas Sarah only has 2.
Suppose Johns UserEmailID is 1 and Sarah is 2:
The OUTPUT of TEMP table would then be:
UserEmailID | Points
1 | 3
1 | 3
1 | 3
2 | 2
2 | 2
Please let me know if you need any clarity.
Hope this helps.
You can do a weighted draw using the following method:
Calculate the cumulative sum of points.
Divide by the total number of points to get a value between 0 and 1
Each row in the original data will have a range, such as [0, 0.1), [0.1, 0.3), [0.3, 1]
Calculate a random number and choose the row where the value falls in the range
Here is standard'ish SQL for this approach:
with u as (
select u.*,
coalesce(lead(rangestart) over (order by points) as rangeend, 1)
from (select u.*,
sum(points*1.0) over (order by points) / sum(points) over () as rangestart
from users u
) u
),
r as (
select random() as rand
)
select u.*
from u
where r.rand between rangestart and rangeend;
In addition to using window functions (which can be handled by correlated subqueries in many cases), the exact format depends on whether the random number generator is deterministic for a query (such as SQL Server where random() returns one value no matter how often called in a query) or non-deterministic (such as in other databases). This method only requires one value for the random number generator, so it will work with either method.
So you want a winner with 1000 points have double the chances as another with only 500 points.
Sort the winners by whatever order and create a running total for the points:
id points
winner1 100
winner2 50
winner3 150
gives:
id points from to
winner1 100 1 100
winner2 50 101 150
winner3 150 151 300
Then compare with a random number from 1 to sum(points), in the example a number between 1 and 300. Find the winner with that number range and you're done.
select winpoints.id_winner
from
(
select
id as id_winner,
coalesce(sum(points) over(order by id rows between unbounded preceding and 1 preceding), 0) + 1 as from_points,
sum(points) over(order by id rows between unbounded preceding and current row) as to_points
from winners
) winpoints
where (select floor(rand() * (sum(points) from winners)) + 1)
between winpoints.from_points and winpoints.to_points;
This solution also works with fractional points/weights. It creates a helper table usersum.
create table user (id int primary key, points float);
insert into user values (1, 0.5), (2, 0), (3, 1);
create table usersum (id int primary key, pointsum float);
insert into usersum
select id, (select sum(points) from user b where b.id <= a.id)
from user a;
set #r = rand() * (select max(pointsum) from usersum);
select #r, usersum.* from usersum where pointsum >= #r order by id limit 1;
http://sqlfiddle.com/#!2/ae539e/1

Order by and apply a running total to the same column without using a temporary table

A representation of my table:
CREATE TABLE Sales
(
id int identity primary key,
SaleAmount numeric(10,2)
);
DECLARE #i INT;
SELECT #i = 1;
SET NOCOUNT ON
WHILE #i <= 100
BEGIN
INSERT INTO Sales VALUES (ABS(CHECKSUM(NEWID()))/10000000.0 );
SELECT #i = #i + 1;
END;
SET NOCOUNT OFF
I need to order my table Sales by SaleAmount and then select all records where a running total of SaleAmount is no greater than X.
To do this I'm currently using a temporary table to first sort the records and then selecting records where the running total is less than or equal to X (in this example 10).
CREATE TABLE #TEMP_TABLE
(
ID integer IDENTITY PRIMARY KEY,
SaleAmount numeric(10,2)
);
INSERT INTO #TEMP_TABLE
(SaleAmount)
SELECT SaleAmount FROM Sales
ORDER BY SaleAmount
SELECT * FROM
(SELECT
Id,
SaleAmount,
(SaleAmount+COALESCE((SELECT SUM(SaleAmount)
FROM #TEMP_TABLE b
WHERE b.Id < a.Id),0))
AS RunningTotal
FROM #TEMP_TABLE a) InnerTable
WHERE RunningTotal <= 10
Is there a way in which I can first order my Sales table without the use of a temporary table?
If you are using SQL Server 2012, then you can just use the window function for cumulative sum:
select s.*,
sum(SaleAmount) over (order by id) as RunningTotal
from Sales s
This is equivalent to the following correlated subquery:
select s.*,
(select sum(SalesAmount) from sales s2 where s2.id <= s.id) as RunningTotal
from Sales s
Following Aaron Bertrand's suggestion of using a cursor method :
DECLARE #st TABLE
(
Id Int PRIMARY KEY,
SaleAmount Numeric(10,2),
RunningTotal Numeric(10,2)
);
DECLARE
#Id INT,
#SaleAmount Numeric(10,2),
#RunningTotal Numeric(10,2) = 0;
DECLARE c CURSOR
LOCAL STATIC FORWARD_ONLY READ_ONLY
FOR
SELECT id, SaleAmount
FROM Sales
ORDER BY SaleAmount;
OPEN c;
FETCH NEXT FROM c INTO #Id, #SaleAmount;
WHILE ##FETCH_STATUS = 0
BEGIN
SET #RunningTotal = #RunningTotal + #SaleAmount;
INSERT #st(Id, SaleAmount, RunningTotal)
SELECT #Id, #SaleAmount, #RunningTotal;
FETCH NEXT FROM c INTO #Id, #SaleAmount;
END
CLOSE c;
DEALLOCATE c;
SELECT Id, SaleAmount, RunningTotal
FROM #st
WHERE RunningTotal<=10
ORDER BY SaleAmount;
This is an increase in code and still requires a table variable. However the improvement in performance is significant.
Credit has to go to Aaron Bertrand for the excellent article on running totals he wrote.
One more option with CTE, ROW_NUMBER() ranking function and APPLY() operator
;WITH cte AS
(
SELECT ROW_NUMBER() OVER(ORDER BY SaleAmount) AS rn, SaleAmount
FROM Sales s
)
SELECT *
FROM cte c CROSS APPLY (
SELECT SUM(s2.SaleAmount) AS RunningTotal
FROM Sales s2
WHERE c.SaleAmount >= s2.SaleAmount
) o
WHERE o.RunningTotal <= 10
FYI, for avoiding operation of sorting you can use this index:
CREATE INDEX ix_SaleAmount_Sales ON Sales(SaleAmount)
After some research, i believe that what your aiming is not possible, unless using SS2012, or Oracle.
Since your solution seems to work i would advise using a table variable instead of a schema table:
DECLARE #TEMP_TABLE TABLE (
ID integer IDENTITY PRIMARY KEY,
SaleAmount numeric(10,2)
);
INSERT INTO #TEMP_TABLE
(SaleAmount)
SELECT SaleAmount FROM Sales
ORDER BY SaleAmount
SELECT * FROM
(SELECT
Id,
SaleAmount,
(SaleAmount+COALESCE((SELECT SUM(SaleAmount)
FROM #TEMP_TABLE b
WHERE b.Id < a.Id),0))
AS RunningTotal
FROM #TEMP_TABLE a) InnerTable
WHERE RunningTotal <= 10
When testing side-by-side, i found some performance improvements.
First of all, you are doing a sub-select and then doing a select * from the sub-select. This is unnecessary.
SELECT
Id,
SaleAmount,
(SaleAmount+COALESCE((SELECT SUM(SaleAmount)
FROM #TEMP_TABLE b
WHERE b.Id < a.Id),0))
AS RunningTotal
FROM #TEMP_TABLE
WHERE RunningTotal <= 10
Now, the temp table is just a query on the Sales table. There is no purpose to ordering the temporary table because by the rules of SQL, the order in the temporary table does not have to be honored, only the order by clause on the outer query, so
SELECT
Id,
SaleAmount,
(SaleAmount+COALESCE((SELECT SUM(SaleAmount)
FROM Sales b
WHERE b.Id < a.Id),0))
AS RunningTotal
FROM Sales
WHERE RunningTotal <= 10