Multiple Running Totals with Group By - sql

I am struggling to find a good way to run running totals with a group by in it, or the equivalent. The below cursor based running total works on a complete table, but I would like to expand this to add a "Client" dimension. So I would get running totals as the below creates but for each company (ie Company A, Company B, Company C, etc.) in one table
CREATE TABLE test (tag int, Checks float, AVG_COST float, Check_total float, Check_amount float, Amount_total float, RunningTotal_Check float,
RunningTotal_Amount float)
DECLARE #tag int,
#Checks float,
#AVG_COST float,
#check_total float,
#Check_amount float,
#amount_total float,
#RunningTotal_Check float ,
#RunningTotal_Check_PCT float,
#RunningTotal_Amount float
SET #RunningTotal_Check = 0
SET #RunningTotal_Check_PCT = 0
SET #RunningTotal_Amount = 0
DECLARE aa_cursor CURSOR fast_forward
FOR
SELECT tag, Checks, AVG_COST, check_total, check_amount, amount_total
FROM test_3
OPEN aa_cursor
FETCH NEXT FROM aa_cursor INTO #tag, #Checks, #AVG_COST, #check_total, #Check_amount, #amount_total
WHILE ##FETCH_STATUS = 0
BEGIN
SET #RunningTotal_CHeck = #RunningTotal_CHeck + #checks
set #RunningTotal_Amount = #RunningTotal_Amount + #Check_amount
INSERT test VALUES (#tag, #Checks, #AVG_COST, #check_total, #Check_amount, #amount_total, #RunningTotal_check, #RunningTotal_Amount )
FETCH NEXT FROM aa_cursor INTO #tag, #Checks, #AVG_COST, #check_total, #Check_amount, #amount_total
END
CLOSE aa_cursor
DEALLOCATE aa_cursor
SELECT *, RunningTotal_Check/Check_total as CHECK_RUN_PCT, round((RunningTotal_Check/Check_total *100),0) as CHECK_PCT_BIN, RunningTotal_Amount/Amount_total as Amount_RUN_PCT, round((RunningTotal_Amount/Amount_total * 100),0) as Amount_PCT_BIN
into test_4
FROM test ORDER BY tag
create clustered index IX_TESTsdsdds3 on test_4(tag)
DROP TABLE test
----------------------------------
I can the the running total for any 1 company but I would like to do it for multiple to produce something like the results below.
CLIENT COUNT Running Total
Company A 1 6.7%
Company A 2 20.0%
Company A 3 40.0%
Company A 4 66.7%
Company A 5 100.0%
Company B 1 3.6%
Company B 2 10.7%
Company B 3 21.4%
Company B 4 35.7%
Company B 5 53.6%
Company B 6 75.0%
Company B 7 100.0%
Company C 1 3.6%
Company C 2 10.7%
Company C 3 21.4%
Company C 4 35.7%
Company C 5 53.6%
Company C 6 75.0%
Company C 7 100.0%

This is finally simple to do in SQL Server 2012, where SUM and COUNT support OVER clauses that contain ORDER BY. Using Cris's #Checks table definition:
SELECT
CompanyID,
count(*) over (
partition by CompanyID
order by Cleared, ID
) as cnt,
str(100.0*sum(Amount) over (
partition by CompanyID
order by Cleared, ID
)/
sum(Amount) over (
partition by CompanyID
),5,1)+'%' as RunningTotalForThisCompany
FROM #Checks;
SQL Fiddle here.

I originally started posting the SQL Server 2012 equivalent (since you didn't mention what version you were using). Steve has done a great job of showing the simplicity of this calculation in the newest version of SQL Server, so I'll focus on a few methods that work on earlier versions of SQL Server (back to 2005).
I'm going to take some liberties with your schema, since I can't figure out what all these #test and #test_3 and #test_4 temporary tables are supposed to represent. How about:
USE tempdb;
GO
CREATE TABLE dbo.Checks
(
Client VARCHAR(32),
CheckDate DATETIME,
Amount DECIMAL(12,2)
);
INSERT dbo.Checks(Client, CheckDate, Amount)
SELECT 'Company A', '20120101', 50
UNION ALL SELECT 'Company A', '20120102', 75
UNION ALL SELECT 'Company A', '20120103', 120
UNION ALL SELECT 'Company A', '20120104', 40
UNION ALL SELECT 'Company B', '20120101', 75
UNION ALL SELECT 'Company B', '20120105', 200
UNION ALL SELECT 'Company B', '20120107', 90;
Expected output in this case:
Client Count Running Total
--------- ----- -------------
Company A 1 17.54
Company A 2 43.86
Company A 3 85.96
Company A 4 100.00
Company B 1 20.55
Company B 2 75.34
Company B 3 100.00
One way:
;WITH gt(Client, Totals) AS
(
SELECT Client, SUM(Amount)
FROM dbo.Checks AS c
GROUP BY Client
), n (Client, Amount, rn) AS
(
SELECT c.Client, c.Amount,
ROW_NUMBER() OVER (PARTITION BY c.Client ORDER BY c.CheckDate)
FROM dbo.Checks AS c
)
SELECT n.Client, [Count] = n.rn,
[Running Total] = CONVERT(DECIMAL(5,2), 100.0*(
SELECT SUM(Amount) FROM n AS n2
WHERE Client = n.Client AND rn <= n.rn)/gt.Totals
)
FROM n INNER JOIN gt ON n.Client = gt.Client
ORDER BY n.Client, n.rn;
A slightly faster alternative - more reads but shorter duration and simpler plan:
;WITH x(Client, CheckDate, rn, rt, gt) AS
(
SELECT Client, CheckDate, rn = ROW_NUMBER() OVER
(PARTITION BY Client ORDER BY CheckDate),
(SELECT SUM(Amount) FROM dbo.Checks WHERE Client = c.Client
AND CheckDate <= c.CheckDate),
(SELECT SUM(Amount) FROM dbo.Checks WHERE Client = c.Client)
FROM dbo.Checks AS c
)
SELECT Client, [Count] = rn,
[Running Total] = CONVERT(DECIMAL(5,2), rt * 100.0/gt)
FROM x
ORDER BY Client, [Count];
While I've offered set-based alternatives here, in my experience I have observed that a cursor is often the fastest supported way to perform running totals. There are other methods such as the quirky update which perform about marginally faster but the result is not guaranteed. The set-based approach where you perform a self-join becomes more and more expensive as the source row counts go up - so what seems to perform okay in testing with a small table, as the table gets larger, the performance goes down.
I have a blog post almost fully prepared that goes through a slightly simpler performance comparison of various running totals approaches. It is simpler because it is not grouped and it only shows the totals, not the running total percentage. I hope to publish this post soon and will try to remember to update this space.
There is also another alternative to consider that doesn't require reading previous rows multiple times. It's a concept Hugo Kornelis describes as "set-based iteration." I don't recall where I first learned this technique, but it makes a lot of sense in some scenarios.
DECLARE #c TABLE
(
Client VARCHAR(32),
CheckDate DATETIME,
Amount DECIMAL(12,2),
rn INT,
rt DECIMAL(15,2)
);
INSERT #c SELECT Client, CheckDate, Amount,
ROW_NUMBER() OVER (PARTITION BY Client
ORDER BY CheckDate), 0
FROM dbo.Checks;
DECLARE #i INT, #m INT;
SELECT #i = 2, #m = MAX(rn) FROM #c;
UPDATE #c SET rt = Amount WHERE rn = 1;
WHILE #i <= #m
BEGIN
UPDATE c SET c.rt = c2.rt + c.Amount
FROM #c AS c
INNER JOIN #c AS c2
ON c.rn = c2.rn + 1
AND c.Client = c2.Client
WHERE c.rn = #i;
SET #i = #i + 1;
END
SELECT Client, [Count] = rn, [Running Total] = CONVERT(
DECIMAL(5,2), rt*100.0 / (SELECT TOP 1 rt FROM #c
WHERE Client = c.Client ORDER BY rn DESC)) FROM #c AS c;
While this does perform a loop, and everyone tells you that loops and cursors are bad, one gain with this method is that once the previous row's running total has been calculated, we only have to look at the previous row instead of summing all prior rows. The other gain is that in most cursor-based solutions you have to go through each client and then each check. In this case, you go through all clients' 1st checks once, then all clients' 2nd checks once. So instead of (client count * avg check count) iterations, we only do (max check count) iterations. This solution doesn't make much sense for the simple running totals example, but for the grouped running totals example it should be tested against the set-based solutions above. Not a chance it will beat Steve's approach, though, if you are on SQL Server 2012.
UPDATE
I've blogged about various running totals approaches here:
http://www.sqlperformance.com/2012/07/t-sql-queries/running-totals

I didn't exactly understand the schema you were pulling from, but here is a quick query using a temp table that shows how to do a running total in a set based operation.
CREATE TABLE #Checks
(
ID int IDENTITY(1,1) PRIMARY KEY
,CompanyID int NOT NULL
,Amount float NOT NULL
,Cleared datetime NOT NULL
)
INSERT INTO #Checks
VALUES
(1,5,'4/1/12')
,(1,5,'4/2/12')
,(1,7,'4/5/12')
,(2,10,'4/3/12')
SELECT Info.ID, Info.CompanyID, Info.Amount, RunningTotal.Total, Info.Cleared
FROM
(
SELECT main.ID, SUM(other.Amount) as Total
FROM
#Checks main
JOIN
#Checks other
ON
main.CompanyID = other.CompanyID
AND
main.Cleared >= other.Cleared
GROUP BY
main.ID) RunningTotal
JOIN
#Checks Info
ON
RunningTotal.ID = Info.ID
DROP TABLE #Checks

Related

How to recursively calculate yearly rollover in SQL?

I need to calculate yearly rollover for a system that keeps track of when people have used days off.
The rollover calculation itself is simple: [TOTALDAYSALLOWED] - [USED]
Provided that number is not higher than [MAXROLLOVER] (and > 0)
Where this gets complicated is the [TOTALDAYSALLOWED] column, which is [NUMDAYSALLOWED] combined with the previous year's rollover to get the total number of days that can be used in a current year.
I've tried several different ways of getting this calculation, but all of them have failed to account for the previous year's rollover being a part of the current year's allowed days.
Creating columns for the LAG of days used, joining the data to itself but shifted back a year, etc. I'm not including examples of code I've tried because the approach was wrong in all of the attempts. That would just make this long post even longer.
Here's the data I'm working with:
Here's how it should look after the calculation:
This is a per-person calculation, so there's no need to consider any personal ID here. DAYTYPE only has one value currently, but I want to include it in the calculation in case another is added. The [HOW] column is only for clarity in this post.
Here's some code to generate the sample data (SQL Server or Azure SQL):
IF OBJECT_ID('tempdb..#COUNTS') IS NOT NULL DROP TABLE #COUNTS
CREATE TABLE #COUNTS (USED INT, DAYTYPE VARCHAR(20), THEYEAR INT)
INSERT INTO #COUNTS (USED, DAYTYPE, THEYEAR)
SELECT 1, 'X', 2019
UNION
SELECT 3, 'X', 2020
UNION
SELECT 0, 'X', 2021
IF OBJECT_ID('tempdb..#ALLOWANCES') IS NOT NULL DROP TABLE #ALLOWANCES
CREATE TABLE #ALLOWANCES (THEYEAR INT, DAYTYPE VARCHAR(20), NUMDAYSALLOWED INT, MAXROLLOVER INT)
INSERT INTO #ALLOWANCES (THEYEAR, DAYTYPE, NUMDAYSALLOWED, MAXROLLOVER)
SELECT 2019, 'X', 3, 3
UNION
SELECT 2020, 'X', 3, 3
UNION
SELECT 2021, 'X', 3, 3
SELECT C.*, A.NUMDAYSALLOWED, A.MAXROLLOVER
FROM #COUNTS C
JOIN #ALLOWANCES A ON C.DAYTYPE = A.DAYTYPE AND C.THEYEAR = A.THEYEAR
The tricky part is to limit the rollover amount. This is maybe possible with window functions, but I think this is easier to do with a recursive query:
with
data as (
select c.*, a.numdaysallowed, a.maxrollover,
row_number() over(partition by c.daytype order by c.theyear) rn
from #counts c
inner join #allowances a on a.theyear = c.theyear and a.daytype = c.daytype
),
cte as (
select d.*,
numdaysallowed as totaldaysallowed,
numdaysallowed - used as actualrollover
from data d
where rn = 1
union all
select d.*,
d.numdaysallowed + c.actualrollover,
case when d.numdaysallowed + c.actualrollover - d.used > d.maxrollover
then 3
else d.numdaysallowed + c.actualrollover - d.used
end
from cte c
inner join data d on d.rn = c.rn + 1 and d.daytype = c.daytype
)
select * from cte order by theyear
Demo on DB Fiddle

avg of count incorrect results

I need to get the Brand and Type of all consoles that get repaired less than average, console types that haven't been repaired have to count towards this average, as well.
So I get the brand and type from the
console table.
And I join this to the Items table(artikel table in this DB).
Then I left join the items to the Repairs table,because I also need the console types that haven't been repaired, not sure if this is correct.
So now to get the amount of repairs per console type I did a count on the repaired_items_id in the Repairs table (repareerd_artikel_id in the picture),and I grouped it by the same column, and then I took the average of that count.
This is my syntax, I also tried different group by combinations but the results are always wrong.
select merk,type from console c join artikel a on
a.CONSOLE_ID=c.CONSOLE_ID left join REPARATIE r on
REPAREERD_ARTIKEL_ID=a.ARTIKEL_ID group by MERK,TYPE
HAVING (select avg(A.rcount) from (select
count(repareerd_artikel_id) AS rcount from REPARATIE group by
REPAREERD_ARTIKEL_ID) A) < (select avg(A.rcount) from (select
count(repareerd_artikel_id) AS rcount from REPARATIE group by
REPAREERD_ARTIKEL_ID) A)
And then I also tried starting with a count instead.
HAVING count(repareerd_artikel_id)< (select avg(A.rcount) from
(select count(repareerd_artikel_id) AS rcount from REPARATIE group by
REPAREERD_ARTIKEL_ID) A)
I have no idea what to do anymore now so any help would be much appreciated.
Using your query (and tbh not checking your images, as they are a pain to navigate to from SO) I think you could do something like this, breaking down the problem into three stages:
Work out the number of repairs for each console, including zeroes;
Work out the average number of repairs across all consoles;
List any repairs that are under the average.
WITH AllRepairs AS (
SELECT
merk,
[type],
ISNULL(COUNT(r.repareerd_artikel_id), 0) AS repairs
FROM
console c
INNER JOIN artikel a ON a.CONSOLE_ID = c.CONSOLE_ID
LEFT JOIN REPARATIE r ON r.REPAREERD_ARTIKEL_ID = a.ARTIKEL_ID
GROUP BY
merk,
[type]),
AverageRepairs AS (
SELECT
AVG(repairs) AS average_repairs
FROM
AllRepairs)
SELECT
a.*
FROM
AllRepairs a
CROSS JOIN AverageRepairs ar
WHERE
a.repairs < ar.average_repairs
ORDER BY
a.repairs;
You might want to worry about comparing integers to a decimal? For example, if your average number of repairs is 2.9 then only anything under 2 would count as below average. I think that's probably what you want?
The way I am viewing your info it looks like you do not have any repairs for the items in your items table. So, I added 2 repairs to one item and 1 repair to another. In my query, I produce an average across all 11 Merk/Types of 1.5. I then compare this to the number of repairs for each. The record with 2 repairs get presented in the result.
Create Table #console
(
Console_Id Int,
Merk Varchar(25),
Type VarChar(25),
Kleur VarChar(10),
Jaar_Uitgave Int,
Maat VarChar(10)
)
Insert Into #console Values
(1,'Sony 1','PS4 Slim','Wit',2016,'Slim'),
(2,'Microsoft','XBox','Beige',2004,'Port'),
(3,'Microsoft','XBox 360','Zwart',2011,'Pro'),
(4,'Microsoft','XBox One','Wit',2014,'Pro'),
(5,'Microsoft','XBox One X','Wit',2017,'Pro'),
(6,'Nintendo','NES Classic Edition','Wit',2016,'XL'),
(7,'Nintendo','Switch','Wit',2017,'XL'),
(8,'Nintendo','WII','Wit',2011,'Slim'),
(9,'Nintendo','WII Mini','Wit',2015,'XL'),
(10,'Nintendo','WII U','Wit',2013,'Slim'),
(11,'Sony','PS3','Wit',2013,'Port')
Create Table #Items
(
Artikel_ID Int,
BarCode VarChar(20),
Prijs Float,
Prijs_Per_D Float,
Spel_Of_Console VarChar(25),
Spel_ID Int,
Console_Id Int
)
Insert Into #Items Values
(301,'10000008',300.00,11.00,'Console',Null,3),
(302,'10000017',400.00,15.00,'Console',Null,4),
(303,'10000026',270.00,9.00,'Console',Null,9),
(304,'10000035',200.00,5.00,'Console',Null,6),
(305,'10000044',200.00,5.00,'Console',Null,11),
(306,'10000053',300.00,11.00,'Console',Null,12),
(307,'10000023',60.00,2.00,'Spel',15,Null),
(308,'10000242',36.00,2.00,'Spel',16,Null),
(309,'10000278',35.00,2.00,'Spel',21,Null),
(310,'10000107',66.00,4.00,'Spel',36,Null),
(311,'10000215',45.00,3.00,'Spel',40,Null)
Create Table #Repairs
(
Medewerker_Id Int,
Repareerd_Artikel_Id Int,
Schadenummer Int,
Huurovereenkomst_Id Int,
datum_Gereed DateTime,
Kosten Float,
Reparatiestatus VarChar(25)
)
Insert Into #Repairs Values
(1,259,7,12,'2017-08-03 00:00:00',112.00,'GEREED'),
(2,260,9,14,'2016-09-29 00:00:00',84.00,'GEREED'),
(3,288,19,28,'2017-04-09 00:00:00',96.00,'GEREED'),
(4,292,21,30,'2018-01-27 00:00:00',110.00,'GEREED'),
(5,283,16,24,'2015-12-29 00:00:00',103.00,'GEREED'),
(6,245,1,2,'2017-01-31 00:00:00',160.00,'GEREED'),
(7,245,2,3,'2018-01-18 00:00:00',120.00,'GEREED'),
(8,275,11,19,'2016-04-15 00:00:00',75.00,'GEREED'),
(9,276,12,20,'2015-08-25 00:00:00',174.00,'GEREED'),
(10,283,15,23,'2014-06-10 00:00:00',74.00,'GEREED'),
(11,297,21,34,'2014-07-17 00:00:00',96.00,'GEREED')
Insert Into #Repairs Values
(14,305,21,34,'2014-07-25 00:00:00',96.00,'GEREED'),
(12,301,21,34,'2014-07-17 00:00:00',96.00,'GEREED'),
(13,301,21,34,'2014-07-25 00:00:00',96.00,'GEREED')
Query
;With cte As
(
select
c.Merk, c.Type,
Count(r.REPAREERD_ARTIKEL_ID) As cnt
from
#console c Left join
#Items a on a.CONSOLE_ID=c.CONSOLE_ID left join
#Repairs r on r.REPAREERD_ARTIKEL_ID=a.ARTIKEL_ID
group by
c.merk, c.type
)
Select
*,
(Select Count(*) As totrecs From cte) As cntRecs ,
(Select avg(Cast(cte.cnt As Float)) As avgrecs From cte Where cte.cnt > 0) as avgrecs
From
cte
Where cte.cnt > (Select avg(Cast(cte.cnt As Float)) As avgrecs From cte Where cte.cnt > 0)
Result:
Merk Type cnt cntRecs avgrecs
Microsoft XBox 360 2 11 1.5

sql lowest running balance in a group

I've been trying for days to solve this problem to no solution.
I want to get the lowest running balance in a group.
Here is a sample data
The running balance is imaginary and is not part of the table.
the running balance is also computed dynamically.
the problem is I want to get the lowest running balance in a Specific month (January)
so the output should be 150 for memberid 10001 and 175 for memberid 10002 as highlighted in the image.
my desired out put should be
memberid | balance
10001 | 150
10002 | 175
Is that possible using sql query only?
PS. Using c# to compute lowest running balance is very slow since I have more than 600,000 records in my table.
I've updated the question.
The answer provided by Mihir Shah gave me the idea how solve my problem.
His answer takes to much time to process making it as slow as my computation on my c# program because his code loops on every record.
Here is my answer to get the minimum lowest value in a specific group (specific month) with a running value or running total without sacrificing a lot of performance.
with IniMonth1 as
(
select a.memberid, a.iniDeposit, a.iniWithdrawal,
(cast(a.iniDeposit as decimal(10,2)) - cast(a.iniWithdrawal as decimal(10,2))) as RunningTotal
from
(
select b.memberid, sum(b.depositamt) as iniDeposit, sum(b.withdrawalamt) as iniWithdrawal
from savings b
where trdate < '01/01/2016'
group by b.memberid
) a /*gets all the memberid, sum of deposit amount and withdrawal amt from the beginning of the savings before the specific month */
where cast(a.iniDeposit as decimal(10,2)) - cast(a.iniWithdrawal as decimal(10,2)) > 0 /*filters zero savings */
)
,DetailMonth1 as
(
select a.memberid, a.depositamt,a.withdrawalamt,
(cast(a.depositamt as decimal(10,2)) - cast(a.withdrawalamt as decimal(10,2))) as totalBal,
Row_Number() Over(Partition By a.memberid Order By a.trdate Asc) RowID
from savings a
where
a.trdate >= '01/01/2016'
and
a.trdate <= '01/31/2016'
and (a.depositamt<>0 or a.withdrawalamt<>0)
) /* gets all record within the specific month and gives a no of row as an id for the running value in the next procedure*/
,ComputedDetailMonth1 as
(
select a.memberid, min(a.runningbalance) as MinRunningBal
from
(
select a.rowid, a.memberid, a.totalbal,
(
sum(b.totalbal) +
(case
when c.runningtotal is null then 0
else c.runningtotal
end)
)as runningbalance , c.runningtotal as oldbalance
from DetailMonth1 a
inner join DetailMonth1 b
on b.rowid<=a.rowid
and a.memberid=b.memberid
left join IniMonth1 c
on a.memberid=c.memberid
group by a.rowid,a.memberid,a.totalbal,c.runningtotal
) a
group by a.memberid
) /* the loop is only for the records of the specific month only making it much faster */
/* this gets the running balance of specific month ONLY and ADD the sum total of IniMonth1 using join to get the running balance from the beginning of savings to the specific month */
/* I then get the minimum of the output using the min function*/
, OldBalanceWithNoNewSavingsMonth1 as
(
select a.memberid,a.RunningTotal
from
IniMonth1 a
left join
DetailMonth1 b
on a.memberid = b.memberid
where b.totalbal is null
)/*this gets all the savings that is not zero and has no transaction in the specific month making and this will become the default value as the lowest value if the member has no transaction in the specific month. */
,finalComputedMonth1 as
(
select a.memberid,a.runningTotal as MinRunTotal from OldBalanceWithNoNewSavingsMonth1 a
union
select b.memberid,b.MinRunningBal from ComputedDetailMonth1 b
)/*the record with minimum running total with clients that has a transaction in the specific month Unions with the members with no current transaction in the specific month*/
select * from finalComputedMonth1 order by memberid /* display the final output */
I have more than 600k savings record on my savings table
Surprisingly the performance of this code is very efficient.
It takes almost 2hr using my c# program to manually compute every record of all the members.
This code makes only 2 secs and at most 9 secs just to compute everything.
i Just display to c# for another 2secs.
The output of this code was tested and compared with my computation using my c# program.
May be below one is help you
Set Nocount On;
Declare #CashFlow Table
(
savingsid Varchar(50)
,memberid Int
,trdate Date
,deposit Decimal(18,2)
,withdrawal Decimal(18,2)
)
Insert Into #CashFlow(savingsid,memberid,trdate,deposit,withdrawal) Values
('10001-0002',10001,'01/01/2015',1000,0)
,('10001-0003',10001,'01/07/2015',25,0)
,('10001-0004',10001,'01/13/2015',25,0)
,('10001-0005',10001,'01/19/2015',0,900)
,('10001-0006',10001,'01/25/2015',25,0)
,('10001-0007',10001,'01/31/2015',25,0)
,('10001-0008',10001,'02/06/2015',25,0)
,('10001-0009',10001,'02/12/2015',25,0)
,('10001-0010',10001,'02/18/2015',0,200)
,('10002-0001',10002,'01/01/2015',500,0)
,('10002-0002',10002,'01/07/2015',25,0)
,('10002-0003',10002,'01/13/2015',0,200)
,('10002-0004',10002,'01/19/2015',25,0)
,('10002-0005',10002,'01/25/2015',25,0)
,('10002-0006',10002,'01/31/2015',0,200)
,('10002-0007',10002,'02/06/2015',25,0)
,('10002-0008',10002,'02/12/2015',25,0)
,('10002-0009',10002,'02/12/2015',0,200)
;With TrialBalance As
(
Select Row_Number() Over(Partition By cf.memberid Order By cf.trdate Asc) RowNum
,cf.memberid
,cf.deposit
,cf.withdrawal
,cf.trdate
From #CashFlow As cf
)
,RunningBalance As
(
Select tb.RowNum
,tb.memberid
,tb.deposit
,tb.withdrawal
,tb.trdate
From TrialBalance As tb
Where tb.RowNum = 1
Union All
Select tb.RowNum
,rb.memberid
,Cast((rb.deposit + tb.deposit - tb.withdrawal) As Decimal(18,2))
,rb.withdrawal
,tb.trdate
From TrialBalance As tb
Join RunningBalance As rb On tb.RowNum = (rb.Rownum + 1) And tb.memberid = rb.memberid
)
Select rb.memberid
,Min(rb.deposit) As runningBalance
From RunningBalance As rb
Where Year(rb.trdate) = 2015
And Month(rb.trdate) = 1
Group By rb.memberid

Find the longest sequence of a value in a table

This is an SQL Question, I think it is difficult one - I'm not sure it is possible to achieve in a simple SQL sentence or a stored procedure:
I want to find the number of the longest sequence of the same (known) number in a column in a table:
example:
TABLE:
DATE SALEDITEMS
1/1/09 4
1/2/09 3
1/3/09 3
1/4/09 4
1/5/09 3
calling the sp/sentence for 4 will give 1 calling the sp/sentecne for 3 will give 2
as there was 2 times in a row number 3.
I'm running SQL server 2008.
UPDATE: I generated a million rows of random data, and abandoned the recursive CTE solution, as its query plan didn't make good use of indexes in the optimizer.
But the non-recursive solution I originaly posted turned out to work great, as long as there was an additional non-clustered index on (SALEDITEMS, [DATE]). This makes sense, since the query needs to filter in both directions (both by date and by SALEDITEMS). With this additional index, queries on a million rows return in under 2 seconds on my (not very beefy) desktop mathine. Without this index, the query was dog-slow.
BTW, this is a great example of how SQL Server's cost-based query optimization totally breaks down in some cases. The recursive CTE solution has a cost (on my PC) of 42 and takes at least several minutes to finish. The non-recursive solution has a cost of 15,446 (!!!) and completes in 1.5 seconds. Moral of the story: when comparing SQL Server query plans, don't assume that cost necessarily correlates to query performance!
Anyway, here's the solution I'd recommend (the same non-recursive CTE I posted earlier) :
DECLARE #SALEDITEMS INT = 3;
WITH SalesNoMatch ([DATE], SALEDITEMS, NoMatchDate)
AS
(
SELECT [DATE], SALEDITEMS,
(SELECT MIN([DATE]) FROM Sales s2 WHERE s2.SALEDITEMS <> #SALEDITEMS
AND s2.[DATE] > s1.[DATE]) as NoMatchDate
FROM Sales s1
)
, SalesMatchCount ([DATE], ConsecutiveCount) AS
(
SELECT [DATE], 1+(SELECT COUNT(1) FROM Sales s2 WHERE s2.[DATE] > s1.[DATE] AND s2.[DATE] < NoMatchDate)
FROM SalesNoMatch s1
WHERE s1.SALEDITEMS = #SALEDITEMS
)
SELECT MAX(ConsecutiveCount)
FROM SalesMatchCount;
Here's the DDL I used to test this, including indexes you'll need:
CREATE TABLE [Sales](
[DATE] date NOT NULL,
[SALEDITEMS] int NOT NULL
);
CREATE UNIQUE CLUSTERED INDEX IX_Sales ON Sales ([DATE]);
CREATE UNIQUE NONCLUSTERED INDEX IX_Sales2 ON Sales (SALEDITEMS, [DATE]);
And here's how I created my test data-- 1,000,001 rows with ascending dates with SALEDITEMS randomly set between 1 and 10.
INSERT INTO Sales ([DATE], SALEDITEMS)
VALUES ('1/1/09', 5)
DECLARE #i int = 0;
WHILE (#i < 1000000)
BEGIN
INSERT INTO Sales ([DATE], SALEDITEMS)
SELECT DATEADD (d, 1, (SELECT MAX ([DATE]) FROM Sales)), ABS(CHECKSUM(NEWID())) % 10 + 1
SET #i = #i + 1;
END
Here's the recursive-CTE solution that I abandoned:
DECLARE #SALEDITEMS INT = 3;
-- recursive CTE solution (remember to set MAXRECURSION!)
WITH SalesRowNum ([DATE], SALEDITEMS, RowNum)
AS
(
SELECT [DATE], SALEDITEMS, ROW_NUMBER() OVER (ORDER BY s1.[DATE]) as RowNum
FROM Sales s1
)
, SalesCTE (RowNum, [DATE], ConsecutiveCount)
AS
(
SELECT s1.RowNum, s1.[DATE], 1 AS ConsecutiveCount
FROM SalesRowNum s1
WHERE SALEDITEMS = #SALEDITEMS
UNION ALL
SELECT s1.RowNum, s1.[DATE], ConsecutiveCount + 1 AS ConsecutiveCount
FROM SalesRowNum s1
INNER JOIN SalesCTE s2 ON s1.RowNum = s2.RowNum + 1
WHERE SALEDITEMS = #SALEDITEMS
)
SELECT MAX(ConsecutiveCount)
FROM SalesCTE;
Untested, because you did not provide DDL and sample data:
DECLARE #SALEDITEMS INT;
SET #SALEDITEMS=3;
SELECT MAX(cnt) FROM(
SELECT COUNT(*) FROM YourTable JOIN (
SELECT y1.[Date] AS d1, y2.[Date] AS d2
FROM YourTable AS y1 JOIN YourTable AS y2
ON y1.SALEDITEMS=#SALEDITEMS AND y2.SALEDITEMS=#SALEDITEMS
AND NOT EXISTS(SELECT 1 FROM YourTable AS y
WHERE y.SALEDITEMS<>#SALEDITEMS
AND y1.[Date] < y.[Date] AND y.[Date] < y2.[Date])
) AS t
WHERE [Date] BETWEEN t.d1 AND t.d2
) AS t;

Get last item in a table - SQL

I have a History Table in SQL Server that basically tracks an item through a process. The item has some fixed fields that don't change throughout the process, but has a few other fields including status and Id which increment as the steps of the process increase.
Basically I want to retrieve the last step for each item given a Batch Reference. So if I do a
Select * from HistoryTable where BatchRef = #BatchRef
It will return all the steps for all the items in the batch - eg
Id Status BatchRef ItemCount
1 1 Batch001 100
1 2 Batch001 110
2 1 Batch001 60
2 2 Batch001 100
But what I really want is:
Id Status BatchRef ItemCount
1 2 Batch001 110
2 2 Batch001 100
Edit: Appologies - can't seem to get the TABLE tags to work with Markdown - followed the help to the letter, and looks fine in the preview
Assuming you have an identity column in the table...
select
top 1 <fields>
from
HistoryTable
where
BatchRef = #BatchRef
order by
<IdentityColumn> DESC
It's kind of hard to make sense of your table design - I think SO ate your delimiters.
The basic way of handling this is to GROUP BY your fixed fields, and select a MAX (or MIN) for some unqiue value (a datetime usually works well). In your case, I think that the GROUP BY would be BatchRef and ItemCount, and Id will be your unique column.
Then, join back to the table to get all columns. Something like:
SELECT *
FROM HistoryTable
JOIN (
SELECT
MAX(Id) as Id.
BatchRef,
ItemCount
FROM HsitoryTable
WHERE
BacthRef = #batchRef
GROUP BY
BatchRef,
ItemCount
) as Latest ON
HistoryTable.Id = Latest.Id
Assuming the Item Ids are incrementally numbered:
--Declare a temp table to hold the last step for each item id
DECLARE #LastStepForEach TABLE (
Id int,
Status int,
BatchRef char(10),
ItemCount int)
--Loop counter
DECLARE #count INT;
SET #count = 0;
--Loop through all of the items
WHILE (#count < (SELECT MAX(Id) FROM HistoryTable WHERE BatchRef = #BatchRef))
BEGIN
SET #count = #count + 1;
INSERT INTO #LastStepForEach (Id, Status, BatchRef, ItemCount)
SELECT Id, Status, BatchRef, ItemCount
FROM HistoryTable
WHERE BatchRef = #BatchRef
AND Id = #count
AND Status =
(
SELECT MAX(Status)
FROM HistoryTable
WHERE BatchRef = #BatchRef
AND Id = #count
)
END
SELECT *
FROM #LastStepForEach
SELECT id, status, BatchRef, MAX(itemcount) AS maxItemcount
FROM HistoryTable GROUP BY id, status, BatchRef
HAVING status > 1
It's a bit hard to decypher your data the way WMD has formatted it, but you can pull of the sort of trick you need with common table expressions on SQL 2005:
with LastBatches as (
select Batch, max(Id)
from HistoryTable
group by Batch
)
select *
from HistoryTable h
join LastBatches b on b.Batch = h.Batch and b.Id = h.Id
Or a subquery (assuming the group by in the subquery works - off the top of my head I don't recall):
select *
from HistoryTable h
join (
select Batch, max(Id)
from HistoryTable
group by Batch
) b on b.Batch = h.Batch and b.Id = h.Id
Edit: I was assuming you wanted the last item for every batch. If you just need it for the one batch then the other answers (doing a top 1 and ordering descending) are the way to go.
As already suggested you probably want to reorder your query to sort it in the other direction so you actually fetch the first row. Then you'd probably want to use something like
SELECT TOP 1 ...
if you're using MSSQL 2k or earlier, or the SQL compliant variant
SELECT * FROM (
SELECT
ROW_NUMBER() OVER (ORDER BY key ASC) AS rownumber,
columns
FROM tablename
) AS foo
WHERE rownumber = n
for any other version (or for other database systems that support the standard notation), or
SELECT ... LIMIT 1 OFFSET 0
for some other variants without the standard SQL support.
See also this question for some additional discussion around selecting rows. Using the aggregate function max() might or might not be faster depending on whether calculating the value requires a table scan.