How to recursively calculate yearly rollover in SQL? - sql

I need to calculate yearly rollover for a system that keeps track of when people have used days off.
The rollover calculation itself is simple: [TOTALDAYSALLOWED] - [USED]
Provided that number is not higher than [MAXROLLOVER] (and > 0)
Where this gets complicated is the [TOTALDAYSALLOWED] column, which is [NUMDAYSALLOWED] combined with the previous year's rollover to get the total number of days that can be used in a current year.
I've tried several different ways of getting this calculation, but all of them have failed to account for the previous year's rollover being a part of the current year's allowed days.
Creating columns for the LAG of days used, joining the data to itself but shifted back a year, etc. I'm not including examples of code I've tried because the approach was wrong in all of the attempts. That would just make this long post even longer.
Here's the data I'm working with:
Here's how it should look after the calculation:
This is a per-person calculation, so there's no need to consider any personal ID here. DAYTYPE only has one value currently, but I want to include it in the calculation in case another is added. The [HOW] column is only for clarity in this post.
Here's some code to generate the sample data (SQL Server or Azure SQL):
IF OBJECT_ID('tempdb..#COUNTS') IS NOT NULL DROP TABLE #COUNTS
CREATE TABLE #COUNTS (USED INT, DAYTYPE VARCHAR(20), THEYEAR INT)
INSERT INTO #COUNTS (USED, DAYTYPE, THEYEAR)
SELECT 1, 'X', 2019
UNION
SELECT 3, 'X', 2020
UNION
SELECT 0, 'X', 2021
IF OBJECT_ID('tempdb..#ALLOWANCES') IS NOT NULL DROP TABLE #ALLOWANCES
CREATE TABLE #ALLOWANCES (THEYEAR INT, DAYTYPE VARCHAR(20), NUMDAYSALLOWED INT, MAXROLLOVER INT)
INSERT INTO #ALLOWANCES (THEYEAR, DAYTYPE, NUMDAYSALLOWED, MAXROLLOVER)
SELECT 2019, 'X', 3, 3
UNION
SELECT 2020, 'X', 3, 3
UNION
SELECT 2021, 'X', 3, 3
SELECT C.*, A.NUMDAYSALLOWED, A.MAXROLLOVER
FROM #COUNTS C
JOIN #ALLOWANCES A ON C.DAYTYPE = A.DAYTYPE AND C.THEYEAR = A.THEYEAR

The tricky part is to limit the rollover amount. This is maybe possible with window functions, but I think this is easier to do with a recursive query:
with
data as (
select c.*, a.numdaysallowed, a.maxrollover,
row_number() over(partition by c.daytype order by c.theyear) rn
from #counts c
inner join #allowances a on a.theyear = c.theyear and a.daytype = c.daytype
),
cte as (
select d.*,
numdaysallowed as totaldaysallowed,
numdaysallowed - used as actualrollover
from data d
where rn = 1
union all
select d.*,
d.numdaysallowed + c.actualrollover,
case when d.numdaysallowed + c.actualrollover - d.used > d.maxrollover
then 3
else d.numdaysallowed + c.actualrollover - d.used
end
from cte c
inner join data d on d.rn = c.rn + 1 and d.daytype = c.daytype
)
select * from cte order by theyear
Demo on DB Fiddle

Related

SQL Query Help - Negative reporting

Perhaps somebody can help with Ideas or a Solution. A User asked me for a negative report. We have a table with tickets each ticket has a ticket number which would be easy to select but the user wants a list of missing tickets between the first and last ticket in the system.
E.g. Select TicketNr from Ticket order by TicketNr
Result
1,
2,
4,
7,
11
But we actually want the result 3,5,6,8,9,10
CREATE TABLE [dbo].[Ticket](
[pknTicketId] [int] IDENTITY(1,1) NOT NULL,
[TicketNr] [int] NULL
) ON [PRIMARY]
GO
SQL Server 2016 - TSQL
Any ideas ?
So a bit more information is need all solution thus far works on small table. Our production database has over 4 million tickets. Hence why we need to find the missing ones.
First get the minimum and maximum, then generate all posible ticket numbers and finally select the ones that are missing.
;WITH FirstAndLast AS
(
SELECT
MinTicketNr = MIN(T.TicketNr),
MaxTicketNr = MAX(T.TicketNr)
FROM
Ticket AS T
),
AllTickets AS
(
SELECT
TicketNr = MinTicketNr,
MaxTicketNr = T.MaxTicketNr
FROM
FirstAndLast AS T
UNION ALL
SELECT
TicketNr = A.TicketNr + 1,
MaxTicketNr = A.MaxTicketNr
FROM
AllTickets AS A
WHERE
A.TicketNr + 1 <= A.MaxTicketNr
)
SELECT
A.TicketNr
FROM
AllTickets AS A
WHERE
NOT EXISTS (
SELECT
'missing ticket'
FROM
Ticket AS T
WHERE
A.TicketNr = T.TicketNr)
ORDER BY
A.TicketNr
OPTION
(MAXRECURSION 32000)
If you can accept the results in a different format, the following will do what you want:
select TicketNr + 1 as first_missing,
next_TicketNr - 1 as last_missing,
(next_TicketNr - TicketNr - 1) as num_missing
from (select t.*, lead(TicketNr) over (order by TicketNr) as next_TicketNr
from Ticket t
) t
where next_TicketNr <> TicketNr + 1;
This shows each sequence of missing ticket numbers on a single row, rather than a separate row for each of them.
If you do use a recursive CTE, I would recommend doing it only for the missing tickets:
with cte as (
select (TicketNr + 1) as missing_TicketNr
from (select t.*, lead(TicketNr) over (order by TicketNr) as next_ticketNr
from tickets t
) t
where next_TicketNr <> TicketNr + 1
union all
select missing_TicketNr + 1
from cte
where not exists (select 1 from tickets t2 where t2.TicketNr = cte.missing_TicketNr + 1)
)
select *
from cte;
This version starts with the list of missing ticket numbers. It then adds a new one, as the numbers are not found.
One method is to use recursive cte to find the missing ticket numbers :
with missing as (
select min(TicketNr) as mnt, max(TicketNr) as mxt
from ticket t
union all
select mnt+1, mxt
from missing m
where mnt < mxt
)
select m.*
from missing m
where not exists (select 1 from tickets t where t.TicketNr = m.mnt);
This should do the trick: SQL Fiddle
declare #ticketsTable table (ticketNo int not null)
insert #ticketsTable (ticketNo) values (1),(2),(4),(7),(11)
;with cte1(ticketNo, isMissing, sequenceNo) AS
(
select ticketNo
, 0
, row_number() over (order by ticketNo)
from #ticketsTable
)
, cte2(ticketNo, isMissing, sequenceNo) AS
(
select ticketNo, isMissing, sequenceNo
from cte1
union all
select a.ticketNo + 1
, 1
, a.sequenceNo
from cte2 a
inner join cte1 b
on b.sequenceNo = a.sequenceNo + 1
and b.ticketNo != a.ticketNo + 1
)
select *
from cte2
where isMissing = 1
order by ticketNo
It works by collecting all of the existing tickets, marking them as existing, and assigning each a consecutive number giving their order in the original list.
We can then see the gaps in the list by finding any spots where the consecutive order number shows the next record, but the ticket numbers are not consecutive.
Finally, we recursively fill in the gaps; working from the start of a gap and adding new records until that gap's consecutive numbers no longer has a gap between the related ticket numbers.
I think this one give you easiest solution
with cte as(
select max(TicketNr) maxnum,min(TicketNr) minnum from Ticket )
select a.number FROM master..spt_values a,cte
WHERE Type = 'P' and number < cte.maxnum and number > cte.minno
except
select TicketNr FROM Ticket
So After looking at all the solutions
I went with creating a temp table with a full range of number from Starting to Ending ticket and then select from the Temp table where the ticket number not in the ticket table.
The reason being I kept running in MAXRECURSION problems.

Calculating current consecutive days from a table

I have what seems to be a common business request but I can't find no clear solution. I have a daily report (amongst many) that gets generated based on failed criteria and gets saved to a table. Each report has a type id tied to it to signify which report it is, and there is an import event id that signifies the day the imports came in (a date column is added for extra clarification). I've added a sqlfiddle to see the basic schema of the table (renamed for privacy issues).
http://www.sqlfiddle.com/#!3/81945/8
All reports currently generated are working fine, so nothing needs to be modified on the table. However, for one report (type 11), not only I need pull the invoices that showed up today, I also need to add one column that totals the amount of consecutive days from date of run for that invoice (including current day). The result should look like the following, based on the schema provided:
INVOICE MESSAGE EVENT_DATE CONSECUTIVE_DAYS_ON_REPORT
12345 Yes July, 30 2013 6
54355 Yes July, 30 2013 2
644644 Yes July, 30 2013 4
I only need the latest consecutive days, not any other set that may show up. I've tried to run self joins to no avail, and my last attempt is also listed as part of the sqlfiddle file, to no avail. Any suggestions or ideas? I'm quite stuck at the moment.
FYI: I am working in SQL Server 2000! I have seen a lot of neat tricks that have come out in 2005 and 2008, but I can't access them.
Your help is greatly appreciated!
Something like this? http://www.sqlfiddle.com/#!3/81945/14
SELECT
[final].*,
[last].total_rows
FROM
tblEventInfo AS [final]
INNER JOIN
(
SELECT
[first_of_last].type_id,
[first_of_last].invoice,
MAX([all_of_last].event_date) AS event_date,
COUNT(*) AS total_rows
FROM
(
SELECT
[current].type_id,
[current].invoice,
MAX([current].event_date) AS event_date
FROM
tblEventInfo AS [current]
LEFT JOIN
tblEventInfo AS [previous]
ON [previous].type_id = [current].type_id
AND [previous].invoice = [current].invoice
AND [previous].event_date = [current].event_date-1
WHERE
[current].type_id = 11
AND [previous].type_id IS NULL
GROUP BY
[current].type_id,
[current].invoice
)
AS [first_of_last]
INNER JOIN
tblEventInfo AS [all_of_last]
ON [all_of_last].type_id = [first_of_last].type_id
AND [all_of_last].invoice = [first_of_last].invoice
AND [all_of_last].event_date >= [first_of_last].event_date
GROUP BY
[first_of_last].type_id,
[first_of_last].invoice
)
AS [last]
ON [last].type_id = [final].type_id
AND [last].invoice = [final].invoice
AND [last].event_date = [final].event_date
The inner most query looks up the starting record of the last block of consecutive records.
Then that joins on to all the records in that block of consecutive records, giving the final date and the count of rows (consecutive days).
Then that joins on to the row for the last day to get the message, etc.
Make sure that in reality you have an index on (type_id, invoice, event_date).
You have multiple problems. Tackle them separately and build up.
Problems:
1) Identifying consecutive ranges: subtract the row_number from the range column and group by the result
2) No ROW_NUMBER() functions in SQL 2000: Fake it with a correlated subquery.
3) You actually want DENSE_RANK() instead of ROW_NUMBER: Make a list of unique dates first.
Solutions:
3)
SELECT MAX(id) AS id,invoice,event_date FROM tblEventInfo GROUP BY invoice,event_date
2)
SELECT t2.invoice,t2.event_date,t2.id,
DATEDIFF(day,(SELECT COUNT(DISTINCT event_date) FROM (SELECT MAX(id) AS id,invoice,event_date FROM tblEventInfo GROUP BY invoice,event_date) t1 WHERE t2.invoice = t1.invoice AND t2.event_date > t1.event_date),t2.event_date) grp
FROM (SELECT MAX(id) AS id,invoice,event_date FROM tblEventInfo GROUP BY invoice,event_date) t2
ORDER BY invoice,grp,event_date
1)
SELECT
t3.invoice AS INVOICE,
MAX(t3.event_date) AS EVENT_DATE,
COUNT(t3.event_date) AS CONSECUTIVE_DAYS_ON_REPORT
FROM (
SELECT t2.invoice,t2.event_date,t2.id,
DATEDIFF(day,(SELECT COUNT(DISTINCT event_date) FROM (SELECT MAX(id) AS id,invoice,event_date FROM tblEventInfo GROUP BY invoice,event_date) t1 WHERE t2.invoice = t1.invoice AND t2.id > t1.id),t2.event_date) grp
FROM (SELECT MAX(id) AS id,invoice,event_date FROM tblEventInfo GROUP BY invoice,event_date) t2
) t3
GROUP BY t3.invoice,t3.grp
The rest of your question is a little ambiguous. If two ranges are of equal length, do you want both or just the most recent? Should the output MESSAGE be 'Yes' if any message = 'Yes' or only if the most recent message = 'Yes'?
This should give you enough of a breadcrumb though
I had a similar requirement not long ago getting a "Top 5" ranking with a consecutive number of periods in Top 5. The only solution I found was to do it in a cursor. The cursor has a date = #daybefore and inside the cursor if your data does not match quit the loop, otherwise set #daybefore = datediff(dd, -1, #daybefore).
Let me know if you want an example. There just seem to be a large number of enthusiasts, who hit downvote when they see the word "cursor" even if they don't have a better solution...
Here, try a scalar function like this:
CREATE FUNCTION ConsequtiveDays
(
#invoice bigint, #date datetime
)
RETURNS int
AS
BEGIN
DECLARE #ct int = 0, #Count_Date datetime, #Last_Date datetime
SELECT #Last_Date = #date
DECLARE counter CURSOR LOCAL FAST_FORWARD
FOR
SELECT event_date FROM tblEventInfo
WHERE invoice = #invoice
ORDER BY event_date DESC
FETCH NEXT FROM counter
INTO #Count_Date
WHILE ##FETCH_STATUS = 0 AND DATEDIFF(dd,#Last_Date,#Count_Date) < 2
BEGIN
#ct = #ct + 1
END
CLOSE counter
DEALLOCATE counter
RETURN #ct
END
GO

how to insert subtract of each two subsequent rows and inserting it into a new column

I have a question about writing query in sql.
in the picture 1 I want to subtract row 2 from row1 (in column date) and insert it's result in row1 of new column with the title of Recency. and again subtract row3 from row2 and insert it in row2 of the new column, and so on.
picture 1:
in fact I want to calculate the recency of each user's activity. for example in the following picture, I calculated this for one user(manually); I want to do this for all of the users by writing a query in sql.
picture 2:
..........................................................................................
and other question:
I also want to calculate the frequency of activity of each user before the current date. I want to calculate frequency for each row. for example for this example, for user abkqz we have:
user name frequency
abkqz 4
abkqz 3
abkqz 2
abkqz 1
abkqz 0
Assuming the following table structure
CREATE TABLE [15853354] -- Stack Overflow question number
(
[user-name] VARCHAR(20),
[submissions] INT,
[date] DATE,
[score] NUMERIC(9,2),
[points] NUMERIC(9,1)
)
INSERT [15853354]
VALUES
('abkqz', 5, '12 JUL 2010', 83.91, 112.5),
('abkqz', 5, '9 JUN 2010', 77.27, 0),
('abkqz', 5, '17 MAY 2010', 91.87, 315)
Then you could write the following query
;WITH [cte15853354] AS
(
SELECT
[user-name],
[submissions],
[date],
[score],
[points],
ROW_NUMBER() OVER (ORDER BY [user-name], [date] DESC) AS [ROWNUMBER]
FROM [15853354]
)
SELECT
t.[user-name],
t.[submissions],
DATEDIFF(DAY, ISNULL([t-1].[date],t.[date]),t.[date]) AS [recency],
t.[score],
t.[points]
FROM [cte15853354] t
LEFT JOIN [cte15853354] [t-1]
ON [t-1].[user-name] = t.[user-name]
AND [t-1].[ROWNUMBER] = t.[ROWNUMBER] + 1
This uses a Common Table Expression to calculate a row number, and then does a self join to join each row with the next, and then calculates the date difference in days.
This is the result:
Try something like this (untested, since sample data was only posted in a picture). The query users analytical function options that were introduced in SQL Server 2012, so this won't work on an earlier version.
select
[user-name],
submissions,
score,
datediff(day,
lag([date],1) over (
partition by [user-name]
order by [date],
[date]) as recency,
count(*) over (
partition by [user-name]
order by [date] desc) -1 as frequency
from yourTable;

Multiple Running Totals with Group By

I am struggling to find a good way to run running totals with a group by in it, or the equivalent. The below cursor based running total works on a complete table, but I would like to expand this to add a "Client" dimension. So I would get running totals as the below creates but for each company (ie Company A, Company B, Company C, etc.) in one table
CREATE TABLE test (tag int, Checks float, AVG_COST float, Check_total float, Check_amount float, Amount_total float, RunningTotal_Check float,
RunningTotal_Amount float)
DECLARE #tag int,
#Checks float,
#AVG_COST float,
#check_total float,
#Check_amount float,
#amount_total float,
#RunningTotal_Check float ,
#RunningTotal_Check_PCT float,
#RunningTotal_Amount float
SET #RunningTotal_Check = 0
SET #RunningTotal_Check_PCT = 0
SET #RunningTotal_Amount = 0
DECLARE aa_cursor CURSOR fast_forward
FOR
SELECT tag, Checks, AVG_COST, check_total, check_amount, amount_total
FROM test_3
OPEN aa_cursor
FETCH NEXT FROM aa_cursor INTO #tag, #Checks, #AVG_COST, #check_total, #Check_amount, #amount_total
WHILE ##FETCH_STATUS = 0
BEGIN
SET #RunningTotal_CHeck = #RunningTotal_CHeck + #checks
set #RunningTotal_Amount = #RunningTotal_Amount + #Check_amount
INSERT test VALUES (#tag, #Checks, #AVG_COST, #check_total, #Check_amount, #amount_total, #RunningTotal_check, #RunningTotal_Amount )
FETCH NEXT FROM aa_cursor INTO #tag, #Checks, #AVG_COST, #check_total, #Check_amount, #amount_total
END
CLOSE aa_cursor
DEALLOCATE aa_cursor
SELECT *, RunningTotal_Check/Check_total as CHECK_RUN_PCT, round((RunningTotal_Check/Check_total *100),0) as CHECK_PCT_BIN, RunningTotal_Amount/Amount_total as Amount_RUN_PCT, round((RunningTotal_Amount/Amount_total * 100),0) as Amount_PCT_BIN
into test_4
FROM test ORDER BY tag
create clustered index IX_TESTsdsdds3 on test_4(tag)
DROP TABLE test
----------------------------------
I can the the running total for any 1 company but I would like to do it for multiple to produce something like the results below.
CLIENT COUNT Running Total
Company A 1 6.7%
Company A 2 20.0%
Company A 3 40.0%
Company A 4 66.7%
Company A 5 100.0%
Company B 1 3.6%
Company B 2 10.7%
Company B 3 21.4%
Company B 4 35.7%
Company B 5 53.6%
Company B 6 75.0%
Company B 7 100.0%
Company C 1 3.6%
Company C 2 10.7%
Company C 3 21.4%
Company C 4 35.7%
Company C 5 53.6%
Company C 6 75.0%
Company C 7 100.0%
This is finally simple to do in SQL Server 2012, where SUM and COUNT support OVER clauses that contain ORDER BY. Using Cris's #Checks table definition:
SELECT
CompanyID,
count(*) over (
partition by CompanyID
order by Cleared, ID
) as cnt,
str(100.0*sum(Amount) over (
partition by CompanyID
order by Cleared, ID
)/
sum(Amount) over (
partition by CompanyID
),5,1)+'%' as RunningTotalForThisCompany
FROM #Checks;
SQL Fiddle here.
I originally started posting the SQL Server 2012 equivalent (since you didn't mention what version you were using). Steve has done a great job of showing the simplicity of this calculation in the newest version of SQL Server, so I'll focus on a few methods that work on earlier versions of SQL Server (back to 2005).
I'm going to take some liberties with your schema, since I can't figure out what all these #test and #test_3 and #test_4 temporary tables are supposed to represent. How about:
USE tempdb;
GO
CREATE TABLE dbo.Checks
(
Client VARCHAR(32),
CheckDate DATETIME,
Amount DECIMAL(12,2)
);
INSERT dbo.Checks(Client, CheckDate, Amount)
SELECT 'Company A', '20120101', 50
UNION ALL SELECT 'Company A', '20120102', 75
UNION ALL SELECT 'Company A', '20120103', 120
UNION ALL SELECT 'Company A', '20120104', 40
UNION ALL SELECT 'Company B', '20120101', 75
UNION ALL SELECT 'Company B', '20120105', 200
UNION ALL SELECT 'Company B', '20120107', 90;
Expected output in this case:
Client Count Running Total
--------- ----- -------------
Company A 1 17.54
Company A 2 43.86
Company A 3 85.96
Company A 4 100.00
Company B 1 20.55
Company B 2 75.34
Company B 3 100.00
One way:
;WITH gt(Client, Totals) AS
(
SELECT Client, SUM(Amount)
FROM dbo.Checks AS c
GROUP BY Client
), n (Client, Amount, rn) AS
(
SELECT c.Client, c.Amount,
ROW_NUMBER() OVER (PARTITION BY c.Client ORDER BY c.CheckDate)
FROM dbo.Checks AS c
)
SELECT n.Client, [Count] = n.rn,
[Running Total] = CONVERT(DECIMAL(5,2), 100.0*(
SELECT SUM(Amount) FROM n AS n2
WHERE Client = n.Client AND rn <= n.rn)/gt.Totals
)
FROM n INNER JOIN gt ON n.Client = gt.Client
ORDER BY n.Client, n.rn;
A slightly faster alternative - more reads but shorter duration and simpler plan:
;WITH x(Client, CheckDate, rn, rt, gt) AS
(
SELECT Client, CheckDate, rn = ROW_NUMBER() OVER
(PARTITION BY Client ORDER BY CheckDate),
(SELECT SUM(Amount) FROM dbo.Checks WHERE Client = c.Client
AND CheckDate <= c.CheckDate),
(SELECT SUM(Amount) FROM dbo.Checks WHERE Client = c.Client)
FROM dbo.Checks AS c
)
SELECT Client, [Count] = rn,
[Running Total] = CONVERT(DECIMAL(5,2), rt * 100.0/gt)
FROM x
ORDER BY Client, [Count];
While I've offered set-based alternatives here, in my experience I have observed that a cursor is often the fastest supported way to perform running totals. There are other methods such as the quirky update which perform about marginally faster but the result is not guaranteed. The set-based approach where you perform a self-join becomes more and more expensive as the source row counts go up - so what seems to perform okay in testing with a small table, as the table gets larger, the performance goes down.
I have a blog post almost fully prepared that goes through a slightly simpler performance comparison of various running totals approaches. It is simpler because it is not grouped and it only shows the totals, not the running total percentage. I hope to publish this post soon and will try to remember to update this space.
There is also another alternative to consider that doesn't require reading previous rows multiple times. It's a concept Hugo Kornelis describes as "set-based iteration." I don't recall where I first learned this technique, but it makes a lot of sense in some scenarios.
DECLARE #c TABLE
(
Client VARCHAR(32),
CheckDate DATETIME,
Amount DECIMAL(12,2),
rn INT,
rt DECIMAL(15,2)
);
INSERT #c SELECT Client, CheckDate, Amount,
ROW_NUMBER() OVER (PARTITION BY Client
ORDER BY CheckDate), 0
FROM dbo.Checks;
DECLARE #i INT, #m INT;
SELECT #i = 2, #m = MAX(rn) FROM #c;
UPDATE #c SET rt = Amount WHERE rn = 1;
WHILE #i <= #m
BEGIN
UPDATE c SET c.rt = c2.rt + c.Amount
FROM #c AS c
INNER JOIN #c AS c2
ON c.rn = c2.rn + 1
AND c.Client = c2.Client
WHERE c.rn = #i;
SET #i = #i + 1;
END
SELECT Client, [Count] = rn, [Running Total] = CONVERT(
DECIMAL(5,2), rt*100.0 / (SELECT TOP 1 rt FROM #c
WHERE Client = c.Client ORDER BY rn DESC)) FROM #c AS c;
While this does perform a loop, and everyone tells you that loops and cursors are bad, one gain with this method is that once the previous row's running total has been calculated, we only have to look at the previous row instead of summing all prior rows. The other gain is that in most cursor-based solutions you have to go through each client and then each check. In this case, you go through all clients' 1st checks once, then all clients' 2nd checks once. So instead of (client count * avg check count) iterations, we only do (max check count) iterations. This solution doesn't make much sense for the simple running totals example, but for the grouped running totals example it should be tested against the set-based solutions above. Not a chance it will beat Steve's approach, though, if you are on SQL Server 2012.
UPDATE
I've blogged about various running totals approaches here:
http://www.sqlperformance.com/2012/07/t-sql-queries/running-totals
I didn't exactly understand the schema you were pulling from, but here is a quick query using a temp table that shows how to do a running total in a set based operation.
CREATE TABLE #Checks
(
ID int IDENTITY(1,1) PRIMARY KEY
,CompanyID int NOT NULL
,Amount float NOT NULL
,Cleared datetime NOT NULL
)
INSERT INTO #Checks
VALUES
(1,5,'4/1/12')
,(1,5,'4/2/12')
,(1,7,'4/5/12')
,(2,10,'4/3/12')
SELECT Info.ID, Info.CompanyID, Info.Amount, RunningTotal.Total, Info.Cleared
FROM
(
SELECT main.ID, SUM(other.Amount) as Total
FROM
#Checks main
JOIN
#Checks other
ON
main.CompanyID = other.CompanyID
AND
main.Cleared >= other.Cleared
GROUP BY
main.ID) RunningTotal
JOIN
#Checks Info
ON
RunningTotal.ID = Info.ID
DROP TABLE #Checks

Math with previous row in SQL, avoiding nested queries?

I want to do some math on the previous rows in an SQL request in order to avoid doing it in my code.
I have a table representing the sales of two entities (the data represented here is doesn't make much sense and it's just an excerpt) :
YEAR ID SALES PURCHASE MARGIN
2009 1 10796820,57 2662369,19 8134451,38
2009 2 2472271,53 2066312,34 405959,19
2008 1 9641213,19 1223606,68 8417606,51
2008 2 3436363,86 2730035,19 706328,67
I want to know how the sales, purchase, margin... have evolved and compare one year to the previous one.
In short I want an SQL result with the evolutions pre-computed like this :
YEAR ID SALES SALES_EVOLUTION PURCHASE PURCHASE_EVOLUTION MARGIN MARGIN_EVOLUTION
2009 1 10796820,57 11,99 2662369,19 117,58 8134451,38 -3,36
2009 2 2472271,53 -28,06 2066312,34 -24,31 405959,19 -42,53
2008 1 9641213,19 1223606,68 8417606,51
2008 2 3436363,86 2730035,19 706328,67
I could do some ugly stuff :
SELECT *, YEAR, ID, SALES , (SALES/(SELECT SALES FROM TABLE WHERE YEAR = OUTER_TABLE.YEAR-1 AND ID = OUTER_TABLE.ID) -1)*100 as SALES_EVOLUTION (...)
FROM TABLE as OUTER_TABLE
ORDER BY YEAR DESC, ID ASC
But I have arround 20 fields for which I would have to do a nested query, meaning I would have a very huge and ugly query.
Is there a better way to do this, with less SQL ?
Using sql server (but this should work for almost any sql), with the table provided you can use a LEFT JOIN
DECLARE #Table TABLE(
[YEAR] INT,
ID INT,
SALES FLOAT,
PURCHASE FLOAT,
MARGIN FLOAT
)
INSERT INTO #Table ([YEAR],ID,SALES,PURCHASE,MARGIN) SELECT 2009,1,10796820.57,2662369.19,8134451.38
INSERT INTO #Table ([YEAR],ID,SALES,PURCHASE,MARGIN) SELECT 2009,2,2472271.53,2066312.34,405959.19
INSERT INTO #Table ([YEAR],ID,SALES,PURCHASE,MARGIN) SELECT 2008,1,9641213.19,1223606.68,8417606.51
INSERT INTO #Table ([YEAR],ID,SALES,PURCHASE,MARGIN) SELECT 2008,2,3436363.86,2730035.19,706328.67
SELECT cur.*,
((cur.Sales / prev.SALES) - 1) * 100
FROM #Table cur LEFT JOIN
#Table prev ON cur.ID = prev.ID AND cur.[YEAR] - 1 = prev.[YEAR]
The LEFT JOIN will allow you to still see values from 2008, where an INNER JOIN would not.
Old skool solution:
SELECT c.YEAR, c.ID, c.SALES, c.PURCHASE, c.MARGIN
, p.YEAR, p.ID, p.SALES, p.PURCHASE, p.MARGIN
FROM tab AS c -- current
INNER JOIN tab AS p -- previous
ON c.year = p.year - 1
AND c.id = p.id
If you have a db with analytical functions (MS SQL, Oracle) you can use the LEAD or LAG analytical functions, see http://www.oracle-base.com/articles/misc/LagLeadAnalyticFunctions.php
I think this would be the correct application:
SELECT c.YEAR, c.ID, c.SALES, c.PURCHASE, c.MARGIN
, LAG(c.YEAR, 1, 0) OVER (ORDER BY ID,YEAR)
, LAG(c.ID, 1, 0) OVER (ORDER BY ID,YEAR)
, LAG(c.SALES, 1, 0) OVER (ORDER BY ID,YEAR)
, LAG(c.PURCHASE, 1, 0) OVER (ORDER BY ID,YEAR)
, LAG(c.MARGIN, 1, 0) OVER (ORDER BY ID,YEAR)
FROM tab AS c -- current
(not really sure, haven't played with this enough)
You can do it like this:
SELECT t1.*, t1.YEAR, t1.ID, t1.SALES , ((t1.sales/t2.sales) -1) * 100 as SALES_EVOLUTION
(...)
FROM Table t1 JOIN Table t2 ON t1.Year = (t2.Year + 1) AND t1.Id = t2.Id
ORDER BY t1.YEAR DESC, t1.ID ASC
Now, if you want to compare more years, you'd have to do more joins, so it is a slightly ugly solution.