We have appointment table as shown below. Each appointment need to be categorized as "New" or "Followup". Any appointment (for a patient) within 30 days of first appointment (of that patient) is Followup. After 30 days, appointment is again "New". Any appointment within 30 days become "Followup".
I am currently doing this by typing while loop.
How to achieve this without WHILE loop?
Table
CREATE TABLE #Appt1 (ApptID INT, PatientID INT, ApptDate DATE)
INSERT INTO #Appt1
SELECT 1,101,'2020-01-05' UNION
SELECT 2,505,'2020-01-06' UNION
SELECT 3,505,'2020-01-10' UNION
SELECT 4,505,'2020-01-20' UNION
SELECT 5,101,'2020-01-25' UNION
SELECT 6,101,'2020-02-12' UNION
SELECT 7,101,'2020-02-20' UNION
SELECT 8,101,'2020-03-30' UNION
SELECT 9,303,'2020-01-28' UNION
SELECT 10,303,'2020-02-02'
You need to use recursive query.
The 30days period is counted starting from prev(and no it is not possible to do it without recursion/quirky update/loop). That is why all the existing answer using only ROW_NUMBER failed.
WITH f AS (
SELECT *, rn = ROW_NUMBER() OVER(PARTITION BY PatientId ORDER BY ApptDate)
FROM Appt1
), rec AS (
SELECT Category = CAST('New' AS NVARCHAR(20)), ApptId, PatientId, ApptDate, rn, startDate = ApptDate
FROM f
WHERE rn = 1
UNION ALL
SELECT CAST(CASE WHEN DATEDIFF(DAY, rec.startDate,f.ApptDate) <= 30 THEN N'FollowUp' ELSE N'New' END AS NVARCHAR(20)),
f.ApptId,f.PatientId,f.ApptDate, f.rn,
CASE WHEN DATEDIFF(DAY, rec.startDate, f.ApptDate) <= 30 THEN rec.startDate ELSE f.ApptDate END
FROM rec
JOIN f
ON rec.rn = f.rn - 1
AND rec.PatientId = f.PatientId
)
SELECT ApptId, PatientId, ApptDate, Category
FROM rec
ORDER BY PatientId, ApptDate;
db<>fiddle demo
Output:
+---------+------------+-------------+----------+
| ApptId | PatientId | ApptDate | Category |
+---------+------------+-------------+----------+
| 1 | 101 | 2020-01-05 | New |
| 5 | 101 | 2020-01-25 | FollowUp |
| 6 | 101 | 2020-02-12 | New |
| 7 | 101 | 2020-02-20 | FollowUp |
| 8 | 101 | 2020-03-30 | New |
| 9 | 303 | 2020-01-28 | New |
| 10 | 303 | 2020-02-02 | FollowUp |
| 2 | 505 | 2020-01-06 | New |
| 3 | 505 | 2020-01-10 | FollowUp |
| 4 | 505 | 2020-01-20 | FollowUp |
+---------+------------+-------------+----------+
How it works:
f - get starting point(anchor - per every PatientId)
rec - recursibe part, if the difference between current value and prev is > 30 change the category and starting point, in context of PatientId
Main - display sorted resultset
Similar class:
Conditional SUM on Oracle - Capping a windowed function
Session window (Azure Stream Analytics)
Running Total until specific condition is true - Quirky update
Addendum
Do not ever use this code on production!
But another option, that is worth mentioning besides using cte, is to use temp table and update in "rounds"
It could be done in "single" round(quirky update):
CREATE TABLE Appt_temp (ApptID INT , PatientID INT, ApptDate DATE, Category NVARCHAR(10))
INSERT INTO Appt_temp(ApptId, PatientId, ApptDate)
SELECT ApptId, PatientId, ApptDate
FROM Appt1;
CREATE CLUSTERED INDEX Idx_appt ON Appt_temp(PatientID, ApptDate);
Query:
DECLARE #PatientId INT = 0,
#PrevPatientId INT,
#FirstApptDate DATE = NULL;
UPDATE Appt_temp
SET #PrevPatientId = #PatientId
,#PatientId = PatientID
,#FirstApptDate = CASE WHEN #PrevPatientId <> #PatientId THEN ApptDate
WHEN DATEDIFF(DAY, #FirstApptDate, ApptDate)>30 THEN ApptDate
ELSE #FirstApptDate
END
,Category = CASE WHEN #PrevPatientId <> #PatientId THEN 'New'
WHEN #FirstApptDate = ApptDate THEN 'New'
ELSE 'FollowUp'
END
FROM Appt_temp WITH(INDEX(Idx_appt))
OPTION (MAXDOP 1);
SELECT * FROM Appt_temp ORDER BY PatientId, ApptDate;
db<>fiddle Quirky update
You could do this with a recursive cte. You should first order by apptDate within each patient. That can be accomplished by a run-of-the-mill cte.
Then, in the anchor portion of your recursive cte, select the first ordering for each patient, mark the status as 'new', and also mark the apptDate as the date of the most recent 'new' record.
In the recursive portion of your recursive cte, increment to the next appointment, calculate the difference in days between the present appointment and the most recent 'new' appointment date. If it's greater than 30 days, mark it 'new' and reset the most recent new appointment date. Otherwise mark it as 'follow up' and just pass along the existing days since new appointment date.
Finallly, in the base query, just select the columns you want.
with orderings as (
select *,
rn = row_number() over(
partition by patientId
order by apptDate
)
from #appt1 a
),
markings as (
select apptId,
patientId,
apptDate,
rn,
type = convert(varchar(10),'new'),
dateOfNew = apptDate
from orderings
where rn = 1
union all
select o.apptId, o.patientId, o.apptDate, o.rn,
type = convert(varchar(10),iif(ap.daysSinceNew > 30, 'new', 'follow up')),
dateOfNew = iif(ap.daysSinceNew > 30, o.apptDate, m.dateOfNew)
from markings m
join orderings o
on m.patientId = o.patientId
and m.rn + 1 = o.rn
cross apply (select daysSinceNew = datediff(day, m.dateOfNew, o.apptDate)) ap
)
select apptId, patientId, apptDate, type
from markings
order by patientId, rn;
I should mention that I initially deleted this answer because Abhijeet Khandagale's answer seemed to meet your needs with a simpler query (after reworking it a bit). But with your comment to him about your business requirement and your added sample data, I undeleted mine because believe this one meets your needs.
I'm not sure that it's exactly what you implemented. But another option, that is worth mentioning besides using cte, is to use temp table and update in "rounds". So we are going to update temp table while all statuses are not set correctly and build result in an iterative way. We can control number of iteration using simply local variable.
So we split each iteration into two stages.
Set all Followup values that are near to New records. That's pretty easy to do just using right filter.
For the rest of the records that dont have status set we can select first in group with same PatientID. And say that they are new since they not processed by the first stage.
So
CREATE TABLE #Appt2 (ApptID INT, PatientID INT, ApptDate DATE, AppStatus nvarchar(100))
select * from #Appt1
insert into #Appt2 (ApptID, PatientID, ApptDate, AppStatus)
select a1.ApptID, a1.PatientID, a1.ApptDate, null from #Appt1 a1
declare #limit int = 0;
while (exists(select * from #Appt2 where AppStatus IS NULL) and #limit < 1000)
begin
set #limit = #limit+1;
update a2
set
a2.AppStatus = IIF(exists(
select *
from #Appt2 a
where
0 > DATEDIFF(day, a2.ApptDate, a.ApptDate)
and DATEDIFF(day, a2.ApptDate, a.ApptDate) > -30
and a.ApptID != a2.ApptID
and a.PatientID = a2.PatientID
and a.AppStatus = 'New'
), 'Followup', a2.AppStatus)
from #Appt2 a2
--select * from #Appt2
update a2
set a2.AppStatus = 'New'
from #Appt2 a2 join (select a.*, ROW_NUMBER() over (Partition By PatientId order by ApptId) rn from (select * from #Appt2 where AppStatus IS NULL) a) ar
on a2.ApptID = ar.ApptID
and ar.rn = 1
--select * from #Appt2
end
select * from #Appt2 order by PatientID, ApptDate
drop table #Appt1
drop table #Appt2
Update. Read the comment provided by Lukasz. It's by far smarter way. I leave my answer just as an idea.
I believe the recursive common expression is great way to optimize queries avoiding loops, but in some cases it can lead to bad performance and should be avoided if possible.
I use the code below to solve the issue and test it will more values, but encourage you to test it with your real data, too.
WITH DataSource AS
(
SELECT *
,CEILING(DATEDIFF(DAY, MIN([ApptDate]) OVER (PARTITION BY [PatientID]), [ApptDate]) * 1.0 / 30 + 0.000001) AS [GroupID]
FROM #Appt1
)
SELECT *
,IIF(ROW_NUMBER() OVER (PARTITION BY [PatientID], [GroupID] ORDER BY [ApptDate]) = 1, 'New', 'Followup')
FROM DataSource
ORDER BY [PatientID]
,[ApptDate];
The idea is pretty simple - I want separate the records in group (30 days), in which group the smallest record is new, the others are follow ups. Check how the statement is built:
SELECT *
,DATEDIFF(DAY, MIN([ApptDate]) OVER (PARTITION BY [PatientID]), [ApptDate])
,DATEDIFF(DAY, MIN([ApptDate]) OVER (PARTITION BY [PatientID]), [ApptDate]) * 1.0 / 30
,CEILING(DATEDIFF(DAY, MIN([ApptDate]) OVER (PARTITION BY [PatientID]), [ApptDate]) * 1.0 / 30 + 0.000001)
FROM #Appt1
ORDER BY [PatientID]
,[ApptDate];
So:
first, we are getting the first date, for each group and calculating the differences in days with the current one
then, we are want to get groups - * 1.0 / 30 is added
as for 30, 60, 90, etc days we are getting whole number and we wanted to start a new period, I have added + 0.000001; also, we are using ceiling function to get the smallest integer greater than, or equal to, the specified numeric expression
That's it. Having such group we simply use ROW_NUMBER to find our start date and make it as new and leaving the rest as follow ups.
With due respect to everybody and in IMHO,
There is not much difference between While LOOP and Recursive CTE in terms of RBAR
There is not much performance gain when using Recursive CTE and Window Partition function all in one.
Appid should be int identity(1,1) , or it should be ever increasing clustered index.
Apart from other benefit it also ensure that all successive row APPDate of that patient must be greater.
This way you can easily play with APPID in your query which will be more efficient than putting inequality operator like >,< in APPDate.
Putting inequality operator like >,< in APPID will aid Sql Optimizer.
Also there should be two date column in table like
APPDateTime datetime2(0) not null,
Appdate date not null
As these are most important columns in most important table,so not much cast ,convert.
So Non clustered index can be created on Appdate
Create NonClustered index ix_PID_AppDate_App on APP (patientid,APPDate) include(other column which is not i predicate except APPID)
Test my script with other sample data and lemme know for which sample data it not working.
Even if it do not work then I am sure it can be fix in my script logic itself.
CREATE TABLE #Appt1 (ApptID INT, PatientID INT, ApptDate DATE)
INSERT INTO #Appt1
SELECT 1,101,'2020-01-05' UNION ALL
SELECT 2,505,'2020-01-06' UNION ALL
SELECT 3,505,'2020-01-10' UNION ALL
SELECT 4,505,'2020-01-20' UNION ALL
SELECT 5,101,'2020-01-25' UNION ALL
SELECT 6,101,'2020-02-12' UNION ALL
SELECT 7,101,'2020-02-20' UNION ALL
SELECT 8,101,'2020-03-30' UNION ALL
SELECT 9,303,'2020-01-28' UNION ALL
SELECT 10,303,'2020-02-02'
;With CTE as
(
select a1.* ,a2.ApptDate as NewApptDate
from #Appt1 a1
outer apply(select top 1 a2.ApptID ,a2.ApptDate
from #Appt1 A2
where a1.PatientID=a2.PatientID and a1.ApptID>a2.ApptID
and DATEDIFF(day,a2.ApptDate, a1.ApptDate)>30
order by a2.ApptID desc )A2
)
,CTE1 as
(
select a1.*, a2.ApptDate as FollowApptDate
from CTE A1
outer apply(select top 1 a2.ApptID ,a2.ApptDate
from #Appt1 A2
where a1.PatientID=a2.PatientID and a1.ApptID>a2.ApptID
and DATEDIFF(day,a2.ApptDate, a1.ApptDate)<=30
order by a2.ApptID desc )A2
)
select *
,case when FollowApptDate is null then 'New'
when NewApptDate is not null and FollowApptDate is not null
and DATEDIFF(day,NewApptDate, FollowApptDate)<=30 then 'New'
else 'Followup' end
as Category
from cte1 a1
order by a1.PatientID
drop table #Appt1
Although it's not clearly addressed in the question, it's easy to figure out that the appointment dates cannot be simply categorized by 30-day groups. It makes no business sense. And you cannot use the appt id either. One can make a new appointment today for 2020-09-06.
Here is how I address this issue. First, get the first appointment, then calculate the date difference between each appointment and the first appt. If it's 0, set to 'New'. If <= 30 'Followup'. If > 30, set as 'Undecided' and do the next round check until there is no more 'Undecided'. And for that, you really need a while loop, but it does not loop through each appointment date, rather only a few datasets. I checked the execution plan. Even though there are only 10 rows, the query cost is significantly lower than that using recursive CTE, but not as low as Lukasz Szozda's addendum method.
IF OBJECT_ID('tempdb..#TEMPTABLE') IS NOT NULL DROP TABLE #TEMPTABLE
SELECT ApptID, PatientID, ApptDate
,CASE WHEN (DATEDIFF(DAY, MIN(ApptDate) OVER (PARTITION BY PatientID), ApptDate) = 0) THEN 'New'
WHEN (DATEDIFF(DAY, MIN(ApptDate) OVER (PARTITION BY PatientID), ApptDate) <= 30) THEN 'Followup'
ELSE 'Undecided' END AS Category
INTO #TEMPTABLE
FROM #Appt1
WHILE EXISTS(SELECT TOP 1 * FROM #TEMPTABLE WHERE Category = 'Undecided') BEGIN
;WITH CTE AS (
SELECT ApptID, PatientID, ApptDate
,CASE WHEN (DATEDIFF(DAY, MIN(ApptDate) OVER (PARTITION BY PatientID), ApptDate) = 0) THEN 'New'
WHEN (DATEDIFF(DAY, MIN(ApptDate) OVER (PARTITION BY PatientID), ApptDate) <= 30) THEN 'Followup'
ELSE 'Undecided' END AS Category
FROM #TEMPTABLE
WHERE Category = 'Undecided'
)
UPDATE #TEMPTABLE
SET Category = CTE.Category
FROM #TEMPTABLE t
LEFT JOIN CTE ON CTE.ApptID = t.ApptID
WHERE t.Category = 'Undecided'
END
SELECT ApptID, PatientID, ApptDate, Category
FROM #TEMPTABLE
I hope this will help you.
WITH CTE AS
(
SELECT #Appt1.*, RowNum = ROW_NUMBER() OVER (PARTITION BY PatientID ORDER BY ApptDate, ApptID) FROM #Appt1
)
SELECT A.ApptID , A.PatientID , A.ApptDate ,
Expected_Category = CASE WHEN (DATEDIFF(MONTH, B.ApptDate, A.ApptDate) > 0) THEN 'New'
WHEN (DATEDIFF(DAY, B.ApptDate, A.ApptDate) <= 30) then 'Followup'
ELSE 'New' END
FROM CTE A
LEFT OUTER JOIN CTE B on A.PatientID = B.PatientID
AND A.rownum = B.rownum + 1
ORDER BY A.PatientID, A.ApptDate
You could use a Case statement.
select
*,
CASE
WHEN DATEDIFF(d,A1.ApptDate,A2.ApptDate)>30 THEN 'New'
ELSE 'FollowUp'
END 'Category'
from
(SELECT PatientId, MIN(ApptId) 'ApptId', MIN(ApptDate) 'ApptDate' FROM #Appt1 GROUP BY PatientID) A1,
#Appt1 A2
where
A1.PatientID=A2.PatientID AND A1.ApptID<A2.ApptID
The question is, should this category be assigned based off the initial appointment, or the one prior? That is, if a Patient has had three appointments, should we compare the third appointment to the first, or the second?
You problem states the first, which is how I've answered. If that's not the case, you'll want to use lag.
Also, keep in mind that DateDiff makes not exception for weekends. If this should be weekdays only, you'll need to create your own Scalar-Valued function.
using Lag function
select apptID, PatientID , Apptdate ,
case when date_diff IS NULL THEN 'NEW'
when date_diff < 30 and (date_diff_2 IS NULL or date_diff_2 < 30) THEN 'Follow Up'
ELSE 'NEW'
END AS STATUS FROM
(
select
apptID, PatientID , Apptdate ,
DATEDIFF (day,lag(Apptdate) over (PARTITION BY PatientID order by ApptID asc),Apptdate) date_diff ,
DATEDIFF(day,lag(Apptdate,2) over (PARTITION BY PatientID order by ApptID asc),Apptdate) date_diff_2
from #Appt1
) SRC
Demo --> https://rextester.com/TNW43808
with cte
as
(
select
tmp.*,
IsNull(Lag(ApptDate) Over (partition by PatientID Order by PatientID,ApptDate),ApptDate) PriorApptDate
from #Appt1 tmp
)
select
PatientID,
ApptDate,
PriorApptDate,
DateDiff(d,PriorApptDate,ApptDate) Elapsed,
Case when DateDiff(d,PriorApptDate,ApptDate)>30
or DateDiff(d,PriorApptDate,ApptDate)=0 then 'New' else 'Followup' end Category from cte
Mine is correct. The authors was incorrect, see elapsed
My data looks like the following
TicketID OwnedbyTeamT Createddate ClosedDate
1234 A
1234 A 01/01/2019 01/05/2019
1234 A 10/05/2018 10/07/2018
1234 B 10/04/2019 10/08/2018
1234 finance 11/01/2018 11/11/2018
1234 B 12/02/2018
Now, I want to calculate the datediff between the closeddates for teams A, and B, if the max closeddate for team A is greater than max closeddate team B. If it is smaller or null I don't want to see them. So, for example,I want to see only one record like this :
TicketID (Datediff)result-days
1234 86
and for another tickets, display the info. For example, if the conditions aren't met then:
TicketID (Datediff)result-days
2456 -1111111
Data sample for 2456:
TicketID OwnedbyTeamT Createddate ClosedDate
2456 A
2456 A 10/01/2019 10/05/2019
2456 B 08/05/2018 08/07/2018
2456 B 06/04/2019 06/08/2018
2456 finance 11/01/2018 11/11/2018
2456 B 12/02/2018
I want to see the difference in days between 01/05/2019 for team A, and
10/08/2018 for team B.
Here is the query that I wrote, however, all I see is -1111111, any help please?:
SELECT A.incidentid,
( CASE
WHEN Max(B.[build validation]) <> 'No data'
AND Max(A.crfs) <> 'No data'
AND Max(B.[build validation]) < Max(A.crfs) THEN
Datediff(day, Max(B.[build validation]), Max(A.crfs))
ELSE -1111111
END ) AS 'Days-CRF-diff'
FROM (SELECT DISTINCT incidentid,
Iif(( ownedbyteam = 'B'
AND titlet LIKE '%Build validation%' ), Cast(
closeddatetimet AS NVARCHAR(255)), 'No data') AS
'Build Validation'
FROM incidentticketspecifics) B
INNER JOIN (SELECT incidentid,
Iif(( ownedbyteamt = 'B'
OR ownedbyteamt =
'Finance' ),
Cast(
closeddatetimet AS NVARCHAR(255)), 'No data') AS
'CRFS'
FROM incidentticketspecifics
GROUP BY incidentid,
ownedbyteamt,
closeddatetimet) CRF
ON A.incidentid = B.incidentid
GROUP BY A.incidentid
I hope the following answer will be of help.
With two subqueries for the two teams (A and B), the max date for every Ticket is brought. A left join between these two tables is performed to have these information in the same row in order to perform DATEDIFF. The last WHERE clause keeps the row with the dates greater for A team than team B.
Please change [YourDB] and [MytableName] in the following code with your names.
--Select the items to be viewed in the final view along with the difference in days
SELECT A.[TicketID],A.[OwnedbyTeamT], A.[Max_DateA],B.[OwnedbyTeamT], B.[Max_DateB], DATEDIFF(dd,B.[Max_DateB],A.[Max_DateA]) AS My_Diff
FROM
(
--The following subquery creates a table A with the max date for every project for team A
SELECT [TicketID]
,[OwnedbyTeamT]
,MAX([ClosedDate]) AS Max_DateA
FROM [YourDB].[dbo].[MytableName]
GROUP BY [TicketID],[OwnedbyTeamT]
HAVING [OwnedbyTeamT]='A')A
--A join between view A and B to bring the max dates for every project
LEFT JOIN (
--The max date for every project for team B
SELECT [TicketID]
,[OwnedbyTeamT]
,MAX([ClosedDate]) AS Max_DateB
FROM [YourDB].[dbo].[MytableName]
GROUP BY [TicketID],[OwnedbyTeamT]
HAVING [OwnedbyTeamT]='B')B
ON A.[TicketID]=B.[TicketID]
--Fill out the rows on the max dates for the teams
WHERE A.Max_DateA>B.Max_DateB
You might be able to do with a PIVOT. I am leaving a working example.
SELECT [TicketID], "A", "B", DATEDIFF(dd,"B","A") AS My_Date_Diff
FROM
(
SELECT [TicketID],[OwnedbyTeamT],MAX([ClosedDate]) AS My_Max
FROM [YourDB].[dbo].[MytableName]
GROUP BY [TicketID],[OwnedbyTeamT]
)Temp
PIVOT
(
MAX(My_Max)
FOR Temp.[OwnedbyTeamT] in ("A","B")
)PIV
WHERE "A">"B"
Your sample query is quite complicated and has conditions not mentioned in the text. It doesn't really help.
I want to calculate the datediff between the closeddates for teams A, and B, if the max closeddate for team A is greater than max closeddate team B. If it is smaller or null I don't want to see them.
I think you want this per TicketId. You can do this using conditional aggregation:
SELECT TicketId,
DATEDIFF(day,
MAX(CASE WHEN OwnedbyTeamT = 'B' THEN ClosedDate END),
MAX(CASE WHEN OwnedbyTeamT = 'A' THEN ClosedDate END) as diff
)
FROM incidentticketspecifics its
GROUP BY TicketId
HAVING MAX(CASE WHEN OwnedbyTeamT = 'A' THEN ClosedDate END) >
MAX(CASE WHEN OwnedbyTeamT = 'B' THEN ClosedDate END)
My query is as follows
SELECT
LEFT(TimePeriod,6) Period, -- string field with YYYYMMDD
SUM(Value) Value
FROM
f_Trans_GL
WHERE
Account = 228
GROUP BY
TimePeriod
And it returns
Period Value
---------------
201412 80
201501 20
201502 30
201506 50
201509 100
201509 100
I'd like to know the Value difference between rows where the period is 1 month apart. The calculation being [value period] - [value period-1].
The desired output being;
Period Value Calculated
-----------------------------------
201412 80 80 - null = 80
201501 20 20 - 80 = -60
201502 30 30 - 20 = 10
201506 50 50 - null = 50
201509 100 (100 + 100) - null = 200
This illustrates a second challenge, as the period needs to be evaluated if the year changes (the difference between 201501 and 201412 is one month).
And the third challenge being a duplicate Period (201509), in which case the sum of that period needs to be evaluated.
Any indicators on where to begin, if this is possible, would be great!
Thanks in advance
===============================
After I accepted the answer, I tailored this a little to suit my needs, the end result is:
WITH cte
AS (SELECT
ISNULL(CAST(TransactionID AS nvarchar), '_nullTransactionId_') + ISNULL(Description, '_nullDescription_') + CAST(Account AS nvarchar) + Category + Currency + Entity + Scenario AS UID,
LEFT(TimePeriod, 6) Period,
SUM(Value1) Value1,
CAST(LEFT(TimePeriod, 6) + '01' AS date) ord_date
FROM MyTestTable
GROUP BY LEFT(TimePeriod, 6),
TransactionID,
Description,
Account,
Category,
Currency,
Entity,
Scenario,
TimePeriod)
SELECT
a.UID,
a.Period,
--a.Value1,
ISNULL(a.Value1, 0) - ISNULL(b.Value1, 0) Periodic
FROM cte a
LEFT JOIN cte b
ON a.ord_date = DATEADD(MONTH, 1, b.ord_date)
ORDER BY a.UID
I have to get the new value (Periodic) for each UID. This UID must be determined as done here because the PK on the table won't work.
But the issue is that this will return many more rows than I actually have to begin with in my table. If I don't add a GROUP BY and ORDER by UID (as done above), I can tell that the first result for each combination of UID and Period is actually correct, the subsequent rows for that combination, are not.
I'm not sure where to look for a solution, my guess is that the UID is the issue here, and that it will somehow iterate over the field... any direction appreciated.
As pointed by other, first mistake is in Group by you need to Left(timeperiod, 6) instead of timeperiod.
For remaining calculation try something like this
;WITH cte
AS (SELECT LEFT(timeperiod, 6) Period,
Sum(value) Value,
Cast(LEFT(timeperiod, 6) + '01' AS DATE) ord_date
FROM f_trans_gl
WHERE account = 228
GROUP BY LEFT(timeperiod, 6))
SELECT a.period,
a.value,
a.value - Isnull(b.value, 0)
FROM cte a
LEFT JOIN cte b
ON a.ord_date = Dateadd(month, 1, b.ord_date)
If you are using SQL SERVER 2012 then this can be easily done using LAG analytic function
Using a derived table, you can join the data to itself to find rows that are in the preceding period. I have converted your Period to a Date value so you can use SQL Server's dateadd function to check for rows in the previous month:
;WITH cte AS
(
SELECT
LEFT(TimePeriod,6) Period, -- string field with YYYYMMDD
CAST(TimePeriod + '01' AS DATE) PeriodDate
SUM(Value) Value
FROM f_Trans_GL
WHERE Account = 228
GROUP BY LEFT(TimePeriod,6)
)
SELECT c1.Period,
c1.Value,
c1.Value - ISNULL(c2.Value,0) AS Calculation
FROM cte c1
LEFT JOIN cte c2
ON c1.PeriodDate = DATEADD(m,1,c2.PeriodDate)
Without cte, you can also try something like this
SELECT A.Period,A.Value,A.Value-ISNULL(B.Value) Calculated
FROM
(
SELECT LEFT(TimePeriod,6) Period
DATEADD(M,-1,(CONVERT(date,LEFT(TimePeriod,6)+'01'))) PeriodDatePrev,SUM(Value) Value
FROM f_Trans_GL
WHERE Account = 228
GROUP BY LEFT(TimePeriod,6)
) AS A
LEFT OUTER JOIN
(
SELECT LEFT(TimePeriod,6) Period
(CONVERT(date,LEFT(TimePeriod,6)+'01')) PeriodDate,SUM(Value) Value
FROM f_Trans_GL
WHERE Account = 228
GROUP BY LEFT(TimePeriod,6)
) AS B
ON (A.PeriodDatePrev = B.PeriodDate)
ORDER BY 1
**EDIT: Our current server is SQL 2008 R2 so LAG/LEAD functions will not work.
I'm attempting to take multiple streams of data within a table and combine them into 1 stream of data. Given the 3 streams of data below I want the end result to be 1 stream that gives preference to the status 'on'. Recursion seems to be the best option but I've had no luck so far putting together a query that does what i want.
CREATE TABLE #Dates(
id INT IDENTITY,
status VARCHAR(4),
StartDate Datetime,
EndDate Datetime,
booth int)
INSERT #Dates
VALUES
( 'off','2015-01-01 08:00','2015-01-01 08:15',1),
( 'on','2015-01-01 08:15','2015-01-01 09:15',1),
( 'off','2015-01-01 08:50','2015-01-01 09:00',2),
( 'on','2015-01-01 09:00','2015-01-01 09:30',2),
( 'off','2015-01-01 09:30','2015-01-01 09:35',2),
( 'on','2015-01-01 09:35','2015-01-01 10:15',2),
( 'off','2015-01-01 09:30','2015-01-01 10:30',3),
( 'on','2015-01-01 10:30','2015-01-01 11:00',3)
status StartDate EndDate
---------------------------
off 08:00 08:15
on 08:15 09:15
off 08:50 09:00
on 09:00 09:30
off 09:30 09:35
on 09:35 10:15
off 09:30 10:30
on 10:30 11:00
End Result:
status StartDate EndDate
---------------------------
off 8:00 8:15
on 8:15 9:15
on 9:15 9:30
off 9:30 9:35
on 9:35 10:15
off 10:15 10:30
on 10:30 11:00
Essentially, anytime there is a status of 'on' it should override any concurrent 'off' status.
Source:
|----off----||---------on---------|
|---off--||------on----||---off---||--------on------|
|--------------off------------------||------on------|
Result (Either result would work):
|----off----||----------on--------||---on---||---off---||--------on------||-off--||------on------|
|----off----||----------on------------------||---off---||--------on------||-off--||------on------|
Here's the simplest version for 2008 that I was able to figure out:
; with Data (Date) as (
select StartDate from Dates
union
select EndDate from Dates),
Ranges (StartDate, Status) as (
select D.Date, D2.Status
from Data D
outer apply (
select top 1 D2.Status
from Dates D2
where D2.StartDate <= D.Date and D2.EndDate > D.Date
order by case when Status = 'on' then 1 else 2 end
) D2)
select R.StartDate,
(select min(D.Date) from Data D where D.Date > R.StartDate) as EndDate,
Status
from Ranges R
order by R.StartDate
It will return new row starting from each start / end point even if the status is the same as previous. Didn't find any simple way to combine them.
Edit: Changing the first CTE to this will combine the rows:
; with Data (Date) as (
select distinct StartDate from Dates D1
where not exists (Select 1 from Dates D2
where D2.StartDate < D1.StartDate and D2.EndDate > D1.StartDate and
Status = 'on')
union
select distinct EndDate from Dates D1
where not exists (Select 1 from Dates D2
where D2.StartDate < D1.EndDate and D2.EndDate > D1.EndDate and
Status = 'on')
),
So basically every time there's even one "on" record, it is on, otherwise off?
Here's a little different kind of approach to the issue, adding +1 every time an "on" cycle starts, and adding -1 when it ends. Then we can use a running total for the status, and when the status is 0, then it's off, and otherwise it is on:
select Date,
sum(oncounter) over (order by Date) as onstat,
sum(offcounter) over (order by Date) as offstat
from (
select StartDate as Date,
case when status = 'on' then 1 else 0 end oncounter,
case when status = 'off' then 1 else 0 end offcounter
from Dates
union all
select EndDate as Date,
case when status = 'on' then -1 else 0 end oncounter,
case when status = 'off' then -1 else 0 end offcounter
from Dates
) TMP
Edit: Added also counter for off -states. It works the same way as "on" counter and when both are 0, then status is neither on or off.
Final result, it seems it can be done, although it's not looking that nice anymore, but at least it's not recursive :)
select
Date as StartDate,
lead(Date, 1, '21000101') over (order by Date) as EndDate,
case onstat
when 0 then
case when offstat > 0 then 'Off' else 'N/A' end
else 'On' end as State
from (
select
Date,
onstat, prevon,
offstat, prevoff
from (
Select
Date,
onstat,
lag(onstat, 1, 0) over (order by Date) as prevon,
offstat,
lag(offstat, 1, 0) over (order by Date) as prevoff
from (
select
Date,
sum(oncounter) over (order by Date) as onstat,
sum(offcounter) over (order by Date) as offstat
from (
select
StartDate as Date,
case when status = 'on' then 1 else 0 end oncounter,
case when status = 'off' then 1 else 0 end offcounter
from
Dates
union all
select
EndDate as Date,
case when status = 'on' then -1 else 0 end oncounter,
case when status = 'off' then -1 else 0 end offcounter
from
Dates
) TMP
) TMP2
) TMP3
where (onstat = 1 and prevon = 0)
or (onstat = 0 and prevon = 1)
or (onstat = 0 and offstat = 1 and prevoff = 0)
or (onstat = 0 and offstat = 0 and prevoff = 1)
) TMP4
It has quite many derived tables for the window functions and getting only the status changes into the result set so lead can pick up correct dates. It might be possible to get rid of some of them.
SQL Fiddle: http://sqlfiddle.com/#!6/b5cfa/7