Related
First time post. Learning SQL over the past 6 months so help is appreciated. I have data structured as below:
DECLARE #tmp4 as TABLE (
AccountNumber int,
Date date,
DateRank int
)
INSERT INTO #tmp4
VALUES (001, '11/13/2018' , 1)
, (002, '12/19/2018', 2)
, (003, '1/23/2019' , 3)
, (004, '2/5/2019' , 4)
, (005, '3/10/2019' , 5)
, (006, '3/20/2019' , 6)
, (007, '4/8/2019' , 7)
, (008, '5/20/2019' , 8)
What I need to do with this data is calculate a rolling total that resets to 0 once a threshold of 90 days is reached. I have used the DateDiff function to calculate the DateDiffs between consecutive dates and have tried multiple things using LAG and other window functions but can't make it reset. The goal is to find "index visits" which can only occur once every 90 days. So my plan is to have a field that reads 0 on the first visit and resets to 0 for the next stay after 90 days is up from the first visit then only pull visits with a value of 0.
One solution I tried was correct for most sets but did not return the right values for the above set (rows 4 and 8 should start over as "index visits").
The results I would expect for this query would be:
Account Date DateRank RollingTotal
001 |'11/13/2018' | 1 | 0
002 |'12/19/2018' | 2 | 35
003 |'1/23/2019' | 3 | 71
004 |'2/5/2019' | 4 | 84
005 |'3/10/2019' | 5 | 0 (not 117)
006 |'3/20/2019' | 6 | 10
007 |'4/8/2019' | 7 | 29
008 |'5/20/2019' | 8 | 71
Thanks for any help.
Here's the code I tried:
DECLARE #tmp2 as TABLE
(EmrNumber varchar(255)
, AdmitDateTime datetime
, DateRank int
, LagDateDiff int
, RunningTotal int
)
INSERT INTO #tmp2
SELECT tmp1.EmrNumber
, tmp1.AdmitDateTime
, tmp1.DateRank
--, LAG(tmp1.AdmitDateTime) OVER(PARTITION BY tmp1.EmrNumber ORDER BY tmp1.DateRank) as NextAdmitDate
, -DATEDIFF(DAY, tmp1.AdmitDateTime, LAG(tmp1.AdmitDateTime) OVER(PARTITION BY tmp1.EmrNumber ORDER BY tmp1.DateRank)) LagDateDiff
, IIF((SELECT SUM(sumt.total)
FROM (
SELECT -DATEDIFF(DAY, tmpsum.AdmitDateTime, LAG(tmpsum.AdmitDateTime) OVER(PARTITION BY tmpsum.EmrNumber ORDER BY tmpsum.DateRank)) total
FROM #tmp tmpsum
WHERE tmp1.EmrNumber = tmpsum.EmrNumber
AND tmpsum.AdmitDateTime <= tmp1.AdmitDateTime
) sumt) IS NULL, 0, (SELECT SUM(sumt.total)
FROM (
SELECT -DATEDIFF(DAY, tmpsum.AdmitDateTime, LAG(tmpsum.AdmitDateTime) OVER(PARTITION BY tmpsum.EmrNumber ORDER BY tmpsum.DateRank)) total
FROM #tmp tmpsum
WHERE tmp1.EmrNumber = tmpsum.EmrNumber
AND tmpsum.AdmitDateTime <= tmp1.AdmitDateTime
) sumt) ) as RunningTotal
FROM #tmp tmp1
SELECT *
, CASE WHEN LagDateDiff >90 THEN 0
WHEN RunningTotal = 0 THEN 0
ELSE LAG(LagDateDiff) OVER(PARTITION BY EmrNumber ORDER BY DateRank) + RunningTotal END AS RollingTotal
FROM #tmp2
You need a recursive query for this, because the running total has to be checked iteratively, row after row:
with cte as (
select
Account,
Date,
DateRank,
0 RollingTotal
from #tmp4
where DateRank = 1
union all
select
t.Account,
t.Date,
t.DateRank,
case when RollingTotal + datediff(day, c.Date, t.Date) > 90
then 0
else RollingTotal + datediff(day, c.Date, t.Date)
end
from cte c
inner join #tmp4 t on t.DateRank = c.DateRank + 1
)
select * from cte
The anchor of the cte selects the first record (as indicated by DateRank. Then, the recursive part processes rows one by one, and resets the running count when it crosses 90.
My data looks something like this
ProductNumber | YearMonth | Number
1 201803 1
1 201804 3
1 201810 6
2 201807 -3
2 201809 5
Now what I want to have is add an additional entry "6MSum" which is the sum of the last 6 months per ProductNumber (not the last 6 entries).
Please be aware the YearMonth data is not complete, for every ProductNumber there are gaps in between so I cant just use the last 6 entries for the sum. The final result should look something like this.
ProductNumber | YearMonth | Number | 6MSum
1 201803 1 1
1 201804 3 4
1 201810 6 9
2 201807 -3 -3
2 201809 5 2
Additionally I don't want to insert the sum to the table but instead use it in a query like:
SELECT [ProductNumber],[YearMonth],[Number],
6MSum = CONVERT(INT,SUM...)
FROM ...
I found a lot off solutions that use a "sum over period" but only for the last X entries and not for the actual conditional statement of "YearMonth within last 6 months".
Any help would be much appreciated!
Its a SQL Database
EDIT/Answer
It seems to be the case that the gaps within the months have to be filled with data, afterwards something like
sum(Number) OVER (PARTITION BY category
ORDER BY year, week
ROWS 6 PRECEDING) AS 6MSum
Should work.
Reference to the solution : https://dba.stackexchange.com/questions/181773/sum-of-previous-n-number-of-columns-based-on-some-category
You could go the OUTER APPLY route. The following produces your required results exactly:
-- prep data
SELECT
ProductNumber , YearMonth , Number
into #t
FROM ( values
(1, 201803 , 1 ),
(1, 201804 , 3 ),
(1, 201810 , 6 ),
(2, 201807 , -3 ),
(2, 201809 , 5 )
) s (ProductNumber , YearMonth , Number)
-- output
SELECT
ProductNumber
,YearMonth
,Number
,[6MSum]
FROM #t t
outer apply (
SELECT
sum(number) as [6MSum]
FROM #t it
where
it.ProductNumber = t.ProductNumber
and it.yearmonth <= t.yearmonth
and t.yearmonth - it.yearmonth between 0 and 6
) tt
drop table #t
Use outer apply and convert yearmonth to a date, something like this:
with t as (
select t.*,
convert(date, convert(varchar(255), yearmonth) + '01')) as ymd
from yourtable t
)
select t.*, t2.sum_6m
from t outer apply
(select sum(t2.number) as sum_6m
from t t2
where t2.productnumber = t.productnumber and
t2.ymd <= t.ymd and
t2.ymd > dateadd(month, -6, ymd)
) t2;
Just to provide one more option. You can use DATEFROMPARTS to build valid dates from the YearMonth value and then search for values within date ranges.
Testable here: https://rextester.com/APJJ99843
SELECT
ProductNumber , YearMonth , Number
INTO #t
FROM ( values
(1, 201803 , 1 ),
(1, 201804 , 3 ),
(1, 201810 , 6 ),
(2, 201807 , -3 ),
(2, 201809 , 5 )
) s (ProductNumber , YearMonth , Number)
SELECT *
,[6MSum] = (SELECT SUM(number) FROM #t WHERE
ProductNumber = t.ProductNumber
AND DATEFROMPARTS(LEFT(YearMonth,4),RIGHT(YearMonth,2),1) --Build a valid start of month date
BETWEEN
DATEADD(MONTH,-6,DATEFROMPARTS(LEFT(t.YearMonth,4),RIGHT(t.YearMonth,2),1)) --Build a valid start of month date 6 months back
AND DATEFROMPARTS(LEFT(t.YearMonth,4),RIGHT(t.YearMonth,2),1)) --Build a valid end of month date
FROM #t t
DROP TABLE #t
So a working query (provided by a colleauge of mine) can look like this
SELECT [YearMonth]
,[Number]
,[ProductNumber]
, (Select Sum(Number) from [...] DPDS_1 where DPDS.ProductNumber =
DPDS_1.ProductNumber and DPDS_1.YearMonth <= DPDS.YearMonth and DPDS_1.YearMonth >=
convert (int, left (convert (varchar, dateadd(mm, -6, DPDS.YearMonth + '01'), 112),
6)))FROM [...] DPDS
I am using SQL Server 2008 and am trying to increase the speed of my query below. The query assigns points to patients based on readmission dates.
Example: A patient is seen on 1/2, 1/5, 1/7, 1/8, 1/9, 2/4. I want to first group visits within 3 days of each other. 1/2-5 are grouped, 1/7-9 are grouped. 1/5 is NOT grouped with 1/7 because 1/5's actual visit date is 1/2. 1/7 would receive 3 points because it is a readmit from 1/2. 2/4 would also receive 3 points because it is a readmit from 1/7. When the dates are grouped the first date is the actual visit date.
Most articles suggest limiting the data set or adding indexes to increase speed. I have limited the amount of rows to about 15,000 and added a index. When running the query with 45 test visit dates/ 3 test patients, the query takes 1.5 min to run. With my actual data set it takes > 8 hrs.
How can I get this query to run < 1 hr? Is there a better way to write my query? Does my Index look correct? Any help would be greatly appreciated.
Example expected results below query.
;CREATE TABLE RiskReadmits(MRN INT, VisitDate DATE, Category VARCHAR(15))
;CREATE CLUSTERED INDEX Risk_Readmits_Index ON RiskReadmits(VisitDate)
;INSERT RiskReadmits(MRN,VisitDate,CATEGORY)
VALUES
(1, '1/2/2016','Inpatient'),
(1, '1/5/2016','Inpatient'),
(1, '1/7/2016','Inpatient'),
(1, '1/8/2016','Inpatient'),
(1, '1/9/2016','Inpatient'),
(1, '2/4/2016','Inpatient'),
(1, '6/2/2016','Inpatient'),
(1, '6/3/2016','Inpatient'),
(1, '6/5/2016','Inpatient'),
(1, '6/6/2016','Inpatient'),
(1, '6/8/2016','Inpatient'),
(1, '7/1/2016','Inpatient'),
(1, '8/1/2016','Inpatient'),
(1, '8/4/2016','Inpatient'),
(1, '8/15/2016','Inpatient'),
(1, '8/18/2016','Inpatient'),
(1, '8/28/2016','Inpatient'),
(1, '10/12/2016','Inpatient'),
(1, '10/15/2016','Inpatient'),
(1, '11/17/2016','Inpatient'),
(1, '12/20/2016','Inpatient')
;WITH a AS (
SELECT
z1.VisitDate
, z1.MRN
, (SELECT MIN(VisitDate) FROM RiskReadmits WHERE VisitDate > DATEADD(day, 3, z1.VisitDate)) AS NextDay
FROM
RiskReadmits z1
WHERE
CATEGORY = 'Inpatient'
), a1 AS (
SELECT
MRN
, MIN(VisitDate) AS VisitDate
, MIN(NextDay) AS NextDay
FROM
a
GROUP BY
MRN
), b AS (
SELECT
VisitDate
, MRN
, NextDay
, 1 AS OrderRow
FROM
a1
UNION ALL
SELECT
a.VisitDate
, a.MRN
, a.NextDay
, b.OrderRow +1 AS OrderRow
FROM
a
JOIN b
ON a.VisitDate = b.NextDay
), c AS (
SELECT
MRN,
VisitDate
, (SELECT MAX(VisitDate) FROM b WHERE b1.VisitDate > VisitDate AND b.MRN = b1.MRN) AS PreviousVisitDate
FROM
b b1
)
SELECT distinct
c1.MRN,
c1.VisitDate
, CASE
WHEN DATEDIFF(day,c1.PreviousVisitDate,c1.VisitDate) < 30 THEN PreviousVisitDate
ELSE NULL
END AS ReAdmissionFrom
, CASE
WHEN DATEDIFF(day,c1.PreviousVisitDate,c1.VisitDate) < 30 THEN 3
ELSE 0
END AS Points
FROM
c c1
ORDER BY c1.MRN
Expected Results:
MRN VisitDate ReAdmissionFrom Points
1 2016-01-02 NULL 0
1 2016-01-07 2016-01-02 3
1 2016-02-04 2016-01-07 3
1 2016-06-02 NULL 0
1 2016-06-06 2016-06-02 3
1 2016-07-01 2016-06-06 3
1 2016-08-01 NULL 0
1 2016-08-15 2016-08-01 3
1 2016-08-28 2016-08-15 3
1 2016-10-12 NULL 0
1 2016-11-17 NULL 0
1 2016-12-20 NULL 0
oops I changed the names of a few cte's (and the post messed up what was code)
It should be like this:
b AS (
SELECT
VisitDate
, MRN
, NextDay
, 1 AS OrderRow
FROM
a1
UNION ALL
SELECT
a.VisitDate
, a.MRN
, a.NextDay
, b.OrderRow +1 AS OrderRow
FROM
a AS a
JOIN b
ON a.VisitDate = b.NextDay AND a.MRN = b.MRN
)
I'm going to take a wild guess here and say you want to change the b cte to
have AND a.MRN = b.MRN as a second condition in the second select query like this:
, b AS (
SELECT
VisitDate
, MRN
, NextDay
, 1 AS OrderRow
FROM
firstVisitAndFollowUp
UNION ALL
SELECT
a.VisitDate
, a.MRN
, a.NextDay
, b.OrderRow +1 AS OrderRow
FROM
visitsDistance3daysOrMore AS a
JOIN b
ON a.VisitDate = b.NextDay AND a.MRN = b.MRN
)
Please feast your eyes on this current structure of our DB.
Our DBA is currently away for the next two weeks, I have very limited SQL knowledge, I like to stay with the UI and middle tier.
What we are trying to figure out is how can we do the following, we need to write a query to calculate the average period (in days) all commissions have taken to transition from ‘Verified’ to ‘Paid’ for a single dealer, currently the status are
Created
Verified
Rejected
Awaiting Payment
Paid
Refunded
I think this query needs to aim directly at the Commission History Table?
I'm not sure how I would go about writing such query due to the fact my knowledge on SQL is limited...
Any help would be great.
Here's a method to achieve what you're after, although it might not be the most efficient. It seems to me that it's more of a one off query you are looking to run, rather than something that you're going to run on a frequent enough to impact database performance.
Test Table Setup:
CREATE TABLE Commission
(
CommissionId INT,
DealerId INT
)
CREATE TABLE CommissionHistory
(
CommissionId INT,
ActionDate DATETIME,
NewPaymentStatusId INT
)
Insert Dummy Data - 5 Commissions for 1 Dealer:
INSERT INTO dbo.Commission
( CommissionId ,
DealerId
)
VALUES ( 1 , 1 ),
( 2 , 1 ),
( 3 , 1 ),
( 4 , 1 ),
( 5 , 1 ),
INSERT INTO dbo.CommissionHistory
( CommissionId ,
ActionDate ,
NewPaymentStatusId
)
VALUES ( 1 , GETDATE() -25, 1 ),
( 1 , GETDATE() -21, 2 ),
( 1 , GETDATE() -18, 3 ),
( 1 , GETDATE() -16, 4 ),
( 1 , GETDATE() -5, 5 ),
( 2 , GETDATE() -10, 1 ),
( 2 , GETDATE() -9, 2 ),
( 2 , GETDATE() -8, 3 ),
( 2 , GETDATE() -7, 4 ),
( 2 , GETDATE() -6, 5 ),
( 3 , GETDATE() -10, 1 ),
( 3 , GETDATE() -8, 2 ),
( 3 , GETDATE() -6, 3 ),
( 3 , GETDATE() -4, 4 ),
( 3 , GETDATE() -2, 5 ),
( 3 , GETDATE() -25, 6 ),
( 4 , GETDATE() -10, 1 ),
( 4 , GETDATE() -7, 2 ),
( 4 , GETDATE() -6, 3 ),
( 4 , GETDATE() -4, 4 ),
( 4 , GETDATE() -1, 5 ),
( 5 , GETDATE() -1, 1 ),
( 5 , GETDATE() -1, 2 )
So with the dummy data, Commissions 1, 2 &, 4 are classified as valid records as they have status 2 and 5. 3 is excluded as it is refunded and 5 is excluded as it's not paid.
To generate the averages I wrote the below query:
-- set the required dealer id
DECLARE #DealerId INT = 1
-- return all CommissionId's in to a temp table that have statuses 2 and 5, but not 6
SELECT DISTINCT CommissionId
INTO #DealerCommissions
FROM dbo.CommissionHistory t1
WHERE CommissionId IN (SELECT CommissionId
FROM dbo.Commission
WHERE DealerId = #DealerId)
AND NOT EXISTS (SELECT CommissionId
FROM dbo.CommissionHistory t2
WHERE t2.NewPaymentStatusId = 6 AND t2.CommissionId = t1.CommissionId)
AND EXISTS (SELECT CommissionId
FROM dbo.CommissionHistory t2
WHERE t2.NewPaymentStatusId = 2 AND t2.CommissionId = t1.CommissionId)
AND EXISTS (SELECT CommissionId
FROM dbo.CommissionHistory t2
WHERE t2.NewPaymentStatusId = 5 AND t2.CommissionId = t1.CommissionId)
-- use the temp table to return average difference between the MIN & MAX date
;WITH cte AS (
SELECT CommissionId FROM #DealerCommissions
)
SELECT AVG(CAST(DaysToCompletion AS DECIMAL(10,8)))
FROM (
SELECT DATEDIFF(DAY, MIN(ch.ActionDate), MAX(ch.ActionDate)) DaysToCompletion
FROM cte
INNER JOIN dbo.CommissionHistory ch ON ch.CommissionId = cte.CommissionId
GROUP BY ch.CommissionId
) AS averageDays
-- remove temp table
DROP TABLE #DealerCommissions
For every commission in history table you could get the max verified date and min paid date, assuming paid date always later than verified date. Then you can join commission table to group by dealer id to get the average duration in days.
with comm as(
select
commissionid,
max(case NewPamentStatus when 'Verified' then ActionDate else null end) as verified_date,
min(case NewPamentStatus when 'Paid' then ActionDate else null end) as paid_date
--using max or min just incase that same status will be recorded more than one time.
from
CommissionHistory
group by
commistionid
)
select
c.DealerId,
avg(datediff(day,comm.verified_date,comm.paid_date))
from
comm
inner join
commission c
on c.commissionid = comm.commissionid
where
datediff(day,comm.verified_date,comm.paid_date)>0
-- to get rid off the commissions with paid date before the verified date or in same day
group by
c.DealerId
I need some help producing a MS SQL 2012 query that will match the desired stair-step output. The rows summarize data by one date range (account submission date month), and the columns summarize it by another date range (payment date month)
Table 1: Accounts tracks accounts placed for collections.
CREATE TABLE [dbo].[Accounts](
[AccountID] [nchar](10) NOT NULL,
[SubmissionDate] [date] NOT NULL,
[Amount] [money] NOT NULL,
CONSTRAINT [PK_Accounts] PRIMARY KEY CLUSTERED (AccountID ASC))
INSERT INTO [dbo].[Accounts] VALUES ('1000', '2012-01-01', 1999.00)
INSERT INTO [dbo].[Accounts] VALUES ('1001', '2012-01-02', 100.00)
INSERT INTO [dbo].[Accounts] VALUES ('1002', '2012-02-05', 350.00)
INSERT INTO [dbo].[Accounts] VALUES ('1003', '2012-03-01', 625.00)
INSERT INTO [dbo].[Accounts] VALUES ('1004', '2012-03-10', 50.00)
INSERT INTO [dbo].[Accounts] VALUES ('1005', '2012-03-10', 10.00)
Table 2: Trans tracks payments made
CREATE TABLE [dbo].[Trans](
[TranID] [int] IDENTITY(1,1) NOT NULL,
[AccountID] [nchar](10) NOT NULL,
[TranDate] [date] NOT NULL,
[TranAmount] [money] NOT NULL,
CONSTRAINT [PK_Trans] PRIMARY KEY CLUSTERED (TranID ASC))
INSERT INTO [dbo].[Trans] VALUES (1000, '2012-01-15', 300.00)
INSERT INTO [dbo].[Trans] VALUES (1000, '2012-02-15', 300.00)
INSERT INTO [dbo].[Trans] VALUES (1000, '2012-03-15', 300.00)
INSERT INTO [dbo].[Trans] VALUES (1002, '2012-02-20', 325.00)
INSERT INTO [dbo].[Trans] VALUES (1002, '2012-04-20', 25.00)
INSERT INTO [dbo].[Trans] VALUES (1003, '2012-03-24', 625.00)
INSERT INTO [dbo].[Trans] VALUES (1004, '2012-03-28', 31.00)
INSERT INTO [dbo].[Trans] VALUES (1004, '2012-04-12', 5.00)
INSERT INTO [dbo].[Trans] VALUES (1005, '2012-04-08', 7.00)
INSERT INTO [dbo].[Trans] VALUES (1005, '2012-04-28', 3.00)
Here's what the desired output should look like
*Total Payments in Each Month*
SubmissionYearMonth TotalAmount | 2012-01 2012-02 2012-03 2012-04
--------------------------------------------------------------------
2012-01 2099.00 | 300.00 300.00 300.00 0.00
2012-02 350.00 | 325.00 0.00 25.00
2012-03 685.00 | 656.00 15.00
The first two columns sum Account.Amount grouping by month.
The last 4 columns sum the Tran.TranAmount, by month, for Accounts placed in the given month of the current row.
The query I've been working with feel close. I just don't have the lag correct.
Here's the query I'm working with thus far:
Select SubmissionYearMonth,
TotalAmount,
pt.[0] AS MonthOld0,
pt.[1] AS MonthOld1,
pt.[2] AS MonthOld2,
pt.[3] AS MonthOld3,
pt.[4] AS MonthOld4,
pt.[5] AS MonthOld5,
pt.[6] AS MonthOld6,
pt.[7] AS MonthOld7,
pt.[8] AS MonthOld8,
pt.[9] AS MonthOld9,
pt.[10] AS MonthOld10,
pt.[11] AS MonthOld11,
pt.[12] AS MonthOld12,
pt.[13] AS MonthOld13
From (
SELECT Convert(Char(4),Year(SubmissionDate)) + '-' + Right('00' + Convert(VarChar(2), DatePart(Month, SubmissionDate)),2) AS SubmissionYearMonth,
SUM(Amount) AS TotalAmount
FROM Accounts
GROUP BY Convert(Char(4),Year(SubmissionDate)) + '-' + Right('00' + Convert(VarChar(2), DatePart(Month, SubmissionDate)),2)
)
AS AccountSummary
OUTER APPLY
(
SELECT *
FROM (
SELECT CASE WHEN DATEDIFF(Month, SubmissionDate, TranDate) < 13
THEN DATEDIFF(Month, SubmissionDate, TranDate)
ELSE 13
END AS PaymentMonthAge,
TranAmount
FROM Trans INNER JOIN Accounts ON Trans.AccountID = Accounts.AccountID
Where Convert(Char(4),Year(TranDate)) + '-' + Right('00' + Convert(VarChar(2), DatePart(Month, TranDate)),2)
= AccountSummary.SubmissionYearMonth
) as TransTemp
PIVOT (SUM(TranAmount)
FOR PaymentMonthAge IN ([0],
[1],
[2],
[3],
[4],
[5],
[6],
[7],
[8],
[9],
[10],
[11],
[12],
[13])) as TransPivot
) as pt
It's producing the following output:
SubmissionYearMonth TotalAmount MonthOld0 MonthOld1 MonthOld2 MonthOld3 ...
2012-01 2099.00 300.00 NULL NULL NULL ...
2012-02 350.00 325.00 300.00 NULL NULL ...
2012-03 685.00 656.00 NULL 300.00 NULL ...
As for the column date headers. I'm not sure what the best option is here. I could add an additional set of columns and create a calculated value that I could use in the resulting report.
SQL Fiddle: http://www.sqlfiddle.com/#!6/272e5/1/0
Since you are using SQL Server 2012, we can use the Format function to make the date pretty. There is no need to group by the strings. Instead, I find it useful to use the proper data type for as long as I can and only use Format or Convert on display (or not at all and let the middle tier handle the display).
In this solution, I arbitrarily assumed the earliest TransDate and extract from it, the first day of that month. However, one could easily replace that expression with a static value of the start date desired and this solution would take that and the next 12 months.
With SubmissionMonths As
(
Select DateAdd(d, -Day(A.SubmissionDate) + 1, A.SubmissionDate) As SubmissionMonth
, A.Amount
From dbo.Accounts As A
)
, TranMonths As
(
Select DateAdd(d, -Day(Min( T.TranDate )) + 1, Min( T.TranDate )) As TranMonth
, 1 As MonthNum
From dbo.Accounts As A
Join dbo.Trans As T
On T.AccountId = A.AccountId
Join SubmissionMonths As M
On A.SubmissionDate >= M.SubmissionMonth
And A.SubmissionDate < DateAdd(m,1,SubmissionMonth)
Union All
Select DateAdd(m, 1, TranMonth), MonthNum + 1
From TranMonths
Where MonthNum < 12
)
, TotalBySubmissionMonth As
(
Select M.SubmissionMonth, Sum( M.Amount ) As Total
From SubmissionMonths As M
Group By M.SubmissionMonth
)
Select Format(SMT.SubmissionMonth,'yyyy-MM') As SubmissionMonth, SMT.Total
, Sum( Case When TM.MonthNum = 1 Then T.TranAmount End ) As Month1
, Sum( Case When TM.MonthNum = 2 Then T.TranAmount End ) As Month2
, Sum( Case When TM.MonthNum = 3 Then T.TranAmount End ) As Month3
, Sum( Case When TM.MonthNum = 4 Then T.TranAmount End ) As Month4
, Sum( Case When TM.MonthNum = 5 Then T.TranAmount End ) As Month5
, Sum( Case When TM.MonthNum = 6 Then T.TranAmount End ) As Month6
, Sum( Case When TM.MonthNum = 7 Then T.TranAmount End ) As Month7
, Sum( Case When TM.MonthNum = 8 Then T.TranAmount End ) As Month8
, Sum( Case When TM.MonthNum = 9 Then T.TranAmount End ) As Month9
, Sum( Case When TM.MonthNum = 10 Then T.TranAmount End ) As Month10
, Sum( Case When TM.MonthNum = 11 Then T.TranAmount End ) As Month11
, Sum( Case When TM.MonthNum = 12 Then T.TranAmount End ) As Month12
From TotalBySubmissionMonth As SMT
Join dbo.Accounts As A
On A.SubmissionDate >= SMT.SubmissionMonth
And A.SubmissionDate < DateAdd(m,1,SMT.SubmissionMonth)
Join dbo.Trans As T
On T.AccountId = A.AccountId
Join TranMonths As TM
On T.TranDate >= TM.TranMonth
And T.TranDate < DateAdd(m,1,TM.TranMonth)
Group By SMT.SubmissionMonth, SMT.Total
SQL Fiddle version
The following query pretty much returns what you want. You need to do the to operations separately. I just join the results together:
select a.yyyymm, a.Amount,
t201201, t201202, t201203, t201204
from (select LEFT(convert(varchar(255), a.submissiondate, 121), 7) as yyyymm,
SUM(a.Amount) as amount
from Accounts a
group by LEFT(convert(varchar(255), a.submissiondate, 121), 7)
) a left outer join
(select LEFT(convert(varchar(255), a.submissiondate, 121), 7) as yyyymm,
sum(case when trans_yyyymm = '2012-01' then tranamount end) as t201201,
sum(case when trans_yyyymm = '2012-02' then tranamount end) as t201202,
sum(case when trans_yyyymm = '2012-03' then tranamount end) as t201203,
sum(case when trans_yyyymm = '2012-04' then tranamount end) as t201204
from Accounts a join
(select t.*, LEFT(convert(varchar(255), t.trandate, 121), 7) as trans_yyyymm
from trans t
) t
on a.accountid = t.accountid
group by LEFT(convert(varchar(255), a.submissiondate, 121), 7)
) t
on a.yyyymm = t.yyyymm
order by 1
I am getting a NULL where you have a 0.00 in two cells.
Thomas, I used your response as inspiration for the following solution I ended up using.
I first create a SubmissionDate, TranDate cross join skeleton date matrix, that I later use to join on the AccountSummary and TranSummary data.
The resulting query output isn't formatted in columns, per TranDate month. Rather I'm using output in a SQL Server Reporting Services matrix, and using a column grouping, based off the TranSummaryMonthNum column, to get the desired formatted output.
SQL Fiddle version
;
WITH
--Generate a list of Dates, from the first SubmissionDate, through today.
--Note: Requires the use of: 'OPTION (MAXRECURSION 0)' to generate a list with more than 100 dates.
CTE_AutoDates AS
( Select Min(SubmissionDate) as FiscalDate
From Accounts
UNION ALL
SELECT DATEADD(Day, 1, FiscalDate)
FROM CTE_AutoDates
WHERE DATEADD(Day, 1, FiscalDate) <= GetDate()
),
FiscalDates As
( SELECT FiscalDate,
DATEFROMPARTS(Year(FiscalDate), Month(FiscalDate), 1) as FiscalMonthStartDate
FROM CTE_AutoDates
--Optionaly filter Fiscal Dates by the last known Math.Max(SubmissionDate, TranDate)
Where FiscalDate <= (Select Max(MaxDate)
From (Select Max(SubmissionDate) as MaxDate From Accounts
Union All
Select Max(TranDate) as MaxDate From Trans
) as MaxDates
)
),
FiscalMonths as
( SELECT Distinct FiscalMonthStartDate
FROM FiscalDates
),
--Matrix to store the reporting date groupings for the Account submission and payment periods.
SubmissionAndTranMonths AS
( Select AM.FiscalMonthStartDate as SubmissionMonthStartDate,
TM.FiscalMonthStartDate as TransMonthStartDate,
DateDiff(Month, (Select Min(FiscalMonthStartDate) From FiscalMonths), TM.FiscalMonthStartDate) as TranSummaryMonthNum
From FiscalMonths AS AM
Join FiscalMonths AS TM
ON TM.FiscalMonthStartDate >= AM.FiscalMonthStartDate
),
AccountData as
( Select A.AccountID,
A.Amount,
FD.FiscalMonthStartDate as SubmissionMonthStartDate
From Accounts as A
Inner Join FiscalDates as FD
ON A.SubmissionDate = FD.FiscalDate
),
TranData as
( Select T.AccountID,
T.TranAmount,
AD.SubmissionMonthStartDate,
FD.FiscalMonthStartDate as TranMonthStartDate
From Trans as T
Inner Join AccountData as AD
ON T.AccountID = AD.AccountID
Inner Join FiscalDates AS FD
ON T.TranDate = FD.FiscalDate
),
AccountSummaryByMonth As
( Select ASM.FiscalMonthStartDate,
Sum(AD.Amount) as TotalSubmissionAmount
From FiscalMonths as ASM
Inner Join AccountData as AD
ON ASM.FiscalMonthStartDate = AD.SubmissionMonthStartDate
Group By
ASM.FiscalMonthStartDate
),
TranSummaryByMonth As
( Select STM.SubmissionMonthStartDate,
STM.TransMonthStartDate,
STM.TranSummaryMonthNum,
Sum(TD.TranAmount) as TotalTranAmount
From SubmissionAndTranMonths as STM
Inner Join TranData as TD
ON STM.SubmissionMonthStartDate = TD.SubmissionMonthStartDate
AND STM.TransMonthStartDate = TD.TranMonthStartDate
Group By
STM.SubmissionMonthStartDate,
STM.TransMonthStartDate,
STM.TranSummaryMonthNum
)
--#Inspect 1
--Select * From SubmissionAndTranMonths
--OPTION (MAXRECURSION 0)
--#Inspect 1 Results
--SubmissionMonthStartDate TransMonthStartDate TranSummaryMonthNum
--2012-01-01 2012-01-01 0
--2012-01-01 2012-02-01 1
--2012-01-01 2012-03-01 2
--2012-01-01 2012-04-01 3
--2012-02-01 2012-02-01 1
--2012-02-01 2012-03-01 2
--2012-02-01 2012-04-01 3
--2012-03-01 2012-03-01 2
--2012-03-01 2012-04-01 3
--2012-04-01 2012-04-01 3
--#Inspect 2
--Select * From AccountSummaryByMonth
--OPTION (MAXRECURSION 0)
--#Inspect 2 Results
--FiscalMonthStartDate TotalSubmissionAmount
--2012-01-01 2099.00
--2012-02-01 350.00
--2012-03-01 685.00
--#Inspect 3
--Select * From TranSummaryByMonth
--OPTION (MAXRECURSION 0)
--#Inspect 3 Results
--SubmissionMonthStartDate TransMonthStartDate TranSummaryMonthNum TotalTranAmount
--2012-01-01 2012-01-01 0 300.00
--2012-01-01 2012-02-01 1 300.00
--2012-01-01 2012-03-01 2 300.00
--2012-02-01 2012-02-01 1 325.00
--2012-02-01 2012-04-01 3 25.00
--2012-03-01 2012-03-01 2 656.00
--2012-03-01 2012-04-01 3 15.00
Select STM.SubmissionMonthStartDate,
ASM.TotalSubmissionAmount,
STM.TransMonthStartDate,
STM.TranSummaryMonthNum,
TSM.TotalTranAmount
From SubmissionAndTranMonths as STM
Inner Join AccountSummaryByMonth as ASM
ON STM.SubmissionMonthStartDate = ASM.FiscalMonthStartDate
Left Join TranSummaryByMonth AS TSM
ON STM.SubmissionMonthStartDate = TSM.SubmissionMonthStartDate
AND STM.TransMonthStartDate = TSM.TransMonthStartDate
Order By STM.SubmissionMonthStartDate, STM.TranSummaryMonthNum
OPTION (MAXRECURSION 0)
--#Results
--SubmissionMonthStartDate TotalSubmissionAmount TransMonthStartDate TranSummaryMonthNum TotalTranAmount
--2012-01-01 2099.00 2012-01-01 0 300.00
--2012-01-01 2099.00 2012-02-01 1 300.00
--2012-01-01 2099.00 2012-03-01 2 300.00
--2012-01-01 2099.00 2012-04-01 3 NULL
--2012-02-01 350.00 2012-02-01 1 325.00
--2012-02-01 350.00 2012-03-01 2 NULL
--2012-02-01 350.00 2012-04-01 3 25.00
--2012-03-01 685.00 2012-03-01 2 656.00
--2012-03-01 685.00 2012-04-01 3 15.00
The following query exactly duplicates the results of your final query in your own answer but takes no more than 1/30th the CPU (or better), plus is a whole lot simpler.
If I had the time & energy I am sure I could find even more improvements... my gut tells me I might not have to hit the Accounts table so many times. But in any case, it's a huge improvement and should perform very well even for very large result sets.
See the SqlFiddle for it.
WITH L0 AS (SELECT 1 N UNION ALL SELECT 1),
L1 AS (SELECT 1 N FROM L0, L0 B),
L2 AS (SELECT 1 N FROM L1, L1 B),
L3 AS (SELECT 1 N FROM L2, L2 B),
L4 AS (SELECT 1 N FROM L3, L2 B),
Nums AS (SELECT N = Row_Number() OVER (ORDER BY (SELECT 1)) FROM L4),
Anchor AS (
SELECT MinDate = DateAdd(month, DateDiff(month, '20000101', Min(SubmissionDate)), '20000101')
FROM dbo.Accounts
),
MNums AS (
SELECT N
FROM Nums
WHERE
N <= DateDiff(month,
(SELECT MinDate FROM Anchor),
(SELECT Max(TranDate) FROM dbo.Trans)
) + 1
),
A AS (
SELECT
AM.AccountMo,
Amount = Sum(A.Amount)
FROM
dbo.Accounts A
CROSS APPLY (
SELECT DateAdd(month, DateDiff(month, '20000101', A.SubmissionDate), '20000101')
) AM (AccountMo)
GROUP BY
AM.AccountMo
), T AS (
SELECT
AM.AccountMo,
TM.TranMo,
TotalTranAmount = Sum(T.TranAmount)
FROM
dbo.Accounts A
CROSS APPLY (
SELECT DateAdd(month, DateDiff(month, '20000101', A.SubmissionDate), '20000101')
) AM (AccountMo)
INNER JOIN dbo.Trans T
ON A.AccountID = T.AccountID
CROSS APPLY (
SELECT DateAdd(month, DateDiff(month, '20000101', T.TranDate), '20000101')
) TM (TranMo)
GROUP BY
AM.AccountMo,
TM.TranMo
)
SELECT
SubmissionStartMonth = A.AccountMo,
TotalSubmissionAmount = A.Amount,
M.TransMonth,
TransMonthNum = N.N - 1,
T.TotalTranAmount
FROM
A
INNER JOIN MNums N
ON N.N >= DateDiff(month, (SELECT MinDate FROM Anchor), A.AccountMo) + 1
CROSS APPLY (
SELECT TransMonth = DateAdd(month, N.N - 1, (SELECT MinDate FROM Anchor))
) M
LEFT JOIN T
ON A.AccountMo = T.AccountMo
AND M.TransMonth = T.TranMo
ORDER BY
A.AccountMo,
M.TransMonth;