Calculate the amount between two dates in the table - sql

I have these 2 tables. I need to merge into one table. Where should I put in the column the amount of expenses between two dates. How can I do it?
Profits:
Id
Date
Money
1
01.01.2022
100
2
15.01.2022
50
3
25.01.2022
30
Expenses:
Id
Date
Money
1
01.01.2022
20
2
03.01.2022
30
3
30.01.2022
40
Result:
Id
Date
Profits
Expenses(Sum)
1
01.01.2022
100
50
2
15.01.2022
50
3
25.01.2022
30
40

The following statement is a possible option:
Data:
SELECT *
INTO Profits
FROM (VALUES
(1, CONVERT(date, '01.01.2022', 104), 100),
(2, CONVERT(date, '15.01.2022', 104), 50),
(3, CONVERT(date, '25.01.2022', 104), 30)
) v (Id, [Date], [Money])
SELECT *
INTO Expenses
FROM (VALUES
(1, CONVERT(date, '01.01.2022', 104), 20),
(2, CONVERT(date, '03.01.2022', 104), 30),
(3, CONVERT(date, '30.01.2022', 104), 40)
) v (Id, [Date], [Money])
Statement:
SELECT p.Id, p.Date, p.Money AS Profits, SUM(e.Money) AS Expenses
FROM (
SELECT *, LEAD(Date) OVER (ORDER BY Date) AS NextDate
FROM Profits
) p
LEFT JOIN Expenses e ON p.Date <= e.Date AND (e.Date <= p.NextDate OR p.NextDate IS NULL)
GROUP BY p.Id, p.Date, p.Money
Result:
Id
Date
Profits
Expenses
1
2022-01-01
100
50
2
2022-01-15
50
3
2022-01-25
30
40
For SQL Server 2008 you have to replace LEAD() with a self-join (a simplified approach when there are no gaps in the Id column):
SELECT p.Id, p.Date, p.Money AS Profits, SUM(e.Money) AS Expenses
FROM (
SELECT p1.*, p2.Date AS NextDate
FROM Profits p1
LEFT JOIN Profits p2 ON p1.Id = p2.Id - 1
) p
LEFT JOIN Expenses e ON p.Date <= e.Date AND (e.Date <= p.NextDate OR p.NextDate IS NULL)
GROUP BY p.Id, p.Date, p.Money

Related

subquery sum and newest record by different type

table employee {
id,
name
}
table payment_record {
id,
type, // 1 is salary, 2-4 is bonus
employee_id,
date_paid,
amount
}
i want to query employee's newest salary and sum(bonus) from some date.
like my payments is like
id, type, employee_id, date_paid, amount
1 1 1 2022-10-01 5000
2 2 1 2022-10-01 1000
3 3 1 2022-10-01 1000
4 1 1 2022-11-01 3000
5 1 2 2022-10-01 1000
6 1 2 2022-11-01 2000
7 2 2 2022-11-01 3000
query date in ['2022-10-01', '2022-11-01']
show me
employee_id, employee_name, newest_salary, sum(bonus)
1 Jeff 3000 2000
2 Alex 2500 3000
which jeff's newest_salary is 3000 becuase there is 2 type = 1(salary) record 5000 and 3000, the newest one is 3000.
and jeff's bonus sum is 1000(type 2) + 1000(type 3) = 2000
the current sql i try is like
select
e.employee_id,
employee.name,
e.newest_salary,
e.bonus
from
(
select
payment_record.employee_id,
SUM(case when type in ('2', '3', '4') then amount end) as bonus,
Max(case when type = '1' then amount end) as newest_salary
from
payment_record
where
date_paid in ('2022-10-01', '2022-11-01')
group by
employee_id
) as e
join
employee
on
employee.id = e.employee_id
order by
employee_id
it's almost done, but the rule of newest_salary is not correct, i just get the max value althought usually the max value is newest record.
The query is below:
SELECT
t1.id employee_id,
t1.name employee_name,
t3.amount newest_salary,
t2.bonus bonus
FROM employee t1
LEFT JOIN
(
SELECT
employee_id,
MAX(CASE WHEN type=1 THEN date_paid END) date_paid,
SUM(CASE WHEN type IN (2,3,4) THEN amount END) bonus
FROM payment_record
WHERE date_paid BETWEEN '2022-10-01' AND '2022-11-01'
GROUP BY employee_id
) t2
ON t1.id=t2.employee_id
LEFT JOIN payment_record t3
ON t3.type=1 AND
t2.employee_id=t3.employee_id AND
t2.date_paid=t3.date_paid
ORDER BY t1.id
db fiddle
I think Postgres is close enough to work with this solution I tested in sql-server, but it should at least be close enough to translate
My approach is to split the payments in the desired range into salary vs bonus, and sum the bonus but use a partitioned row number to identify the newest salary payment for each employee in the desired date range and only join that one to the bonus totals. Note that I used a LEFT JOIN because an employee might not get a bonus.
DECLARE #StartDate DATE = '2022-10-01';
DECLARE #EndDate DATE = '2022-11-01';
with cteSample as ( --BEGIN sample data
SELECT * FROM ( VALUES
(1, 1, 1, CONVERT(DATE,'2022-10-01'), 5000)
, (2, 2, 1, '2022-10-01', 1000)
, (3, 3, 1, '2022-10-01', 1000)
, (4, 1, 1, '2022-11-01', 3000)
, (5, 1, 2, '2022-10-01', 1000)
, (6, 1, 2, '2022-11-01', 2000)
, (7, 2, 2, '2022-11-01', 3000)
) as TabA(Pay_ID, Pay_Type, employee_id, date_paid, amount)
) --END Sample data
, ctePayments as ( --Filter down to just the payments in the date range you are interested in
SELECT Pay_ID, Pay_Type, employee_id, date_paid, amount
FROM cteSample --Replace this with your real table of payments
WHERE date_paid >= #StartDate AND date_paid <= #EndDate
), cteSalary as ( --Identify salary payments in range and order them newest first
SELECT employee_id, amount
, ROW_NUMBER() over (PARTITION BY employee_id ORDER BY date_paid DESC) as Newness
FROM ctePayments as S
WHERE S.Pay_Type = 1
), cteBonus as ( --Identify bonus payments in range and sum them
SELECT employee_id, SUM(amount) as BonusPaid
FROM ctePayments as S
WHERE S.Pay_Type in (2,3,4)
GROUP BY employee_id
)
SELECT S.employee_id, S.amount as SalaryNewest
, COALESCE(B.BonusPaid, 0) as BonusTotal
FROM cteSalary as S --Join the salary list to the bonusa list
LEFT OUTER JOIN cteBonus as B ON S.employee_id = B.employee_id
WHERE S.Newness = 1 --Keep only the newest
Result:
employee_id
SalaryNewest
BonusTotal
1
3000
2000
2
2000
3000

rolling three day average in sql server

I want to find three day average of values. Expected output:
dt
amt
running_avg
2022-05-1
100
0
2022-05-2
150
0
2022-05-3
50
100
2022-05-14
250
150
2022-05-15
0
100
Average should be calculated for 3 day window. My query is:
select a.dt, avg(b.amt) over(order by a.dt ) as running_avg,b.amt
from trans a
left join trans b on b.dt = a.dt
where a.dt between DATEADD(day,-3, a.dt) and getdate()
My query is just giving normal running average and not average for 3 days. Let me know how this can be done in SQL Server.
Thanks!
It seems you need a ROWS/RANGE argument in the windowed AVG():
Data:
SELECT *
INTO trans
FROM (VALUES
(CONVERT(date, '20220501'), 100),
(CONVERT(date, '20220502'), 150),
(CONVERT(date, '20220503'), 50),
(CONVERT(date, '20220514'), 250),
(CONVERT(date, '20220515'), 0)
) v (dt, amt)
Statement:
SELECT
dt,
amt,
AVG(amt) OVER (ORDER BY dt ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS running_avg,
CASE WHEN COUNT(*) OVER (ORDER BY dt ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) = 3
THEN AVG(amt) OVER (ORDER BY dt ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)
ELSE 0
END AS running_avg2
FROM trans
ORDER BY dt
Results:
dt
amt
running_avg
running_avg2
2022-05-01
100
100
0
2022-05-02
150
125
0
2022-05-03
50
100
100
2022-05-14
250
150
150
2022-05-15
0
100
100

Rolling sum for historical dates

I have a table orders that looks like this:
order_id
sales_amount
order_time
store_id
1412412
30
2022/03/28
456
1551211
5
2022/03/27
145
I am interested in calculating the sales from stores that had their first order in the last 28 days, on a rolling basis. The following will give me this for the most recent day:
with first_order_dates AS (
select
min(order_time) as first_order_time,
store_id
from Orders
group by store_id
)
select
dateadd(day,-1, cast(getdate() as date)) AS date,
sum(sales_amount) AS new_revenue_last_28d
from Orders
left join first_order_dates
on first_order_dates.store_id = Orders.store_id
where first_order_time between dateadd(day,-29, cast(getdate() as date)) and dateadd(day,-1, cast(getdate() as date))
group by dateadd(day,-1, cast(getdate() as date))
Resulting in:
Date
new_revenue_last_28d
2022/04/06
5400
What I want is to go back and calculate this for every historical day, i.e to end up with
Date
new_revenue_last_28d
2022/04/06
5400
2022/04/05
5732
2022/04/04
4300
and so on so I can chart this. I have run out of ideas - how can I do this with only the info I have available? Using Snowflake ideally
So if you want to only show sales for shops that have their first sale in the last 28 days, and for "those 28 days, have a rolling window of the sum of those sales"
WITH data as (
select * from values
(100, '2022-04-07'::date, 10),
(100, '2022-04-06'::date, 8),
(100, '2022-04-05'::date, 11),
(100, '2022-04-01'::date, 12),
(101, '2022-04-02'::date, 110),
(101, '2022-04-01'::date, 120)
t(store_id, order_date, sales_amount)
), store_valid_orders as (
select
store_id
,order_date
,sales_amount
from data
qualify min(order_date) over(partition by store_id) >= current_date() - 28
), those_28_days as (
select current_date() - row_number()over(order by null) + 1 as date
from table(generator(ROWCOUNT => 29))
), day_join_sales as (
select
d.date
,s.store_id
,sum(s.sales_amount) as sales_amount
from those_28_days as d
left join store_valid_orders as s on d.date = s.order_date
group by 1,2
)
select
date
,store_id
,sum(sales_amount) over(partition by store_id order by date rows between 28 preceding and current row ) as prior_28_days_sales
from day_join_sales
qualify store_id is not null;
gives:
DATE
STORE_ID
PRIOR_28_DAYS_SALES
2022-04-01
100
12
2022-04-05
100
23
2022-04-06
100
31
2022-04-07
100
41
2022-04-01
101
120
2022-04-02
101
230
that is actually more complex that it needs to be.. but I half have the concept for solving rolling windows of days, which include the first sales with respect to rolling date. Which is more complex, but the above might be enough to answer your question. So I will stop here.
Take 2:
with daily 28 days of sales per store, rolled into single daily total:
WITH data as (
select * from values
(100, '2022-04-07'::date, 10),
(100, '2022-04-06'::date, 8),
(100, '2022-04-05'::date, 11),
(100, '2022-04-01'::date, 12),
(101, '2022-04-02'::date, 110),
(101, '2022-04-01'::date, 120)
t(store_id, order_date, sales_amount)
), store_first_orders as (
select
store_id
,min(order_date) as first_order
from data
group by 1
), _29_rows as (
select
row_number()over(order by null) - 1 as rn
from table(generator(ROWCOUNT => 29))
), those_29_rows as (
select
v.store_id
,dateadd(day, r.rn, v.first_order) as date
from _29_rows as r
full join store_first_orders as v
), first_28_days_of_data as (
select
r.store_id
,r.date
,d.sales_amount
from those_29_rows r
left join data as d
on d.store_id = r.store_id AND d.order_date = r.date
), per_site_dailies as (
select
store_id
,date
,sum(sales_amount) over(partition by store_id order by date) as roll_sales
from first_28_days_of_data
order by 2,1
)
select
date,
sum(roll_sales) as new_revenue_last_28d
from per_site_dailies
group by 1
having date <= current_date()
order by 1;
gives:
DATE
NEW_REVENUE_LAST_28D
2022-04-01
132
2022-04-02
242
2022-04-03
242
2022-04-04
242
2022-04-05
253
2022-04-06
261
2022-04-07
271
2022-04-08
271

How to aggregate (counting distinct items) over a sliding window in SQL Server?

I am currently using this query (in SQL Server) to count the number of unique item each day:
SELECT Date, COUNT(DISTINCT item)
FROM myTable
GROUP BY Date
ORDER BY Date
How can I transform this to get for each date the number of unique item over the past 3 days (including the current day)?
The output should be a table with 2 columns:
one columns with all dates in the original table. On the second column, we have the number of unique item per date.
for instance if original table is:
Date Item
01/01/2018 A
01/01/2018 B
02/01/2018 C
03/01/2018 C
04/01/2018 C
With my query above I currently get the unique count for each day:
Date count
01/01/2018 2
02/01/2018 1
03/01/2018 1
04/01/2018 1
and I am looking to get as result the unique count over 3 days rolling window:
Date count
01/01/2018 2
02/01/2018 3 (because items ABC on 1st and 2nd Jan)
03/01/2018 3 (because items ABC on 1st,2nd,3rd Jan)
04/01/2018 1 (because only item C on 2nd,3rd,4th Jan)
Using an apply provides a convenient way to form sliding windows
CREATE TABLE myTable
([DateCol] datetime, [Item] varchar(1))
;
INSERT INTO myTable
([DateCol], [Item])
VALUES
('2018-01-01 00:00:00', 'A'),
('2018-01-01 00:00:00', 'B'),
('2018-01-02 00:00:00', 'C'),
('2018-01-03 00:00:00', 'C'),
('2018-01-04 00:00:00', 'C')
;
CREATE NONCLUSTERED INDEX IX_DateCol
ON MyTable([Date])
;
Query:
select distinct
t1.dateCol
, oa.ItemCount
from myTable t1
outer apply (
select count(distinct t2.item) as ItemCount
from myTable t2
where t2.DateCol between dateadd(day,-2,t1.DateCol) and t1.DateCol
) oa
order by t1.dateCol ASC
Results:
| dateCol | ItemCount |
|----------------------|-----------|
| 2018-01-01T00:00:00Z | 2 |
| 2018-01-02T00:00:00Z | 3 |
| 2018-01-03T00:00:00Z | 3 |
| 2018-01-04T00:00:00Z | 1 |
There may be some performance gains by reducing the date column prior to using the apply, like so:
select
d.date
, oa.ItemCount
from (
select distinct t1.date
from myTable t1
) d
outer apply (
select count(distinct t2.item) as ItemCount
from myTable t2
where t2.Date between dateadd(day,-2,d.Date) and d.Date
) oa
order by d.date ASC
;
Instead of using select distinct in that subquery you could use group by instead but the execution plan will remain the same.
Demo at SQL Fiddle
The most straight forward solution is to join the table with itself based on dates:
SELECT t1.DateCol, COUNT(DISTINCT t2.Item) AS C
FROM testdata AS t1
LEFT JOIN testdata AS t2 ON t2.DateCol BETWEEN DATEADD(dd, -2, t1.DateCol) AND t1.DateCol
GROUP BY t1.DateCol
ORDER BY t1.DateCol
Output:
| DateCol | C |
|-------------------------|---|
| 2018-01-01 00:00:00.000 | 2 |
| 2018-01-02 00:00:00.000 | 3 |
| 2018-01-03 00:00:00.000 | 3 |
| 2018-01-04 00:00:00.000 | 1 |
GROUP BY should be faster then DISTINCT (make sure to have an index on your Date column)
DECLARE #tbl TABLE([Date] DATE, [Item] VARCHAR(100))
;
INSERT INTO #tbl VALUES
('2018-01-01 00:00:00', 'A'),
('2018-01-01 00:00:00', 'B'),
('2018-01-02 00:00:00', 'C'),
('2018-01-03 00:00:00', 'C'),
('2018-01-04 00:00:00', 'C');
SELECT t.[Date]
--Just for control. You can take this part away
,(SELECT DISTINCT t2.[Item] AS [*]
FROM #tbl AS t2
WHERE t2.[Date]<=t.[Date]
AND t2.[Date]>=DATEADD(DAY,-2,t.[Date]) FOR XML PATH('')) AS CountedItems
--This sub-select comes back with your counts
,(SELECT COUNT(DISTINCT t2.[Item])
FROM #tbl AS t2
WHERE t2.[Date]<=t.[Date]
AND t2.[Date]>=DATEADD(DAY,-2,t.[Date])) AS ItemCount
FROM #tbl AS t
GROUP BY t.[Date];
The result
Date CountedItems ItemCount
2018-01-01 AB 2
2018-01-02 ABC 3
2018-01-03 ABC 3
2018-01-04 C 1
This solution is different from other solutions. Can you check performance of this query on real data with comparison to other answers?
The basic idea is that each row can participate in the window for its own date, the day after, or the day after that. So this first expands the row out into three rows with those different dates attached and then it can just use a regular COUNT(DISTINCT) aggregating on the computed date. The HAVING clause is just to avoid returning results for dates that were solely computed and not present in the base data.
with cte(Date, Item) as (
select cast(a as datetime), b
from (values
('01/01/2018','A')
,('01/01/2018','B')
,('02/01/2018','C')
,('03/01/2018','C')
,('04/01/2018','C')) t(a,b)
)
select
[Date] = dateadd(dd, n, Date), [Count] = count(distinct Item)
from
cte
cross join (values (0),(1),(2)) t(n)
group by dateadd(dd, n, Date)
having max(iif(n = 0, 1, 0)) = 1
option (force order)
Output:
| Date | Count |
|-------------------------|-------|
| 2018-01-01 00:00:00.000 | 2 |
| 2018-01-02 00:00:00.000 | 3 |
| 2018-01-03 00:00:00.000 | 3 |
| 2018-01-04 00:00:00.000 | 1 |
It might be faster if you have many duplicate rows:
select
[Date] = dateadd(dd, n, Date), [Count] = count(distinct Item)
from
(select distinct Date, Item from cte) c
cross join (values (0),(1),(2)) t(n)
group by dateadd(dd, n, Date)
having max(iif(n = 0, 1, 0)) = 1
option (force order)
Use GETDATE() function to get current date, and DATEADD() to get the last 3 days
SELECT Date, count(DISTINCT item)
FROM myTable
WHERE [Date] >= DATEADD(day,-3, GETDATE())
GROUP BY Date
ORDER BY Date
SQL
SELECT DISTINCT Date,
(SELECT COUNT(DISTINCT item)
FROM myTable t2
WHERE t2.Date BETWEEN DATEADD(day, -2, t1.Date) AND t1.Date) AS count
FROM myTable t1
ORDER BY Date;
Demo
Rextester demo: http://rextester.com/ZRDQ22190
Since COUNT(DISTINCT item) OVER (PARTITION BY [Date]) is not supported you can use dense_rank to emulate that:
SELECT Date, dense_rank() over (partition by [Date] order by [item])
+ dense_rank() over (partition by [Date] order by [item] desc)
- 1 as count_distinct_item
FROM myTable
One thing to note is that dense_rank will count null as whereas COUNT will not.
Refer this post for more details.
Here is a simple solution that uses myTable itself as the source of grouping dates (edited for SQLServer dateadd). Note that this query assumes there will be at least one record in myTable for every date; if any date is absent, it will not appear in the query results, even if there are records for the 2 days prior:
select
date,
(select
count(distinct item)
from (select distinct date, item from myTable) as d2
where
d2.date between dateadd(day,-2,d.date) and d.date
) as count
from (select distinct date from myTable) as d
I solve this question with Math.
z (any day) = 3x + y (y is mode 3 value)
I need from 3 * (x - 1) + y + 1 to 3 * (x - 1) + y + 3
3 * (x- 1) + y + 1 = 3* (z / 3 - 1) + z % 3 + 1
In that case; I can use group by (between 3* (z / 3 - 1) + z % 3 + 1 and z)
SELECT iif(OrderDate between 3 * (cast(OrderDate as int) / 3 - 1) + (cast(OrderDate as int) % 3) + 1
and orderdate, Orderdate, 0)
, count(sh.SalesOrderID) FROM Sales.SalesOrderDetail shd
JOIN Sales.SalesOrderHeader sh on sh.SalesOrderID = shd.SalesOrderID
group by iif(OrderDate between 3 * (cast(OrderDate as int) / 3 - 1) + (cast(OrderDate as int) % 3) + 1
and orderdate, Orderdate, 0)
order by iif(OrderDate between 3 * (cast(OrderDate as int) / 3 - 1) + (cast(OrderDate as int) % 3) + 1
and orderdate, Orderdate, 0)
If you need else day group, you can use;
declare #n int = 4 (another day count)
SELECT iif(OrderDate between #n * (cast(OrderDate as int) / #n - 1) + (cast(OrderDate as int) % #n) + 1
and orderdate, Orderdate, 0)
, count(sh.SalesOrderID) FROM Sales.SalesOrderDetail shd
JOIN Sales.SalesOrderHeader sh on sh.SalesOrderID = shd.SalesOrderID
group by iif(OrderDate between #n * (cast(OrderDate as int) / #n - 1) + (cast(OrderDate as int) % #n) + 1
and orderdate, Orderdate, 0)
order by iif(OrderDate between #n * (cast(OrderDate as int) / #n - 1) + (cast(OrderDate as int) % #n) + 1
and orderdate, Orderdate, 0)

SQL Stairstep Query

I need some help producing a MS SQL 2012 query that will match the desired stair-step output. The rows summarize data by one date range (account submission date month), and the columns summarize it by another date range (payment date month)
Table 1: Accounts tracks accounts placed for collections.
CREATE TABLE [dbo].[Accounts](
[AccountID] [nchar](10) NOT NULL,
[SubmissionDate] [date] NOT NULL,
[Amount] [money] NOT NULL,
CONSTRAINT [PK_Accounts] PRIMARY KEY CLUSTERED (AccountID ASC))
INSERT INTO [dbo].[Accounts] VALUES ('1000', '2012-01-01', 1999.00)
INSERT INTO [dbo].[Accounts] VALUES ('1001', '2012-01-02', 100.00)
INSERT INTO [dbo].[Accounts] VALUES ('1002', '2012-02-05', 350.00)
INSERT INTO [dbo].[Accounts] VALUES ('1003', '2012-03-01', 625.00)
INSERT INTO [dbo].[Accounts] VALUES ('1004', '2012-03-10', 50.00)
INSERT INTO [dbo].[Accounts] VALUES ('1005', '2012-03-10', 10.00)
Table 2: Trans tracks payments made
CREATE TABLE [dbo].[Trans](
[TranID] [int] IDENTITY(1,1) NOT NULL,
[AccountID] [nchar](10) NOT NULL,
[TranDate] [date] NOT NULL,
[TranAmount] [money] NOT NULL,
CONSTRAINT [PK_Trans] PRIMARY KEY CLUSTERED (TranID ASC))
INSERT INTO [dbo].[Trans] VALUES (1000, '2012-01-15', 300.00)
INSERT INTO [dbo].[Trans] VALUES (1000, '2012-02-15', 300.00)
INSERT INTO [dbo].[Trans] VALUES (1000, '2012-03-15', 300.00)
INSERT INTO [dbo].[Trans] VALUES (1002, '2012-02-20', 325.00)
INSERT INTO [dbo].[Trans] VALUES (1002, '2012-04-20', 25.00)
INSERT INTO [dbo].[Trans] VALUES (1003, '2012-03-24', 625.00)
INSERT INTO [dbo].[Trans] VALUES (1004, '2012-03-28', 31.00)
INSERT INTO [dbo].[Trans] VALUES (1004, '2012-04-12', 5.00)
INSERT INTO [dbo].[Trans] VALUES (1005, '2012-04-08', 7.00)
INSERT INTO [dbo].[Trans] VALUES (1005, '2012-04-28', 3.00)
Here's what the desired output should look like
*Total Payments in Each Month*
SubmissionYearMonth TotalAmount | 2012-01 2012-02 2012-03 2012-04
--------------------------------------------------------------------
2012-01 2099.00 | 300.00 300.00 300.00 0.00
2012-02 350.00 | 325.00 0.00 25.00
2012-03 685.00 | 656.00 15.00
The first two columns sum Account.Amount grouping by month.
The last 4 columns sum the Tran.TranAmount, by month, for Accounts placed in the given month of the current row.
The query I've been working with feel close. I just don't have the lag correct.
Here's the query I'm working with thus far:
Select SubmissionYearMonth,
TotalAmount,
pt.[0] AS MonthOld0,
pt.[1] AS MonthOld1,
pt.[2] AS MonthOld2,
pt.[3] AS MonthOld3,
pt.[4] AS MonthOld4,
pt.[5] AS MonthOld5,
pt.[6] AS MonthOld6,
pt.[7] AS MonthOld7,
pt.[8] AS MonthOld8,
pt.[9] AS MonthOld9,
pt.[10] AS MonthOld10,
pt.[11] AS MonthOld11,
pt.[12] AS MonthOld12,
pt.[13] AS MonthOld13
From (
SELECT Convert(Char(4),Year(SubmissionDate)) + '-' + Right('00' + Convert(VarChar(2), DatePart(Month, SubmissionDate)),2) AS SubmissionYearMonth,
SUM(Amount) AS TotalAmount
FROM Accounts
GROUP BY Convert(Char(4),Year(SubmissionDate)) + '-' + Right('00' + Convert(VarChar(2), DatePart(Month, SubmissionDate)),2)
)
AS AccountSummary
OUTER APPLY
(
SELECT *
FROM (
SELECT CASE WHEN DATEDIFF(Month, SubmissionDate, TranDate) < 13
THEN DATEDIFF(Month, SubmissionDate, TranDate)
ELSE 13
END AS PaymentMonthAge,
TranAmount
FROM Trans INNER JOIN Accounts ON Trans.AccountID = Accounts.AccountID
Where Convert(Char(4),Year(TranDate)) + '-' + Right('00' + Convert(VarChar(2), DatePart(Month, TranDate)),2)
= AccountSummary.SubmissionYearMonth
) as TransTemp
PIVOT (SUM(TranAmount)
FOR PaymentMonthAge IN ([0],
[1],
[2],
[3],
[4],
[5],
[6],
[7],
[8],
[9],
[10],
[11],
[12],
[13])) as TransPivot
) as pt
It's producing the following output:
SubmissionYearMonth TotalAmount MonthOld0 MonthOld1 MonthOld2 MonthOld3 ...
2012-01 2099.00 300.00 NULL NULL NULL ...
2012-02 350.00 325.00 300.00 NULL NULL ...
2012-03 685.00 656.00 NULL 300.00 NULL ...
As for the column date headers. I'm not sure what the best option is here. I could add an additional set of columns and create a calculated value that I could use in the resulting report.
SQL Fiddle: http://www.sqlfiddle.com/#!6/272e5/1/0
Since you are using SQL Server 2012, we can use the Format function to make the date pretty. There is no need to group by the strings. Instead, I find it useful to use the proper data type for as long as I can and only use Format or Convert on display (or not at all and let the middle tier handle the display).
In this solution, I arbitrarily assumed the earliest TransDate and extract from it, the first day of that month. However, one could easily replace that expression with a static value of the start date desired and this solution would take that and the next 12 months.
With SubmissionMonths As
(
Select DateAdd(d, -Day(A.SubmissionDate) + 1, A.SubmissionDate) As SubmissionMonth
, A.Amount
From dbo.Accounts As A
)
, TranMonths As
(
Select DateAdd(d, -Day(Min( T.TranDate )) + 1, Min( T.TranDate )) As TranMonth
, 1 As MonthNum
From dbo.Accounts As A
Join dbo.Trans As T
On T.AccountId = A.AccountId
Join SubmissionMonths As M
On A.SubmissionDate >= M.SubmissionMonth
And A.SubmissionDate < DateAdd(m,1,SubmissionMonth)
Union All
Select DateAdd(m, 1, TranMonth), MonthNum + 1
From TranMonths
Where MonthNum < 12
)
, TotalBySubmissionMonth As
(
Select M.SubmissionMonth, Sum( M.Amount ) As Total
From SubmissionMonths As M
Group By M.SubmissionMonth
)
Select Format(SMT.SubmissionMonth,'yyyy-MM') As SubmissionMonth, SMT.Total
, Sum( Case When TM.MonthNum = 1 Then T.TranAmount End ) As Month1
, Sum( Case When TM.MonthNum = 2 Then T.TranAmount End ) As Month2
, Sum( Case When TM.MonthNum = 3 Then T.TranAmount End ) As Month3
, Sum( Case When TM.MonthNum = 4 Then T.TranAmount End ) As Month4
, Sum( Case When TM.MonthNum = 5 Then T.TranAmount End ) As Month5
, Sum( Case When TM.MonthNum = 6 Then T.TranAmount End ) As Month6
, Sum( Case When TM.MonthNum = 7 Then T.TranAmount End ) As Month7
, Sum( Case When TM.MonthNum = 8 Then T.TranAmount End ) As Month8
, Sum( Case When TM.MonthNum = 9 Then T.TranAmount End ) As Month9
, Sum( Case When TM.MonthNum = 10 Then T.TranAmount End ) As Month10
, Sum( Case When TM.MonthNum = 11 Then T.TranAmount End ) As Month11
, Sum( Case When TM.MonthNum = 12 Then T.TranAmount End ) As Month12
From TotalBySubmissionMonth As SMT
Join dbo.Accounts As A
On A.SubmissionDate >= SMT.SubmissionMonth
And A.SubmissionDate < DateAdd(m,1,SMT.SubmissionMonth)
Join dbo.Trans As T
On T.AccountId = A.AccountId
Join TranMonths As TM
On T.TranDate >= TM.TranMonth
And T.TranDate < DateAdd(m,1,TM.TranMonth)
Group By SMT.SubmissionMonth, SMT.Total
SQL Fiddle version
The following query pretty much returns what you want. You need to do the to operations separately. I just join the results together:
select a.yyyymm, a.Amount,
t201201, t201202, t201203, t201204
from (select LEFT(convert(varchar(255), a.submissiondate, 121), 7) as yyyymm,
SUM(a.Amount) as amount
from Accounts a
group by LEFT(convert(varchar(255), a.submissiondate, 121), 7)
) a left outer join
(select LEFT(convert(varchar(255), a.submissiondate, 121), 7) as yyyymm,
sum(case when trans_yyyymm = '2012-01' then tranamount end) as t201201,
sum(case when trans_yyyymm = '2012-02' then tranamount end) as t201202,
sum(case when trans_yyyymm = '2012-03' then tranamount end) as t201203,
sum(case when trans_yyyymm = '2012-04' then tranamount end) as t201204
from Accounts a join
(select t.*, LEFT(convert(varchar(255), t.trandate, 121), 7) as trans_yyyymm
from trans t
) t
on a.accountid = t.accountid
group by LEFT(convert(varchar(255), a.submissiondate, 121), 7)
) t
on a.yyyymm = t.yyyymm
order by 1
I am getting a NULL where you have a 0.00 in two cells.
Thomas, I used your response as inspiration for the following solution I ended up using.
I first create a SubmissionDate, TranDate cross join skeleton date matrix, that I later use to join on the AccountSummary and TranSummary data.
The resulting query output isn't formatted in columns, per TranDate month. Rather I'm using output in a SQL Server Reporting Services matrix, and using a column grouping, based off the TranSummaryMonthNum column, to get the desired formatted output.
SQL Fiddle version
;
WITH
--Generate a list of Dates, from the first SubmissionDate, through today.
--Note: Requires the use of: 'OPTION (MAXRECURSION 0)' to generate a list with more than 100 dates.
CTE_AutoDates AS
( Select Min(SubmissionDate) as FiscalDate
From Accounts
UNION ALL
SELECT DATEADD(Day, 1, FiscalDate)
FROM CTE_AutoDates
WHERE DATEADD(Day, 1, FiscalDate) <= GetDate()
),
FiscalDates As
( SELECT FiscalDate,
DATEFROMPARTS(Year(FiscalDate), Month(FiscalDate), 1) as FiscalMonthStartDate
FROM CTE_AutoDates
--Optionaly filter Fiscal Dates by the last known Math.Max(SubmissionDate, TranDate)
Where FiscalDate <= (Select Max(MaxDate)
From (Select Max(SubmissionDate) as MaxDate From Accounts
Union All
Select Max(TranDate) as MaxDate From Trans
) as MaxDates
)
),
FiscalMonths as
( SELECT Distinct FiscalMonthStartDate
FROM FiscalDates
),
--Matrix to store the reporting date groupings for the Account submission and payment periods.
SubmissionAndTranMonths AS
( Select AM.FiscalMonthStartDate as SubmissionMonthStartDate,
TM.FiscalMonthStartDate as TransMonthStartDate,
DateDiff(Month, (Select Min(FiscalMonthStartDate) From FiscalMonths), TM.FiscalMonthStartDate) as TranSummaryMonthNum
From FiscalMonths AS AM
Join FiscalMonths AS TM
ON TM.FiscalMonthStartDate >= AM.FiscalMonthStartDate
),
AccountData as
( Select A.AccountID,
A.Amount,
FD.FiscalMonthStartDate as SubmissionMonthStartDate
From Accounts as A
Inner Join FiscalDates as FD
ON A.SubmissionDate = FD.FiscalDate
),
TranData as
( Select T.AccountID,
T.TranAmount,
AD.SubmissionMonthStartDate,
FD.FiscalMonthStartDate as TranMonthStartDate
From Trans as T
Inner Join AccountData as AD
ON T.AccountID = AD.AccountID
Inner Join FiscalDates AS FD
ON T.TranDate = FD.FiscalDate
),
AccountSummaryByMonth As
( Select ASM.FiscalMonthStartDate,
Sum(AD.Amount) as TotalSubmissionAmount
From FiscalMonths as ASM
Inner Join AccountData as AD
ON ASM.FiscalMonthStartDate = AD.SubmissionMonthStartDate
Group By
ASM.FiscalMonthStartDate
),
TranSummaryByMonth As
( Select STM.SubmissionMonthStartDate,
STM.TransMonthStartDate,
STM.TranSummaryMonthNum,
Sum(TD.TranAmount) as TotalTranAmount
From SubmissionAndTranMonths as STM
Inner Join TranData as TD
ON STM.SubmissionMonthStartDate = TD.SubmissionMonthStartDate
AND STM.TransMonthStartDate = TD.TranMonthStartDate
Group By
STM.SubmissionMonthStartDate,
STM.TransMonthStartDate,
STM.TranSummaryMonthNum
)
--#Inspect 1
--Select * From SubmissionAndTranMonths
--OPTION (MAXRECURSION 0)
--#Inspect 1 Results
--SubmissionMonthStartDate TransMonthStartDate TranSummaryMonthNum
--2012-01-01 2012-01-01 0
--2012-01-01 2012-02-01 1
--2012-01-01 2012-03-01 2
--2012-01-01 2012-04-01 3
--2012-02-01 2012-02-01 1
--2012-02-01 2012-03-01 2
--2012-02-01 2012-04-01 3
--2012-03-01 2012-03-01 2
--2012-03-01 2012-04-01 3
--2012-04-01 2012-04-01 3
--#Inspect 2
--Select * From AccountSummaryByMonth
--OPTION (MAXRECURSION 0)
--#Inspect 2 Results
--FiscalMonthStartDate TotalSubmissionAmount
--2012-01-01 2099.00
--2012-02-01 350.00
--2012-03-01 685.00
--#Inspect 3
--Select * From TranSummaryByMonth
--OPTION (MAXRECURSION 0)
--#Inspect 3 Results
--SubmissionMonthStartDate TransMonthStartDate TranSummaryMonthNum TotalTranAmount
--2012-01-01 2012-01-01 0 300.00
--2012-01-01 2012-02-01 1 300.00
--2012-01-01 2012-03-01 2 300.00
--2012-02-01 2012-02-01 1 325.00
--2012-02-01 2012-04-01 3 25.00
--2012-03-01 2012-03-01 2 656.00
--2012-03-01 2012-04-01 3 15.00
Select STM.SubmissionMonthStartDate,
ASM.TotalSubmissionAmount,
STM.TransMonthStartDate,
STM.TranSummaryMonthNum,
TSM.TotalTranAmount
From SubmissionAndTranMonths as STM
Inner Join AccountSummaryByMonth as ASM
ON STM.SubmissionMonthStartDate = ASM.FiscalMonthStartDate
Left Join TranSummaryByMonth AS TSM
ON STM.SubmissionMonthStartDate = TSM.SubmissionMonthStartDate
AND STM.TransMonthStartDate = TSM.TransMonthStartDate
Order By STM.SubmissionMonthStartDate, STM.TranSummaryMonthNum
OPTION (MAXRECURSION 0)
--#Results
--SubmissionMonthStartDate TotalSubmissionAmount TransMonthStartDate TranSummaryMonthNum TotalTranAmount
--2012-01-01 2099.00 2012-01-01 0 300.00
--2012-01-01 2099.00 2012-02-01 1 300.00
--2012-01-01 2099.00 2012-03-01 2 300.00
--2012-01-01 2099.00 2012-04-01 3 NULL
--2012-02-01 350.00 2012-02-01 1 325.00
--2012-02-01 350.00 2012-03-01 2 NULL
--2012-02-01 350.00 2012-04-01 3 25.00
--2012-03-01 685.00 2012-03-01 2 656.00
--2012-03-01 685.00 2012-04-01 3 15.00
The following query exactly duplicates the results of your final query in your own answer but takes no more than 1/30th the CPU (or better), plus is a whole lot simpler.
If I had the time & energy I am sure I could find even more improvements... my gut tells me I might not have to hit the Accounts table so many times. But in any case, it's a huge improvement and should perform very well even for very large result sets.
See the SqlFiddle for it.
WITH L0 AS (SELECT 1 N UNION ALL SELECT 1),
L1 AS (SELECT 1 N FROM L0, L0 B),
L2 AS (SELECT 1 N FROM L1, L1 B),
L3 AS (SELECT 1 N FROM L2, L2 B),
L4 AS (SELECT 1 N FROM L3, L2 B),
Nums AS (SELECT N = Row_Number() OVER (ORDER BY (SELECT 1)) FROM L4),
Anchor AS (
SELECT MinDate = DateAdd(month, DateDiff(month, '20000101', Min(SubmissionDate)), '20000101')
FROM dbo.Accounts
),
MNums AS (
SELECT N
FROM Nums
WHERE
N <= DateDiff(month,
(SELECT MinDate FROM Anchor),
(SELECT Max(TranDate) FROM dbo.Trans)
) + 1
),
A AS (
SELECT
AM.AccountMo,
Amount = Sum(A.Amount)
FROM
dbo.Accounts A
CROSS APPLY (
SELECT DateAdd(month, DateDiff(month, '20000101', A.SubmissionDate), '20000101')
) AM (AccountMo)
GROUP BY
AM.AccountMo
), T AS (
SELECT
AM.AccountMo,
TM.TranMo,
TotalTranAmount = Sum(T.TranAmount)
FROM
dbo.Accounts A
CROSS APPLY (
SELECT DateAdd(month, DateDiff(month, '20000101', A.SubmissionDate), '20000101')
) AM (AccountMo)
INNER JOIN dbo.Trans T
ON A.AccountID = T.AccountID
CROSS APPLY (
SELECT DateAdd(month, DateDiff(month, '20000101', T.TranDate), '20000101')
) TM (TranMo)
GROUP BY
AM.AccountMo,
TM.TranMo
)
SELECT
SubmissionStartMonth = A.AccountMo,
TotalSubmissionAmount = A.Amount,
M.TransMonth,
TransMonthNum = N.N - 1,
T.TotalTranAmount
FROM
A
INNER JOIN MNums N
ON N.N >= DateDiff(month, (SELECT MinDate FROM Anchor), A.AccountMo) + 1
CROSS APPLY (
SELECT TransMonth = DateAdd(month, N.N - 1, (SELECT MinDate FROM Anchor))
) M
LEFT JOIN T
ON A.AccountMo = T.AccountMo
AND M.TransMonth = T.TranMo
ORDER BY
A.AccountMo,
M.TransMonth;