recurring period in sql script - sql

The situation:
The user creates a case record that includes a date field (DateOpened), and wants to send the client a follow up every 30 days until the case is closed.
The user will run the query periodically (probably weekly) and provide a 'From' and 'To' date range to specify the period in which a record may fall within the mutliple of 30 days.
The request:
I need a method to identify records where the user specified date range includes those records which are a multiple of 30 days since the DateOpened date.
UPDATE
This is what came to me all of a sudden while watching a third rate TV show last night!!!
SELECT
....
FROM
....
WHERE
(CAST((DATEDIFF(dd, Invoice.DateOpened #EndDate)/30) AS INT) - CAST((DATEDIFF(dd, Invoice.DateOpened, #StartDate)/30) AS INT)) >=1
OR DATEDIFF(dd, Invoice.DateOpened, #StartDate) % 30 = 0 --this line to capture valid records but where From and To dates are the same

Is this Microsoft SQL? Is this Express edition? As long as it's not Express, you may want to look into using the SQL Agent service, which lets you schedule tasks that can run against the database. What do you want it to do with the record once it hits 30 days?

You can use the DATEDIFF function to calculate the difference between dates in days. You can use the modulus (%) operator to get the "remainder" of a division operation. Combining the two gives you:
SELECT
....
FROM
....
WHERE
--In MS T-SQL, BETWEEN is inclusive.
DateOpened BETWEEN #UserSuppliedFromDate AND #UserSuppliedToDate
AND DATEDIFF(dd, DateOpened, getdate()) % 30 = 0
which should give you the desired result.
Edit (Give this example a try in MSSQL):
DECLARE #Table TABLE
(
ID integer,
DateOpened datetime
)
DECLARE #FromDate as datetime = '1/1/2012'
DECLARE #ToDate as datetime = '12/31/2012'
INSERT INTO #Table VALUES (0, '1/1/1982')
INSERT INTO #Table values (1, '1/1/2012')
INSERT INTO #Table VALUES (2, '2/17/2012')
INSERT INTO #Table VALUES (3, '3/16/2012')
INSERT INTO #Table VALUES (4, '4/16/2012')
INSERT INTO #Table VALUES (5, '5/28/2012')
INSERT INTO #Table VALUES (6, '1/31/2012')
INSERT INTO #Table VALUES (7, '12/12/2013')
DECLARE #DateLoop as datetime
DECLARE #ResultIDs as table ( ID integer, DateLoopAtTheTime datetime, DaysDifference integer )
--Initialize to lowest possible value
SELECT #DateLoop = #FromDate
--Loop until we hit the maximum date to check
WHILE #DateLoop <= #ToDate
BEGIN
INSERT INTO #ResultIDs (ID,DateLoopAtTheTime, DaysDifference)
SELECT ID, #DateLoop, DATEDIFF(dd,#DateLoop, DateOpened)
FROM #Table
WHERE
DATEDIFF(dd,#DateLoop, DateOpened) % 30 = 0
AND DATEDIFF(dd,#DateLoop,DateOpened) > 0 -- Avoids false positives when #DateLoop and DateOpened are the same
AND DateOpened <= #ToDate
SELECT #DateLoop = DATEADD(dd, 1, #DateLoop) -- Increment the iterator
END
SELECT distinct * From #ResultIDs

Related

How to create a temp table with values from another table aggregated weekly?

It is a bit difficult to explain but basically I need to create a temporary table (date datetime, #customers int) where #customers is the number of weekly customers pulled from another table. Here's my code.
declare #date datetime
declare #temptable table (date datetime not null,#customers int)
set #date='2018-02-13'
while #date<getdate()
begin
insert into #temptable values
(#date,
(select count(*) from in_ft_conversion
where u4='cfa' and sales_date between #date and #date-7))
set #date=#date+7
end
The result is a table with all the correct date entries but 0 in the customer column... Does anybody know what I'm doing wrong? Thanks!
Your date range is wrong , swap the date values in the BETWEEN so you have BETWEEN <earlier date> AND <later date>
where u4='cfa' and sales_date between #date-7 and #date))
Why would you use a while loop for this? I think you want something like this:
insert into #temptable (date, num_customers)
select dateadd(day, '2018-02-08', weekno * 7)
count(*)
from in_ft_conversion cross apply
(values (datediff(day, '2018-02-08', sales_date) / 7
) v(weekno)
where u4 = 'cfa' and sales_date >= '2018-02-08'
group by v.weekno;
No loop is necessary.
Your problem is specifically the between comparison:
sales_date between #date and #date-7
The dates are backwards -- the lower bound needs to go first.
But, I also doubt that you want to count weeks with 8 days and have one day overlap on each week. I think the above logic does what you want, but you can adjust the date arithmetic to get the exact dates you want.

tsql grouping with duplication based on variable

I want to create some aggregations from a table but I am not able to figure out a solution.
Example table:
DECLARE #MyTable TABLE(person INT, the_date date, the_value int)
INSERT INTO #MyTable VALUES
(1,'2017-01-01', 10),
(1,'2017-02-01', 5),
(1,'2017-03-01', 5),
(1,'2017-04-01', 10),
(1,'2017-05-01', 2),
(2,'2017-04-01', 10),
(2,'2017-05-01', 10),
(2,'2017-05-01', 0),
(3,'2017-01-01', 2)
For each person existing at that time, I want to average the value for the last x (#months_back) months given some starting date (#start_date):
DECLARE #months_back int, #start_date date
set #months_back = 3
set #start_date = '2017-05-01'
SELECT person, avg(the_value) as avg_the_value
FROM #MyTable
where the_date <= #start_date and the_date >= dateadd(month, -#months_back, #start_date)
group by person
This works. I now want to do the same thing again but skip back some months (#month_skip) from the starting date. Then I want to union those two tables together. Then, I again want to skip back #month_skip months from this date and do the same thing. I want to continue doing this until I have skipped back to some specified date (#min_date).
DECLARE #months_back int, #month_skip int, #start_date date, #min_date date
set #months_back = 3
set #month_skip = 2
set #start_date = '2017-05-01'
set #min_date = '2017-03-01'
Using the above variables and the table #MyTable the result should be:
person | avg_the_value
1 | 5
2 | 6
1 | 6
3 | 2
Only one skip is made here since #min_date is 2 months back but I would like to be able to do multiple skips based on what #min_date is.
This example table is simple but the real one has many more automatically created columns and therefore it is not feasible to use a table variable where I would have to declare the scheme of the resulting table.
I asked a related question Here but did not manage to get any of the answers to work for this problem.
It sounds like what you're trying to do is the following:
Starting with a date (e.g. 2017-05-01), look back #months_back months and define a range of dates. For example, if we go 3 months back, we're defining a range from 2017-02-01 through 2017-05-01.
After we define this range, we go back to our starting date and define a new starting date, going back #month_skip months. For example, with an initial starting date of 2017-05-01, we might skip back 2 months, giving us a new starting date of 2017-03-01.
We take this new starting date, and define a range of corresponding dates (as we did above). This produces the range 2016-12-01 through 2017-03-01.
We repeat this as needed through the minimum date specified, to produce a list of date ranges we want to do calculations for:
2017-03-01 through 2017-05-01
2016-12-01 through 2017-03-01
... etc ...
For each of these periods, look at a person and calculate the average of their value.
The query below should do what is described above: rather than taking a value and iterating back to calculate previous values, we use a numbers table to calculate offsets on an interval, which is used to determine the ending and starting dates for each interval/period. This query was built using SQL Server 2008 R2 and should be compatible with future versions.
/* Table, data, variable declarations */
DECLARE #MyTable TABLE(person INT, the_date date, the_value int)
INSERT INTO #MyTable VALUES
(1,'2017-01-01', 10),
(1,'2017-02-01', 5),
(1,'2017-03-01', 5),
(1,'2017-04-01', 10),
(1,'2017-05-01', 2),
(2,'2017-04-01', 10),
(2,'2017-05-01', 10),
(2,'2017-05-01', 0),
(3,'2017-01-01', 2)
DECLARE #months_back int, #month_skip int, #start_date date, #min_date date
set #months_back = 3
set #month_skip = 2
set #start_date = '2017-05-01'
set #min_date = '2017-01-01'
/* Common table expression to build list of Integers */
/* reference http://www.itprotoday.com/software-development/build-numbers-table-you-need if you want more info */
declare #end_int bigint = 50
; WITH IntegersTableFill (ints) AS
(
SELECT
CAST(0 AS BIGINT) AS 'ints'
UNION ALL
SELECT (T.ints + 1) AS 'ints'
FROM IntegersTableFill T
WHERE ints <= (
CASE
WHEN (#end_int <= 32767) THEN #end_int
ELSE 32767
END
)
)
/* What we're going to do is define a series of periods.
These periods have a start date and an end date, and will simplify grouping
(in place of the calculate-and-union approach)
*/
/* Now, we start defining the periods
#months_Back_start defines the end of the range we need to calculate for.
#month_skip defines the amount of time we have to jump back for each period
*/
/* Using the number table we defined above and the data in our variables, calculate start and end dates */
,periodEndDates as
(
select ints as Period
,DATEADD(month, -(#months_back*ints), #start_date) as endOfPeriod
from IntegersTableFill itf
)
,periodStartDates as
(
select *
,DATEADD(month, -(#month_skip), endOfPeriod) as startOfPeriod
from periodEndDates
)
,finalPeriodData as
(
select (period) as period, startOfPeriod, endOfPeriod from periodStartDates
)
/* Link the entries in our original data to the periods they fall into */
/* NOTE: The join criteria originally specified allows values to fall into multiple periods.
You may want to fix this?
*/
,periodTableJoin as
(
select * from finalPeriodData fpd
inner join #MyTable mt
on mt.the_date >= fpd.startOfPeriod
and mt.the_date <= fpd.endOfPeriod
and mt.the_date >= #min_date
and mt.the_date <= #start_date
)
/* Calculate averages, grouping by period and person */
,periodValueAggregate as
(
select person, avg(the_value) as avg_the_value from
periodTableJoin
group by period, person
)
select * from periodValueAggregate
The method I propose is set-based, not iterative.
(I am not following your problem exactly, but please follow along and we can iron out any discrepancies)
Essentially, you are looking to divide a calendar up in to periods of interest. The periods are all equal in width and are sequential.
For this, I propose you build a calendar table and mark the periods using division as illustrated in the code;
DECLARE #CalStart DATE = '2017-01-01'
,#CalEnd DATE = '2018-01-01'
,#CalWindowSize INT = 2
;WITH Numbers AS
(
SELECT TOP (DATEDIFF(MONTH, #CalStart, #CalEnd)) N = CAST(ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS INT) - 1
FROM syscolumns
)
SELECT CalWindow = N / #CalWindowSize
,CalDate = DATEADD(MONTH, N, #CalStart)
FROM Numbers
Once you have correctly configured the variables, you should have a calendar that represents the windows of interest.
It is then a matter of affixing this calendar to your dataset and grouping by not only the person but the CalWindow too;
DECLARE #MyTable TABLE(person INT, the_date date, the_value int)
INSERT INTO #MyTable VALUES
(1,'2017-01-01', 10),
(1,'2017-02-01', 5),
(1,'2017-03-01', 5),
(1,'2017-04-01', 10),
(1,'2017-05-01', 2),
(2,'2017-04-01', 10),
(2,'2017-05-01', 10),
(2,'2017-05-01', 0),
(3,'2017-01-01', 2)
----------------------------------
-- Build Calendar
----------------------------------
DECLARE #CalStart DATE = '2017-01-01'
,#CalEnd DATE = '2018-01-01'
,#CalWindowSize INT = 2
;WITH Numbers AS
(
SELECT TOP (DATEDIFF(MONTH, #CalStart, #CalEnd)) N = CAST(ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS INT) - 1
FROM syscolumns
)
,Calendar AS
(
SELECT CalWindow = N / #CalWindowSize
,CalDate = DATEADD(MONTH, N, #CalStart)
FROM Numbers
)
SELECT TB.Person
,AVG(TB.the_value)
FROM #MyTable TB
JOIN Calendar CL ON TB.the_date = CL.CalDate
GROUP BY CL.CalWindow, TB.person
Hope I have understood your problem.

SQL Server : 5 days moving average for last month

I have a view with two columns TOTAL and DATE, the latter one excludes Saturdays and Sundays, i.e.
TOTAL DATE
0 1-1-2014
33 2-1-2014
11 3-1-2014
55 5-1-2014
...
25 15-1-2014
35 16-1-2014
17 17-1-2014
40 20-1-2014
33 21-1-2014
...
The task that I'm trying to complete is counting 5 days TOTAL average for the whole month, i.e between 13th and 17th, 14th and 20th (we skip weekends), 15th and 21st etc. up to current date.
And YES, they ARE OVERLAPPING RANGES.
Any idea how to achieve it in SQL?
Example of the output (starting from the 6th and using fake numbers)
5daysAVG Start_day
22 1-01-2014 <-counted between 1st to 6th Jan excl 4 and 5 of Jan
25 2-01-2014 <- Counted between 2nd to 7th excluding 4 and 5
27 3-01-2014 <- 3rd to 8th excluding 4/5
24 6-01-2014 <-6th to 10th
...
33 today-5
Okay, I usually set up some test data to play with.
Here is some code to create a [work] table in tempdb. I am skipping weekends. The total is a random number from 0 to 40.
-- Just playing
use tempdb;
go
-- drop existing
if object_id ('work') > 0
drop table work
go
-- create new
create table work
(
my_date date,
my_total int
);
go
-- clear data
truncate table work;
go
-- Monday = 1
SET DATEFIRST 1;
GO
-- insert data
declare #dt date = '20131231';
declare #hr int;
while (#dt < '20140201')
begin
set #hr = floor(rand(checksum(newid())) * 40);
set #dt = dateadd(d, 1, #dt);
if (datepart(dw, #dt) < 6)
insert into work values (#dt, #hr);
end
go
This becomes real easy in SQL SERVER 2012 with the new LEAD() window function.
-- show data
with cte_summary as
(
select
row_number() over (order by my_date) as my_num,
my_date,
my_total,
LEAD(my_total, 0, 0) OVER (ORDER BY my_date) +
LEAD(my_total, 1, 0) OVER (ORDER BY my_date) +
LEAD(my_total, 2, 0) OVER (ORDER BY my_date) +
LEAD(my_total, 3, 0) OVER (ORDER BY my_date) +
LEAD(my_total, 4, 0) OVER (ORDER BY my_date) as my_sum,
(select count(*) from work) as my_cnt
from work
)
select * from cte_summary
where my_num <= my_cnt - 4
Basically, we give a row number to each row, calculate the sum for rows 0 (current) to row 4 (4 away) and a total count.
Since this is a running total for five periods, the remaining dates have missing data. Therefore, we toss them out. my_row <= my_cnt -4
I hope this solves your problem!
If you are only caring about one number for the month, change the select to the following. I left the other rows in for you to get an understanding of what is going on.
select avg(my_sum/5) as my_stat
from cte_summary
where my_num <= my_cnt - 4
FOR SQL SERVER < 2012 & >= 2005
Like anything in this world, there is always a way to do it. I used a small tally table to loop thru the data and collect sets of 5 data points for averages.
-- show data
with
cte_tally as
(
select
row_number() over (order by (select 1)) as n
from
sys.all_columns x
),
cte_data as
(
select
row_number() over (order by my_date) as my_num,
my_date,
my_total
from
work
)
select
(select my_date from cte_data where my_num = n) as the_date,
(
select sum(my_total) / 5
from cte_data
where my_num >= n and my_num < n+5
) as the_average
from cte_tally
where n <= (select count(*)-4 from work)
Here is an explanation of the common table expressions (CTE).
cte_data = order data by date and give row numbers
cte_tally = a set based counting algorithm
For groups of five calculate an average and show the date.
This solution does not depend on holidays or weekends. If data is there, it just partitions by groups of five order by date.
If you need to filter out holidays and weekends, create a holiday table. Add a where clause to cte_data that checks for NOT IN (SELECT DATE FROM HOLIDAY TABLE).
Good luck!
SQL Server offers the datepart(wk, ...) function to get the week of the year. Unfortunately, it uses the first day of the year to define the year.
Instead, you can find sequences of consecutive values and group them together:
select min(date), max(date, avg(total*1.0)
from (select v.*, row_number() over (order by date) as seqnum
from view
) v
group by dateadd(day, -seqnum, date);
The idea is that subtracting a sequence of numbers from a sequence of consecutive days yields a constant.
You can also do this by using a canonical date and dividing by 7:
select min(date), max(date, avg(total*1.0)
from view v
group by datediff(day, '2000-01-03', date) / 7;
The date '2000-01-03' is an arbitrary Monday.
EDIT:
You seem to want a 5-day moving average. Because there is missing data for the weekends, avg() should just work:
select v1.date, avg(v2.value)
from view v1 join
view v2
on v2.date >= v1.date and v2.date < dateadd(day, 7, v1.date)
group by v1.date;
Here's a solution that works in SQL 2008;
The concept here is to use a table variable to normalize the data first; the rest is simple math to count and average the days.
By normalizing the data, I mean, get rid of weekend days, and assign ID's in a temporary table variable that can be used to identify the rows;
Check it out: (SqlFiddle also here)
-- This represents your original source table
Declare #YourSourceTable Table
(
Total Int,
CreatedDate DateTime
)
-- This represents some test data in your table with 2 weekends
Insert Into #YourSourceTable Values (0, '1-1-2014')
Insert Into #YourSourceTable Values (33, '1-2-2014')
Insert Into #YourSourceTable Values (11, '1-3-2014')
Insert Into #YourSourceTable Values (55, '1-4-2014')
Insert Into #YourSourceTable Values (25, '1-5-2014')
Insert Into #YourSourceTable Values (35, '1-6-2014')
Insert Into #YourSourceTable Values (17, '1-7-2014')
Insert Into #YourSourceTable Values (40, '1-8-2014')
Insert Into #YourSourceTable Values (33, '1-9-2014')
Insert Into #YourSourceTable Values (43, '1-10-2014')
Insert Into #YourSourceTable Values (21, '1-11-2014')
Insert Into #YourSourceTable Values (5, '1-12-2014')
Insert Into #YourSourceTable Values (12, '1-13-2014')
Insert Into #YourSourceTable Values (16, '1-14-2014')
-- Just a quick test to see the source data
Select * From #YourSourceTable
/* Now we need to normalize the data;
Let's just remove the weekends and get some consistent ID's to use in a separate table variable
We will use DateName SQL Function to exclude weekend days while also giving
sequential ID's to the remaining data in our temporary table variable,
which are easier to query later
*/
Declare #WorkingTable Table
(
TempID Int Identity,
Total Int,
CreatedDate DateTime
)
-- Let's get the data normalized:
Insert Into
#WorkingTable
Select
Total,
CreatedDate
From #YourSourceTable
Where DateName(Weekday, CreatedDate) != 'Saturday'
And DateName(Weekday, CreatedDate) != 'Sunday'
-- Let's run a 2nd quick sanity check to see our normalized data
Select * From #WorkingTable
/* Now that data is normalized, we can just use the ID's to get each 5 day range and
perform simple average function on the columns; I chose to use a CTE here just to
be able to query it and drop the NULL ranges (where there wasn't 5 days of data)
without having to recalculate each average
*/
; With rangeCte (StartDate, TotalAverage)
As
(
Select
wt.createddate As StartDate,
(
wt.Total +
(Select Total From #WorkingTable Where TempID = wt.TempID + 1) +
(Select Total From #WorkingTable Where TempID = wt.TempID + 2) +
(Select Total From #WorkingTable Where TempID = wt.TempID + 3) +
(Select Total From #WorkingTable Where TempID = wt.TempID + 4)
) / 5
As TotalAverage
From
#WorkingTable wt
)
Select
StartDate,
TotalAverage
From rangeCte
Where TotalAverage
Is Not Null

UPDATE field based on another row filed value

I have a table stored on a SQL Server 2008, that associate a value to a date range.
DateFrom DateTo Value
2012-01-01 2012-02-01 10
2012-02-02 2012-02-15 15
The application that deal with this table, can insert a new range between the existings.
For example, If i insert
DateFrom DateTo Value
2012-02-07 2012-02-10 12
The result must be
DateFrom DateTo Value
2012-01-01 2012-02-01 10
2012-02-02 2012-02-06 15
2012-02-07 2012-02-10 12
2012-02-11 2012-02-15 15
I can do that programmatically from the application, but I wonder if there is some fast SQL statement that make me able to set the data values by referencing other row's field and performing data operation on it.
A MUST requirement is that the date range must represent a time sequence, two range cannot span each other.
I've written an example based on the example I gave you in a comment, it may do what you want. Since, in general terms, there might be multiple rows to insert/delete, it's best to define them all separately, then use a MERGE to perform the overall change.
I've also assumed that it's okay to delete/insert to achieve the splitting - you can't update and produce 2 rows from 1, so you'd always have to do an insert, and the symmetry is cleaner if I do both:
declare #T table (DateFrom datetime2, DateTo datetime2,Value int)
insert into #T(DateFrom , DateTo , Value) VALUES
('20120101', '20120201', 10),
('20120202', '20120206', 15),
('20120207', '20120210', 12),
('20120211', '20120215', 15)
select * from #t order by DateFrom
declare #NewFrom datetime2 = '20120205'
declare #NewTo datetime2 = '20120208'
declare #NewValue int = 8
--We need to identify a) rows to delete, b) new sliced rows to create, and c) the new row itself
;With AlteredRows as (
select #NewFrom as DateFrom,#NewTo as DateTo,#NewValue as Value,1 as toInsert
union all
select DateFrom,DATEADD(day,-1,#NewFrom),Value,1 from #t where #NewFrom between DATEADD(day,1,DateFrom) and DateTo
union all
select DATEADD(day,1,#NewTo),DateTo,Value,1 from #t where #NewTo between DateFrom and DATEADD(day,-1,DateTo)
union all
select DateFrom,DateTo,0,0 from #t where DateTo > #NewFrom and DateFrom < #NewTo
)
merge into #t t using AlteredRows ar on t.DateFrom = ar.DateFrom and t.DateTo = ar.DateTo
when matched and toInsert=0 then delete
when not matched then insert (DateFrom,DateTo,Value) values (ar.DateFrom,ar.DateTo,ar.Value);
select * from #t order by DateFrom
It may be possible to re-write the CTE so that it's a single scan of #t - but I only think it's worth doing that if performance is critical.
I've had similar problems in the past, and found that if the range needs to be continuous the best approach is to do away with the End Date of the range, and calculate this as the Next start date. Then if needs be create a view as follows:
SELECT FromDate,
( SELECT DATEADD(DAY, -1, MIN(DateFrom))
FROM YourTable b
WHERE b.FromDate > a.FromDate
) [ToDate],
Value
FROM YourTable a
This ensures that 2 ranges can never cross, however does not necessarily ensure no work is required upon insert to get the desired result, but it should be more maintainable and have less scope for error than storing both the start and end date.
ADDENDUM
Once I had written out all of the below I realised it does not improve maintainability that much to do away with the DateTo Field, it still requires a fair amount of code for the validation, but here's how I would do it anyway.
DECLARE #T table (DateFrom DATE, Value INT)
INSERT INTO #T VALUES ('20120101', 10), ('20120202', 15), ('20120207', 12), ('20120211', 15)
DECLARE #NewFrom DATE = '20120209',
#NewTo DATE = '20120210',
#NewValue INT = 8
-- SHOW INITIAL VALUES FOR DEMONSTATIVE PURPOSES --
SELECT DateFrom,
ISNULL(( SELECT DATEADD(DAY, -1, MIN(DateFrom))
FROM #t b
WHERE b.DateFrom > a.DateFrom
), CAST(GETDATE() AS DATE)) [DateTo],
Value
FROM #t a
ORDER BY DateFrom
;WITH CTE AS
( SELECT DateFrom,
( SELECT DATEADD(DAY, -1, MIN(DateFrom))
FROM #t b
WHERE b.DateFrom > a.DateFrom
) [DateTo],
Value
FROM #t a
),
MergeCTE AS
( SELECT #NewFrom [DateFrom], #NewValue [Value], 'INSERT' [RowAction]
WHERE #NewFrom < #NewTo -- ENSURE A VALID RANGE IS ENTERED
UNION ALL
-- INSERT A ROW WHERE THE NEW DATE TO SLICES AN EXISTING PERIOD
SELECT DATEADD(DAY, 1, #NewTo), Value, 'INSERT'
FROM CTE
WHERE #NewTo BETWEEN DateFrom AND DateTo
UNION ALL
-- DELETE ALL ENTRIES STARTING WITHIN THE DEFINED PERIOD
SELECT DateFrom, Value, 'DELETE'
FROM CTE
WHERE DateFrom BETWEEN #NewFrom AND #NewTo
)
MERGE INTO #t t USING MergeCTE c ON t.DateFrom = c.DateFrom AND t.Value = c.Value
WHEN MATCHED AND RowAction = 'DELETE' THEN DELETE
WHEN NOT MATCHED THEN INSERT VALUES (c.DateFrom, c.Value);
SELECT DateFrom,
ISNULL(( SELECT DATEADD(DAY, -1, MIN(DateFrom))
FROM #t b
WHERE b.DateFrom > a.DateFrom
), CAST(GETDATE() AS DATE)) [DateTo],
Value
FROM #t a
ORDER BY DateFrom
You can use a cursor to get each row from the table at a time and aftwerwards do the necessary calculations.
If NewDateFrom >= RowDateFrom and NewDateFrom <= RowDateTo ...
Check this article to see how to make a cursor.

How to Determine Values for Missing Months based on Data of Previous Months in T-SQL

I have a set of transactions occurring at specific points in time:
CREATE TABLE Transactions (
TransactionDate Date NOT NULL,
TransactionValue Integer NOT NULL
)
The data might be:
INSERT INTO Transactions (TransactionDate, TransactionValue)
VALUES ('1/1/2009', 1)
INSERT INTO Transactions (TransactionDate, TransactionValue)
VALUES ('3/1/2009', 2)
INSERT INTO Transactions (TransactionDate, TransactionValue)
VALUES ('6/1/2009', 3)
Assuming that the TransactionValue sets some kind of level, I need to know what the level was between the transactions. I need this in the context of a set of T-SQL queries, so it would be best if I could get a result set like this:
Month Value
1/2009 1
2/2009 1
3/2009 2
4/2009 2
5/2009 2
6/2009 3
Note how, for each month, we either get the value specified in the transaction, or we get the most recent non-null value.
My problem is that I have little idea how to do this! I'm only an "intermediate" level SQL Developer, and I don't remember ever seeing anything like this before. Naturally, I could create the data I want in a program, or using cursors, but I'd like to know if there's a better, set-oriented way to do this.
I'm using SQL Server 2008, so if any of the new features will help, I'd like to hear about it.
P.S. If anyone can think of a better way to state this question, or even a better subject line, I'd greatly appreciate it. It took me quite a while to decide that "spread", while lame, was the best I could come up with. "Smear" sounded worse.
I'd start by building a Numbers table holding sequential integers from 1 to a million or so. They come in really handy once you get the hang of it.
For example, here is how to get the 1st of every month in 2008:
select firstOfMonth = dateadd( month, n - 1, '1/1/2008')
from Numbers
where n <= 12;
Now, you can put that together using OUTER APPLY to find the most recent transaction for each date like so:
with Dates as (
select firstOfMonth = dateadd( month, n - 1, '1/1/2008')
from Numbers
where n <= 12
)
select d.firstOfMonth, t.TransactionValue
from Dates d
outer apply (
select top 1 TransactionValue
from Transactions
where TransactionDate <= d.firstOfMonth
order by TransactionDate desc
) t;
This should give you what you're looking for, but you might have to Google around a little to find the best way to create the Numbers table.
here's what i came up with
declare #Transactions table (TransactionDate datetime, TransactionValue int)
declare #MinDate datetime
declare #MaxDate datetime
declare #iDate datetime
declare #Month int
declare #count int
declare #i int
declare #PrevLvl int
insert into #Transactions (TransactionDate, TransactionValue)
select '1/1/09',1
insert into #Transactions (TransactionDate, TransactionValue)
select '3/1/09',2
insert into #Transactions (TransactionDate, TransactionValue)
select '5/1/09',3
select #MinDate = min(TransactionDate) from #Transactions
select #MaxDate = max(TransactionDate) from #Transactions
set #count=datediff(mm,#MinDate,#MaxDate)
set #i=1
set #iDate=#MinDate
while (#i<=#count)
begin
set #iDate=dateadd(mm,1,#iDate)
if (select count(*) from #Transactions where TransactionDate=#iDate) < 1
begin
select #PrevLvl = TransactionValue from #Transactions where TransactionDate=dateadd(mm,-1,#iDate)
insert into #Transactions (TransactionDate, TransactionValue)
select #iDate, #prevLvl
end
set #i=#i+1
end
select *
from #Transactions
order by TransactionDate
To do it in a set-based way, you need sets for all of your data or information. In this case there's the overlooked data of "What months are there?" It's very useful to have a "Calendar" table as well as a "Number" table in databases as utility tables.
Here's a solution using one of these methods. The first bit of code sets up your calendar table. You can fill it using a cursor or manually or whatever and you can limit it to whatever date range is needed for your business (back to 1900-01-01 or just back to 1970-01-01 and as far into the future as you want). You can also add any other columns that are useful for your business.
CREATE TABLE dbo.Calendar
(
date DATETIME NOT NULL,
is_holiday BIT NOT NULL,
CONSTRAINT PK_Calendar PRIMARY KEY CLUSTERED (date)
)
INSERT INTO dbo.Calendar (date, is_holiday) VALUES ('2009-01-01', 1) -- New Year
INSERT INTO dbo.Calendar (date, is_holiday) VALUES ('2009-01-02', 1)
...
Now, using this table your question becomes trivial:
SELECT
CAST(MONTH(date) AS VARCHAR) + '/' + CAST(YEAR(date) AS VARCHAR) AS [Month],
T1.TransactionValue AS [Value]
FROM
dbo.Calendar C
LEFT OUTER JOIN dbo.Transactions T1 ON
T1.TransactionDate <= C.date
LEFT OUTER JOIN dbo.Transactions T2 ON
T2.TransactionDate > T1.TransactionDate AND
T2.TransactionDate <= C.date
WHERE
DAY(C.date) = 1 AND
T2.TransactionDate IS NULL AND
C.date BETWEEN '2009-01-01' AND '2009-12-31' -- You can use whatever range you want
John Gibb posted a fine answer, already accepted, but I wanted to expand on it a bit to:
eliminate the one year limitation,
expose the date range in a more
explicit manner, and
eliminate the need for a separate
numbers table.
This slight variation uses a recursive common table expression to establish the set of Dates representing the first of each month on or after from and to dates defined in DateRange. Note the use of the MAXRECURSION option to prevent a stack overflow (!); adjust as necessary to accommodate the maximum number of months expected. Also, consider adding alternative Dates assembly logic to support weeks, quarters, even day-to-day.
with
DateRange(FromDate, ToDate) as (
select
Cast('11/1/2008' as DateTime),
Cast('2/15/2010' as DateTime)
),
Dates(Date) as (
select
Case Day(FromDate)
When 1 Then FromDate
Else DateAdd(month, 1, DateAdd(month, ((Year(FromDate)-1900)*12)+Month(FromDate)-1, 0))
End
from DateRange
union all
select DateAdd(month, 1, Date)
from Dates
where Date < (select ToDate from DateRange)
)
select
d.Date, t.TransactionValue
from Dates d
outer apply (
select top 1 TransactionValue
from Transactions
where TransactionDate <= d.Date
order by TransactionDate desc
) t
option (maxrecursion 120);
If you do this type of analysis often, you might be interested in this SQL Server function I put together for exactly this purpose:
if exists (select * from dbo.sysobjects where name = 'fn_daterange') drop function fn_daterange;
go
create function fn_daterange
(
#MinDate as datetime,
#MaxDate as datetime,
#intval as datetime
)
returns table
--**************************************************************************
-- Procedure: fn_daterange()
-- Author: Ron Savage
-- Date: 12/16/2008
--
-- Description:
-- This function takes a starting and ending date and an interval, then
-- returns a table of all the dates in that range at the specified interval.
--
-- Change History:
-- Date Init. Description
-- 12/16/2008 RS Created.
-- **************************************************************************
as
return
WITH times (startdate, enddate, intervl) AS
(
SELECT #MinDate as startdate, #MinDate + #intval - .0000001 as enddate, #intval as intervl
UNION ALL
SELECT startdate + intervl as startdate, enddate + intervl as enddate, intervl as intervl
FROM times
WHERE startdate + intervl <= #MaxDate
)
select startdate, enddate from times;
go
it was an answer to this question, which also has some sample output from it.
I don't have access to BOL from my phone so this is a rough guide...
First, you need to generate the missing rows for the months you have no data. You can either use a OUTER join to a fixed table or temp table with the timespan you want or from a programmatically created dataset (stored proc or suchlike)
Second, you should look at the new SQL 2008 'analytic' functions, like MAX(value) OVER ( partition clause ) to get the previous value.
(I KNOW Oracle can do this 'cause I needed it to calculate compounded interest calcs between transaction dates - same problem really)
Hope this points you in the right direction...
(Avoid throwing it into a temp table and cursoring over it. Too crude!!!)
-----Alternative way------
select
d.firstOfMonth,
MONTH(d.firstOfMonth) as Mon,
YEAR(d.firstOfMonth) as Yr,
t.TransactionValue
from (
select
dateadd( month, inMonths - 1, '1/1/2009') as firstOfMonth
from (
values (1), (2), (3), (4), (5), (7), (8), (9), (10), (11), (12)
) Dates(inMonths)
) d
outer apply (
select top 1 TransactionValue
from Transactions
where TransactionDate <= d.firstOfMonth
order by TransactionDate desc
) t