MS SQL GROUPED SUM - sql

I currently have an MS SQL query which calculates the length of time that each User has been logged in to a system during one day. The table I am extracting this information from records each log in/log out as a separate record. Currently, my MS SQL code is as follows:
SELECT
CAST(DateTime As Date),
UserID,
MIN(DateTime),
MAX(DateTime),
DATEDIFF(SS, MIN(DateTime), MAX(DateTime))
FROM
LoginLogoutData
WHERE
CAST(DateTime AS DATE) = '01/01/2015'
GROUP BY
CAST(DateTime As Date),
UserID
This works as required and creates a table similar to the below.
Date UserID FirstLogIn FinalLogOut LoggedInTime
......... ...... .......... ............ ............
01/01/2015 ABC 07:42:57 14:57:13 26056
01/01/2015 DEF 07:45:49 13:57:56 22326
This works fine for one day's-worth of data. However, if I wanted to calculate the length of time that someone was logged into the system for during a larger date range, e.g. a week or month, this would not work; it would calculate the length of time between the user's log in on the first day and their log out on the final day.
Basically, I would like my code to calculate (Max(DateTime) - Min(DateTime)) FOR EACH DAY then sum all these values together into one simple table grouped only by UserId. I would then be able to set my date range as I please and receive the correct results.
So I would have a table as follows:
UserId LoggedInTime
........ .............
ABC 563287
DEF 485823
GEH 126789
I assume I need to use a GROUP BY within the MIN() function but I don't have much experience with this yet.
Does anyone have any experience with this? Any help would be greatly appreciated.
Thank you.

First you need to aggregate by date, and then by larger units of time. For instance, for the year to date:
SELECT UserId, SUM(diffsecs)
FROM (SELECT CAST(DateTime As Date) as thedate, UserID,
DATEDIFF(second, MIN(DateTime), MAX(DateTime)) as diffsecs
FROM LoginLogoutData
GROUP BY CAST(DateTime As Date), UserID
) ud
WHERE thedate between '2015-01-01' and getdate();

You can use another group by statement to the first query like below:
Select UserID,SUM(LoggedTime)
FROM
(
SELECT CAST(DateTime As Date),
UserID, MIN(DateTime),
MAX(DateTime),
DATEDIFF(SS, MIN(DateTime), MAX(DateTime)) AS LoggedTime
FROM LoginLogoutData
WHERE CAST(DateTime AS DATE) = '01/01/2015'
GROUP BY CAST(DateTime As Date), UserID
) As temp
GROUP BY UserID
Here you can change the where clause to match the data range. It will first calculate logged time for each day and then get the sum of all days per user.

Here is an example of how to do this with working sample data and the T-SQL included.
-- original table you described
CREATE TABLE LoginLogoutData (UserID int, DateTime DateTime)
GO
-- clean any previous sample records
TRUNCATE TABLE LoginLogOutData
/*local variables for iteration*/
DECLARE #i int = 1
DECLARE #n int
DECLARE #entryDate DateTime = GETDATE()
--populate the table with some sample data
/* for each of the five sample users, generate sample login and logout
data for 30 days. Each login and logout are simply an hour apart for demo purposes. */
SET NOCOUNT ON
-- iterate over 5 users (userid)
WHILE (#i <= 5)
BEGIN
--set the initial counter for the date loop
SET #n = 1
--dated entry loop
WHILE (#n <= 30)
BEGIN
-- increment to the next day
SET #entryDate = DateAdd(dd,#n,GETDATE())
--logged in entry
INSERT INTO LoginLogoutData (DateTime, UserID)
SELECT #entryDate,#i
-- logged out entry
INSERT INTO LoginLogoutData (DateTime, UserID)
SELECT DateAdd(hh,1,#entryDate),#i
--increment counter
SET #n = #n+1
END
--increment counter
SET #i=#i+1
END
GO
/* demonstrate that for each user each day has entries and that
the code calculates (Max(DateTime) - Min(DateTime)) FOR EACH DAY
*/
SELECT UserID,
MIN(DateTime) AS LoggedIn,
MAX(DateTime) AS LoggedOut,
DATEDIFF(SS, MIN(DateTime), MAX(DateTime)) AS LoginTime
FROM LoginLogoutData
GROUP BY CAST(DateTime As Date), UserID
/*this is a table variable used to support the "sum all these values together into one
simple table grouped only by UserId*/
DECLARE #SummedUserActivity AS TABLE (UserID int, DailyActivity int)
-- group the subtotals from each day per user
INSERT INTO #SummedUserActivity (UserID, DailyActivity)
SELECT UserID, DATEDIFF(SS, MIN(DateTime), MAX(DateTime))
FROM LoginLogoutData
GROUP BY CAST(DateTime As Date), UserID
-- demonstrate the sum of the subtotals grouped by userid
SELECT UserID, SUM(DailyActivity) AS TotalActivity
FROM #SummedUserActivity
GROUP BY UserID

Since you are already calculating LoggedInTime for each date, the following query would be necessary
SELECT USERID,SUM(LoggedInTime) LoggedInTime
FROM YOURTABLE
GROUP BY USERID
UPDATE
Since you have one record for login and the next record for logout(irrespective of date),we can use the concept of SELF JOIN(a table that joins itself) to get logout datetime for corresponding login time.
DECLARE #FROMDATE DATETIME='2014-01-01 07:42:57.000'
DECLARE #TODATE DATETIME='2015-02-01 07:42:57.000'
;WITH CTE AS
(
-- ROW_NUMBER() is taken as logic for self joining.
SELECT ROW_NUMBER() OVER(ORDER BY USERID,[DATETIME]) RNO,*
FROM #TEMP
WHERE [DATETIME] >= #FROMDATE AND [DATETIME] < #TODATE
)
,CTE2 AS
(
-- Since we are using SELF JOIN,we will get the next row's
-- datetime(ie, Logout time for corresponding login time)
SELECT C1.*,C2.[DATETIME] [DATETIME2],
DATEDIFF(SS, C1.[DateTime], C2.[DateTime]) SECONDSDIFF
FROM CTE C1
LEFT JOIN CTE C2 ON C1.RNO=C2.RNO-1
WHERE C1.RNO%2 <> 0
-- Since we get the next row's datetime in current row,
-- we omit each logout time's row
)
SELECT USERID,SUM(SECONDSDIFF) SECONDSDIFF
FROM CTE2
SQL FIDDLE

Related

How to create a temp table with values from another table aggregated weekly?

It is a bit difficult to explain but basically I need to create a temporary table (date datetime, #customers int) where #customers is the number of weekly customers pulled from another table. Here's my code.
declare #date datetime
declare #temptable table (date datetime not null,#customers int)
set #date='2018-02-13'
while #date<getdate()
begin
insert into #temptable values
(#date,
(select count(*) from in_ft_conversion
where u4='cfa' and sales_date between #date and #date-7))
set #date=#date+7
end
The result is a table with all the correct date entries but 0 in the customer column... Does anybody know what I'm doing wrong? Thanks!
Your date range is wrong , swap the date values in the BETWEEN so you have BETWEEN <earlier date> AND <later date>
where u4='cfa' and sales_date between #date-7 and #date))
Why would you use a while loop for this? I think you want something like this:
insert into #temptable (date, num_customers)
select dateadd(day, '2018-02-08', weekno * 7)
count(*)
from in_ft_conversion cross apply
(values (datediff(day, '2018-02-08', sales_date) / 7
) v(weekno)
where u4 = 'cfa' and sales_date >= '2018-02-08'
group by v.weekno;
No loop is necessary.
Your problem is specifically the between comparison:
sales_date between #date and #date-7
The dates are backwards -- the lower bound needs to go first.
But, I also doubt that you want to count weeks with 8 days and have one day overlap on each week. I think the above logic does what you want, but you can adjust the date arithmetic to get the exact dates you want.

SQL Server Patient Census Average By Day and Hour

I need to create a patient census report that shows average number patients present per hour and per day of a week over a given time period. This would allow me to show, for example, over the last 6 months there was an average of 4 people in the ER on Mondays. I have a table valued function that will show the following for patients:
VisitID, FromDateTime, ThruDateTime, LocationID.
I was able to show the number of patients in, for example, the ER for a given day using the code below. But it is limited to one day only. (Adapted from http://www.sqlservercentral.com/Forums/Topic939818-338-1.aspx).
--Census Count by Date Range--
DECLARE #BeginDateParameter DateTime, #EndDateParameter DateTime
SET #BeginDateParameter = '20160201'
SET #EndDateParameter = '2016-02-01 23:59:59.000'
----------------------------------------------------
-- Create a temp table to hold the necessary values
-- plus an extra "x" field to track processing
----------------------------------------------------
IF OBJECT_ID('tempdb..#Temp') IS NOT NULL DROP TABLE #Temp
CREATE TABLE #Temp (ID INT Identity NOT NULL, VisitID VarChar(100), SourceID VarChar(100),
FromDateTime DateTime, ThruDateTime DateTime, x INT)
----------------------------------------------------
-- Populate the temp table with values from the
-- the actual table in the database
----------------------------------------------------
INSERT INTO #Temp
SELECT VisitID, FromDateTime, ThruDateTime
FROM PatientFlowTable(BeginDateParameter,#EndDateParameter)
WHERE (FromDateTime BETWEEN #BeginDateParameter AND #EndDateParameter +1
OR ThruDateTime BETWEEN #BeginDateParameter AND #EndDateParameter +1)
AND LocationID = 'ER'
-- Given Period is taken as inclusive of given hours in the input (eg. 15:25:30 will be taken as 15:00:00)
-- frist make sure that the minutes, seconds and milliseconds are removed from input range for clarity
set #BeginDateParameter = dateadd(hh, datepart(hh,#BeginDateParameter), convert(varchar(12),#BeginDateParameter,112))
set #EndDateParameter = dateadd(hh, datepart(hh,#EndDateParameter), convert(varchar(12),#EndDateParameter,112))
-- you may create this CTE by other ways (eg. from permanent Tally table)...
;with dh
as
(
select top 24
DATEADD(hour,ROW_NUMBER() OVER (ORDER BY [Object_id])-1,convert(varchar(12),#BeginDateParameter,112)) as HoDstart
,DATEADD(hour,ROW_NUMBER() OVER (ORDER BY [Object_id]),convert(varchar(12),#BeginDateParameter,112)) as HoDend
,ROW_NUMBER() OVER (ORDER BY Object_id)-1 as DayHour
from sys.columns -- or any other (not very big) table which have more than 24 raws, just remamber to change
-- [Object_id] in OVER (ORDER BY [Object_id]... to some existing column
)
select d.DayHour, count(w.VisitID) as PatientCount
from dh d
left join #Temp w
on w.[FromDateTime] < d.HoDend
and w.[ThruDateTime] >= d.HoDstart
where d.HoDstart between #BeginDateParameter and #EndDateParameter
group by d.DayHour
order by d.DayHour
SELECT VisitID, FromDateTime, ThruDateTime
FROM PatientFlowTable(BeginDateParameter,#EndDateParameter)
WHERE (FromDateTime BETWEEN #BeginDateParameter AND #EndDateParameter +1
OR ThruDateTime BETWEEN #BeginDateParameter AND #EndDateParameter +1)
AND LocationID = 'ER'
Output example for the first three hours show patients that were present in the ER by taking into account their departure time.
Hour PatientCount
0 2
1 3
2 3
For querying short time periods, I would create a table-valued function that generates the hour entries. The results table can be joined into your query.
CREATE FUNCTION [dbo].[f_hours] (#startDateTime DATETIME,
#endDateTime DATETIME)
RETURNS #result TABLE (
[dateTime] DATETIME PRIMARY KEY
)
AS
BEGIN
DECLARE
#dateTime DATETIME = #startDateTime,
#hours INT = DATEDIFF(hour, #startDateTime, #endDateTime)
WHILE (#dateTime <= #endDateTime)
BEGIN
INSERT
INTO #result
VALUES (#dateTime)
SET #dateTime = DATEADD(hour, 1, #dateTime)
END
RETURN
END
GO
The time required by the function can be output with SET STATISTICS TIME ON. For the generation of over 6000 records needs my computer 53 ms.
SET STATISTICS TIME ON
SELECT *
FROM [dbo].[f_hours]('2016-02-01', '2016-02-10 16:00')
SET STATISTICS TIME OFF

How to find the total playing time per day for all the users in my sql server database

I have a table which contains following columns
userid,
game,
gameStarttime datetime,
gameEndtime datetime,
startdate datetime,
currentdate datetime
I can retrieve all the playing times but I want to count the total playing time per DAY and 0 or null if game not played on a specific day.
Take a look at DATEDIFF to do the time calculations. Your requirements are not very clear, but it should work for whatever you're looking to do.
Your end result would probably look something like this:
SELECT
userid,
game,
DATEDIFF(SS, gameStarttime, gameEndtime) AS [TotalSeconds]
FROM [source]
GROUP BY
userid,
game
In the example query above, the SS counts the seconds between the 2 dates (assuming both are not null). If you need just minutes, then MI will provide the total minutes. However, I imagine total seconds is best so that you can convert to whatever unit of measure you need accurate, such as hours that might be "1.23" or something like that.
Again, most of this is speculation based on assumptions and what you seem to be looking for. Hope that helps.
MSDN Docs for DATEDIFF: https://msdn.microsoft.com/en-us/library/ms189794.aspx
You may also look up DATEPART if you want the minutes and seconds separately.
UPDATED BASED ON FEEDBACK
The query below breaks out the hour breakdowns by day, splits time across multiple days, and shows "0" for days where no games are played. Also, for your output, I have to assume you have a separate table of users (so you can show users who have no time in your date range).
-- Define start date
DECLARE #BeginDate DATE = '4/21/2015'
-- Create sample data
DECLARE #Usage TABLE (
userid int,
game nvarchar(50),
gameStartTime datetime,
gameEndTime datetime
)
DECLARE #Users TABLE (
userid int
)
INSERT #Users VALUES (1)
INSERT #Usage VALUES
(1, 'sample', '4/25/2015 10pm', '4/26/2015 2:30am'),
(1, 'sample', '4/22/2015 4pm', '4/22/2015 4:30pm')
-- Generate list of days in range
DECLARE #DayCount INT = DATEDIFF(DD, #BeginDate, GETDATE()) + 1
;WITH CTE AS (
SELECT TOP (225) [object_id] FROM sys.all_objects
), [Days] AS (
SELECT TOP (#DayCount)
DATEADD(DD, ROW_NUMBER() OVER (ORDER BY x.[object_id]) - 1, #BeginDate) AS [Day]
FROM CTE x
CROSS JOIN CTE y
ORDER BY
[Day]
)
SELECT
[Days].[Day],
Users.userid,
SUM(COALESCE(CONVERT(MONEY, DATEDIFF(SS, CASE WHEN CONVERT(DATE, Usage.gameStartTime) < [Day] THEN [Day] ELSE Usage.gameStartTime END,
CASE WHEN CONVERT(DATE, Usage.gameEndTime) > [Day] THEN DATEADD(DD, 1, [Days].[Day]) ELSE Usage.gameEndTime END)) / 3600, 0)) AS [Hours]
FROM [Days]
CROSS JOIN #Users Users
LEFT OUTER JOIN #Usage Usage
ON Usage.userid = Users.userid
AND [Days].[Day] BETWEEN CONVERT(DATE, Usage.gameStartTime) AND CONVERT(DATE, Usage.gameEndTime)
GROUP BY
[Days].[Day],
Users.userid
The query above yields the output below for the sample data:
Day userid Hours
---------- ----------- ---------------------
2015-04-21 1 0.00
2015-04-22 1 0.50
2015-04-23 1 0.00
2015-04-24 1 0.00
2015-04-25 1 2.00
2015-04-26 1 2.50
2015-04-27 1 0.00
I've edited my sql on sql fiddle and I think this might get you what you asked for. to me it looks a little more simple then the answer you've accepted.
DECLARE #FromDate datetime, #ToDate datetime
SELECT #Fromdate = MIN(StartDate), #ToDate = MAX(currentDate)
FROM Games
-- This recursive CTE will get you all dates
-- between the first StartDate and the last CurrentDate on your table
;WITH AllDates AS(
SELECT #Fromdate As TheDate
UNION ALL
SELECT TheDate + 1
FROM AllDates
WHERE TheDate + 1 <= #ToDate
)
SELECT UserId,
TheDate,
COALESCE(
SUM(
-- When the game starts and ends in the same date
CASE WHEN DATEDIFF(DAY, GameStartTime, GameEndTime) = 0 THEN
DATEDIFF(HOUR, GameStartTime, GameEndTime)
ELSE
-- when the game starts in the current date
CASE WHEN DATEDIFF(DAY, GameStartTime, TheDate) = 0 THEN
DATEDIFF(HOUR, GameStartTime, DATEADD(Day, 1, TheDate))
ELSE -- meaning the game ends in the current date
DATEDIFF(HOUR, TheDate, GameEndTime)
END
END
),
0) As HoursPerDay
FROM (
SELECT DISTINCT UserId,
TheDate,
CASE
WHEN CAST(GameStartTime as Date) = TheDate
THEN GameStartTime
ELSE NULL
END As GameStartTime, -- return null if no game started that day
CASE
WHEN CAST(GameEndTime as Date) = TheDate
THEN GameEndTime
ELSE NULL
END As GameEndTime -- return null if no game ended that day
FROM Games CROSS APPLY AllDates -- This is where the magic happens :-)
) InnerSelect
GROUP BY UserId, TheDate
ORDER BY UserId, TheDate
OPTION (MAXRECURSION 0)
Play with it your self on sql fiddle.

Grouping by contiguous dates, ignoring weekends in SQL

I'm attempting to group contiguous date ranges to show the minimum and maximum date for each range. So far I've used a solution similar to this one: http://www.sqlservercentral.com/articles/T-SQL/71550/ however I'm on SQL 2000 so I had to make some changes. This is my procedure so far:
create table #tmp
(
date smalldatetime,
rownum int identity
)
insert into #tmp
select distinct date from testDates order by date
select
min(date) as dateRangeStart,
max(date) as dateRangeEnd,
count(*) as dates,
dateadd(dd,-1*rownum, date) as GroupID
from #tmp
group by dateadd(dd,-1*rownum, date)
drop table #tmp
It works exactly how I want except for one issue: weekends. My data sets have no records for weekend dates, which means any group found is at most 5 days. For instance, in the results below, I would like the last 3 groups to show up as a single record, with a dateRangeStart of 10/6 and a dateRangeEnd of 10/20:
Is there some way I can set this up to ignore a break in the date range if that break is just a weekend?
Thanks for the help.
EDITED
I didn't like my previous idea very much. Here's a better one, I think:
Based on the first and the last dates from the set of those to be grouped, prepare the list of all the intermediate weekend dates.
Insert the working dates together with weekend dates, ordered, so they would all be assigned rownum values according to their normal order.
Use your method of finding contiguous ranges with the following modifications:
1) when calculating dateRangeStart, if it's a weekend date, pick the nearest following weekday;
2) accordingly for dateRangeEnd, if it's a weekend date, pick the nearest preceding weekday;
3) when counting dates for the group, pick only weekdays.
Select from the resulting set only those rows where dates > 0, thus eliminating the groups formed only of the weekends.
And here's an implementation of the method, where it is assumed, that a week starts on Sunday (DATEPART returns 1) and weekend days are Sunday and Saturday:
DECLARE #tmp TABLE (date smalldatetime, rownum int IDENTITY);
DECLARE #weekends TABLE (date smalldatetime);
DECLARE #minDate smalldatetime, #maxDate smalldatetime, #date smalldatetime;
/* #1 */
SELECT #minDate = MIN(date), #maxDate = MAX(date)
FROM testDates;
SET #date = #minDate - DATEPART(dw, #minDate) + 7;
WHILE #date < #maxDate BEGIN
INSERT INTO #weekends
SELECT #date UNION ALL
SELECT #date + 1;
SET #date = #date + 7;
END;
/* #2 */
INSERT INTO #tmp
SELECT date FROM testDates
UNION
SELECT date FROM #weekends
ORDER BY date;
/* #3 & #4 */
SELECT *
FROM (
SELECT
MIN(date + CASE DATEPART(dw, date) WHEN 1 THEN 1 WHEN 7 THEN 2 ELSE 0 END)
AS dateRangeStart,
MAX(date - CASE DATEPART(dw, date) WHEN 1 THEN 2 WHEN 7 THEN 1 ELSE 0 END)
AS dateRangeEnd,
COUNT(CASE WHEN DATEPART(dw, date) NOT IN (1, 7) THEN date END) AS dates,
DATEADD(d, -rownum, date) AS GroupID
FROM #tmp
GROUP BY DATEADD(d, -rownum, date)
) s
WHERE dates > 0;

How to Determine Values for Missing Months based on Data of Previous Months in T-SQL

I have a set of transactions occurring at specific points in time:
CREATE TABLE Transactions (
TransactionDate Date NOT NULL,
TransactionValue Integer NOT NULL
)
The data might be:
INSERT INTO Transactions (TransactionDate, TransactionValue)
VALUES ('1/1/2009', 1)
INSERT INTO Transactions (TransactionDate, TransactionValue)
VALUES ('3/1/2009', 2)
INSERT INTO Transactions (TransactionDate, TransactionValue)
VALUES ('6/1/2009', 3)
Assuming that the TransactionValue sets some kind of level, I need to know what the level was between the transactions. I need this in the context of a set of T-SQL queries, so it would be best if I could get a result set like this:
Month Value
1/2009 1
2/2009 1
3/2009 2
4/2009 2
5/2009 2
6/2009 3
Note how, for each month, we either get the value specified in the transaction, or we get the most recent non-null value.
My problem is that I have little idea how to do this! I'm only an "intermediate" level SQL Developer, and I don't remember ever seeing anything like this before. Naturally, I could create the data I want in a program, or using cursors, but I'd like to know if there's a better, set-oriented way to do this.
I'm using SQL Server 2008, so if any of the new features will help, I'd like to hear about it.
P.S. If anyone can think of a better way to state this question, or even a better subject line, I'd greatly appreciate it. It took me quite a while to decide that "spread", while lame, was the best I could come up with. "Smear" sounded worse.
I'd start by building a Numbers table holding sequential integers from 1 to a million or so. They come in really handy once you get the hang of it.
For example, here is how to get the 1st of every month in 2008:
select firstOfMonth = dateadd( month, n - 1, '1/1/2008')
from Numbers
where n <= 12;
Now, you can put that together using OUTER APPLY to find the most recent transaction for each date like so:
with Dates as (
select firstOfMonth = dateadd( month, n - 1, '1/1/2008')
from Numbers
where n <= 12
)
select d.firstOfMonth, t.TransactionValue
from Dates d
outer apply (
select top 1 TransactionValue
from Transactions
where TransactionDate <= d.firstOfMonth
order by TransactionDate desc
) t;
This should give you what you're looking for, but you might have to Google around a little to find the best way to create the Numbers table.
here's what i came up with
declare #Transactions table (TransactionDate datetime, TransactionValue int)
declare #MinDate datetime
declare #MaxDate datetime
declare #iDate datetime
declare #Month int
declare #count int
declare #i int
declare #PrevLvl int
insert into #Transactions (TransactionDate, TransactionValue)
select '1/1/09',1
insert into #Transactions (TransactionDate, TransactionValue)
select '3/1/09',2
insert into #Transactions (TransactionDate, TransactionValue)
select '5/1/09',3
select #MinDate = min(TransactionDate) from #Transactions
select #MaxDate = max(TransactionDate) from #Transactions
set #count=datediff(mm,#MinDate,#MaxDate)
set #i=1
set #iDate=#MinDate
while (#i<=#count)
begin
set #iDate=dateadd(mm,1,#iDate)
if (select count(*) from #Transactions where TransactionDate=#iDate) < 1
begin
select #PrevLvl = TransactionValue from #Transactions where TransactionDate=dateadd(mm,-1,#iDate)
insert into #Transactions (TransactionDate, TransactionValue)
select #iDate, #prevLvl
end
set #i=#i+1
end
select *
from #Transactions
order by TransactionDate
To do it in a set-based way, you need sets for all of your data or information. In this case there's the overlooked data of "What months are there?" It's very useful to have a "Calendar" table as well as a "Number" table in databases as utility tables.
Here's a solution using one of these methods. The first bit of code sets up your calendar table. You can fill it using a cursor or manually or whatever and you can limit it to whatever date range is needed for your business (back to 1900-01-01 or just back to 1970-01-01 and as far into the future as you want). You can also add any other columns that are useful for your business.
CREATE TABLE dbo.Calendar
(
date DATETIME NOT NULL,
is_holiday BIT NOT NULL,
CONSTRAINT PK_Calendar PRIMARY KEY CLUSTERED (date)
)
INSERT INTO dbo.Calendar (date, is_holiday) VALUES ('2009-01-01', 1) -- New Year
INSERT INTO dbo.Calendar (date, is_holiday) VALUES ('2009-01-02', 1)
...
Now, using this table your question becomes trivial:
SELECT
CAST(MONTH(date) AS VARCHAR) + '/' + CAST(YEAR(date) AS VARCHAR) AS [Month],
T1.TransactionValue AS [Value]
FROM
dbo.Calendar C
LEFT OUTER JOIN dbo.Transactions T1 ON
T1.TransactionDate <= C.date
LEFT OUTER JOIN dbo.Transactions T2 ON
T2.TransactionDate > T1.TransactionDate AND
T2.TransactionDate <= C.date
WHERE
DAY(C.date) = 1 AND
T2.TransactionDate IS NULL AND
C.date BETWEEN '2009-01-01' AND '2009-12-31' -- You can use whatever range you want
John Gibb posted a fine answer, already accepted, but I wanted to expand on it a bit to:
eliminate the one year limitation,
expose the date range in a more
explicit manner, and
eliminate the need for a separate
numbers table.
This slight variation uses a recursive common table expression to establish the set of Dates representing the first of each month on or after from and to dates defined in DateRange. Note the use of the MAXRECURSION option to prevent a stack overflow (!); adjust as necessary to accommodate the maximum number of months expected. Also, consider adding alternative Dates assembly logic to support weeks, quarters, even day-to-day.
with
DateRange(FromDate, ToDate) as (
select
Cast('11/1/2008' as DateTime),
Cast('2/15/2010' as DateTime)
),
Dates(Date) as (
select
Case Day(FromDate)
When 1 Then FromDate
Else DateAdd(month, 1, DateAdd(month, ((Year(FromDate)-1900)*12)+Month(FromDate)-1, 0))
End
from DateRange
union all
select DateAdd(month, 1, Date)
from Dates
where Date < (select ToDate from DateRange)
)
select
d.Date, t.TransactionValue
from Dates d
outer apply (
select top 1 TransactionValue
from Transactions
where TransactionDate <= d.Date
order by TransactionDate desc
) t
option (maxrecursion 120);
If you do this type of analysis often, you might be interested in this SQL Server function I put together for exactly this purpose:
if exists (select * from dbo.sysobjects where name = 'fn_daterange') drop function fn_daterange;
go
create function fn_daterange
(
#MinDate as datetime,
#MaxDate as datetime,
#intval as datetime
)
returns table
--**************************************************************************
-- Procedure: fn_daterange()
-- Author: Ron Savage
-- Date: 12/16/2008
--
-- Description:
-- This function takes a starting and ending date and an interval, then
-- returns a table of all the dates in that range at the specified interval.
--
-- Change History:
-- Date Init. Description
-- 12/16/2008 RS Created.
-- **************************************************************************
as
return
WITH times (startdate, enddate, intervl) AS
(
SELECT #MinDate as startdate, #MinDate + #intval - .0000001 as enddate, #intval as intervl
UNION ALL
SELECT startdate + intervl as startdate, enddate + intervl as enddate, intervl as intervl
FROM times
WHERE startdate + intervl <= #MaxDate
)
select startdate, enddate from times;
go
it was an answer to this question, which also has some sample output from it.
I don't have access to BOL from my phone so this is a rough guide...
First, you need to generate the missing rows for the months you have no data. You can either use a OUTER join to a fixed table or temp table with the timespan you want or from a programmatically created dataset (stored proc or suchlike)
Second, you should look at the new SQL 2008 'analytic' functions, like MAX(value) OVER ( partition clause ) to get the previous value.
(I KNOW Oracle can do this 'cause I needed it to calculate compounded interest calcs between transaction dates - same problem really)
Hope this points you in the right direction...
(Avoid throwing it into a temp table and cursoring over it. Too crude!!!)
-----Alternative way------
select
d.firstOfMonth,
MONTH(d.firstOfMonth) as Mon,
YEAR(d.firstOfMonth) as Yr,
t.TransactionValue
from (
select
dateadd( month, inMonths - 1, '1/1/2009') as firstOfMonth
from (
values (1), (2), (3), (4), (5), (7), (8), (9), (10), (11), (12)
) Dates(inMonths)
) d
outer apply (
select top 1 TransactionValue
from Transactions
where TransactionDate <= d.firstOfMonth
order by TransactionDate desc
) t