SQL Server 2012: GROUP BY seconds, minutes, hours - sql

I am trying to group records in SQL Server 2012 by DateTime. I found an example on stackoverflow that does partially what I want, but the problem is that it does not group correct when it exceeds the range. If, for example, I group on minutes in blocks of 30 minutes the result is correct (see query and result below). But if I group on blocks of 120 minutes, I got the exact same result. It keeps grouping at its maximum of 60 minutes in an hour (result below).
The problem is that the grouping can not take it's parent DateTime element (seconds to minutes, minutes to hours, ... , even seconds to hours,...). It is kinda logic cause I only check at minutes, but I would like to see it pass hours also. Maybe with a DATEADD(), but I don't manage to get it working.
Any ideas??
A (small) example to show what I mean:
Query:
DECLARE #TimeInterval int, #StartTime DateTime, #EndTime Datetime
SET #TimeInterval = 30
SET #StartTime='2015-01-01T08:00:00Z'
SET #EndTime = '2015-01-05T10:00:00Z'
declare #T table
(
Value datetime
);
insert into #T values ('2015-01-01T08:00:00');
insert into #T values ('2015-01-01T08:03:00');
insert into #T values ('2015-01-01T08:06:00');
insert into #T values ('2015-01-01T08:14:00');
insert into #T values ('2015-01-01T09:06:00');
insert into #T values ('2015-01-01T09:07:00');
insert into #T values ('2015-01-01T09:08:00');
insert into #T values ('2015-01-01T11:09:00');
insert into #T values ('2015-01-01T12:10:00');
insert into #T values ('2015-01-01T13:11:00');
insert into #T values ('2015-01-02T08:08:00');
insert into #T values ('2015-01-02T08:09:00');
insert into #T values ('2015-01-03T08:10:00');
insert into #T values ('2015-01-04T08:11:00');
SELECT DATEADD(MINUTE, #TimeInterval, Convert(Datetime,CONCAT(DATEPART(YEAR, Value),'-', DATEPART(MONTH, Value),
'-', DATEPART(DAY, Value),' ', DATEPART(HOUR, Value),':', ((DATEPART(MINUTE, Value) / #TimeInterval) * #TimeInterval),':00'),120)) as Period,
ISNULL(COUNT(*), 0) AS NumberOfVisitors
FROM #T
WHERE Value >= #StartTime AND Value < #EndTime
GROUP BY Convert(Datetime,CONCAT(DATEPART(YEAR, Value),'-', DATEPART(MONTH, Value), '-', DATEPART(DAY, Value),' ',
DATEPART(HOUR, Value),':',((DATEPART(MINUTE, Value) / #TimeInterval) * #TimeInterval),':00'),120)
ORDER BY Period
Result for 30 min
2015-01-01 08:30:00.000 | 4
2015-01-01 09:30:00.000 | 3
2015-01-01 11:30:00.000 | 1
2015-01-01 12:30:00.000 | 1
2015-01-01 13:30:00.000 | 1
2015-01-02 08:30:00.000 | 2
2015-01-03 08:30:00.000 | 1
2015-01-04 08:30:00.000 | 1
Result for 60 min
2015-01-01 08:30:00.000 | 4
2015-01-01 09:30:00.000 | 3
2015-01-01 11:30:00.000 | 1
2015-01-01 12:30:00.000 | 1
2015-01-01 13:30:00.000 | 1
2015-01-02 08:30:00.000 | 2
2015-01-03 08:30:00.000 | 1
2015-01-04 08:30:00.000 | 1
Thanks in advance!

You don't want datepart() for this purpose. You want a minutes count. One way is to use datediff():
SELECT datediff(minute, #StartTime, value) / #TimeInterval as minutes,
COUNT(*) AS NumberOfVisitors
FROM #T
WHERE Value >= #StartTime AND Value < #EndTime
GROUP BY datediff(minute, #StartTime, value) / #TimeInterval
ORDER BY minutes ;
SQL Server does integer division, so you don't have to worry about remainders in this case. Also, COUNT(*) cannot return NULL, so neither ISNULL() nor COALESCE() is appropriate.
Or, to get a date/time value:
SELECT dateadd(day,
datediff(minute, #StartTime, value) / #TimeInterval,
#StartTime) as period,
COUNT(*) AS NumberOfVisitors
FROM #T
WHERE Value >= #StartTime AND Value < #EndTime
GROUP BY datediff(minute, #StartTime, value) / #TimeInterval
ORDER BY period ;

Related

SQL how to count census points occurring between date records

I’m using MS-SQL-2008 R2 trying to write a script that calculates the Number of Hospital Beds occupied on any given day, at 2 census points: midnight, and 09:00.
I’m working from a data set of patient Ward Stays. Basically, each row in the table is a record of an individual patient's stay on a single ward, and records the date/time the patient is admitted onto the ward, and the date/time the patient leaves the ward.
A sample of this table is below:
Ward_Stay_Primary_Key | Ward_Start_Date_Time | Ward_End_Date_Time
1 | 2017-09-03 15:04:00.000 | 2017-09-27 16:55:00.000
2 | 2017-09-04 18:08:00.000 | 2017-09-06 18:00:00.000
3 | 2017-09-04 13:00:00.000 | 2017-09-04 22:00:00.000
4 | 2017-09-04 20:54:00.000 | 2017-09-08 14:30:00.000
5 | 2017-09-04 20:52:00.000 | 2017-09-13 11:50:00.000
6 | 2017-09-05 13:32:00.000 | 2017-09-11 14:49:00.000
7 | 2017-09-05 13:17:00.000 | 2017-09-12 21:00:00.000
8 | 2017-09-05 23:11:00.000 | 2017-09-06 17:38:00.000
9 | 2017-09-05 11:35:00.000 | 2017-09-14 16:12:00.000
10 | 2017-09-05 14:05:00.000 | 2017-09-11 16:30:00.000
The key thing to note here is that a patient’s Ward Stay can span any length of time, from a few hours to many days.
The following code enables me to calculate the number of beds at both census points for any given day, by specifying the date in the case statement:
SELECT
'05/09/2017' [Date]
,SUM(case when Ward_Start_Date_Time <= '05/09/2017 00:00:00.000' AND (Ward_End_Date_Time >= '05/09/2017 00:00:00.000' OR Ward_End_Date_Time IS NULL)then 1 else 0 end)[No. Beds Occupied at 00:00]
,SUM(case when Ward_Start_Date_Time <= '05/09/2017 09:00:00.000' AND (Ward_End_Date_Time >= '05/09/2017 09:00:00.000' OR Ward_End_Date_Time IS NULL)then 1 else 0 end)[No. Beds Occupied at 09:00]
FROM
WardStaysTable
And, based on the sample 10 records above, generates this output:
Date | No. Beds Occupied at 00:00 | No. Beds Occupied at 09:00
05/09/2017 | 4 | 4
To perform this for any number of days is obviously onerous, so what I’m looking to create is a query where I can specify a start/end date parameter (e.g. 1st-5th Sept), and for the query to then evaluate the Ward_Start_Date_Time and Ward_End_Date_Time variables for each record, and – grouping by the dates defined in the date parameter – count each time the 00:00:00.000 and 09:00:00.000 census points fall between these 2 variables, to give an output something along these lines (based on the above 10 records):
Date | No. Beds Occupied at 00:00 | No. Beds Occupied at 09:00
01/09/2017 | 0 | 0
02/09/2017 | 0 | 0
03/09/2017 | 0 | 0
04/09/2017 | 1 | 1
05/09/2017 | 4 | 4
I’ve approached this (perhaps naively) thinking that if I use a cte to create a table of dates (defined by the input parameters), along with associated midnight and 9am census date/time points, then I could use these variables to group and evaluate the dataset.
So, this code generates the grouping dates and census date/time points:
DECLARE
#StartDate DATE = '01/09/2017'
,#EndDate DATE = '05/09/2017'
,#0900 INT = 540
SELECT
DATEADD(DAY, nbr - 1, #StartDate) [Date]
,CONVERT(DATETIME,(DATEADD(DAY, nbr - 1, #StartDate))) [MidnightDate]
,DATEADD(mi, #0900,(CONVERT(DATETIME,(DATEADD(DAY, nbr - 1, #StartDate))))) [0900Date]
FROM
(
SELECT
ROW_NUMBER() OVER ( ORDER BY c.object_id ) AS nbr
FROM sys.columns c
) nbrs
WHERE nbr - 1 <= DATEDIFF(DAY, #StartDate, #EndDate)
The stumbling block I’ve hit is how to join the cte to the WardStays dataset, because there’s no appropriate key… I’ve tried a few iterations of using a subquery to make this work, but either I’m taking the wrong approach or I’m getting my syntax in a mess.
In simple terms, the logic I’m trying to create to get the output is something like:
SELECT
[Date]
,SUM (case when WST.Ward_Start_Date_Time <= [MidnightDate] AND (WST.Ward_End_Date_Time >= [MidnightDate] OR WST.Ward_End_Date_Time IS NULL then 1 else 0 end) [No. Beds Occupied at 00:00]
,SUM (case when WST.Ward_Start_Date_Time <= [0900Date] AND (WST.Ward_End_Date_Time >= [0900Date] OR WST.Ward_End_Date_Time IS NULL then 1 else 0 end) [No. Beds Occupied at 09:00]
FROM WardStaysTable WST
GROUP BY [Date]
Is the above somehow possible, or am I barking up the wrong tree and need to take a different approach altogether? Appreciate any advice.
I would expect something like this:
WITH dates as (
SELECT CAST(#StartDate as DATETIME) as dte
UNION ALL
SELECT DATEADD(DAY, 1, dte)
FROM dates
WHERE dte < #EndDate
)
SELECT dates.dte [Date],
SUM(CASE WHEN Ward_Start_Date_Time <= dte AND
Ward_END_Date_Time >= dte
THEN 1 ELSE 0
END) as num_beds_0000,
SUM(CASE WHEN Ward_Start_Date_Time <= dte + CAST('09:00' as DATETIME) AND
Ward_END_Date_Time >= dte + CAST('09:00' as DATETIME)
THEN 1 ELSE 0
END) as num_beds_0900
FROM dates LEFT JOIN
WardStaysTable wt
ON wt.Ward_Start_Date_Time <= DATEADD(day, 1, dates.dte) AND
wt.Ward_END_Date_Time >= dates.dte
GROUP BY dates.dte
ORDER BY dates.dte;
The cte is just creating the list of dates.
What a cool exercise. Here is what I came up with:
CREATE TABLE #tmp (ID int, StartDte datetime, EndDte datetime)
INSERT INTO #tmp values(1,'2017-09-03 15:04:00.000','2017-09-27 06:55:00.000')
INSERT INTO #tmp values(2,'2017-09-04 08:08:00.000','2017-09-06 18:00:00.000')
INSERT INTO #tmp values(3,'2017-09-04 13:00:00.000','2017-09-04 22:00:00.000')
INSERT INTO #tmp values(4,'2017-09-04 20:54:00.000','2017-09-08 14:30:00.000')
INSERT INTO #tmp values(5,'2017-09-04 20:52:00.000','2017-09-13 11:50:00.000')
INSERT INTO #tmp values(6,'2017-09-05 13:32:00.000','2017-09-11 14:49:00.000')
INSERT INTO #tmp values(7,'2017-09-05 13:17:00.000','2017-09-12 21:00:00.000')
INSERT INTO #tmp values(8,'2017-09-05 23:11:00.000','2017-09-06 07:38:00.000')
INSERT INTO #tmp values(9,'2017-09-05 11:35:00.000','2017-09-14 16:12:00.000')
INSERT INTO #tmp values(10,'2017-09-05 14:05:00.000','2017-09-11 16:30:00.000')
DECLARE
#StartDate DATE = '09/01/2017'
,#EndDate DATE = '10/01/2017'
, #nHours INT = 9
;WITH d(OrderDate) AS
(
SELECT DATEADD(DAY, n-1, #StartDate)
FROM (SELECT TOP (DATEDIFF(DAY, #StartDate, #EndDate) + 1)
ROW_NUMBER() OVER (ORDER BY [object_id]) FROM sys.all_objects) AS x(n)
)
, CTE AS(
select OrderDate, t2.*
from #tmp t2
cross apply(select orderdate from d ) d
where StartDte >= #StartDate and EndDte <= #EndDate)
select OrderDate,
SUM(CASE WHEN OrderDate >= StartDte and OrderDate <= EndDte THEN 1 ELSE 0 END) [No. Beds Occupied at 00:00],
SUM(CASE WHEN StartDTE <= DateAdd(hour,#nHours,CAST(OrderDate as datetime)) and DateAdd(hour,#nHours,CAST(OrderDate as datetime)) <= EndDte THEN 1 ELSE 0 END) [No. Beds Occupied at 09:00]
from CTE
GROUP BY OrderDate
This should allow you to check for any hour of the day using the #nHours parameter if you so choose. If you only want to see records that actually fall within your date range then you can filter the cross apply on start and end dates.

Converting INT to DATE then using GETDATE on conversion?

I am trying to convert the results from an INT column to DATE so the GETDATE function will be compatible with this column. The date is currently in the format yyyymmdd
This is what I have so far based on what I could find but I am sure it is completely wrong
...AND (dbo.V_HEAD.LF_DATE CONVERT(DATE,(CONVERT(INT, LF_DATE)) >= GETDATE-28)
AND (dbo.V_HEAD.LF_DATE CONVERT(DATE,(CONVERT(INT, LF_DATE)) <= GETDATE)...
I want the results qualified on LF_DATE for the last 28 days too
The rest of the script runs correctly.
Where am I going wrong and how can I correct it?
Update
Following your comments, I've created some sample data to test my answer:
Create and populate sample data (Please save us this step in your future questions)
DECLARE #T as TABLE
(
Id int,
ActualDate Date,
LF_Date int
)
INSERT INTO #T (Id, ActualDate) VALUES
(10, DATEADD(DAY, -5, GETDATE())),
(9, DATEADD(DAY, -10, GETDATE())),
(8, DATEADD(DAY, -15, GETDATE())),
(7, DATEADD(DAY, -20, GETDATE())),
(6, DATEADD(DAY, -25, GETDATE())),
(5, DATEADD(DAY, -30, GETDATE())),
(4, DATEADD(DAY, -35, GETDATE())),
(3, DATEADD(DAY, -40, GETDATE())),
(2, DATEADD(DAY, -45, GETDATE())),
(1, DATEADD(DAY, -50, GETDATE()))
UPDATE #T
SET LF_Date = YEAR(ActualDate) * 10000 + MONTH(ActualDate) * 100 + DAY(ActualDate)
Test sample data:
SELECT *
FROM #T
Results:
Id ActualDate LF_Date
----------- ---------- -----------
10 2016-08-09 20160809
9 2016-08-04 20160804
8 2016-07-30 20160730
7 2016-07-25 20160725
6 2016-07-20 20160720
5 2016-07-15 20160715
4 2016-07-10 20160710
3 2016-07-05 20160705
2 2016-06-30 20160630
1 2016-06-25 20160625
As you can see, the sample table's LF_Date column is an int that keeps the date as yyyyMMdd, just like in the question.
The query:
DECLARE #DateAsInt int,
#Date date = GETDATE();
SELECT #DateAsInt = YEAR(#Date) * 10000 + MONTH(#Date) * 100 + DAY(#Date);
SELECT *
FROM #T
WHERE LF_DATE >= #DateAsInt - 28
AND LF_DATE <= #DateAsInt
Results:
Id ActualDate LF_Date
----------- ---------- -----------
10 2016-08-09 20160809
9 2016-08-04 20160804
Conclusion:
as far as the sample data goes, the answer is fine. You need to test your data to see what's stopping you from getting the results from the previous month, but I seriously doubt that it's my suggestion.
First version
Assuming your Sql server version is 2012 or higher, you can use some math and the DATEFROMPARTS built in function:
DECLARE #IntDate int = 20160322
SELECT DATEFROMPARTS (
(#IntDate - (#IntDate % 10000)) / 10000,
(#IntDate % 1000) / 100,
#IntDate % 100
) As [Date]
Results:
Date
2016-03-22
However, It will be simpler and probably have a better performance to convert the date to int:
DECLARE #Date date = '2016-03-22'
SELECT YEAR(#Date) * 10000 +
MONTH(#Date) * 100 +
DAY(#Date) As [Int]
Results:
Int
20160322
To put that in context of your question - calculate the int value of the current date before your query:
DECLARE #DateAsInt int,
#Date date = GETDATE();
SELECT #DateAsInt = YEAR(#Date) * 10000 + MONTH(#Date) * 100 + DAY(#Date);
And then, in your where clause you simply write this:
...
AND LF_DATE >= #DateAsInt - 28
AND LF_DATE <= #DateAsInt
...
In any case, you will be better off if you could change your table structure and replace that int column with a date column.
Read Aaron Bertrand's Bad habits to kick : choosing the wrong data type.
Perhaps this may help
Select DateAdd(DD,-28,cast(cast(20160809 as varchar(8)) as date))
Returns 2016-07-12
However, since your data is an int, I think it would be more efficient to convert the desired date range into an int rather than performing row level calculations
Declare #DateR1 int = Format(DateAdd(DD,-28,GetDate()),'yyyyMMdd')
Declare #DateR2 int = Format(GetDate(),'yyyyMMdd')
Select DateR1=#DateR1,DateR2=#DateR2
Returns
DateR1 DateR2
20160712 20160809
#Zohar Peled, I think I have cracked it! It is subtracting 28 as an int and not days.
The problem is 20160809 - 28 = 20160781 which is no good
The desired results would be
SELECT *
FROM #T
WHERE LF_DATE >= #DateAsInt - 28 (DAYS)
AND LF_DATE <= #DateAsInt
Id ActualDate LF_Date
10 2016-08-09 20160809
9 2016-08-04 20160804
8 2016-07-30 20160730
7 2016-07-25 20160725
6 2016-07-20 20160720
5 2016-07-15 20160715
As 20160809 - 28 DAYS would include dates from 20160712
The way around this was to subtract 97 instead of 28.
This this is not very clean, there must be a better way...

SQL Server grouping a timestamp by hour but keep as date format (don't want to pull hour out)

When I want to group a bunch of time stamps by day, by
CONVERT (datetime, CONVERT (varchar, dbo.MEASUREMENT_Battery.STAMP, 101))
it produces for me a "day" stamp that SQL Server still views as a date and can be sorted and used as such.
What I'm trying to figure out is if it's possible to do the same thing by hour. I tried
CAST(DATEPART(Month, STAMP) AS varchar) + '/' + CAST(DATEPART(Day, STAMP) AS varchar) + '/' + CAST(DATEPART(Year, STAMP) AS varchar) + ' ' + CAST(DATEPART(Hour, STAMP) AS varchar) + ':00:00.000'
and this "works" but SQL Server doesn't view this as a date anymore so I can't sort properly.
The end result I want is right though: ex: 9/9/2015 9:00:00.000
Do NOT convert into a string, until you absolutely have to "present" the result.
CONVERT() or FORMAT() return string representations of temporal information
The following method returns a datetime value truncated to the hour without resorting to string manipulation (and hence fast).
select
dateadd(hour, datediff(hour,0, dbo.MEASUREMENT_Battery.STAMP ), 0)
, count(*)
from dbo.MEASUREMENT_Battery
group by
dateadd(hour, datediff(hour,0, dbo.MEASUREMENT_Battery.STAMP ), 0)
SQL Fiddle
MS SQL Server 2014 Schema Setup:
CREATE TABLE MEASUREMENT_Battery
([STAMP] datetime)
;
INSERT INTO MEASUREMENT_Battery
([STAMP])
VALUES
('2015-11-12 07:40:15'),
('2015-11-12 08:40:15'),
('2015-11-12 09:40:15'),
('2015-11-12 10:40:15'),
('2015-11-12 11:40:15'),
('2015-11-12 12:40:15'),
('2015-11-12 13:40:15'),
('2015-11-12 14:40:15')
;
NOTE: the output below for column [Stamp] is the default display
Results:
| | |
|----------------------------|---|
| November, 12 2015 07:00:00 | 1 |
| November, 12 2015 08:00:00 | 1 |
| November, 12 2015 09:00:00 | 1 |
| November, 12 2015 10:00:00 | 1 |
| November, 12 2015 11:00:00 | 1 |
| November, 12 2015 12:00:00 | 1 |
| November, 12 2015 13:00:00 | 1 |
| November, 12 2015 14:00:00 | 1 |
If you absolutely insist on dipay of a date/time value a paricular way, then you may add the display format in the select clause (but not needed in the group by clause!)
select
FORMAT(dateadd(hour, datediff(hour,0, dbo.MEASUREMENT_Battery.STAMP ), 0) , 'MM/dd/yyyy HH')
, count(*)
from dbo.MEASUREMENT_Battery
group by
dateadd(hour, datediff(hour,0, dbo.MEASUREMENT_Battery.STAMP ), 0)
What happens is that when you use the DateTime Style 101 (at the end of the second CONVERT) the Date will be converted to mm/dd/yyyy and the time to 00:00:00.000 always as stated here an:
https://msdn.microsoft.com/en-us/library/ms187928.aspx
Now, from what I understand from your question is that you would like to include the hour as well and this can be done like this:
SELECT FORMAT(STAMP , 'MM/dd/yyyy HH') + ':00:00.000'
Note:
':00:00.000' is optional and is just for a nicer output.
This only works in SQL Server 2012 and later version.
Testing with some test date we will see that we get the expected result:
-- Drop temp table if it exists
IF OBJECT_ID('tempdb..#T') IS NOT NULL DROP TABLE #T
-- Create temp table
CREATE TABLE #T ( myDate DATETIME )
-- Insert dummy values
INSERT INTO #T VALUES ( '2015-12-25 14:00:00.000' ) -- 14
INSERT INTO #T VALUES ( '2015-12-25 14:00:00.000' ) -- 14
INSERT INTO #T VALUES ( '2015-12-25 15:00:00.000' )
INSERT INTO #T VALUES ( '2015-12-25 16:00:00.000' )
INSERT INTO #T VALUES ( '2015-12-25 17:00:00.000' ) -- 17
INSERT INTO #T VALUES ( '2015-12-25 17:00:00.000' ) -- 17
-- Select query
SELECT COUNT( myDate ), MAX( FORMAT( myDate , 'MM/dd/yyyy HH') + ':00:00.000' ) FROM #T
GROUP BY DATEPART( hour, myDate )
Output:
2 12/25/2015 14:00:00.000
1 12/25/2015 15:00:00.000
1 12/25/2015 16:00:00.000
2 12/25/2015 17:00:00.000

Returning Distinct Dates

Morning
I am trying to return the distinct dates of an outcome by a unique identifer.
For example:
ID|Date
1 | 2011-10-01 23:57:59
1 | 2011-10-01 23:58:59
1 | 2010-10-01 23:59:59
2 | 2010-09-01 23:59:59
2 | 2010-09-01 23:58:29
3 | 2010-09-01 23:58:39
3 | 2010-10-01 23:59:14
3 | 2010-10-01 23:59:36
The times are not important just the dates. So for example on ID 1 I can't do a distinct on the ID as that would return only one of my dates. So I would want to return:
1|2011-10-01
1|2010-10-01
I Have tried the following query:
Drop Table #Temp
select Distinct DateAdd(dd, DateDiff(DD,0, Date),0) as DateOnly
,ID
Into #Temp
From Table
Select Distinct (Date)
,ID
From #Temp
I am getting the following results however:
ID|Date
1 | 2011-10-01 00:00:00
1 | 2011-10-01 00:00:00
1 | 2010-10-01 00:00:00
I'm new to SQL so apologies I may have made a glaring mistake. I have got so far by searching through the previously asked questions.
As always any help and pointers is greatly appreciated.
You can use the T-SQL convert function to extract the Date.
Try
CONVERT(char(10), GetDate(),126)
so, in your case, do
Drop Table #Temp
select Distinct CONVERT(char(10), DateAdd(dd, DateDiff(DD,0, Date),0), 126) as DateOnly
,ID
Into #Temp
From Table
Select Distinct (Date)
,ID
From #Temp
further informations: Getting the Date Portion of a SQL Server Datetime field
hope this helps!
If you are using Sql Server 2008 - you can cast DateTime column to a built in Date type , otherwise to get rid of time you should cast to VARCHAR() only day/month/year parts and then convert back to datetime so time part would be zeroed:
declare #dates table(id int, dt datetime)
INSERT INTO #dates VALUES(1, '2011-10-01 23:57:49')
INSERT INTO #dates VALUES(2, '2011-10-02 23:57:59')
INSERT INTO #dates VALUES(2, '2011-10-02 23:57:39')
SELECT stripped.id, stripped.dateOnly
FROM
(
-- this will return dates with zeroed time part 2011-10-01 00:00:00.000
SELECT id,
CONVERT(datetime,
CAST(YEAR(dt) as VARCHAR(4)) + '-' +
CAST(MONTH(dt) AS VARCHAR(2)) + '-' +
CAST(DAY(dt) AS VARCHAR(2))) as dateOnly
FROM #dates
) stripped
GROUP BY stripped.id, stripped.dateOnly

SQL query for cumulative frequency of list of datetimes

I have a list of times in a database column (representing visits to a website).
I need to group them in intervals and then get a 'cumulative frequency' table of those dates.
For instance I might have:
9:01
9:04
9:11
9:13
9:22
9:24
9:28
and i want to convert that into
9:05 - 2
9:15 - 4
9:25 - 6
9:30 - 7
How can I do that? Can i even easily achieve this in SQL? I can quite easily do it in C#
create table accu_times (time_val datetime not null, constraint pk_accu_times primary key (time_val));
go
insert into accu_times values ('9:01');
insert into accu_times values ('9:05');
insert into accu_times values ('9:11');
insert into accu_times values ('9:13');
insert into accu_times values ('9:22');
insert into accu_times values ('9:24');
insert into accu_times values ('9:28');
go
select rounded_time,
(
select count(*)
from accu_times as at2
where at2.time_val <= rt.rounded_time
) as accu_count
from (
select distinct
dateadd(minute, round((datepart(minute, at.time_val) + 2)*2, -1)/2,
dateadd(hour, datepart(hour, at.time_val), 0)
) as rounded_time
from accu_times as at
) as rt
go
drop table accu_times
Results in:
rounded_time accu_count
----------------------- -----------
1900-01-01 09:05:00.000 2
1900-01-01 09:15:00.000 4
1900-01-01 09:25:00.000 6
1900-01-01 09:30:00.000 7
I should point out that based on the stated "intent" of the problem, to do analysis on visitor traffic - I wrote this statement to summarize the counts in uniform groups.
To do otherwise (as in the "example" groups) would be comparing the counts during a 5 minute interval to counts in a 10 minute interval - which doesn't make sense.
You have to grok to the "intent" of the user requirement, not the literal "reading" of it. :-)
create table #myDates
(
myDate datetime
);
go
insert into #myDates values ('10/02/2008 09:01:23');
insert into #myDates values ('10/02/2008 09:03:23');
insert into #myDates values ('10/02/2008 09:05:23');
insert into #myDates values ('10/02/2008 09:07:23');
insert into #myDates values ('10/02/2008 09:11:23');
insert into #myDates values ('10/02/2008 09:14:23');
insert into #myDates values ('10/02/2008 09:19:23');
insert into #myDates values ('10/02/2008 09:21:23');
insert into #myDates values ('10/02/2008 09:21:23');
insert into #myDates values ('10/02/2008 09:21:23');
insert into #myDates values ('10/02/2008 09:21:23');
insert into #myDates values ('10/02/2008 09:21:23');
insert into #myDates values ('10/02/2008 09:26:23');
insert into #myDates values ('10/02/2008 09:27:23');
insert into #myDates values ('10/02/2008 09:29:23');
go
declare #interval int;
set #interval = 10;
select
convert(varchar(5), dateadd(minute,#interval - datepart(minute, myDate) % #interval, myDate), 108) timeGroup,
count(*)
from
#myDates
group by
convert(varchar(5), dateadd(minute,#interval - datepart(minute, myDate) % #interval, myDate), 108)
retuns:
timeGroup
--------- -----------
09:10 4
09:20 3
09:30 8
ooh, way too complicated all of that stuff.
Normalise to seconds, divide by your bucket interval, truncate and remultiply:
select sec_to_time(floor(time_to_sec(d)/300)*300), count(*)
from d
group by sec_to_time(floor(time_to_sec(d)/300)*300)
Using Ron Savage's data, I get
+----------+----------+
| i | count(*) |
+----------+----------+
| 09:00:00 | 1 |
| 09:05:00 | 3 |
| 09:10:00 | 1 |
| 09:15:00 | 1 |
| 09:20:00 | 6 |
| 09:25:00 | 2 |
| 09:30:00 | 1 |
+----------+----------+
You may wish to use ceil() or round() instead of floor().
Update: for a table created with
create table d (
d datetime
);
Create a table periods describing the periods you wish to divide the day up into.
SELECT periods.name, count(time)
FROM periods, times
WHERE period.start <= times.time
AND times.time < period.end
GROUP BY periods.name
Create a table containing what intervals you want to be getting totals at then join the two tables together.
Such as:
time_entry.time_entry
-----------------------
2008-10-02 09:01:00.000
2008-10-02 09:04:00.000
2008-10-02 09:11:00.000
2008-10-02 09:13:00.000
2008-10-02 09:22:00.000
2008-10-02 09:24:00.000
2008-10-02 09:28:00.000
time_interval.time_end
-----------------------
2008-10-02 09:05:00.000
2008-10-02 09:15:00.000
2008-10-02 09:25:00.000
2008-10-02 09:30:00.000
SELECT
ti.time_end,
COUNT(*) AS 'interval_total'
FROM time_interval ti
INNER JOIN time_entry te
ON te.time_entry < ti.time_end
GROUP BY ti.time_end;
time_end interval_total
----------------------- -------------
2008-10-02 09:05:00.000 2
2008-10-02 09:15:00.000 4
2008-10-02 09:25:00.000 6
2008-10-02 09:30:00.000 7
If instead of wanting cumulative totals you wanted totals within a range, then you add a time_start column to the time_interval table and change the query to
SELECT
ti.time_end,
COUNT(*) AS 'interval_total'
FROM time_interval ti
INNER JOIN time_entry te
ON te.time_entry >= ti.time_start
AND te.time_entry < ti.time_end
GROUP BY ti.time_end;
This uses quite a few SQL tricks (SQL Server 2005):
CREATE TABLE [dbo].[stackoverflow_165571](
[visit] [datetime] NOT NULL
) ON [PRIMARY]
GO
;WITH buckets AS (
SELECT dateadd(mi, (1 + datediff(mi, 0, visit - 1 - dateadd(dd, 0, datediff(dd, 0, visit))) / 5) * 5, 0) AS visit_bucket
,COUNT(*) AS visit_count
FROM stackoverflow_165571
GROUP BY dateadd(mi, (1 + datediff(mi, 0, visit - 1 - dateadd(dd, 0, datediff(dd, 0, visit))) / 5) * 5, 0)
)
SELECT LEFT(CONVERT(varchar, l.visit_bucket, 8), 5) + ' - ' + CONVERT(varchar, SUM(r.visit_count))
FROM buckets l
LEFT JOIN buckets r
ON r.visit_bucket <= l.visit_bucket
GROUP BY l.visit_bucket
ORDER BY l.visit_bucket
Note that it puts all the times on the same day, and assumes they are in a datetime column. The only thing it doesn't do as your example does is strip the leading zeroes from the time representation.