Returning Distinct Dates - sql

Morning
I am trying to return the distinct dates of an outcome by a unique identifer.
For example:
ID|Date
1 | 2011-10-01 23:57:59
1 | 2011-10-01 23:58:59
1 | 2010-10-01 23:59:59
2 | 2010-09-01 23:59:59
2 | 2010-09-01 23:58:29
3 | 2010-09-01 23:58:39
3 | 2010-10-01 23:59:14
3 | 2010-10-01 23:59:36
The times are not important just the dates. So for example on ID 1 I can't do a distinct on the ID as that would return only one of my dates. So I would want to return:
1|2011-10-01
1|2010-10-01
I Have tried the following query:
Drop Table #Temp
select Distinct DateAdd(dd, DateDiff(DD,0, Date),0) as DateOnly
,ID
Into #Temp
From Table
Select Distinct (Date)
,ID
From #Temp
I am getting the following results however:
ID|Date
1 | 2011-10-01 00:00:00
1 | 2011-10-01 00:00:00
1 | 2010-10-01 00:00:00
I'm new to SQL so apologies I may have made a glaring mistake. I have got so far by searching through the previously asked questions.
As always any help and pointers is greatly appreciated.

You can use the T-SQL convert function to extract the Date.
Try
CONVERT(char(10), GetDate(),126)
so, in your case, do
Drop Table #Temp
select Distinct CONVERT(char(10), DateAdd(dd, DateDiff(DD,0, Date),0), 126) as DateOnly
,ID
Into #Temp
From Table
Select Distinct (Date)
,ID
From #Temp
further informations: Getting the Date Portion of a SQL Server Datetime field
hope this helps!

If you are using Sql Server 2008 - you can cast DateTime column to a built in Date type , otherwise to get rid of time you should cast to VARCHAR() only day/month/year parts and then convert back to datetime so time part would be zeroed:
declare #dates table(id int, dt datetime)
INSERT INTO #dates VALUES(1, '2011-10-01 23:57:49')
INSERT INTO #dates VALUES(2, '2011-10-02 23:57:59')
INSERT INTO #dates VALUES(2, '2011-10-02 23:57:39')
SELECT stripped.id, stripped.dateOnly
FROM
(
-- this will return dates with zeroed time part 2011-10-01 00:00:00.000
SELECT id,
CONVERT(datetime,
CAST(YEAR(dt) as VARCHAR(4)) + '-' +
CAST(MONTH(dt) AS VARCHAR(2)) + '-' +
CAST(DAY(dt) AS VARCHAR(2))) as dateOnly
FROM #dates
) stripped
GROUP BY stripped.id, stripped.dateOnly

Related

Grouping sql rows by weeks

I have a table
DATE Val
01-01-2020 1
01-02-2020 3
01-05-2020 2
01-07-2020 8
01-13-2020 3
...
I want to summarize these values by the following Sunday. For example, in the above example:
1-05-2020, 1-12-2020, and 1-19-2020 are Sundays, so I want to summarize these by those dates.
The final result should be something like
DATE SUM
1-05-2020 6 //(01-01-2020 + 01-02-2020 + 01-05-2020)
1-12-2020 8
1-19-2020 3
I wasn't certain if the best place to start would be to create a temp calendar table, and then try to join backwards based on that? Or if there was an easier way involving DATEDIFF. Any help would be appreciated! Thanks!
Here's a solution that uses DATEADD & DATEPART to calculate the closest Sunday.
With a correction for a different setting of ##datefirst.
(Since the datepart weekday values are different depending on the DATEFIRST setting)
Sample data:
create table #TestTable
(
Id int identity(1,1) primary key,
[Date] date,
Val int
);
insert into #TestTable
([Date], Val)
VALUES
('2020-01-01', 1)
, ('2020-01-02', 3)
, ('2020-01-05', 2)
, ('2020-01-07', 8)
, ('2020-01-13', 3)
;
Query:
WITH CTE_DATA AS
(
SELECT [Date], Val
, DATEADD(day,
((7-(##datefirst+datepart(weekday, [Date])-1)%7)%7),
[Date]) AS Sunday
FROM #TestTable
)
SELECT
Sunday AS [Date],
SUM(Val) AS [Sum]
FROM CTE_DATA
GROUP BY Sunday
ORDER BY Sunday;
Date | Sum
:--------- | --:
2020-01-05 | 6
2020-01-12 | 8
2020-01-19 | 3
db<>fiddle here
Extra:
Apparently the trick of adding the difference of weeks from day 0 to day 6 also works independently from the DATEFIRST setting.
So this query will return the same result for the sample data.
WITH CTE_DATA AS
(
SELECT [Date], Val
, CAST(DATEADD(week, DATEDIFF(week, 0, DATEADD(day, -1, [Date])), 6) AS DATE) AS Sunday
FROM #TestTable
)
SELECT
Sunday AS [Date],
SUM(Val) AS [Sum]
FROM CTE_DATA
GROUP BY Sunday
ORDER BY Sunday;
The subtraction of 1 day makes sure that if the date is already a Sunday that it isn't calculated to the next Sunday.
Here is a way to do it:
nb:1-13-2020 wont show cuz its not a sunday
with cte as
(
select cast('01-01-2020'as Date) as Date, 1 as Val
union select '01-02-2020' , 3
union select '01-05-2020' , 2
union select '01-07-2020' , 8
)
select Date, max(dateadd(dd,number,Date)), sum(distinct Val) as SUM
from master..spt_values a inner join cte on Date <= dateadd(dd,number,Date)
where type = 'p'
and year(dateadd(dd,number,Date))=year(Date)
and DATEPART(dw,dateadd(dd,number,Date)) = 7
group by Date
Output:
Date (No column name) SUM
2020-01-01 2020-12-26 1
2020-01-02 2020-12-26 3
2020-01-05 2020-12-26 2
2020-01-07 2020-12-26 8
Here is a simple solution. Putting your values into a temporary table and viewing the results on that table:
DECLARE #dates TABLE
(
mDATE DATE,
Val INT,
Sunday DATE
)
INSERT INTO #dates (mDATE,Val) VALUES
('01-01-2020',1),('01-02-2020',3),('01-05-2020',2),('01-07-2020',8),('01-13-2020',3)
UPDATE #dates
SET Sunday = dateadd(week, datediff(week, 0, mDATE), 6)
SELECT Sunday,SUM(Val) AS Val FROM #dates
GROUP BY Sunday
OUTPUT:
Sunday Val
2020-01-05 4
2020-01-12 10
2020-01-19 3

SQL Server 2012: GROUP BY seconds, minutes, hours

I am trying to group records in SQL Server 2012 by DateTime. I found an example on stackoverflow that does partially what I want, but the problem is that it does not group correct when it exceeds the range. If, for example, I group on minutes in blocks of 30 minutes the result is correct (see query and result below). But if I group on blocks of 120 minutes, I got the exact same result. It keeps grouping at its maximum of 60 minutes in an hour (result below).
The problem is that the grouping can not take it's parent DateTime element (seconds to minutes, minutes to hours, ... , even seconds to hours,...). It is kinda logic cause I only check at minutes, but I would like to see it pass hours also. Maybe with a DATEADD(), but I don't manage to get it working.
Any ideas??
A (small) example to show what I mean:
Query:
DECLARE #TimeInterval int, #StartTime DateTime, #EndTime Datetime
SET #TimeInterval = 30
SET #StartTime='2015-01-01T08:00:00Z'
SET #EndTime = '2015-01-05T10:00:00Z'
declare #T table
(
Value datetime
);
insert into #T values ('2015-01-01T08:00:00');
insert into #T values ('2015-01-01T08:03:00');
insert into #T values ('2015-01-01T08:06:00');
insert into #T values ('2015-01-01T08:14:00');
insert into #T values ('2015-01-01T09:06:00');
insert into #T values ('2015-01-01T09:07:00');
insert into #T values ('2015-01-01T09:08:00');
insert into #T values ('2015-01-01T11:09:00');
insert into #T values ('2015-01-01T12:10:00');
insert into #T values ('2015-01-01T13:11:00');
insert into #T values ('2015-01-02T08:08:00');
insert into #T values ('2015-01-02T08:09:00');
insert into #T values ('2015-01-03T08:10:00');
insert into #T values ('2015-01-04T08:11:00');
SELECT DATEADD(MINUTE, #TimeInterval, Convert(Datetime,CONCAT(DATEPART(YEAR, Value),'-', DATEPART(MONTH, Value),
'-', DATEPART(DAY, Value),' ', DATEPART(HOUR, Value),':', ((DATEPART(MINUTE, Value) / #TimeInterval) * #TimeInterval),':00'),120)) as Period,
ISNULL(COUNT(*), 0) AS NumberOfVisitors
FROM #T
WHERE Value >= #StartTime AND Value < #EndTime
GROUP BY Convert(Datetime,CONCAT(DATEPART(YEAR, Value),'-', DATEPART(MONTH, Value), '-', DATEPART(DAY, Value),' ',
DATEPART(HOUR, Value),':',((DATEPART(MINUTE, Value) / #TimeInterval) * #TimeInterval),':00'),120)
ORDER BY Period
Result for 30 min
2015-01-01 08:30:00.000 | 4
2015-01-01 09:30:00.000 | 3
2015-01-01 11:30:00.000 | 1
2015-01-01 12:30:00.000 | 1
2015-01-01 13:30:00.000 | 1
2015-01-02 08:30:00.000 | 2
2015-01-03 08:30:00.000 | 1
2015-01-04 08:30:00.000 | 1
Result for 60 min
2015-01-01 08:30:00.000 | 4
2015-01-01 09:30:00.000 | 3
2015-01-01 11:30:00.000 | 1
2015-01-01 12:30:00.000 | 1
2015-01-01 13:30:00.000 | 1
2015-01-02 08:30:00.000 | 2
2015-01-03 08:30:00.000 | 1
2015-01-04 08:30:00.000 | 1
Thanks in advance!
You don't want datepart() for this purpose. You want a minutes count. One way is to use datediff():
SELECT datediff(minute, #StartTime, value) / #TimeInterval as minutes,
COUNT(*) AS NumberOfVisitors
FROM #T
WHERE Value >= #StartTime AND Value < #EndTime
GROUP BY datediff(minute, #StartTime, value) / #TimeInterval
ORDER BY minutes ;
SQL Server does integer division, so you don't have to worry about remainders in this case. Also, COUNT(*) cannot return NULL, so neither ISNULL() nor COALESCE() is appropriate.
Or, to get a date/time value:
SELECT dateadd(day,
datediff(minute, #StartTime, value) / #TimeInterval,
#StartTime) as period,
COUNT(*) AS NumberOfVisitors
FROM #T
WHERE Value >= #StartTime AND Value < #EndTime
GROUP BY datediff(minute, #StartTime, value) / #TimeInterval
ORDER BY period ;

SQL Server grouping a timestamp by hour but keep as date format (don't want to pull hour out)

When I want to group a bunch of time stamps by day, by
CONVERT (datetime, CONVERT (varchar, dbo.MEASUREMENT_Battery.STAMP, 101))
it produces for me a "day" stamp that SQL Server still views as a date and can be sorted and used as such.
What I'm trying to figure out is if it's possible to do the same thing by hour. I tried
CAST(DATEPART(Month, STAMP) AS varchar) + '/' + CAST(DATEPART(Day, STAMP) AS varchar) + '/' + CAST(DATEPART(Year, STAMP) AS varchar) + ' ' + CAST(DATEPART(Hour, STAMP) AS varchar) + ':00:00.000'
and this "works" but SQL Server doesn't view this as a date anymore so I can't sort properly.
The end result I want is right though: ex: 9/9/2015 9:00:00.000
Do NOT convert into a string, until you absolutely have to "present" the result.
CONVERT() or FORMAT() return string representations of temporal information
The following method returns a datetime value truncated to the hour without resorting to string manipulation (and hence fast).
select
dateadd(hour, datediff(hour,0, dbo.MEASUREMENT_Battery.STAMP ), 0)
, count(*)
from dbo.MEASUREMENT_Battery
group by
dateadd(hour, datediff(hour,0, dbo.MEASUREMENT_Battery.STAMP ), 0)
SQL Fiddle
MS SQL Server 2014 Schema Setup:
CREATE TABLE MEASUREMENT_Battery
([STAMP] datetime)
;
INSERT INTO MEASUREMENT_Battery
([STAMP])
VALUES
('2015-11-12 07:40:15'),
('2015-11-12 08:40:15'),
('2015-11-12 09:40:15'),
('2015-11-12 10:40:15'),
('2015-11-12 11:40:15'),
('2015-11-12 12:40:15'),
('2015-11-12 13:40:15'),
('2015-11-12 14:40:15')
;
NOTE: the output below for column [Stamp] is the default display
Results:
| | |
|----------------------------|---|
| November, 12 2015 07:00:00 | 1 |
| November, 12 2015 08:00:00 | 1 |
| November, 12 2015 09:00:00 | 1 |
| November, 12 2015 10:00:00 | 1 |
| November, 12 2015 11:00:00 | 1 |
| November, 12 2015 12:00:00 | 1 |
| November, 12 2015 13:00:00 | 1 |
| November, 12 2015 14:00:00 | 1 |
If you absolutely insist on dipay of a date/time value a paricular way, then you may add the display format in the select clause (but not needed in the group by clause!)
select
FORMAT(dateadd(hour, datediff(hour,0, dbo.MEASUREMENT_Battery.STAMP ), 0) , 'MM/dd/yyyy HH')
, count(*)
from dbo.MEASUREMENT_Battery
group by
dateadd(hour, datediff(hour,0, dbo.MEASUREMENT_Battery.STAMP ), 0)
What happens is that when you use the DateTime Style 101 (at the end of the second CONVERT) the Date will be converted to mm/dd/yyyy and the time to 00:00:00.000 always as stated here an:
https://msdn.microsoft.com/en-us/library/ms187928.aspx
Now, from what I understand from your question is that you would like to include the hour as well and this can be done like this:
SELECT FORMAT(STAMP , 'MM/dd/yyyy HH') + ':00:00.000'
Note:
':00:00.000' is optional and is just for a nicer output.
This only works in SQL Server 2012 and later version.
Testing with some test date we will see that we get the expected result:
-- Drop temp table if it exists
IF OBJECT_ID('tempdb..#T') IS NOT NULL DROP TABLE #T
-- Create temp table
CREATE TABLE #T ( myDate DATETIME )
-- Insert dummy values
INSERT INTO #T VALUES ( '2015-12-25 14:00:00.000' ) -- 14
INSERT INTO #T VALUES ( '2015-12-25 14:00:00.000' ) -- 14
INSERT INTO #T VALUES ( '2015-12-25 15:00:00.000' )
INSERT INTO #T VALUES ( '2015-12-25 16:00:00.000' )
INSERT INTO #T VALUES ( '2015-12-25 17:00:00.000' ) -- 17
INSERT INTO #T VALUES ( '2015-12-25 17:00:00.000' ) -- 17
-- Select query
SELECT COUNT( myDate ), MAX( FORMAT( myDate , 'MM/dd/yyyy HH') + ':00:00.000' ) FROM #T
GROUP BY DATEPART( hour, myDate )
Output:
2 12/25/2015 14:00:00.000
1 12/25/2015 15:00:00.000
1 12/25/2015 16:00:00.000
2 12/25/2015 17:00:00.000

Sql Query create date range with single date column

I have a table containing a single date column say special date (stored as yyyymmdd )
How do i create a date range among the small subset of rows?
Example table contains a date column with following values 01-jan-2010, 01-feb-2010, 01-mar-2010
I need
01-jan-2010 - 01-feb-2010
01-feb-2010 - 01-mar-2010
....
....
Please help.
You can try something like
DECLARE #Table TABLE(
DateVal DATETIME
)
INSERT INTO #Table SELECT '01 Jan 2010'
INSERT INTO #Table SELECT '01 Feb 2010'
INSERT INTO #Table SELECT '01 Mar 2010'
;WITH DateVals AS (
SELECT *,
ROW_NUMBER() OVER(ORDER BY DateVal) RowID
FROM #Table
)
SELECT s.DateVal StartDate,
e.DateVal EndDate
FROM DateVals s INNER JOIN
DateVals e ON s.RowID + 1 = e.RowID
Output
StartDate EndDate
2010-01-01 00:00:00.000 2010-02-01 00:00:00.000
2010-02-01 00:00:00.000 2010-03-01 00:00:00.000
You can avoid the CTE by using
SELECT s.DateVal StartDate,
MIN(e.DateVal) EndDate
FROM #Table s LEFT JOIN
#Table e ON s.DateVal < e.DateVal
GROUP BY s.DateVal
HAVING MIN(e.DateVal) IS NOT NULL
But I do not see why you wish to do so.

SQL query for cumulative frequency of list of datetimes

I have a list of times in a database column (representing visits to a website).
I need to group them in intervals and then get a 'cumulative frequency' table of those dates.
For instance I might have:
9:01
9:04
9:11
9:13
9:22
9:24
9:28
and i want to convert that into
9:05 - 2
9:15 - 4
9:25 - 6
9:30 - 7
How can I do that? Can i even easily achieve this in SQL? I can quite easily do it in C#
create table accu_times (time_val datetime not null, constraint pk_accu_times primary key (time_val));
go
insert into accu_times values ('9:01');
insert into accu_times values ('9:05');
insert into accu_times values ('9:11');
insert into accu_times values ('9:13');
insert into accu_times values ('9:22');
insert into accu_times values ('9:24');
insert into accu_times values ('9:28');
go
select rounded_time,
(
select count(*)
from accu_times as at2
where at2.time_val <= rt.rounded_time
) as accu_count
from (
select distinct
dateadd(minute, round((datepart(minute, at.time_val) + 2)*2, -1)/2,
dateadd(hour, datepart(hour, at.time_val), 0)
) as rounded_time
from accu_times as at
) as rt
go
drop table accu_times
Results in:
rounded_time accu_count
----------------------- -----------
1900-01-01 09:05:00.000 2
1900-01-01 09:15:00.000 4
1900-01-01 09:25:00.000 6
1900-01-01 09:30:00.000 7
I should point out that based on the stated "intent" of the problem, to do analysis on visitor traffic - I wrote this statement to summarize the counts in uniform groups.
To do otherwise (as in the "example" groups) would be comparing the counts during a 5 minute interval to counts in a 10 minute interval - which doesn't make sense.
You have to grok to the "intent" of the user requirement, not the literal "reading" of it. :-)
create table #myDates
(
myDate datetime
);
go
insert into #myDates values ('10/02/2008 09:01:23');
insert into #myDates values ('10/02/2008 09:03:23');
insert into #myDates values ('10/02/2008 09:05:23');
insert into #myDates values ('10/02/2008 09:07:23');
insert into #myDates values ('10/02/2008 09:11:23');
insert into #myDates values ('10/02/2008 09:14:23');
insert into #myDates values ('10/02/2008 09:19:23');
insert into #myDates values ('10/02/2008 09:21:23');
insert into #myDates values ('10/02/2008 09:21:23');
insert into #myDates values ('10/02/2008 09:21:23');
insert into #myDates values ('10/02/2008 09:21:23');
insert into #myDates values ('10/02/2008 09:21:23');
insert into #myDates values ('10/02/2008 09:26:23');
insert into #myDates values ('10/02/2008 09:27:23');
insert into #myDates values ('10/02/2008 09:29:23');
go
declare #interval int;
set #interval = 10;
select
convert(varchar(5), dateadd(minute,#interval - datepart(minute, myDate) % #interval, myDate), 108) timeGroup,
count(*)
from
#myDates
group by
convert(varchar(5), dateadd(minute,#interval - datepart(minute, myDate) % #interval, myDate), 108)
retuns:
timeGroup
--------- -----------
09:10 4
09:20 3
09:30 8
ooh, way too complicated all of that stuff.
Normalise to seconds, divide by your bucket interval, truncate and remultiply:
select sec_to_time(floor(time_to_sec(d)/300)*300), count(*)
from d
group by sec_to_time(floor(time_to_sec(d)/300)*300)
Using Ron Savage's data, I get
+----------+----------+
| i | count(*) |
+----------+----------+
| 09:00:00 | 1 |
| 09:05:00 | 3 |
| 09:10:00 | 1 |
| 09:15:00 | 1 |
| 09:20:00 | 6 |
| 09:25:00 | 2 |
| 09:30:00 | 1 |
+----------+----------+
You may wish to use ceil() or round() instead of floor().
Update: for a table created with
create table d (
d datetime
);
Create a table periods describing the periods you wish to divide the day up into.
SELECT periods.name, count(time)
FROM periods, times
WHERE period.start <= times.time
AND times.time < period.end
GROUP BY periods.name
Create a table containing what intervals you want to be getting totals at then join the two tables together.
Such as:
time_entry.time_entry
-----------------------
2008-10-02 09:01:00.000
2008-10-02 09:04:00.000
2008-10-02 09:11:00.000
2008-10-02 09:13:00.000
2008-10-02 09:22:00.000
2008-10-02 09:24:00.000
2008-10-02 09:28:00.000
time_interval.time_end
-----------------------
2008-10-02 09:05:00.000
2008-10-02 09:15:00.000
2008-10-02 09:25:00.000
2008-10-02 09:30:00.000
SELECT
ti.time_end,
COUNT(*) AS 'interval_total'
FROM time_interval ti
INNER JOIN time_entry te
ON te.time_entry < ti.time_end
GROUP BY ti.time_end;
time_end interval_total
----------------------- -------------
2008-10-02 09:05:00.000 2
2008-10-02 09:15:00.000 4
2008-10-02 09:25:00.000 6
2008-10-02 09:30:00.000 7
If instead of wanting cumulative totals you wanted totals within a range, then you add a time_start column to the time_interval table and change the query to
SELECT
ti.time_end,
COUNT(*) AS 'interval_total'
FROM time_interval ti
INNER JOIN time_entry te
ON te.time_entry >= ti.time_start
AND te.time_entry < ti.time_end
GROUP BY ti.time_end;
This uses quite a few SQL tricks (SQL Server 2005):
CREATE TABLE [dbo].[stackoverflow_165571](
[visit] [datetime] NOT NULL
) ON [PRIMARY]
GO
;WITH buckets AS (
SELECT dateadd(mi, (1 + datediff(mi, 0, visit - 1 - dateadd(dd, 0, datediff(dd, 0, visit))) / 5) * 5, 0) AS visit_bucket
,COUNT(*) AS visit_count
FROM stackoverflow_165571
GROUP BY dateadd(mi, (1 + datediff(mi, 0, visit - 1 - dateadd(dd, 0, datediff(dd, 0, visit))) / 5) * 5, 0)
)
SELECT LEFT(CONVERT(varchar, l.visit_bucket, 8), 5) + ' - ' + CONVERT(varchar, SUM(r.visit_count))
FROM buckets l
LEFT JOIN buckets r
ON r.visit_bucket <= l.visit_bucket
GROUP BY l.visit_bucket
ORDER BY l.visit_bucket
Note that it puts all the times on the same day, and assumes they are in a datetime column. The only thing it doesn't do as your example does is strip the leading zeroes from the time representation.