SQL Server / T-SQL: Selecting a specific interval ( group by ) - sql

I want to write a select that aggregates over data (which has a DATETIME column as ID) with ANY interval theoretically possible (like 1hr, 1hr and 22seconds, 1year and 3minutes, etc. ).
This select should be able to aggregate by 1hr, 12min, 14seconds and should return 3 rows
SELECT DATEPART(YEAR,id) as year,
DATEPART(MONTH,id) as month,
DATEPART(DAY,id) as day,
DATEPART(HOUR,id) as hour,
DATEPART(MINUTE,id) as minute,
AVG([Open]),
AVG([Close]),
AVG([Min]),
AVG([Max])
FROM QuoteHistory
where id between '2000-02-06 17:00:00.000' and '2000-02-06 20:36:42.000'
GROUP BY
DATEPART(YEAR,id),
DATEPART(MONTH,id),
DATEPART(DAY,id),
DATEPART(HOUR,id),
DATEPART(MINUTE,id)
ORDER BY 1,2,3,4,5;
I am kind of stuck here and can't get my head around this problem.. For "simple intervals" like "30 minutes" i could just add a modulo
DATEPART(MINUTE,id)%2
but when the interval "touches" more than 1 part of the date, I'm stuck.
Any help appreciated, thx!

Assuming some parameters here:
;WITH Date_Ranges AS (
SELECT
#min_datetime AS start_datetime,
DATEADD(SECOND, #seconds,
DATEADD(MINUTE, #minutes,
DATEADD(HOUR, #hours,
DATEADD(DAY, #days,
DATEADD(WEEK, #weeks,
DATEADD(MONTH, #months,
DATEADD(YEAR, #years, #min_datetime))))))) AS end_datetime
UNION ALL
SELECT
DATEADD(SECOND, 1, end_datetime),
DATEADD(SECOND, #seconds,
DATEADD(MINUTE, #minutes,
DATEADD(HOUR, #hours,
DATEADD(DAY, #days,
DATEADD(WEEK, #weeks,
DATEADD(MONTH, #months,
DATEADD(YEAR, #years, end_datetime)))))))
FROM
Date_Ranges
WHERE
DATEADD(SECOND, 1, end_datetime) < #max_datetime
)
SELECT
DR.min_datetime,
DR.max_datetime,
AVG([Open]),
AVG([Close]),
AVG([Min]),
AVG([Max])
FROM
Date_Ranges DR
LEFT OUTER JOIN Quote_History QH ON
QH.id BETWEEN DR.min_datetime AND DR.max_datetime
GROUP BY
DR.min_datetime,
DR.max_datetime
ORDER BY
DR.min_datetime,
DR.max_datetime
You might need to fiddle with how to handle the edge cases (that 1 second range between date ranges could be a problem depending on your data). This should hopefully point you in the right direction though.

Related

How can I return an event duration divided over multiple days in SQL?

I have a database that contains Event data. These Events represent a warning state in a system. They have a StartTime, EndTime, Duration (in seconds) and VarName (and some other attributes that are less relevant for the question).
I am trying to write a query that will allow me to represent the amount of time that a certain warning was active per day. This way, service engineers can easily see if certain changes/fixes caused a warning to decrease or disappear over time.
A quick-and-dirty first attempt is shown below.
SELECT
[VarName] AS metric,
SUM([Duration]) AS value,
Convert(date, [StartTime]) AS date
FROM [dbo].[Events]
WHERE VarName LIKE 'WARN%'
GROUP BY Convert(date, [StartTime]), VarName
ORDER BY date
This works well enough when Events last only a short time and handles multiple Events of the same type in a day. But it breaks down when Events span multiple days (or even weeks).
Example:
VarName
StartTime
EndTime
Duration
WARN_1
2021-06-28 23:00:00.000
2021-06-29 02:00:00.00
10800
What I get:
metric
date
value
WARN_1
2021-06-28
10800
What I want:
metric
date
value
WARN_1
2021-06-28
3600
WARN_1
2021-06-29
7200
Taking into account that:
An event can occur multiple times in the same day
An event can span multiple days
I'll be fiddling with this today and if I come up with a working solution I'll append it to this post. But I don't work with SQL all that often, and it feels like this may require some more advanced trickery. Any help is appreciated!
You can use a recursive CTE to break the periods into days:
with cte as (
select varname, starttime,
(case when datediff(day, starttime, endtime) = 0
then endtime
else dateadd(day, 1, convert(date, starttime))
end) as day_endtime,
endtime
from t
union all
select varname, day_endtime,
(case when datediff(day, starttime, endtime) = 1
then endtime
else dateadd(day, 1, convert(date, day_endtime))
end) as day_endtime,
endtime
from cte
where datediff(day, starttime, endtime) > 0
)
select *
from cte;
To aggregate, change the last part to:
select varname, convert(date, starttime),
sum(datediff(second, starttime, day_endtime))
from cte
group by varname, convert(date, starttime);
Here is a db<>fiddle.
If you need timespans longer than 100 days, add option (maxrecursion 0) or any number larger than about 732.
If you have a numbers table, you can join to that based on the difference in days which will give you n number of rows for each event based on the days. You then just need to calculate how much of each event falls in that day, which I've done with a case expression:
SELECT t.VarName,
StartDate = CONVERT(DATE, DATEADD(DAY, n.Number, t.StartTime)),
Duration = CASE -- starts and ends on same date
WHEN CONVERT(DATE, t.StartTime) = CONVERT(DATE, t.EndTime) THEN t.Duration
-- First Day
WHEN n.Number = 0 THEN DATEDIFF(SECOND, t.StartTime, CONVERT(DATE, DATEADD(DAY, 1, t.StartTime)))
--Last Day
WHEN CONVERT(DATE, DATEADD(DAY, n.Number, t.StartTime)) = CONVERT(DATE, t.EndTime)
THEN DATEDIFF(SECOND, CONVERT(DATE, DATEADD(DAY, n.Number, t.StartTime)), t.EndTime)
-- Middle Day
ELSE 86400
END
FROM #t AS t
INNER JOIN Numbers AS n
ON n.Number <= DATEDIFF(DAY, t.StartTime, t.EndTime);
If you don't have a numbers table, you can very easily create this on the fly:
DECLARE #T TABLE (VarName VARCHAR(10), StartTime DATETIME, EndTime DATETIME, Duration AS DATEDIFF(SECOND, StartTime, EndTime));
INSERT #T (VarName, StartTime, EndTime)
VALUES
('WARN_1', '20210628 23:00:00.000', '20210629 02:00:00.00'),
('WARN_2', '20210629 11:00:00.000', '20210629 14:00:00.00'),
('WARN_3', '20210630 23:00:00.000', '20210704 02:00:00.00');
-- This will do numbers 0-99, add more cross joins if necessary
WITH Numbers (Number) AS
( SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) - 1
FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n1 (N)
CROSS JOIN (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n2 (N)
)
SELECT t.VarName,
StartDate = CONVERT(DATE, DATEADD(DAY, n.Number, t.StartTime)),
Duration = CASE -- starts and ends on same date
WHEN CONVERT(DATE, t.StartTime) = CONVERT(DATE, t.EndTime) THEN t.Duration
-- First Day
WHEN n.Number = 0 THEN DATEDIFF(SECOND, t.StartTime, CONVERT(DATE, DATEADD(DAY, 1, t.StartTime)))
--Last Day
WHEN CONVERT(DATE, DATEADD(DAY, n.Number, t.StartTime)) = CONVERT(DATE, t.EndTime)
THEN DATEDIFF(SECOND, CONVERT(DATE, DATEADD(DAY, n.Number, t.StartTime)), t.EndTime)
-- Middle Day
ELSE 86400
END
FROM #t AS t
INNER JOIN Numbers AS n
ON n.Number <= DATEDIFF(DAY, t.StartTime, t.EndTime);

SQL Server, filter for max date and max date minus 7 days

I'm trying to design a view and apply several conditions on my timestamp (datetime): last date and last date minus 7 days.
This works fine for the last date:
SELECT *
FROM table
WHERE timestamp = (SELECT MAX(timestamp) FROM table)
I couldn't figure out the way to add minus 7 days so far.
I tried, for instance
SELECT *
FROM table
WHERE (timestamp = (SELECT MAX(timestamp) FROM table)) OR (timestamp = (SELECT DATEADD(DAY, -7, MAX(timestamp)) FROM table)
and some other variations, including GETDATE() instead of MAX, however, I'm getting the execution timeout messages.
Please let me know what logic should I follow in this case.
Data looks like this, but there's more of it :)
So I want to get data only for rows with 29/11/2019 and 22/11/2019. I have an additional requirement for filtering for factors, but it's a simple one.
If you care about dates, then perhaps you want:
select t.*
from t cross join
(select max(timestamp) as max_timestamp from t) tt
where (t.timestamp >= convert(date, max_timestamp) and
t.timestamp < dateadd(day, 1, convert(date, max_timestamp))
) or
(t.timestamp >= dateadd(day, -7, convert(date, max_timestamp)) and
t.timestamp < dateadd(day, -6, convert(date, max_timestamp))
);
So I ended up with the next code:
SELECT *
FROM table
WHERE (timestamp >= CAST(DATEADD(DAY, - 1, GETDATE()) AS datetime)) AND (timestamp < CAST(GETDATE() AS DATETIME)) OR
(timestamp >= CAST(DATEADD(DAY, - 8, GETDATE()) AS datetime)) AND (timestamp < CAST(DATEADD(day, - 7, GETDATE()) AS DATETIME)) AND (Factor1 = 'Criteria1' OR
Factor2 = 'Criteria2')
Not sure if it's the best or the most elegant solution, but it works for me.

Roll weekend counts into monday counts

I have a query like this:
select date, count(*)
from inflow
where date >= dateadd(year, -2, getdate())
group by date
order by date
I need to exclude Saturday and Sunday dates, and instead add their counts into the following Monday. What would be the best way to do this? Do I need to exclude Saturday, Sunday, and Mondays, then add them on with a join to a different query? The query above is a simplified version, this is a relatively big query, so I need to keep efficiency in mind.
Well, this is a somewhat brute-force approach:
select date,
(case when datename(weekday, date) = 'Monday')
then cnt + cnt1 + cnt2
else cnt
end) as cnt
from (select date, count(*) as cnt,
lag(count(*), 1, 0) over (order by date) as prev_cnt,
lag(count(*), 2, 0) over (order by date) as prev_cnt2
from inflow
where date >= dateadd(year, -2, getdate())
group by date
) d
where datename(weekday, date) not in ('Saturday', 'Sunday')
order by date;
Note: This is assuming English-language settings so the datename() logic works.
An alternative method without subqueries;
select v.dte, count(*) as cnt
from inflow i cross apply
(values (case when datename(weekday, i.date) = 'Saturday'
then dateadd(day, 2, i.date)
when datename(weekday, i.date) = 'Sunday'
then dateadd(day, 1, 9.date)
else i.date
end)
) v.dte
where i.date >= dateadd(year, -2, getdate())
group by v.dte
order by date;
You state for performance, however without knowing the full picture it's quite hard to understand how to optimise the query.
While I've been working on this, I noticed Gordon Linoff's answer, however I'll continue to write my version up as well, we both following the same path, but get to the answer a little different.
WITH DateData (date, datetoapply)
AS
(
SELECT
[date],
CASE DATEPART(w, [date])
WHEN 5 THEN DATEADD(d, 2, [date])
WHEN 6 THEN DATEADD(d, 1, [date])
ELSE date
END as 'datetoapply'
FROM inflow
WHERE [date] >= dateadd(year, -2, getdate())
)
SELECT datetoapply, count(*)
FROM DateData
GROUP BY datetoapply
ORDER BY datetoapply
While I could not get Gordon's query working as expected, I can confirm that "DATEPART(w, [date])" performs much better than "DATENAME(weekday, [date])", which if replaced in the query above increases the server processing time from 87ms to 181ms based on a table populated with 10k rows in Azure.

select all from table using date with time

how can I get the last recorded data of the time 23:59 from yesterday and the day before?
my code doesn't have a filter of the time yet so it only shows all the data from yesterday and the day before.
select *
from tbl_Total
where date between DATEADD(day, -3, GETDATE()) AND DATEADD(day, -1, GETDATE())
In your case,
select * from tbl_Total as of timestamp timestamp '2017-07-19 23:59:59'
and
select * from tbl_Total as of timestamp timestamp '2017-07-18 23:59:59'
Try this
select *
from tbl_Total
where date between dateadd(day,-3,convert(varchar(10),getdate(),112)) AND dateadd(day,-3,convert(varchar(10),getdate(),112)+ ' 23:59:59:997' )
This query will return yestarday date with time 23:59:59.
SELECT CAST(CAST(CAST(DATEADD(day, -1, GETDATE()) as DATE) as varchar(12)) +' 23:59:59' as datetime2)
So you can use it in your query:
select *
from tbl_Total
where date between DATEADD(day, -3, GETDATE()) AND CAST(CAST(CAST(DATEADD(day, -1, GETDATE()) as DATE) as varchar(12)) +' 23:59:59' as datetime2)
EDIT: More elegant way:
SELECT DATEADD(second, -1, DATEADD(dd, DATEDIFF(dd,0,GETDATE()),0))
This query returns yesterday date with time 23:59:59.
EDIT2: If you want to return the day before with time 23:59:59 you need to use this query:
SELECT DATEADD(second, -1, DATEADD(dd, DATEDIFF(dd,1,GETDATE()),0))
If you want to obtain any other day you can change number 2 and test it.
Assuming you don't know the exact time you can get the latest rows using ROW_NUMBER:
with cte as
( select *,
row_number() -- for each day sorted descending
over (partition by DATEADD(dd, DATEDIFF(dd,0,GETDATE()),0)
order by date desc) as rn
from tbl_Total
where -- yesterday between 23:59 and 23:59:99.999
( date >= DATEADD(dd, DATEDIFF(dd,0,GETDATE()),0) - (1.0/1440)
and date < DATEADD(dd, DATEDIFF(dd,0,GETDATE()),0)
)
or -- day before yesterday between 23:59 and 23:59:99.999
( date >= DATEADD(dd, DATEDIFF(dd,1,GETDATE()),0) - (1.0/1440)
and date < DATEADD(dd, DATEDIFF(dd,1,GETDATE()),0)
)
)
select * from cte
where rn = 1 --latest row only

Get Sum of all Distinct in a week

I am wondering where exactly i am not clear with this query. I want to get the count of all distinct RepIDs that worked in a particular week. This is In SQL Server 2005. Thank you!!
This query gives me distinct RepID's for the whole week. I want to count RepID twice if he has records on 2 different days but count only once even if he has more than 1 record for any partiular day.. I hope i am clear. I am sorry that i was not clear before! Thank you!
Select count(distinct(RepID)) as SalesPeople from DailyInfo
where Date > DATEADD(dd, -(DATEPART(dw, #Date)-1), #Date)
and Date < DATEADD(dd, 7-(DATEPART(dw, #Date)), #Date)
You can make unique combinations of the RepID+Date to make it unique (SQLFiddle):
SELECT COUNT(distinct RIGHT(DateDiff(d,0,Date),10)
+RIGHT(RepID,10)) as SalesPeople
FROM DailyInfo
WHERE Date > DATEADD(dd, -(DATEPART(dw, #Date)-1), #Date)
AND Date < DATEADD(dd, 7-(DATEPART(dw, #Date)), #Date);
I have assumed DailyInfo.Date can contain time information. You can swap DateDiff(d,0,Date) above for just Date. Similarly, CAST(DateDiff(d,0,Date) as datetime) below can be just `Date.
Below is the query if you needed to see the breakdown for each day.
SELECT CAST(DateDiff(d,0,Date) as datetime) TheDay,
COUNT(distinct RepID) as SalesPeople
FROM DailyInfo
WHERE Date > DATEADD(dd, -(DATEPART(dw, #Date)-1), #Date)
AND Date < DATEADD(dd, 7-(DATEPART(dw, #Date)), #Date)
GROUP BY CAST(DateDiff(d,0,Date) as datetime) -- by day
ORDER BY TheDay
Let me answer this by suggesting how you should think about the problem. You are looking for the number of reps per day. So, your query should have a summary (subquery) at this level. Then, you can count the number of days per week.
Assuming that your date does not have any time component, you can use the following:
select count(*)
from (select RepId, date as thedate, count(*) as NumOnDay
from DailyInfo
group by RepId, date
where Date > DATEADD(dd, -(DATEPART(dw, #Date)-1), #Date)
and Date < DATEADD(dd, 7-(DATEPART(dw, #Date)), #Date)
) rd
Alternatively, you could count the number of days that a rep worked during a week and then add these up:
select sum(numdates)
from (select RepId, count(distinct date) as numdates
from DailyInfo
group by RepId
where Date > DATEADD(dd, -(DATEPART(dw, #Date)-1), #Date)
and Date < DATEADD(dd, 7-(DATEPART(dw, #Date)), #Date)
) rd
If your date field has a time component, then you need to remove the time component for this to work. Or use some trick such as day(date), since the day function will returns a different value for each date in a week. In later versions of SQL Server, you can just cast(date as date), if the original date is datetime.
Select count(1)
from DailyInfo
group by convert(varchar(10),[date], 120)
where [put your condition here]
You could use a CTE, but might be over kill. And have the group by the day in the CTE and you need to do is a sum of the totals.
WITH cte ([day], total) as
(
Select DATENAME(DW,[Date]), count(distinct(RepID)) as SalesPeople from DailyInfo
where [Date] > DATEADD(dd, -(DATEPART(dw, #Date)-1), #Date) and [Date] < DATEADD(dd, 7-(DATEPART(dw, #Date)), #Date)
GROUP BY DATENAME(DW,[Date])
)
select SUM(total) FROM cte;
To do what I think you want you need to group by day, and filter on the week, and then do a distinct on the result:
Select Distinct(RepID)
From (Select RepID
Group By DateDiff(day, 0, Date)
From DailyInfo
Where Date > DateAdd(dd, -(DATEPART(dw, #Date)-1), #Date)
And Date < DateAdd(dd, 7-(DATEPART(dw, #Date)), #Date)