SQL Server : average count of alerts per day, not including days with no alerts - sql

I have a table that acts as a message log, with the two key tables being TIMESTAMP and TEXT. I'm working on a query that grabs all alerts (from TEXT) for the past 30 days (based on TIMESTAMP) and gives a daily average for those alerts.
Here is the query so far:
--goback 30 days start at midnight
declare #olderdate as datetime
set #olderdate = DATEADD(Day, -30, DATEDIFF(Day, 0, GetDate()))
--today at 11:59pm
declare #today as datetime
set #today = dateadd(ms, -3, (dateadd(day, +1, convert(varchar, GETDATE(), 101))))
print #today
--Grab average alerts per day over 30 days
select
avg(x.Alerts * 1.0 / 30)
from
(select count(*) as Alerts
from MESSAGE_LOG
where text like 'The process%'
and text like '%has alerted%'
and TIMESTAMP between #olderdate and #today) X
However, I want to add something that checks whether there were any alerts for a day and, if there are no alerts for that day, doesn't include it in the average. For example, if there are 90 alerts for a month but they're all in one day, I wouldn't want the average to be 3 alerts per day since that's clearly misleading.
Is there a way I can incorporate this into my query? I've searched for other solutions to this but haven't been able to get any to work.

This isn't written for your query, as I don't have any DDL or sample data, thus I'm going to provide a very simple example instead of how you would do this.
USE Sandbox;
GO
CREATE TABLE dbo.AlertMessage (ID int IDENTITY(1,1),
AlertDate date);
INSERT INTO dbo.AlertMessage (AlertDate)
VALUES('20190101'),('20190101'),('20190105'),('20190110'),('20190115'),('20190115'),('20190115');
GO
--Use a CTE to count per day:
WITH Tots AS (
SELECT AlertDate,
COUNT(ID) AS Alerts
FROM dbo.AlertMessage
GROUP BY AlertDate)
--Now the average
SELECT AVG(Alerts*1.0) AS DayAverage
FROM Tots;
GO
--Clean up
DROP TABLE dbo.AlertMessage;

You're trying to compute a double-aggregate: The average of daily totals.
Without using a CTE, you can try this as well, which is generalized a bit more to work for multiple months.
--get a list of events per day
DECLARE #Event TABLE
(
ID INT NOT NULL IDENTITY(1, 1)
,DateLocalTz DATE NOT NULL--make sure to handle time zones
,YearLocalTz AS DATEPART(YEAR, DateLocalTz) PERSISTED
,MonthLocalTz AS DATEPART(MONTH, DateLocalTz) PERSISTED
)
/*
INSERT INTO #Event(EntryDateLocalTz)
SELECT DISTINCT CONVERT(DATE, TIMESTAMP)--presumed to be in your local time zone because you did not specify
FROM dbo.MESSAGE_LOG
WHERE UPPER([TEXT]) LIKE 'THE PROCESS%' AND UPPER([TEXT]) LIKE '%HAS ALERTED%'--case insenitive
*/
INSERT INTO #Event(DateLocalTz)
VALUES ('2018-12-31'), ('2019-01-01'), ('2019-01-01'), ('2019-01-01'), ('2019-01-12'), ('2019-01-13')
--get average number of alerts per alerting day each month
-- (this will not return months with no alerts,
-- use a LEFT OUTER JOIN against a month list table if you need to include uneventful months)
SELECT
YearLocalTz
,MonthLocalTz
,AvgAlertsOfAlertingDays = AVG(CONVERT(REAL, NumDailyAlerts))
FROM
(
SELECT
YearLocalTz
,MonthLocalTz
,DateLocalTz
,NumDailyAlerts = COUNT(*)
FROM #Event
GROUP BY YearLocalTz, MonthLocalTz, DateLocalTz
) AS X
GROUP BY YearLocalTz, MonthLocalTz
ORDER BY YearLocalTz ASC, MonthLocalTz ASC
Some things to note in my code:
I use PERSISTED columns to get the month and year date parts (because I'm lazy when populating tables)
Use explicit CONVERT to escape integer math that rounds down decimals. Multiplying by 1.0 is a less-readable hack.
Use CONVERT(DATE, ...) to round down to midnight instead of converting back and forth between strings
Do case-insensitive string searching by making everything uppercase (or lowercase, your preference)
Don't subtract 3 milliseconds to get the very last moment before midnight. Change your semantics to interpret the end of a time range as exclusive, instead of dealing with the precision of your datatypes. The only difference is using explicit comparators (i.e. use < instead of <=). Also, DATETIME resolution is 1/300th of a second, not 3 milliseconds.
Avoid using built-in keywords as column names (i.e. "TEXT"). If you do, wrap them in square brackets to avoid ambiguity.

Instead of dividing by 30 to get the average, divide by the count of distinct days in your results.
select
avg(x.Alerts * 1.0 / x.dd)
from
(select count(*) as Alerts, count(distinct CAST([TIMESTAMP] AS date)) AS dd
...

Related

Adding x work days onto a date in SQL Server?

I'm a bit confused if there is a simple way to do this.
I have a field called receipt_date in my data table and I wish to add 10 working days to this (with bank holidays).
I'm not sure if there is any sort of query I could use to join onto this table from my original to calculate 10 working days from this, I've tried a few sub queries but I couldn't get it right or perhaps its not possible to do this. I didn't know if there was a way to extract the 10th rowcount after the receipt date to get the calendar date if I only include 'Y' into the WHERE?
Any help appreciated.
This is making several assumptions about your data, because we have none. One method, however, would be to create a function, I use a inline table value function here, to return the relevant row from your calendar table. Note that this assumes that the number of days must always be positive, and that if you provide a date that isn't a working day that day 0 would be the next working day. I.e. adding zero working days to 2021-09-05 would return 2021-09-06, or adding 3 would return 2021-09-09. If that isn't what you want, this should be more than enough for you to get there yourself.
CREATE FUNCTION dbo.AddWorkingDays (#Days int, #Date date)
RETURNS TABLE AS
RETURN
WITH Dates AS(
SELECT CalendarDate,
WorkingDay
FROM dbo.CalendarTable
WHERE CalendarDate >= #Date)
SELECT CalendarDate
FROM Dates
WHERE WorkingDay = 1
ORDER BY CalendarDate
OFFSET #Days ROWS FETCH NEXT 1 ROW ONLY;
GO
--Using the function
SELECT YT.DateColumn,
AWD.CalendarDate AS AddedWorkingDays
FROM dbo.YourTable YT
CROSS APPLY dbo.AddWorkingDays(10,YT.DateColumn) AWD;

How to get count of average calls every 5 minutes using datetime sql

I am running the following query
select DateTime
from Calls
where DateTime > '17 Oct 2018 00:00:00.000' and
DialedNumberID = '1234'
What would this give me is a list of all the times that this number was dialled on the specific date.
Essentially what I am looking for is a query that would give me the average calls that take place every X minutes and would like to run the query for the whole year.
Thanks
I guess you have a table named Calls with the columns DateTime and DialedNumberID.
You can summarize the information in that table year-by-year using the kind of pattern.
SELECT YEAR(`DateTime`),
DialedNumberID,
COUNT(*) call_count
FROM Calls
GROUP BY YEAR(`DateTime`), DialedNumberID
The trick in this pattern is to GROUP BY f(date) . The function f() reduces any date to the year in which it occures.
Summary by five minute intervals, you need f(date) that reduces datestamps to five minute intervals. That function is a good deal more complex than YEAR().
DATE_FORMAT(datestamp,'%Y-%m-%d %H:00') + INTERVAL (MINUTE(datestamp) - MINUTE(datestamp) MOD 5)
Given, for example, 2001-09-11 08:43:00, this gives back 2001-09-11 08:40:00.
So, here's your summary by five minute intervals.
SELECT DATE_FORMAT(`DateTime`,'%Y-%m-%d %H:00')+INTERVAL(MINUTE(`DateTime`)-MINUTE(datestamp) MOD 5) interval_beginning,
DialedNumberID,
COUNT(*) call_count
FROM Calls
GROUP BY DATE_FORMAT(`DateTime`,'%Y-%m-%d %H:00')+INTERVAL(MINUTE(`DateTime`)-MINUTE(datestamp) MOD 5),
DialedNumberID
You can make this query clearer and less repetitive by defining a stored function for that ugly DATE_FORMAT() expression. But that's a topic for another day.
Finally, append
WHERE YEAR(`DateTime`) = YEAR(NOW())
AND DialedNumberID = '1234'
to the query to filter by the current year and a particular id.
This query will need work to make it as efficient as possible. That too is a topic for another day.
Pro tip: DATETIME is a reserved word in MySQL. Column names are generally case-insensitive. Avoid naming your columns, in this case DateTime, the same as a reserved word.
The average amount of calls per interval is the number of calls (COUNT(*)) divided by the minutes between the start and end of of the monitored period (TIMESTAMPDIFF(minute, period_start, period_end)) multiplied with the number of minutes in the desired interval (five in your example).
For MySQL:
select count(*) / timestampdiff(minute, date '2018-01-01', now()) * 5 as avg_calls
from calls
where `datetime` >= date '2018-01-01'
and dialednumberid = 1234;
For SQL Server:
select count(*) * 1.0 / datediff(minute, '20180101', getdate()) * 5 as avg_calls
from calls
where [datetime] >= '20180101'
and dialednumberid = 1234;
This forces the call time into 5 minute intervals. Use 'count' and 'group by' on these intervals. Using DateTime as a column name is confusing
SELECT DATEADD(MINUTE, CAST(DATEPART(MINUTE, [DateTime] AS INTEGER)%5 * - 1,CAST(FORMAT([DateTime], 'MM/dd/yyyy hh:mm') AS DATETIME)) AS CallInterval, COUNT(*)
FROM Calls
GROUP BY DATEADD(MINUTE, CAST(DATEPART(MINUTE, [DateTime]) AS INTEGER)%5 * - 1,CAST(FORMAT([DateTime], 'MM/dd/yyyy hh:mm') AS DATETIME))

Converting a time excel formula to t-sql

So I'm new to SQL (I believe it's T-SQL) and I'm trying to convert a function I used in Excel to SQL.
L2 becomes Column 1
G2 becomes Column 2
=(INT(L2)-INT(G2))*("17:00"-"08:45")+MEDIAN(MOD(L2,1),"17:00","08:45")-MEDIAN(MOD(G2,1),"17:00","08:45")
What this does is calculate the business hours worked between 8:45AM and 05:00PM.
If work goes from 4:00PM to 9:00AM the next day, the result should be 01:15:00.
If it goes over several days (4:00PM on the 1st to 9:00AM on the 4th) it should be 17:45:00.
I'd prefer not to have a separate function because I don't know how to use them as I'm quite new to this - I'd prefer to have it as something I can write within the SELECT * , <code here here> FROM db.name section.
Thanks in advance
I know you said you don't want this in a function, but they really aren't hard to use and the logic you require for this is too complex in SQL Server to be sensibly contained inline (Though it can be, if you really want to be that guy).
This function has no error handling if any of your parameters are not suitable, though I will leave that up to you as a learning exercise on NULL values, process flows and fully thinking through all the possibilities that you may need to deal with:
-- This bit creates your function. You can rename the function from fnWorkingDays to anything you want, though try to keep your naming conventions sensible:
create function fnWorkingDays(#Start datetime
,#End datetime
)
returns decimal(10,2)
as
begin
-- Declare the start and end times of your working day:
declare #WorkingStart time = '08:45:00.000'
declare #WorkingEnd time = '17:00:00.000'
-- Work out the number of minutes outside the working day in 24 Hour Notation:
declare #OvernightMinutes int = datediff(minute -- Work out the difference in minutes,
,cast(#workingend as datetime) -- between the end of the working day (CASTing a TIME as DATETIME gives you 1900-01-01 17:00:00)
,dateadd(d,1,cast(#WorkingStart as datetime)) -- and the start of the next working day (CAST the TIME value as DATETIME [1900-01-01 08:45:00] and then add a day to it [1900-01-02 08:45:00])
)
-- There is no need to retain the minutes that fall outside your Working Day, to if the very start or very end of your given period fall outside your Working Day, discard those minutes:
declare #TrueStart datetime = (select case when cast(#Start as time) < #WorkingStart
then dateadd(d,datediff(d,0,#Start),0) + cast(#WorkingStart as datetime)
else #Start
end
)
declare #TrueEnd datetime = (select case when cast(#End as time) > #WorkingEnd
then dateadd(d,datediff(d,0,#End),0) + cast(#WorkingEnd as datetime)
else #End
end
)
-- You can now calculate the number of minutes in your true working period, and then subtract the total overnight periods in minutes to get your final value.
-- So firstly, if your Working Period is not long enough to stretch over two days, there is not need to do any more than calculate the difference between the True Start and End:
return (select case when datediff(minute,#Start,#End) < #OvernightMinutes
then datediff(minute,#TrueStart,#TrueEnd)
-- If you do need to calculate over more than one day, calculate the total minutes between your True Start and End, then subtract the number of Overnight Minutes multiplied by the number of nights.
-- This works because DATEDIFF calculated the number of boundaries crossed, so when using DAYS, it actually counts the number of midnights between your two dates:
else (datediff(minute,#TrueStart,#TrueEnd) - (datediff(d,#TrueStart,#TrueEnd) * #OvernightMinutes))/1440.
-- If you want to return your value in a slightly different format, you could use variations of these two, though you will need to change the RETURNS DECIMAL(10,2) at the top to RETURNS NVARCHAR(25) if you use the last one:
-- else datediff(minute,#TrueStart,#TrueEnd) - (datediff(d,#TrueStart,#TrueEnd) * #OvernightMinutes)
-- else cast((datediff(minute,#TrueStart,#TrueEnd) - (datediff(d,#TrueStart,#TrueEnd) * #OvernightMinutes))/60 as nvarchar(5)) + ' Hours ' + cast((datediff(minute,#TrueStart,#TrueEnd) - (datediff(d,#TrueStart,#TrueEnd) * #OvernightMinutes))%60 as nvarchar(5)) + ' Minutes'
end
)
end
go
And this is how you call the function:
select dbo.fnWorkingDays('2016-09-04 12:00:00.000', '2016-09-06 12:10:00.000') as WorkingDays
You can replace the two DATETIME values about with the appropriate column names to get your desired result inline:
select dbo.fnWorkingDays(Dates.StartDate, Dates.EndDate) as WorkingDays
from (select '2016-09-04 12:00:00.000' as StartDate
,'2016-09-06 12:10:00.000' as EndDate
) as Dates

How to rearrange the value of column

I have a table (tblDates). In this table I have two column (Date,Age) . Now I want If I add new date in this table then Age column rearranged there values.
Table - tblDates
Date Age
--------------------
12/01/14 5
12/02/14 4
12/03/14 3
12/04/14 2
12/05/14 1
If I add New date i.e., 12/06/14 then I want result like this
Table - tblDates
Date Age
--------------------
12/01/14 6
12/02/14 5
12/03/14 4
12/04/14 3
12/05/14 2
12/06/14 1
I may be reading too much into your question, but if your goal is to compute the age (in days) from a given date (today?) to the date stored in your tables, then you'll be better off using the DATEDIFF function and computing the value when you query it each time.
For example:
-- Option 1: Compute when you query it each time in the query you require it
SELECT d.[Date], DATEDIFF(dd, d.[Date], CONVERT(DATE, GETDATE())) as [Age]
FROM tblDates AS d
You can also define the Age column on your table as a Computed Column if it will be used frequently enough, or wrap the table in a View to embed this computation:
-- Option 2: Compute at query time, but build the computation into the table definition
CREATE TABLE [dbo].[tblDates] (
[Date] DATE NOT NULL,
[AgeInDaysComputed] AS (DATEDIFF(dd, [Date], CONVERT(DATE, GETDATE())) )
)
GO
-- Option 3: Compute at query time, but require caller interact with a different object
-- (view) to get the computation
CREATE VIEW [dbo].[vwDates]
AS
SELECT d.[Date], DATEDIFF(dd, d.[Date], CONVERT(DATE, GETDATE())) as [AgeInDays]
FROM dbo.tblDates AS D
GO
One note regarding the GETDATE function: you need to be aware of your server timezone, as GETDATE returns the date according to your server's local timezone. As long as your server configuration and user's configurations are all in the same timezone, this should provide the correct result.
(If the age in days is what you're trying to compute, you may want to edit your question to better reflect this intent for the benefit of future readers, as it is quite different from "rearranging the value of columns")
Pull the values that you want out when you query, not when you insert data. You seem to want:
select d.*, row_number() over (order by date desc) as age
from tblDates d;
Otherwise, your insert operation will become very cumbersome, requiring changes to all the rows in the table.

How to store absolute and relative date ranges in SQL database?

I'm trying to model a DateRange concept for a reporting application. Some date ranges need to be absolute, March 1, 2011 - March 31, 2011. Others are relative to current date, Last 30 Days, Next Week, etc. What's the best way to store that data in SQL table?
Obviously for absolute ranges, I can have a BeginDate and EndDate. For relative ranges, having an InceptionDate and an integer RelativeDays column makes sense. How do I incorporate both of these ideas into a single table without implementing context into it, ie have all four columns mentioned and use XOR logic to populate 2 of the 4.
Two possible schemas I rejected due to having context-driven columns:
CREATE TABLE DateRange
(
BeginDate DATETIME NULL,
EndDate DATETIME NULL,
InceptionDate DATETIME NULL,
RelativeDays INT NULL
)
OR
CREATE TABLE DateRange
(
InceptionDate DATETIME NULL,
BeginDaysRelative INT NULL,
EndDaysRelative INT NULL
)
Thanks for any advice!
I don't see why your second design doesn't meet your needs unless you're in the "no NULLs never" camp. Just leave InceptionDate NULL for "relative to current date" choices so that your application can tell them apart from fixed date ranges.
(Note: not knowing your DB engine, I've left date math and current date issues in pseudocode. Also, as in your question, I've left out any text description and primary key columns).
Then, either create a view like this:
CREATE VIEW DateRangesSolved (Inception, BeginDays, EndDays) AS
SELECT CASE WHEN Inception IS NULL THEN Date() ELSE Inception END,
BeginDays,
EndDays,
FROM DateRanges
or just use that logic when you SELECT from the table directly.
You can even take it one step further:
CREATE VIEW DateRangesSolved (BeginDate, EndDate) AS
SELECT (CASE WHEN Inception IS NULL THEN Date() ELSE Inception END + BeginDays),
(CASE WHEN Inception IS NULL THEN Date() ELSE Inception END + EndDays)
FROM DateRanges
Others are relative to current date, Last 30 Days, Next Week, etc.
What's the best way to store that data in SQL table?
If you store those ranges in a table, you have to update them every day. In this case, you have to update each row differently every day. That might be a big problem; it might not.
There usually aren't many rows in that kind of table, often less than 50. The table structure is obvious. Updating should be driven by a cron job (or its equivalent), and you should run very picky exception reports every day to make sure things have been updated correctly.
Normally, these kinds of reports should produce no output if things are fine. You have the added complication that driving such a report from cron will produce no output if cron isn't running. And that's not fine.
You can also create a view, which doesn't require any maintenance. With a few dozen rows, it might be slower than a physical table, but it might still fast enough. And it eliminates all maintenance and administrative work for these ranges. (Check for off-by-one errors, because I didn't.)
create view relative_date_ranges as
select 'Last 30 days' as range_name,
(current_date - interval '30' day)::date as range_start,
current_date as range_end
union all
select 'Last week' as range_name,
(current_date - interval '7' day)::date as range_start,
current_date as range_end
union all
select 'Next week' as range_name,
(current_date + interval '7' day)::date as range_start,
current_date as range_end
Depending on the app, you might be able to treat your "absolute" ranges the same way.
...
union all
select 'March this year' as range_name,
(extract(year from current_date) || '-03-01')::date as range_start,
(extract(year from current_date) || '-03-31')::date as range_end
Put them in separate tables. There is absolutely no reason to have them in a single table.
For the relative dates, I would go so far as to simply make the table the parameters you need for the date functions, i.e.
CREATE TABLE RelativeDate
(
Id INT Identity,
Date_Part varchar(25),
DatePart_Count int
)
Then you can know that it is -2 WEEK or 30 DAY variance and use that in your logic.
If you need to see them both at the same time, you can combine them LOGICALLY in a query or view without needing to mess up your data structure by cramming different data elements into the same table.
Create a table containing the begindate and the offset. The precision of the offset is up to you to decide.
CREATE TABLE DateRange(
BeginDate DATETIME NOT NULL,
Offset int NOT NULL,
OffsetLabel varchar(100)
)
to insert into it:
INSERT INTO DateRange (BeginDate, Offset, OffsetLabel)
select '20110301', DATEDIFF(sec, '20110301', '20110331'), 'March 1, 2011 - March 31, 2011'
Last 30 days
INSERT INTO DateRange (BeginDate, Duration, OffsetLabel)
select '20110301', DATEDIFF(sec, current_timestamp, DATEADD(day, -30, current_timestamp)), 'Last 30 Days'
To display the values later:
select BeginDate, EndDate = DATEADD(sec, Offset, BeginDate), OffsetLabel
from DateRange
If you want to be able to parse the "original" vague descriptions you will have to look for a "Fuzzy Date" or "Approxidate" function. (There exists something like this in the git source code. )