Query for the average call length for all users for a day - sql

A person uses their cell phone multiple times per day, and the length of their calls vary. I am tracking the length of the calls in a table:
Calls [callID, memberID, startTime, duration]
I need to a query to return the average call length for users per day. Per day means, if a user used the phone 3 times, first time for 5 minutes, second for 10 minutes and the last time for 7 minutes, the calculation is: 5 + 10 + 7 / 3 = ...
Note:
People don't use the phone everyday, so we have to get the latest day's average per person and use this to get the overall average call duration.
we don't want to count anyone twice in the average, so only 1 row per user will go into calculating the average daily call duration.
Some clarifications...
I need a overall per day average, based on the per-user per-day average, using the users latest days numbers (since we are only counting a given user ONCE in the query), so it will mean we will be using different days avg. since people might not use the phone each day or on the same day even.

The following query will get you the desired end results.
SELECT AVG(rt.UserDuration) AS AveragePerDay
FROM
(
SELECT
c1.MemberId,
AVG(c1.Duration) AS "UserDuration"
FROM Calls c1
WHERE CONVERT(VARCHAR, c1.StartTime, 102) =
(SELECT CONVERT(VARCHAR, MAX(c2.StartTime), 102)
FROM Calls c2
WHERE c2.MemberId = c1.MemberId)
GROUP By MemberId
) AS rt
THis accomplishes it by first creating a table with 1 record for each member and the average duration of their calls for the most recent day. Then it simply averages all of those values to get the end "average call duration. If you want to see a specific user, you can run just the innser SELECT section to get the member list

You need to convert the DATETIME to something you can make "per day" groups on, so this would produce "yy/mm/dd".
SELECT
memberId,
CONVERT(VARCHAR, startTime, 102) Day,
AVG(Duration) AvgDuration
FROM
Calls
WHERE
CONVERT(VARCHAR, startTime, 102) =
(
SELECT
CONVERT(VARCHAR, MAX(startTime), 102)
FROM
Calls i WHERE i.memberId = Calls.memberId
)
GROUP BY
memberId,
CONVERT(VARCHAR, startTime, 102)
Use LEFT(CONVERT(VARCHAR, startTime, 120), 10) to produce "yyyy-mm-dd".
For these kind of queries it would be helpful to have a dedicated "day only" column to avoid the whole conversion business and as a side effect make the query more readable.

select average(duration) from calls group by date(startTime);

Related

SQL Average or MAX of one column in query for pivot table

I have a query that shows everytime an employee clocks in or out at a customer site. I've already worked out the math for calculating their hours for each visit to a site. Is it possible to just get the average or max value of the invoice amount column for CompanyA?
I'm trying to use this data in a pivot table, but it will SUM the Invoice Amount for that customer showing '690.0000' and then it totals in the "Grand Total". This makes the math wrong for the grand total. I want it to just account for the one value of 345.00. The InvoiceAmount is always the same no matter how many times they clocked in and out there.
Is there something I can do in my query before it's pulled into the pivot table?
ProjectName
InvoiceAmount
HoursPerRow
CompanyA
345.00
19
CompanyA
345.00
2
declare #current_cst datetimeoffset;
set #current_cst = (SELECT getdate() AT TIME ZONE 'UTC' AT TIME ZONE 'Central Standard Time')
SELECT
dbo.WorkOrderEmployeeTimeTracker.EmployeeID,
dbo.Employees.FirstName + ' ' + dbo.Employees.LastName AS EmployeeFullName,
dbo.Projects.ProjectName,
MAX (dbo.WorkOrderProjects.InvoiceAmount) AS InvoiceAmount,
dbo.WorkOrders.WorkOrderID,
AVG (dbo.WorkOrderEmployeeTimeTracker.Wage) AS Wage,
convert(datetime,switchoffset(convert(datetimeoffset, dbo.WorkOrderEmployeeTimeTracker.ClockIn),datename(tzoffset,#current_cst))) AS ClockIn,
convert(datetime,switchoffset(convert(datetimeoffset, dbo.WorkOrderEmployeeTimeTracker.ClockOut),datename(tzoffset,#current_cst))) AS ClockOut,
CONVERT(int,DATEDIFF(mi, ClockIn, ClockOut) / 60) as Hrs_Difference,
CONVERT(int,DATEDIFF(mi, ClockIn, ClockOut) % 60) as Mins_Difference,
DATEDIFF(second, dbo.WorkOrderEmployeeTimeTracker.ClockIn, dbo.WorkOrderEmployeeTimeTracker.ClockOut) / 3600.0 as HoursPerRow
FROM
dbo.WorkOrderEmployeeTimeTracker INNER JOIN
dbo.WorkOrders ON dbo.WorkOrderEmployeeTimeTracker.WorkOrderID = dbo.WorkOrders.WorkOrderID INNER JOIN
dbo.WorkOrderProjects ON dbo.WorkOrders.WorkOrderProjectID = dbo.WorkOrderProjects.WorkOrderProjectID INNER JOIN
dbo.Employees ON dbo.WorkOrderEmployeeTimeTracker.EmployeeID = dbo.Employees.EmployeeID INNER JOIN
dbo.Projects ON dbo.WorkOrderProjects.ProjectID = dbo.Projects.ProjectID
where dbo.WorkOrderProjects.IsPaid = 'True'
group by
dbo.WorkOrderEmployeeTimeTracker.EmployeeID,
dbo.Employees.FirstName,
dbo.Employees.LastName,
dbo.WorkOrderProjects.InvoiceAmount,
dbo.Projects.ProjectName,
dbo.WorkOrders.WorkOrderID,
dbo.WorkOrderEmployeeTimeTracker.Wage,
WorkOrderEmployeeTimeTracker.ClockIn,
dbo.WorkOrderEmployeeTimeTracker.ClockOut
order by 1,2,3
I only want the Invoice Amount once when there is a duplicate. I still need the other information from each row though, as employees have different clock times when they have clocked in and out more than once for the same project.
UPDATE: It would seem that covertion on the clockin and clockout were throwing everything off. Removing them solved everything. I Will have to find a way to convert these times to Central Standard Time.
If anyone has proper ideas on that, please let me know.
There is aggregate functions that will help like MAX function
you can write
SELECT MAX(InvoiceAmount) FROM TableName
and there is other functions like AVG and can be used on on column

SQL Server : average count of alerts per day, not including days with no alerts

I have a table that acts as a message log, with the two key tables being TIMESTAMP and TEXT. I'm working on a query that grabs all alerts (from TEXT) for the past 30 days (based on TIMESTAMP) and gives a daily average for those alerts.
Here is the query so far:
--goback 30 days start at midnight
declare #olderdate as datetime
set #olderdate = DATEADD(Day, -30, DATEDIFF(Day, 0, GetDate()))
--today at 11:59pm
declare #today as datetime
set #today = dateadd(ms, -3, (dateadd(day, +1, convert(varchar, GETDATE(), 101))))
print #today
--Grab average alerts per day over 30 days
select
avg(x.Alerts * 1.0 / 30)
from
(select count(*) as Alerts
from MESSAGE_LOG
where text like 'The process%'
and text like '%has alerted%'
and TIMESTAMP between #olderdate and #today) X
However, I want to add something that checks whether there were any alerts for a day and, if there are no alerts for that day, doesn't include it in the average. For example, if there are 90 alerts for a month but they're all in one day, I wouldn't want the average to be 3 alerts per day since that's clearly misleading.
Is there a way I can incorporate this into my query? I've searched for other solutions to this but haven't been able to get any to work.
This isn't written for your query, as I don't have any DDL or sample data, thus I'm going to provide a very simple example instead of how you would do this.
USE Sandbox;
GO
CREATE TABLE dbo.AlertMessage (ID int IDENTITY(1,1),
AlertDate date);
INSERT INTO dbo.AlertMessage (AlertDate)
VALUES('20190101'),('20190101'),('20190105'),('20190110'),('20190115'),('20190115'),('20190115');
GO
--Use a CTE to count per day:
WITH Tots AS (
SELECT AlertDate,
COUNT(ID) AS Alerts
FROM dbo.AlertMessage
GROUP BY AlertDate)
--Now the average
SELECT AVG(Alerts*1.0) AS DayAverage
FROM Tots;
GO
--Clean up
DROP TABLE dbo.AlertMessage;
You're trying to compute a double-aggregate: The average of daily totals.
Without using a CTE, you can try this as well, which is generalized a bit more to work for multiple months.
--get a list of events per day
DECLARE #Event TABLE
(
ID INT NOT NULL IDENTITY(1, 1)
,DateLocalTz DATE NOT NULL--make sure to handle time zones
,YearLocalTz AS DATEPART(YEAR, DateLocalTz) PERSISTED
,MonthLocalTz AS DATEPART(MONTH, DateLocalTz) PERSISTED
)
/*
INSERT INTO #Event(EntryDateLocalTz)
SELECT DISTINCT CONVERT(DATE, TIMESTAMP)--presumed to be in your local time zone because you did not specify
FROM dbo.MESSAGE_LOG
WHERE UPPER([TEXT]) LIKE 'THE PROCESS%' AND UPPER([TEXT]) LIKE '%HAS ALERTED%'--case insenitive
*/
INSERT INTO #Event(DateLocalTz)
VALUES ('2018-12-31'), ('2019-01-01'), ('2019-01-01'), ('2019-01-01'), ('2019-01-12'), ('2019-01-13')
--get average number of alerts per alerting day each month
-- (this will not return months with no alerts,
-- use a LEFT OUTER JOIN against a month list table if you need to include uneventful months)
SELECT
YearLocalTz
,MonthLocalTz
,AvgAlertsOfAlertingDays = AVG(CONVERT(REAL, NumDailyAlerts))
FROM
(
SELECT
YearLocalTz
,MonthLocalTz
,DateLocalTz
,NumDailyAlerts = COUNT(*)
FROM #Event
GROUP BY YearLocalTz, MonthLocalTz, DateLocalTz
) AS X
GROUP BY YearLocalTz, MonthLocalTz
ORDER BY YearLocalTz ASC, MonthLocalTz ASC
Some things to note in my code:
I use PERSISTED columns to get the month and year date parts (because I'm lazy when populating tables)
Use explicit CONVERT to escape integer math that rounds down decimals. Multiplying by 1.0 is a less-readable hack.
Use CONVERT(DATE, ...) to round down to midnight instead of converting back and forth between strings
Do case-insensitive string searching by making everything uppercase (or lowercase, your preference)
Don't subtract 3 milliseconds to get the very last moment before midnight. Change your semantics to interpret the end of a time range as exclusive, instead of dealing with the precision of your datatypes. The only difference is using explicit comparators (i.e. use < instead of <=). Also, DATETIME resolution is 1/300th of a second, not 3 milliseconds.
Avoid using built-in keywords as column names (i.e. "TEXT"). If you do, wrap them in square brackets to avoid ambiguity.
Instead of dividing by 30 to get the average, divide by the count of distinct days in your results.
select
avg(x.Alerts * 1.0 / x.dd)
from
(select count(*) as Alerts, count(distinct CAST([TIMESTAMP] AS date)) AS dd
...

SQL question: count of occurrence greater than N in any given hour

I'm looking through login logs (in Netezza) and trying to find users who have greater than a certain number of logins in any 1 hour time period (any consecutive 60 minute period, as opposed to strictly a clock hour) since December 1st. I've viewed the following posts, but most seem to address searching within a specific time range, not ANY given time period. Thanks.
https://dba.stackexchange.com/questions/137660/counting-number-of-occurences-in-a-time-period
https://dba.stackexchange.com/questions/67881/calculating-the-maximum-seen-so-far-for-each-point-in-time
Count records per hour within a time span
You could use the analytic function lag to look back in a sorted sequence of time stamps to see whether the record that came 19 entries earlier is within an hour difference:
with cte as (
select user_id,
login_time,
lag(login_time, 19) over (partition by user_id order by login_time) as lag_time
from userlog
order by user_id,
login_time
)
select user_id,
min(login_time) as login_time
from cte
where extract(epoch from (login_time - lag_time)) < 3600
group by user_id
The output will show the matching users with the first occurrence when they logged a twentieth time within an hour.
I think you might do something like that (I'll use a login table, with user, datetime as single column for the sake of simplicity):
with connections as (
select ua.user
, ua.datetime
from user_logons ua
where ua.datetime >= timestamp'2018-12-01 00:00:00'
)
select ua.user
, ua.datetime
, (select count(*)
from connections ut
where ut.user = ua.user
and ut.datetime between ua.datetime and (ua.datetime + 1 hour)
) as consecutive_logons
from connections ua
It is up to you to complete with your columns (user, datetime)
It is up to you to find the dateadd facilities (ua.datetime + 1 hour won't work); this is more or less dependent on the DB implementation, for example it is DATE_ADD in mySQL (https://www.w3schools.com/SQl/func_mysql_date_add.asp)
Due to the subquery (select count(*) ...), the whole query will not be the fastest because it is a corelative subquery - it needs to be reevaluated for each row.
The with is simply to compute a subset of user_logons to minimize its cost. This might not be useful, however this will lessen the complexity of the query.
You might have better performance using a stored function or a language driven (eg: java, php, ...) function.

How to get count of average calls every 5 minutes using datetime sql

I am running the following query
select DateTime
from Calls
where DateTime > '17 Oct 2018 00:00:00.000' and
DialedNumberID = '1234'
What would this give me is a list of all the times that this number was dialled on the specific date.
Essentially what I am looking for is a query that would give me the average calls that take place every X minutes and would like to run the query for the whole year.
Thanks
I guess you have a table named Calls with the columns DateTime and DialedNumberID.
You can summarize the information in that table year-by-year using the kind of pattern.
SELECT YEAR(`DateTime`),
DialedNumberID,
COUNT(*) call_count
FROM Calls
GROUP BY YEAR(`DateTime`), DialedNumberID
The trick in this pattern is to GROUP BY f(date) . The function f() reduces any date to the year in which it occures.
Summary by five minute intervals, you need f(date) that reduces datestamps to five minute intervals. That function is a good deal more complex than YEAR().
DATE_FORMAT(datestamp,'%Y-%m-%d %H:00') + INTERVAL (MINUTE(datestamp) - MINUTE(datestamp) MOD 5)
Given, for example, 2001-09-11 08:43:00, this gives back 2001-09-11 08:40:00.
So, here's your summary by five minute intervals.
SELECT DATE_FORMAT(`DateTime`,'%Y-%m-%d %H:00')+INTERVAL(MINUTE(`DateTime`)-MINUTE(datestamp) MOD 5) interval_beginning,
DialedNumberID,
COUNT(*) call_count
FROM Calls
GROUP BY DATE_FORMAT(`DateTime`,'%Y-%m-%d %H:00')+INTERVAL(MINUTE(`DateTime`)-MINUTE(datestamp) MOD 5),
DialedNumberID
You can make this query clearer and less repetitive by defining a stored function for that ugly DATE_FORMAT() expression. But that's a topic for another day.
Finally, append
WHERE YEAR(`DateTime`) = YEAR(NOW())
AND DialedNumberID = '1234'
to the query to filter by the current year and a particular id.
This query will need work to make it as efficient as possible. That too is a topic for another day.
Pro tip: DATETIME is a reserved word in MySQL. Column names are generally case-insensitive. Avoid naming your columns, in this case DateTime, the same as a reserved word.
The average amount of calls per interval is the number of calls (COUNT(*)) divided by the minutes between the start and end of of the monitored period (TIMESTAMPDIFF(minute, period_start, period_end)) multiplied with the number of minutes in the desired interval (five in your example).
For MySQL:
select count(*) / timestampdiff(minute, date '2018-01-01', now()) * 5 as avg_calls
from calls
where `datetime` >= date '2018-01-01'
and dialednumberid = 1234;
For SQL Server:
select count(*) * 1.0 / datediff(minute, '20180101', getdate()) * 5 as avg_calls
from calls
where [datetime] >= '20180101'
and dialednumberid = 1234;
This forces the call time into 5 minute intervals. Use 'count' and 'group by' on these intervals. Using DateTime as a column name is confusing
SELECT DATEADD(MINUTE, CAST(DATEPART(MINUTE, [DateTime] AS INTEGER)%5 * - 1,CAST(FORMAT([DateTime], 'MM/dd/yyyy hh:mm') AS DATETIME)) AS CallInterval, COUNT(*)
FROM Calls
GROUP BY DATEADD(MINUTE, CAST(DATEPART(MINUTE, [DateTime]) AS INTEGER)%5 * - 1,CAST(FORMAT([DateTime], 'MM/dd/yyyy hh:mm') AS DATETIME))

sql select number divided aggregate sum function

I have this schema
and I want to have a query to calculate the cost per consultant per hour per month. In other words, a consultant has a salary per month, I want to divide the amount of the salary between the hours that he/she worked that month.
SELECT
concat_ws(' ', consultants.first_name::text, consultants.last_name::text) as name,
EXTRACT(MONTH FROM tasks.init_time) as task_month,
SUM(tasks.finish_time::timestamp::time - tasks.init_time::timestamp::time) as duration,
EXTRACT(MONTH FROM salaries.payment_date) as salary_month,
salaries.payment
FROM consultants
INNER JOIN tasks ON consultants.id = tasks.consultant_id
INNER JOIN salaries ON consultants.id = salaries.consultant_id
WHERE EXTRACT(MONTH FROM tasks.init_time) = EXTRACT(MONTH FROM salaries.payment_date)
GROUP BY (consultants.id, EXTRACT(MONTH FROM tasks.init_time), EXTRACT(MONTH FROM salaries.payment_date), salaries.payment);
It is not possible to do this in the select
salaries.payment / SUM(tasks.finish_time::timestamp::time - tasks.init_time::timestamp::time)
Is there another way to do it? Is it possible to solve it in one query?
Assumptions made for this answer:
The model is not entirely clear to me, so I am assuming the following:
you are using PostgreSQL
salaries.date is defined as a date column that stores the day when a consultant was paid
tasks.init_time and task.finish_time are defined as timestamp storing the data & time when a consultant started and finished work on a specific task.
Your join on only the month is wrong as far as I can tell. For one, because it would also include months from different years, but more importantly because this would lead to a result where the same row from salaries appeared several times. I think you need to join on the complete date:
FROM consultants c
JOIN tasks t ON c.id = t.consultant_id
JOIN salaries s ON c.id = s.consultant_id
AND t.init_time::date = s.payment_date --<< here
If my assumptions about the data types are correct, the cast to a timestamp and then back to a time is useless and wrong. Useless because you can simply subtract to timestamps and wrong because you are ignoring the actual date in the timestamp so (although unlikely) if init_time and finish_time are not on the same day, the result is wrong.
So the calculation of the duration can be simplified to:
t.finish_time - t.init_time
To get the cost per hour per month, you need to convert the interval (which is the result when subtracting one timestamp from another) to a decimal indicating the hours, you can do this by extracting the seconds from the interval and then dividing that by 3600, e.g.
extract(epoch from sum(t.finish_time - t.init_time)) / 3600)
If you divide the sum of the payments by that number you get your cost per hour per month:
SELECT concat_ws(' ', c.first_name, c.last_name) as name,
to_char(s.payment_date, 'yyyy-mm') as salary_month,
extract(epoch from sum(t.finish_time - t.init_time)) / 3600 as worked_hours,
sum(s.payment) / (extract(epoch from sum(t.finish_time - t.init_time)) / 3600) as cost_per_hour
FROM consultants c
JOIN tasks t ON c.id = t.consultant_id
JOIN salaries s ON c.id = s.consultant_id AND t.init_time::date = s.payment_date
GROUP BY c.id, to_char(s.payment_date, 'yyyy-mm') --<< no parentheses!
order by name, salary_month;
As you want the report broken down by month you should convert the month into something that contains the year as well. I used to_char() to get a string with only year and month. You also need to remove salaries.payment from the group by clause.
You also don't need the "payment month" and "salary month" because both will always be the same as that is the join condition.
And finally you don't need the cast to ::text for the name columns because they are most certainly defined as varchar or text anyway.
The sample data I made up for this: http://sqlfiddle.com/#!15/ae0c9
Somewhat unrelated, but:
You should also not put the column list of the group by in parentheses. Putting a column list in parentheses in Postgres creates an anonymous record which is something completely different then having multiple columns. This is also true for the columns in the select list.
If at all the target is putting it in one query, then just confirming, have you tried to achieve it using CTEs?
Like
;WITH cte_pymt
AS
(
//Your existing query 1
)
SELECT <your required data> FROM cte_pymt