Calculate average age of items in ruby on rails

Calculate average age of items in ruby on rails - sql

I wanting to find the average age of Ticket.opened. Opened is a scope on my ticket model.
This returns a nice result for one ticket.
time_ago_in_words(Ticket.last.created_at.to_time)
What I am wanting is to do something like this
age = []
Ticket.opened.each do |t|
age = time_ago_in_words(t.created_at.to_time)+[]
end
average_age = age/Ticket.opened.count
I am aware that this code is awful, but it is my best attempt at explaining what I am looking for.

If you use PostgreSQL then you could fetch average created_at time in a single SQL query:
avg_created_at = Ticket.opened.average('extract(epoch from tickets.created_at)')
time_ago_in_words(Time.at(avg_created_at))
EXTRACT(EPOCH FROM ...) function returns created_at value as the number of seconds since 1970-01-01 00:00:00 UTC. This can be used in an aggregate function, such as average
I'm sure MySQL allows such queries too.

I believe you can get this by converting the times to integers and calculating the average. Roughly speaking:
times = [2.days.ago, 3.days.ago, 4.days.ago]
average_ticket_time = Time.at(times.map(&:to_i).sum / times.count) # convert to ints and get average
time_ago = Time.now - average_ticket_time # time ago from now
readable_time_ago = (time_ago / 60 / 60).round # divide by 60 secs then 60 mins; round to get closest hour
# => 72 (hours ago)
Or in one line:
((Time.now - (Time.at(times.map(&:to_i).sum / times.count))) / 60 / 60).round
So, on average, 72 hours ago.
You can also divide by an additional 24 to get the time in days: 3 days in this case, which is what you'd expect in this simple example.
In your example, your times array would be the following:
times = Ticket.opened.map { |ticket| ticket.created_at.to_i }
# or perhaps the following, check the performace:
times = Ticket.opened.pluck(:created_at).map(&:to_i)
Hope that helps - let me know how you get on.

Related

Converting an nvarchar into a custom time format to query upon duration

I am working with a table that has a variety of column types in rather specific and strange formats. Specifically I have a column, 'Total_Time' that measures a duration in the format:
days:hours:minutes (d:hh:mm)
e.g 200:10:03 represents 200 days, 10 hours and 3 minutes.
I want to be able to run queries against this duration in order to filter upon time durations such as
SELECT * FROM [TestDB].[dbo].[myData] WHERE Total_Time < 0:1:20
Ideally this would provide me with a list of entries whose total time duration is less than 1 hour and 20 minutes. I'm not aware of how this is possible in an nvarchar format so I would appreciate any advice on how to approach this problem. Thanks in advance...

I would suggest converting that value to minutes, and then passing the parametrised value as minutes as well.
If we can assume that there will always be a days, hours, and minutes section (so N'0:0:10' would be used to represent 10 minutes) you could do something like this:
SELECT *
FROM (VALUES(N'200:10:03'))V(Duration)
CROSS APPLY (VALUES(CHARINDEX(':',V.Duration)))H(CI)
CROSS APPLY (VALUES(CHARINDEX(':',V.Duration,H.CI+1)))M(CI)
CROSS APPLY (VALUES(TRY_CONVERT(int,LEFT(V.Duration,H.CI-1)),TRY_CONVERT(int,SUBSTRING(V.Duration,H.CI+1, M.CI - H.CI-1)),TRY_CONVERT(int, STUFF(V.Duration,1,M.CI,''))))DHM(Days,Hours,Minutes)
CROSS APPLY (VALUES((DHM.Days*60*24) + (DHM.Hours * 60) + DHM.Minutes))D(Minutes)
WHERE D.[Minutes] < 80; --1 hour 20 minutes = 80 minutes
If you can, then ideally you should be fixing your design and just storing the value as a consumable value (like an int representing the number of minutes), or at least adding a computed column (likely PERSISTED and indexed appropriately) so that you can just reference that.
If you're on SQL Server 2022+, you could do something like this, which is less "awful" to look at:
SELECT *
FROM (VALUES(N'200:10:03'))V(Duration)
CROSS APPLY(SELECT SUM(CASE SS.ordinal WHEN 1 THEN TRY_CONVERT(int,SS.[value]) * 60 * 24
WHEN 2 THEN TRY_CONVERT(int,SS.[value]) * 60
WHEN 3 THEN TRY_CONVERT(int,SS.[value])
END) AS Minutes
FROM STRING_SPLIT(V.Duration,':',1) SS)D
WHERE D.[Minutes] < 80; --1 hour 20 minutes = 80 minutes;

Trying to learn SQL Aggregations and Sub-Queries

I am trying to improve my query writing and need help with the following...
I have one table with multiple columns, including Operation_Code, Operation_Category, Downtime_In_Minutes, Downtime (as a percentage of the last 24 hours). Each line of my results set needs to SUM(Downtime_minutes) for each Operation_Code and SUM(Count of each occurrance of the Operation_Code). Stop will always be yesterday. Date functions and formatting return yesterdays date. This is not presented in the query below due to length of the code, but it works. So, each line in the results should look like:
StopDate
Operation_Code
Operation_Category
Count (# of occurrences of each Op_Code)
SUM (in minutes) of all downtime for each Operation_Code
% of Last 24 hours
Example Results:
StopDate Op_Code OP_Category Count Downtime (Minutes) % of Last 24
7/18/2021 X123 Grinder 10 720 50%
7/18/2021 A800 Cutter 12 360 25%
7/18/2021 O225 Polisher 5 60 4%
My query without attempting any aggregations is basically:
Select StopDate,
OpCode,
OpCat
From DTS
Where StopDate = yesterday
Basic question is hw do I SUM the count of occurrences and SUM the total time in minutes for each unique Operation_Code?
Thanks in advance!

Are you just looking for aggregation? Then you can use a window function to get the ratio, which I am guessing is based on the downtime:
Select StopDate, OpCode, OpCat, count(*) as cnt,
sum(Downtime_In_Minutes),
sum(Downtime_In_Minutes) * 1.0 / nullif(sum(Downtime_In_Minutes) over ()) as
From DTS
Where StopDate = yesterday;
I assume you know how to deal with "yesterday", because you say that you already have a query.
group by StopDate;

Get data when date is equal to or greater than 90 days ago

I wonder if anyone here can help with a BigQuery piece I am working on.
I'm trying to pull the date, email and last interaction time from a dataset when the last interaction time is equal to or greater than 90 days ago.
I have the following query:
SELECT
date,
user_email,
DATE_FROM_UNIX_DATE(gmail.last_interaction_time) AS Last_Interaction_Date,
DATE_ADD(CURRENT_DATE(), INTERVAL -90 DAY) AS Days_ago
FROM
`bqadminreporting.adminlogtracking.usage`
WHERE
'Last_Interaction_Date' >= 'Days_ago'
However, I run into the following error:
DATE value is out of allowed range: from 0001-01-01 to 9999-12-31
As far as I can see, it makes sense - so not entirely sure why its throwing out an error?

Looks like you have some inconsistent values (data) in filed gmail.last_interaction_time, which you need to handle to avoid error.
Moreover above query will not work as per your expected WHERE conditions, you should use following query to get expected output.
SELECT * FROM
(SELECT
date,
user_email,
DATE_FROM_UNIX_DATE(gmail.last_interaction_time) AS Last_Interaction_Date,
DATE_ADD(CURRENT_DATE(), INTERVAL -90 DAY) AS Days_ago
FROM
`bqadminreporting.adminlogtracking.usage`)
WHERE
Last_Interaction_Date >= Days_ago

Presumably, your problem is DATE_FROM_UNIX_DATE(). Without sample data, it is not really possible to determine what the issue is.
However, you don't need to convert to a date to do this. You can do all the work in the Unix seconds space:
select u.*
from `bqadminreporting.adminlogtracking.usage` u
where gmail.last_interaction_time >= unix_seconds(timestamp(current_date)) - 90 * 60 * 60 * 24
Note that I suspect that the issue is that last_interaction_time is really measured in milliseconds or microseconds or some other unit. This will prevent your error, but it might not do what you want.

Vertica different calculations than in PostgreSQL

I have one query:
SELECT CAST(((stats.ts_spawn - 1427835600) / 86400) * 86400 +
1427835600 AS INTEGER) AS anon_1 FROM stats WHERE stats.ts_spawn >
1427835600 AND stats.ts_spawn < 1428440399 GROUP BY anon_1 order by anon_1;
I'm expecting to get start of the each day in a week.
Result in Postgresql:
1427835600
1427922000
1428008400
1428094800
1428181200
1428267600
1428354000
Vertica returns start of each hour of each day of the week:
1427839200
1427842800
1427846400
1427850000
... and so on, total 167 records(24 * 7 - 1)
I have no idea how to modify this query.

The second one is obviously resulting in a float not an integer in division. In Vertica documents we can read this:
the Vertica 6 release introduced a behavior change when dividing integers using the / operator
If you want the query to behave the same on both systems either change the configuration option as mentioned in that doc or use the Floor() function on the result of division.

Hours and Average Hours Worked per Day, by Department

I'm trying to get an estimate of how many hours people worked during a set period of time. I want to show this by department and by what area they were working in. Right now I have this:
SELECT M.MemberDepartmentID,T.TaskName,
COUNT(DATEDIFF(HOUR, TT.StartTime, TT.EndTime)) 'Hours',
AVG(DATEDIFF(HOUR, TT.StartTime, TT.EndTime)) Average
FROM Member.TaskTracking TT
LEFT OUTER JOIN Member.Task T
ON TT.TaskID=T.TaskID
JOIN dbo.tblMember M
ON TT.MemberID=M.MemberID
WHERE M.FullTime=1
AND M.EmployeeSalary=1
AND (TT.StartTime >= '2013-10-01'
AND TT.EndTime < '2013-11-01')
GROUP BY M.MemberDepartmentID,T.TaskName
ORDER BY M.MemberDepartmentID,T.TaskName
I don't know how to confirm if it's correct, but some are definitely showing averages of zero even if there were hours worked. And some averages are way higher than the hours worked. For instance, here are some of my results:
MemberDepartmentID TaskName Hours Average
---------------------------------------------------
1 Packing 25 0
1 Picking 6 0
1 PreScanning 38 7
4 Picking 2 104
Suggestions?

First, it is important to note that DATEDIFF(HOUR) returns an integer, and it does not necessarily give a good reflection of how much time has actually passed. For example, these both yield 1:
SELECT DATEDIFF(HOUR, '03:59', '04:01'); -- 2 minutes (0.033333 hours)
SELECT DATEDIFF(HOUR, '03:01', '04:59'); -- 118 minutes (1.966666 hours)
And these both yield 0:
SELECT DATEDIFF(HOUR, '03:01', '03:59'); -- 58 minutes (0.966666 hours)
SELECT DATEDIFF(HOUR, '03:01', '03:02'); -- 1 minute (0.016666 hours)
Next, if you give SQL Server integers to divide, it's going to perform integer math. Meaning it will divide, but it will discard any remainder. This yields 0:
SELECT 3/4;
Even though really it's 0.75, and if it rounded up it should be 1. (Not that either of those results are particularly meaningful). Now, extend that to average.
DECLARE #d1 TABLE(a INT);
INSERT #d1 VALUES(3),(4);
SELECT AVG(a) FROM #d1;
This yields 3, not 3.5, which you would probably expect. For the same reasons as above.
Remembering that some of your tasks may have lasted up to 59 minutes, but would still yield an hour differential of 0, you could have, say, 4 tasks, three that lasted > 1 hour, and one that lasted < 1 hour. So your average calculation would essentially be:
SELECT (1+1+1+0)/4;
Which, as above, still yields 0.
If you want a meaningful average there, you should calculate the time spent more granularly than by hours. For example, you could perform the datediff in minutes:
SELECT DATEDIFF(MINUTE, '03:01', '04:59');
This yields 118. If you want to express that in hours, you could divide by 60.0 (the decimal is important) or multiply by 1.0:
SELECT DATEDIFF(MINUTE, '03:01', '04:59')/60.0;
SELECT 1.0*DATEDIFF(MINUTE, '03:01', '04:59')/60;
These both yield 1.966666. Much more meaningful to average such a result. So perhaps change your expression to:
Average = AVG(1.0*DATEDIFF(MINUTE, TT.StartTime, TT.EndTime)/60)
About the count, not sure what you're attempting to do there, but you may want to make similar adjustments to the calculation and probably consider using SUM. If you show some sample data and the results you expect, we can help more.
Also I recommend not escaping keyword aliases using 'single quotes' - some forms of this syntax are deprecated, and it makes your alias look like a string literal. First, try not to use keywords or otherwise invalid identifiers as aliases; but if you must, escape them with [square brackets].

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Calculate average age of items in ruby on rails - sql

Related

Converting an nvarchar into a custom time format to query upon duration

Trying to learn SQL Aggregations and Sub-Queries

Get data when date is equal to or greater than 90 days ago

Vertica different calculations than in PostgreSQL

Hours and Average Hours Worked per Day, by Department

Categories

Resources