pandas convert delta time column into one sole unit time - pandas

I am working with a pandas dataframe where i have a deltatime column with values like:
{'deltatime': 0 days 09:06:30 , 0 days 00:30:34, 2 days 23:07:14 }
How can I convert those times into a single unit, like minutes, or hours but a single one in order to better visualize those times in a graph.
Some idea?
here:
https://pandas.pydata.org/pandas-docs/stable/user_guide/timedeltas.html
it does not clarify how to simple change units

You can use the total_seconds() method of a timedelta to get its duration in seconds:
seconds = [t.total_seconds() for t in df['deltatime']]
And then convert the units if desired:
#to hours, i.e. 3600 seconds
hours = [t.total_seconds()/3600 for t in df['deltatime']]

You can do:
df['deltatime'] = pd.to_timedelta(df['deltatime']).dt.total_seconds()
Output:
deltatime
0 32790.0
1 1834.0
2 256034.0
You can also perform arithmetic operations on timedelta:
# convert to hours
pd.to_timedelta(df['deltatime']) / pd.to_timedelta('1H')
Output:
0 9.108333
1 0.509444
2 71.120556
Name: deltatime, dtype: float64

Related

Converting an nvarchar into a custom time format to query upon duration

I am working with a table that has a variety of column types in rather specific and strange formats. Specifically I have a column, 'Total_Time' that measures a duration in the format:
days:hours:minutes (d:hh:mm)
e.g 200:10:03 represents 200 days, 10 hours and 3 minutes.
I want to be able to run queries against this duration in order to filter upon time durations such as
SELECT * FROM [TestDB].[dbo].[myData] WHERE Total_Time < 0:1:20
Ideally this would provide me with a list of entries whose total time duration is less than 1 hour and 20 minutes. I'm not aware of how this is possible in an nvarchar format so I would appreciate any advice on how to approach this problem. Thanks in advance...
I would suggest converting that value to minutes, and then passing the parametrised value as minutes as well.
If we can assume that there will always be a days, hours, and minutes section (so N'0:0:10' would be used to represent 10 minutes) you could do something like this:
SELECT *
FROM (VALUES(N'200:10:03'))V(Duration)
CROSS APPLY (VALUES(CHARINDEX(':',V.Duration)))H(CI)
CROSS APPLY (VALUES(CHARINDEX(':',V.Duration,H.CI+1)))M(CI)
CROSS APPLY (VALUES(TRY_CONVERT(int,LEFT(V.Duration,H.CI-1)),TRY_CONVERT(int,SUBSTRING(V.Duration,H.CI+1, M.CI - H.CI-1)),TRY_CONVERT(int, STUFF(V.Duration,1,M.CI,''))))DHM(Days,Hours,Minutes)
CROSS APPLY (VALUES((DHM.Days*60*24) + (DHM.Hours * 60) + DHM.Minutes))D(Minutes)
WHERE D.[Minutes] < 80; --1 hour 20 minutes = 80 minutes
If you can, then ideally you should be fixing your design and just storing the value as a consumable value (like an int representing the number of minutes), or at least adding a computed column (likely PERSISTED and indexed appropriately) so that you can just reference that.
If you're on SQL Server 2022+, you could do something like this, which is less "awful" to look at:
SELECT *
FROM (VALUES(N'200:10:03'))V(Duration)
CROSS APPLY(SELECT SUM(CASE SS.ordinal WHEN 1 THEN TRY_CONVERT(int,SS.[value]) * 60 * 24
WHEN 2 THEN TRY_CONVERT(int,SS.[value]) * 60
WHEN 3 THEN TRY_CONVERT(int,SS.[value])
END) AS Minutes
FROM STRING_SPLIT(V.Duration,':',1) SS)D
WHERE D.[Minutes] < 80; --1 hour 20 minutes = 80 minutes;

Combining data from one column into one with multiplication

I am trying to find a way to add up amount with the same ID and different units but also perform multiplication or division on them before adding them together.
The column time describes the amount of time spent doing a certain task.
There are four different values the time column can have which are:
- Uur (which stands for Hours)
- Minuut (which stands for Minutes)
- Etmaal (which stands for 24 hours)
- Dagdeel (which stands for 4 hours)
What I'd like is to transform them all into hours which should eventually return the row:
ID | Amount | Time |
---------------------
82 | 1690634 | Uur |
So only one unit remains.
This means rows that contain minuut will have their amount divided by 60
Rows that contain etmaal will have their amount multiplied by 24
and rows that contain dagdeel will have their amount multiplied by 4
I think the logic is:
select id,
(case when time = 'minuut' then amount / 60
when time = 'etmaal' then amount * 24
when time = 'dagdeel' then amount / 4
when time = 'uur' then amount
end) as amount,
time
from table
Please use below query,
select id,
case when time = 'minuut' then amount/60
when time = 'etmaal' then amount*24
when time = 'dagdeel' then amount*4 end as amount,
time
from table;

Querying Timedelta df column

#return rows that are <= 1 hour
df = df[df['timedeltas'].dt.total_seconds() < 3600]
After this query I am left with rows that are Timedelta(-1 days, +23:00:00)
And even when I compare pd.Timedelta(days=-1,hours=23) <= pd.Timedelta(hours=1) I get True. How come?
I am looking for a way to exclude any rows that are => 1 hour (01:00:00). Thanks
They indeed have -3600 seconds, it is kind of strange but makes sense depending on where is your 0, note that pd.Timedelta(days=-1,hours=23) is the same as pd.Timedelta(hours=-1) so if you have your origin in midnight from 01/04/2020 to 02/04/2020 this pd.Timedelta(days=-1,hours=23) would mean 23:00 01/04/2020, if you want to remove them, do like this:
df = df[(df['timedeltas'].dt.total_seconds() < 3600) & (df['timedeltas'].dt.total_seconds() >= 0)]

Postgresql - Query to multiply time by number

I'm creating a query to multiply a time by a number.
The time is the result of a query of the annual working hours of a person. The purpose is calculate the 25% then I want to multiply the time by 0.25
Example:
Time1: 100:00:00
[format H:M:s]
For this I've tried this query:
SELECT '100:00:00'::time * 0.25
PostgreSQL returns:
ERROR: the time / date value is out of range: «100: 00: 00»
LINE 2: SELECT '100: 00: 00' :: time * 0.25
The error is clear. My conversion of the time is out of range because the time object only is valid to 24 hours. In this case is working fine:
SELECT '24:00:00'::time * 0.25
Result:
"06:00:00"
Someone have any idea?
Time values are limited to 24 hours. You can do the calculation using other units. For instance, the following translates this to seconds:
select 0.25 * ( (string_to_array('100:00:00', ':'))[1]::int*60*60 +
(string_to_array('100:00:00', ':'))[2]::int*60 +
(string_to_array('100:00:00', ':'))[3]::int
) as num_seconds
Note that you also cannot represent the result as a time
You can use an INTERVAL, as in:
select interval '100 hours 0 minutes 0 seconds' * 0.25
or:
select interval '100:00:00' * 0.25
or:
select '100:00:00'::interval * 0.25
The result (for all of them):
?column?
--------
25:00:00

Calculate average age of items in ruby on rails

I wanting to find the average age of Ticket.opened. Opened is a scope on my ticket model.
This returns a nice result for one ticket.
time_ago_in_words(Ticket.last.created_at.to_time)
What I am wanting is to do something like this
age = []
Ticket.opened.each do |t|
age = time_ago_in_words(t.created_at.to_time)+[]
end
average_age = age/Ticket.opened.count
I am aware that this code is awful, but it is my best attempt at explaining what I am looking for.
If you use PostgreSQL then you could fetch average created_at time in a single SQL query:
avg_created_at = Ticket.opened.average('extract(epoch from tickets.created_at)')
time_ago_in_words(Time.at(avg_created_at))
EXTRACT(EPOCH FROM ...) function returns created_at value as the number of seconds since 1970-01-01 00:00:00 UTC. This can be used in an aggregate function, such as average
I'm sure MySQL allows such queries too.
I believe you can get this by converting the times to integers and calculating the average. Roughly speaking:
times = [2.days.ago, 3.days.ago, 4.days.ago]
average_ticket_time = Time.at(times.map(&:to_i).sum / times.count) # convert to ints and get average
time_ago = Time.now - average_ticket_time # time ago from now
readable_time_ago = (time_ago / 60 / 60).round # divide by 60 secs then 60 mins; round to get closest hour
# => 72 (hours ago)
Or in one line:
((Time.now - (Time.at(times.map(&:to_i).sum / times.count))) / 60 / 60).round
So, on average, 72 hours ago.
You can also divide by an additional 24 to get the time in days: 3 days in this case, which is what you'd expect in this simple example.
In your example, your times array would be the following:
times = Ticket.opened.map { |ticket| ticket.created_at.to_i }
# or perhaps the following, check the performace:
times = Ticket.opened.pluck(:created_at).map(&:to_i)
Hope that helps - let me know how you get on.