how to convert timedelta into numeric so that i can apply mean() operation? - data-science

df["Request_Closing_Time"]=(df['Closed Date']-df['Created Date']).As "Request_Closing_Time" is a timedelta type then how to conert it into int so that i can found average Request_Closing_Time of each complaint type??

Related

Pandas DataFrame Time Conversion

How to change for example '01:31:41' the value display in stopwatch which is into seconds value. E.g this is 1 min and 31 second, so roughly about 91 seconds.
split string into a list object
create a timedelta object without miliseconds as they aren't needed
we use the self method of timedelta total_seconds()
from datetime import timedelta
original_time_string = "01:30:12"
list_string = original_time_string.split(":")
print((timedelta(minutes=int(list_string[0]),seconds=int(list_string[1])).total_seconds()))

TypeError: dtype datetime64[ns] cannot be converted to timedelta64[ns]

I have a column of years from the sunspots dataset.
I want to convert column 'year' in integer e.g. 1992 to datetime format then find the time delta and eventually compute total seconds (cumulative) to represent the time index column of a time series.
I am trying to use the following code but I get the error
TypeError: dtype datetime64[ns] cannot be converted to timedelta64[ns]
sunspots_df['year'] = pd.to_timedelta(pd.to_datetime(sunspots_df['year'], format='%Y') ).dt.total_seconds()
pandas.Timedelta "[r]epresents a duration, the difference between two dates or times." So you're trying to get Python to tell you the difference between a particular datetime and...nothing. That's why it's failing.
If it's important that you store your index this way (and there may be better ways), then you need to pick a start datetime and compute the difference to get a timedelta.
For example, this code...
import pandas as pd
df = pd.DataFrame({'year': [1990,1991,1992]})
diff = (pd.to_datetime(df['year'], format='%Y') - pd.to_datetime('1990', format='%Y'))\
.dt.total_seconds()
...returns a series whose values are seconds from January 1st, 1990. You'll note that it doesn't invoke pd.to_timedelta(), because it doesn't need to: the result of the subtraction is automatically a pd.timedelta column.

Convert DateTime to TimeStamp Pandas

The objective of this post is to be able to convert the columns [‘Open Date’, 'Close date’] to timestamp format
I have tried with the functions / examples from these links with any results.
Convert datetime to timestamp in Neo4j
Convert datetime pandas
Pandas to_dict() converts datetime to Timestamp
Really appreciate any ideas / comments / examples on how to do so.
Data Base Image
Column Characteristics:
Open Date datetime64[ns] and pandas.core.series.Series
Close date datetime64[ns] and pandas.core.series.Series
Finally I been using these libraries
import pandas as pd
import numpy as np
from datetime import datetime, date, time, timedelta
You convert first to numpy array by values and transform (cast) to int64 - output is in nanoseconds , which means divide by 10 ** 9:
df['open_ts'] = df['Open_Date'].datetime.values.astype(np.int64)
df['close_ts'] = df['Close_Date'].datetime.values.astype(np.int64)
OR
If you want to avoid using numpy, you can also try:
df['open_ts'] = pd.to_timedelta(df['Open_Date'], unit='ns').dt.total_seconds().astype(int)
df['close_ts'] = pd.to_timedelta(df['Close_Date'], unit='ns').dt.total_seconds().astype(int)
Try them and report it back here

Convert TZ datetime to timestamp

I have column of dates in the format below in a pandas dataframe.
What is the most effective way to convert
2021-11-06T21:54:35.825Z
to
2021-11-6 21:54:35
pd.to_datetime(df['date'], format='%Y-%m-%d %H:%M:%S') only returns 2021-11-06 without the timestamp
You can use .dt accessor on Pandas Series followed by by .strftime property dt.strftime, to format datetime into desired string representation.
import pandas as pd
import datetime
df = pd.DataFrame({'date': ["2021-11-06T21:54:35.825Z"]})
fmt = '%Y-%m-%d %H:%M:%S'
pd.to_datetime(df['date']).dt.strftime(fmt)
returns
0 2021-11-06 21:54:35
Name: date, dtype: object
Or if you don't want to have zero padding before the day, you can use: fmt="%Y-%m-%-d %H:%M:%S" (notice the hyphen between % and d). This results in: 2021-11-6 21:54:35

Timestamp String in Zulu Format To Datetime

I am dealing with timestamps that, according to Google's documentation, are:
A timestamp in RFC3339 UTC "Zulu" format, accurate to nanoseconds. Example: "2014-10-02T15:01:23.045123456Z".
So, for example, if the string is '2019-11-06T06:24:42.558008Z', then pd.to_datetime('2019-11-06T06:24:42.558008Z',infer_datetime_format=True) works and returns Timestamp('2019-11-06 06:24:42.558008').
However, letting Pandas infer the format is slow, and I have many rows of data. What would I pass the format parameter to help speed up the processing?
You could use to_datetime with utc=True + tz_convert:
import pandas as pd
utc = pd.to_datetime('2019-11-06T06:24:42.558008Z', utc=True).tz_convert(None)
inferred = pd.to_datetime('2019-11-06T06:24:42.558008Z', infer_datetime_format=True)
print(utc == inferred)
Output
True
From the documentation on tz_convert:
A tz of None will convert to UTC and remove the timezone information.
Note that only doing:
utc = pd.to_datetime('2019-11-06T06:24:42.558008Z', utc=True) # or pd.to_datetime('2019-11-06T06:24:42.558008Z')
throws a TypeError exception when comparing with inferred:
TypeError: Cannot compare tz-naive and tz-aware timestamps