I am dealing with time objects saved as strings in the form 57:44.6 (second, minute, hour).I am trying to convert the column elements to datetime using pd_todatetime. There results are Nat. How can i change the format of the string to HH:MM:SS (6:44:57)before converting?
provide the appropriate format specifier '%S:%M.%H'. Ex:
import pandas as pd
s = pd.Series(['57:44.6'])
dts = pd.to_datetime(s, format='%S:%M.%H')
# dts
# 0 1900-01-01 06:44:57
# dtype: datetime64[ns]
Related
The objective of this post is to be able to convert the columns [‘Open Date’, 'Close date’] to timestamp format
I have tried with the functions / examples from these links with any results.
Convert datetime to timestamp in Neo4j
Convert datetime pandas
Pandas to_dict() converts datetime to Timestamp
Really appreciate any ideas / comments / examples on how to do so.
Data Base Image
Column Characteristics:
Open Date datetime64[ns] and pandas.core.series.Series
Close date datetime64[ns] and pandas.core.series.Series
Finally I been using these libraries
import pandas as pd
import numpy as np
from datetime import datetime, date, time, timedelta
You convert first to numpy array by values and transform (cast) to int64 - output is in nanoseconds , which means divide by 10 ** 9:
df['open_ts'] = df['Open_Date'].datetime.values.astype(np.int64)
df['close_ts'] = df['Close_Date'].datetime.values.astype(np.int64)
OR
If you want to avoid using numpy, you can also try:
df['open_ts'] = pd.to_timedelta(df['Open_Date'], unit='ns').dt.total_seconds().astype(int)
df['close_ts'] = pd.to_timedelta(df['Close_Date'], unit='ns').dt.total_seconds().astype(int)
Try them and report it back here
I have column of dates in the format below in a pandas dataframe.
What is the most effective way to convert
2021-11-06T21:54:35.825Z
to
2021-11-6 21:54:35
pd.to_datetime(df['date'], format='%Y-%m-%d %H:%M:%S') only returns 2021-11-06 without the timestamp
You can use .dt accessor on Pandas Series followed by by .strftime property dt.strftime, to format datetime into desired string representation.
import pandas as pd
import datetime
df = pd.DataFrame({'date': ["2021-11-06T21:54:35.825Z"]})
fmt = '%Y-%m-%d %H:%M:%S'
pd.to_datetime(df['date']).dt.strftime(fmt)
returns
0 2021-11-06 21:54:35
Name: date, dtype: object
Or if you don't want to have zero padding before the day, you can use: fmt="%Y-%m-%-d %H:%M:%S" (notice the hyphen between % and d). This results in: 2021-11-6 21:54:35
I am dealing with timestamps that, according to Google's documentation, are:
A timestamp in RFC3339 UTC "Zulu" format, accurate to nanoseconds. Example: "2014-10-02T15:01:23.045123456Z".
So, for example, if the string is '2019-11-06T06:24:42.558008Z', then pd.to_datetime('2019-11-06T06:24:42.558008Z',infer_datetime_format=True) works and returns Timestamp('2019-11-06 06:24:42.558008').
However, letting Pandas infer the format is slow, and I have many rows of data. What would I pass the format parameter to help speed up the processing?
You could use to_datetime with utc=True + tz_convert:
import pandas as pd
utc = pd.to_datetime('2019-11-06T06:24:42.558008Z', utc=True).tz_convert(None)
inferred = pd.to_datetime('2019-11-06T06:24:42.558008Z', infer_datetime_format=True)
print(utc == inferred)
Output
True
From the documentation on tz_convert:
A tz of None will convert to UTC and remove the timezone information.
Note that only doing:
utc = pd.to_datetime('2019-11-06T06:24:42.558008Z', utc=True) # or pd.to_datetime('2019-11-06T06:24:42.558008Z')
throws a TypeError exception when comparing with inferred:
TypeError: Cannot compare tz-naive and tz-aware timestamps
Cannot covert 'YYYY-MM' string to YYYY-MM datetime using datetime64 for data in pandas DataFrame.
np.datetime64 works to convert date string(s) of 'YYYY-MM' to datetime when stored in a scalar or array, but not when same data is accessed via a DateFrame.
What I want to do is convert a column dates (format: YYYY-MM) to datetime data (with or without adding another column).
csv file data:
month, volume
2019-01, 100
2019-02, 110
Sample Code:
import pandas as pd
import numpy as np
df=pd.read_csv (r'file location')
df["date"]=df["month"].apply(np.datetime64)
# Input (month): 2013-01
# Expected output (date): 2013-01
# Actual output (date): 2013-01-01
So, the datetime64 changes YYYY-MM to YYYY_MM-01
(Also, YYYY is converted to YYYY-01-01)
Perhaps you're looking for pd.Period:
In [11]: df.date.apply(pd.Period, freq='M')
Out[11]:
0 2019-01
1 2019-02
Name: date, dtype: object
Similarly, but without the apply:
In [12]: pd.to_datetime(df.date).dt.to_period(freq='M')
Out[12]:
0 2019-01
1 2019-02
Name: date, dtype: object
How do I convert a string in this format to a Pandas timestamp?
00:55:02:285
hours:minutes:seconds:milliseconds
I have a dataframe already with several columns in this format.
Pandas don't seem to recognize this format as a timestamp when I use any of the conversion functions, e.g.. to_datetime()
Many Thanks.
I think you need parameter format in to_datetime:
df = pd.DataFrame({'times':['00:55:02:285','00:55:02:285']})
print (df)
times
0 00:55:02:285
1 00:55:02:285
print (pd.to_datetime(df.times, format='%H:%M:%S:%f'))
0 1900-01-01 00:55:02.285
1 1900-01-01 00:55:02.285
Name: times, dtype: datetime64[ns]