Pandas Time Series Conversion and Formatting - pandas

How do I convert a string in this format to a Pandas timestamp?
00:55:02:285
hours:minutes:seconds:milliseconds
I have a dataframe already with several columns in this format.
Pandas don't seem to recognize this format as a timestamp when I use any of the conversion functions, e.g.. to_datetime()
Many Thanks.

I think you need parameter format in to_datetime:
df = pd.DataFrame({'times':['00:55:02:285','00:55:02:285']})
print (df)
times
0 00:55:02:285
1 00:55:02:285
print (pd.to_datetime(df.times, format='%H:%M:%S:%f'))
0 1900-01-01 00:55:02.285
1 1900-01-01 00:55:02.285
Name: times, dtype: datetime64[ns]

Related

convert pandas datetime64[ns] to julian day

I am confused by the number of data type conversions and seemingly very different solutions to this, none of which I can get to work.
What is the best way to convert a pandas datetime column (datetime64[ns] eg 2017-01-01 03:15:00) to another column in the same pandas dataframe, converted to julian day eg 2458971.8234259?
Many thanks
Create DatetimeIndex and convert to julian dates:
df = pd.DataFrame({'dates':['2017-01-01 03:15:00','2017-01-01 03:15:00']})
df['dates'] = pd.to_datetime(df['dates'])
df['jul1'] = pd.DatetimeIndex(df['dates']).to_julian_date()
#if need remove times
df['jul2'] = pd.DatetimeIndex(df['dates']).floor('d').to_julian_date()
print (df)
dates jul1 jul2
0 2017-01-01 03:15:00 2.457755e+06 2457754.5
1 2017-01-01 03:15:00 2.457755e+06 2457754.5
Because:
df['jul'] = df['dates'].dt.to_julian_date()
AttributeError: 'DatetimeProperties' object has no attribute 'to_julian_date'

Covert MM:SS column to HH:MM:SS column in Pandas?

Convert MM:SS column to HH:MM:SS column in Pandas. I tried every possible way, like changing datatype and to_datetime and to_timedelta, but I couldn't covert the series. Please help somebody. I am getting errors like:
(here chiptime is in MM:SS format, which I want to change in HH:MM:SS)
df2["ChipTime"]=pd.to_datetime(df2.ChipTime, unit="hour").dt.strftime('%H:%M:%S')
ValueError: cannot cast unit hour
df2["ChipTime"]=pd.to_timedelta(df2["ChipTime"])
ValueError: expected hh:mm:ss format
df2["ChipTime"]=df2["ChipTime"].astype(int)
ValueError: invalid literal for int() with base 10: '16:48'
I have tried more methods, above are some of them, I am beginner in Pandas, so please excuse me if I have done any blunder. Thanks
If convert values to datetimes there are added default year, month, day with parameter format in to_datetime, if neccesary is possible convert values to times by Series.dt.time
df2 = pd.DataFrame({'ChipTime':['16:48','10:48']})
df2["ChipTime1"]=pd.to_datetime(df2.ChipTime, format="%M:%S")
df2["ChipTime11"]=pd.to_datetime(df2.ChipTime, format="%M:%S").dt.time
Or for timedeltas add 00: for default hour by to_timedelta:
df2["ChipTime2"]=pd.to_timedelta('00:' + df2["ChipTime"])
print (df2)
ChipTime ChipTime1 ChipTime11 ChipTime2
0 16:48 1900-01-01 00:16:48 00:16:48 00:16:48
1 10:48 1900-01-01 00:10:48 00:10:48 00:10:48

pandas string to date type conversion in proper format

I am getting date data in string format in pandas like 10-Oct,11-Oct but i want to make it date data type like this format 2019-10-10,2019-10-11
is there any easy way available in pandas?
Use to_datetime with added year and parameter format:
df = pd.DataFrame({'date':['10-Oct', '11-Oct']})
df['date'] = pd.to_datetime(df['date'] + '-2019', format='%d-%b-%Y')
print (df)
date
0 2019-10-10
1 2019-10-11

convert pandas datetime field with NAT entries to date

I have a Pandas dataframe with a field that is datetime datatype. Most of the values in the field are valid datetime values, but some are NAT.
I need to drop the time part of the datetime values for each value in the field, keeping the field as date datatype (not str). I tried the following:
df['mydate'] = df['mydate'].dt.date
it work fine if there is no NAT values in the column. However, if there are NAT values, it throws this error
{AttributeError}Can only use .dt accessor with datetimelike values
I tried this alternative to skip over the NAT:
df['mydate'] = [d.date if not pd.isnull(d) else None for d in df['mydate']]
but this converted the values in the column to:
<built-in method date of Timestamp object at 0x000002A06F6501C8>
Please advise how ignore or skip the NAT in the field when converting. I'v had no luck googling for an answer, and I am trying to avoid using iterrows() looping on the entire dataframe.
First convert values to datetimes and then working nice dt.date function:
df = pd.DataFrame({'mydate':['2015-04-04','2018-09-10', np.nan]})
df['new'] = pd.to_datetime(df['mydate'], errors='coerce').dt.date
print (df)
mydate new
0 2015-04-04 2015-04-04
1 2018-09-10 2018-09-10
2 NaN NaT

Anomaly using numPy datetime64 to work with dates in a pandas Dataframe column

Cannot covert 'YYYY-MM' string to YYYY-MM datetime using datetime64 for data in pandas DataFrame.
np.datetime64 works to convert date string(s) of 'YYYY-MM' to datetime when stored in a scalar or array, but not when same data is accessed via a DateFrame.
What I want to do is convert a column dates (format: YYYY-MM) to datetime data (with or without adding another column).
csv file data:
month, volume
2019-01, 100
2019-02, 110
Sample Code:
import pandas as pd
import numpy as np
df=pd.read_csv (r'file location')
df["date"]=df["month"].apply(np.datetime64)
# Input (month): 2013-01
# Expected output (date): 2013-01
# Actual output (date): 2013-01-01
So, the datetime64 changes YYYY-MM to YYYY_MM-01
(Also, YYYY is converted to YYYY-01-01)
Perhaps you're looking for pd.Period:
In [11]: df.date.apply(pd.Period, freq='M')
Out[11]:
0 2019-01
1 2019-02
Name: date, dtype: object
Similarly, but without the apply:
In [12]: pd.to_datetime(df.date).dt.to_period(freq='M')
Out[12]:
0 2019-01
1 2019-02
Name: date, dtype: object