I have a Pandas dataframe that looks like:
YYYYMMDD HH
0 19900101 1
1 19900101 2
2 19900101 3
3 19900101 4
4 19900101 5
With
YYYYMMDD: year-month-day values as integers.
HH: hourly values as integers.
I want to get a column 'DateTime' which is a datetime column as yyyy-mm-dd hh:mm:ss. (so, mm:ss=00:00).
df['DateTime'] = df['YYYYMMDD'].apply(lambda x: pd.to_datetime(str(x), format='%Y%m%d'))
Gives me yyyy-mm-dd as datetime, is there a fancy way to add 'HH' values to the datetime?
Use pd.to_datetime and pd.to_timedelta:
df['Datetime'] = (pd.to_datetime(df['YYYYMMDD'],format='%Y%m%d')
.add(pd.to_timedelta(df['HH'], 'h')))
print(df)
YYYYMMDD HH Datetime
0 19900101 1 1990-01-01 01:00:00
1 19900101 2 1990-01-01 02:00:00
2 19900101 3 1990-01-01 03:00:00
3 19900101 4 1990-01-01 04:00:00
4 19900101 5 1990-01-01 05:00:00
df['DateTime'] = df['YYYYMMDD'].apply(lambda x: pd.to_datetime(str(x), format='%Y%m%d')) + (pd.to_timedelta(df.HH, unit='H'))
Solved it.
Related
i have data that looks like this stored in a DF and I'm trying to convert the "DATE" column so that all the dates are in the format of yyyy-mm-dd format instead of yyyy-dd-mm as you can see when the date changes by the "TIME" column to a new day (some of the dates not shown are already set to the YYYY-MM-DD format but I'm trying to change all of them to the YYYY-MM-DD format):
DATE TIME BAFFIN BAY GATUN II GATUN I KLONDIKE IIIG \
8778 2016-01-01 1900 8.926278 8.046583 7.649784 7.333993
8779 2016-01-01 2000 8.817666 4.395097 4.748931 6.672631
8780 2016-01-01 2100 8.704014 6.384826 7.128692 6.115349
8781 2016-01-01 2200 8.496358 8.261933 8.166153 6.242737
8782 2016-01-01 2300 8.434297 4.656991 5.894877 5.781445
8783 2016-02-01 0000 8.528372 3.056838 3.086056 5.023564
8784 2016-02-01 0100 8.783731 4.614589 4.894076 5.042875
8785 2016-02-01 0200 8.572500 3.860174 4.641366 5.174426
8786 2016-02-01 0300 8.279557 2.076971 2.644479 5.492729
8787 2016-02-01 0400 8.378920 3.562210 2.806703 5.356025
I'm trying to set it the "DATE" column to a datetime column with specifying the format but it does nothing:
df2['DATE'] = pd.to_datetime(df2['DATE'],format='%Y-%m-%d')
thank you in advance for your help!
Can you try this
pd.to_datetime(df['TIME'], dayfirst=True)
0 2016-01-01
1 2016-01-01
2 2016-01-01
3 2016-01-01
4 2016-01-01
5 2016-01-02
6 2016-01-02
7 2016-01-02
8 2016-01-02
9 2016-01-02
consider joining 'DATE' and 'TIME' to get a complete datetime column. Assuming both columns are of dtype obj (string), you can combine them using the + operator and then call pd.to_datetime with a specified format. Ex:
import pandas as pd
df = pd.DataFrame({'DATE': ['2016-01-01', '2016-02-01'],
'TIME': ['1900', '0000']})
df['DateTime'] = pd.to_datetime(df['DATE']+df['TIME'], format='%Y-%d-%m%H%M')
# df['DateTime']
# 0 2016-01-01 19:00:00
# 1 2016-01-02 00:00:00
# Name: DateTime, dtype: datetime64[ns]
I have an 'hour' column in a pandas dataframe that is simply a list of numbers from 0 to 23 representing hours. How can I convert them to an hour format such as 01:00 when the numbers are single digit ( like 1 ) and double digit (like 18)? The single digit numbers need to have a leading zero, a colon and two trailing zeros. The double digit numbers need only a colon and two trailing zeros. How can this be accomplished in a dataframe? Also, I have a 'date' column that needs to merge with the hour column after the hour column is converted.
e.g. date hour
2018-07-01 0
2018-07-01 1
2018-07-01 3
...
2018-07-01 21
2018-07-01 22
2018-07-01 23
Needs to look like:
date
2018-07-01 01:00
...
2018-07-01 23:00
The source of the data is a .csv file.
Thanks for your consideration. I'm new to pandas and I can't find in their documentation how to do this considering the single and double digit numbers.
Convert hours to timedeltas by to_timedelta and add to datetimes converted by to_datetime if necessary:
df['date'] = pd.to_datetime(df['date']) + pd.to_timedelta(df['hour'], unit='h')
print (df)
date hour
0 2018-07-01 00:00:00 0
1 2018-07-01 01:00:00 1
2 2018-07-01 03:00:00 3
3 2018-07-01 21:00:00 21
4 2018-07-01 22:00:00 22
5 2018-07-01 23:00:00 23
If need also remove hour column use DataFrame.pop
df['date'] = pd.to_datetime(df['date']) + pd.to_timedelta(df.pop('hour'), unit='h')
print (df)
date
0 2018-07-01 00:00:00
1 2018-07-01 01:00:00
2 2018-07-01 03:00:00
3 2018-07-01 21:00:00
4 2018-07-01 22:00:00
5 2018-07-01 23:00:00
I have a Pandas dataframe containing a datetime column, in which all the values are formatted like this:
25/09/15 12:00:00. I'd like to reformat this field in all the rows, in order to match this format: 25.09.15 12:00.
Here some sample data:
Date | Value
25/08/15 12:00:00 | 49.0
25/08/15 13:00:00 | 49.5
The date column datatype is string.
Thank you in advance
Use Series.dt.strftime to format datetime
df
Date Value
0 2015-08-25 12:00:00 49.0
1 2015-08-25 13:00:00 49.5
df['Date'] = df['Date'].dt.strftime('%Y.%m.%d %H:%M')
df
Date Value
0 2015.08.25 12:00 49.0
1 2015.08.25 13:00 49.5
if column type is str than you need to convert first to datetime
df.Date = pd.to_datetime(df.Date)
I have a dataframe in 1 column with all different times.
Time
-----
10:00
11:30
12:30
14:10
...
I need to do a quantile range on this dataframe with the code below:
df.quantile([0,0.5,1],numeric_only=False)
Following the link below, the quantile does work.
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.quantile.html
As my column as in object, I need to convert to pd.datetime or pd.Timestamp.
When I convert to pd.datetime, I will have all my time inserted with dates too.
If I format it to %H:%M, the column turns back to object which cannot work with quantile under numeric_only mode.
How can I convert to datetime format in %H:%M and still stick to datetime format?
Below was the code I used:
df = pd.DataFrame({"Time":["10:10","09:10","12:00","13:23","15:23","17:00","17:30"]})
df['Time2'] = pd.to_datetime(df['Time']).dt.strftime('%H:%M')
df['Time2'] = df['Time2'].astype('datetime64[ns]')
How can I convert to datetime format in %H:%M and still stick to datetime format?
Impossible in pandas, maybe closer is use timedeltas:
df = pd.DataFrame({"Time":["10:10","09:10","12:00","13:23","15:23","17:00","17:30"]})
df['Time2'] = pd.to_timedelta(df['Time'].add(':00'))
print (df)
Time Time2
0 10:10 10:10:00
1 09:10 09:10:00
2 12:00 12:00:00
3 13:23 13:23:00
4 15:23 15:23:00
5 17:00 17:00:00
6 17:30 17:30:00
I have the following dataframe;
Date = ['01-Jan','01-Jan','01-Jan','01-Jan']
Heure = ['00:00','01:00','02:00','03:00']
value =[1,2,3,4]
df = pd.DataFrame({'value':value,'Date':Date,'Hour':Heure})
print(df)
Date Hour value
0 01-Jan 00:00 1
1 01-Jan 01:00 2
2 01-Jan 02:00 3
3 01-Jan 03:00 4
I am trying to create a datetime index, knowing that the file I am working with is for 2015. I have tried a lot of things but can get it to work! I tried to only convert the date and the month, but even that does not work:
df.index = pd.to_datetime(df['Date'],format='%d-%m')
I expect the following result:
Date Hour value
2015-01-01 00:00:00 01-Jan 00:00 1
2015-01-01 01:00:00 01-Jan 01:00 2
2015-01-01 02:00:00 01-Jan 02:00 3
2015-01-01 03:00:00 01-Jan 03:00 4
Does anyone know how to do it?
Thanks,
You need to explicitely add 2015 somehow, and include the Hour column as well. I would do something like this:
df.index = pd.to_datetime(df.Date + '-2015 ' + df.Hour, format='%d-%b-%Y %H:%M')
>>> df
Date Hour value
2015-01-01 00:00:00 01-Jan 00:00 1
2015-01-01 01:00:00 01-Jan 01:00 2
2015-01-01 02:00:00 01-Jan 02:00 3
2015-01-01 03:00:00 01-Jan 03:00 4
You can replace the default 1900 by using replace
s=pd.to_datetime(df['Date']+df['Hour'],format='%d-%b%H:%M').apply(lambda x : x.replace(year=2015))
s
Out[131]:
0 2015-01-01 00:00:00
1 2015-01-01 01:00:00
2 2015-01-01 02:00:00
3 2015-01-01 03:00:00
dtype: datetime64[ns]
df.index=s