Combine date column and time column into datetime - pandas

I have two columns (both text objects), one date, the other hour-ending.
df = pd.DataFrame({'Date' : ['2018-10-01', '2018-10-01', '2018-10-01'],
'Hour_Ending': ['1.0', '2.0', '3.0']})
How do I add the two columns together to get a datetime object that looks like this?
2018-10-01 01:00
As a bonus, how do I change Hour_Ending to Hour_Starting?

Using to_datetime and Timedelta
pd.to_datetime(df.Date)+pd.to_timedelta(df.Hour_Ending.astype('float'), unit='h')
Out[122]:
0 2018-10-01 01:00:00
1 2018-10-01 02:00:00
2 2018-10-01 03:00:00
dtype: datetime64[ns]

Related

change multiple date time formats to single format in pandas dataframe

I have a DataFrame with multiple formats as shown below
0 07-04-2021
1 06-03-1991
2 12-10-2020
3 07/04/2021
4 05/12/1996
What I want is to have one format after applying the Pandas function to the entire column so that all the dates are in the format
date/month/year
What I tried is the following
date1 = pd.to_datetime(df['Date_Reported'], errors='coerce', format='%d/%m/%Y')
But it is not working out. Can this be done? Thank you
try with dayfirst=True:
date1=pd.to_datetime(df['Date_Reported'], errors='coerce',dayfirst=True)
output of date1:
0 2021-04-07
1 1991-03-06
2 2020-10-12
3 2021-04-07
4 1996-12-05
Name: Date_Reported, dtype: datetime64[ns]
If needed:
date1=date1.dt.strftime('%d/%m/%Y')
output of date1:
0 07/04/2021
1 06/03/1991
2 12/10/2020
3 07/04/2021
4 05/12/1996
Name: Date_Reported, dtype: object

Pandas add row to datetime indexed dataframe

I cannot find a solution for this problem. I would like to add future dates to a datetime indexed Pandas dataframe for model prediction purposes.
Here is where I am right now:
new_datetime = df2.index[-1:] # current end of datetime index
increment = '1 days' # string for increment - eventually will be in a for loop to add add'l days
new_datetime = new_datetime+pd.Timedelta(increment)
And this is where I am stuck. The append examples online only seem always seem to show examples with ignore_index=True , and in my case, I want to use the proper datetime indexing.
Suppose you have this df:
date value
0 2020-01-31 00:00:00 1
1 2020-02-01 00:00:00 2
2 2020-02-02 00:00:00 3
then an alternative for adding future days is
df.append(pd.DataFrame({'date': pd.date_range(start=df.date.iloc[-1], periods=6, freq='D', closed='right')}))
which returns
date value
0 2020-01-31 00:00:00 1.0
1 2020-02-01 00:00:00 2.0
2 2020-02-02 00:00:00 3.0
0 2020-02-03 00:00:00 NaN
1 2020-02-04 00:00:00 NaN
2 2020-02-05 00:00:00 NaN
3 2020-02-06 00:00:00 NaN
4 2020-02-07 00:00:00 NaN
where the frequency is D (days) day and the period is 6 days.
I think I was making this more difficult than necessary because I was using a datetime index instead of the typical integer index. By leaving the 'date' field as a regular column instead of an index adding the rows is straightforward.
One thing I did do was add a reindex command so I did not end up with wonky duplicate index values:
df = df.append(pd.DataFrame({'date': pd.date_range(start=df.date.iloc[-1], periods=21, freq='D', closed='right')}))
df = df.reset_index() # resets index
i also needed this and i solve merging the code that you share with the code on this other response add to a dataframe as I go with datetime index and end out with the following code that work for me.
data=raw.copy()
new_datetime = data.index[-1:] # current end of datetime index
increment = '1 days' # string for increment - eventually will be in a for loop to add add'l days
new_datetime = new_datetime+pd.Timedelta(increment)
today_df = pd.DataFrame({'value': 301.124},index=new_datetime)
data = data.append(today_df)
data.tail()
here 'value' is the header of your own dataframe

Date object and time integer to datetime

All, I have a dataframe with a date column and an hour column. I am trying to combine those into a single timestamp. I tried many solutions available using datetime.datetime.combine and just implicitly extracting month day and year and creating a datetime stamp with it but all lead to some error.
idOnController date eventTime Energy hour
0 5014 2018-05-31 2018-05-31 01:00:00 26.619 0
2 5014 2018-06-02 2018-06-02 02:00:00 29.251 0
3 5014 2018-06-03 2018-06-03 03:00:00 30.635 0
The datatypes are as follows
idOnController int64
date object
eventTime datetime64[ns]
Energy float64
hour int64
dtype: object
I am looking to combine date and hour into a timestamp that looks like eventTime and then replace eventTime with that value.
You can do:
df['new_date'] = pd.to_datetime(df['date']) + df['hour'] * pd.to_timedelta('1H')
Output of df.dtypes:
idOnController int64
date object
eventTime datetime64[ns]
Energy float64
hour int64
new_date datetime64[ns]
dtype: object
If you want to have the string timestamps you can do
df['new_date'] = df['new_date'].dt.strftime('%Y-%m-%d %H:%M:%S')
Another way of doing this would be (a bit more verbose though!):
df['date'] = pd.to_datetime(df['date'])
df['year'] = df.date.dt.year
df['month'] = df.date.dt.month
df['day'] = df.date.dt.day
df['date'] = pd.to_datetime(df[['year','month','day','hour']])

Add random datetimes to timestamps

I have a column of timestamps that span over 24 hours. I want to convert these to differentiate between days. I've done this by converting to timedelta. The result is displayed below.
The question I have is, can these be converted or re-arranged again to provide random datetimes. e.g. dd:mm:yyyy hh:mm:ss.
import pandas as pd
df = pd.DataFrame({
'Time' : ['8:00','18:00','28:00'],
})
df['Time'] = [x + ':00' for x in df['Time']]
df['Time'] = pd.to_timedelta(df['Time'])
Out:
Time
0 0 days 08:00:00
1 0 days 18:00:00
2 1 days 04:00:00
Intended Output:
Time
0 1/01/1904 08:00:00 AM
1 1/01/1904 18:00:00 PM
2 2/01/1904 04:00:00 AM
The input timestamps will never go over more than 2 days. Is there a package that can achieve this or would a dummy start and end dates.
After you convert the Time just adding the date part
df.Time+pd.to_datetime('1904-01-01')
0 1904-01-01 08:00:00
1 1904-01-01 18:00:00
2 1904-01-02 04:00:00
Name: Time, dtype: datetime64[ns]

Converting to correct data format

I need your help guys
I have information with wrong time format.
For example:
it shows 1245 or 1837 etc. I want them to be like in correct format:
12:45 PM or 6:37 PM.
How can I convert it?
Thanks!
I think you need convert to_datetime and then strftime or dt.time:
See also http://strftime.org/.
df = pd.DataFrame({'date':[1245, 1837]})
print (df)
date
0 1245
1 1837
print (pd.to_datetime(df['date'], format='%H%M'))
0 1900-01-01 12:45:00
1 1900-01-01 18:37:00
Name: date, dtype: datetime64[ns]
#for string output
print (pd.to_datetime(df['date'], format='%H%M').dt.strftime('%I:%M %p'))
0 12:45 PM
1 06:37 PM
Name: date, dtype: object
#for time output
print (pd.to_datetime(df['date'], format='%H%M').dt.time)
0 12:45:00
1 18:37:00
Name: date, dtype: object