pandas get days in a column from start date?
start_date = '01/01/2021' (dd/mm/yyyy)
df
dates
2021-01-01
2021-01-02
.
.
.
2021-02-01
.
.
.
2021-06-01 (end date should be current date)
If there is always 1.1. pandas parse datetimes like mm/dd/YYYY so because same day and month here working well only passing string to date_range with to_datetime and now, default period='D', so omitted:
df = pd.DataFrame({'dates':pd.date_range(start_date, pd.to_datetime('now'))})
General solution with convert start_date by format dd/mm/YYYY is parsed also start_date with format parameter:
start_date = '01/05/2021'
df = pd.DataFrame({'dates': pd.date_range(pd.to_datetime(start_date, format='%d/%m/%Y'),
pd.to_datetime('now'))})
If you wand a dataframe output :
d = pd.date_range(start_date, pd.to_datetime('now'))
df = pd.DataFrame({'dates': d})
Related
I have a data frame with date columns in the format: day / month / year
They are in string/object format.
I want to convert them to datetime.
Sample date, 5th of January 2016: '05/01/2016'
However pd.to_datetime is confusing the day and month.
Here is what I've tried:
pd.to_datetime('05/01/2016')
Timestamp('2016-05-01 00:00:00')
This has given me Year - Month - Day
I want Day - Month - Year as in: 05-01-2016
What I have tried:
pd.to_datetime('05/01/2016',dayfirst=True)
Timestamp('2016-01-05 00:00:00')
This is correct, but it's not the format I want, which is '05-01-2016'
So I tried this:
pd.to_datetime('05/01/2016',dayfirst=True,format='%d/%m/%Y')
Timestamp('2016-01-05 00:00:00')
There's no difference.
How can I do it? How can I force it to display the datetime as '05-01-2016'
The only way I know is to change the display options:
pd.set_option("display.date_dayfirst", True)
https://pandas.pydata.org/pandas-docs/stable/user_guide/options.html#available-options
but it's not working... Else you convert the datetime type to str:
ts = pd.to_datetime('05/01/2016', format='%d/%m/%Y')
print(ts)
# Timestamp('2016-01-05 00:00:00')
ts = ts.strftime('%d-%m-%Y')
print(ts)
# '05-01-2016'
Or just replace '/' by '-':
print('05/01/2016'.replace('/', '-'))
# '05-01-2016'
You can't change the timestamp format (to my knowledge), but you can convert it to string in the wanted format like so :
>>> import pandas as pd
>>> pd.to_datetime('05/01/2016', dayfirst=True, format='%d/%m/%Y').strftime('%d-%m-%Y')
'05-01-2016'
I have a dataframe in pandas with some columns with dates in the following format
dates
202001
202002
I want to convert them to the following format
dates
2020-01-01
2020-02-01
Could anyone assist with converting the date format? Thanks
If need datetimes use to_datetime with format='%Y%m':
df['dates'] = pd.to_datetime(df['dates'], format='%Y%m')
You may use to_datetime here:
df["dates"] = pd.to_datetime(df["dates"] + '01', format='%Y%m%d', errors='ignore')
Note that your current text dates are year month only, so I concatenate 01 to the end of each one to form the first of the month, for each date.
Try this:
df['dates'] = df['dates'].astype(str)
df['dates'] = pd.to_datetime(df['dates'].str[:4] + ' ' + df['dates'].str[4:])
print(df)
Output:
dates
0 2020-01-01
1 2020-02-01
I want to subtract 30 days from current date and get date in following format.
final_date = 2019-12-24
I am doing following thing in pandas, but getting timestamp object in return
final_date = pd.to_datetime(pd.datetime.now().date() - timedelta(30))
How can I do it in pandas?
There is more solution for subtract today by Timestamp.floor with timedeltas or offsets:
final_date = pd.Timestamp.now().floor('d') - pd.Timedelta(30, unit='d')
final_date = pd.to_datetime('now').floor('d') - pd.DateOffset(days=30)
final_date = pd.to_datetime('now').floor('d') - pd.offsets.Day(30)
print (final_date)
2019-12-24 00:00:00
And last convert output to python object dates:
print (final_date.date())
2019-12-24
Or to strings:
print (final_date.strftime('%Y-%m-%d'))
2019-12-24
Use Series.strftime:
final_date = (pd.datetime.now().date() - pd.Timedelta(days = 30)).strftime('%Y-%m-%d')
#'2019-12-24'
I am working with a time series data in pandas df that doesn't have a real calendar date but an index value that indicates an equal time interval in between each value. I'm trying to convert it into a datetime type with daily or weekly frequency. Is there a way to keep the values same while changing the type (like without setting an actual calander date)?
Index,Col1,Col2
1,6.5,0.7
2,6.2,0.3
3,0.4,2.1
pd.to_datetime can create dates when given time units relative to some origin. The default is the POSIX origin 1970-01-01 00:00:00 and time in nanoseconds.
import pandas as pd
df['date1'] = pd.to_datetime(df.index, unit='D', origin='2010-01-01')
df['date2'] = pd.to_datetime(df.index, unit='W')
Output:
# Col1 Col2 date1 date2
#Index
#1 6.5 0.7 2010-01-02 1970-01-08
#2 6.2 0.3 2010-01-03 1970-01-15
#3 0.4 2.1 2010-01-04 1970-01-22
Alternatively, you can add timedeltas to the specified start:
pd.to_datetime('2010-01-01') + pd.to_timedelta(df.index, unit='D')
or just keep them as a timedelta:
pd.to_timedelta(df.index, unit='D')
#TimedeltaIndex(['1 days', '2 days', '3 days'], dtype='timedelta64[ns]', name='Index', freq=None)
How to generate a time series column from today to the next 600 days in pandas?
I'm a new pandas learner. I can generate a new column as follows:
dates = pd.date_range('2010-01-01', '2011-8-23', freq='D')
Output:
DatetimeIndex(['2010-01-01', '2010-01-02', '2010-01-03', '2010-01-04',
'2010-01-05', '2010-01-06', '2010-01-07', '2010-01-08',
'2010-01-09', '2010-01-10',
...
'2011-08-14', '2011-08-15', '2011-08-16', '2011-08-17',
'2011-08-18', '2011-08-19', '2011-08-20', '2011-08-21',
'2011-08-22', '2011-08-23'],
dtype='datetime64[ns]', length=600, freq='D')
My question is: what should we do if we do only know the starting date, and the time period 600 days? we don't know the ending date. How to modify the code?
And another follow up questions, how to set the starting date to current or yesterday's date?
Just change the period to 600, you should get your out put
pd.date_range(start='2010-01-01', periods=5, freq='D')
Out[335]:
DatetimeIndex(['2010-01-01', '2010-01-02', '2010-01-03', '2010-01-04',
'2010-01-05'],
dtype='datetime64[ns]', freq='D')
For get today'date
pd.to_datetime('today')
Out[338]: Timestamp('2017-09-29 00:00:00')
First, import core package datetime
import datetime
Then you can instantiate a datetime object and add 600 days using the timedelta() method
start_date = "2010-01-01"
start_date = datetime.datetime.strptime(start_date, "%Y-%m-%d")
end_date = start_date + datetime.timedelta(days=600)
To now get the string back, we can use strftime() like:
end_date = end_date.strftime("%Y-%m-%d")
> '2011-08-24'