Comparision between date and integer in pandas - pandas

I have dataset df with date column. I have dates from 2020-01-01 to 2021-03-30 in date column. Now i have a variable like a=20210130(which is actually a date). I need take take values from the df which is <=a.

First idea is convert a to datetimes and compare, then filter by boolean indexing:
df['date'] = pd.to_datetime(df['date'])
a = 20210130
df = df[df['date'] <= pd.to_datetime(a)]
Or convert column to integers and compare:
a = 20210130
df = df[df['date'].dt.strftime('%Y%m%d').astype(int) <= a]

Related

Pandas - Converting datetime field to a specified format

I am trying to get a date time field in Pandas in the below format
df['date'] = pd.to_datetime(df['date'])
The above code returns date time column in the below format
2021-11-27 03:30:00
I would like to get an output of 27/11/2021 (format is dd/mm/yyyy) and the data type of the column needs to be datetime and not object.
If your column is a string, you will need to first use pd.to_datetime,
df['Date'] = pd.to_datetime(df['Date'])
Then, use .dt datetime accessor with strftime:
df = pd.DataFrame({'Date':pd.date_range('2017-01-01', periods = 60, freq='D')})
df.Date.dt.strftime('%Y%m%d').astype(int)
Or use lambda function:
df.Date.apply(lambda x: x.strftime('%Y%m%d')).astype(int)

how to drop rows having a specific time in datatime index

How can I remove the whole row from the dataframe where the datetime column is having 07:15 time in the datetime column.
Suppose your first column is named 'Datetime' and has a datetime64 dtype, use:
import datetime
# Not mandatory if Datetime is already datetime64
# df['Datetime'] = pd.to_datetime(df['Datetime'])
out = df[df['Datetime'].dt.time != time(7, 15)]

convert pandas datetime64[ns] to julian day

I am confused by the number of data type conversions and seemingly very different solutions to this, none of which I can get to work.
What is the best way to convert a pandas datetime column (datetime64[ns] eg 2017-01-01 03:15:00) to another column in the same pandas dataframe, converted to julian day eg 2458971.8234259?
Many thanks
Create DatetimeIndex and convert to julian dates:
df = pd.DataFrame({'dates':['2017-01-01 03:15:00','2017-01-01 03:15:00']})
df['dates'] = pd.to_datetime(df['dates'])
df['jul1'] = pd.DatetimeIndex(df['dates']).to_julian_date()
#if need remove times
df['jul2'] = pd.DatetimeIndex(df['dates']).floor('d').to_julian_date()
print (df)
dates jul1 jul2
0 2017-01-01 03:15:00 2.457755e+06 2457754.5
1 2017-01-01 03:15:00 2.457755e+06 2457754.5
Because:
df['jul'] = df['dates'].dt.to_julian_date()
AttributeError: 'DatetimeProperties' object has no attribute 'to_julian_date'

Convert string date column to int column for merge in python

I have two dataframe and I have to merge them with a date column,
The column of the first dataframe is an integer(year,month and day) and the second is a str(%d,/%m/&Y)
How can I convert the str dataframe to join them?
What we do is convert both of them to date format.
df1.Date=pd.to_datetime(df1.Date,format='%Y%m%d')
df2.Date=pd.to_datetime(df2.Date,format='%m/%d/%Y')
Then join or merge
df1.merge(df2, on = 'Date')# df1.join(df2) when the Date is index

pandas string to date type conversion in proper format

I am getting date data in string format in pandas like 10-Oct,11-Oct but i want to make it date data type like this format 2019-10-10,2019-10-11
is there any easy way available in pandas?
Use to_datetime with added year and parameter format:
df = pd.DataFrame({'date':['10-Oct', '11-Oct']})
df['date'] = pd.to_datetime(df['date'] + '-2019', format='%d-%b-%Y')
print (df)
date
0 2019-10-10
1 2019-10-11