Trying to create a new column in a dataframe that shows number of days between now and a past date. So far I have below code but it returns 'days' + a timestamp. How can I get just the number of days?
import pytz
now = datetime.datetime.now(pytz.utc)
excel1['days_old'] = now - excel1['Start Time']
Returns:
92 days 08:08:06.667518
excel1['days_old'] will hold "timedeltas". To get them to the day difference, just use ".days" like this:
import pytz
now = datetime.datetime.now(pytz.utc)
excel1['days_timedelta'] = now - excel1['Start Time']
excel1['days_old'] = excel1['days_timedelta'].days
Assuming that Start Time column is of datetime type, run:
(pd.Timestamp.now() - df['Start Time']).dt.days
Also worked for me
import datetime
import pytz
now = datetime.datetime.now(pytz.utc)
excel1['days_old'] = (now - excel1['Start Time']).astype('timedelta64[D]')
Related
The objective of this post is to be able to convert the columns [‘Open Date’, 'Close date’] to timestamp format
I have tried with the functions / examples from these links with any results.
Convert datetime to timestamp in Neo4j
Convert datetime pandas
Pandas to_dict() converts datetime to Timestamp
Really appreciate any ideas / comments / examples on how to do so.
Data Base Image
Column Characteristics:
Open Date datetime64[ns] and pandas.core.series.Series
Close date datetime64[ns] and pandas.core.series.Series
Finally I been using these libraries
import pandas as pd
import numpy as np
from datetime import datetime, date, time, timedelta
You convert first to numpy array by values and transform (cast) to int64 - output is in nanoseconds , which means divide by 10 ** 9:
df['open_ts'] = df['Open_Date'].datetime.values.astype(np.int64)
df['close_ts'] = df['Close_Date'].datetime.values.astype(np.int64)
OR
If you want to avoid using numpy, you can also try:
df['open_ts'] = pd.to_timedelta(df['Open_Date'], unit='ns').dt.total_seconds().astype(int)
df['close_ts'] = pd.to_timedelta(df['Close_Date'], unit='ns').dt.total_seconds().astype(int)
Try them and report it back here
I am looking to convert datetime to date for a pandas datetime series.
I have listed the code below:
df = pd.DataFrame()
df = pandas.io.parsers.read_csv("TestData.csv", low_memory=False)
df['PUDATE'] = pd.Series([pd.to_datetime(date) for date in df['DATE_TIME']])
df['PUDATE2'] = datetime.datetime.date(df['PUDATE']) #Does not work
Can anyone guide me in right direction?
You can access the datetime methods of a Pandas series by using the .dt methods (in a aimilar way to how you would access string methods using .str. For your case, you can extract the date of your datetime column as:
df['PUDATE'].dt.date
This is a simple way to get day of month, from a pandas
#create a dataframe with dates as a string
test_df = pd.DataFrame({'dob':['2001-01-01', '2002-02-02', '2003-03-03', '2004-04-04']})
#convert column to type datetime
test_df['dob']= pd.to_datetime(test_df['dob'])
# Extract day, month , year using dt accessor
test_df['DayOfMonth']=test_df['dob'].dt.day
test_df['Month']=test_df['dob'].dt.month
test_df['Year']=test_df['dob'].dt.year
I think you need to specify the format for example
df['PUDATE2']=datetime.datetime.date(df['PUDATE'], format='%Y%m%d%H%M%S')
So you just need to know what format you are using
I am trying to convert the following string in datetime format
"14DEC2014"
Does anyone have an advice on how to do this, I have been stuck on this one for a day or two now
import pandas as pd
test = '14DEC2014'
test = pd.to_datetime(test)
print(test)
output:
2014-12-14 00:00:00
If you would like to only have the date:
test = pd.to_datetime(test).date()
output:
2014-12-14
my date time format in excel is 01-12-2010 08:26 (date =01,month =12) when i import that into pandas and change dtype to datetime, month and date both get swapped.I am new to this please help
Output of pandas is
x .date
12
x. month
1
Excel
Invoice date = 01/12/2010 08:26
PANDAS
When import using sales = pd.read_csv()
sales["InvoiceDate"] = sales["InvoiceDate"].astype("datetime64[ns]")
[ln] y["InvoiceDate"].loc[0].
[Out] Timestamp['2010-01-12 08:26:00']
[ln] y["InvoiceDate"].loc[0].day
[out] 12
the output of this should be 1 instead of 12
where i am getting it wrong
please help
you can use pd.to_datetime with parameter dayfirst like below
pd.to_datetime("01/12/2010 08:26", dayfirst=True)
I would like to subset a data frame based on a date column, which originally has this format:
3/22/13
After I transform it to a date:
df['date']=pd.to_datetime(df['date'], format='%m/%d/%y')
I get this:
2013-03-22 00:00:00
Now I would like to subset it with something like this:
df.loc[(df['date']>'2014-06-22')]
But that either gives me an empty data frame or full data frame, that is no filtering.
Any suggestions how I can get this to work?
remark: I am well aware that similar questions have been asked in other forums but I could not figure out a solution since my date column looks different.
First you have to convert your starting date and final date into a datetime format. Then you can apply multiple conditions inside df.loc. Do not forget to reassign your modifications to your df :
import pandas as pd
from datetime import datetime
df['date']=pd.to_datetime(df['date'], format='%m/%d/%y')
date1 = datetime.strptime('2013-03-23', '%Y-%m-%d')
date2 = datetime.strptime('2013-03-25', '%Y-%m-%d')
df = df.loc[(df['date']>date1) & (df['date']<date2)]