I made a file that had three date columns:
pd.DataFrame({'yyyymm':[199501],'yyyy':[1995],'mm':[1],'Address':['AL1'],'Number':[12]})
yyyymm yyyy mm Address Number
0 199501 1995 1 AL1 12
and saved it as a file:
df.to_csv('complete.csv')
I read in the file with:
df=pd.read_csv('complete.csv')
and my 3 date columns are converted to int's, and not dates.
I tried to convert them back to dates with:
df['yyyymm']=df['yyyymm'].astype(str).dt.strftime('%Y%m')
df['yyyy']=df['yyyy'].dt.strftime('%Y')
df['mm']=df['mm'].dt.dtrftime('%m')
with the error:
AttributeError: Can only use .dt accessor with datetimelike values
Very odd, as the command I used to make the datetime column was:
df['yyyymm']=df['col2'].dt.strftime('%Y%m')
Am I missing something? HOw can I convert the 6 digit column back to yyyymm datetime, the 4 digit column to yyyy datetime, and the mm digit column back to datetime?
The columns yyyymm and yyyy and mm are integers. By using .astype(str), you convert these to strings. But a string has no .dt.
You can use pd.to_datetime(..) [pandas-doc] to convert these to a datetime object:
df['yyyymm'] = pd.to_datetime(df['yyyymm'].astype(str), format='%Y%m')
Indeed, this gives us:
>>> pd.to_datetime(df['yyyymm'].astype(str), format='%Y%m')
0 1995-01-01
Name: yyyymm, dtype: datetime64[ns]
The same can be done for the yyyy and mm columns:
>>> pd.to_datetime(df['yyyy'].astype(str), format='%Y')
0 1995-01-01
Name: yyyy, dtype: datetime64[ns]
>>> pd.to_datetime(df['mm'].astype(str), format='%m')
0 1900-01-01
Name: mm, dtype: datetime64[ns]
Related
im a begginer at coding and started a project for my small business.
i imported a xlsx file using panda and got the table with dtype('O') at every columns
me sheet
but i can't find anywhere a way to get only the date in the format DD/MM/YYYY
any tips?
i have tried this code
tabela['dt_nasc'] = pd.to_datetime(tabela['dt_nasc'], format='%m/%d/%Y')
but the results were
ValueError: time data '1988-10-24 00:00:00' does not match format '%m/%d/%Y' (match)
i also tried another code
import datetime
def convert_to_date(x):
return datetime.datetime.strptime(x , '%Y-%m-%d %H:%M:%S')
tabela.loc[:, 'dt_nasc'] = tabela['dt_nasc'].apply(convert_to_date)
# better to use a lambda function
tabela.loc[:, 'dt_nasc'] = tabela['dt_nasc'].apply(lambda x:datetime.datetime.strptime(x , '%Y-%m-%d %H:%M:%S'))
but couldn't find a way to print at format DD/MM/YYYY
Example
s = pd.Series({0: '2022-11-29 13:00:00', 1: '2022-11-30 13:48:00'})
s
0 2022-11-29 13:00:00
1 2022-11-30 13:48:00
dtype: object <-- chk dtype
Code
object to datetime
pd.to_datetime(s)
result
0 2022-11-29 13:00:00
1 2022-11-30 13:48:00
dtype: datetime64[ns] <--chk dtype
convert result to DD-MM-YYYY HH-mm-ss
pd.to_datetime(s).dt.strftime('%d-%m-%Y %H:%M:%S')
0 16-11-2022 13:00:00
1 17-11-2022 13:48:00
dtype: object <--chk dtype
convert result to DD/MM/YYYY
pd.to_datetime(s).dt.strftime('%d/%m/%Y')
0 29/11/2022
1 30/11/2022
dtype: object <-- chk dtype
Having a Pandas DataFrame with a column of TimeStamp yyyy-mm-dd HH:MM:SS timezone (e.g. 2020-06-01 04:26:00-05:00), how to extract new column with only yyyy-mm-dd HH
Tried:
df.index = df.Time.to_period(freq='T').index
Result in: yyyy-mm-dd HH:MM
you can use:
df['new_date']=df['your_date_columns'].dt.strftime('%Y-%m-%d %H')
Please assist how to convert a complete column elements to datetime format in pandas dataframe. The below is one of the such element.
1-July-2020 7.30 PM
DateTime
Use pd.to_datetime and the format argument:
>>> df
DateTime
0 1-July-2020 7.30 PM
>>> df.dtypes
DateTime object
dtype: object
df['DateTime'] = pd.to_datetime(df['DateTime'], format='%d-%B-%Y %I.%M %p')
Output result:
>>> df
DateTime
0 2020-07-01 19:30:00
>>> df.dtypes
DateTime datetime64[ns]
dtype: object
I have a date column of format YYYY-MM-DD and want to convert it to an int type, consecutively, where 1= Jan 1, 2000. So if I have a date 2000-01-31, it will convert to 31. If I have a date 2020-01-31 it will convert to (365*20yrs + 5 leap days), etc.
Is this possible to do in pandas?
I looked at Pandas: convert date 'object' to int, but this solution converts to an int 8 digits long.
First subtract column by Timestamp, convert timedelts to days by Series.dt.days and last add 1:
df = pd.DataFrame({"Date": ["2000-01-29", "2000-01-01", "2014-03-31"]})
d = '2000-01-01'
df["new"] = pd.to_datetime(df["Date"]).sub(pd.Timestamp(d)).dt.days + 1
print( df )
Date new
0 2000-01-29 29
1 2000-01-01 1
2 2014-03-31 5204
I'm trying to cast a datetime64 panda object to string without printing the index.
I have a csv file with the following
Dates
2019-06-01
2019-06-02
2019-06-03
When I import the csv file via pandas, I have a normal pandas object in the column.
df['Dates'] = pd.to_datetime(df['Dates'], format='%Y-%m-%d')
This provides a datetime64[ns] object. I tried printing this object with the following output.
>>> What is the date 0 2019-06-01
Name: Dates, dtype: datetime64[ns]
So I have to cast this object to a string. The documentation suggests I use dt.strftime().
s=df["Dates"].dt.strftime("%Y-%m-%d")
print(f"What is the date {s['Dates'}")
The output for the above is:
>>> What is the date 0 2019-06-01
How do I remove the index from the output?
file = r'test.csv'
df = pd.read_csv(file)
df['Dates'] = pd.to_datetime(df['Dates'], format='%Y-%m-%d')
s = df[df["Dates"] < "2019-06-02"]
print(f"What is the date {s['Dates']}")
print(s["Dates"])
The expected output is the following:
>>> What is the date 2019-06-01
However I am getting the following
>>> What is the date 0 2019-06-01
You can try:
[print(f"What is the date {x}") for x in s['Dates'].astype('str')]
gives:
What is the date 2019-06-01
What is the date 2019-06-02
What is the date 2019-06-03