I have a data frame and I am trying to figure our days of the week for a data set.
df['day_of_week'] = df['CMPLNT_TO_DT'].dt.day_name()
TypeError: 'Timestamp' object is not subscriptable
Your problem is with an incorrect assignment.
df['date']=pd.date_range('2021/5/9', '2021/5/14')
df['date'].dt.day_name()
Output:
and:
df = pd.Timestamp('2017-01-01T12')
df['CMPLNT_TO_DT'].dt.day_name()
Output:
The problem is not with the .dt.day_name():
df = pd.Timestamp('2017-01-01T12')
df['CMPLNT_TO_DT']
again:
Related
When I'm trying to get all years from by database, Python throws an error:
df = pd.to_datetime(df.index)
df['year'] = [i.year for i in df.index]
Output:
AttributeError: 'DatetimeIndex' object has no attribute 'index'
Use DatetimeIndex.year:
df['year'] = pd.to_datetime(df.index).year
In your solution assign back df.index:
df.index = pd.to_datetime(df.index)
df['year'] = [i.year for i in df.index]
import numpy as np
def answer_seven():
counties = census_df[['POPESTIMATE2010','POPESTIMATE2011','POPESTIMATE2012','POPESTIMATE2013','POPESTIMATE2014','POPESTIMATE2015']]
return counties[[counties.max(axis=1)]-[counties.min(axis=1)]].abs().idxmax()
TypeError: unsupported operand type(s) for -: 'list' and 'list'
Above is my code which does not work and I got this error message.
But the code below does work.
import numpy as np
def answer_seven():
counties_df = census_df[census_df['SUMLEV'] == 50][['CTYNAME','POPESTIMATE2010','POPESTIMATE2011','POPESTIMATE2012','POPESTIMATE2013',
'POPESTIMATE2014','POPESTIMATE2015']]
counties_df["MaxDiff"] = abs(counties_df.max(axis=1) - counties_df.min(axis=1))
most_change = counties_df.sort_values(by=["MaxDiff"], ascending = False)
return most_change.iloc[0][0]
It uses max and min function to get the max difference as well which uses a list to subtract another list. Could someone explain to me why my code is not working but this one does? Thanks!
The problem is here -
return counties[[counties.max(axis=1)]-[counties.min(axis=1)]]
You are subtracting two lists, I think the below edit should make it work
return counties[counties.max(axis=1)-counties.min(axis=1)]
I want to turn my date string into day of year... I try this code..
import pandas as pd
import datetime
data = pd.DataFrame()
data = pd.read_csv(xFilename, sep=",")
and get this DataFrame
Index Date Tmin Tmax
0 1950-01-02 -16.508 -2.096
1 1950-01-03 -6.769 0.875
2 1950-01-04 -1.795 8.859
3 1950-01-05 1.995 9.487
4 1950-01-06 -17.738 -9.766
I try this...
convert = lambda x: x.DatetimeIndex.dayofyear
data['Date'].map(convert)
with this error:
AttributeError: 'str' object has no attribute 'DatetimeIndex'
I expect to get new date to match 1950-01-02 = 2, 1950-01-03 = 3...
Thank for your help... and sorry Im new on python
I think need pass parameter parse_dates to read_csv and then call Series.dt.dayofyear:
data = pd.read_csv(xFilename, parse_dates=["Date"])
data['dayofyear'] = data['Date'].dt.dayofyear
I have a problem when I try to concatenate multiple DataFrames (a datastructure from the DataFrames package!) with the same columns but different row numbers. Here's my code:
using(DataFrames)
DF = DataFrame()
DF[:x1] = 1:1000
DF[:x2] = rand(1000)
DF[:time] = append!( [0] , cumsum( diff(DF[:x1]).<0 ) ) + 1
DF1 = DF[DF[:time] .==1,:]
DF2 = DF[DF[:time] .==round(maximum(DF[:time])),:]
DF3 = DF[DF[:time] .==round(maximum(DF[:time])/4),:]
DF4 = DF[DF[:time] .==round(maximum(DF[:time])/2),:]
DF1[:T] = "initial"
DF2[:T] = "final"
DF3[:T] = "1/4"
DF4[:T] = "1/2"
DF = [DF1;DF2;DF3;DF4]
The last line gives me the error
MethodError: Cannot `convert` an object of type DataFrames.DataFrame to an object of type LastMain.LastMain.LastMain.DataFrames.AbstractDataFrame
This may have arisen from a call to the constructor LastMain.LastMain.LastMain.DataFrames.AbstractDataFrame(...),
since type constructors fall back to convert methods.
I don't understand this error message. Can you help me out? Thanks!
I just ran into this exact problem on Julia 0.5.0 x86_64-linux-gnu, DataFrames 0.8.5, with both hcat and vcat.
Neither clearing the workspace nor reloading DataFrames solved the problem, but restarting the REPL fixed it immediately.
I am new to Pandas. I have grouped a dataframe by date and applied a function to different columns of the dataframe as shown below
def func(x):
questionID = x['questionID'].size()
is_true = x['is_bounty'].sum()
is_closed = x['is_closed'].sum()
flag = True
return pd.Series([questionID, is_true, is_closed, flag], index=['questionID', 'is_true', 'is_closed', 'flag'])
df_grouped = df1.groupby(['date'], as_index = False)
df_grouped = df_grouped.apply(func)
But when I run this I get an error saying
questionID = x['questionID'].size()
TypeError: 'int' object is not callable.
When I do the same thing this way it doesn't give any error.
df_grouped1 = df_grouped['questionID'].size()
I don't understand where am I going wrong.
'int' object is not callable. means you have to use size without ()
x['questionID'].size
For some objects size is only value, for others it can be function.
The same can be with other values/functions.