AttributeError: 'Styler' object has no attribute 'plot' - pandas

I worked on a database of skill workers in different courses and now I want to evaluate success in three different courses. After group by on databaase I get the below result:
Courses:
1 0.5882
2 0.2195
3 0.2857
I used style.format to change format value as percentage:
'style.format({'Courses': '{:,.2%}'.format})':
Courses:
1 58.82%
2 21.95%
3 28.57%
Now, for plot this values, I got the below error, how can I solve this problem?
AttributeError: 'Styler' object has no attribute 'plot'

Related

[pandas]Dividing all elements of columns in df with elements in another column (Same df)

I'm sorry, I know this is basic but I've tried to figure it out myself for 2 days by sifting through documentation to no avail.
My code:
import numpy as np
import pandas as pd
name = ["bob","bobby","bombastic"]
age = [10,20,30]
price = [111,222,333]
share = [3,6,9]
list = [name,age,price,share]
list2 = np.transpose(list)
dftest = pd.DataFrame(list2, columns = ["name","age","price","share"])
print(dftest)
name age price share
0 bob 10 111 3
1 bobby 20 222 6
2 bombastic 30 333 9
Want to divide all elements in 'price' column with all elements in 'share' column. I've tried:
print(dftest[['price']/['share']]) - Failed
dftest['price']/dftest['share'] - Failed, unsupported operand type
dftest.loc[:,'price']/dftest.loc[:,'share'] - Failed
Wondering if I could just change everything to int or float, I tried:
dftest.astype(float) - cant convert from str to float
Ive tried iter and items methods but could not understand the printouts...
My only suspicion is to use something called iterate, which I am unable to wrap my head around despite reading other old posts...
Please help me T_T
Apologies in advance for the somewhat protracted answer, but the question is somewhat unclear with regards to what exactly you're attempting to accomplish.
If you simply want price[0]/share[0], price[1]/share[1], etc. you can just do:
dftest['price_div_share'] = dftest['price'] / dftest['share']
The issue with the operand types can be solved by:
dftest['price_div_share'] = dftest['price'].astype(float) / dftest['share'].astype(float)
You're getting the cant convert from str to float error because you're trying to call astype(float) on the ENTIRE dataframe which contains string columns.
If you want to divide each item by each item, i.e. price[0] / share[0], price[1] / share[0], price[2] / share[0], price[0] / share[1], etc. You would need to iterate through each item and append the result to a new list. You can do that pretty easily with a for loop, although it may take some time if you're working with a large dataset. It would look something like this if you simply want the result:
new_list = []
for p in dftest['price'].astype(float):
for s in dftest['share'].astype(float):
new_list.append(p/s)
If you want to get this in a new dataframe you can simply save it to a new dataframe using pd.Dataframe() method:
new_df = pd.Dataframe(new_list, columns=[price_divided_by_share])
This new dataframe would only have one column (the result, as mentioned above). If you want the information from the original dataframe as well, then you would do something like the following:
new_list = []
for n, a, p in zip(dftest['name'], dftest['age'], dftest['price'].astype(float):
for s in dftest['share'].astype(float):
new_list.append([n, a, p, s, p/s])
new_df = pd.Dataframe(new_list, columns=[name, age, price, share, price_div_by_share])
If you check the data types of your dataframe, you will realise that they are all strings/object type :
dftest.dtypes
name object
age object
price object
share object
dtype: object
first step will be to change the relevant columns to numbers - this is one way:
dftest = dftest.set_index("name").astype(float)
dftest.dtypes
age float64
price float64
share float64
dtype: object
This way you make the names a useful index, and separate it from the numeric data. This is just a suggestion; you may have other reasons to leave names as a columns - in that case, you have to individually change the data types of each column.
Once that is done, you can safely execute your code :
dftest.div(dftest.share,axis=0)
age price share
name
bob 3.333333 37.0 1.0
bobby 3.333333 37.0 1.0
bombastic 3.333333 37.0 1.0
I assume this is what you expect as your outcome. If not, you can tweak it. Main part is get your data types as numbers before computation/division can occur.

AttributeError: 'int' object has no attribute 'count' while using itertuples() method with dataframes

I am trying to iterate over rows in a Pandas Dataframe using the itertuples()-method, which works quite fine for my case. Now i want to check if a specific value ('x') is in a specific tuple. I used the count() method for that, as i need to use the number of occurences of x later.
The weird part is, for some Tuples that works just fine (i.e. in my case (namedtuple[7].count('x')) + (namedtuple[8].count('x')) ), but for some (i.e. namedtuple[9].count('x')) i get an AttributeError: 'int' object has no attribute 'count'
Would appreciate your help very much!
Apparently, some columns of your DataFrame are of object type (actually a string)
and some of them are of int type (more generally - numbers).
To count occurrences of x in each row, you should:
Apply a function to each row which:
checks whether the type of the current element is str,
if it is, return count('x'),
if not, return 0 (don't attempt to look for x in a number).
So far this function returns a Series, with a number of x in each column
(separately), so to compute the total for the whole row, this Series should
be summed.
Example of working code:
Test DataFrame:
C1 C2 C3
0 axxv bxy 10
1 vx cy 20
2 vv vx 30
Code:
for ind, row in df.iterrows():
print(ind, row.apply(lambda it:
it.count('x') if type(it).__name__ == 'str' else 0).sum())
(in my opinion, iterrows is more convenient here).
The result is:
0 3
1 1
2 1
So as you can see, it is possible to count occurrences of x,
even when some columns are not strings.

how does the python code df['A'][df['B']] work when the column 'A' is a list of numbers and column 'B' is the target variable?

I am trying to run a shapiro test:
stats.shapiro(dataframe_iris_new['sepalWidth'][dataframe_iris_new['target']])
I am confused with how the above code works.
When you do dataframe['columnA'] it will return only that column, so it should throw error. Can you paste the output here?
Let's say you have a Series like this:
> s = pd.Series({
2: 'Alan',
4: 'Mary',
6: 'Sophie',
8: 'Jack'
})
2 Alan
4 Mary
6 Sophie
8 Jack
dtype: object
You can slice it by a single label or a list of labels:
> s[[2]]
2 Alan
dtype: object
> s[[6,2]]
6 Sophie
2 Alan
You can also slice with it a boolean list:
> s[[False, True, True, False]]
4 Mary
6 Sophie
dtype: object
So how does this fit into your question?
dataframe_iris_new['sepalWidth'] returns a series (let's call it s)
dataframe_iris_new['target'] reutns another series (let's call it t)
s[t] is a slicing operation: cutting s according to the values in t:
If t is a list of labels, it will extract matching labels in s
If t is a boolean list, it will select the items where the value of t is True. This also requires that s and t are of the same length.

Error exceptions.IndexError while importing in Odoo products

I'm trying to import 8500 products, and I cut the CSV in files with 1000 rows. Everything goes fine, but when I get 2500, I get this error:
Unknown error during import: : list index out of range at row 2
name,categ_id,standard_price,list_price,Public Price,default_code,description_purchase,Main Supplier,sale_delay,taxes_id,Id. Externo,property_account_expense,route_ids/id,Acabado,product_variant_ids/attribute_line_ids/attribute_id,product_variant_ids/attribute_line_ids/value_ids
Mueble Base Encajonada con Estante Metal,Category / Subcategory,999.00,999.24,999.24,A037073000,MOBILETTO BASE SCATOLATA,Provider,35,IVA 21%,A037073000,400000080,"purchase.route_warehouse0_buy,stock.route_warehouse0_mto",A03,Color,D7
Any idea where is the problem?
This error came because of you are given extra line value compare to columns label.
For example:
You have 16 columns than you must have to give 16 values. If you give 17 values than this type of error will come list index out of range
Situation:
Please recheck columns value it must be same as the number of columns label.

Stata - Spin on Reshape

I was working through reshaping a file and was wondering how Stata handled a file in the below format. Using data from a race, for example.
Race_Number Race_Date Racer_1_Name Racer_2_Name Racer_3_Name Racer_1_Position Racer_2_Position Racer_3_Position
Is it possible to transform this to the following.
Race_Number Race_Date Racer_Name Racer Position
Out of curiosity I created the above dataset and reshape did not work and I had to manually manipulate.
We appreciate you show us exactly what your input/output was. Things like
...reshape did not work and I had to manually manipulate.
don't tell us much.
Also, a complete toy data set would have helped. I assume you mean Race_Date where you typed Race Date (first code line) and Racer_Position where you typed Racer Position (second code line).
You can try
clear all
set more off
*----- example dataset -----
input ///
Race_Num Race_Dat str5(R1_Name R2_Name R3_Name) R1_Pos R2_Pos R3_Pos
1 5 "Al" "Bob" "Carl" 3 2 1
2 7 "Al" "Bob" "Carl" 3 1 2
3 15 "Al" "Bob" "Carl" 1 2 3
end
format Race_Dat %td
list
*----- what you want -----
forvalues i = 1/3 {
rename R`i'_Name Nam_R`i'
rename R`i'_Pos Pos_R`i'
}
list
reshape long Pos_R Nam_R, i(Race_Num) j(Racer)
order Race_Num Race_Dat
list, sepby(Race_Num)
All I did was change variable names before the reshape.
A better way is to use the # and then there's no need for renaming variables:
reshape long R#_Pos R#_Name, i(Race_Num) j(Racer)