Hi everyone,
I'm relatively new to python. I want to plot a pie chart for department, i.e., how many percent of the employees coming from Sales & Marketing, Operations, Technology, etc...
I'm not sure how to count the total number of employees from respective department first then only fit the data into the pie chart. Any help will be greatly appreciated!
You can use pandas.Series.value_counts with pandas.Series.plot.pie:
df = pd.DataFrame({"department": ["Sales", "Operations",
"Sales", "Sales", "Technology"]})
df["department"].value_counts().plot.pie()
Output:
Related
I have the following datasets:
df1 = {'lower':[3.99,4.99,5.99,1700], 'percentile':[1,2,5,10,50,100]}
df2 = {'lower':[2.99,4.50,5,1850], 'percentile':[2,4,7,15,55,100]}
The data:
The percentile refers to the percentage of the data that corresponds
to a particular price e.g: 3.99 would represent 1% of the data while
all values under 5.99 would represent 5% of the data.
The length of the two datasets is 100 given that we are showing percentiles, but they vary between the two datasets as the price.
What I have done so far:
What I need help with:
As you see in the third graph, I can plot the two datasets overlayed, which is what I need, but I have been unsuccessful trying to change the legend and the weird tick x values on the third graph. It is not showing the percentile, or other metrics I might use the x axis with.
Any help?
I have a dataframe with 5 columns: salesman, vendor, product1, product2 and product3. The dataframe is similar to below:
I want to groupby salesman, loop through 'salesman' index and create a separate stacked bar chart for each salesman. For each stacked bar chart, I want x-axis to be the vendor and y-axis to stack on column values (i.e. product1/2/3 sales amount). Desired output:
Have tried several methods combined with for loop but keep getting error. Appreciate any guidance on this one! Thank you.
The DataFrame can be viewed here: Global Suicide Dataset
I have made a pivot table with country and year as indices using the following code:
df1 = pd.pivot_table(df, index = ['country', 'year'],
values=['suicides_no','gdp_per_capita ($)', 'population', 'suicides/100k pop'],
aggfunc = {"suicides_no" : np.sum
,"gdp_per_capita ($)" : np.mean
,"population" : np.mean
,"suicides/100k pop" : np.mean})
Output:
Now for my project, i want to visualize how does the suicides_no vary with the gdp_per_capita for a country over the years. But I am unable to plot it. Can somebody please help me out?
First lets convert indexes to columns using df1.reset_index(inplace=True)
Now, you can draw this in a scatter plot where the main features are - Year (preferably on x-axis) and suicides_no (on y-axis). The gdp_per_capita will go as size of the dots.
In this case you have two options:
Draw different plots for each country. (gdp will be shown as hue)
sns.catplot(x='year', y='suicides_no', row='country', hue='gdp_per_capita ($)', data=df1)
Draw everything in a single plot. Scatter plot with GDP as dot size, and Country as Color (hue)
sns.scatterplot(x='year', y='suicides_no', hue='country', size='gdp_per_capita ($)', data=df1)
I have a data frame that has persons affected by two different issues for each year from 1999 to 2005 as in bellow image.
dataframe
Can I create a bar chart that shows a comparison of person affected by cancer, Heart disease for each year?
I am reading huge csv file using pandas module.
filename = pd.read_csv(filepath)
Converted to Dataframe,
df = pd.DataFrame(filename, index=None)
From the csv file, I am concerned with the three columns of name country, year, and value.
I have groupby the country names and sum the values of it as in the following code and plot it as a bar graph.
df.groupby('country').value.sum().plot(kind='bar')
where, x axis is country and y axis is value.
Now, I want to make this bar graph as a stacked bar and used the third column year with different color bars representing each year. Looking forward for an easy way.
Note that, year column contains years from 2000 to 2019.
Thanks.
from what i understand you should try something like :
df.groupby(['country', 'Year']).value.sum().unstack().plot(kind='bar', stacked=True)