How remove a specific legend label in the Dataframe.plot? - pandas

I am trying to plot two DataFrame together by 'bar' style and 'line' style respectively, but have trouble when showing the legend only for the bars, excluding the line.
Here are my codes:
import numpy as np
import pandas as pd
np.random.seed(5)
df = pd.DataFrame({'2012':np.random.random_sample((4,)),'2014':np.random.random_sample((4,))})
df.index = ['A','B','C','D']
sumdf = df.T.apply(np.sum,axis=1)
ax = df.T.plot.bar(stacked=True)
sumdf.plot(ax=ax)
ax.set_xlim([-0.5,1.5])
ax.set_ylim([0,3])
ax.legend(loc='upper center',ncol=3,framealpha=0,labelspacing=0,handlelength=4,borderaxespad=0)
Annoyingly got this: Figure, where the line legend is also shown in the legend box. I want to remove it rather than make it invisible.
But I do not find the way.
Thank you!

If a matplotlib.legend's label starts with an underscore, it will not be shown in the legend by default.
You can simply change
sumdf.plot(ax=ax)
to
sumdf.plot(ax=ax, label='_')

Related

python pandas plot line chart in pandas.plot hbar

I have a horizontal bar chart created with
df.plot(kind='barh', ax=ax)
and now I would like to plot a horizontal line chart in the same axis. How can I do that. There seems to be no equivalent lineh
I tried to just flip axes when plotting a regular line
df=pd.DataFrame(dict(k=['A','B','C','D'], v=[1,3,2,3]))
df.plot(x='v', y='k')
but then pandas complains that there is no numerical data to plot
If you want to use matplotlib, you can do like the following. Here the command xticks() is to set x-tick labels only at integer values.
import pandas as pd
import matplotlib.pyplot as plt
df=pd.DataFrame(dict(k=['A','B','C','D'], v=[1,3,2,3]))
plt.plot(df.v, df.k)
plt.xticks(range(1, max(df.v)+1))
plt.show()

How to plot Series with selective ticks?

I have a Series that I would like to plot as a bar chart: pd.Series([-4,2, 3,3, 4,5,9,20]).value_counts()
Since I have many bars I only want to display some (equidistant) ticks.
However, unless I actively work against it, pyplot will print the wrong labels. E.g. if I leave out set_xticklabels in the code below I get
where every element from the index is taken and just displayed with the specified distance.
This code does what I want:
import numpy as np
import pandas as pd
import matplotlib
import matplotlib.pyplot as plt
s = pd.Series([-4,2, 3,3, 4,5,9,20]).value_counts().sort_index()
mi,ma = min(s.index), max(s.index)
s = s.reindex(range(mi,ma+1,1), fill_value=0)
distance = 10
a = s.plot(kind='bar')
condition = lambda t: int(t[1].get_text()) % 10 == 0
ticks_,labels_=zip(*filter(condition, zip(a.get_xticks(), a.get_xticklabels())))
a.set_xticks(ticks_)
a.set_xticklabels(labels_)
plt.show()
But I still feel like I'm being unnecessarily clever here. Am I missing a function? Is this the best way of doing that?
Consider not using a pandas bar plot in case you intend to plot numeric values; that is because pandas bar plots are categorical in nature.
If instead using a matplotlib bar plot, which is numeric in nature, there is no need to tinker with any ticks at all.
s = pd.Series([-4,2, 3,3, 4,5,9,20]).value_counts().sort_index()
plt.bar(s.index, s)
I think you overcomplicated it. You can simply use the following. You just need to find the relationship between the ticks and the ticklabels.
a = s.plot(kind='bar')
xticks = np.arange(0, max(s)*10+1, 10)
plt.xticks(xticks + abs(mi), xticks)

How can i remove or change text from a matplotlib plot?

I have a bar plot which is being returned to me (i have access to the AxesSubplot object) which already has some labels on the bars. The issue is they are illegible and i would like to enlarge them (or clear and reset them). Take the following code for example:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({'a':['red','green','blue'], 'b':[4,8,12]})
plot = df.plot(kind='barh')
for i in plot.patches:
plot.text(i.get_width()+.01, i.get_y()+.38, str(i.get_width()), fontsize=31)
This generates a nice bar plot with labels on the bars. But lets say i want to remove or change those labels, how can this be done?
You can access the text objects using plot.texts. In your example, you get:
>>> plot.texts
[Text(4.01,0.13,'4'), Text(8.01,1.13,'8'), Text(12.01,2.13,'12')]
You can remove them all in a loop:
for t in plot.texts:
t.set_visible(False)
Or change attributes (fontsize for example) in a similar manner:
for t in plot.texts:
# Reduce fontsize to 10:
t.set_fontsize(10)

Seaborn FacetGrid plots are empty if any of the sub-plots have no data

I have a dataset that I want to plot with FacetGrids using the seaborn library. The problem is my data is "sparse"; some of the individual subplots don't exist (ie. there are zero data points). I would like those cells to either not show up, or just show up and be blank, but still see the subplots that have data. Here's a simple example:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
df = pd.DataFrame(columns=['a','b','c','d'],
data=[[1,1,1,4],[1,2,2,8],[2,1,2,12],[2,1,3,14]])
print df
g = sns.FacetGrid(df, col='a', row='b', hue='c')
g.map(plt.scatter, 'c', 'd', marker='o')
Unfortunately, when I plot this, I just get four empty plots instead of 3 filled plots and one empty one. If I change the last row of data to [2,2,3,14] instead, then all four plots appear as expected. Is this a bug in seaborn? Can I work around it somehow?

Creating a bar plot using Seaborn

I am trying to plot bar chart using seaborn. Sample data:
x=[1,1000,1001]
y=[200,300,400]
cat=['first','second','third']
df = pd.DataFrame(dict(x=x, y=y,cat=cat))
When I use:
sns.factorplot("x","y", data=df,kind="bar",palette="Blues",size=6,aspect=2,legend_out=False);
The figure produced is
When I add the legend
sns.factorplot("x","y", data=df,hue="cat",kind="bar",palette="Blues",size=6,aspect=2,legend_out=False);
The resulting figure looks like this
As you can see, the bar is shifted from the value. I don't know how to get the same layout as I had in the first figure and add the legend.
I am not necessarily tied to seaborn, I like the color palette, but any other approach is fine with me. The only requirement is that the figure looks like the first one and has the legend.
It looks like this issue arises here - from the docs searborn.factorplot
hue : string, optional
Variable name in data for splitting the plot by color. In the case of ``kind=”bar”, this also influences the placement on the x axis.
So, since seaborn uses matplotlib, you can do it like this:
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
x=[1,1000,1001]
y=[200,300,400]
sns.set_context(rc={"figure.figsize": (8, 4)})
nd = np.arange(3)
width=0.8
plt.xticks(nd+width/2., ('1','1000','1001'))
plt.xlim(-0.15,3)
fig = plt.bar(nd, y, color=sns.color_palette("Blues",3))
plt.legend(fig, ['First','Second','Third'], loc = "upper left", title = "cat")
plt.show()
Added #mwaskom's method to get the three sns colors.