How to set common labels with matplotlib - matplotlib

I have a plot obtained in this way:
f, ((ax1, ax2, ax3, ax4), (ax5, ax6, ax7, ax8), (ax9, ax10, ax11, ax12)) = plt.subplots(3, 4, sharex = 'col', sharey = 'row')
ax1.set_title('column1')
ax1.plot([x], [y])
ax5.plot([x1],[y1])
ax9.plot([x2],[y2])
.....
so, I essentially have 3 rows and 4 columns.
I would like to know how is it possible to put commond labels to the x and y axis.
I tried to write
plt_xlabel('x')
plt.ylabel('y')
or
set.xlabel('x')
set.ylabel('y')
but it doesn't work. Can you help me? Is it also possible to put text on the right end side of the plot?

You can do this by iterating over your list of axes:
f, ax_lst = plt.subplots(3, 4, sharex = 'col', sharey = 'row')
for ax_l in ax_lst:
for ax in ax_l:
ax.set_xlabel('x')
ax.set_ylabel('y')

Related

Showing Matplotlib pie chart only top 3 item's percentage [duplicate]

I have the following code:
import matplotlib.pyplot as plt
import numpy as np
np.random.seed(123456)
import pandas as pd
df = pd.DataFrame(3 * np.random.rand(4, 4), index=['a', 'b', 'c', 'd'],
columns=['x', 'y','z','w'])
plt.style.use('ggplot')
colors = plt.rcParams['axes.color_cycle']
fig, axes = plt.subplots(nrows=2, ncols=3)
for ax in axes.flat:
ax.axis('off')
for ax, col in zip(axes.flat, df.columns):
ax.pie(df[col], labels=df.index, autopct='%.2f', colors=colors)
ax.set(ylabel='', title=col, aspect='equal')
axes[0, 0].legend(bbox_to_anchor=(0, 0.5))
fig.savefig('your_file.png') # Or whichever format you'd like
plt.show()
Which produce the following:
My question is, how can I remove the label based on a condition. For example I'd only want to display labels with percent > 20%. Such that the labels and value of a,c,d won't be displayed in X, etc.
The autopct argument from pie can be a callable, which will receive the current percentage. So you only would need to provide a function that returns an empty string for the values you want to omit the percentage.
Function
def my_autopct(pct):
return ('%.2f' % pct) if pct > 20 else ''
Plot with matplotlib.axes.Axes.pie
fig, axes = plt.subplots(nrows=2, ncols=2, figsize=(8, 6))
for ax, col in zip(axes.flat, df.columns):
ax.pie(df[col], labels=df.index, autopct=my_autopct)
ax.set(ylabel='', title=col, aspect='equal')
fig.tight_layout()
Plot directly with the dataframe
axes = df.plot(kind='pie', autopct=my_autopct, figsize=(8, 6), subplots=True, layout=(2, 2), legend=False)
for ax in axes.flat:
yl = ax.get_ylabel()
ax.set(ylabel='', title=yl)
fig = axes[0, 0].get_figure()
fig.tight_layout()
If you need to parametrize the value on the autopct argument, you'll need a function that returns a function, like:
def autopct_generator(limit):
def inner_autopct(pct):
return ('%.2f' % pct) if pct > limit else ''
return inner_autopct
ax.pie(df[col], labels=df.index, autopct=autopct_generator(20), colors=colors)
For the labels, the best thing I can come up with is using list comprehension:
for ax, col in zip(axes.flat, df.columns):
data = df[col]
labels = [n if v > data.sum() * 0.2 else ''
for n, v in zip(df.index, data)]
ax.pie(data, autopct=my_autopct, colors=colors, labels=labels)
Note, however, that the legend by default is being generated from the first passed labels, so you'll need to pass all values explicitly to keep it intact.
axes[0, 0].legend(df.index, bbox_to_anchor=(0, 0.5))
For labels I have used:
def my_level_list(data):
list = []
for i in range(len(data)):
if (data[i]*100/np.sum(data)) > 2 : #2%
list.append('Label '+str(i+1))
else:
list.append('')
return list
patches, texts, autotexts = plt.pie(data, radius = 1, labels=my_level_list(data), autopct=my_autopct, shadow=True)
You can make the labels function a little shorter using list comprehension:
def my_autopct(pct):
return ('%1.1f' % pct) if pct > 1 else ''
def get_new_labels(sizes, labels):
new_labels = [label if size > 1 else '' for size, label in zip(sizes, labels)]
return new_labels
fig, ax = plt.subplots()
_,_,_ = ax.pie(sizes, labels=get_new_labels(sizes, labels), colors=colors, autopct=my_autopct, startangle=90, rotatelabels=False)

probelm with subplots in matplotlib

I have the following code which works just fine:
plt.rcParams["figure.figsize"] = (5,5) # V1.0b
fig, axes = plt.subplots(ncols = 2, nrows = 2) # V1.0b
ax1, ax2, ax3, ax4 = axes.flatten()
plt.subplot(2, 2, 1)
ax1.plot(x1, y1)
ax1.plot(x2, y2)
(etc)
Exactly as expected, I get 2 plots in row 1, 2 plots in row 2.
Now, I want 2 rows by 3 cols and 4 plots (from exactly the same data):
plt.rcParams["figure.figsize"] = (6,4)
fig, axes = plt.subplots(ncols = 3, nrows = 2)
ax1, ax2, ax3, ax4 = axes.flatten()
plt.subplot(2, 3, 1)
ax1.plot(x1, y1)
(etc)
And I get an error from the line:
---> 12 ax1, ax2, ax3, ax4 = axes.flatten()
The error message is:
ValueError: too many values to unpack (expected 4)
Surely ax1, ax2, ax3, ax4 are the 4 values? But, evidently not; what's going wrong here?
I've found this works. As you say, no need for subplots:
figure, axis = plt.subplots(3, 3)
axis[0, 0]
axis[0, 0].set_title("NGC0628")
axis[0, 0].plot(x0,y0)
axis[0, 1]
axis[0, 0].plot(x1,y1)
axis[0, 2]
axis[0, 0].plot(x2,y2)
(etc)
BTW I need control over each plot, i.e. as in
axis[0, 0].set_title("NGC0628")
Thanks for the steer

change color of bar for data selection in seaborn histogram (or plt)

Let's say I have a dataframe like:
X2 = np.random.normal(10, 3, 200)
X3 = np.random.normal(34, 2, 200)
a = pd.DataFrame({"X3": X3, "X2":X2})
and I am doing the following plotting routine:
f, axes = plt.subplots(2, 2, gridspec_kw={"height_ratios":(.10, .30)}, figsize = (13, 4))
for i, c in enumerate(a.columns):
sns.boxplot(a[c], ax=axes[0,i])
sns.distplot(a[c], ax = axes[1,i])
axes[1, i].set(yticklabels=[])
axes[1, i].set(xlabel='')
axes[1, i].set(ylabel='')
plt.tight_layout()
plt.show()
Which yields to:
Now I want to be able to perform a data selection on the dataframe a. Let's say something like:
b = a[(a['X2'] <4)]
and highlight the selection from b in the posted histograms.
for example if the first row of b is [32:0] for X3 and [0:5] for X2, the desired output would be:
is it possible to do this with the above for loop and with sns? Many thanks!
EDIT: I am also happy with a matplotlib solution, if easier.
EDIT2:
If it helps, it would be similar to do the following:
b = a[(a['X3'] >38)]
f, axes = plt.subplots(2, 2, gridspec_kw={"height_ratios":(.10, .30)}, figsize = (13, 4))
for i, c in enumerate(a.columns):
sns.boxplot(a[c], ax=axes[0,i])
sns.distplot(a[c], ax = axes[1,i])
sns.distplot(b[c], ax = axes[1,i])
axes[1, i].set(yticklabels=[])
axes[1, i].set(xlabel='')
axes[1, i].set(ylabel='')
plt.tight_layout()
plt.show()
which yields the following:
However, I would like to be able to just colour those bars in the first plot in a different colour!
I also thought about setting the ylim to only the size of the blue plot so that the orange won't distort the shape of the blue distribution, but it wouldn't still be feasible, as in reality I have about 10 histograms to show, and setting ylim would be pretty much the same as sharey=True, which Im trying to avoid, so that I'm able to show the true shape of the distributions.
I think I found the solution for this using the inspiration from the previous answer and this video:
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
np.random.seed(2021)
X2 = np.random.normal(10, 3, 200)
X3 = np.random.normal(34, 2, 200)
a = pd.DataFrame({"X3": X3, "X2":X2})
b = a[(a['X3'] < 30)]
hist_idx=[]
for i, c in enumerate(a.columns):
bin_ = np.histogram(a[c], bins=20)[1]
hist = np.where(np.logical_and(bin_<=max(b[c]), bin_>min(b[c])))
hist_idx.append(hist)
f, axes = plt.subplots(2, 2, gridspec_kw={"height_ratios":(.10, .30)}, figsize = (13, 4))
for i, c in enumerate(a.columns):
sns.boxplot(a[c], ax=axes[0,i])
axes[1, i].hist(a[c], bins = 20)
axes[1, i].set(yticklabels=[])
axes[1, i].set(xlabel='')
axes[1, i].set(ylabel='')
for it, index in enumerate(hist_idx):
lenght = len(index[0])
for r in range(lenght):
try:
axes[1, it].patches[index[0][r]-1].set_fc("red")
except:
pass
plt.tight_layout()
plt.show()
which yields the following for b = a[(a['X3'] < 30)] :
or for b = a[(a['X3'] > 36)]:
Thought I'd leave it here - although niche, might help someone in the future!
I created the following code with the understanding that the intent of your question is to add a different color to the histogram based on the data extracted under certain conditions.
Use np.histogram() to get an array of frequencies and an array of bins. Get the index of the value closest to the value of the first row of data extracted for a certain condition. Change the color of the histogram with that retrieved index. The same method can be used to deal with the other graph.
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
np.random.seed(2021)
X2 = np.random.normal(10, 3, 200)
X3 = np.random.normal(34, 2, 200)
a = pd.DataFrame({"X3": X3, "X2":X2})
f, axes = plt.subplots(2, 2, gridspec_kw={"height_ratios":(.10, .30)}, figsize = (13, 4))
for i, c in enumerate(a.columns):
sns.boxplot(a[c], ax=axes[0,i])
sns.distplot(a[c], ax = axes[1,i])
axes[1, i].set(yticklabels=[])
axes[1, i].set(xlabel='')
axes[1, i].set(ylabel='')
b = a[(a['X2'] <4)]
hist3, bins3 = np.histogram(X3)
idx = np.abs(np.asarray(hist3) - b['X3'].head(1).values[0]).argmin()
for k in range(idx):
axes[1,0].get_children()[k].set_color("red")
plt.tight_layout()
plt.show()

y and x axis subplots matplotlib

A quite basic question about ticks' labels for x and y-axis. According to this code
fig, axes = plt.subplots(6,12, figsize=(50, 24), constrained_layout=True, sharex=True , sharey=True)
fig.subplots_adjust(hspace = .5, wspace=.5)
custom_xlim = (-1, 1)
custom_ylim = (-0.2,0.2)
for i in range(72):
x_data = ctheta[i]
y_data = phi[i]
y_err = err_phi[i]
ax = fig.add_subplot(6, 12, i+1)
ax.plot(x_data_new, bspl(x_data_new))
ax.axis('off')
ax.errorbar(x_data,y_data, yerr=y_err, fmt="o")
ax.set_xlim(custom_xlim)
ax.set_ylim(custom_ylim)
I get the following output:
With y labels for plots on the first column and x labels for theone along the last line, although I call them off.
Any idea?
As #BigBen wrote in their comment, your issue is caused by you adding axes to your figure twice, once via fig, axes = plt.subplots() and then once again within your loop via fig.add_subplot(). As a result, the first set of axes is still visible even after you applied .axis('off') to the second set.
Instead of the latter, you could change your loop to:
for i in range(6):
for j in range(12):
ax = axes[i,j] # these are the axes created via plt.subplots(6,12,...)
ax.axis('off')
# … your other code here

plotting histograms next to each other

I have 4 histograms form 4 different df. I can plot them one by one but can't figure a way to plot them all next to each other, let's stay 2 on top, 2 at the bottom.
hist_1 = df1.hist(bins=50,range=[0,1])
hist_2 = df2.hist(bins=50,range=[0,1])
hist_3 = df3.hist(bins=50,range=[0,1])
hist_4 = df4.hist(bins=50,range=[0,1])
I have tried different things but it is always showing them overlapped on the same figure.
Is this something you are expecting? Using subplots divide the axes into 4 (2 at the top and 2 at the bottom) and then plot the histograms for each df in each subplot.
Code:
fig, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2, 2, sharex=True)
axs = [ax1, ax2, ax3, ax4]
dfs = [df1, df2, df3, df4]
for n in range(len(axs)):
axs[n].hist(dfs[n], bins=50, range=[0,1])
plt.show()
Output: