Text in matplotlib subplots in wrong positions - matplotlib

I'm trying to get some text into a matplotlib subplot, but I get it into weird locations.
The code I use is like the following:
limSX_list = [0, 0, 0]
limDX_list = [10, 10, 10]
fig, axs = plt.subplots(3, 3)
for xxx in range(3):
for yyy in range(3):
if xxx==yyy:
#axs[xxx,yyy].hist( Deltas_list[xxx], range=[limSX_list[xxx], limDX_list[xxx]], bins=100, color='red' )
axs[xxx,yyy].hist( Deltas_list[xxx], bins=100, color='red' )
else:
#axs[xxx,yyy].hist2d( Deltas_list[xxx], Deltas_list[yyy], bins=(100, 100), cmap=plt.cm.viridis, range=[[limSX_list[xxx],limDX_list[xxx]],[limSX_list[yyy],limDX_list[yyy]]], norm=LogNorm() )
axs[xxx,yyy].hist2d( Deltas_list[xxx], Deltas_list[yyy], bins=(100, 100), cmap=plt.cm.viridis, norm=LogNorm() )
for row in range(3):
for col in range(3):
axs[row, col].text(0.5, 0.5, str((row, col)),color='blue', fontsize=18, ha='center')
plt.show()
And I get this weird output:
And if I set a x/y range (i.e. I uncomment the commented lines, and comment the ones below), I get the text in a different, but still wrong, position.
If I try with an even more minimal code, like the following, instead, everything goes fine:
fig, ax = plt.subplots(rows, cols)
for row in range(3):
for col in range(3):
ax[row, col].text(0.5, 0.5, str((row, col)), color='blue', fontsize=18, ha='center')
plt.show()
Any guess why? Thanks! :)

Related

change color of bar for data selection in seaborn histogram (or plt)

Let's say I have a dataframe like:
X2 = np.random.normal(10, 3, 200)
X3 = np.random.normal(34, 2, 200)
a = pd.DataFrame({"X3": X3, "X2":X2})
and I am doing the following plotting routine:
f, axes = plt.subplots(2, 2, gridspec_kw={"height_ratios":(.10, .30)}, figsize = (13, 4))
for i, c in enumerate(a.columns):
sns.boxplot(a[c], ax=axes[0,i])
sns.distplot(a[c], ax = axes[1,i])
axes[1, i].set(yticklabels=[])
axes[1, i].set(xlabel='')
axes[1, i].set(ylabel='')
plt.tight_layout()
plt.show()
Which yields to:
Now I want to be able to perform a data selection on the dataframe a. Let's say something like:
b = a[(a['X2'] <4)]
and highlight the selection from b in the posted histograms.
for example if the first row of b is [32:0] for X3 and [0:5] for X2, the desired output would be:
is it possible to do this with the above for loop and with sns? Many thanks!
EDIT: I am also happy with a matplotlib solution, if easier.
EDIT2:
If it helps, it would be similar to do the following:
b = a[(a['X3'] >38)]
f, axes = plt.subplots(2, 2, gridspec_kw={"height_ratios":(.10, .30)}, figsize = (13, 4))
for i, c in enumerate(a.columns):
sns.boxplot(a[c], ax=axes[0,i])
sns.distplot(a[c], ax = axes[1,i])
sns.distplot(b[c], ax = axes[1,i])
axes[1, i].set(yticklabels=[])
axes[1, i].set(xlabel='')
axes[1, i].set(ylabel='')
plt.tight_layout()
plt.show()
which yields the following:
However, I would like to be able to just colour those bars in the first plot in a different colour!
I also thought about setting the ylim to only the size of the blue plot so that the orange won't distort the shape of the blue distribution, but it wouldn't still be feasible, as in reality I have about 10 histograms to show, and setting ylim would be pretty much the same as sharey=True, which Im trying to avoid, so that I'm able to show the true shape of the distributions.
I think I found the solution for this using the inspiration from the previous answer and this video:
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
np.random.seed(2021)
X2 = np.random.normal(10, 3, 200)
X3 = np.random.normal(34, 2, 200)
a = pd.DataFrame({"X3": X3, "X2":X2})
b = a[(a['X3'] < 30)]
hist_idx=[]
for i, c in enumerate(a.columns):
bin_ = np.histogram(a[c], bins=20)[1]
hist = np.where(np.logical_and(bin_<=max(b[c]), bin_>min(b[c])))
hist_idx.append(hist)
f, axes = plt.subplots(2, 2, gridspec_kw={"height_ratios":(.10, .30)}, figsize = (13, 4))
for i, c in enumerate(a.columns):
sns.boxplot(a[c], ax=axes[0,i])
axes[1, i].hist(a[c], bins = 20)
axes[1, i].set(yticklabels=[])
axes[1, i].set(xlabel='')
axes[1, i].set(ylabel='')
for it, index in enumerate(hist_idx):
lenght = len(index[0])
for r in range(lenght):
try:
axes[1, it].patches[index[0][r]-1].set_fc("red")
except:
pass
plt.tight_layout()
plt.show()
which yields the following for b = a[(a['X3'] < 30)] :
or for b = a[(a['X3'] > 36)]:
Thought I'd leave it here - although niche, might help someone in the future!
I created the following code with the understanding that the intent of your question is to add a different color to the histogram based on the data extracted under certain conditions.
Use np.histogram() to get an array of frequencies and an array of bins. Get the index of the value closest to the value of the first row of data extracted for a certain condition. Change the color of the histogram with that retrieved index. The same method can be used to deal with the other graph.
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
np.random.seed(2021)
X2 = np.random.normal(10, 3, 200)
X3 = np.random.normal(34, 2, 200)
a = pd.DataFrame({"X3": X3, "X2":X2})
f, axes = plt.subplots(2, 2, gridspec_kw={"height_ratios":(.10, .30)}, figsize = (13, 4))
for i, c in enumerate(a.columns):
sns.boxplot(a[c], ax=axes[0,i])
sns.distplot(a[c], ax = axes[1,i])
axes[1, i].set(yticklabels=[])
axes[1, i].set(xlabel='')
axes[1, i].set(ylabel='')
b = a[(a['X2'] <4)]
hist3, bins3 = np.histogram(X3)
idx = np.abs(np.asarray(hist3) - b['X3'].head(1).values[0]).argmin()
for k in range(idx):
axes[1,0].get_children()[k].set_color("red")
plt.tight_layout()
plt.show()

y and x axis subplots matplotlib

A quite basic question about ticks' labels for x and y-axis. According to this code
fig, axes = plt.subplots(6,12, figsize=(50, 24), constrained_layout=True, sharex=True , sharey=True)
fig.subplots_adjust(hspace = .5, wspace=.5)
custom_xlim = (-1, 1)
custom_ylim = (-0.2,0.2)
for i in range(72):
x_data = ctheta[i]
y_data = phi[i]
y_err = err_phi[i]
ax = fig.add_subplot(6, 12, i+1)
ax.plot(x_data_new, bspl(x_data_new))
ax.axis('off')
ax.errorbar(x_data,y_data, yerr=y_err, fmt="o")
ax.set_xlim(custom_xlim)
ax.set_ylim(custom_ylim)
I get the following output:
With y labels for plots on the first column and x labels for theone along the last line, although I call them off.
Any idea?
As #BigBen wrote in their comment, your issue is caused by you adding axes to your figure twice, once via fig, axes = plt.subplots() and then once again within your loop via fig.add_subplot(). As a result, the first set of axes is still visible even after you applied .axis('off') to the second set.
Instead of the latter, you could change your loop to:
for i in range(6):
for j in range(12):
ax = axes[i,j] # these are the axes created via plt.subplots(6,12,...)
ax.axis('off')
# … your other code here

mouse-over only on actual data points

Here's a really simple line chart.
%matplotlib notebook
import matplotlib.pyplot as plt
lines = plt.plot([1, 2, 3, 4], [1, 4, 9, 16])
plt.setp(lines,marker='D')
plt.ylabel('foo')
plt.xlabel('bar')
plt.show()
If I move my mouse over the chart, I get the x and y values for wherever the pointer is. Is there any way to only get values only when I'm actually over a data point?
I understood you wanted to modify the behavior of the coordinates displayed in the status bar at the bottom right of the plot, is that right?
If so, you can "hijack" the Axes.format_coord() function to make it display whatever you want. You can see an example of this on matplotlib's example gallery.
In your case, something like this seem to do the trick?
my_x = np.array([1, 2, 3, 4])
my_y = np.array([1, 4, 9, 16])
eps = 0.1
def format_coord(x, y):
close_x = np.isclose(my_x, x, atol=eps)
close_y = np.isclose(my_y, y, atol=eps)
if np.any(close_x) and np.any(close_y):
return 'x=%s y=%s' % (ax.format_xdata(my_x[close_x]), ax.format_ydata(my_y[close_y]))
else:
return ''
fig, ax = plt.subplots()
ax.plot(my_x, my_y, 'D-')
ax.set_ylabel('foo')
ax.set_xlabel('bar')
ax.format_coord = format_coord
plt.show()

Bar chart remove space through aspect or axis limits constrains in or between subplots

I am struggling to remove the empty space in or between subplots. I already read a lot of answers here, but I am not getting anywhere.
I want to make horizontal bar plots with several subplots:
My example is:
import matplotlib.pyplot as plt
x1 = [5]
y1 = [-10]
x2 = [30, 35]
y2 = [-15, -20]
x3 = [15, 5, 20]
y3 = [-10, -15, -30]
xlimits = [-30, 35]
ylimits = [-0.5, 2.5]
fig = plt.figure(figsize=(12,6))
ax1 = fig.add_subplot(3,1,1)
ax1.barh(0, x1, height = 1)
ax1.barh(0, y1, height = 1)
ax2 = fig.add_subplot(3,1,2)
ax2.barh([0, 1], x2, height = 1)
ax2.barh([0, 1], y2, height = 1)
ax3 = fig.add_subplot(3,1,3)
ax3.barh([0, 1, 2], x3, height = 1)
ax3.barh([0, 1, 2], y3, height = 1)
for ax in fig.axes:
ax.set_ylim(ylimits)
ax.set_xlim(xlimits)
plt.show()
will result in:
I used ax.set_ylim(ylimits) to have an equal height of all bars and ax.set_xlim(xlimits) to have "0" in one vertical line.
Now I would like to adjust the bbox to remove the empty space in the subplots (top and middle). But I have no idea how to achieve this. I also tried ax.set_aspect(). In this case I will receive empty space between the subplots.
I would like to do it with subplots to easily add description, swap stuff and so on.
Thanks in advance for any suggestions.
If I understood you correctly, you could try adding this to your code:
fig.subplots_adjust(wspace=0, hspace=0)

Matplotlib / Seaborn: Make a vertical distplot and a barplot share the Y axis

I'm trying to create a plot with two subplots (one row, two columns), in which a vertical distplot and a vertical barplot (both from seaborn) share the Y axis. The result should look somewhat like an asymmetric violin plot.
The data for the bar plot is of this form:
In[8]: barplot_data[0:5]
Out[8]:
[{'time': 0, 'val': 171.19374169863295},
{'time': 50, 'val': 2313.8459788903383},
{'time': 100, 'val': 1518.687964071397},
{'time': 150, 'val': 1355.8373488876694},
{'time': 200, 'val': 1558.7682098705088}]
I.e., for every time step (in steps of 50), I know the height of the bar. The data for the dist plot is of the form:
In[9]: distplot_data[0:5]
Out[9]: [605, 477, 51, 337, 332]
I.e., a series of time points of which I'd like the distribution to be drawn.
Here's how I create the bar plot in the right subplot:
barplot_df = pd.DataFrame(barplot_data)
fig, axes = plt.subplots(1, 2, sharex=False, sharey=True, squeeze=False)
left_ax = axes[0][0]
right_ax = axes[0][1]
sns.barplot(y='time', x='val',
data=barplot_df,
orient='h',
ax = right_ax)
The result is pretty much what I want on the right side:
Similarly, I can put the dist plot on the left side:
fig, axes = plt.subplots(1, 2, sharex=False, sharey=True, squeeze=False)
left_ax = axes[0][0]
right_ax = axes[0][1]
sns.distplot(distplot_data, ax=left_ax, vertical=True)
This also works. I think it's kind of strange that the direction of the Y axis is reversed, but whatever:
However, now I'm just trying to plot them both into the same figure and it wreaks havoc on the dist plot:
fig, axes = plt.subplots(1, 2, sharex=False, sharey=True, squeeze=False)
left_ax = axes[0][0]
right_ax = axes[0][1]
sns.barplot(y='time', x='val',
data=barplot_df,
orient='h',
ax = right_ax)
sns.distplot(distplot_data, ax=left_ax, vertical=True)
I can only imagine that this is because of the axis of the distplot somehow being distorted or something? Does someone know what's going on here?