Matplotlib / Seaborn: Make a vertical distplot and a barplot share the Y axis - matplotlib

I'm trying to create a plot with two subplots (one row, two columns), in which a vertical distplot and a vertical barplot (both from seaborn) share the Y axis. The result should look somewhat like an asymmetric violin plot.
The data for the bar plot is of this form:
In[8]: barplot_data[0:5]
Out[8]:
[{'time': 0, 'val': 171.19374169863295},
{'time': 50, 'val': 2313.8459788903383},
{'time': 100, 'val': 1518.687964071397},
{'time': 150, 'val': 1355.8373488876694},
{'time': 200, 'val': 1558.7682098705088}]
I.e., for every time step (in steps of 50), I know the height of the bar. The data for the dist plot is of the form:
In[9]: distplot_data[0:5]
Out[9]: [605, 477, 51, 337, 332]
I.e., a series of time points of which I'd like the distribution to be drawn.
Here's how I create the bar plot in the right subplot:
barplot_df = pd.DataFrame(barplot_data)
fig, axes = plt.subplots(1, 2, sharex=False, sharey=True, squeeze=False)
left_ax = axes[0][0]
right_ax = axes[0][1]
sns.barplot(y='time', x='val',
data=barplot_df,
orient='h',
ax = right_ax)
The result is pretty much what I want on the right side:
Similarly, I can put the dist plot on the left side:
fig, axes = plt.subplots(1, 2, sharex=False, sharey=True, squeeze=False)
left_ax = axes[0][0]
right_ax = axes[0][1]
sns.distplot(distplot_data, ax=left_ax, vertical=True)
This also works. I think it's kind of strange that the direction of the Y axis is reversed, but whatever:
However, now I'm just trying to plot them both into the same figure and it wreaks havoc on the dist plot:
fig, axes = plt.subplots(1, 2, sharex=False, sharey=True, squeeze=False)
left_ax = axes[0][0]
right_ax = axes[0][1]
sns.barplot(y='time', x='val',
data=barplot_df,
orient='h',
ax = right_ax)
sns.distplot(distplot_data, ax=left_ax, vertical=True)
I can only imagine that this is because of the axis of the distplot somehow being distorted or something? Does someone know what's going on here?

Related

How to access and remove all unwanted objects in a matplotlib figure manually?

I am trying to understand the underlying concepts of matplotlib, especially Axes and Figure. Therefore I am trying to plot two scatters and then remove any superfluous space (the red one below) by accessing different APIs & objects in the hierarchy.
Yet I fail to understand where the remaining red space is coming from. This is the code:
# Random data
df = pd.DataFrame(np.random.randint(0,100,size=(100, 2)), columns=list('AB'))
# Create a single Axes and preconfigure the figure with red facecolor.
# Then plot a scatter
fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(10,5), facecolor='r')
ax1 = df.plot(kind='scatter', x='A', y='B', ax=axes[0])
ax2 = df.plot(kind='scatter', x='B', y='A', ax=axes[1])
# Remove except the scatter
for a in [ax1, ax2]:
a.set_xlabel(''), a.set_ylabel('') # Remove x and y labels
for loc in ['left', 'right', 'bottom', 'top']:
a.spines[loc].set_visible(False) # Remove spines
a.set_xticks([], []), a.set_yticks([], []) # Remove ticks
a.set_xmargin(0), a.set_ymargin(0) # No margin beyond outer values
# On figure-level we can make it more tight
fig.tight_layout()
It produces the following figure:
I saw that there is something like..
a.set_axis_off()
.. but this doesn't seem to be the right solution. Somewhere there seems to be some kind of padding that remains. It doesn't look like it's from some X/Y axis as it's the same for all four edges in both subplots.
Any help appreciated.
Solution
Two things are needed:
First we need to initialize the Figure with frameon=False:
fig, axes = plt.subplots(
// ...
frameon=False)
The space between the subplots can be removed using the subplot layout:
plt.subplots_adjust(wspace=.0, hspace=.0)
For the finest level of layout control, you can position your axes manually instead of relying on matplotlib to do it for you. There are a couple of ways of doing this.
One option is Axes.set_position
# Random data
df = pd.DataFrame(np.random.randint(0,100,size=(100, 2)), columns=list('AB'))
# Create a pair of Axes and preconfigure the figure with red facecolor.
# Then plot a scatter
fig, axes = plt.subplots(1, 2, figsize=(10, 5), facecolor='r')
df.plot(kind='scatter', x='A', y='B', ax=axes[0]).set_position([0, 0, 0.5, 1])
df.plot(kind='scatter', x='B', y='A', ax=axes[1]).set_position([0, 0.5, 0.5, 1])
You could also use the old-fashioned Figure.add_axes method:
# Random data
df = pd.DataFrame(np.random.randint(0,100,size=(100, 2)), columns=list('AB'))
# Create a pair of Axes and preconfigure the figure with red facecolor.
# Then plot a scatter
fig = plt.figure(figsize=(10, 5), facecolor='r')
df.plot(kind='scatter', x='A', y='B', ax=fig.add_axes([0, 0, 0.5, 1]))
df.plot(kind='scatter', x='B', y='A', ax=fig.add_axes([0, 0.5, 0.5, 1]))

Align bar and line plot on x axis without the use of rank and pointplot

Please note, I've looked at other questions like question and my problem is different and not a duplicate!
I would like to have two plots, with the same x axis in matplotlib. I thought this should be achieved via constrained_layout, but apparently this is not the case. Here is an example code.
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.gridspec as grd
x = np.arange(0, 30, 0.001)
df_line = pd.DataFrame({"x": x, "y": np.sin(x)})
df_bar = pd.DataFrame({
"x_bar": [1, 7, 10, 20, 30],
"y_bar": [0.0, 0.3, 0.4, 0.1, 0.2]
})
fig = plt.subplots(constrained_layout=True)
gs = grd.GridSpec(2, 1, height_ratios=[3, 2], wspace=0.1)
ax1 = plt.subplot(gs[0])
sns.lineplot(data=df_line, x=df_line["x"], y=df_line["y"], ax=ax1)
ax1.set_xlabel("time", fontsize="22")
ax1.set_ylabel("y values", fontsize="22")
plt.yticks(fontsize=16)
plt.xticks(fontsize=16)
plt.setp(ax1.get_legend().get_texts(), fontsize="22")
ax2 = plt.subplot(gs[1])
sns.barplot(data=df_bar, x="x_bar", y="y_bar", ax=ax2)
ax2.set_xlabel("time", fontsize="22")
ax2.set_ylabel("y values", fontsize="22")
plt.yticks(fontsize=16)
plt.xticks(fontsize=16)
this leads to the following figure.
However, I would like to see the corresponding x values of both plot aligned. How can I achieve this? Note, I've tried to use the following related question. However, this doesn't fully apply to my situation. First with the high number of x points (which I need in reality) point plots is make the picture to big and slow for loading. On top, I can't use the rank method as my categories for the barplot are not evenly distributed. They are specific points on the x axis which should be aligned with the corresponding point on the lineplot
x = np.arange(0, 30, 0.001)
df_line = pd.DataFrame({"x": x, "y": np.sin(x)})
df_bar = pd.DataFrame({
"x_bar": [1, 7, 10, 20, 30],
"y_bar": [0.0, 0.3, 0.4, 0.1, 0.2]
})
fig, (ax1, ax2) = plt.subplots(2,1)
ax1.plot(df_line['x'], df_line['y'])
for i in range(len(df_bar['x_bar'])):
ax2.axvline(x=df_bar['x_bar'][i], ymin=0, ymax=df_bar['y_bar'][i])
Output:
---edit---
I incorporated #mozway advice for linewidth:
lw = (300/ax1.get_xlim()[1])
ax2.axvline(x=df_bar['x_bar'][i], ymin=0, ymax=df_bar['y_bar'][i], solid_capstyle='butt', lw=lw)
Output:
or:

Pyplot: How to add mirrored second y-axis instead of negative values?

In pyplot, how could I add a second y-axis where normally the negative values would be? The y-values should be increasing away from the shared x-axis.
Above the x-axis I want to plot time consumption and below the x-axis space consumption. The x-axis is basically the current iteration.
This can be achieved using sharex=True and invert_yaxis():
from matplotlib import pyplot as plt
fig, (ax1, ax2) = plt.subplots(2, 1, sharex=True, figsize=(5, 10))
xdata = [2, 1, 5]
data1 = [1, 3, 5]
data2 = [1, 4, 7]
l1 = ax1.plot(xdata, data1, "ro")
l2 = ax2.plot(xdata, data2, "bx")
l3 = ax2.plot(xdata, data1, "g")
#make the appearance more coherent
#remove space between subplots
plt.subplots_adjust(hspace=0)
#invert the y-axis
ax2.invert_yaxis()
#set both plots to the same y-limit
lim = max(max(ax1.get_ylim()), max(ax2.get_ylim()))
ax1.set_ylim(0, lim)
ax2.set_ylim(lim, 0)
#move the x-axis label to the center
ax2.xaxis.tick_top()
#remove the double zero label
yticks2 = ax2.yaxis.get_major_ticks()
yticks2[0].set_visible(False)
#create a common legend
ax1.legend(l1+l2+l3, ["data1", "data2", "again data1"])
plt.show()
Sample output:

matplotlib bar chart with labels: space out the bars

When I run this code in Jupyter notebook the labels overlap and are unreadable.
y = [72, 21, 114, 52, 114, 12, 101, 16, 68, 118]
x = np.arange(len(y))
columns = ['MAHC_A', 'MAHC_B', 'MAHC_C', 'MAHC_D', 'MAHC_E', 'MAHC_F','MAHC_G', 'MAHC_H', 'MAHC_I', 'MAHC_J']
fig, ax = plt.subplots()
ax.bar(x, y, width=bar_width)
ax.set_xticks(x)
ax.set_xticklabels(xlabels)
plt.show()
Is there a way to space these out?
There are several options:
Make the figure larger in the horizontal direction.
fig, ax = plt.subplots(figsize=(10,4))
Make the fontsize smaller
ax.set_xticklabels(columns, fontsize=8)
Rotate the labels, such that they won't overlap anymore.
ax.set_xticklabels(columns, rotation=45)
Or, of course any combination of those.

Get desired wspace and subplots appropriately sized?

I'm trying to make a plot with one panel up top (colspan = 2) and two plots below, with a controlled amount of space between them. I'd like the bounds of the plots to be in alignment. Here's what I'm starting with:
import cartopy
from matplotlib import pyplot
from matplotlib.gridspec import GridSpec
gs = GridSpec(2, 2, height_ratios=[2, 1], hspace=0, wspace=0)
ax0 = pyplot.subplot(gs[0, :], projection=cartopy.crs.LambertConformal())
ax0.add_feature(cartopy.feature.COASTLINE)
ax0.set_extent([-120, -75, 20, 52], cartopy.crs.Geodetic())
ax1 = pyplot.subplot(gs[1, 0], projection=cartopy.crs.LambertConformal())
ax1.add_feature(cartopy.feature.COASTLINE)
ax1.set_extent([-90, -75, 20, 30], cartopy.crs.Geodetic())
ax2 = pyplot.subplot(gs[1, 1], projection=cartopy.crs.LambertConformal())
ax2.add_feature(cartopy.feature.COASTLINE)
ax2.set_extent([-90, -75, 20, 30], cartopy.crs.Geodetic())
pyplot.show()
First problem is that the wspace=0 parameter doesn't take.
Second problem is (at least this is my guess on how to proceed) calculating a height ratio that will make the width of the upper subplot equal the combined width of the lower subplots (plus any wspace).