Align bar and line plot on x axis without the use of rank and pointplot - matplotlib

Please note, I've looked at other questions like question and my problem is different and not a duplicate!
I would like to have two plots, with the same x axis in matplotlib. I thought this should be achieved via constrained_layout, but apparently this is not the case. Here is an example code.
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.gridspec as grd
x = np.arange(0, 30, 0.001)
df_line = pd.DataFrame({"x": x, "y": np.sin(x)})
df_bar = pd.DataFrame({
"x_bar": [1, 7, 10, 20, 30],
"y_bar": [0.0, 0.3, 0.4, 0.1, 0.2]
})
fig = plt.subplots(constrained_layout=True)
gs = grd.GridSpec(2, 1, height_ratios=[3, 2], wspace=0.1)
ax1 = plt.subplot(gs[0])
sns.lineplot(data=df_line, x=df_line["x"], y=df_line["y"], ax=ax1)
ax1.set_xlabel("time", fontsize="22")
ax1.set_ylabel("y values", fontsize="22")
plt.yticks(fontsize=16)
plt.xticks(fontsize=16)
plt.setp(ax1.get_legend().get_texts(), fontsize="22")
ax2 = plt.subplot(gs[1])
sns.barplot(data=df_bar, x="x_bar", y="y_bar", ax=ax2)
ax2.set_xlabel("time", fontsize="22")
ax2.set_ylabel("y values", fontsize="22")
plt.yticks(fontsize=16)
plt.xticks(fontsize=16)
this leads to the following figure.
However, I would like to see the corresponding x values of both plot aligned. How can I achieve this? Note, I've tried to use the following related question. However, this doesn't fully apply to my situation. First with the high number of x points (which I need in reality) point plots is make the picture to big and slow for loading. On top, I can't use the rank method as my categories for the barplot are not evenly distributed. They are specific points on the x axis which should be aligned with the corresponding point on the lineplot

x = np.arange(0, 30, 0.001)
df_line = pd.DataFrame({"x": x, "y": np.sin(x)})
df_bar = pd.DataFrame({
"x_bar": [1, 7, 10, 20, 30],
"y_bar": [0.0, 0.3, 0.4, 0.1, 0.2]
})
fig, (ax1, ax2) = plt.subplots(2,1)
ax1.plot(df_line['x'], df_line['y'])
for i in range(len(df_bar['x_bar'])):
ax2.axvline(x=df_bar['x_bar'][i], ymin=0, ymax=df_bar['y_bar'][i])
Output:
---edit---
I incorporated #mozway advice for linewidth:
lw = (300/ax1.get_xlim()[1])
ax2.axvline(x=df_bar['x_bar'][i], ymin=0, ymax=df_bar['y_bar'][i], solid_capstyle='butt', lw=lw)
Output:
or:

Related

Matplotlib--scatter plot with half filled markers

Question: Using a scatter plot in matplotlib, is there a simple way get a half-filled marker?
I know half-filled markers can easily be done using a line plot, but I would like to use 'scatter' because I want to use marker size and color (i.e., alternate marker face color) to represent other data. (I believe this will be easier with a scatter plot since I want to automate making a large number of plots from a large data set.)
I can't seem to make half-filled markers properly using a scatter plot. That is to say, instead of a half-filled marker, the plot shows half of a marker. I've been using matplotlib.markers.MarkerStyle, but that seems to only get me halfway there. I'm able to get following output using the code below.
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.markers import MarkerStyle
plt.scatter(1, 1, marker=MarkerStyle('o', fillstyle='full'), edgecolors='k', s=500)
plt.scatter(2, 2, marker=MarkerStyle('o', fillstyle='left'), edgecolors='k', s=500)
plt.scatter(3, 3, marker=MarkerStyle('o', fillstyle='right'), edgecolors='k', s=500)
plt.scatter(4, 4, marker=MarkerStyle('o', fillstyle='top'), edgecolors='k', s=500)
plt.scatter(5, 5, marker=MarkerStyle('o', fillstyle='bottom'), edgecolors='k', s=500)
plt.show()
As mentioned in the comments, I don't see why you have to use plt.scatter but if you want to, you can fake a combined marker:
from matplotlib.markers import MarkerStyle
from matplotlib import pyplot as plt
#data generation
import pandas as pd
import numpy as np
np.random.seed(123)
n = 10
df = pd.DataFrame({"X": np.random.randint(1, 20, n),
"Y": np.random.randint(10, 30, n),
"S": np.random.randint(50, 500, n),
"C1": np.random.choice(["red", "blue", "green"], n),
"C2": np.random.choice(["yellow", "grey"], n)})
fig, ax = plt.subplots()
ax.scatter(df.X, df.Y, s=df.S, c=df.C1, edgecolor="black", marker=MarkerStyle("o", fillstyle="right"))
ax.scatter(df.X, df.Y, s=df.S, c=df.C2, edgecolor="black", marker=MarkerStyle("o", fillstyle="left"))
plt.show()
Sample output:
This works, of course, also for continuous data:
from matplotlib import pyplot as plt
from matplotlib.markers import MarkerStyle
import pandas as pd
import numpy as np
np.random.seed(123)
n = 10
df = pd.DataFrame({"X": np.random.randint(1, 20, n),
"Y": np.random.randint(10, 30, n),
"S": np.random.randint(100, 1000, n),
"C1": np.random.randint(1, 100, n),
"C2": np.random.random(n)})
fig, ax = plt.subplots(figsize=(10,8))
im1 = ax.scatter(df.X, df.Y, s=df.S, c=df.C1, edgecolor="black", marker=MarkerStyle("o", fillstyle="right"), cmap="autumn")
im2 = ax.scatter(df.X, df.Y, s=df.S, c=df.C2, edgecolor="black", marker=MarkerStyle("o", fillstyle="left"), cmap="winter")
cbar1 = plt.colorbar(im1, ax=ax)
cbar1.set_label("right half", rotation=90)
cbar2 = plt.colorbar(im2, ax=ax)
cbar2.set_label("left half", rotation=90)
plt.show()
Sample output:
But be reminded that plt.plot with marker definitions might be faster for large-scale datasets: The plot function will be faster for scatterplots where markers don't vary in size or color.

Coloring minimum bars in seaborn FacetGrid barplot

Any easy way to automatically color (or mark in any way) the minimum/maximum bars for each plot of a FacetGrid?
For example, how to mark the minimal Z value on each one of the following 16 plots?
df = pd.DataFrame({'A':[10, 20, 30, 40]*4, 'Y':[1,2,3,4]*4, 'W':range(16), 'Z':range(16)})
g = sns.FacetGrid(df, row="A", col="Y", sharey=False)
g.map(sns.barplot, "W", "Z")
plt.show()
The following approach loops through the diagonal axes, for each ax searches the minimum height of the bars and then colors those:
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
df = pd.DataFrame({'A': [10, 20, 30, 40] * 4, 'Y': [1, 2, 3, 4] * 4, 'W': range(16), 'Z': range(16)})
g = sns.FacetGrid(df, row="A", col="Y", sharey=False)
g.map(sns.barplot, "W", "Z")
for i in range(len(g.axes)):
ax = g.axes[i, i]
min_height = min([p.get_height() for p in ax.patches])
for p in ax.patches:
if p.get_height() == min_height:
p.set_color('red')
plt.tight_layout()
plt.show()

Matplotlib `fill_between`: Remove thin boundary

Consider the following code:
import matplotlib.pyplot as plt
import numpy as np
from pylab import *
graph_data = [[0, 1, 2, 3], [5, 8, 7, 9]]
x = range(len(graph_data[0]))
y = graph_data[1]
fig, ax = plt.subplots()
alpha = 0.5
plt.plot(x, y, '-o',markersize=3, color=[1., alpha, alpha], markeredgewidth=0.0)
ax.fill_between(x, 0, y, facecolor=[1., alpha, alpha], interpolate=False)
plt.show()
filename = 'test1.pdf'
fig.savefig(filename, bbox_inches='tight')
It works fine. However, when zoomed in the generated PDF, I can see two thin gray/black boundaries that separate the line:
I can see this when viewing in both Edge and Chrome. My question is, how can I get rid of the boundaries?
UPDATE I forgot to mention, I was using Sage to generate the graph. Now it seems a problem specific to Sage (and not to Python in general). This time I used native Python, and got correct result.
I could not reproduce it but maybe you can try to not plot the line.
import matplotlib.pyplot as plt
import numpy as np
from pylab import *
graph_data = [[0, 1, 2, 3], [5, 8, 7, 9]]
x = range(len(graph_data[0]))
y = graph_data[1]
fig, ax = plt.subplots()
alpha = 0.5
plt.plot(x, y, 'o',markersize=3, color=[1., alpha, alpha])
ax.fill_between(x, 0, y, facecolor=[1., alpha, alpha], interpolate=False)
plt.show()
filename = 'test1.pdf'
fig.savefig(filename, bbox_inches='tight')

Get desired wspace and subplots appropriately sized?

I'm trying to make a plot with one panel up top (colspan = 2) and two plots below, with a controlled amount of space between them. I'd like the bounds of the plots to be in alignment. Here's what I'm starting with:
import cartopy
from matplotlib import pyplot
from matplotlib.gridspec import GridSpec
gs = GridSpec(2, 2, height_ratios=[2, 1], hspace=0, wspace=0)
ax0 = pyplot.subplot(gs[0, :], projection=cartopy.crs.LambertConformal())
ax0.add_feature(cartopy.feature.COASTLINE)
ax0.set_extent([-120, -75, 20, 52], cartopy.crs.Geodetic())
ax1 = pyplot.subplot(gs[1, 0], projection=cartopy.crs.LambertConformal())
ax1.add_feature(cartopy.feature.COASTLINE)
ax1.set_extent([-90, -75, 20, 30], cartopy.crs.Geodetic())
ax2 = pyplot.subplot(gs[1, 1], projection=cartopy.crs.LambertConformal())
ax2.add_feature(cartopy.feature.COASTLINE)
ax2.set_extent([-90, -75, 20, 30], cartopy.crs.Geodetic())
pyplot.show()
First problem is that the wspace=0 parameter doesn't take.
Second problem is (at least this is my guess on how to proceed) calculating a height ratio that will make the width of the upper subplot equal the combined width of the lower subplots (plus any wspace).

Matplotlib: Don't show errorbars in legend

I'm plotting a series of data points with x and y error but do NOT want the errorbars to be included in the legend (only the marker). Is there a way to do so?
Example:
import matplotlib.pyplot as plt
import numpy as np
subs=['one','two','three']
x=[1,2,3]
y=[1,2,3]
yerr=[2,3,1]
xerr=[0.5,1,1]
fig,(ax1)=plt.subplots(1,1)
for i in np.arange(len(x)):
ax1.errorbar(x[i],y[i],yerr=yerr[i],xerr=xerr[i],label=subs[i],ecolor='black',marker='o',ls='')
ax1.legend(loc='upper left', numpoints=1)
fig.savefig('test.pdf', bbox_inches=0)
You can modify the legend handler. See the legend guide of matplotlib.
Adapting your example, this could read:
import matplotlib.pyplot as plt
import numpy as np
subs=['one','two','three']
x=[1,2,3]
y=[1,2,3]
yerr=[2,3,1]
xerr=[0.5,1,1]
fig,(ax1)=plt.subplots(1,1)
for i in np.arange(len(x)):
ax1.errorbar(x[i],y[i],yerr=yerr[i],xerr=xerr[i],label=subs[i],ecolor='black',marker='o',ls='')
# get handles
handles, labels = ax1.get_legend_handles_labels()
# remove the errorbars
handles = [h[0] for h in handles]
# use them in the legend
ax1.legend(handles, labels, loc='upper left',numpoints=1)
plt.show()
This produces
Here is an ugly patch:
pp = []
colors = ['r', 'b', 'g']
for i, (y, yerr) in enumerate(zip(ys, yerrs)):
p = plt.plot(x, y, '-', color='%s' % colors[i])
pp.append(p[0])
plt.errorbar(x, y, yerr, color='%s' % colors[i])
plt.legend(pp, labels, numpoints=1)
Here is a figure for example:
The accepted solution works in simple cases but not in general. In particular, it did not work in my own more complex situation.
I found a more robust solution, which tests for ErrorbarContainer, which did work for me. It was proposed by Stuart W D Grieve and I copy it here for completeness
import matplotlib.pyplot as plt
from matplotlib import container
label = ['one', 'two', 'three']
color = ['red', 'blue', 'green']
x = [1, 2, 3]
y = [1, 2, 3]
yerr = [2, 3, 1]
xerr = [0.5, 1, 1]
fig, (ax1) = plt.subplots(1, 1)
for i in range(len(x)):
ax1.errorbar(x[i], y[i], yerr=yerr[i], xerr=xerr[i], label=label[i], color=color[i], ecolor='black', marker='o', ls='')
handles, labels = ax1.get_legend_handles_labels()
handles = [h[0] if isinstance(h, container.ErrorbarContainer) else h for h in handles]
ax1.legend(handles, labels)
plt.show()
It produces the following plot (on Matplotlib 3.1)
I works for me if I set the label argument as a None type.
plt.errorbar(x, y, yerr, label=None)