matplotlib shared row label (not y label) in plot containing subplots - matplotlib

I have a trellis-like plot I am trying to produce in matplotlib. Here is a sketch of what I'm going for:
One thing I am having trouble with is getting a shared row label for each row. I.e. in my plot, I have four rows for four different sets of experiments, so I want row labels "1 source node, 2 source nodes, 4 source nodes and 8 source nodes".
Note that I am not referring to the y axis label, which is being used to label the dependent variable. The dependent variable is the same in all subplots, but the row labels I am after are to describe the four categories of experiments conducted, one for each row.
At the moment, I'm generating the plot with:
fig, axes = plt.subplots(4, 5, sharey=True)
While I've found plenty of information on sharing the y-axis label, I haven't found anything on adding a single shared row label.

As far as I know there is no ytitle or something. You can use text to show some text. The x and y are in data-coordinates. ha and va are horizontal and vertical alignment, respectively.
import numpy
import matplotlib
import matplotlib.pyplot as plt
n_rows = 4
n_cols = 5
fig, axes = plt.subplots(n_rows, n_cols, sharey = True)
axes[0][0].set_ylim(0,10)
for i in range(n_cols):
axes[0][i].text(x = 0.5, y = 12, s = "column label", ha = "center")
axes[n_rows-1][i].set_xlabel("xlabel")
for i in range(n_rows):
axes[i][0].text(x = -0.8, y = 5, s = "row label", rotation = 90, va = "center")
axes[i][0].set_ylabel("ylabel")
plt.show()

You could give titles to subplots on the top row like Robbert suggested
fig, axes = plt.subplots(4,3)
for i, ax in enumerate(axes[0,:]):
ax.set_title('col%i'%i)

Related

Apply `ListedColormap` on bar chart [duplicate]

I have a df with two columns:
y: different numeric values for the y axis
days: the names of four different days (Monday, Tuesday, Wednesday, Thursday)
I also have a colormap with four different colors that I made myself and it's a ListedColorMap object.
I want to create a bar chart with the four categories (days of the week) in the x axis and their corresponding values in the y axis. At the same time, I want each bar to have a different color using my colormap.
This is the code I used to build my bar chart:
def my_barchart(my_df, my_cmap):
fig = plt.figure()
ax = fig.add_axes([0,0,1,1])
ax.bar(my_df['days'], my_df['y'], color=my_cmap)
return fig
However, I get the following error: "object of type 'ListedColormap' has no len()", so it seems that I'm not using my_cmap correctly.
If I remove that from the function and run it, my bar chart looks ok, except that all bars have the same color. So my question is: what is the right way to use a colormap with a bar chart?
The color argument wants either a string or an RGB[A] value (it can be a single colour, or a sequence of colours with one for each data point you are plotting). Colour maps are typically callable with floats in the range [0, 1].
So what you want to do is take the values you want for the colours for each bar, scale them to the range [0, 1], and then call my_cmap with those rescaled values.
So, say for example you wanted the colours to correspond to the y values (heights of the bars), then you should modify your code like this (assumes you have called import numpy as np earlier on):
def my_barchart(my_df, my_cmap):
rescale = lambda y: (y - np.min(y)) / (np.max(y) - np.min(y))
fig = plt.figure()
ax = fig.add_axes([0,0,1,1])
ax.bar(my_df['days'], my_df['y'], color=my_cmap(rescale(my_df['y'])))
return fig
Here is a self-contained minimal example of using the color argument with the output from a cmap:
import matplotlib.pyplot as plt
import numpy as np
x = np.array([1, 2, 3])
y = np.array([4, 5, 6])
my_cmap = plt.get_cmap("viridis")
rescale = lambda y: (y - np.min(y)) / (np.max(y) - np.min(y))
plt.bar(x, y, color=my_cmap(rescale(y)))
plt.savefig("temp")
Output:
Okay, I found a way to do this without having to scale my values:
def my_barchart(my_df, my_cmap):
fig = plt.figure()
ax = fig.add_axes([0,0,1,1])
ax.bar(my_df['days'], my_df['y'], color=my_cmap.colors)
return fig
Simply adding .colors after my_cmap works!

Tick labels in between colors (discrete colorer)

Hi I want to put the ticklabels between colors (center of the intervals), and the figure is plotted by discrete colors. But the min value is not 0. How can I write the code to do that?
I used following code to do that, but what I got is wrong...
n_clusters = len(cbar_tick_label)
tick_locs = (np.arange(n_clusters)+0.5)*(n_clusters-1)/(n_clusters)
cbar.set_ticks(tick_locs)
cbar.set_ticklabels(cbar_tick_label)
This code is from question: Discrete Color Bar with Tick labels in between colors. But it does not work when the min value of data is not zero.
Thanks!
Suppose there are N (e.g. 6) clusters. If you subdivide the range from the lowest number (e.g. 5) to the highest number (e.g. 10) into N equal parts, there will be a tick at every border between color cells. Subdividing into 2*N+1 equal parts, will also have a tick in the center of each color cell. Now, skipping every other of these 2*N+1 ticks will leave us with only the cell centers. So, np.linspace(5, 10, 6*2+1) are the ticks for borders and centers; taking np.linspace(5, 10, 6*2+1)[1::2] will be only the centers.
import numpy as np
import matplotlib.pyplot as plt
x, y = np.random.rand(2, 100)
c = np.random.randint(5, 11, x.shape)
n_clusters = c.max() - c.min() + 1
fig, ax = plt.subplots()
cmap = plt.get_cmap('inferno_r', n_clusters)
scat = ax.scatter(x, y, c=c, cmap=cmap)
cbar = plt.colorbar(scat)
tick_locs = np.linspace(c.min(), c.max(), 2 * n_clusters + 1)[1::2]
cbar_tick_label = np.arange(c.min(), c.max() + 1)
cbar.set_ticks(tick_locs)
cbar.set_ticklabels(cbar_tick_label)
plt.show()

Display the value of the bar on each bar, wrong place

I have a DF like that:
Day Destiny Flight Year
0 10 AJU 1504 2019
1 10 AJU 1502 2020
2 10 FOR 1524 2019
3 10 FOR 1522 2020
4 10 FOR 1528 2019
I am using this code to plot the chart to compare the year side by side for each destination.It's working well.
df.groupby(["Destiny","Year"])["Flight"].count().unstack().plot.bar(figsize=(12, 3))
I have this other one to plot values on top of the bars. But it is plotting in the wrong place.
a = df.groupby(["Destiny","Year"])["Flight"].count().unstack().plot.bar(figsize=(12, 3))
for i, v in enumerate(df.groupby(["Destiny","Year"])["Flight"].count()):
a.text(v, i, str(v))
How to display the value of the bar on each bar correctly?
I've been looking for something like that, but I haven't found it.
Update:
Version 3.4 of matplotlib added function bar_label, which could be incorporated as follows in the code below:
for bar_group in ax.containers:
ax.bar_label(bar_group, fmt='%.0f', size=18)
Old answer:
You can loop through the generated bars, and use their x, height and width to position the text. Adding an empty line into the string helps position the text independent of the scale. ax.margins() can add some space above the bars to make the text fit.
from matplotlib import pyplot as plt
import pandas as pd
df = pd.DataFrame({'Destiny': ['AJU','AJU','FOR','FOR','FOR' ],
'Flight':range(1501,1506),
'Year':[2019,2020,2019,2020,2019]})
ax = df.groupby(["Destiny","Year"])["Flight"].count().unstack().plot.bar(figsize=(12, 3))
for p in ax.patches:
x = p.get_x()
h = p.get_height()
w = p.get_width()
ax.annotate(f'{h:.0f}\n', (x + w/2, h), ha='center', va='center', size=18)
plt.margins(y=0.2)
plt.tight_layout()
plt.show()
The below add_value_labels function is from justfortherec, it's very easy to use, just pass matplotlib.axes.Axes object to it:
import pandas as pd
import matplotlib.pyplot as plt
def add_value_labels(ax, spacing=5):
"""Add labels to the end of each bar in a bar chart.
Arguments:
ax (matplotlib.axes.Axes): The matplotlib object containing the axes
of the plot to annotate.
spacing (int): The distance between the labels and the bars.
"""
# For each bar: Place a label
for rect in ax.patches:
# Get X and Y placement of label from rect.
y_value = rect.get_height()
x_value = rect.get_x() + rect.get_width() / 2
# Number of points between bar and label. Change to your liking.
space = spacing
# Vertical alignment for positive values
va = 'bottom'
# If value of bar is negative: Place label below bar
if y_value < 0:
# Invert space to place label below
space *= -1
# Vertically align label at top
va = 'top'
# Use Y value as label and format number with one decimal place
label = "{:.1f}".format(y_value)
# Create annotation
ax.annotate(
label, # Use `label` as label
(x_value, y_value), # Place label at end of the bar
xytext=(0, space), # Vertically shift label by `space`
textcoords="offset points", # Interpret `xytext` as offset in points
ha='center', # Horizontally center label
va=va) # Vertically align label differently for
# positive and negative values.
df = pd.read_csv("1.csv")
ax = df.groupby(["Destiny","Year"])["Flight"].count().unstack().plot.bar(figsize=(12, 3))
# Call the function above. All the magic happens there.
add_value_labels(ax)
plt.show()
I think we can adapt this answer referenced by #JohanC to fit your problem.
import pandas as pd
import seaborn as sn
import matplotlib.pyplot as plt
from decimal import Decimal
df = pd.DataFrame({'Day':[10]*5, 'Destiny':['AJU']*2+['FOR']*3, 'Flight':[1504,1502,1524,1522,1528],'Year':[2019,2020,2019,2020,2019]})
df.groupby(["Destiny","Year"])["Flight"].count().unstack().plot.bar(figsize=(12, 3))
a = df.groupby(["Destiny","Year"])["Flight"].count().unstack().plot.bar(figsize=(12, 3))
for p in a.patches:
a.annotate('{}'.format(Decimal(str(p.get_height()))), (p.get_x(), p.get_height()))
plt.show()

Scatterplot with marginal KDE plots and multiple categories in Matplotlib

I'd like a function in Matplotlib similar to the Matlab 'scatterhist' function which takes continuous values for 'x' and 'y' axes, plus a categorical variable as input; and produces a scatter plot with marginal KDE plots and two or more categorical variables in different colours as output:
I've found examples of scatter plots with marginal histograms in Matplotlib, marginal histograms in Seaborn jointplot, overlapping histograms in Matplotlib and marginal KDE plots in Matplotib ; but I haven't found any examples which combine scatter plots with marginal KDE plots and are colour coded to indicate different categories.
If possible, I'd like a solution which uses 'vanilla' Matplotlib without Seaborn, as this will avoid dependencies and allow complete control and customisation of the plot appearance using standard Matplotlib commands.
I was going to try to write something based on the above examples; but before doing so wanted to check whether a similar function was already available, and if not then would be grateful for any guidance on the best approach to use.
#ImportanceOfBeingEarnest: Many thanks for your help.
Here's my first attempt at a solution.
It's a bit hacky but achieves my objectives, and is fully customisable using standard matplotlib commands. I'm posting the code here with annotations in case anyone else wishes to use it or develop it further. If there are any improvements or neater ways of writing the code I'm always keen to learn and would be grateful for guidance.
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import gridspec
from scipy import stats
label = ['Setosa','Versicolor','Virginica'] # List of labels for categories
cl = ['b','r','y'] # List of colours for categories
categories = len(label)
sample_size = 20 # Number of samples in each category
# Create numpy arrays for dummy x and y data:
x = np.zeros(shape=(categories, sample_size))
y = np.zeros(shape=(categories, sample_size))
# Generate random data for each categorical variable:
for n in range (0, categories):
x[n,:] = np.array(np.random.randn(sample_size)) + 4 + n
y[n,:] = np.array(np.random.randn(sample_size)) + 6 - n
# Set up 4 subplots as axis objects using GridSpec:
gs = gridspec.GridSpec(2, 2, width_ratios=[1,3], height_ratios=[3,1])
# Add space between scatter plot and KDE plots to accommodate axis labels:
gs.update(hspace=0.3, wspace=0.3)
# Set background canvas colour to White instead of grey default
fig = plt.figure()
fig.patch.set_facecolor('white')
ax = plt.subplot(gs[0,1]) # Instantiate scatter plot area and axis range
ax.set_xlim(x.min(), x.max())
ax.set_ylim(y.min(), y.max())
ax.set_xlabel('x')
ax.set_ylabel('y')
axl = plt.subplot(gs[0,0], sharey=ax) # Instantiate left KDE plot area
axl.get_xaxis().set_visible(False) # Hide tick marks and spines
axl.get_yaxis().set_visible(False)
axl.spines["right"].set_visible(False)
axl.spines["top"].set_visible(False)
axl.spines["bottom"].set_visible(False)
axb = plt.subplot(gs[1,1], sharex=ax) # Instantiate bottom KDE plot area
axb.get_xaxis().set_visible(False) # Hide tick marks and spines
axb.get_yaxis().set_visible(False)
axb.spines["right"].set_visible(False)
axb.spines["top"].set_visible(False)
axb.spines["left"].set_visible(False)
axc = plt.subplot(gs[1,0]) # Instantiate legend plot area
axc.axis('off') # Hide tick marks and spines
# Plot data for each categorical variable as scatter and marginal KDE plots:
for n in range (0, categories):
ax.scatter(x[n],y[n], color='none', label=label[n], s=100, edgecolor= cl[n])
kde = stats.gaussian_kde(x[n,:])
xx = np.linspace(x.min(), x.max(), 1000)
axb.plot(xx, kde(xx), color=cl[n])
kde = stats.gaussian_kde(y[n,:])
yy = np.linspace(y.min(), y.max(), 1000)
axl.plot(kde(yy), yy, color=cl[n])
# Copy legend object from scatter plot to lower left subplot and display:
# NB 'scatterpoints = 1' customises legend box to show only 1 handle (icon) per label
handles, labels = ax.get_legend_handles_labels()
axc.legend(handles, labels, scatterpoints = 1, loc = 'center', fontsize = 12)
plt.show()`
`
Version 2, using Pandas to import 'real' data from a csv file, with a different number of entries in each category. (csv file format: row 0 = headers; col 0 = x values, col 1 = y values, col 2 = category labels). Scatterplot axis and legend labels are generated from column headers.
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import gridspec
from scipy import stats
import pandas as pd
"""
Create scatter plot with marginal KDE plots
from csv file with 3 cols of data
formatted as following example (first row of
data are headers):
'x_label', 'y_label', 'category_label'
4,5,'virginica'
3,6,'sentosa'
4,6, 'virginica' etc...
"""
df = pd.read_csv('iris_2.csv') # enter filename for csv file to be imported (within current working directory)
cl = ['b','r','y', 'g', 'm', 'k'] # Custom list of colours for each categories - increase as needed...
headers = list(df.columns) # Extract list of column headers
# Find min and max values for all x (= col [0]) and y (= col [1]) in dataframe:
xmin, xmax = df.min(axis=0)[0], df.max(axis=0)[0]
ymin, ymax = df.min(axis=0)[1], df.max(axis=0)[1]
# Create a list of all unique categories which occur in the right hand column (ie index '2'):
category_list = df.ix[:,2].unique()
# Set up 4 subplots and aspect ratios as axis objects using GridSpec:
gs = gridspec.GridSpec(2, 2, width_ratios=[1,3], height_ratios=[3,1])
# Add space between scatter plot and KDE plots to accommodate axis labels:
gs.update(hspace=0.3, wspace=0.3)
fig = plt.figure() # Set background canvas colour to White instead of grey default
fig.patch.set_facecolor('white')
ax = plt.subplot(gs[0,1]) # Instantiate scatter plot area and axis range
ax.set_xlim(xmin, xmax)
ax.set_ylim(ymin, ymax)
ax.set_xlabel(headers[0], fontsize = 14)
ax.set_ylabel(headers[1], fontsize = 14)
ax.yaxis.labelpad = 10 # adjust space between x and y axes and their labels if needed
axl = plt.subplot(gs[0,0], sharey=ax) # Instantiate left KDE plot area
axl.get_xaxis().set_visible(False) # Hide tick marks and spines
axl.get_yaxis().set_visible(False)
axl.spines["right"].set_visible(False)
axl.spines["top"].set_visible(False)
axl.spines["bottom"].set_visible(False)
axb = plt.subplot(gs[1,1], sharex=ax) # Instantiate bottom KDE plot area
axb.get_xaxis().set_visible(False) # Hide tick marks and spines
axb.get_yaxis().set_visible(False)
axb.spines["right"].set_visible(False)
axb.spines["top"].set_visible(False)
axb.spines["left"].set_visible(False)
axc = plt.subplot(gs[1,0]) # Instantiate legend plot area
axc.axis('off') # Hide tick marks and spines
# For each category in the list...
for n in range(0, len(category_list)):
# Create a sub-table containing only entries matching current category:
st = df.loc[df[headers[2]] == category_list[n]]
# Select first two columns of sub-table as x and y values to be plotted:
x = st[headers[0]]
y = st[headers[1]]
# Plot data for each categorical variable as scatter and marginal KDE plots:
ax.scatter(x,y, color='none', s=100, edgecolor= cl[n], label = category_list[n])
kde = stats.gaussian_kde(x)
xx = np.linspace(xmin, xmax, 1000)
axb.plot(xx, kde(xx), color=cl[n])
kde = stats.gaussian_kde(y)
yy = np.linspace(ymin, ymax, 1000)
axl.plot(kde(yy), yy, color=cl[n])
# Copy legend object from scatter plot to lower left subplot and display:
# NB 'scatterpoints = 1' customises legend box to show only 1 handle (icon) per label
handles, labels = ax.get_legend_handles_labels()
axc.legend(handles, labels, title = headers[2], scatterpoints = 1, loc = 'center', fontsize = 12)
plt.show()

matplotlib: Using append_axes multiple times

I'm new to matplotlib, so I do not have strong enough command of the language to know if I'm going about this the right way, but I've been searching for the answer for a while now, and I just cannot find anything one way or the other on this.
I know how to use matplotlib's append_axes locator function to append histograms alongside 2D plots, e.g.:
axMain= fig1.add_subplot(111)
cax = plt.contourf(xl,y1,z1)
divider = make_axes_locatable(axMain)
axHisty = divider.append_axes("right", 1.2, pad=0.1, sharey=axMain)
axHisty.plot(x,y)
and I also know how to append a colorbar in a similar manner:
divider = make_axes_locatable(axMain)
ax_cb = divider.new_horizontal(size='5%', pad=0.3)
fig1.add_axes(ax_cb)
fig1.colorbar(cax, cax=ax_cb)
What I am not clear on is how to do both in the same subplot without the two appended figures overlapping. To be clear, I want the histogram to have the same yaxis ticks and height as the axContour, and I want the colorbar to have the same height as axContour. ImageGrid doesn't seem to be quite what I want because I do not want to fix the size of my plot. It would better for me if I could add/remove these figure "embellishments" interactively, but maybe that is not possible...Let me know!
You are already fixing the size of your plot with divider.append_axes("right", 1.2, pad=0.1, sharey=axMain). 1.2 is the size of the new axis. Below is a way of plotting three axes using gridspec.
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.gridspec as grd
from numpy.random import rand
# add axes
fig1 = plt.figure(1)
gs = grd.GridSpec(1, 3, width_ratios=[5,1, 1], wspace=0.3)
axMain = plt.subplot(gs[0])
axHisty = plt.subplot(gs[1])
ax_cb = plt.subplot(gs[2])
# some things to plot
x = [1,2,3,4]
y = [1,2,3,4]
x1 = [1,2,3,4]
y1 = [1,2,3,4]
z1 = rand(4,4)
# make plots
h = axMain.contourf(x1,y1,z1)
axHisty.plot(x,y)
cb = plt.colorbar(h, cax = ax_cb)
plt.show()