pylab 2d plotting - matplotlib

im using the pcolor plot in pylab to plot a two dimensional arrays with values between 0 and 1.
Some array entries do not make sense and i want some other color for them in the plot meaning they are not valid.
I tried to call pcolor two times with a differente color map hoping pylab would overlap them, but that didnt work.
How could I do this?
Thanks a lot,
Thomas

You can use masked arrays with pcolor. This is, by default, coloured white when plotted. Setting the background colour to something else will change this.
i.e.
# Set the background colour for the axes (green in this example)
matplotlib.rcParams['axes.facecolor'] = 'green'
# Make an identity array
data = np.eye(3)
# Convert to a masked array
masked_data = np.ma.array(data)
# Initialise the mask
masked_data.mask = False
# Now mask the middle data point
masked_data.mask[1,1] = True
# Plot
pcolor(masked_data)

Related

hvplot quadmesh custom dynamic cmap

I am trying to create a rangeSlider that controls the colorbar and bins of a hvplot quadmesh plot of gridded data. Right now I am using cmap and it is wonderful but I need a way to bin and color the data to a 3 color scheme namely,
(min, rangeSlider[0]) = Green labeled Good
(rangeSlider[0], rangeSlider[1]) = Yellow labeled Caution
(rangeSlider[1], max) = Red labeled Dangerous
So I made a couple of attempts but am not sure how to pass a ListedColormap from Matplotlib.colors as well as labels to a "bining" function of the quadmesh hvplot object.

Specify Matplotlib's kwargs to Seaborn's displot when hue is used

Suppose we have this:
import seaborn as sns
import pandas as pd
import numpy as np
samples = 2**13
data = pd.DataFrame({'Values': list(np.random.normal(size=samples)) + list(np.random.uniform(size=samples)),
'Kind': ['Normal'] * samples + ['Uniform'] * samples})
sns.displot(data, hue='Kind', x='Values', fill=True)
I want my Normal's histogram (or KDE) emphasized. I'd like it in red and non transparent in the background. Uniform should have alpha = .5.
How do I specify these style parameters in a "per hue" manner?
It's possible to do it with two separate histplots on the same Axes, as #Redox suggested. We can basically recreate the same plot, but with fine-grade control over colours and alpha. However I had to explicitly pass the number of bins in to get the same plot as yours. I also needed to define the colour for Uniform otherwise a ghost element would be added to the legend! I used C1, meaning the first default colour.
_, ax = plt.subplots()
sns.histplot(data=data[data.Kind=='Normal'], x="Values", ax=ax, label='Normal', color='tab:red',bins=130,alpha=1)
sns.histplot(data=data[data.Kind=='Uniform'], x="Values", ax=ax, label='Uniform', color='C1',bins=17, alpha=.5)
ax.set_xlabel('')
ax.legend()
Note that if you just want to set the colour without alpha you can already do this on a displot via the palette argument - pass in a dictionary of your unique hue values to colour names. However, the alpha that you pass in must be a scalar. I tried to use this clever answer to set colours as RGBA colours which include alpha, which seems to work with other figure level plots in Seaborn. However, displot overrides this and sets the alpha separately!

How to pass different scatter kwargs to facets in lmplot in Seaborn

I'm trying to map a 3rd variable to the scatter point colour in the Seaborn lmplot. So total_bill on x, tip on y and point colour as function of size.
It works when no faceting is enabled but fails when col is used because the colour array size does not match the size of the data plotted in each facet.
This is my code
import matplotlib as mpl
import seaborn as sns
sns.set(color_codes=True)
# load data
data = sns.load_dataset("tips")
# size of data
print len(data.index)
### we want to plot scatter point colour as function of variable 'size'
# first, sort the data by 'size' so that high 'size' values are plotted
# over the smaller sizes (so they are more visible)
data = data.sort_values(by=['size'], ascending=True)
scatter_kws = dict()
cmap = mpl.cm.get_cmap(name='Blues')
# normalise 'size' variable as float range needs to be
# between 0 and 1 to map to a valid colour
scatter_kws['c'] = data['size'] / data['size'].max()
# map normalised values to colours
scatter_kws['c'] = cmap(scatter_kws['c'].values)
# colour array has same size as data
print len(scatter_kws['c'])
# this works as intended
g = sns.lmplot(data=data, x="total_bill", y="tip", scatter_kws=scatter_kws)
The above works well and produces the following (not allowed to include images yet, so here's the link):
lmplot with point colour as function of size
However, when I add col='sex' to lmplot (try code below), the issue is that the colour array has the size of the original dataset which is larger than the size of data plotted in each facet. So, for example col='male' has 157 data points so first 157 values from the colour array are mapped to the points (and these aren't even the correct ones). See below:
lmplot with point colour as function of size with col=sex
g = sns.lmplot(data=data, x="total_bill", y="tip", col="sex", scatter_kws=scatter_kws)
Ideally, I'd like to pass an array of scatter_kws to the lmplot so that each facet uses the correct colour array (which I'd calculate before passing to lmplot). But that doesn't seem to be an option.
Any other ideas or workarounds that still allow me to use the functionality of Seaborn's lmplot (meaning, without resorting to re-creating lmplot functionality from FacetGrid?
In principle the lmplot with different cols seems to be just a wrapper for several regplots. So instead of one lmplot we could use two regplots, one for each sex.
We therefore need to separate the original dataframe into male and female, the rest is rather straight forward.
import matplotlib.pyplot as plt
import seaborn as sns
data = sns.load_dataset("tips")
data = data.sort_values(by=['size'], ascending=True)
# make a new dataframe for males and females
male = data[data["sex"] == "Male"]
female = data[data["sex"] == "Female"]
# get normalized colors for all data
colors = data['size'].values / float(data['size'].max())
# get colors for males / females
colors_male = colors[data["sex"].values == "Male"]
colors_female = colors[data["sex"].values == "Female"]
# colors are values in [0,1] range
fig, (ax1, ax2) = plt.subplots(ncols=2, figsize=(9,4))
#create regplot for males, put it to left axes
#use colors_male to color the points with Blues cmap
sns.regplot(data=male, x="total_bill", y="tip", ax=ax1,
scatter_kws= {"c" : colors_male, "cmap":"Blues"})
# same for females
sns.regplot(data=female, x="total_bill", y="tip", ax=ax2,
scatter_kws={"c" : colors_female, "cmap":"Greens"})
ax1.set_title("Males")
ax2.set_title("Females")
for ax in [ax1, ax2]:
ax.set_xlim([0,60])
ax.set_ylim([0,12])
plt.tight_layout()
plt.show()

colorbars for grid of line (not contour) plots in matplotlib

I'm having trouble giving colorbars to a grid of line plots in Matplotlib.
I have a grid of plots, which each shows 64 lines. The lines depict the penalty value vs time when optimizing the same system under 64 different values of a certain hyperparameter h.
Since there are so many lines, instead of using a standard legend, I'd like to use a colorbar, and color the lines by the value of h. In other words, I'd like something that looks like this:
The above was done by adding a new axis to hold the colorbar, by calling figure.add_axes([0.95, 0.2, 0.02, 0.6]), passing in the axis position explicitly as parameters to that method. The colorbar was then created as in the example code here, by instantiating a ColorbarBase(). That's fine for single plots, but I'd like to make a grid of plots like the one above.
To do this, I tried doubling the number of subplots, and using every other subplot axis for the colorbar. Unfortunately, this led to the colorbars having the same size/shape as the plots:
Is there a way to shrink just the colorbar subplots in a grid of subplots like the 1x2 grid above?
Ideally, it'd be great if the colorbar just shared the same axis as the line plot it describes. I saw that the colorbar.colorbar() function has an ax parameter:
ax
parent axes object from which space for a new colorbar axes will be stolen.
That sounds great, except that colorbar.colorbar() requires you to pass in a imshow image, or a ContourSet, but my plot is neither an image nor a contour plot. Can I achieve the same (axis-sharing) effect using ColorbarBase?
It turns out you can have different-shaped subplots, so long as all the plots in a given row have the same height, and all the plots in a given column have the same width.
You can do this using gridspec.GridSpec, as described in this answer.
So I set the columns with line plots to be 20x wider than the columns with color bars. The code looks like:
grid_spec = gridspec.GridSpec(num_rows,
num_columns * 2,
width_ratios=[20, 1] * num_columns)
colormap_type = cm.cool
for (x_vec_list,
y_vec_list,
color_hyperparam_vec,
plot_index) in izip(x_vec_lists,
y_vec_lists,
color_hyperparam_vecs,
range(len(x_vecs))):
line_axis = plt.subplot(grid_spec[grid_index * 2])
colorbar_axis = plt.subplot(grid_spec[grid_index * 2 + 1])
colormap_normalizer = mpl.colors.Normalize(vmin=color_hyperparam_vec.min(),
vmax=color_hyperparam_vec.max())
scalar_to_color_map = mpl.cm.ScalarMappable(norm=colormap_normalizer,
cmap=colormap_type)
colorbar.ColorbarBase(colorbar_axis,
cmap=colormap_type,
norm=colormap_normalizer)
for (line_index,
x_vec,
y_vec) in zip(range(len(x_vec_list)),
x_vec_list,
y_vec_list):
hyperparam = color_hyperparam_vec[line_index]
line_color = scalar_to_color_map.to_rgba(hyperparam)
line_axis.plot(x_vec, y_vec, color=line_color, alpha=0.5)
For num_rows=1 and num_columns=1, this looks like:

Problems with zeros in matplotlib.colors.LogNorm

I am plotting a histogram using
plt.imshow(hist2d, norm = LogNorm(), cmap = gray)
where hist2d is a matrix of histogram values. This works fine except for elements in hist2d that are zero. In particular, I obtain the following image
but would like the white patches to be black.
Thank you!
Here's an alternative method that does not require you to muck with your data by setting a rgb value for bad pixels.
import copy
data = np.arange(25).reshape((5,5))
my_cmap = copy.copy(matplotlib.cm.get_cmap('gray')) # copy the default cmap
my_cmap.set_bad((0,0,0))
plt.imshow(data,
norm=matplotlib.colors.LogNorm(),
interpolation='nearest',
cmap=my_cmap)
The problem is that bins with 0 can not be properly log normalized so they are flagged as 'bad', which are mapped to differently. The default behavior is to not draw anything on those pixels. You can also specify what color to draw pixels that are over or under the limits of the color map (the default is to draw them as the highest/lowest color).
If you're happy with the colour scaling as is, and simply want the 0 values to be black, I'd simply change the input matrix so that the 0s are replaced by the next smallest value:
import matplotlib.pyplot as plt
import matplotlib.cm, matplotlib.colors
import numpy
hist2d = numpy.arange(9).reshape(3,3)
plt.imshow(numpy.maximum(hist2d, sorted(hist2d.flat)[1]),
interpolation='nearest',
norm = matplotlib.colors.LogNorm(),
cmap = matplotlib.cm.gray)
produces
Was this generated with the matplotlib hist2d function?
All you need to do is go through the matrix and set some arbitrary floor value, then make sure to plot this with fixed limits
for f in hist2d:
f += 1e-3
then when you show the figure, all of the whitespace will now be at the floor value, and will show up on the lognormal plot . However, if you are letting hist2d automatically pick the scaling for you, it will want to use the 1e-3 floor value as the minimum. To avoid this, you need to set vmin and vmax values in hist2d
hist2d(x,y,bins=40, norm=LogNorm(), vmin=1, vmax=1e4)