Matplotlib: Discrete colorbar fails for custom labels - matplotlib

I faced a serious problem when I was trying to add colorbar to scatter plot which indicates in which classes individual sample belongs to. The code works perfectly when classes are [0,1,2] but when the classes are for example [4,5,6] chooses colorbar automatically color values in the end of colormap and colorbar looks blue solid color. I'm missing something obvious but I just can't figure out what it is.
Here is the example code about the problem:
import matplotlib.pyplot as plt
import numpy as np
fig, ax = plt.subplots(1 , figsize=(6, 6))
plt.scatter(datapoints[:,0], datapoints[:,1], s=20, c=labels, cmap='jet', alpha=1.0)
plt.setp(ax, xticks=[], yticks=[])
cbar = plt.colorbar(boundaries=np.arange(len(classes)+1)-0.5)
cbar.set_ticks(np.arange(len(classes)))
cbar.set_ticklabels(classes)
plt.show()
Variables can be for example
datapoints = np.array([[1,1],[2,2],[3,3],[4,4],[5,5],[6,6],[7,7]])
labels = np.array([4,5,6,4,5,6,4])
classes = np.array([4,5,6])
Correct result is got when
labels = np.array([0,1,2,0,1,2,0])
In my case I want it to work also for classes [4,5,6]

The buoundaries need to be in data units. Meaning, if your classes are 4,5,6, you probably want to use boundaries of 3.5, 4.5, 5.5, 6.5.
import matplotlib.pyplot as plt
import numpy as np
datapoints = np.array([[1,1],[2,2],[3,3],[4,4],[5,5],[6,6],[7,7]])
labels = np.array([4,5,6,4,5,6,4])
classes = np.array([4,5,6])
fig, ax = plt.subplots(1 , figsize=(6, 6))
sc = ax.scatter(datapoints[:,0], datapoints[:,1], s=20, c=labels, cmap='jet', alpha=1.0)
ax.set(xticks=[], yticks=[])
cbar = plt.colorbar(sc, ticks=classes, boundaries=np.arange(4,8)-0.5)
plt.show()
If you wanted to have the boundaries determined automatically from the classes, some assumption must me made. E.g. if all classes are subsequent integers,
boundaries=np.arange(classes.min(), classes.max()+2)-0.5
In general, an alternative would be to use a BoundaryNorm, as shown e.g. in Create a discrete colorbar in matplotlib
or How to specify different color for a specific year value range in a single figure? (Python) or python colormap quantisation (matplotlib)

Related

changing the size of subplots with matplotlib

I am trying to plot multiple rgb images with matplotlib
the code I am using is:
import numpy as np
import matplotlib.pyplot as plt
for i in range(0, images):
test = np.random.rand(1080, 720,3)
plt.subplot(images,2,i+1)
plt.imshow(test, interpolation='none')
the subplots appear tiny though as thumbnails
How can I make them bigger?
I have seen solutions using
fig, ax = plt.subplots()
syntax before but not with plt.subplot ?
plt.subplots initiates a subplot grid, while plt.subplot adds a subplot. So the difference is whether you want to initiate you plot right away or fill it over time. Since it seems, that you know how many images to plot beforehand, I would also recommend going with subplots.
Also notice, that the way you use plt.subplot you generate empy subplots in between the ones you are actually using, which is another reason they are so small.
import numpy as np
import matplotlib.pyplot as plt
images = 4
fig, axes = plt.subplots(images, 1, # Puts subplots in the axes variable
figsize=(4, 10), # Use figsize to set the size of the whole plot
dpi=200, # Further refine size with dpi setting
tight_layout=True) # Makes enough room between plots for labels
for i, ax in enumerate(axes):
y = np.random.randn(512, 512)
ax.imshow(y)
ax.set_title(str(i), fontweight='bold')

Stylizing only some boxes with boxplots in matplotlib

How would you go about changing the style of only some boxes in a matplotlib boxplot? Below, you can see an example of styling, but I would like the style to only apply to one of the boxes.
The same question has been asked for seaborn boxplots already. For matplotlib boxplots this is even easier, since the boxplot directly returns a dictionary of the involved artists, see boxplot documentation.
This means that if bplot = ax.boxplot(..) is your boxplot, you may access the boxes via bplot['boxes'], select one of them and set its linestyle to your desire. E.g.
bplot['boxes'][2].set_linestyle("-.")
Modifying the boxplot_color example
import matplotlib.pyplot as plt
import numpy as np
# Random test data
np.random.seed(19680801)
all_data = [np.random.normal(0, std, size=100) for std in range(1, 4)]
labels = ['x1', 'x2', 'x3']
fig, ax = plt.subplots()
# notch shape box plot
bplot = ax.boxplot(all_data, vert=True, patch_artist=True, labels=labels)
# Loop through boxes and colorize them individually
colors = ['pink', 'lightblue', 'lightgreen']
for patch, color in zip(bplot['boxes'], colors):
patch.set_facecolor(color)
# Make the third box dotted
bplot['boxes'][2].set_linestyle("-.")
plt.show()

Customize the axis label in seaborn jointplot

I seem to have got stuck at a relatively simple problem but couldn't fix it after searching for last hour and after lot of experimenting.
I have two numpy arrays x and y and I am using seaborn's jointplot to plot them:
sns.jointplot(x, y)
Now I want to label the xaxis and yaxis as "X-axis label" and "Y-axis label" respectively. If I use plt.xlabel, the labels goes to the marginal distribution. How can I make them appear on the joint axes?
sns.jointplot returns a JointGrid object, which gives you access to the matplotlib axes and you can then manipulate from there.
import seaborn as sns
import numpy as np
# example data
X = np.random.randn(1000,)
Y = 0.2 * np.random.randn(1000) + 0.5
h = sns.jointplot(X, Y)
# JointGrid has a convenience function
h.set_axis_labels('x', 'y', fontsize=16)
# or set labels via the axes objects
h.ax_joint.set_xlabel('new x label', fontweight='bold')
# also possible to manipulate the histogram plots this way, e.g.
h.ax_marg_y.grid('on') # with ugly consequences...
# labels appear outside of plot area, so auto-adjust
h.figure.tight_layout()
(The problem with your attempt is that functions such as plt.xlabel("text") operate on the current axis, which is not the central one in sns.jointplot; but the object-oriented interface is more specific as to what it will operate on).
Note that the last command uses the figure attribute of the JointGrid. The initial version of this answer used the simpler - but not object-oriented - approach via the matplotlib.pyplot interface.
To use the pyplot interface:
import matplotlib.pyplot as plt
plt.tight_layout()
Alternatively, you can specify the axes labels in a pandas DataFrame in the call to jointplot.
import pandas as pd
import seaborn as sns
x = ...
y = ...
data = pd.DataFrame({
'X-axis label': x,
'Y-axis label': y,
})
sns.jointplot(x='X-axis label', y='Y-axis label', data=data)

Correct legend color for intersecting transparent layers in Matplotlib

I often need to indicate the distribution of some data in a concise plot, as in the below figure. I do this by plotting several fill_between areas, limited by the quantiles of the distribution.
ax.fill_between(x, quantile1, quantile2, alpha=0.2)
In a for loop, I make plots like this by calculating quantiles 1 and 2 (as indicated by the legend) as the 0% to 100% quantiles, then 10% to 90% and so on, each fill_between plotting on top of the previous "layer".
Here is the output with three layers of transparent colors along with the median line (0.5):
However, the legend colors are not what I would like them to be, since they (naturally) use the color of each individual layer, not taking into account the combined effect of several layers.
ax.legend([0.5]+[['0.0%', '100.0%'], ['10.0%', '90.0%'], ['30.0%', '70.0%']])
What is the best way to overwrite the face color value within the legend command?
I would like to avoid doing this by first plotting 0% to 10% with transparency "0.2", then 10% to 30% with transparency "0.4" and so on, as this will take twice the amount of time to compute and will make the code more complicated.
You can use proxy artists to place in the legend which have the exact same transparency as the resulting overlay from the plot.
As a proxy artist you can use a simple rectangle. The transparency however needs to be calculated as two objects with transparency 0.2 together will appear as a single object with transparency 0.36 (and not 0.4!).
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.patches
a = np.sort(np.random.rand(6,18), axis=0)
x = np.arange(len(a[0]))
def alpha(i, base=0.2):
l = lambda x: x+base-x*base
ar = [l(0)]
for j in range(i):
ar.append(l(ar[-1]))
return ar[-1]
fig, ax = plt.subplots(figsize=(4,2))
handles = []
labels=[]
for i in range(len(a)/2):
ax.fill_between(x, a[i, :], a[len(a)-1-i, :], color="blue", alpha=0.2)
handle = matplotlib.patches.Rectangle((0,0),1,1,color="blue", alpha=alpha(i, base=0.2))
handles.append(handle)
label = "quant {:.1f} to {:.1f}".format(float(i)/len(a)*100, 100-float(i)/len(a)*100)
labels.append(label)
plt.legend(handles=handles, labels=labels, framealpha=1)
plt.show()
One has to decide if this is really worth the effort. A solution without transparency but with the very same result can be achieved much shorter:
import matplotlib.pyplot as plt
import numpy as np
a = np.sort(np.random.rand(6,18), axis=0)
x = np.arange(len(a[0]))
fig, ax = plt.subplots(figsize=(4,2))
for i in range(len(a)/2):
label = "quant {:.1f} to {:.1f}".format(float(i)/len(a)*100, 100-float(i)/len(a)*100)
c = plt.cm.Blues(0.2+.6*(float(i)/len(a)*2) )
ax.fill_between(x, a[i, :], a[len(a)-1-i, :], color=c, label=label)
plt.legend( framealpha=1)
plt.show()

heatmap for positive and negative values [duplicate]

I am trying to make a filled contour for a dataset. It should be fairly straightforward:
plt.contourf(x, y, z, label = 'blah', cm = matplotlib.cm.RdBu)
However, what do I do if my dataset is not symmetric about 0? Let's say I want to go from blue (negative values) to 0 (white), to red (positive values). If my dataset goes from -8 to 3, then the white part of the color bar, which should be at 0, is in fact slightly negative. Is there some way to shift the color bar?
First off, there's more than one way to do this.
Pass an instance of DivergingNorm as the norm kwarg.
Use the colors kwarg to contourf and manually specify the colors
Use a discrete colormap constructed with matplotlib.colors.from_levels_and_colors.
The simplest way is the first option. It is also the only option that allows you to use a continuous colormap.
The reason to use the first or third options is that they will work for any type of matplotlib plot that uses a colormap (e.g. imshow, scatter, etc).
The third option constructs a discrete colormap and normalization object from specific colors. It's basically identical to the second option, but it will a) work with other types of plots than contour plots, and b) avoids having to manually specify the number of contours.
As an example of the first option (I'll use imshow here because it makes more sense than contourf for random data, but contourf would have identical usage other than the interpolation option.):
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.colors import DivergingNorm
data = np.random.random((10,10))
data = 10 * (data - 0.8)
fig, ax = plt.subplots()
im = ax.imshow(data, norm=DivergingNorm(0), cmap=plt.cm.seismic, interpolation='none')
fig.colorbar(im)
plt.show()
As an example of the third option (notice that this gives a discrete colormap instead of a continuous colormap):
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.colors import from_levels_and_colors
data = np.random.random((10,10))
data = 10 * (data - 0.8)
num_levels = 20
vmin, vmax = data.min(), data.max()
midpoint = 0
levels = np.linspace(vmin, vmax, num_levels)
midp = np.mean(np.c_[levels[:-1], levels[1:]], axis=1)
vals = np.interp(midp, [vmin, midpoint, vmax], [0, 0.5, 1])
colors = plt.cm.seismic(vals)
cmap, norm = from_levels_and_colors(levels, colors)
fig, ax = plt.subplots()
im = ax.imshow(data, cmap=cmap, norm=norm, interpolation='none')
fig.colorbar(im)
plt.show()