heatmap for positive and negative values [duplicate] - matplotlib

I am trying to make a filled contour for a dataset. It should be fairly straightforward:
plt.contourf(x, y, z, label = 'blah', cm = matplotlib.cm.RdBu)
However, what do I do if my dataset is not symmetric about 0? Let's say I want to go from blue (negative values) to 0 (white), to red (positive values). If my dataset goes from -8 to 3, then the white part of the color bar, which should be at 0, is in fact slightly negative. Is there some way to shift the color bar?

First off, there's more than one way to do this.
Pass an instance of DivergingNorm as the norm kwarg.
Use the colors kwarg to contourf and manually specify the colors
Use a discrete colormap constructed with matplotlib.colors.from_levels_and_colors.
The simplest way is the first option. It is also the only option that allows you to use a continuous colormap.
The reason to use the first or third options is that they will work for any type of matplotlib plot that uses a colormap (e.g. imshow, scatter, etc).
The third option constructs a discrete colormap and normalization object from specific colors. It's basically identical to the second option, but it will a) work with other types of plots than contour plots, and b) avoids having to manually specify the number of contours.
As an example of the first option (I'll use imshow here because it makes more sense than contourf for random data, but contourf would have identical usage other than the interpolation option.):
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.colors import DivergingNorm
data = np.random.random((10,10))
data = 10 * (data - 0.8)
fig, ax = plt.subplots()
im = ax.imshow(data, norm=DivergingNorm(0), cmap=plt.cm.seismic, interpolation='none')
fig.colorbar(im)
plt.show()
As an example of the third option (notice that this gives a discrete colormap instead of a continuous colormap):
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.colors import from_levels_and_colors
data = np.random.random((10,10))
data = 10 * (data - 0.8)
num_levels = 20
vmin, vmax = data.min(), data.max()
midpoint = 0
levels = np.linspace(vmin, vmax, num_levels)
midp = np.mean(np.c_[levels[:-1], levels[1:]], axis=1)
vals = np.interp(midp, [vmin, midpoint, vmax], [0, 0.5, 1])
colors = plt.cm.seismic(vals)
cmap, norm = from_levels_and_colors(levels, colors)
fig, ax = plt.subplots()
im = ax.imshow(data, cmap=cmap, norm=norm, interpolation='none')
fig.colorbar(im)
plt.show()

Related

Ticks position in heatmap with categorical data (seaborn)

I am trying to plot a confusion matrix of my predictions. My data is multi-class (13 different labels) so I'm using a heatmap.
As you can see below, my heat map looks generally okay but the labels are a bit out of position: y ticks should be a little lower and x ticks should be a bit more to the right. I want to move both axis ticks a bit so they will aligned with the center of each square.
my code:
sns.set()
my_mask = np.zeros((con_matrix.shape[0], con_matrix.shape[0]), dtype=int)
for i in range(con_matrix.shape[0]):
for j in range(con_matrix.shape[0]):
my_mask[i][j] = con_matrix[i][j] == 0
fig_dims = (10, 10)
plt.subplots(figsize=fig_dims)
ax = sns.heatmap(con_matrix, annot=True, fmt="d", linewidths=.5, cmap="Pastel1", cbar=False, mask=my_mask, vmax=15)
plt.xticks(range(len(party_names)), party_names, rotation=45)
plt.yticks(range(len(party_names)), party_names, rotation='horizontal')
plt.show()
and for reproduction purpose, here are con_matrix and party_names hard-coded:
import numpy as np
from matplotlib import pyplot as plt
import seaborn as sns
con_matrix = np.array([[55, 0, 0, 0,0, 0, 0,0,0,0,0,0,2], [0,199,0,0,0,0,0,0,0,0,2,0,1],
[0, 0,52,0,0,0,0,0,0,0,0,0,1],
[0,0,0,39,0,0,0,0,0,0,0,0,0],
[0,0,0,0,90,0,0,0,0,0,0,4,3],
[0,0,0,1,0,35,0,0,0,0,0,0,0],
[0,0,0,0,5,0,26,0,0,1,0,1,0],
[0,5,0,0,0,1,0,44,0,0,3,0,1],
[0,1,0,0,0,0,0,0,52,0,0,0,0],
[0,1,0,0,2,0,0,0,0,235,0,1,1],
[1,2,0,0,0,0,0,3,0,0,34,0,3],
[0,0,0,0,5,0,0,0,0,1,0,40,0],
[0,0,0,0,0,0,0,0,0,1,0,0,46]])
party_names = ['Blues', 'Browns', 'Greens', 'Greys', 'Khakis', 'Oranges', 'Pinks', 'Purples', 'Reds', 'Turquoises', 'Violets', 'Whites', 'Yellows']
I already tried to work with position argument of different axes, but it did not turn out well. Could not find an exactly answer in this site as well (at least not a solution that works for categorical data).
I'm new in visualization with seaborn, any improvement with explanations would be appreciated (not only for my problem but on my code & visualization as well).
You can shift both the ticklabels by 0.5 offset to have the desired alignment. To do so, I have used NumPy's arange that enables vectorized addition of 0.5 to the whole array.
plt.xticks(np.arange(len(party_names))+0.5, party_names, rotation=45)
plt.yticks(np.arange(len(party_names))+0.5, party_names, rotation='horizontal')

Add a secondary label to a plot x-axis for events

I have an ax.stackplot showing population of different groups over time. The x-axis is time and the y-axis is population. I am showing time at major labels 1 year and minor labels 1 month, however, changes in the data occur more frequently at "events". I'd like to show labels for these events along the x-axis, kind of how I have it sketched out in the image here:
I've attempted adding a second axis with plt.axes(), but this second axis is overwriting the ticks of my first axis for some reason. Does anyone have any suggestions for how to accomplish this?
Thank you!
If you don't have too many points, I think the best way to do this is adding text to your axes using ax.text:
from matplotlib import pyplot
import matplotlib
import numpy as np
# Random plot
t = np.arange(0.0, 2.0, 0.01)
s = 1 + np.sin(2 * np.pi * t)
fig, ax = pyplot.subplots()
ax.plot(t, s)
# ax.text(x, y, text, rotation)
ax.text(0, -0.35, "Event 1", rotation=90) # rotation=90 is easier to read, for me
ax.text(0.5, -0.35, "Event 2", rotation=-90) # opposite rotation
ax.text(0.75, -0.35, "Event 3", rotation=-90)
# This gives some space at the bottom of the figure
# so that the text is visible
fig.subplots_adjust(bottom=0.2)
pyplot.show()
Result:
Check the Axes.text documentation for more info.
Thank you for the responses, I was able to come up with a solution based on your suggestions. The solution involves using ax.twiny() to create a second axes object, and then specifying the second x-axis data points and labels. Below is a simple example for those interested:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
# Create some meaningless data for testing.
x = np.arange(0, 10)
y = np.full(10, len(x))
# Set up figure and set axes parameters.
fig = plt.figure(num=None, figsize=(10, 8), dpi=80, facecolor='w', edgecolor='k')
ax = plt.axes()
ax.xaxis.set_minor_locator(ticker.FixedLocator([1, 3, 5, 7, 9]))
# Get a second axes (for secondary labels) and set parameters.
axl = ax.twiny()
axl.tick_params(axis='x', bottom=True, labelbottom=True, labeltop=False, top=False, length=15, colors=[.5,.5,.5])
# Plot data on primary axes
ax.bar(x, y)
interval = ax.xaxis.get_view_interval()
# Set label properties on secondary axes (for secondary labels)
axl.xaxis.set_view_interval(*interval)
axl.xaxis.set_ticklabels(['a', 'b'])
axl_loc = ticker.FixedLocator([0.5, 4.75])
axl.xaxis.set_major_locator(axl_loc)
plt.show()

Matplotlib: Discrete colorbar fails for custom labels

I faced a serious problem when I was trying to add colorbar to scatter plot which indicates in which classes individual sample belongs to. The code works perfectly when classes are [0,1,2] but when the classes are for example [4,5,6] chooses colorbar automatically color values in the end of colormap and colorbar looks blue solid color. I'm missing something obvious but I just can't figure out what it is.
Here is the example code about the problem:
import matplotlib.pyplot as plt
import numpy as np
fig, ax = plt.subplots(1 , figsize=(6, 6))
plt.scatter(datapoints[:,0], datapoints[:,1], s=20, c=labels, cmap='jet', alpha=1.0)
plt.setp(ax, xticks=[], yticks=[])
cbar = plt.colorbar(boundaries=np.arange(len(classes)+1)-0.5)
cbar.set_ticks(np.arange(len(classes)))
cbar.set_ticklabels(classes)
plt.show()
Variables can be for example
datapoints = np.array([[1,1],[2,2],[3,3],[4,4],[5,5],[6,6],[7,7]])
labels = np.array([4,5,6,4,5,6,4])
classes = np.array([4,5,6])
Correct result is got when
labels = np.array([0,1,2,0,1,2,0])
In my case I want it to work also for classes [4,5,6]
The buoundaries need to be in data units. Meaning, if your classes are 4,5,6, you probably want to use boundaries of 3.5, 4.5, 5.5, 6.5.
import matplotlib.pyplot as plt
import numpy as np
datapoints = np.array([[1,1],[2,2],[3,3],[4,4],[5,5],[6,6],[7,7]])
labels = np.array([4,5,6,4,5,6,4])
classes = np.array([4,5,6])
fig, ax = plt.subplots(1 , figsize=(6, 6))
sc = ax.scatter(datapoints[:,0], datapoints[:,1], s=20, c=labels, cmap='jet', alpha=1.0)
ax.set(xticks=[], yticks=[])
cbar = plt.colorbar(sc, ticks=classes, boundaries=np.arange(4,8)-0.5)
plt.show()
If you wanted to have the boundaries determined automatically from the classes, some assumption must me made. E.g. if all classes are subsequent integers,
boundaries=np.arange(classes.min(), classes.max()+2)-0.5
In general, an alternative would be to use a BoundaryNorm, as shown e.g. in Create a discrete colorbar in matplotlib
or How to specify different color for a specific year value range in a single figure? (Python) or python colormap quantisation (matplotlib)

Correct legend color for intersecting transparent layers in Matplotlib

I often need to indicate the distribution of some data in a concise plot, as in the below figure. I do this by plotting several fill_between areas, limited by the quantiles of the distribution.
ax.fill_between(x, quantile1, quantile2, alpha=0.2)
In a for loop, I make plots like this by calculating quantiles 1 and 2 (as indicated by the legend) as the 0% to 100% quantiles, then 10% to 90% and so on, each fill_between plotting on top of the previous "layer".
Here is the output with three layers of transparent colors along with the median line (0.5):
However, the legend colors are not what I would like them to be, since they (naturally) use the color of each individual layer, not taking into account the combined effect of several layers.
ax.legend([0.5]+[['0.0%', '100.0%'], ['10.0%', '90.0%'], ['30.0%', '70.0%']])
What is the best way to overwrite the face color value within the legend command?
I would like to avoid doing this by first plotting 0% to 10% with transparency "0.2", then 10% to 30% with transparency "0.4" and so on, as this will take twice the amount of time to compute and will make the code more complicated.
You can use proxy artists to place in the legend which have the exact same transparency as the resulting overlay from the plot.
As a proxy artist you can use a simple rectangle. The transparency however needs to be calculated as two objects with transparency 0.2 together will appear as a single object with transparency 0.36 (and not 0.4!).
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.patches
a = np.sort(np.random.rand(6,18), axis=0)
x = np.arange(len(a[0]))
def alpha(i, base=0.2):
l = lambda x: x+base-x*base
ar = [l(0)]
for j in range(i):
ar.append(l(ar[-1]))
return ar[-1]
fig, ax = plt.subplots(figsize=(4,2))
handles = []
labels=[]
for i in range(len(a)/2):
ax.fill_between(x, a[i, :], a[len(a)-1-i, :], color="blue", alpha=0.2)
handle = matplotlib.patches.Rectangle((0,0),1,1,color="blue", alpha=alpha(i, base=0.2))
handles.append(handle)
label = "quant {:.1f} to {:.1f}".format(float(i)/len(a)*100, 100-float(i)/len(a)*100)
labels.append(label)
plt.legend(handles=handles, labels=labels, framealpha=1)
plt.show()
One has to decide if this is really worth the effort. A solution without transparency but with the very same result can be achieved much shorter:
import matplotlib.pyplot as plt
import numpy as np
a = np.sort(np.random.rand(6,18), axis=0)
x = np.arange(len(a[0]))
fig, ax = plt.subplots(figsize=(4,2))
for i in range(len(a)/2):
label = "quant {:.1f} to {:.1f}".format(float(i)/len(a)*100, 100-float(i)/len(a)*100)
c = plt.cm.Blues(0.2+.6*(float(i)/len(a)*2) )
ax.fill_between(x, a[i, :], a[len(a)-1-i, :], color=c, label=label)
plt.legend( framealpha=1)
plt.show()

Matplotlib plotting a single line that continuously changes color

I would like to plot a curve in the (x,y) plane, where the color of the curve depends on a value of another variable T. x is a 1D numpy array, y is a 1D numpy array.
T=np.linspace(0,1,np.size(x))**2
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(x,y)
I want the line to change from blue to red (using RdBu colormap) depending on the value of T (one value of T exists for every (x,y) pair).
I found this, but I don't know how to warp it to my simple example. How would I use the linecollection for my example? http://matplotlib.org/examples/pylab_examples/multicolored_line.html
Thanks.
One idea could be to set the color using color=(R,G,B) then split your plot into n segments and continuously vary either one of the R, G or B (or a combinations)
import pylab as plt
import numpy as np
# Make some data
n=1000
x=np.linspace(0,100,n)
y=np.sin(x)
# Your coloring array
T=np.linspace(0,1,np.size(x))**2
fig = plt.figure()
ax = fig.add_subplot(111)
# Segment plot and color depending on T
s = 10 # Segment length
for i in range(0,n-s,s):
ax.plot(x[i:i+s+1],y[i:i+s+1],color=(0.0,0.5,T[i]))
Hope this is helpful