How to fix lines of axes overlapping imshow plot? - matplotlib

When plotting matrices using matplotlib's imshow function the lines of the axes can overlap the actual plot, see the following minimal example (matshow is just a simple wrapper around imshow):
import numpy as np
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(3,3))
ax.matshow(np.random.random((50, 50)), interpolation="none", cmap="Blues")
plt.savefig("example.png", dpi=300)
I would expect every entry of the matrix to be represented by a square, but in the top row it is quite obvious that the axis is hiding a bit of the plot resulting in non-square entries. The same is happening for the last column. Since I want the complete matrix to be seen - every entry with the same importance - is there any way this can be fixed?

To me, this is just a visualisation issue. If I run your code and maximise the window, I do not see the overlapping you are talking about:
Otherwise, remove the spines but without hiding the ticks:
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.spines['bottom'].set_visible(False)
ax.spines['left'].set_visible(False)
EDIT
Reduce the thickness of the borders:
[x.set_linewidth(0.3) for x in ax.spines.values()]
The following is the exported image:
With 0.2 the exported image looks like this:

Related

Problem with text and annotation x and y coordinates changing while looping through subplots matplotlib

I would like to iterate through subplots, plot data, and annotate the subplots with either the text function or the annotation function in matplotlib. Both functions ask for x and y coordinates in order to place text or annotations. I can get this to work fine, until I plot data. Then the annotations and the text jump all over the place and I cannot figure out why.
My set up is something like this, which produces well-aligned annotations with no data:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
fig, ax=plt.subplots(nrows=3, ncols=3, sharex=True)
fig.suptitle('Axes ylim unpacking error demonstration')
annotation_colors=["red", "lightblue", "tan", "purple", "lightgreen", "black", "pink", "blue", "magenta"]
for jj, ax in enumerate(ax.flat):
bott, top = plt.ylim()
left, right = plt.xlim()
ax.text(left+0.1*(right-left), bott+0.1*(top-bott), 'Annotation', color=annotation_colors[jj])
plt.show
When I add random data (or my real data), the annotations jump:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
#Same as above but but with 9 random data frames plotted.
df_cols = ['y' + str(x) for x in range(1,10)]
df=pd.DataFrame(np.random.randint(0,10, size=(10,9)), columns=df_cols)
df['x']=range(0,10)
#Make a few columns much larger in terms of magnitude of mean values
df['y2']=df['y2']*-555
df['y5']=df['y5']*123
fig, ax=plt.subplots(nrows=3, ncols=3, sharex=True)
fig.suptitle('Axes ylim unpacking error demonstration')
annotation_colors=["red", "lightblue", "tan", "purple", "lightgreen", "black", "pink", "blue", "magenta"]
for jj, ax in enumerate(ax.flat):
ax.plot(df['x'], df['y'+str(jj+1)], color=annotation_colors[jj])
bott, top = plt.ylim()
left, right = plt.xlim()
ax.text(left+0.1*(right-left), bott+0.1*(top-bott), 'Annotation', color=annotation_colors[jj])
plt.show()
This is just to demonstrate the issue that is likely caused by my lack of understanding of how the ax and fig calls are working. It seems to me that the coordinates x and y of the ax.text call may actually apply to the coordinates of of the fig, or something similar. The end result is far worse with my actual data!!! In that case, some of the annotations end up miles above the actual plots and not even within the coordinates of any of the subplot axes. Others completely overlap! What I am misunderstanding?
Edit for more details:
I have tried Stef's solution of using axes coordinates of axes.text(0.1, 0.1, 'Annotation'...)
I get the following plot, which still shows the same problem of moving the text all over the place. Because I am running this example with random numbers, the annotations are moving randomly with every run - i.e. they are not just displaced in the subplots with different axis ranges (y2 and y5).
You can specify the text location in axes coordinates (as opposed to data coordinates as you did implicitely):
ax.text(.1, .1, 'Annotation', color=annotation_colors[jj], transform=ax.transAxes)
See the Transformations Tutorial for further information.

layout problem of multiple heatmaps in one figure with matplotlib

I put multiple heatmaps in one figure with matplotlib. I cannot layout it well. Here is my code.
import matplotlib; matplotlib.use('agg')
import matplotlib.pyplot as plt
import numpy as np
x = np.random.rand(6,240,240)
y = np.random.rand(6,240,240)
t = np.random.rand(6,240,240)
plt.subplots_adjust(wspace=0.2, hspace=0.3)
c=1
for i in range(6):
ax=plt.subplot(6,3,c)
plt.imshow(x[i])
ax.set_title("x"+str(i))
c+=1
ax=plt.subplot(6,3,c)
plt.imshow(y[i])
ax.set_title("y"+str(i))
c+=1
ax=plt.subplot(6,3,c)
plt.imshow(t[i])
ax.set_title("t"+str(i))
c+=1
plt.tight_layout()
plt.savefig("test.png")
test.png looks like this.
I want to
make each heatmap bigger
reduce the margin between each heatmaps in row.
I tried to adjust by "subplots_adjust", but it doesn't work.
Additional information
According to ImportanceOfBeingErnest's comment, I removed tight_layout(). It generated this.
It makes bigger each heatmap, but titles overlappes on subplots. And I still want to make each heatmap more bigger, and I want to reduce the margin in row.

How do I use colourmaps with variable alpha in a Seaborn kdeplot without seeing the contour lines?

Python version: 3.6.4 (Anaconda on Windows)
Seaborn: 0.8.1
Matplotlib: 2.1.2
I'm trying to create a 2D Kernel Density plot using Seaborn but I want each step in the colourmap to have a different alpha value. I had a look at this question to create a matplotlib colourmap with alpha values: Add alpha to an existing matplotlib colormap.
I have a problem in that the lines between contours are visible. The result I get is here:
I thought that I had found the answer when I found this question: Hide contour linestroke on pyplot.contourf to get only fills. I tried the method outlined in the answer (using set_edgecolor("face") but it did not work in this case. That question also seemed to be related to vector graphics formats and I am just writing out a PNG.
Here is my script:
import numpy as np
import seaborn as sns
import matplotlib.colors as cols
import matplotlib.pyplot as plt
def alpha_cmap(cmap):
my_cmap = cmap(np.arange(cmap.N))
# Set a square root alpha.
x = np.linspace(0, 1, cmap.N)
my_cmap[:,-1] = x ** (0.5)
my_cmap = cols.ListedColormap(my_cmap)
return my_cmap
xs = np.random.uniform(size=100)
ys = np.random.uniform(size=100)
kplot = sns.kdeplot(data=xs, data2=ys,
cmap=alpha_cmap(plt.cm.viridis),
shade=True,
shade_lowest=False,
n_levels=30)
plt.savefig("example_plot.png")
Guided by some comments on this question I have tried some other methods that have been successful when this problem has come up. Based on this question (Matplotlib Contourf Plots Unwanted Outlines when Alpha < 1) I have tried altering the plot call to:
sns.kdeplot(data=xs, data2=ys,
cmap=alpha_cmap(plt.cm.viridis),
shade=True,
shade_lowest=False,
n_levels=30,
antialiased=True)
With antialiased=True the lines between contours are replaced by a narrow white line:
I have also tried an approach similar to this question - Pyplot pcolormesh confused when alpha not 1. This approach is based on looping over the PathCollections in kplot.collections and tuning the parameters of the edges so that they become invisible. I have tried adding this code and tweaking the linewidth -
for thing in kplot.collections:
thing.set_edgecolor("face")
thing.set_linewidth(0.01)
fig.canvas.draw()
This results in a mix of white and dark lines - .
I believe that I will not be able to tune the line width to make the lines disappear because of the variable width of the contour bands.
Using both methods (antialiasing + linewidth) makes this version, which looks cool but isn't quite what I want:
I also found this question - Changing Transparency of/Remove Contour Lines in Matplotlib
This one suggests overplotting a second plot with a different number of contour levels on the same axis, like:
kplot = sns.kdeplot(data=xs, data2=ys,
ax=ax,
cmap=alpha_cmap(plt.cm.viridis),
shade=True,
shade_lowest=False,
n_levels=30,
antialiased=True)
kplot = sns.kdeplot(data=xs, data2=ys,
ax=ax,
cmap=alpha_cmap(plt.cm.viridis),
shade=True,
shade_lowest=False,
n_levels=35,
antialiased=True)
This results in:
This is better, and almost works. The problem here is I need variable (and non-linear) alpha throughout the colourmap. The variable banding and lines seem to be a result of the combinations of alpha when contours are plotted over each other. I also still see some clear/white lines in the result.

How to overlay one pyplot figure on another

Searching easily reveals how to plot multiple charts on one figure, whether using the same plotting axes, a second y axis or subplots. Much harder to uncover is how to overlay one figure onto another, something like this:
That image was prepared using a bitmap editor to overlay the images. I have no difficulty creating the individual plots, but cannot figure out how to combine them. I expect a single line of code will suffice, but what is it? Here is how I imagine it:
bigFig = plt.figure(1, figsize=[5,25])
...
ltlFig = plt.figure(2)
...
bigFig.overlay(ltlFig, pos=[x,y], size=[1,1])
I've established that I can use figure.add_axes, but it is quite challenging getting the position of the overlaid plot correct, since the parameters are fractions, not x,y values from the first plot. It also [it seems to me - am I wrong?] places constraints on the order in which the charts are plotted, since the main plot must be completed before the other plots are added in turn.
What is the pyplot method that achieves this?
To create an inset axes you may use mpl_toolkits.axes_grid1.inset_locator.inset_axes.
Position of inset axes in axes coordinates
import matplotlib.pyplot as plt
from mpl_toolkits.axes_grid1.inset_locator import inset_axes
fig, ax= plt.subplots()
inset_axes = inset_axes(ax,
width=1, # inch
height=1, # inch
bbox_transform=ax.transAxes, # relative axes coordinates
bbox_to_anchor=(0.5,0.5), # relative axes coordinates
loc=3) # loc=lower left corner
ax.axis([0,500,-.1,.1])
plt.show()
Position of inset axes in data coordinates
import matplotlib.pyplot as plt
from mpl_toolkits.axes_grid1.inset_locator import inset_axes
fig, ax= plt.subplots()
inset_axes = inset_axes(ax,
width=1, # inch
height=1, # inch
bbox_transform=ax.transData, # data coordinates
bbox_to_anchor=(250,0.0), # data coordinates
loc=3) # loc=lower left corner
ax.axis([0,500,-.1,.1])
plt.show()
Both of the above produce the same plot
(For a possible drawback of this solution see specific location for inset axes)

Figures with lots of data points in matplotlib

I generated the attached image using matplotlib (png format). I would like to use eps or pdf, but I find that with all the data points, the figure is really slow to render on the screen. Other than just plotting less of the data, is there anyway to optimize it so that it loads faster?
I think you have three options:
As you mentioned yourself, you can plot fewer points. For the plot you showed in your question I think it would be fine to only plot every other point.
As #tcaswell stated in his comment, you can use a line instead of points which will be rendered more efficiently.
You could rasterize the blue dots. Matplotlib allows you to selectively rasterize single artists, so if you pass rasterized=True to the plotting command you will get a bitmapped version of the points in the output file. This will be way faster to load at the price of limited zooming due to the resolution of the bitmap. (Note that the axes and all the other elements of the plot will remain as vector graphics and font elements).
First, if you want to show a "trend" in your plot , and considering the x,y arrays you are plotting are "huge" you could apply a random sub-sampling to your x,y arrays, as a fraction of your data:
import numpy as np
import matplotlib.pyplot as plt
fraction = 0.50
x_resampled = []
y_resampled = []
for k in range(0,len(x)):
if np.random.rand() < fraction:
x_resampled.append(x[k])
y_resampled.append(y[k])
plt.scatter(x_resampled,y_resampled , s=6)
plt.show()
Second, have you considered using log-scale in the x-axis to increase visibility?
In this example, only the plotting area is rasterized, the axis are still in vector format:
import numpy as np
import matplotlib.pyplot as plt
x = np.random.uniform(size=400000)
y = np.random.uniform(size=400000)
plt.scatter(x, y, marker='x', rasterized=False)
plt.savefig("norm.pdf", format='pdf')