How do 'normalized figure coordinates' work? - matplotlib

In matplotlib, I recently came across the term 'normalized figure coordinates', which is apparently a specification of a rectangle by four parameters.
It is evident that a rectangle can be described by four numbers, and I'm guessing these four numbers somehow describe the dimensions as well as the location of the rectangle. However, I haven't managed to find an answer as to which of these parameters specifies which value.
Additionally, I'm not sure whether this is a matplotlib-specific term or one of general meaning, as the matplotlib documentation does not cite or link any sources with respect to this term.
Can anyone shed some light on this issue, please?

There are several functions where normalized figure coordinates are used.
In general possibilities are
(left, bottom, width, height) (this is called "bounds" in matplotlib); or
(left, bottom, right, top) (called "extent").
Hopefully the documentation will make it clear which 4 tuple is expected in the respective case.
Here you seem to be interested in the GridSpec's tight_layout parameter rect. From its documentation
rect : tuple of 4 floats, optional
(left, bottom, right, top) rectangle in normalized figure coordinates that the whole subplots area (including labels) will fit into. Default is (0, 0, 1, 1).

To answer your last question, the term normalization is not matplotlib-specific you can get a very short intro from wikipedia.
As for Matplotlib: you can have different coordinate systems relative to different objects (e.g. the axis, the figure).
Each of these systems is normalized, in the sense that the 4 corners of the chosen reference object will always have the following coordinates:
(0,1) Top left corner
(1,1) Top right corner
(1,0) Bottom right corner
(0,0) Bottom left corner
Where the first element of each pair refers to x-axis and the second element refers to the y-axis.
This makes, among other things, annotation or placements of artist objects easier as you can specify the position of the element you wish to add using any of the available coordinate systems.
All you need to do is select an appropriate coordinate system by passing a transformation object to the transform parameter.
Some example:
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
ax.plot([5.], [2.], 'o')
circle=plt.Circle((0, 0), 0.1, color="g",transform=ax.transAxes) #bottom (y=0) left (x=0) green circle of radius 0.1 (expressed in coord system)
ax.add_artist(circle)
ax.annotate('I am the top (y=1.0) right (x=1.0) Figure corner',
xy=(1, 1), xycoords=fig.transFigure,
xytext=(0.2, 0.2), textcoords='offset points',
)
plt.text( # position text relative to data
5., 2., 'I am the (5,2) data point', # x, y, text,
ha='center', va='bottom', # text alignment
transform=ax.transData # coordinate system transformation
)
plt.text( # position text relative to Axes
1.0, 0.0, 'I am the bottom (y=0.0) right (x=1.0) axis corner',
ha='right', va='bottom',
transform=ax.transAxes
)
plt.text( # position text relative to Figure
0.0, 1.0, 'I am the top (y=1.0) left (x=0.0) figure corner',
ha='left', va='top',
transform=fig.transFigure
)
plt.show()

Related

Line Profile Diagonal

When you make a line profile of all x-values or all y-values the extraction from each pixel is clear. But when you take a line profile along a diagonal, how does DM choose which pixels to use in the one dimensional readout?
Not really a scripting question, but I'm rather certain that it uses bi-linear interpolation between the grid-points along the drawn line. (And if perpendicular integration is enabled, it does so in an integral.) It's the same interpolation you would get for a "rotate" image.
In fact, you can think of it as a rotate-image (bi-linearly interpolated) with a 'cut-out' afterwards, potentially summed/projected onto the new X-axis.
Here is an example
Assume we have a 5 x 4 image, which gives the grid as shown below.
I'm drawing top-left corners to indicate the coordinates system pixel convention used in DigitalMicrgraph, where
(x/y)=(0/0) is the top-left corner of the image
Now extract a LineProfile from (1/1) to (4/3). I have highlighted the pixels for those coordinates.
Note, that a Line drawn from the corners seems to be shifted by half-a-pixel from what feels 'natural', but that is the consequence of the top-left-corner convention. I think, this is why a LineProfile-Marker is shown shifted compared to f.e. LineAnnotations.
In general, this top-left corner convention makes schematics with 'pixels' seem counter-intuitive. It is easier to think of the image simply as grid with values in points at the given coordinates than as square pixels.
Now the maths.
The exact profile has a length of:
As we can only have profiles with integer channels, we actually extract a LineProfile of length = 4, i.e we round up.
The angle of the profile is given by the arc-tangent of dX and dY.
So to extract the profile, we 'rotate' the grid by that angle - done by bilinear interpolation - and then extract the profile as grid of size 4 x 1:
This means the 'values' in the profile are from the four points:
Which are each bi-linearly interpolated values from four closest points of the original image:
In case the LineProfile is averaged over a certain width W, you do the same thing but:
extract a 2D grid of size L x W centered symmetrically over the line.i.e. the grid is shifted by (W-1)/2 perpendicular to the profile direction.
sum the values along W

overlapping constrained 3d subplots

What knobs must I tweak to prevent these problems:
overlapping axes labels
overlapping plots with cropped axes labels
I'm using matplotlib 3.5.1 with the PGF backend. Some solutions for older versions no longer work.
fig, axes = plt.subplots\
(2, 3, constrained_layout=True, subplot_kw=dict(projection="3d"))
#it = np.nditer(axes, flags=["refs_ok","multi_index"])
#for ax in it:
# # Plot the surfaces, add row and column title annotations.
# pass
width = 150 * 0.8 * mm
height = width * 0.65
fig.set_size_inches(width, height)
fig.savefig("something.pgf", dpi=300)
plt.close(fig)
Getting rid of the constrained layout and using plt.subplots_adjust(wspace=<value>, hspace=<value>) worked for me
from matplotlib import pyplot as plt
fig, axes = plt.subplots(nrows=2, ncols=3, subplot_kw=dict(projection="3d"))
plt.subplots_adjust(wspace=0.5,hspace=0.5)
labels_x, labels_y, labels_z = [['x-axis']*3]*2, [['y-axis']*3]*2, [['z-axis']*3]*2
for i in range(len(axes)):
for j in range(len(axes[i])):
axes[i,j].set_xlabel(labels_x[i][j])
axes[i,j].set_ylabel(labels_y[i][j])
axes[i,j].set_zlabel(labels_z[i][j])
plt.show()
While both tight and constrained layouts can be used with 3d projection (mplot3d) it seems that constrained layout does not understand how to pad 3d tick labels, leading to overlapping or trimmed labels. Both layout managers adjust subplot padding and axes size given a fixed figure size. Neither can fit the figure size to the contents. To do so with tight layout, extrapolate the desired figure size from the current figure tight bbox over multiple iterations. When using the constrained layout manager, sum the axes tight bboxes and padding to determine any extraneous space. Tight layout adjusts subplotpars (axes size and figure padding) and supports "h_pad" and "w_pad" parameters. Constrained layout adjusts axes size and supports "wspace", "hspace", "w_pad", and "h_pad" parameters. The tight layout squeezes subplots into a tight group which is then centered in the available space. The constrained layout distributes subplots evenly across all available space. Regardless of the layout manager, if the ticks overlap or are clustered too tightly, switch the tick locator to "MaxNLocator" for some smaller "n".
The root problem is that the 3d projection tick labels are empty until a full canvas draw. The "_draw_disabled" draw performed by the constrained layout manager isn't sufficient to trigger the tick labels. If you trace the axes tight bbox you'll notice they don't include the labels until after a call to "fig.canvas.draw", before then the tick labels are just "Text(0, 0, '')". Be sure to include this call after sizing the figure and then the layout will be constrained as expected. Given a fixed figure width, set the height based on aesthetics or sum the axes tight bboxes with padding to determine the minimum possible height.

Matplotlib tick labels

Is there a way to render the tick labels just right inside the axes, i.e, something like the direction property there is on the ticks themself?
Right now I'm setting the x property to a positive value on the ticklabels to draw them inside of the axis, i.e.,
ax2.set_yticklabels(['0', '2500', '5000', '7500'], minor=False, x=0.05)
But this doesn't really work on resizable plots, as the 0.05 figure is absolute (and too big on big plots).
Any ideas?
I'm assuming that ax2 is constructed as ax2 = ax.twinx(), which is to say that it is on the right side of the axes.
You could do something like the following:
ax2.set_yticklabels(['0', '2500', '5000', '7500'], minor=False, horizontalalignment='right')
for tick in ax2.yaxis.get_major_ticks():
tick.set_pad(-8)
If you want the left side axis on the inside too, then you'd simply switch the horizontal alignment to 'left' and change the pad from -8 to -25.
The two numbers might not be exact and could depend on other matplotlib settings you might have (e.g. length of major ticks) so you may want to increase or decrease those values slightly.

Matplotlib difference between two images

I have images (4000x2000 pixels) that are derived from the same image, but with subtle differences in less than 1% of the pixels. I'd like to plot the two images side-by-side and highlight the regions of the array's that are different (by highlight I mean I want the pixels that differ to jump out, but still display the color that matches their value. I've been using rectangles that are unfilled to outline the edges of such pixels so far. I can do this very nicely in small images (~50x50) with:
fig=figure(figsize=(20,15))
ax1=fig.add_subplot(1,2,1)
imshow(image1,interpolation='nearest',origin='lower left')
colorbar()
ax2=fig.add_subplot(122,sharex=ax1, sharey=ax1)
imshow(image2,interpolation='nearest',origin='lower left')
colorbar()
#now show differences
Xspots=im1!=im2
Xx,Xy=nonzero(Xspots)
for x,y in zip(Xx,Xy):
rect=Rectangle((y-.5,x-.5),1,1,color='w',fill=False,ec='w')
ax1.add_patch(rect)
ax2.add_patch(rect)
However this doesn't work so well when the image is very large. Strange things happen, for example when I zoom in the patch disappears. Also, this way sucks because it takes forever to load things when I zoom in/out.
I feel like there must be a better way to do this, maybe one where there is only one patch that determines where all of the things are, rather than a whole bunch of patches. I could do a scatter plot on top of the imshow image, but I don't know how to fix it so that the points will stay exactly the size of the pixel when I zoom in/out.
Any ideas?
I would try something with the alpha channel:
import copy
N, M = 20, 40
test_data = np.random.rand(N, M)
mark_mask = np.random.rand(N, M) < .01 # mask 1%
# this is redundant in this case, but in general you need it
my_norm = matplotlib.colors.Normalize(vmin=0, vmax=1)
# grab a copy of the color map
my_cmap = copy.copy(cm.get_cmap('cubehelix'))
c_data= my_cmap(my_norm(test_data))
c_data[:, :, 3] = .5 # make everything half alpha
c_data[mark_mask, 3] = 1 # reset the marked pixels as full opacity
# plot it
figure()
imshow(c_data, interpolation='none')
No idea if this will work with your data or not.

pyplot scatter plot marker size

In the pyplot document for scatter plot:
matplotlib.pyplot.scatter(x, y, s=20, c='b', marker='o', cmap=None, norm=None,
vmin=None, vmax=None, alpha=None, linewidths=None,
faceted=True, verts=None, hold=None, **kwargs)
The marker size
s:
size in points^2. It is a scalar or an array of the same length as x and y.
What kind of unit is points^2? What does it mean? Does s=100 mean 10 pixel x 10 pixel?
Basically I'm trying to make scatter plots with different marker sizes, and I want to figure out what does the s number mean.
This can be a somewhat confusing way of defining the size but you are basically specifying the area of the marker. This means, to double the width (or height) of the marker you need to increase s by a factor of 4. [because A = WH => (2W)(2H)=4A]
There is a reason, however, that the size of markers is defined in this way. Because of the scaling of area as the square of width, doubling the width actually appears to increase the size by more than a factor 2 (in fact it increases it by a factor of 4). To see this consider the following two examples and the output they produce.
# doubling the width of markers
x = [0,2,4,6,8,10]
y = [0]*len(x)
s = [20*4**n for n in range(len(x))]
plt.scatter(x,y,s=s)
plt.show()
gives
Notice how the size increases very quickly. If instead we have
# doubling the area of markers
x = [0,2,4,6,8,10]
y = [0]*len(x)
s = [20*2**n for n in range(len(x))]
plt.scatter(x,y,s=s)
plt.show()
gives
Now the apparent size of the markers increases roughly linearly in an intuitive fashion.
As for the exact meaning of what a 'point' is, it is fairly arbitrary for plotting purposes, you can just scale all of your sizes by a constant until they look reasonable.
Edit: (In response to comment from #Emma)
It's probably confusing wording on my part. The question asked about doubling the width of a circle so in the first picture for each circle (as we move from left to right) it's width is double the previous one so for the area this is an exponential with base 4. Similarly the second example each circle has area double the last one which gives an exponential with base 2.
However it is the second example (where we are scaling area) that doubling area appears to make the circle twice as big to the eye. Thus if we want a circle to appear a factor of n bigger we would increase the area by a factor n not the radius so the apparent size scales linearly with the area.
Edit to visualize the comment by #TomaszGandor:
This is what it looks like for different functions of the marker size:
x = [0,2,4,6,8,10,12,14,16,18]
s_exp = [20*2**n for n in range(len(x))]
s_square = [20*n**2 for n in range(len(x))]
s_linear = [20*n for n in range(len(x))]
plt.scatter(x,[1]*len(x),s=s_exp, label='$s=2^n$', lw=1)
plt.scatter(x,[0]*len(x),s=s_square, label='$s=n^2$')
plt.scatter(x,[-1]*len(x),s=s_linear, label='$s=n$')
plt.ylim(-1.5,1.5)
plt.legend(loc='center left', bbox_to_anchor=(1.1, 0.5), labelspacing=3)
plt.show()
Because other answers here claim that s denotes the area of the marker, I'm adding this answer to clearify that this is not necessarily the case.
Size in points^2
The argument s in plt.scatter denotes the markersize**2. As the documentation says
s : scalar or array_like, shape (n, ), optional
size in points^2. Default is rcParams['lines.markersize'] ** 2.
This can be taken literally. In order to obtain a marker which is x points large, you need to square that number and give it to the s argument.
So the relationship between the markersize of a line plot and the scatter size argument is the square. In order to produce a scatter marker of the same size as a plot marker of size 10 points you would hence call scatter( .., s=100).
import matplotlib.pyplot as plt
fig,ax = plt.subplots()
ax.plot([0],[0], marker="o", markersize=10)
ax.plot([0.07,0.93],[0,0], linewidth=10)
ax.scatter([1],[0], s=100)
ax.plot([0],[1], marker="o", markersize=22)
ax.plot([0.14,0.86],[1,1], linewidth=22)
ax.scatter([1],[1], s=22**2)
plt.show()
Connection to "area"
So why do other answers and even the documentation speak about "area" when it comes to the s parameter?
Of course the units of points**2 are area units.
For the special case of a square marker, marker="s", the area of the marker is indeed directly the value of the s parameter.
For a circle, the area of the circle is area = pi/4*s.
For other markers there may not even be any obvious relation to the area of the marker.
In all cases however the area of the marker is proportional to the s parameter. This is the motivation to call it "area" even though in most cases it isn't really.
Specifying the size of the scatter markers in terms of some quantity which is proportional to the area of the marker makes in thus far sense as it is the area of the marker that is perceived when comparing different patches rather than its side length or diameter. I.e. doubling the underlying quantity should double the area of the marker.
What are points?
So far the answer to what the size of a scatter marker means is given in units of points. Points are often used in typography, where fonts are specified in points. Also linewidths is often specified in points. The standard size of points in matplotlib is 72 points per inch (ppi) - 1 point is hence 1/72 inches.
It might be useful to be able to specify sizes in pixels instead of points. If the figure dpi is 72 as well, one point is one pixel. If the figure dpi is different (matplotlib default is fig.dpi=100),
1 point == fig.dpi/72. pixels
While the scatter marker's size in points would hence look different for different figure dpi, one could produce a 10 by 10 pixels^2 marker, which would always have the same number of pixels covered:
import matplotlib.pyplot as plt
for dpi in [72,100,144]:
fig,ax = plt.subplots(figsize=(1.5,2), dpi=dpi)
ax.set_title("fig.dpi={}".format(dpi))
ax.set_ylim(-3,3)
ax.set_xlim(-2,2)
ax.scatter([0],[1], s=10**2,
marker="s", linewidth=0, label="100 points^2")
ax.scatter([1],[1], s=(10*72./fig.dpi)**2,
marker="s", linewidth=0, label="100 pixels^2")
ax.legend(loc=8,framealpha=1, fontsize=8)
fig.savefig("fig{}.png".format(dpi), bbox_inches="tight")
plt.show()
If you are interested in a scatter in data units, check this answer.
You can use markersize to specify the size of the circle in plot method
import numpy as np
import matplotlib.pyplot as plt
x1 = np.random.randn(20)
x2 = np.random.randn(20)
plt.figure(1)
# you can specify the marker size two ways directly:
plt.plot(x1, 'bo', markersize=20) # blue circle with size 10
plt.plot(x2, 'ro', ms=10,) # ms is just an alias for markersize
plt.show()
From here
It is the area of the marker. I mean if you have s1 = 1000 and then s2 = 4000, the relation between the radius of each circle is: r_s2 = 2 * r_s1. See the following plot:
plt.scatter(2, 1, s=4000, c='r')
plt.scatter(2, 1, s=1000 ,c='b')
plt.scatter(2, 1, s=10, c='g')
I had the same doubt when I saw the post, so I did this example then I used a ruler on the screen to measure the radii.
I also attempted to use 'scatter' initially for this purpose. After quite a bit of wasted time - I settled on the following solution.
import matplotlib.pyplot as plt
input_list = [{'x':100,'y':200,'radius':50, 'color':(0.1,0.2,0.3)}]
output_list = []
for point in input_list:
output_list.append(plt.Circle((point['x'], point['y']), point['radius'], color=point['color'], fill=False))
ax = plt.gca(aspect='equal')
ax.cla()
ax.set_xlim((0, 1000))
ax.set_ylim((0, 1000))
for circle in output_list:
ax.add_artist(circle)
This is based on an answer to this question
If the size of the circles corresponds to the square of the parameter in s=parameter, then assign a square root to each element you append to your size array, like this: s=[1, 1.414, 1.73, 2.0, 2.24] such that when it takes these values and returns them, their relative size increase will be the square root of the squared progression, which returns a linear progression.
If I were to square each one as it gets output to the plot: output=[1, 2, 3, 4, 5]. Try list interpretation: s=[numpy.sqrt(i) for i in s]