Put pcolormesh and contour onto same grid? - matplotlib

I'm trying to display 2D data with axis labels using both contour and pcolormesh. As has been noted on the matplotlib user list, these functions obey different conventions: pcolormesh expects the x and y values to specify the corners of the individual pixels, while contour expects the centers of the pixels.
What is the best way to make these behave consistently?
One option I've considered is to make a "centers-to-edges" function, assuming evenly spaced data:
def centers_to_edges(arr):
dx = arr[1]-arr[0]
newarr = np.linspace(arr.min()-dx/2,arr.max()+dx/2,arr.size+1)
return newarr
Another option is to use imshow with the extent keyword set.
The first approach doesn't play nicely with 2D axes (e.g., as created by meshgrid or indices) and the second discards the axis numbers entirely

Your data is a regular mesh? If it doesn't, you can use griddata() to obtain it. I think that if your data is too big, a sub-sampling or regularization always is possible. If the data is too big, maybe your output image always will be small compared with it and you can exploit this.
If you use imshow() with "extent" and "interpolation='nearest'", you will see that the data is cell-centered, and extent provided the lower edges of cells (corners). On the other hand, contour assumes that the data is cell-centered, and X,Y must be the center of cells. So, you need to be care about the input domain for contour. The trivial example is:
x = np.arange(-10,10,1)
X,Y = np.meshgrid(x,x)
P = X**2+Y**2
imshow(P,extent=[-10,10,-10,10],interpolation='nearest',origin='lower')
contour(X+0.5,Y+0.5,P,20,colors='k')
My tests told me that pcolormesh() is a very slow routine, and I always try to avoid it. griddata and imshow() always is a good choose for me.

Related

Unusual Mesh Outline PColorMesh

I am utilizing the pcolormesh function in Matplotlib to plot a series of gridded data (in parallel) across multiple map domains. The code snippet relevant to this question is as follows:
im = ax2.pcolormesh(xgrid, ygrid, data.variable.data[0], cmap=cmap, norm=norm, alpha=0.90, facecolor=None)
Where: xgrid = array of longitude points, ygrid = array of latitude points, data.variable.data[0] = array of corresponding data values, cmap = defined colormap, & norm = defined value normalization
Consider the following image generated from the provided code:
The undesired result I've found in the image above is what appears to be outlines around each grid square, or perhaps better described as patchwork that stands out slightly as the mesh alpha is reduced below 1.
I've set facecolor=None assuming that would remove these outlines, to no avail. What additions or corrections can I make to remove this feature?

Using matplotlib to plot a matrix with the third variable as source for a color map

Say you have the matrix given by three arrays, being:
x = N-dimensional array.
y = M-dimensional array.
And z is a set of "somewhat random" values from -0.3 to 0.3 in a NxM shape. I need to create a plot in which the x values are in the x-axis, y values are in the y-axis and using z as the source to indicate the intensity of each pixel with a color map.
So far, I have tried using
plt.contourf(x,y,z)
and the resulting plot is very nice for me (attached at the end of this paragraph), but a smoothing is automatically applied to the plot! I need to be able to distinguish the pixels and I cannot find a way to do it.
contourf result
I have also studied the possibility of using
ax.matshow(z)
in order to sucesfully see the pixels... but then I am struggling trying to personalize the x and y axis, since only the index of the pixel is shown (see below).
matshow result
Would you please give me some ideas? Thank you.
Without more information on your x,y data it's hard to know, but I would guess you are looking for pcolormesh.
plt.pcolormesh(x,y,z)
This would take the x and y data as input and hence shows the z data at the appropriate coordinates.
You can use imshow with the keyword interpolation='nearest'.
plt.imshow(z, interpolation='nearest')

Interpolating data onto a line of points

I have some irregularly spaced data and need to analyze it. I can successfully interpolate this data onto a regular grid using mlab.griddata (or rather, the natgrid implementation of it). This allows me to use pcolormesh and contour to generate plots, extract levels, etc. Using plot.contour, I then extract a certain level using get_paths from the contour CS.collections().
Now, what I'd like to do is then, with my original irregularly spaced data, interpolate some quantities onto this specific contour line (i.e., NOT onto a regular grid). The similarly named griddata function from Scipy allows for this behavior, and it almost works. However, I find that as I increase the number of original points, I can get odd erratic behavior in the interpolation. I'm wondering if there's a way around this, i.e., another way to interpolate irregularly spaced (or regularly spaced data for that matter, since I can use my regularly spaced data from mlab.griddata) onto a specific line.
Let me show some numerical examples of what I'm talking about. Take a look at this figure:
The top left shows my data as points, and the line shows an extracted level of level=0 from some data D that I have at those points (x,y) [note, I have data 'D', 'Energy', and 'Pressure', all defined in this (x,y) space]. Once I have this curve, I can plot the interpolated quantities of D, Energy, and Pressure onto my specific line. First, note the plot of D (middle, right). It should be zero at all points, but it's not quite zero at all points. The likely cause of this is that the line that corresponds to the 0 level is generated from a uniform set of points that came from mlab.griddata, whereas the plot of 'D' is generated from my ORIGINAL data interpolated onto that level curve. You can also see some unphysical wiggles in 'Energy' and 'Pressure'.
Okay, seems easy enough, right? Maybe I should just get more original data points along my level=0 curve. Getting some more of these points, I then generate the following plots:
First look at the top left. You can see that I've sampled the hell out of the (x,y) space in the vicinity of my level=0 curve. Furthermore, you can see that my new "D" plot (middle, right) now correctly interpolates to zero in the region that it originally didn't. But now I get some wiggles at the start of the curve, as well as getting some other wiggles in the 'Energy' and 'Pressure' in this space! It is far from obvious to me that this should occur, since my original data points are still there and I've only supplemented additional points. Furthermore, some regions where my interpolation is going bad aren't even near the points that I added in the second run -- they are exclusively neighbored by my original points.
So this brings me to my original question. I'm worried that the interpolation that produces the 'Energy', 'D', and 'Pressure' curves is not working correctly (this is scigrid's griddata). Mlab's griddata only interpolates to a regular grid, whereas I want to interpolate to this specific line shown in the top left plot. What's another way for me to do this?
Thanks for your time!
After posting this, I decided to try scipy.interpolate.SmoothBivariateSpline, which produced the following result:
You can now see that my line is smoothed, so it seems like this will work. I'll mark this as the answer unless someone posts something soon that hints that there may be an even better solution.
Edit: As requested, below is some of the code used to generate these plots. I don't have a minimally working example, and the above plots were generated in a larger framework of code, but I'll write the important parts schematically below with comments.
# x,y,z are lists of data where the first point is x[0],y[0],z[0], and so on
minx=min(x)
maxx=max(x)
miny=min(y)
maxy=max(y)
# convert to numpy arrays
x=np.array(x)
y=np.array(y)
z=np.array(z)
# here we are creating a fine grid to interpolate the data onto
xi=np.linspace(minx,maxx,100)
yi=np.linspace(miny,maxy,100)
# here we interpolate our data from the original x,y,z unstructured grid to the new
# fine, regular grid in xi,yi, returning the values zi
zi=griddata(x,y,z,xi,yi)
# now let's do some plotting
plt.figure()
# returns the CS contour object, from which we'll be able to get the path for the
# level=0 curve
CS=plt.contour(x,y,z,levels=[0])
# can plot the original data if we want
plt.scatter(x,y,alpha=0.5,marker='x')
# now let's get the level=0 curve
for c in CS.collections:
data=c.get_paths()[0].vertices
# lineX,lineY are simply the x,y coordinates for our level=0 curve, expressed as arrays
lineX=data[:,0]
lineY=data[:,1]
# so it's easy to plot this too
plt.plot(lineX,lineY)
# now what to do if we want to interpolate some other data we have, say z2
# (also at our original x,y positions), onto
# this level=0 curve?
# well, first I tried using scipy.interpolate.griddata == scigrid like so
origdata=np.transpose(np.vstack((x,y))) # just organizing this data like the
# scigrid routine expects
lineZ2=scigrid(origdata,z2,data,method='linear')
# plotting the above curve (as plt.plot(lineZ2)) gave me really bad results, so
# trying a spline approach
Z2spline=SmoothBivariateSpline(x,y,z2)
# the above creates a spline object on our original data. notice we haven't EVALUATED
# it anywhere yet (we'll want to evaluate it on our level curve)
Z2Line=[]
# here we evaluate the spline along all our points on the level curve, and store the
# result as a new list
for i in range(0,len(lineX)):
Z2Line.append(Z2spline(lineX[i],lineY[i])[0][0]) # the [0][0] is just to get the
# value, which is enclosed in
# some array structure for some
# reason otherwise
# you can then easily plot this
plt.plot(Z2Line)
Hope this helps someone!

Visualizing randomized four dimensional data set

I have a four dimensional data set. None of the four variables are equally spaced. Right now, I visualize the data using 3D scatter (with the color of the dots indicating the fourth dimension). But this makes it extremely unwieldy while it is printed. Had the variables been evenly spaced,a series of pcolors would have been an option. Is there some way, wherein I can represent such a data using a series of 2D plots? My data set looks something like this:
x = [3.67, 3.89, 25.6]
y = [4.88, 4.88, 322.9]
z = [1.0, 2.0, 3.0]
b = [300.0,411.0,414.5]
A scatter plot matrix is a common way to plot multiple dimensions. Here's a plot of four continuous variables colored by a fifth categorical variable.
To deal with the uneven spacing, it depends on the nature of the unevenness.
You might plot it as-is if the unevenness is significant.
You might make a second plot with the extreme values excluded.
You might apply a transformation (such as log or quantile) if the data justifies it.

Contours based on a "label mask"

I have images that have had features extracted with a contouring algorithm (I'm doing astrophysical source extraction). This approach yields a "feature map" that has each pixel "labeled" with an integer (usually ~1000 unique features per map).
I would like to show each individual feature as its own contour.
One way I could accomplish this is:
for ii in range(labelmask.max()):
contour(labelmask,levels=[ii-0.5])
However, this is very slow, particularly for large images. Is there a better (faster) way?
P.S.
A little testing showed that skimage's find-contours is no faster.
As per #tcaswell's comment, I need to explain why contour(labels, levels=np.unique(levels)+0.5)) or something similar doesn't work:
1. Matplotlib spaces each subsequent contour "inward" by a linewidth to avoid overlapping contour lines. This is not the behavior desired for a labelmask.
2. The lowest-level contours encompass the highest-level contours
3. As a result of the above, the highest-level contours will be surrounded by a miniature version of whatever colormap you're using and will have extra-thick contours compared to the lowest-level contours.
Sorry for answering my own... impatience (and good luck) got the better of me.
The key is to use matplotlib's low-level C routines:
I = imshow(data)
E = I.get_extent()
x,y = np.meshgrid(np.linspace(E[0],E[1],labels.shape[1]), np.linspace(E[2],E[3],labels.shape[0]))
for ii in np.unique(labels):
if ii == 0: continue
tracer = matplotlib._cntr.Cntr(x,y,labels*(labels==ii))
T = tracer.trace(0.5)
contour_xcoords,contour_ycoords = T[0].T
# to plot them:
plot(contour_xcoords, contour_ycoords)
Note that labels*(labels==ii) will put each label's contour at a slightly different location; change it to just labels==ii if you want overlapping contours between adjacent labels.