Contours based on a "label mask" - matplotlib

I have images that have had features extracted with a contouring algorithm (I'm doing astrophysical source extraction). This approach yields a "feature map" that has each pixel "labeled" with an integer (usually ~1000 unique features per map).
I would like to show each individual feature as its own contour.
One way I could accomplish this is:
for ii in range(labelmask.max()):
contour(labelmask,levels=[ii-0.5])
However, this is very slow, particularly for large images. Is there a better (faster) way?
P.S.
A little testing showed that skimage's find-contours is no faster.
As per #tcaswell's comment, I need to explain why contour(labels, levels=np.unique(levels)+0.5)) or something similar doesn't work:
1. Matplotlib spaces each subsequent contour "inward" by a linewidth to avoid overlapping contour lines. This is not the behavior desired for a labelmask.
2. The lowest-level contours encompass the highest-level contours
3. As a result of the above, the highest-level contours will be surrounded by a miniature version of whatever colormap you're using and will have extra-thick contours compared to the lowest-level contours.

Sorry for answering my own... impatience (and good luck) got the better of me.
The key is to use matplotlib's low-level C routines:
I = imshow(data)
E = I.get_extent()
x,y = np.meshgrid(np.linspace(E[0],E[1],labels.shape[1]), np.linspace(E[2],E[3],labels.shape[0]))
for ii in np.unique(labels):
if ii == 0: continue
tracer = matplotlib._cntr.Cntr(x,y,labels*(labels==ii))
T = tracer.trace(0.5)
contour_xcoords,contour_ycoords = T[0].T
# to plot them:
plot(contour_xcoords, contour_ycoords)
Note that labels*(labels==ii) will put each label's contour at a slightly different location; change it to just labels==ii if you want overlapping contours between adjacent labels.

Related

Unusual Mesh Outline PColorMesh

I am utilizing the pcolormesh function in Matplotlib to plot a series of gridded data (in parallel) across multiple map domains. The code snippet relevant to this question is as follows:
im = ax2.pcolormesh(xgrid, ygrid, data.variable.data[0], cmap=cmap, norm=norm, alpha=0.90, facecolor=None)
Where: xgrid = array of longitude points, ygrid = array of latitude points, data.variable.data[0] = array of corresponding data values, cmap = defined colormap, & norm = defined value normalization
Consider the following image generated from the provided code:
The undesired result I've found in the image above is what appears to be outlines around each grid square, or perhaps better described as patchwork that stands out slightly as the mesh alpha is reduced below 1.
I've set facecolor=None assuming that would remove these outlines, to no avail. What additions or corrections can I make to remove this feature?

Holoviews: Format legend and colors of Spread and Curve Overlay

Given a tidy Pandas column with 4 or more columns, I want an otherwise very straightforward plot: two of the columns should be the x-y axes of a single figure, and one of the columns should index an Overlay of N Curve objects based on the x-y columns, and N Spread objects, using the final column as error. So if N=4 there should be 4 curves and four spreads. The curves and spreads with same index should be the same color, and the legend should attest to this.
Using table.to(hv.Curve,'col1','col2') I can get a Holomap for the curves, and with some effort I can do the same for the spread. If I then call .overlay() I get a nice figure for the curves including a legend, but when I do the same for the spread the legend vanishes. If I overlay the two, the legend likewise vanishes and the color cycle stops working, making all curves and spreads the same color. If I create a Holomap of curve*spread objects, then the colors match but the legend is still gone.
This seems like a very standard plot, but I can find very little in the Holoviews docs about pairing different Elements or controlling the legend.
This is a bit difficult to answer without any concrete code, for example I can't reproduce some of the issues you are describing. However the first issue is simply that show_legend is not enabled by default for the Spread elemen. In the case of plotting a Curve and Spread using .to and .overlay, here is what I can confirm works:
%%opts Spread [show_legend=True width=600] Overlay [legend_position='right']
df = pd.DataFrame({
'index': np.arange(100), 'y': np.random.randn(100).cumsum(),
'err': np.random.rand(100)+0.1, 'z': np.repeat(np.arange(10), 10)
})
ds = hv.Dataset(df)
ds.to(hv.Curve, 'index', 'y', 'z').overlay() * ds.to(hv.Spread, 'index', ['y', 'err']).overlay()
If I create a Holomap of curve*spread objects, then the colors match but the legend is still gone.
This is indeed a current limitation since we recommended against nesting objects in this way in the past, however I have just opened this PR which will allow this approach as well.

Put pcolormesh and contour onto same grid?

I'm trying to display 2D data with axis labels using both contour and pcolormesh. As has been noted on the matplotlib user list, these functions obey different conventions: pcolormesh expects the x and y values to specify the corners of the individual pixels, while contour expects the centers of the pixels.
What is the best way to make these behave consistently?
One option I've considered is to make a "centers-to-edges" function, assuming evenly spaced data:
def centers_to_edges(arr):
dx = arr[1]-arr[0]
newarr = np.linspace(arr.min()-dx/2,arr.max()+dx/2,arr.size+1)
return newarr
Another option is to use imshow with the extent keyword set.
The first approach doesn't play nicely with 2D axes (e.g., as created by meshgrid or indices) and the second discards the axis numbers entirely
Your data is a regular mesh? If it doesn't, you can use griddata() to obtain it. I think that if your data is too big, a sub-sampling or regularization always is possible. If the data is too big, maybe your output image always will be small compared with it and you can exploit this.
If you use imshow() with "extent" and "interpolation='nearest'", you will see that the data is cell-centered, and extent provided the lower edges of cells (corners). On the other hand, contour assumes that the data is cell-centered, and X,Y must be the center of cells. So, you need to be care about the input domain for contour. The trivial example is:
x = np.arange(-10,10,1)
X,Y = np.meshgrid(x,x)
P = X**2+Y**2
imshow(P,extent=[-10,10,-10,10],interpolation='nearest',origin='lower')
contour(X+0.5,Y+0.5,P,20,colors='k')
My tests told me that pcolormesh() is a very slow routine, and I always try to avoid it. griddata and imshow() always is a good choose for me.

Plot variable size/color-heatmap for mulitple occurences of points in scatter plot

I'm stuck with the following problem and I hope I can explain it coherent.
So, I have a number (about 10) of descrete positions on a coordinate system.
Now, I want to analyse data from a program where user could label each point as somethingA and somethingB.
I extracted the data points for each class. So I have about 60 points for the somethingA class and a little bit less for the other class. One class stands for good points and one for bad points. I want to find the positions which have the most good/bad labels. I do that with machine learning algorithms, I just want to visualize this with plots.
I now want to plot those points. So I make one plot per class. But since in every class every point occurs at least once, the two plots would look exactly the same.
But, the amount of occurences has a different distribution thoughout the positions.
Maybe point A has 20 occurences in class A and 1 in class B, both plots would look the same.
So, my question is: How can I take the number of occurences for points into account when plotting scatters in Matplotlib?
Either with different colors (like a heatmap?) maybe with a cool legend.
Or with different sizes (e.g. higher amount = bigger cirlce).
Any help would be appreciated!
I don't know if this helps you but I have had a problem where I wanted a scatterplot to reflect both positions as well as two variables that were attributed to the data points.
Since size and color in the scatter function do not allow variables themselves, meaning one has to specify color code and size in the usual way, meaning sth like
ax.scatter(..., c=whatEverFunction, s=numberOfOccurences, ...)
did not work for me.
what I did was to bin the values of the two variables I wanted to visualize. In my case the variable nodeMass and another variable.
for i in range(Number):
mask[i] = False
if(lowerBound1<variableOne[i]<upperBound1):
mask[i] = True & pmask[i]
if len(positionX[mask])>0:
ax.scatter(positionX[mask], positionY[mask], positionZ[mask],C='#424242',s=10, edgecolors='none')
for i in range(Number):
mask[i] = False
if(lowerBound2<variableOne[i]<upperBound2):
mask[i] = True & pmask[i]
if len(positionX[mask])>0:
ax.scatter(positionX[mask], positionY[mask], positionZ[mask],c='#9E0050',s=25,edgecolors='none')
I know it is not very elegant but it worked for me. I had to make as many for loops as I had bins in my variables. With if-querys and the masks I could at least avoid redundant or 'unreadable' plots.

Colorbar for imshow, centered on 0 and with symlog scale

I want to generate a grid of plots, of several arrays, with positive and negative values, with log scale, sharing the same colorbar.
I've achieved the sharing part of the colorbar (using ImageGrid and common max and min values), and I know that I could get a logarithmic scale using LogNorm() on the imshow call in the case of only positive values. But given the presence of negative values, I would need a colorbar on symmetric logarithmic scale.
I have found what would be the solution on https://stackoverflow.com/a/7741317/1101750 , but running the sample code Yann provides gives me very different results, cleary wrong:
Reviewing the code, I'm not able to grasp what's going on.
In addition to that, I've discovered that on Matplotlib 1.2, scale.SymmetricalLogScale.SymmetricalLogTransform asks for a new argument not explained on the documentation (linscale, which looking at the code of other transforms I assume that leaving it as 1 is a safe value).
Is the easiest solution subclassing LogNorm?
I've used a pretty simple recipe in the past to do exactly this, without the need to do any subclassing. matplotlib.colors.SymLogNorm provides most of the functionality you need, except that I've found it necessary to generate the tick marks by hand. Note that this solution uses matplotlib 1.3.0, and I may be using features that weren't available with 1.2.
def imshow_symlog(my_matrix, vmin, vmax, logthresh=5):
img=imshow( my_matrix ,
vmin=float(vmin), vmax=float(vmax),
norm=matplotlib.colors.SymLogNorm(10**-logthresh) )
maxlog=int(np.ceil( np.log10(vmax) ))
minlog=int(np.ceil( np.log10(-vmin) ))
#generate logarithmic ticks
tick_locations=([-(10**x) for x in xrange(minlog,-logthresh-1,-1)]
+[0.0]
+[(10**x) for x in xrange(-logthresh,maxlog+1)] )
cb=colorbar(ticks=tick_locations)
return img,cb
Since 1.3 matplotlib has a SymLogNorm. http://matplotlib.org/api/colors_api.html#matplotlib.colors.SymLogNorm