pyplot return unexpected contour paths - matplotlib

I want to use pyplot.contour to extract isolines from 2D data.
My problem is that this method returns unexpected results : when I use levels clearly outside data range, the contour result still contains paths.
Here is an example reproducing the issue :
import numpy
from matplotlib import pyplot
n = 256
x = numpy.linspace(-3., 3., n)
y = numpy.linspace(-3., 3., n)
X, Y = numpy.meshgrid(x, y)
Z = X * numpy.sinc(X ** 2 + Y ** 2)
levels = [1000]
print(f'data min : {Z.min()}')
print(f'data min : {Z.max()}')
print(f'levels : {levels}')
isolines = pyplot.contour(X, Y, Z, levels, colors='red')
for i, collection in enumerate(isolines.collections):
npaths = len(collection.get_paths())
print(f'collection[{i}] has {npaths} paths')
pyplot.show()
Which outputs
data min : -0.47993931267102286
data min : 0.47993931267102286
levels : [1000]
/path/to/issue.py:15: UserWarning: No contour levels were found within the data range.
isolines = pyplot.contour(X, Y, Z, levels, colors='red')
collection[0] has 1 paths
I expected the contour to be empty and not contain 1 path, do I miss something obvious here ?

As of 2023/01/11, it is a bug in matplotlib :
https://github.com/matplotlib/matplotlib/issues/23778
As the fix has not landed yet, my temporary workaround is to detect when levels are outside Z value range, and empty the contour collections in that case.
quadcontourset = pyplot.contour(X, Y, Z, levels)
zmin = numpy.min(Z)
zmax = numpy.max(Z)
inside = (levels > zmin) & (levels < zmax)
levels_in = levels[inside]
if not levels_in:
quadcontourset.collections.clear()
I reproduce the issue with matplotlib 3.5.3. The issue is not fixed in current 3.6.2 version but a fix seems on track at
https://github.com/matplotlib/matplotlib/pull/24912

Related

traitsui Range() fatal error with slider entry box

Below is some sample code straight from the MayaVI website on using sliders. Try putting in a number outside of the slider range for a fatal error:
from numpy import arange, pi, cos, sin
from traits.api import HasTraits, Range, Instance, \
on_trait_change
from traitsui.api import View, Item, Group
from mayavi.core.api import PipelineBase
from mayavi.core.ui.api import MayaviScene, SceneEditor, \
MlabSceneModel
dphi = pi/1000.
phi = arange(0.0, 2*pi + 0.5*dphi, dphi, 'd')
def curve(n_mer, n_long):
mu = phi*n_mer
x = cos(mu) * (1 + cos(n_long * mu/n_mer)*0.5)
y = sin(mu) * (1 + cos(n_long * mu/n_mer)*0.5)
z = 0.5 * sin(n_long*mu/n_mer)
t = sin(mu)
return x, y, z, t
class MyModel(HasTraits):
n_meridional = Range(0, 30, 6, )#mode='spinner')
n_longitudinal = Range(0, 30, 11, )#mode='spinner')
scene = Instance(MlabSceneModel, ())
plot = Instance(PipelineBase)
# When the scene is activated, or when the parameters are changed, we
# update the plot.
#on_trait_change('n_meridional,n_longitudinal,scene.activated')
def update_plot(self):
x, y, z, t = curve(self.n_meridional, self.n_longitudinal)
if self.plot is None:
self.plot = self.scene.mlab.plot3d(x, y, z, t,
tube_radius=0.025, colormap='Spectral')
else:
self.plot.mlab_source.set(x=x, y=y, z=z, scalars=t)
# The layout of the dialog created
view = View(Item('scene', editor=SceneEditor(scene_class=MayaviScene),
height=250, width=300, show_label=False),
Group(
'_', 'n_meridional', 'n_longitudinal',
),
resizable=True,
)
my_model = MyModel()
my_model.configure_traits()
How can I improve this code to disallow users from triggering this fatal error? I think a line that could deactivate the entry box (such as setDisabled(True)) could work, or remove it entirely - but I'm not sure how to implement it within the traitsui methods.
After lots of trial and error, this appears to be a bug in the default Range() mode of Traitsui, at least in the case for Mac OS X (I'm running High Sierra, 10.13.3).
The solution is to alter the default mode to one that looks and acts identical, minus crashing the program:
n_meridional = Range(0, 30, 6, mode='slider')

Plot random points a specified distance apart

I'm trying to come up with a function that plots n points inside the unit circle, but I need them to be sufficiently spread out.
ie. something that looks like this:
Is it possible to write a function with two parameters, n (number of points) and min_d (minimum distance apart) such that the points are:
a) equidistant
b) no pairwise distance exceeds a given min_d
The problem with sampling from a uniform distribution is that it could happen that two points are almost on top of each other, which I do not want to happen. I need this kind of input for a network diagram representing node clusters.
EDIT: I have found an answer to a) here: Generator of evenly spaced points in a circle in python, but b) still eludes me.
At the time this answer was provided, the question asked for random numbers. This answer thus gives a solution drawing random numbers. It ignores any edits made to the question afterwards.
On may simply draw random points and for each one check if the condition of the minimum distance is fulfilled. If not, the point can be discarded. This can be done until a list is filled with enough points or some break condition is met.
import numpy as np
import matplotlib.pyplot as plt
class Points():
def __init__(self,n=10, r=1, center=(0,0), mindist=0.2, maxtrials=1000 ) :
self.success = False
self.n = n
self.r = r
self.center=np.array(center)
self.d = mindist
self.points = np.ones((self.n,2))*10*r+self.center
self.c = 0
self.trials = 0
self.maxtrials = maxtrials
self.tx = "rad: {}, center: {}, min. dist: {} ".format(self.r, center, self.d)
self.fill()
def dist(self, p, x):
if len(p.shape) >1:
return np.sqrt(np.sum((p-x)**2, axis=1))
else:
return np.sqrt(np.sum((p-x)**2))
def newpoint(self):
x = (np.random.rand(2)-0.5)*2
x = x*self.r-self.center
if self.dist(self.center, x) < self.r:
self.trials += 1
if np.all(self.dist(self.points, x) > self.d):
self.points[self.c,:] = x
self.c += 1
def fill(self):
while self.trials < self.maxtrials and self.c < self.n:
self.newpoint()
self.points = self.points[self.dist(self.points,self.center) < self.r,:]
if len(self.points) == self.n:
self.success = True
self.tx +="\n{} of {} found ({} trials)".format(len(self.points),self.n,self.trials)
def __repr__(self):
return self.tx
center =(0,0)
radius = 1
p = Points(n=40,r=radius, center=center)
fig, ax = plt.subplots()
x,y = p.points[:,0], p.points[:,1]
plt.scatter(x,y)
ax.add_patch(plt.Circle(center, radius, fill=False))
ax.set_title(p)
ax.relim()
ax.autoscale_view()
ax.set_aspect("equal")
plt.show()
If the number of points should be fixed, you may try to run find this number of points for decreasing distances until the desired number of points are found.
In the following case, we are looking for 60 points and start with a minimum distance of 0.6 which we decrease stepwise by 0.05 until there is a solution found. Note that this will not necessarily be the optimum solution, as there is only maxtrials of retries in each step. Increasing maxtrials will of course bring us closer to the optimum but requires more runtime.
center =(0,0)
radius = 1
mindist = 0.6
step = 0.05
success = False
while not success:
mindist -= step
p = Points(n=60,r=radius, center=center, mindist=mindist)
print p
if p.success:
break
fig, ax = plt.subplots()
x,y = p.points[:,0], p.points[:,1]
plt.scatter(x,y)
ax.add_patch(plt.Circle(center, radius, fill=False))
ax.set_title(p)
ax.relim()
ax.autoscale_view()
ax.set_aspect("equal")
plt.show()
Here the solution is found for a minimum distance of 0.15.

Numpy.ma polyfit function for masked arrays crashes on integer input

The numpy polynomial fit function for masked arrays, ma.polyfit, crashes on integer iput:
import numpy.ma as ma
x = ma.arange(2)
y = ma.arange(2)
p1 = ma.polyfit(np.float32(x), y, deg=1)
p2 = ma.polyfit( x , y, deg=1)
The last line results in an error:
ValueError: data type <type 'numpy.int64'> not inexact
Why can't I fit data with integer x-values (it's no problem with the normal numpy.polyfit function), is this a (known) bug?
It is indeed a bug from numpy.ma : the rcond (a parameter to exclude some values ) takes len(x)*np.finfo(x.dtypes).eps as a default value, and np.int32 does not have any epsfield (because an int does not have a relative precision).
import numpy.ma as ma
eps = np.finfo(np.float32).eps
x = ma.arange(2)
y = ma.arange(2)
p1 = ma.polyfit(np.float32(x), y, deg=1, rcond = len(x)*eps)
p2 = ma.polyfit( x , y, deg=1, rcond = len(x)*eps)
I've looked quickly into numpy's issues, and this bug does not seems to figured there. It might be a good idea to raise a new issue : New Issue

N-D interpolation for equally-spaced data

I'm trying to copy the Scipy Cookbook function:
from scipy import ogrid, sin, mgrid, ndimage, array
x,y = ogrid[-1:1:5j,-1:1:5j]
fvals = sin(x)*sin(y)
newx,newy = mgrid[-1:1:100j,-1:1:100j]
x0 = x[0,0]
y0 = y[0,0]
dx = x[1,0] - x0
dy = y[0,1] - y0
ivals = (newx - x0)/dx
jvals = (newy - y0)/dy
coords = array([ivals, jvals])
newf = ndimage.map_coordinates(fvals, coords)
by using my own function that has to work for many scenarios
import scipy
import numpy as np
"""N-D interpolation for equally-spaced data"""
x = np.c_[plist['modx']]
y = np.transpose(np.c_[plist['mody']])
pdb.set_trace()
#newx,newy = np.meshgrid(plist['newx'],plist['newy'])
newx,newy = scipy.mgrid[plist['modx'][0]:plist['modx'][-1]:-plist['remapto'],
plist['mody'][0]:plist['mody'][-1]:-plist['remapto']]
x0 = x[0,0]
y0 = y[0,0]
dx = x[1,0] - x0
dy = y[0,1] - y0
ivals = (newx - x0)/dx
jvals = (newy - y0)/dy
coords = scipy.array([ivals, jvals])
for i in np.arange(ivals.shape[0]):
nvals[i] = scipy.ndimage.map_coordinates(ivals[i], coords)
I'm having difficulty getting this code to work properly. The problem areas are:
1.) Recreating this line: newx,newy = mgrid[-1:1:100j,-1:1:100j]. In my case I have a dictionary with the grid in vector form. I've tried to recreate this line using np.meshgrid but then I get an error on line coords = scipy.array([ivals, jvals]). I'm looking for some help in recreating this Cookbook function and making it more dynamic
any help is greatly appreciated.
/M
You should have a look at the documentation for map_coordinates. I don't see where the actual data you are trying to interpolate is in your code. What I mean is, presumably you have some data input which is a function of x and y; i.e. input = f(x,y) that you want to interpolate. In the first example you show, this is the array fvals. This should be your first argument to map_coordinates.
For example, if the data you are trying to inperpolate is input, which should be a 2-dimensional array of shape (len(x),len(y)), then the interpolated data would be:
interpolated_data = map_coordinates(input, coords)

Storing plot objects in a list

I asked this question yesterday about storing a plot within an object. I tried implementing the first approach (aware that I did not specify that I was using qplot() in my original question) and noticed that it did not work as expected.
library(ggplot2) # add ggplot2
string = "C:/example.pdf" # Setup pdf
pdf(string,height=6,width=9)
x_range <- range(1,50) # Specify Range
# Create a list to hold the plot objects.
pltList <- list()
pltList[]
for(i in 1 : 16){
# Organise data
y = (1:50) * i * 1000 # Get y col
x = (1:50) # get x col
y = log(y) # Use natural log
# Regression
lm.0 = lm(formula = y ~ x) # make linear model
inter = summary(lm.0)$coefficients[1,1] # Get intercept
slop = summary(lm.0)$coefficients[2,1] # Get slope
# Make plot name
pltName <- paste( 'a', i, sep = '' )
# make plot object
p <- qplot(
x, y,
xlab = "Radius [km]",
ylab = "Services [log]",
xlim = x_range,
main = paste("Sample",i)
) + geom_abline(intercept = inter, slope = slop, colour = "red", size = 1)
print(p)
pltList[[pltName]] = p
}
# close the PDF file
dev.off()
I have used sample numbers in this case so the code runs if it is just copied. I did spend a few hours puzzling over this but I cannot figure out what is going wrong. It writes the first set of pdfs without problem, so I have 16 pdfs with the correct plots.
Then when I use this piece of code:
string = "C:/test_tabloid.pdf"
pdf(string, height = 11, width = 17)
grid.newpage()
pushViewport( viewport( layout = grid.layout(3, 3) ) )
vplayout <- function(x, y){viewport(layout.pos.row = x, layout.pos.col = y)}
counter = 1
# Page 1
for (i in 1:3){
for (j in 1:3){
pltName <- paste( 'a', counter, sep = '' )
print( pltList[[pltName]], vp = vplayout(i,j) )
counter = counter + 1
}
}
dev.off()
the result I get is the last linear model line (abline) on every graph, but the data does not change. When I check my list of plots, it seems that all of them become overwritten by the most recent plot (with the exception of the abline object).
A less important secondary question was how to generate a muli-page pdf with several plots on each page, but the main goal of my code was to store the plots in a list that I could access at a later date.
Ok, so if your plot command is changed to
p <- qplot(data = data.frame(x = x, y = y),
x, y,
xlab = "Radius [km]",
ylab = "Services [log]",
xlim = x_range,
ylim = c(0,10),
main = paste("Sample",i)
) + geom_abline(intercept = inter, slope = slop, colour = "red", size = 1)
then everything works as expected. Here's what I suspect is happening (although Hadley could probably clarify things). When ggplot2 "saves" the data, what it actually does is save a data frame, and the names of the parameters. So for the command as I have given it, you get
> summary(pltList[["a1"]])
data: x, y [50x2]
mapping: x = x, y = y
scales: x, y
faceting: facet_grid(. ~ ., FALSE)
-----------------------------------
geom_point:
stat_identity:
position_identity: (width = NULL, height = NULL)
mapping: group = 1
geom_abline: colour = red, size = 1
stat_abline: intercept = 2.55595281266726, slope = 0.05543539319091
position_identity: (width = NULL, height = NULL)
However, if you don't specify a data parameter in qplot, all the variables get evaluated in the current scope, because there is no attached (read: saved) data frame.
data: [0x0]
mapping: x = x, y = y
scales: x, y
faceting: facet_grid(. ~ ., FALSE)
-----------------------------------
geom_point:
stat_identity:
position_identity: (width = NULL, height = NULL)
mapping: group = 1
geom_abline: colour = red, size = 1
stat_abline: intercept = 2.55595281266726, slope = 0.05543539319091
position_identity: (width = NULL, height = NULL)
So when the plot is generated the second time around, rather than using the original values, it uses the current values of x and y.
I think you should use the data argument in qplot, i.e., store your vectors in a data frame.
See Hadley's book, Section 4.4:
The restriction on the data is simple: it must be a data frame. This is restrictive, and unlike other graphics packages in R. Lattice functions can take an optional data frame or use vectors directly from the global environment. ...
The data is stored in the plot object as a copy, not a reference. This has two
important consequences: if your data changes, the plot will not; and ggplot2 objects are entirely self-contained so that they can be save()d to disk and later load()ed and plotted without needing anything else from that session.
There is a bug in your code concerning list subscripting. It should be
pltList[[pltName]]
not
pltList[pltName]
Note:
class(pltList[1])
[1] "list"
pltList[1] is a list containing the first element of pltList.
class(pltList[[1]])
[1] "ggplot"
pltList[[1]] is the first element of pltList.
For your second question: Multi-page pdfs are easy -- see help(pdf):
onefile: logical: if true (the default) allow multiple figures in one
file. If false, generate a file with name containing the
page number for each page. Defaults to ‘TRUE’.
For your main question, I don't understand if you want to store the plot inputs in a list for later processing, or the plot outputs. If it is the latter, I am not sure that plot() returns an object you can store and retrieve.
Another suggestion regarding your second question would be to use either Sweave or Brew as they will give you complete control over how you display your multi-page pdf.
Have a look at this related question.