Getting data from matplotlib axes object - matplotlib

I'm trying to determine what the data points are on a matplotlib axes. Is there an attribute I'm missing on the Axes object to get the x/y data values?
For example, say my code is passed a line plot, and I want to print out the x/y values that are plotted.

Your plot call will give you a lines.Line2D, which has the get_xdata(orig=True) and the get_ydata(orig=True) methods.
You can check axes.get_children() for Line2D instances.
Note that what you're doing sounds horrible from a software design point of view. You should rather implement something like a wrapper for plot that prints your raw data.
#JRichardSnape adds that iff your plot is only lines, you can use get_lines() rather than filtering the output of get_children().

Related

A plot describing the density of data points in 2D space in Julia

I am trying to use Julia to create a gif animation showing the change of density of data points with time (the data points are at the beginning concentrated at the center, and than spread to the sides, a little bit like 2D Gaussian of variance increasing with time). I have checked a catalogue of available kinds of plots in Julia:
http://docs.juliaplots.org/latest/examples/gr/
And I have tried contour plot, heatmap and 2D histogram. However, it seems that the grids of a heatmap or a contour plot have to be manually specified which is highly inconvenient. A 2D histogram serves the purpose better, but it's more related to the number of data points and when I want the plot to be more continuous by setting more bins, it cannot describe the density of data points well. Are there any good substitutes of the 2D density plot in matplotlib in Julia as the following?
https://python-graph-gallery.com/85-density-plot-with-matplotlib/
You use a package like KernelDensity to calculate the point density, then plot that. Here's an example
using StatsPlots, KernelDensity
a, b = randn(10000), randn(10000)
dens = kde((a,b))
plot(dens)
The philosophy, in the Plots package and other places in Julia, is that you generate the object you are interested in first, and then dispatch takes care of plotting it correctly.
Alternatively, you can always use PyPlot to plot anything using matplotlib directly.

Matplotlib's Figure and Axes explanation

I am really pretty new to matplotlib, though I know that it can be very powerful.
I've been reading number of tutorials and examples and it's a real hassle to understand how does matplotlib's Figure and Axes work. I am illustrating, what I am trying to understand, with the attached figure.
I know how to create a figure instance of certain size in inches. However, what bothers me is how can I create subplots and then axes, within each subplot, with relative coordinates (bottom=0,left=0,top=1,right=1) as illustrated.
So, for example I want to create a "parent" plot area (say (6in,10in)). Then, I want to create two subplot areas, each with size (3in,3in), with 1in space from the top, 2in space between the two vertical subplot areas and 1in from bottom. Then, 1in space on the left and 2in space on the write. In the same time, I would like to be able to get the coordinates of the subplot areas with respect to the main plot area.
Then, inside the first subplot area, I'd like to create 2 axis instances, with Axis 1, having coordinates with respect to Subplot Area1 (0.1,0.7,0.7,0.2) and Axes 2 (0.1,0.2,0.7,0.5). And then of course I'd like to be able to plot on these axes e.g., ax1.plot()....
If you could provide a sample code to achieve that, then I can study it.
Your help will be very much appreciated!
a subplot and an Axes object are really the same thing. There is not really a "subplot" as you describe it in matplotlib. You can just create your three Axes objects using gridspec without the need to put them in your "subplots".
There are a few different ways to create Axes instances within your figure.
fig.add_axes will create an Axes instance at the position given to it (you give it [left,bottom,width,height] in figure coordinates (i.e. 0,0 is bottom left, 1,1 is top right).
fig.add_subplot will also create an Axes instance. In this case, rather than giving it a rectangle to be created in, you give it the number of rows and columns of subplots you would like, and then the plot_number, where plot_number starts at 1, increments across rows first and has a maximum of nrows * ncols.
For example, to create the top-left Axes in a grid of 2 row and 2 columns, you could do the following:
fig.add_subplot(2,2,1)
or the shorthand
fig.add_subplot(221)
There are some more customisable ways to create Axes as well, for example gridspec and subplot2grid which allow for easy creation of many subplots of different shapes and sizes.

Interpolating data onto a line of points

I have some irregularly spaced data and need to analyze it. I can successfully interpolate this data onto a regular grid using mlab.griddata (or rather, the natgrid implementation of it). This allows me to use pcolormesh and contour to generate plots, extract levels, etc. Using plot.contour, I then extract a certain level using get_paths from the contour CS.collections().
Now, what I'd like to do is then, with my original irregularly spaced data, interpolate some quantities onto this specific contour line (i.e., NOT onto a regular grid). The similarly named griddata function from Scipy allows for this behavior, and it almost works. However, I find that as I increase the number of original points, I can get odd erratic behavior in the interpolation. I'm wondering if there's a way around this, i.e., another way to interpolate irregularly spaced (or regularly spaced data for that matter, since I can use my regularly spaced data from mlab.griddata) onto a specific line.
Let me show some numerical examples of what I'm talking about. Take a look at this figure:
The top left shows my data as points, and the line shows an extracted level of level=0 from some data D that I have at those points (x,y) [note, I have data 'D', 'Energy', and 'Pressure', all defined in this (x,y) space]. Once I have this curve, I can plot the interpolated quantities of D, Energy, and Pressure onto my specific line. First, note the plot of D (middle, right). It should be zero at all points, but it's not quite zero at all points. The likely cause of this is that the line that corresponds to the 0 level is generated from a uniform set of points that came from mlab.griddata, whereas the plot of 'D' is generated from my ORIGINAL data interpolated onto that level curve. You can also see some unphysical wiggles in 'Energy' and 'Pressure'.
Okay, seems easy enough, right? Maybe I should just get more original data points along my level=0 curve. Getting some more of these points, I then generate the following plots:
First look at the top left. You can see that I've sampled the hell out of the (x,y) space in the vicinity of my level=0 curve. Furthermore, you can see that my new "D" plot (middle, right) now correctly interpolates to zero in the region that it originally didn't. But now I get some wiggles at the start of the curve, as well as getting some other wiggles in the 'Energy' and 'Pressure' in this space! It is far from obvious to me that this should occur, since my original data points are still there and I've only supplemented additional points. Furthermore, some regions where my interpolation is going bad aren't even near the points that I added in the second run -- they are exclusively neighbored by my original points.
So this brings me to my original question. I'm worried that the interpolation that produces the 'Energy', 'D', and 'Pressure' curves is not working correctly (this is scigrid's griddata). Mlab's griddata only interpolates to a regular grid, whereas I want to interpolate to this specific line shown in the top left plot. What's another way for me to do this?
Thanks for your time!
After posting this, I decided to try scipy.interpolate.SmoothBivariateSpline, which produced the following result:
You can now see that my line is smoothed, so it seems like this will work. I'll mark this as the answer unless someone posts something soon that hints that there may be an even better solution.
Edit: As requested, below is some of the code used to generate these plots. I don't have a minimally working example, and the above plots were generated in a larger framework of code, but I'll write the important parts schematically below with comments.
# x,y,z are lists of data where the first point is x[0],y[0],z[0], and so on
minx=min(x)
maxx=max(x)
miny=min(y)
maxy=max(y)
# convert to numpy arrays
x=np.array(x)
y=np.array(y)
z=np.array(z)
# here we are creating a fine grid to interpolate the data onto
xi=np.linspace(minx,maxx,100)
yi=np.linspace(miny,maxy,100)
# here we interpolate our data from the original x,y,z unstructured grid to the new
# fine, regular grid in xi,yi, returning the values zi
zi=griddata(x,y,z,xi,yi)
# now let's do some plotting
plt.figure()
# returns the CS contour object, from which we'll be able to get the path for the
# level=0 curve
CS=plt.contour(x,y,z,levels=[0])
# can plot the original data if we want
plt.scatter(x,y,alpha=0.5,marker='x')
# now let's get the level=0 curve
for c in CS.collections:
data=c.get_paths()[0].vertices
# lineX,lineY are simply the x,y coordinates for our level=0 curve, expressed as arrays
lineX=data[:,0]
lineY=data[:,1]
# so it's easy to plot this too
plt.plot(lineX,lineY)
# now what to do if we want to interpolate some other data we have, say z2
# (also at our original x,y positions), onto
# this level=0 curve?
# well, first I tried using scipy.interpolate.griddata == scigrid like so
origdata=np.transpose(np.vstack((x,y))) # just organizing this data like the
# scigrid routine expects
lineZ2=scigrid(origdata,z2,data,method='linear')
# plotting the above curve (as plt.plot(lineZ2)) gave me really bad results, so
# trying a spline approach
Z2spline=SmoothBivariateSpline(x,y,z2)
# the above creates a spline object on our original data. notice we haven't EVALUATED
# it anywhere yet (we'll want to evaluate it on our level curve)
Z2Line=[]
# here we evaluate the spline along all our points on the level curve, and store the
# result as a new list
for i in range(0,len(lineX)):
Z2Line.append(Z2spline(lineX[i],lineY[i])[0][0]) # the [0][0] is just to get the
# value, which is enclosed in
# some array structure for some
# reason otherwise
# you can then easily plot this
plt.plot(Z2Line)
Hope this helps someone!

How can I get and set the position of a draggable legend in matplotlib

I'm trying to get and set the position of a draggable legend in matplotlib. My application consists of an interactive GUI, which has a redraw/plot function that should perform the follow steps:
save the position of the current legend.
clear the current axes and perform various plotting operations, which may or may add labels to their plots.
build a new draggable legend (ax.legend().draggable()) and restore the old position of the legend.
In between these steps the user is free to drag the legend around, and the goal is to persist the legend position when the plots are redrawn.
My first approach was to use oldpos = legend.get_bbox_to_anchor() and legend.set_bbox_to_anchor(oldpos) in steps 1 and 3. However this causes to move the legend completely off the visible area.
Note that I have to use ax.legend() and cannot use fig.legend(lines, labels), since step 2 is completely decoupled, i.e., I don't know anything about lines and labels in step 3. According to answers to the question How to position and align a matplotlib figure legend? there seems to be a difference between these two possibilities regarding axes or figure coordinates. Obviously my problem calls for figure coordinates, but I haven't fully understood how to convert the bbox to a "bbox in figure coordinates".
The even more severe problem I just realized is that apparently legend.get_bbox_to_anchor() always seems to return the same values irrespective of the drag position. So maybe the anchor can only be (ab-)used to manipulate the position of static legends? Is there another/proper way to save and restore the position of a draggable legend?
By looking at the implementation of Legend I found out that there is an undocumented property _loc, which exactly does what I want. My solution now looks astonishingly simple:
oldLegPos = ax.get_legend()._loc
# perform all plotting operations...
legend = ax.legend().draggable()
legend._loc = oldLegPos
It looks like _loc automatically stores figure coordinates, since I do not have to convert the coordinates in any way (eg. when the plotting operations completely change the axes ranges/coordinates).

Put pcolormesh and contour onto same grid?

I'm trying to display 2D data with axis labels using both contour and pcolormesh. As has been noted on the matplotlib user list, these functions obey different conventions: pcolormesh expects the x and y values to specify the corners of the individual pixels, while contour expects the centers of the pixels.
What is the best way to make these behave consistently?
One option I've considered is to make a "centers-to-edges" function, assuming evenly spaced data:
def centers_to_edges(arr):
dx = arr[1]-arr[0]
newarr = np.linspace(arr.min()-dx/2,arr.max()+dx/2,arr.size+1)
return newarr
Another option is to use imshow with the extent keyword set.
The first approach doesn't play nicely with 2D axes (e.g., as created by meshgrid or indices) and the second discards the axis numbers entirely
Your data is a regular mesh? If it doesn't, you can use griddata() to obtain it. I think that if your data is too big, a sub-sampling or regularization always is possible. If the data is too big, maybe your output image always will be small compared with it and you can exploit this.
If you use imshow() with "extent" and "interpolation='nearest'", you will see that the data is cell-centered, and extent provided the lower edges of cells (corners). On the other hand, contour assumes that the data is cell-centered, and X,Y must be the center of cells. So, you need to be care about the input domain for contour. The trivial example is:
x = np.arange(-10,10,1)
X,Y = np.meshgrid(x,x)
P = X**2+Y**2
imshow(P,extent=[-10,10,-10,10],interpolation='nearest',origin='lower')
contour(X+0.5,Y+0.5,P,20,colors='k')
My tests told me that pcolormesh() is a very slow routine, and I always try to avoid it. griddata and imshow() always is a good choose for me.