This seems like a fairly straightforward problem, but I'm new to Python and I'm struggling to resolve it. I've got a scatter plot / heatmap generated from two numpy arrays (about 25,000 pieces of information). The y-axis is taken directly from an array and the x-axis is generated from a simple subtraction operation on two arrays.
What I need to do now is slice up the data so that I can work with a selection that falls within certain parameters on the plot. For example, I need to extract all the points that fall within the parallelogram:
I'm able to cut out a rectangle using simple inequalities (see indexing idx_c, idx_h and idx, below) but I really need a way to select the points using a more complex geometry. It looks like this slicing can be done by specifying the vertices of the polygon. This is about the closest I can find to a solution, but I can't figure out how to implement it:
http://matplotlib.org/api/nxutils_api.html#matplotlib.nxutils.points_inside_poly
Ideally, I really need something akin to the indexing below, i.e. something like colorjh[idx]. Ultimately I'll have to plot different quantities (for example, colorjh[idx] vs colorhk[idx]), so the indexing needs to be transferable to all the arrays in the dataset (lots of arrays). Maybe that's obvious, but I would imagine there are solutions that might not be as flexible. In other words, I'll use this plot to select the points I'm interested in, and then I'll need those indices to work for other arrays from the same table.
Here's the code I'm working with:
import numpy as np
from numpy import ndarray
import matplotlib.pyplot as plt
import matplotlib
import atpy
from pylab import *
twomass = atpy.Table()
twomass.read('/IRSA_downloads/2MASS_GCbox1.tbl')
hmag = list([twomass['h_m']])
jmag = list([twomass['j_m']])
kmag = list([twomass['k_m']])
hmag = np.array(hmag)
jmag = np.array(jmag)
kmag = np.array(kmag)
colorjh = np.array(jmag - hmag)
colorhk = np.array(hmag - kmag)
idx_c = (colorjh > -1.01) & (colorjh < 6) #manipulate x-axis slicing here here
idx_h = (hmag > 0) & (hmag < 17.01) #manipulate y-axis slicing here
idx = idx_c & idx_h
# heatmap below
heatmap, xedges, yedges = np.histogram2d(hmag[idx], colorjh[idx], bins=200)
extent = [yedges[0], yedges[-1], xedges[-1], xedges[0]]
plt.clf()
plt.imshow(heatmap, extent=extent, aspect=0.65)
plt.xlabel('Color(J-H)', fontsize=15) #adjust axis labels here
plt.ylabel('Magnitude (H)', fontsize=15)
plt.gca().invert_yaxis() #I put this in to recover familiar axis orientation
plt.legend(loc=2)
plt.title('CMD for Galactic Center (2MASS)', fontsize=20)
plt.grid(True)
colorbar()
plt.show()
Like I say, I'm new to Python, so the less jargon-y the explanation the more likely I'll be able to implement it. Thanks for any help y'all can provide.
a = np.random.randint(0,10,(100,100))
x = np.linspace(-1,5.5,100) # tried to mimic your data boundaries
y = np.linspace(8,16,100)
xx, yy = np.meshgrid(x,y)
m = np.all([yy > xx**2, yy < 10* xx, xx < 4, yy > 9], axis = 0)
Related
I generate plots like below:
from pylab import *
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.ticker
import matplotlib.ticker as ticker
rcParams['axes.linewidth'] = 2 # set the value globally
rcParams['font.size'] = 16# set the value globally
rcParams['font.family'] = ['DejaVu Sans']
rcParams['mathtext.fontset'] = 'stix'
rcParams['legend.fontsize'] = 24
rcParams['axes.prop_cycle'] = cycler(color=['grey','b','g','r','orange'])
rc('lines', linewidth=2, linestyle='-',marker='o')
rcParams['axes.xmargin'] = 0
rcParams['axes.ymargin'] = 0
t = arange(0,21,1)
v = 2.0
s = v*t
plt.figure(figsize=(12, 4))
plt.plot(t,s,label='$s=%1.1f\cdot t$'%v)
plt.title('Wykres drogi w czasie $s=v\cdot t$')
plt.xlabel('Czas $t$, s')
plt.ylabel('Droga $s$, m')
plt.autoscale(enable=True, axis='both', tight=None)
legend(loc='best')
plt.xlim(min(t),max(t))
plt.ylim(min(s),max(s))
plt.grid()
plt.show()
When I am changing the value t = arange(0,21,1) for example to t = arange(0,20,1) which gives me for example on the x axis max value= 19.0 my max value dispirs from the x axis. The same situation is of course with y axis.
My question is how to force matplotlib to produce always plots where on the axes are max values just at the end of the axes like should be always for my purposes or should be possible to chose like an option?
Imiage from my program in Fortan I did some years ago
Matplotlib is more efficiens that I use it but there should be an opition like that (the picture above).
In this way I can always observe max min in text windows or do take addiional steps to make sure about max min values. I would like to read them from axes and the question is ...Are there such possibilites in mathplotlib ??? If not I will close the post.
Axes I am thinking about more or less
I see two ways to solve the problem.
Set the axes automatic limit mode to round numbers
In the rcParams you can do this with
rcParams['axes.autolimit_mode'] = 'round_numbers'
And turn off the manual axes limits with min and max
plt.xlim(min(t),max(t))
plt.ylim(min(s),max(s))
This will produce the image below. Still, the extreme values of the axes are shown at the nearest "round numbers", but the user can approximately catch the data range limits. If you need the exact value to be displayed, you can see the second solution which cannot be directly used from the rcParams.
or – Manually generate axes ticks
This solution implies explicitly asking for a given number of ticks. I guess there is a way to automatize it depending on the axes size etc. But if you are dealing with more or less every time the same graph size, you can decide a fixed number of ticks manually. This can be done with
plt.xlim(min(t),max(t))
plt.ylim(min(s),max(s))
plt.xticks(np.linspace(t.min(), t.max(), 7)) # arbitrary chosen
plt.yticks(np.linspace(s.min(), s.max(), 5)) # arbitrary chosen
generated the image below, quite similar to your image example.
I'm trying to set xlimits and keep the margins.
In a simplified code, the dataset contains 50 values. When plotting the whole data set, it is fine. However, I only want to plot values 20-40. The plot starts and ends without having any margins.
How do I plot values 20-40 but keep the margins?
Online I found to ways to play with the margin/padding
1) plt.tight_layout(pad=1.08, h_pad=None, w_pad=None, rect=None)
2) ax1.margins(0.05)
Both, however, do not seem to work when using xlimits.
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(1, 200, 50)
y = np.random.random(len(x))
fig_1 = plt.figure(figsize=(8, 4))
ax1 = plt.subplot(1,1,1)
ax1.plot(x, y)
ax1.set_xlim(x[19], x[40])
# ax1.plot(x[19:40], y[19:40])
# would create exactly the plot I want. But it is not the solution I am looking for.
# I cannot change/slice the data. I want to change the figure.
I have a Series that I would like to plot as a bar chart: pd.Series([-4,2, 3,3, 4,5,9,20]).value_counts()
Since I have many bars I only want to display some (equidistant) ticks.
However, unless I actively work against it, pyplot will print the wrong labels. E.g. if I leave out set_xticklabels in the code below I get
where every element from the index is taken and just displayed with the specified distance.
This code does what I want:
import numpy as np
import pandas as pd
import matplotlib
import matplotlib.pyplot as plt
s = pd.Series([-4,2, 3,3, 4,5,9,20]).value_counts().sort_index()
mi,ma = min(s.index), max(s.index)
s = s.reindex(range(mi,ma+1,1), fill_value=0)
distance = 10
a = s.plot(kind='bar')
condition = lambda t: int(t[1].get_text()) % 10 == 0
ticks_,labels_=zip(*filter(condition, zip(a.get_xticks(), a.get_xticklabels())))
a.set_xticks(ticks_)
a.set_xticklabels(labels_)
plt.show()
But I still feel like I'm being unnecessarily clever here. Am I missing a function? Is this the best way of doing that?
Consider not using a pandas bar plot in case you intend to plot numeric values; that is because pandas bar plots are categorical in nature.
If instead using a matplotlib bar plot, which is numeric in nature, there is no need to tinker with any ticks at all.
s = pd.Series([-4,2, 3,3, 4,5,9,20]).value_counts().sort_index()
plt.bar(s.index, s)
I think you overcomplicated it. You can simply use the following. You just need to find the relationship between the ticks and the ticklabels.
a = s.plot(kind='bar')
xticks = np.arange(0, max(s)*10+1, 10)
plt.xticks(xticks + abs(mi), xticks)
I'd like to make a streamplot with lines that don't stop when they get too close together. I'd rather each streamline be calculated in both directions until it hits the edge of the window. The result is there'd be some areas where they'd all jumble up. But that's what I want.
I there anyway to do this in matplotlib? If not, is there another tool I can use for this that could interface with python/numpy?
import numpy as np
import matplotlib.pyplot as plt
Y,X = np.mgrid[-10:10:.01, -10:10:.01]
U, V = Y**2, X**2
plt.streamplot(X,Y, U,V, density=1)
plt.show(False)
Ok, I've figured out I can get mostly what I want by turning up the density a lot and using custom start points. I'm still interested if there is a better or alternate way to do this.
Here's my solution. Doesn't it look so much better?
import numpy as np
import matplotlib.pyplot as plt
Y,X = np.mgrid[-10:10:.01, -10:10:.01]
y,x = Y[:,0], X[0,:]
U, V = Y**2, X**2
stream_points = np.array(zip(np.arange(-9,9,.5), -np.arange(-9,9,.5)))
plt.streamplot(x,y, U,V, start_points=stream_points, density=35)
plt.show(False)
Edit: By the way, there seems to be some bug in streamplot such that start_points keyword only works if you use 1d arrays for the grid data. See Python Matplotlib Streamplot providing start points
As of Matplotlib version 3.6.0, an optional parameter broken_streamlines has been added for disabling streamline breaks.
Adding it to your snippet produces the following result:
import numpy as np
import matplotlib.pyplot as plt
Y,X = np.mgrid[-10:10:.01, -10:10:.01]
U, V = Y**2, X**2
plt.streamplot(X,Y, U,V, density=1, broken_streamlines=False)
plt.show(False)
Note
This parameter just extends the streamlines which were originally drawn (as in the question). This means that the streamlines in the modified plot above are much more uneven than the result obtained in the other answer, with custom start_points. The density of streamlines on any stream plot does not represent the magnitude of U or V at that point, only their direction. See the documentation for the density parameter of matplotlib.pyplot.streamplot for more details on how streamline start points are chosen by default, when they aren't specified by the optional start_points parameter.
For accurate streamline density, consider using matplotlib.pyplot.contour, but be aware that contour does not show arrows.
Choosing start points automatically
It may not always be easy to choose a set of good starting points automatically. However, if you know the streamfunction corresponding to the flow you wish to plot you can use matplotlib.pyplot.contour to produce a contour plot (which can be hidden from the output), and then extract a suitable starting point from each of the plotted contours.
In the following example, psi_expression is the streamfunction corresponding to the flow. When modifying this example for your own needs, make sure to update both the line defining psi_expression, as well as the one defining U and V. Ensure these both correspond to the same flow.
The density of the streamlines can be altered by changing contour_levels. Here, the contours are uniformly distributed.
import numpy as np
import matplotlib.pyplot as plt
import sympy as sy
x, y = sy.symbols("x y")
psi_expression = x**3 - y**3
psi_function = sy.lambdify((x, y), psi_expression)
Y, X = np.mgrid[-10:10:0.01, -10:10:0.01]
psi_evaluated = psi_function(X, Y)
U, V = Y**2, X**2
contour_levels = np.linspace(np.amin(psi_evaluated), np.amax(psi_evaluated), 30)
# Draw a temporary contour plot.
temp_figure = plt.figure()
contour_plot = plt.contour(X, Y, psi_evaluated, contour_levels)
plt.close(temp_figure)
points_list = []
# Iterate over each contour.
for collection in contour_plot.collections:
# Iterate over each segment in this contour.
for path in collection.get_paths():
middle_point = path.vertices[len(path.vertices) // 2]
points_list.append(middle_point)
# Reshape python list into numpy array of coords.
stream_points = np.reshape(np.array(points_list), (-1, 2))
plt.streamplot(X, Y, U, V, density=1, start_points=stream_points, broken_streamlines=False)
plt.show(False)
I'm currently writing line and contour plotting functions for my PyQuante quantum chemistry package using matplotlib. I have some great functions that evaluate basis sets along a (npts,3) array of points, e.g.
from somewhere import basisset, line
bfs = basisset(h2) # Generate a basis set
points = line((0,0,-5),(0,0,5)) # Create a line in 3d space
bfmesh = bfs.mesh(points)
for i in range(bfmesh.shape[1]):
plot(bfmesh[:,i])
This is fast because it evaluates all of the basis functions at once, and I got some great help from stackoverflow here and here to make them extra-nice.
I would now like to update this to do contour plotting as well. The slow way I've done this in the past is to create two one-d vectors using linspace(), mesh these into a 2D grid using meshgrid(), and then iterating over all xyz points and evaluating each one:
f = np.empty((50,50),dtype=float)
xvals = np.linspace(0,10)
yvals = np.linspace(0,20)
z = 0
for x in xvals:
for y in yvals:
f = bf(x,y,z)
X,Y = np.meshgrid(xvals,yvals)
contourplot(X,Y,f)
(this isn't real code -- may have done something dumb)
What I would like to do is to generate the mesh in more or less the same way I do in the contour plot example, "unravel" it to a (npts,3) list of points, evaluate the basis functions using my new fast routines, then "re-ravel" it back to X,Y matrices for plotting with contourplot.
The problem is that I don't have anything that I can simply call .ravel() on: I either have 1d meshes of xvals and yvals, the 2D versions X,Y, and the single z value.
Can anyone think of a nice, pythonic way to do this?
If you can express f as a function of X and Y, you could avoid the Python for-loops this way:
import matplotlib.pyplot as plt
import numpy as np
def bf(x, y):
return np.sin(np.sqrt(x**2+y**2))
xvals = np.linspace(0,10)
yvals = np.linspace(0,20)
X, Y = np.meshgrid(xvals,yvals)
f = bf(X,Y)
plt.contour(X,Y,f)
plt.show()
yields