How to create volume from point cloud in spherical coordinates? - numpy

I have two sets of discrete points in spherical coordinates, each representing top and bottom surfaces of an object.
I am trying to create volume from these points to separate points which lies inside and outside the object. Any suggestions where to look or which library to use?
Blue and red points represents top and bottom surfaces. Red points are generated by shifting top surface radially downwards with some constant radius.

If I am right, the blue and red surfaces are meshed (and watertight). So for every point you can draw the line from the sphere center and look for intersections with the mesh. This is done by finding the two triangles such that the line pierces them (this can be done by looking at the angular coordinates only, using a point-in-triangle formula), then finding the intersection points. Then it is an easy matter to classify the point as before the red surface, after the blue or in between.
Exhaustive search for the triangles can be costly. You can speed it up for instance using a hierarchy of bounding boxes or similar device.

Here is a custom tinkered method which may works at the condition that the average distance between points in the original surface is much smaller than the thickness of the volume and than the irregularities on the surface contour. In other words, that there are a lot of points describing the blue surfaces.
import matplotlib.pylab as plt
import numpy as np
from scipy.spatial import KDTree
# Generate a test surface:
theta = np.linspace(3, 1, 38)
phi = np.zeros_like(theta)
r = 1 + 0.1*np.sin(8*theta)
surface_points = np.stack((r, theta, phi), axis=1) # n x 3 array
# Generate test points:
x_span, y_span = np.linspace(-1, 0.7, 26), np.linspace(0.1, 1.2, 22)
x_grid, y_grid = np.meshgrid(x_span, y_span)
r_test = np.sqrt(x_grid**2 + y_grid**2).ravel()
theta_test = np.arctan2(y_grid, x_grid).ravel()
phi_test = np.zeros_like(theta_test)
test_points = np.stack((r_test, theta_test, phi_test), axis=1) # n x 3 array
# Determine if the test points are in the volume:
volume_thickness = 0.2 # Distance between the two surfaces
angle_threshold = 0.05 # Angular threshold to determine for a point
# if the line from the origin to the point
# go through the surface
# Get the nearest point: (replace the interpolation)
get_nearest_points = KDTree(surface_points[:, 1:]) # keep only the angles
# This is based on the cartesian distance,
# and therefore not enterily valid for the angle between points on a sphere
# It could be better to project the points on a unit shpere, and convert
# all coordinates in cartesian frame in order to do the nearest point seach...
distance, idx = get_nearest_points.query(test_points[:, 1:])
go_through = distance < angle_threshold
nearest_surface_radius = surface_points[idx, 0]
is_in_volume = (go_through) & (nearest_surface_radius > test_points[:, 0]) \
& (nearest_surface_radius - volume_thickness < test_points[:, 0])
not_in_volume = np.logical_not(is_in_volume)
# Graph;
plt.figure(figsize=(10, 7))
plt.polar(test_points[is_in_volume, 1], test_points[is_in_volume, 0], '.r',
label='in volume');
plt.polar(test_points[not_in_volume, 1], test_points[not_in_volume, 0], '.k',
label='not in volume', alpha=0.2);
plt.polar(test_points[go_through, 1], test_points[go_through, 0], '.g',
label='go through', alpha=0.2);
plt.polar(surface_points[:, 1], surface_points[:, 0], '.b',
label='surface');
plt.xlim([0, np.pi]); plt.grid(False);plt.legend();
The result graph, for 2D case, is:
The idea is to look for each test point the nearest point in the surface, by considering only the direction and not the radius. Once this "same direction" point is found, it's possible to test both if the point is inside the volume along the radial direction (volume_thickness), and close enough to the surface using the parameter angle_threshold.
I think it would be better to mesh (non-convex) the blue surface and perform a proper interpolation, but I don't know Scipy method for this.

Related

Stratigraphic column in matplotlib

My goal is to create a stratigraphic column (colored stacked rectangles) using matplotlib like the example below.
Data is in this format:
depth = [1,2,3,4,5,6,7,8,9,10] #depth (feet) below ground surface
lithotype = [4,4,4,5,5,5,6,6,6,2] #lithology type. 4 = clay, 6 = sand, 2 = silt
I tried matplotlib.patches.Rectangle but it's cumbersome. Wondering if someone has another suggestion.
Imho using Rectangle is not so difficult nor cumbersome.
from numpy import ones
from matplotlib.pyplot import show, subplots
from matplotlib.cm import get_cmap
from matplotlib.patches import Rectangle as r
# a simplification is to use, for the lithology types, a qualitative colormap
# here I use Paired, but other qualitative colormaps are displayed in
# https://matplotlib.org/stable/tutorials/colors/colormaps.html#qualitative
qcm = get_cmap('Paired')
# the data, augmented with type descriptions
# note that depths start from zero
depth = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10] # depth (feet) below ground surface
lithotype = [4, 4, 4, 5, 5, 5, 6, 1, 6, 2] # lithology type.
types = {1:'swiss cheese', 2:'silt', 4:'clay', 5:'silty sand', 6:'sand'}
# prepare the figure
fig, ax = subplots(figsize = (4, 8))
w = 2 # a conventional width, used to size the x-axis and the rectangles
ax.set(xlim=(0,2), xticks=[]) # size the x-axis, no x ticks
ax.set_ylim(ymin=0, ymax=depth[-1])
ax.invert_yaxis()
fig.suptitle('Soil Behaviour Type')
fig.subplots_adjust(right=0.5)
# plot a series of dots, that eventually will be covered by the Rectangle\s
# so that we can draw a legend
for lt in set(lithotype):
ax.scatter(lt, depth[1], color=qcm(lt), label=types[lt], zorder=0)
fig.legend(loc='center right')
ax.plot((1,1), (0,depth[-1]), lw=0)
# do the rectangles
for d0, d1, lt in zip(depth, depth[1:], lithotype):
ax.add_patch(
r( (0, d0), # coordinates of upper left corner
2, d1-d0, # conventional width on x, thickness of the layer
facecolor=qcm(lt), edgecolor='k'))
# That's all, folks!
show()
As you can see, placing the rectangles is not complicated, what is indeed cumbersome is to properly prepare the Figure and the Axes.
I know that I omitted part of the qualifying details from my solution, but I hope these omissions won't stop you from profiting from my answer.
I made a package called striplog for handling this sort of data and making these kinds of plots.
The tool can read CSV, LAS, and other formats directly (if the format is rather particular), but we can also construct a Striplog object manually. First let's set up the basic data:
depth = [1,2,3,4,5,6,7,8,9,10]
lithotype = [4,4,4,5,5,5,6,6,6,2]
KEY = {2: 'silt', 4: 'clay', 5: 'mud', 6: 'sand'}
Now you need to know that a Striplog is composed of Interval objects, each of which can have one or more Component elements:
from striplog import Striplog, Component, Interval
intervals = []
for top, base, lith in zip(depth, depth[1:], lithotype):
comp = Component({'lithology': KEY[lith]})
iv = Interval(top, base, components=[comp])
intervals.append(iv)
s = Striplog(intervals).merge_neighbours() # Merge like with like.
This results in Striplog(3 Intervals, start=1.0, stop=10.0). Now we'd like to make a plot using an appropriate Legend object.
from striplog import Legend
legend_csv = u"""colour, width, component lithology
#F7E9A6, 3, Sand
#A68374, 2.5, Silt
#99994A, 2, Mud
#666666, 1, Clay"""
legend = Legend.from_csv(text=legend_csv)
s.plot(legend=legend, aspect=2, label='lithology')
Which gives:
Admittedly the plotting is a little limited, but it's just matplotlib so you can always add more code. To be honest, if I were to build this tool today, I think I'd probably leave the plotting out entirely; it's often easier for the user to do their own thing.
Why go to all this trouble? Fair question. striplog lets you merge zones, make thickness or lithology histograms, make queries ("show me sandstone beds thicker than 2 m"), make 'flags', export LAS or CSV, and even do Markov chain sequence analysis. But even if it's not what you're looking for, maybe you can recycle some of the plotting code! Good luck.

Dynamically scaling axes during a matplotlib ArtistAnimation

It appears to be impossible to change the y and x axis view limits during an ArtistAnimation, and have the frames replayed with different axis limits.
The limits seem to fixed to those set last before the animation function is called.
In the code below, I have two plotting stages. The input data in the second plot is a much smaller subset of the data in the 1st frame. The data in the 1st stage has a much wider range.
So, I need to "zoom in" when displaying the second plot (otherwise the plot would be very tiny if the axis limits remain the same).
The two plots are overlaid on two different images (that are of the same size, but different content).
import matplotlib
import matplotlib.pyplot as plt
import matplotlib.animation as animation
import matplotlib.image as mpimg
import random
# sample 640x480 image. Actual frame loops through
# many different images, but of same size
image = mpimg.imread('image_demo.png')
fig = plt.figure()
plt.axis('off')
ax = fig.gca()
artists = []
def plot_stage_1():
# both x, y axis limits automatically set to 0 - 100
# when we call ax.imshow with this extent
im_extent = (0, 100, 0, 100) # (xmin, xmax, ymin, ymax)
im = ax.imshow(image, extent=im_extent, animated=True)
# y axis is a list of 100 random numbers between 0 and 100
p, = ax.plot(range(100), random.choices(range(100), k=100))
# Text label at 90, 90
t = ax.text(im_extent[1]*0.9, im_extent[3]*0.9, "Frame 1")
artists.append([im, t, p])
def plot_stage_2():
# axes remain at the the 0 - 100 limit from the previous
# imshow extent so both the background image and plot are tiny
im_extent = (0, 10, 0, 10)
# so let's update the x, y axis limits
ax.set_xlim(im_extent[0], im_extent[1])
ax.set_ylim(im_extent[0], im_extent[3])
im = ax.imshow(image, extent=im_extent, animated=True)
p, = ax.plot(range(10), random.choices(range(10), k=10))
# Text label at 9, 9
t = ax.text(im_extent[1]*0.9, im_extent[3]*0.9, "Frame 2")
artists.append([im, t, p])
plot_stage_1()
plot_stage_2()
# clear white space around plot
fig.subplots_adjust(left=0, bottom=0, right=1, top=1, wspace=None, hspace=None)
# set figure size
fig.set_size_inches(6.67, 5.0, True)
anim = animation.ArtistAnimation(fig, artists, interval=2000, repeat=False, blit=False)
plt.show()
If I call just one of the two functions above, the plot is fine. However, if I call both, the axis limits in both frames will be 0 - 10, 0 - 10. So frame 1 will be super zoomed in.
Also calling ax.set_xlim(0, 100), ax.set_ylim(0, 100) in plot_stage_1() doesn't help. The last set_xlim(), set_ylim() calls fix the axis limits throughout all frames in the animation.
I could keep the axis bounds fixed and apply a scaling function to the input data.
However, I'm curious to know whether I can simply change the axis limits -- my code will be better this way, because the actual code is complicated with multiple stages, zooming plots across many different ranges.
Or perhaps I have to rejig my code to use FuncAnimation, instead of ArtistAnimation?
FuncAnimation appears to result in the expected behavior. So I'm changing my code to use that instead of ArtistAnimation.
Still curious to know though, whether this can at all be done using ArtistAnimation.

Draw a point at the mean peak of a distplot or kdeplot in Seaborn

I'm interested in automatically plotting a point just above the mean peak of a distribution, represented by a kdeplot or distplot with kde. Plotting points and lines manually is simple, but I'm having difficulty deriving this maximal coordinate point.
For example, the kdeplot generated below should have a point drawn at about (3.5, 1.0):
iris = sns.load_dataset("iris")
setosa = iris.loc[iris.species == "setosa"]
sns.kdeplot(setosa.sepal_width)
This question is serving the ultimate goal to draw a line across to the next peak (two distributions in one graph) with a t-statistic printed above it.
Here is one way to do it. The idea here is to first extract the x and y-data of the line object in the plot. Then, get the id of the peak and finally plot the single (x,y) point corresponding to the peak of the distribution.
import numpy as np
import seaborn as sns
iris = sns.load_dataset("iris")
setosa = iris.loc[iris.species == "setosa"]
ax = sns.kdeplot(setosa.sepal_width)
x = ax.lines[0].get_xdata() # Get the x data of the distribution
y = ax.lines[0].get_ydata() # Get the y data of the distribution
maxid = np.argmax(y) # The id of the peak (maximum of y data)
plt.plot(x[maxid],y[maxid], 'bo', ms=10)

Annotate a data point with a graph

For the lack of better term, is there a way to annotate a data point with a graph? I include an example of what I am for below
Big black data point with a graph corresponding to it. Note that graph is rotated so its "x" axis (not shown) is perpendicular to the "y" axis of the scatter plot
annotation_box http://matplotlib.org/examples/pylab_examples/demo_annotation_box.html is the closest thing I can find at the moment, but even knowing the proper term for what I want to do, would make my life easier.
If I understood the problem correctly, what you need are floating axes that you can place as annotations over your plot. Unfortunately, this is not easily possible in matplotlib, as far I know.
An easy solution would be to just plot the points and graphs in the same axis, with the graphs scaled down and shifted close to the points.
import numpy as np
import scipy.stats as sps
import matplotlib.pyplot as plt
xp = [5, 1, 3]
yp = [2, 1, 4]
# just generate some curves
curves_x = np.array([np.linspace(0, 10, 100)] * 3)
curves_y = sps.gamma.pdf(curves_x[0], [[2], [5], [7]], 1)
plt.scatter(xp, yp, s=50)
for x, y, cx, cy in zip(xp, yp, curves_x, curves_y):
plt.plot(x + cy / np.max(cy) + 0.1 , y + cx / np.max(cx) - 0.5)
plt.show()
This is a very simplistic example. The numbers will have to be tuned to look nice with varying scale of the data.

Numpy: find mean coordinate of points along line

I have a bunch of points in a 2D space which all reside on a line (polygon). How can I compute the mean coordinate of these points on the line?
I don't mean the centroid of the points in the 2D space (as #rth initially proposed in his answer), but the mean location of the points along the line on which they reside. So basically, I could transform the line to a 1D axis, compute the mean location in 1D, and transform the location of the mean back into the 2D space.
Maybe these are exactly the necessary steps, but I think (or hope) that there is a function in numpy/scipy which allows me to do this in one step.
Edit: The approach you describe in the question is indeed probably the simplest way for solving this problem.
Here is an implementation that calculates the positions of vertices along the line in 1D, takes their mean, and finally calculates the corresponding 2D position with parametric interpolation,
import numpy as np
from scipy.interpolate import splprep, splev
vert = np.random.randn(1000, 2) # vertices definition here
# calculate the Euclidean distances between consecutive vertices
# equivalent to a for loop with
# dl[i] = ((vert[i+1, 0] - vert[i, 0])**2 + (vert[i+1,1] - vert[i,1])**2)**0.5
dl = (np.diff(vert, axis=0)**2).sum(axis=1)**0.5
# pad with 0, so dl.shape[0] == vert.shape[0] for convenience
dl = np.insert(dl, 0, 0.0)
l = np.cumsum(dl) # 1D coordinates along the line
l_mean = np.mean(l) # mean in the line coordinates
# calculate the coordinate of l_mean in 2D space
# with parametric B-spline interpolation
tck, _ = splprep(x=vert.T, u=l, k=3)
res = splev(l_mean, tck)
print(res)
Edit2: Assuming now that you have a high resolution set of points for your path vert_full and some approximate measurements vert_1, vert_2, etc, what you could do is the following.
Project each points of vert_1, etc. onto the exact path. Assuming that vert_full has much more datapoints than vert_1, we can simply look for the nearest neighbours of vert_1 in vert_full:
from scipy.spatial import cKDTree
tr = cKDTree(vert_full)
d, idx = tr.query(vert_1, k=1)
vert_1_proj = vert_full[idx] # this gives the projected corrdinates onto vert_full
# I have not actually run this, so it might require minor changes
Use the above mean calculation with the new vert_1_proj vector.
Meanwhile I've found the answer to my question, although using Shapely instead of Numpy.
from shapely.geometry import LineString, Point
# lists of points as (x,y) tuples
path_xy = [...]
points_xy = [...] # should be on or near path
path = LineString(path_xy) # create path object
pts = [Point(p) for p in points_xy] # create point objects
dist = [path.project(p) for p in pts] # distances along path
mean_dist = np.mean(dist) # mean distance along path
mean = path.interpolate(mean_dist) # mean point
mean_xy = (mean.x,mean.y)
This works perfectly!
(That's is also why I have to accept it as the answer, though I highly appreciate #rth's help!)