How to create an adjacency matrix from VTK / STL file? - mesh

I have a .vtk mesh with N points and F polygon (triangle) faces, and i'd like to build an N x N adjacency matrix to represent the connectivity between the points.
I've tried mesh.GetLines().GetData() however, this returns an empty array. I've also tried mesh.GetPolys().GetData() and this gives an flat array of 4 x F elements.
From inspecting the .vtk file, I know that each face is given as 3, point1, point2, point3 where I assume 3 indicates the faces are triangular. From here it is possible to create the adjacency matrix by iterating through the list, however I'd like to know there if there is any inbuilt VTK functions that can do the job for me.
I also have the mesh in .stl format, if that helps.
Thanks

It is possible to create the adj matrix by iterating over the vtk polys faces and adding a 1 in an empty matrix for every edge connection like so:
polygons = vtk_to_numpy(mesh.GetPolys().GetData())
adj = np.zeros((len(coords), len(coords)))
print('ADJACENCY MATRIX SHAPE: ',adj.shape)
for i in range(0, int(len(polygons)), 4):
line = polygons[i:i+4] # 3, point1, point2, point3
face = line[1:] # point1, point2, point3
n1, n2, n3 = face[0], face[1], face[2]
adj[n1,n2] = 1
adj[n2,n1] = 1
adj[n1,n3] = 1
adj[n3,n1] = 1
adj[n2,n3] = 1
adj[n3,n2] = 1
Altenatively, if using .stl files, this can be done with the trimesh and networkx packages, like so:
mesh = trimesh.load_mesh('mesh.stl')
adj = networkx.adjacency_matrix(trimesh.graph.vertex_adjacency_graph(mesh))

Related

Getting single value from the N dim histogram in NumPy or SciPy

Assume I have a data like this:
x = np.random.randn(4, 100000)
and I fit a histogram
hist = np.histogramdd(x, density=True)
What I want is to get the probability of number g, e.g. g=0.1. Assume some hypothetical function foo then.
g = 0.1
prob = foo(hist, g)
print(prob)
>> 0.2223124214
How could I do something like this, where I get probability back for a single or a vector of numbers for a fitted histogram? Especially histogram that is N-dimensional.
histogramdd takes O(r^D) memory, and unless you have a very large dataset or very small dimension you will have a poor estimate. Consider your example data, 100k points in 4-D space, the default histogram will be 10 x 10 x 10 x 10, so it will have 10k bins.
x = np.random.randn(4, 100000)
hist = np.histogramdd(x.transpose(), density=True)
np.mean(hist[0] == 0)
gives something arround 0.77 meaning that 77% of the bins in the histogram have no points.
You probably want to smooth the distribution. Unless you have a good reason to not do, I would suggest you to use Gaussian kernel-density Estimate
x = np.random.randn(4, 100000) # d x n array
f = scipy.stats.gaussian_kde(x) # d-dimensional PDF
f([1,2,3,4]) # evaluate the PDF in a given point

speeding up numpy code involving array slicing and broadcasting

I have the following code:
x = sp.linspace(-2,2,1000)
z = sp.linspace(-1,3,2000)
X,Z = sp.meshgrid(x,z)
X = X[:,:,sp.meshgrid]
Z = Z[:,:,sp.meshgrid]
E = sp.zeros((len(z),len(x),3), dtype=complex)
# e_uvect.shape = (2,N,2,3)
# En.shape = (2,N,2)
# d_cum.shape = (N,)
# pol is either 0 or 1
for n in range(N):
idx = sp.logical_and(Z<d_cum[n], Z>=d_cum[n-1])
E += e_uvect[pol,n,0,:]*En[pol,n,0]*sp.exp(+1j*self.kz[n]*(Z-d_cum[n-1])+1j*self.kx*X)*idx
Basically the above is part of a code to calculate the electric field of an N-layer structures. For each iteration inside for loop, I find the index of the array elements which are within the Nth layer, then after I calculate the electric field I multiply the whole thing by idx to 'filter' out the correct part which satisfies sp.logical_and(Z<d_cum[n], Z>=d_cum[n-1]).
It works fine, but I wonder if there is a more efficient way of doing this using numpy array slicing or other methods, because each multiplication involves a large proportion of array elements which are not accepted in each iteration. I tried something like the following to only work on the relevant part of the coordinates array Z and X
idx = sp.logical_and(Z<d_cum[n], Z>=d_cum[n-1])
Z2 = Z[idx]
X2 = X[idx]
E[???] += e_uvect[pol,n,0,:]*En[pol,n,0]*sp.exp(+1j*self.kz[n]*(Z2-d_cum[n-1])+1j*self.kx*X2)
But then Z2 and X2 becomes a 1d-array, and I'm not sure about the indexing part within E or how to reshape the arrays appropriately.
So are there any ways to speed up the original code?

Difficulty with numpy broadcasting

I have two 2d point clouds (oldPts and newPts) which I whish to combine. They are mx2 and nx2 numpyinteger arrays with m and n of order 2000. newPts contains many duplicates or near duplicates of oldPts and I need to remove these before combining.
So far I have used the histogram2d function to produce a 2d representation of oldPts (H). I then compare each newPt to an NxN area of H and if it is empty I accept the point. This last part I am currently doing with a python loop which i would like to remove. Can anybody show me how to do this with broadcasting or perhaps suggest a completely different method of going about the problem. the working code is below
npzfile = np.load(path+datasetNo+'\\temp.npz')
arrs = npzfile.files
oldPts = npzfile[arrs[0]]
newPts = npzfile[arrs[1]]
# remove all the negative values
oldPts = oldPts[oldPts.min(axis=1)>=0,:]
newPts = newPts[newPts.min(axis=1)>=0,:]
# round to integers
oldPts = np.around(oldPts).astype(int)
newPts = newPts.astype(int)
# put the oldPts into 2d array
H, xedg,yedg= np.histogram2d(oldPts[:,0],oldPts[:,1],
bins = [xMax,yMax],
range = [[0, xMax], [0, yMax]])
finalNewList = []
N = 5
for pt in newPts:
if not H[max(0,pt[0]-N):min(xMax,pt[0]+N),
max(0,pt[1]- N):min(yMax,pt[1]+N)].any():
finalNewList.append(pt)
finalNew = np.array(finalNewList)
The right way to do this is to use linear algebra to compute the distance between each pair of 2-long vectors, and then accept only the new points that are "different enough" from each old point: using scipy.spatial.distance.cdist:
import numpy as np
oldPts = np.random.randn(1000,2)
newPts = np.random.randn(2000,2)
from scipy.spatial.distance import cdist
dist = cdist(oldPts, newPts)
print(dist.shape) # (1000, 2000)
okIndex = np.max(dist, axis=0) > 5
print(np.sum(okIndex)) # prints 1503 for me
finalNew = newPts[okIndex,:]
print(finalNew.shape) # (1503, 2)
Above I use the Euclidean distance of 5 as the threshold for "too close": any point in newPts that's farther than 5 from all points in oldPts is accepted into finalPts. You will have to look at the range of values in dist to find a good threshold, but your histogram can guide you in picking the best one.
(One good way to visualize dist is to use matplotlib.pyplot.imshow(dist).)
This is a more refined version of what you were doing with the histogram. In fact, you ought to be able to get the exact same answer as the histogram by passing in metric='minkowski', p=1 keyword arguments to cdist, assuming your histogram bin widths are the same in both dimensions, and using 5 again as the threshold.
(PS. If you're interested in another useful function in scipy.spatial.distance, check out my answer that uses pdist to find unique rows/columns in an array.)

Transform a numpy 3D ndarray to a symmetric form with respect to a specific index

In the case of a matrix mat n x n, i can do the following
sym = 0.5 * (mat + mat.T)
the operation gives the desired result sym[i,j] = sym[j,i]
Suppose we have a 3D array ndarr[i,j,k], where i,j,k 0,1,...n,
then ndarr is n x n x n. The idea is to obtain the following "symmetric" form
nsym[i,j,k] = nsym[j,i,k] using ndarr. I tried this:
import numpy as np
# Generate some random matrix, n = 5
ndarr = np.random.beta(0.1,1,(5,5,5))
# First attempt to symmetrize
sym1 = np.array([0.5*(ndarr[:,:,k]+ndarr[:,:,k].T) for k in range(5)])
The problem here is that sym1[i,j,k] != sym1[j,i,k] as it is required. In fact I obtain sym1[i,j,k] = sym1[i,k,j], symmetric under the exchange of the last two symbols!
# Second attempt
sym2 = 0.5*(ndarr+ndarr.T)
Same problem here and sym2 is symmetric with respect the second index sym2[i,j,k]=sym2[k,j,i].
To resume, the goal is to find a symmetric form for a 3D array with respect to the third index and to preserve the values in the diagonal for the original ndarr[i,i,i].
The problem here is that you're not using the correct transpose:
sym = 0.5 * (ndarr + np.transpose(ndarr, (1, 0, 2)))
By default, np.transpose and the .T property will reverse the order of the axes. In your case, we want to only flip the first two axes: (0,1,2) -> (1,0,2).
EDIT: The reason your first attempt failed is because you were concatenating each symmetrized matrix along the first axis, not the last. It's more clear if you make ndarr with shape (5, 5, 3):
In [16]: sym = np.array([0.5*(ndarr[:,:,k]+ndarr[:,:,k].T) for k in range(3)])
In [17]: sym.shape
Out[17]: (3L, 5L, 5L)
In any case, the version above with np.transpose is cleaner and more efficient.

finding matrix through optimisation

I am looking for algorithm to solve the following problem :
I have two sets of vectors, and I want to find the matrix that best approximate the transformation from the input vectors to the output vectors.
vectors are 3x1, so matrix is 3x3.
This is the general problem. My particular problem is I have a set of RGB colors, and another set that contains the desired color. I am trying to find an RGB to RGB transformation that would give me colors closer to the desired ones.
There is correspondence between the input and output vectors, so computing an error function that should be minimized is the easy part. But how can I minimize this function ?
This is a classic linear algebra problem, the key phrase to search on is "multiple linear regression".
I've had to code some variation of this many times over the years. For example, code to calibrate a digitizer tablet or stylus touch-screen uses the same math.
Here's the math:
Let p be an input vector and q the corresponding output vector.
The transformation you want is a 3x3 matrix; call it A.
For a single input and output vector p and q, there is an error vector e
e = q - A x p
The square of the magnitude of the error is a scalar value:
eT x e = (q - A x p)T x (q - A x p)
(where the T operator is transpose).
What you really want to minimize is the sum of e values over the sets:
E = sum (e)
This minimum satisfies the matrix equation D = 0 where
D(i,j) = the partial derivative of E with respect to A(i,j)
Say you have N input and output vectors.
Your set of input 3-vectors is a 3xN matrix; call this matrix P.
The ith column of P is the ith input vector.
So is the set of output 3-vectors; call this matrix Q.
When you grind thru all of the algebra, the solution is
A = Q x PT x (P x PT) ^-1
(where ^-1 is the inverse operator -- sorry about no superscripts or subscripts)
Here's the algorithm:
Create the 3xN matrix P from the set of input vectors.
Create the 3xN matrix Q from the set of output vectors.
Matrix Multiply R = P x transpose (P)
Compute the inverseof R
Matrix Multiply A = Q x transpose(P) x inverse (R)
using the matrix multiplication and matrix inversion routines of your linear algebra library of choice.
However, a 3x3 affine transform matrix is capable of scaling and rotating the input vectors, but not doing any translation! This might not be general enough for your problem. It's usually a good idea to append a "1" on the end of each of the 3-vectors to make then a 4-vector, and look for the best 3x4 transform matrix that minimizes the error. This can't hurt; it can only lead to a better fit of the data.
You don't specify a language, but here's how I would approach the problem in Matlab.
v1 is a 3xn matrix, containing your input colors in vertical vectors
v2 is also a 3xn matrix containing your output colors
You want to solve the system
M*v1 = v2
M = v2*inv(v1)
However, v1 is not directly invertible, since it's not a square matrix. Matlab will solve this automatically with the mrdivide operation (M = v2/v1), where M is the best fit solution.
eg:
>> v1 = rand(3,10);
>> M = rand(3,3);
>> v2 = M * v1;
>> v2/v1 - M
ans =
1.0e-15 *
0.4510 0.4441 -0.5551
0.2220 0.1388 -0.3331
0.4441 0.2220 -0.4441
>> (v2 + randn(size(v2))*0.1)/v1 - M
ans =
0.0598 -0.1961 0.0931
-0.1684 0.0509 0.1465
-0.0931 -0.0009 0.0213
This gives a more language-agnostic solution on how to solve the problem.
Some linear algebra should be enough :
Write the average squared difference between inputs and outputs ( the sum of the squares of each difference between each input and output value ). I assume this as definition of "best approximate"
This is a quadratic function of your 9 unknown matrix coefficients.
To minimize it, derive it with respect to each of them.
You will get a linear system of 9 equations you have to solve to get the solution ( unique or a space variety depending on the input set )
When the difference function is not quadratic, you can do the same but you have to use an iterative method to solve the equation system.
This answer is better for beginners in my opinion:
Have the following scenario:
We don't know the matrix M, but we know the vector In and a corresponding output vector On. n can range from 3 and up.
If we had 3 input vectors and 3 output vectors (for 3x3 matrix), we could precisely compute the coefficients αr;c. This way we would have a fully specified system.
But we have more than 3 vectors and thus we have an overdetermined system of equations.
Let's write down these equations. Say that we have these vectors:
We know, that to get the vector On, we must perform matrix multiplication with vector In.In other words: M · I̅n = O̅n
If we expand this operation, we get (normal equations):
We do not know the alphas, but we know all the rest. In fact, there are 9 unknowns, but 12 equations. This is why the system is overdetermined. There are more equations than unknowns. We will approximate the unknowns using all the equations, and we will use the sum of squares to aggregate more equations into less unknowns.
So we will combine the above equations into a matrix form:
And with some least squares algebra magic (regression), we can solve for b̅:
This is what is happening behind that formula:
Transposing a matrix and multiplying it with its non-transposed part creates a square matrix, reduced to lower dimension ([12x9] · [9x12] = [9x9]).
Inverse of this result allows us to solve for b̅.
Multiplying vector y̅ with transposed x reduces the y̅ vector into lower [1x9] dimension. Then, by multiplying [9x9] inverse with [1x9] vector we solved the system for b̅.
Now, we take the [1x9] result vector and create a matrix from it. This is our approximated transformation matrix.
A python code:
import numpy as np
import numpy.linalg
INPUTS = [[5,6,2],[1,7,3],[2,6,5],[1,7,5]]
OUTPUTS = [[3,7,1],[3,7,1],[3,7,2],[3,7,2]]
def get_mat(inputs, outputs, entry_len):
n_of_vectors = inputs.__len__()
noe = n_of_vectors*entry_len# Number of equations
#We need to construct the input matrix.
#We need to linearize the matrix. SO we will flatten the matrix array such as [a11, a12, a21, a22]
#So for each row we combine the row's variables with each input vector.
X_mat = []
for in_n in range(0, n_of_vectors): #For each input vector
#populate all matrix flattened variables. for 2x2 matrix - 4 variables, for 3x3 - 9 variables and so on.
base = 0
for col_n in range(0, entry_len): #Each original unknown matrix's row must be matched to all entries in the input vector
row = [0 for i in range(0, entry_len ** 2)]
for entry in inputs[in_n]:
row[base] = entry
base+=1
X_mat.append(row)
Y_mat = [item for sublist in outputs for item in sublist]
X_np = np.array(X_mat)
Y_np = np.array([Y_mat]).T
solution = np.dot(np.dot(numpy.linalg.inv(np.dot(X_np.T,X_np)),X_np.T),Y_np)
var_mat = solution.reshape(entry_len, entry_len) #create square matrix
return var_mat
transf_mat = get_mat(INPUTS, OUTPUTS, 3) #3 means 3x3 matrix, and in/out vector size 3
print(transf_mat)
for i in range(0,INPUTS.__len__()):
o = np.dot(transf_mat, np.array([INPUTS[i]]).T)
print(f"{INPUTS[i]} x [M] = {o.T} ({OUTPUTS[i]})")
The output is as such:
[[ 0.13654096 0.35890767 0.09530002]
[ 0.31859558 0.83745124 0.22236671]
[ 0.08322497 -0.0526658 0.4417611 ]]
[5, 6, 2] x [M] = [[3.02675088 7.06241873 0.98365224]] ([3, 7, 1])
[1, 7, 3] x [M] = [[2.93479472 6.84785436 1.03984767]] ([3, 7, 1])
[2, 6, 5] x [M] = [[2.90302805 6.77373212 2.05926064]] ([3, 7, 2])
[1, 7, 5] x [M] = [[3.12539476 7.29258778 1.92336987]] ([3, 7, 2])
You can see, that it took all the specified inputs, got the transformed outputs and matched the outputs to the reference vectors. The results are not precise, since we have an approximation from the overspecified system. If we used INPUT and OUTPUT with only 3 vectors, the result would be exact.