Merge masked selection of array with original array - numpy

I'm facing a problem with an assignment at the moment.
So I have an array which contains 400 2d Points. So an array of shape 400 X 2.
Then I have a mask that selects m points (rows) that I wanna compute some changes on.
As per the assignment I'm supposed to store the points that I want to change in an array of shape m X 2.
Then I do my changes on this resulting array. But now after the changes I want to insert these new computed values in my original array at the original indices. And I just have no clue how to do that.
So I basically have:
orig (400 X 2)
mask (400 X 1) (boolean mask selecting the rows to edit)
change (m X 2) (just the changes I want to add)
changed (m X 2) (the original values + the change (with a factor applied) added together
How do I transform my change or changed arrays with the mask so that I can add/insert the changes into my original array?

Look at this example with 4 rows.
The principle is that the mask that "extract" from orig can also return the sub-array to the original place.
import numpy as np
x = np.array([[1,2],[3,4],[5,6],[7,8]])
print(x)
mask_ix = np.array([True,False, True, False])
masked = x[mask_ix,:]
masked = masked * 10 # the change
print(masked)
x[mask_ix] = masked # return to the original x in the mask_ix mask
print(x)
x =[[1 2]
[3 4]
[5 6]
[7 8]]
masked = [[10 20]
[50 60]]
x = [[10 20]
[ 3 4]
[50 60]
[ 7 8]]

Related

select from multi-dimensional arrays based on a given condition

There is an ndarray data with shape, e.g., (5, 10, 2); and other two lists, x1 and x2. Both of size 10. I want select subset from data based on the following conditions,
Across the second dimension
If x1[i]<= data[j, i,0] <=x2[i], then we will select data[j, i,:]
I tried selected = data[x1<=data[:,:,0]<=x2]. It does not work. I am not clear what's the efficient (or vectorized) way to implement this condition-based selection.
The code below selects all values in data where the third dimension is 0 (i.e. each value has some index data[i, j, 0] and where the value is <= than the corresponding x2 and >= than the corresponding x1:
idx = np.where(np.logical_and(data[:, :, 0] >= np.array(x1), data[:, :, 0] <= np.array(x2)))
# data[idx] contains the full rows of length 2 rather than just the 0th column, so we need to select the 0th column.
selected = data[idx][:, 0]
The code assumes that x1 and x2 are lists with lengths equal to the size of data's second dimension (in this case, 10). Note that the code only returns the values, not the indices of the values.
Let me know if you have any questions.

Converting strings Series to numeric one

I'm doing
X = data['x'].apply(lambda h: [int(h[i:i + 2], 16) for i in (0, 2 ,4)])
Where x has strings of hex colors, and I'd like to map them to RGB arrays (3 values each). After that, X hasdtype='object, and X.values is a numpy array of numpy arrays.
My final goal is making it an 3 * n numpy array and use it with sklearn.cluster.KMeans. What is the best way to achieving this?
After creating X, you can split up the data into 3 columns like this
X = data['x'].apply(lambda h: [int(h[i:i + 2], 16) for i in (0, 2 ,4)])
data[['R','G','B']] = pd.DataFrame(X.values.tolist(), index=X.index)
so that
data[['R','G','B']]
has the result in three columns for further processing

tensorflow max preserve mapping which is smooth

How can I make x from y where
x = tf.constant([[1,5,3], [100,20,3]])
y = ([[0,5,0], [100,0,0]])
So it basically preserves only the max values and makes other elements zero. Using tf.argmax we can get the max indices but don't really know how to make y from it.
Could you please help?
And would such y has its proper gradient (i.e., at the max element gradient 1 and at others gradient 0).
Not sure if this is the optimized way but you can do it with tf.gather_nd and tf.scatter_nd. 1) use tf.argmax to construct the indices corresponding to the maximum values; 2) extract the maximum values using tf.gather_nd and indices; 3) make a new tensor with the indices and updates using tf.scatter_nd.
x = tf.constant([[1,5,3], [100,20,3]])
​
with tf.Session() as sess:
indices = tf.stack([tf.range(x.shape[0], dtype=tf.int64), tf.argmax(x, axis=1)], axis=1)
updates = tf.gather_nd(x, indices)
output = tf.scatter_nd(indices, updates, x.shape)
print(sess.run(output))
#[[ 0 5 0]
# [100 0 0]]

Transform a numpy 3D ndarray to a symmetric form with respect to a specific index

In the case of a matrix mat n x n, i can do the following
sym = 0.5 * (mat + mat.T)
the operation gives the desired result sym[i,j] = sym[j,i]
Suppose we have a 3D array ndarr[i,j,k], where i,j,k 0,1,...n,
then ndarr is n x n x n. The idea is to obtain the following "symmetric" form
nsym[i,j,k] = nsym[j,i,k] using ndarr. I tried this:
import numpy as np
# Generate some random matrix, n = 5
ndarr = np.random.beta(0.1,1,(5,5,5))
# First attempt to symmetrize
sym1 = np.array([0.5*(ndarr[:,:,k]+ndarr[:,:,k].T) for k in range(5)])
The problem here is that sym1[i,j,k] != sym1[j,i,k] as it is required. In fact I obtain sym1[i,j,k] = sym1[i,k,j], symmetric under the exchange of the last two symbols!
# Second attempt
sym2 = 0.5*(ndarr+ndarr.T)
Same problem here and sym2 is symmetric with respect the second index sym2[i,j,k]=sym2[k,j,i].
To resume, the goal is to find a symmetric form for a 3D array with respect to the third index and to preserve the values in the diagonal for the original ndarr[i,i,i].
The problem here is that you're not using the correct transpose:
sym = 0.5 * (ndarr + np.transpose(ndarr, (1, 0, 2)))
By default, np.transpose and the .T property will reverse the order of the axes. In your case, we want to only flip the first two axes: (0,1,2) -> (1,0,2).
EDIT: The reason your first attempt failed is because you were concatenating each symmetrized matrix along the first axis, not the last. It's more clear if you make ndarr with shape (5, 5, 3):
In [16]: sym = np.array([0.5*(ndarr[:,:,k]+ndarr[:,:,k].T) for k in range(3)])
In [17]: sym.shape
Out[17]: (3L, 5L, 5L)
In any case, the version above with np.transpose is cleaner and more efficient.

Numpy loop using an index

I'm a newbie and was trying something in python 2.7.2 with Numpy which wasn't working as expected so wanted to check if there was something basic I was misunderstanding.
I was calculating a value for a triangle (trinormals) and then updating a value per point of the triangle (vertnormals) using an array of triangle indexes (trivertexidx). As a loop I was calculating:
for itri in range(ntriangles) :
vertnormals[(trivertidx[itri,0]),:] += trinormals[itri,:]
vertnormals[(trivertidx[itri,1]),:] += trinormals[itri,:]
vertnormals[(trivertidx[itri,2]),:] += trinormals[itri,:]
As this was a little slow I thought it could be modified to :
vertnormals[(trivertidx[:,0]),:] += trinormals[:,:]
vertnormals[(trivertidx[:,1]),:] += trinormals[:,:]
vertnormals[(trivertidx[:,2]),:] += trinormals[:,:]
However this doesn't give the same results. Is there another simpler way to write the loop? Any pointers appreciated. Note the intent here was to get a single value for each entry in vertnormals and then normalise the result.
Numpy has a function bincount that can be very helpful in situations like this. The two lines bellow are the the same when the elements of index are unique, but different when index has repeated values:
A[index] += W
A += np.bincount(index, W, minlenght=len(A))
I believe you want the behavior of the second, but you're code is a little more complex because A, index, and W are not 1d. You can try something like this,
import numpy as np
N = len(vertnormals)
for j in range(vertnormals.shape[-1]):
vertnormals[:, j] += np.bincount(trivertidx[:, 0], trinormals[:, j], minlength=N)
vertnormals[:, j] += np.bincount(trivertidx[:, 1], trinormals[:, j], minlength=N)
vertnormals[:, j] += np.bincount(trivertidx[:, 2], trinormals[:, j], minlength=N)
Hope that helps.
If I am understanding your question well, you have m points from which you have formed n triangles, and trivertidx is an array of shape (n, 3) holding values in the range [0, m), where trivertidx[j] is the list of the 3 points making up the j-th triangle.
trinormals then is an array of shape (n,) holding a value assigned to each triangle, and you want vertnormals to be an array of shape (m,) holding, for each point, the sum of the values assigned to each triangle that point is a vertex of.
If the above is right, the following example should show why your second code is not working properly:
>>> a = np.arange(5)
>>> a
array([0, 1, 2, 3, 4])
>>> a[[1,2,0,2]] += 1
>>> a
array([1, 2, 3, 3, 4])
Even though the element in position 2 shows up twice in the left hand side, what happens is that two copies of the same value have 1 added, and then the incremented value is copied twice to the same position.
To vectorize this summation you would need an array of shape (n, m) where the value at position [j, k] is True if vertex k is part of triangle j, False if not. You could build that array like this:
trivert = np.zeros((n, m), dtype='bool')
trivert[np.arange(n).reshape(n, 1), trivertidx] = 1
Once you have this array, you can get your sums for each vertex as
vertnormals = np.sum(trivert * trinormals.reshape(-1, 1), axis=0)