How to add two partially overlapping numpy arrays and extend the non-overlapping parts? - numpy

I'm adding a short audio signal (1-D numpy array of a musical note) to roughly the end of a longer signal (the first part of the audio stream constructed so far). I'd like to add the overlapping part and extend the non-overlapping part. What is the most efficient way to achieve this? I can identify the overlapping part and add it to the main signal while concatenating the non-overlapping part, but I don't think this is sufficiently efficient. I also think make them the same size by padding with zeros is very memory inefficient. Is there a numpy or scipy function for achieving this?

np arrays are contiguous memory blocks. Those of a and b are almost guaranteed to not be contiguous with each other, so you really only have an option to extend one with a copy of the second or create a new object that creates the object you want.
I don't know your constraints, but I suspect you're trying to prematurely optimize. Just write something clear first and optimize if it doesn't meet whatever needs you have:
def add_signal(a, b, ai=0, bi=0):
assert ai >= 0
assert bi >= 0
al = len(a)
bl = len(b)
cl = max(ai + al, bi + bl)
c = np.zeros(cl)
c[ai: ai + al] += a
c[bi: bi + bl] += b
return c
Example:
a = np.array([0, 1, 2, 3, 4, 5])
b = np.array([10, 20, 30, 40])
add_signal(a, b, bi=len(a)-3)
Output:
array([ 0., 1., 2., 13., 24., 35., 40.])

Related

How to find matrix common members of matrices in Numpy

I have a 2D matrix A and a vector B. I want to find all row indices of elements in A that are also contained in B.
A = np.array([[1,9,5], [8,4,9], [4,9,3], [6,7,5]], dtype=int)
B = np.array([2, 4, 8, 10, 12, 18], dtype=int)
My current solution is only to compare A to one element of B at a time but that is horribly slow:
res = np.array([], dtype=int)
for i in range(B.shape[0]):
cres, _ = (B[i] == A).nonzero()
degElem = np.append(res, cres)
res = np.unique(res)
The following Matlab statement would solve my issue:
find(any(reshape(any(reshape(A, prod(size(A)), 1) == B, 2),size(A, 1),size(A, 2)), 2))
However comparing a row and a colum vector in Numpy does not create a Boolean intersection matrix as it does in Matlab.
Is there a proper way to do this in Numpy?
We can use np.isin masking.
To get all the row numbers, it would be -
np.where(np.isin(A,B).T)[1]
If you need them split based on each element's occurence -
[np.flatnonzero(i) for i in np.isin(A,B).T if i.any()]
Posted MATLAB code seems to be doing broadcasting. So, an equivalent one would be -
np.where(B[:,None,None]==A)[1]

After projecting 3D points to 2D, how to get back to 3D?

Simple question: I used a translation and rotation matrix and camera intrinsics matrix to get a 3x4 matrix used to transform 3d points to 2d points (notated as Tform)
I transformed the point [10,-5,1] with the matrix by adding one to the end, and the new point is notated as newpoint.
Now I want to use the newpoint data to transform back to 3D space, where old_est should be equal to old.
I'm looking for the solution to plug into the XXX matrix in my code below
import numpy as np
Tform=np.array([[4000,0,-1600,-8000],[500,5000,868,-8000],[.5,0,.8,-8]])
old=np.array([10,-5,1,1])
newpoint=np.dot(Tform,old)
print(newpoint)
old_est=np.dot(XXX,np.append(newpoint,1))
print(old_est)
Add a 4th row to Tform with the values 0 0 0 1, i.e. the last row of an identity matrix:
>>> m = np.vstack(Tform, np.array([0,0,0,1]))
>>> m
array([[ 4.00e+03, 0.00e+00, -1.60e+03, -8.00e+03],
[ 5.00e+02, 5.00e+03, 8.68e+02, -8.00e+03],
[ 5.00e-01, 0.00e+00, 8.00e-01, -8.00e+00],
[ 0.00e+00, 0.00e+00, 0.00e+00, 1.00e+00]])
Note that you cannot use append because it also flattens the input arrays.
Observe that, when multiplied with old, the 4th component of the result is 1, i.e. the result is equal to np.append(newpoint, 1):
>>> np.dot(m, old)
array([ 3.0400e+04, -2.7132e+04, -2.2000e+00, 1.0000e+00])
----------
It follows that XXX is the inverse of this new matrix:
>>> XXX = np.linalg.inv(m)
>>> np.dot(XXX, np.append(newpoint, 1))
array([10., -5., 1., 1.])
-------------
And we get the components of old back.
Alternatively you can subtract the 4th column of Tform from newpoint and multiply the result with the inverse of the left 3x3 sub-matrix of Tform, but this is slightly fiddly so we might as well let numpy do more of the work :)

How to perform matching between two sequences?

I have two mini-batch of sequences :
a = C.sequence.input_variable((10))
b = C.sequence.input_variable((10))
Both a and b have variable-length sequences.
I want to do matching between them where matching is defined as: match (eg. dot product) token at each time step of a with token at every time step of b .
How can I do this?
I have mostly answered this on github but to be consistent with SO rules, I am including a response here. In case of something simple like a dot product you can take advantage of the fact that it factorizes nicely, so the following code works
axisa = C.Axis.new_unique_dynamic_axis('a')
axisb = C.Axis.new_unique_dynamic_axis('b')
a = C.sequence.input_variable(1, sequence_axis=axisa)
b = C.sequence.input_variable(1, sequence_axis=axisb)
c = C.sequence.broadcast_as(C.sequence.reduce_sum(a), b) * b
c.eval({a: [[1, 2, 3],[4, 5]], b: [[6, 7], [8]]})
[array([[ 36.],
[ 42.]], dtype=float32), array([[ 72.]], dtype=float32)]
In the general case you need the following steps
static_b, mask = C.sequence.unpack(b, neutral_value).outputs
scores = your_score(a, static_b)
The first line will convert the b sequence into a static tensor with one more axis than b. Because of packing, some elements of this tensor will be invalid and those will be indicated by the mask. The neutral_value will be placed as a dummy value in the static_b tensor wherever data was missing. Depending on your score you might be able to arrange for the neutral_value to not affect the final score (e.g. if your score is a dot product a 0 would be a good choice, if it involves a softmax -infinity or something close to that would be a good choice). The second line can now have access to each element of a and all the elements of b as the first axis of static_b. For a dot product static_b is a matrix and one element of a is a vector so a matrix vector multiplication will result in a sequence whose elements are all inner products between the corresponding element of a and all elements of b.

Numpy index array of unknown dimensions?

I need to compare a bunch of numpy arrays with different dimensions, say:
a = np.array([1,2,3])
b = np.array([1,2,3],[4,5,6])
assert(a == b[0])
How can I do this if I do not know either the shape of a and b, besides that
len(shape(a)) == len(shape(b)) - 1
and neither do I know which dimension to skip from b. I'd like to use np.index_exp, but that does not seem to help me ...
def compare_arrays(a,b,skip_row):
u = np.index_exp[ ... ]
assert(a[:] == b[u])
Edit
Or to put it otherwise, I wan't to construct slicing if I know the shape of the array and the dimension I want to miss. How do I dynamically create the np.index_exp, if I know the number of dimensions and positions, where to put ":" and where to put "0".
I was just looking at the code for apply_along_axis and apply_over_axis, studying how they construct indexing objects.
Lets make a 4d array:
In [355]: b=np.ones((2,3,4,3),int)
Make a list of slices (using list * replicate)
In [356]: ind=[slice(None)]*b.ndim
In [357]: b[ind].shape # same as b[:,:,:,:]
Out[357]: (2, 3, 4, 3)
In [358]: ind[2]=2 # replace one slice with index
In [359]: b[ind].shape # a slice, indexing on the third dim
Out[359]: (2, 3, 3)
Or with your example
In [361]: b = np.array([1,2,3],[4,5,6]) # missing []
...
TypeError: data type not understood
In [362]: b = np.array([[1,2,3],[4,5,6]])
In [366]: ind=[slice(None)]*b.ndim
In [367]: ind[0]=0
In [368]: a==b[ind]
Out[368]: array([ True, True, True], dtype=bool)
This indexing is basically the same as np.take, but the same idea can be extended to other cases.
I don't quite follow your questions about the use of :. Note that when building an indexing list I use slice(None). The interpreter translates all indexing : into slice objects: [start:stop:step] => slice(start, stop, step).
Usually you don't need to use a[:]==b[0]; a==b[0] is sufficient. With lists alist[:] makes a copy, with arrays it does nothing (unless used on the RHS, a[:]=...).

Indexing the last dimension of a 3D array with a 2D integer array

I have one 3D data = NxMxD numpy array, and another 2D idx = NxM integer array which values are in the range of [0, D-1]. I want to perform basic updates to each data = NxM entry at the depth given by the idx array at that position.
For example, for N = M = D = 2:
data = np.zeros((2,2,2))
idx = np.array([[0,0],[1, 1]], int)
And I want to perform a simple operation like:
data[..., idx] += 1
My expected output would be:
>>> data
array([[[ 1., 0.],
[ 1., 0.]],
[[ 0., 1.],
[ 0., 1.]]])
idx indicates for each 2D coordinate which D should be updated. The above operation doesn't work.
I've found this approach in SO which solves the indexing problem by using:
data[np.arange(N)[:, None], np.arange(M)[None, :], idx] += 1
It works fine, but looks pretty horrible needing to manually index the whole matrices for what it seems a pretty simple operation (using one matrix as a index mask for the last channel).
Is there any better solution?
With numpy.ix_ it does not look so horrible but the underlying idea using fancy indexing is still the same
x = np.arange(N)
y = np.arange(M)
xx,yy = np.ix_(x,y)
data[xx,yy,idx] += 1
Note
The problem is that you want to change the values of data. If you just wanted to have the values according to idx you could do
out = np.choose(idx,data.transform(2,0,1))
However, this gives you a copy of the values of data and not a view which means that
out += 1
has no effect on your values in data.