Tensordot equivalent of einsum 'ij, ijk -> ik' - numpy

I am not using numpy but Eigen::Tensor C++ API, which only has contraction operations, this is just to enable me think through implementation from python.
So 'ij, ijk -> ik' is basically like doing a for loop for each of the first dimensions.
a = np.random.uniform(size=[10, 4])
b = np.random.uniform(size=[10, 4, 4])
vec = []
for i in range(10):
vec.append(a[i].dot(b[i]))
print(np.stack(vec, axis=0))
## or with einsum
print(np.einsum('ij,ijk->ik', a, b))
This can not seem to be done easily with tensordot. Any suggestions?

Related

Computing quick convex hull using Numba

I came across to this nice implementation of computing convex hull of 2d points using Numpy implementation. I would like to be able to #njit this function to use it inside my other Numba jitted code. However I'm not able to modify it, to run, as it uses recursion, and unsupported Numba features? Can anybody help me to rewrite this?
import numpy as np
from numba import njit
def process(S, P, a, b):
signed_dist = np.cross(S[P] - S[a], S[b] - S[a])
K = [i for s, i in zip(signed_dist, P) if s > 0 and i != a and i != b]
if len(K) == 0:
return (a, b)
c = max(zip(signed_dist, P))[1]
return process(S, K, a, c)[:-1] + process(S, K, c, b)
def quickhull_2d(S: np.ndarray) -> np.ndarray:
a, b = np.argmin(S[:,0]), np.argmax(S[:,0])
max_index = np.argmax(S[:,0])
max_element = S[max_index]
return process(S, np.arange(S.shape[0]), a, max_index)[:-1] + process(S, np.arange(S.shape[0]), max_index, a)[:-1]
Example data input and output
points = np.array([[0, 0], [1, 1], [0.5, 0.5], [0, 1], [1, 0]])
ch = quickhull_2d(points)
print(ch)
[0, 4, 1, 3]
print(points[ch])
[[0. 0.]
[1. 0.]
[1. 1.]
[0. 1.]]
There are many issues in this code for Numba to be used.
First of all, returning variable-sized tuples is not possible in Numba because the type of a tuple implicitly includes its size. A tuple is basically a structured type and not a list. See this post and this one for more information about this issue. The solution is basically to return a list (slow) or an array (fast).
Moreover, the type of the parameters change from one function to another. Indeed, process is called in quickhull_2d with a P defined as a Numpy array and then called from process itself with P defined as a list. List and array are completely different things. It is better to use array when possible in Numba unless you use a list to add an unknown number of items (not small nor bounded).
Additionally, max(zip(signed_dist, P))[1] is apparently unsupported by Numba and it is not very efficient anyway (nor idiomatic for a Numpy code). P[np.argmax(signed_dist)] should be used instead.
Furthermore, np.cross also does not seems supported for the general case and you need to currently use cross2d instead (from numba.np.extensions).
Finally, when you use recursive function like this, it is better to specify the input type of the parameters so to avoid weird errors. This can be done thanks to a signature string.
The resulting code is:
import numpy as np
from numba import njit
from numba.np.extensions import cross2d
#njit('(float64[:,:], int64[:], int64, int64)')
def process(S, P, a, b):
signed_dist = cross2d(S[P] - S[a], S[b] - S[a])
K = np.array([i for s, i in zip(signed_dist, P) if s > 0 and i != a and i != b], dtype=np.int64)
if len(K) == 0:
return [a, b]
c = P[np.argmax(signed_dist)]
return process(S, K, a, c)[:-1] + process(S, K, c, b)
#njit('(float64[:,:],)')
def quickhull_2d(S: np.ndarray) -> np.ndarray:
a, b = np.argmin(S[:,0]), np.argmax(S[:,0])
max_index = np.argmax(S[:,0])
max_element = S[max_index]
return process(S, np.arange(S.shape[0]), a, max_index)[:-1] + process(S, np.arange(S.shape[0]), max_index, a)[:-1]
points = np.array([[0, 0], [1, 1], [0.5, 0.5], [0, 1], [1, 0]])
ch = quickhull_2d(points)
print(ch) # print [0, 4, 1, 3]
Note that the compilation time is slow and the execution time should not be great. This is due to lists (and so temporary array for the runtime performance). The next step is simply to use arrays. The bad news is that concatenate is not supported by Numba (because the general case is not easy to implement though specific case are trivial). You can create a new array and copy each part (or even better: you can preallocate an array and slice it during the recursive calls).
Also not that any recursive function can be transformed to a non-recursive function using a manual stack. That being said, it may be slower and make the code more verbose. There are some benefits to this approach though: it avoid stack overflow when the recursion is deep and it may be faster if the function is rewritten so not to stack one of the function call thanks to tail call optimization.

Keras Custom Merge Two Tensors

I have two tensors of shape [1,4] say,
[1,2,3,4]
[0.2,0.3,0.4,0.5]
Now I want to merge them in merge layer (perhaps using some custom function using Tensorflow backend) so that they become
[1,0.2,2,0.3,3,0.4,4,0.5]
How can I achieve this? The shape of the tensor is fixed. Thank you for your time.
A possible solution is to concatenate the tensors along the axis 0 and then gather the values according to the indices, like that
import tensorflow as tf
from itertools import chain
A = tf.constant([1, 2, 3, 4])
B = tf.constant([0.2, 0.3, 0.4, 0.5])
# Cast A to be compatible with B
A = tf.cast(A, tf.float32)
# Concat AB one next to the other
AB = tf.concat([A, B], axis=0)
# Generate a list of values in this sequence
# 0, 4, 1, 5, ... in other to indicize the tensors
# use gather to collect values in the specified positions
NEW = tf.gather(AB,
list(
chain.from_iterable((i, i + A.shape[0].value)
for i in range(A.shape[0].value))))
with tf.Session() as sess:
print(sess.run([NEW]))
Using Tensorflow, you can use reshape and concat. These operations are also available in the keras backend.
a = tf.constant([1,2,3,4])
b = tf.constant([10,20,30,40])
c = tf.reshape(tf.concat([tf.reshape(a,(-1,1)), tf.reshape(b, (-1,1))], 1), (-1,))
I don't know if there exists a more straightforward way to accomplish this.
Edit: There exists a simpler solution using tf.stack instead of tf.concat.
c = tf.reshape(tf.stack([a, b], 1),(-1,))

Python Memory error on scipy stats. Scipy linalg lstsq <> manual beta

Not sure if this question belongs here or on crossvalidated but since the primary issue is programming language related, I am posting it here.
Inputs:
Y= big 2D numpy array (300000,30)
X= 1D array (30,)
Desired Output:
B= 1D array (300000,) each element of which regression coefficient of regressing each row (element of length 30) of Y against X
So B[0] = scipy.stats.linregress(X,Y[0])[0]
I tried this first:
B = scipy.stats.linregress(X,Y)[0]
hoping that it will broadcast X according to shape of Y. Next I broadcast X myself to match the shape of Y. But on both occasions, I got this error:
File "C:\...\scipy\stats\stats.py", line 3011, in linregress
ssxm, ssxym, ssyxm, ssym = np.cov(x, y, bias=1).flat
File "C:\...\numpy\lib\function_base.py", line 1766, in cov
return (dot(X, X.T.conj()) / fact).squeeze()
MemoryError
I used manual approach to calculate beta, and on Sascha's suggestion below also used scipy.linalg.lstsq as follows
B = lstsq(Y.T, X)[0] # first estimate of beta
Y1=Y-Y.mean(1)[:,None]
X1=X-X.mean()
B1= np.dot(Y1,X1)/np.dot(X1,X1) # second estimate of beta
The two estimates of beta are very different however:
>>> B1
Out[10]: array([0.135623, 0.028919, -0.106278, ..., -0.467340, -0.549543, -0.498500])
>>> B
Out[11]: array([0.000014, -0.000073, -0.000058, ..., 0.000002, -0.000000, 0.000001])
Scipy's linregress will output slope+intercept which defines the regression-line.
If you want to access the coefficients naturally, scipy's lstsq might be more appropriate, which is an equivalent formulation.
Of course you need to feed it with the correct dimensions (your data is not ready; needs preprocessing; swap dims).
Code
import numpy as np
from scipy.linalg import lstsq
Y = np.random.random((300000,30))
X = np.random.random(30)
x, res, rank, s = lstsq(Y.T, X) # Y transposed!
print(x)
print(x.shape)
Output
[ 1.73122781e-05 2.70274135e-05 9.80840639e-06 ..., -1.84597771e-05
5.25035470e-07 2.41275026e-05]
(300000,)

Slice 3D ndarray with 2D ndarray in numpy?

My apologies if this has been answered many times, but I just can't find a solution.
Assume the following code:
import numpy as np
A,_,_ = np.meshgrid(np.arange(5),np.arange(7),np.arange(10))
B = (rand(7,10)*5).astype(int)
How can I slice A using B so that B represent the indexes in the first and last dimensions of A (I.e A[magic] = B)?
I have tried
A[:,B,:] which doesn't work due to peculiarities of advanced indexing.
A[:,B,np.arange(10)] generates 7 copies of the matrix I'm after
A[np.arange(7),B,np.arange(10)] gives the error:
ValueError: shape mismatch: objects cannot be broadcast to a single shape
Any other suggestions?
These both work:
A[0, B, 0]
A[B, B, B]
Really, only the B in axis 1 matters, the others can be any range that will broadcast to B.shape and are limited by A.shape[0] (for axis 1) and A.shape[2] (for axis 2), for a ridiculous example:
A[range(7) + range(3), B, range(9,-1, -1)]
But you don't want to use : because then you'll get, as you said, 7 or 10 (or both!) "copies" of the array you want.
A, _, _ = np.meshgrid(np.arange(5),np.arange(7),np.arange(10))
B = (rand(7,10)*A.shape[1]).astype(int)
np.allclose(B, A[0, B, 0])
#True
np.allclose(B, A[B, B, B])
#True
np.allclose(B, A[range(7) + range(3), B, range(9,-1, -1)])
#True

Convolution along one axis only

I have two 2-D arrays with the same first axis dimensions. In python, I would like to convolve the two matrices along the second axis only. I would like to get C below without computing the convolution along the first axis as well.
import numpy as np
import scipy.signal as sg
M, N, P = 4, 10, 20
A = np.random.randn(M, N)
B = np.random.randn(M, P)
C = sg.convolve(A, B, 'full')[(2*M-1)/2]
Is there a fast way?
You can use np.apply_along_axis to apply np.convolve along the desired axis. Here is an example of applying a boxcar filter to a 2d array:
import numpy as np
a = np.arange(10)
a = np.vstack((a,a)).T
filt = np.ones(3)
np.apply_along_axis(lambda m: np.convolve(m, filt, mode='full'), axis=0, arr=a)
This is an easy way to generalize many functions that don't have an axis argument.
With ndimage.convolve1d, you can specify the axis...
np.apply_along_axis won't really help you, because you're trying to iterate over two arrays. Effectively, you'd have to use a loop, as described here.
Now, loops are fine if your arrays are small, but if N and P are large, then you probably want to use FFT to convolve instead.
However, you need to appropriately zero pad your arrays first, so that your "full" convolution has the expected shape:
M, N, P = 4, 10, 20
A = np.random.randn(M, N)
B = np.random.randn(M, P)
A_ = np.zeros((M, N+P-1), dtype=A.dtype)
A_[:, :N] = A
B_ = np.zeros((M, N+P-1), dtype=B.dtype)
B_[:, :P] = B
A_fft = np.fft.fft(A_, axis=1)
B_fft = np.fft.fft(B_, axis=1)
C_fft = A_fft * B_fft
C = np.real(np.fft.ifft(C_fft))
# Test
C_test = np.zeros((M, N+P-1))
for i in range(M):
C_test[i, :] = np.convolve(A[i, :], B[i, :], 'full')
assert np.allclose(C, C_test)
for 2D arrays, the function scipy.signal.convolve2d is faster and scipy.signal.fftconvolve can be even faster (depending on the dimensions of the arrays):
Here the same code with N = 100000
import time
import numpy as np
import scipy.signal as sg
M, N, P = 10, 100000, 20
A = np.random.randn(M, N)
B = np.random.randn(M, P)
T1 = time.time()
C = sg.convolve(A, B, 'full')
print(time.time()-T1)
T1 = time.time()
C_2d = sg.convolve2d(A, B, 'full')
print(time.time()-T1)
T1 = time.time()
C_fft = sg.fftconvolve(A, B, 'full')
print(time.time()-T1)
>>> 12.3
>>> 2.1
>>> 0.6
Answers are all the same with slight differences due to different computation methods used (e.g., fft vs direct multiplication, but i don't know what exaclty convolve2d uses):
print(np.max(np.abs(C - C_2d)))
>>>7.81597009336e-14
print(np.max(np.abs(C - C_fft)))
>>>1.84741111298e-13
Late answer, but worth posting for reference. Quoting from comments of the OP:
Each row in A is being filtered by the corresponding row in B. I could
implement it like that, just thought there might be a faster way.
A is on the order of 10s of gigabytes in size and I use overlap-add.
Naive / Straightforward Approach
import numpy as np
import scipy.signal as sg
M, N, P = 4, 10, 20
A = np.random.randn(M, N) # (4, 10)
B = np.random.randn(M, P) # (4, 20)
C = np.vstack([sg.convolve(a, b, 'full') for a, b in zip(A, B)])
>>> C.shape
(4, 29)
Each row in A is convolved with each respective row in B, essentially convolving M 1D arrays/vectors.
No Loop + CUDA Supported Version
It is possible to replicate this operation by using PyTorch's F.conv1d. We have to imagine A as a 4-channel, 1D signal of length 10. We wish to convolve each channel in A with a specific kernel of length 20. This is a special case called a depthwise convolution, often used in deep learning.
Note that torch's conv is implemented as cross-correlation, so we need to flip B in advance to do actual convolution.
import torch
import torch.nn.functional as F
#torch.no_grad()
def torch_conv(A, B):
M, N, P = A.shape[0], A.shape[1], B.shape[1]
C = F.conv1d(A, B[:, None, :], bias=None, stride=1, groups=M, padding=N+(P-1)//2)
return C.numpy()
# Convert A and B to torch tensors + flip B
X = torch.from_numpy(A) # (4, 10)
W = torch.from_numpy(np.fliplr(B).copy()) # (4, 20)
# Do grouped conv and get np array
Y = torch_conv(X, W)
>>> Y.shape
(4, 29)
>>> np.allclose(C, Y)
True
Advantages of using a depthwise convolution with torch:
No loops!
The above solution can also run on CUDA/GPU, which can really speed things up if A and B are very large matrices. (From OP's comment, this seems to be the case: A is 10GB in size.)
Disadvantages:
Overhead of converting from array to tensor (should be negligible)
Need to flip B once before the operation