batched tensor slice, slice B x N x M with B x 1 - numpy

I have an B x M x N tensor, X, and I have and B x 1 tensor, Y, which corresponds to the index of tensor X at dimension=1 that I want to keep. What is the shorthand for this slice so that I can avoid a loop?
Essentially I want to do this:
Z = torch.zeros(B,N)
for i in range(B):
Z[i] = X[i][Y[i]]

the following code is similar to the code in the loop. the difference is that instead of sequentially indexing the array Z,X and Y we are indexing them in parallel using the array i
B, M, N = 13, 7, 19
X = np.random.randint(100, size= [B,M,N])
Y = np.random.randint(M , size= [B,1])
Z = np.random.randint(100, size= [B,N])
i = np.arange(B)
Y = Y.ravel() # reducing array to rank-1, for easy indexing
Z[i] = X[i,Y[i],:]
this code can be further simplified as
-> Z[i] = X[i,Y[i],:]
-> Z[i] = X[i,Y[i]]
-> Z[i] = X[i,Y]
-> Z = X[i,Y]
pytorch equivalent code
B, M, N = 5, 7, 3
X = torch.randint(100, size= [B,M,N])
Y = torch.randint(M , size= [B,1])
Z = torch.randint(100, size= [B,N])
i = torch.arange(B)
Y = Y.ravel()
Z = X[i,Y]

The answer provided by #Hammad is short and perfect for the job. Here's an alternative solution if you're interested in using some less known Pytorch built-ins. We will use torch.gather (similarly you can achieve this with numpy.take).
The idea behind torch.gather is to construct a new tensor-based on two identically shaped tensors containing the indices (here ~ Y) and the values (here ~ X).
The operation performed is Z[i][j][k] = X[i][Y[i][j][k]][k].
Since X's shape is (B, M, N) and Y shape is (B, 1) we are looking to fill in the blanks inside Y such that Y's shape becomes (B, 1, N).
This can be achieved with some axis manipulation:
>>> Y.expand(-1, N)[:, None] # expand to dim=1 to N and unsqueeze dim=1
The actual call to torch.gather will be:
>>> X.gather(dim=1, index=Y.expand(-1, N)[:, None])
Which you can reshape to (B, N) by adding in [:, 0].
This function can be very effective in tricky scenarios...

Related

In Tensorflow, is there a built in function to compute states over time given a transition matrix?

I have a system given by this recursive relationship: xt = At xt-1 + bt. I wish to compute xt for all t, with At, bt and x0 given. Is there are built-in function for that? If I use a loop it would be extremely slow. Thanks!
There is sort of a way. Let's say you have your A matrices in a 3D tensor with shape (T, N, N), where T is the total number of time steps and N is the size of your vector. Similarly, B values are in a 2D tensor (T, N). The first step in the computation would be:
x1 = A[0] # x0 + B[0]
Where # represents matrix product. But you can convert this into a single matrix product. Suppose we add a value 1 at the end of x0, and we call that x0p (for prime):
x0p = tf.concat([x, [1]], axis=0)
And now we build a new 3D tensor Ap with shape (T, N+1, N+1), such that for each A[i] we concatenate B[i] as a new column, and then we add a row with N zeros and a single one at the end:
AwithB = tf.concat([tf.concat([A, tf.expand_dims(B, 2)], axis=2)], axis=1)
AnewRow = tf.concat([tf.zeros((T, 1, N), A.dtype), tf.ones((T, 1, 1), A.dtype)], axis=2)
Ap = tf.concat([AwithB, AnewRow], axis=1)
As it turns out, you can now say:
x1p = Ap[0] # x0p
And therefore:
x2p = Ap[1] # x1p = Ap[1] # Ap[0] # x0p
So we just need to compute all the matrix product of all matrices in Ap across the first dimension. Unfortunately, there does not seem to be a direct operation to compute that with TensorFlow, but you can do it relatively fast with tf.scan:
Ap_prod = tf.scan(tf.matmul, Ap)[-1]
And with that you just have to do:
xtp = Ap_prod # x0p
Here is a proof of concept (the code is tweaked to support single examples and batches, either in the A and B values or in the x)
import tensorflow as tf
def compute_state(a, b, x):
s = tf.shape(a)
t = s[-3]
n = s[-1]
# Add final 1 to x
xp = tf.concat([x, tf.ones_like(x[..., :1])], axis=-1)
# Add B column to A
a_b = tf.concat([tf.concat([a, tf.expand_dims(b, axis=-1)], axis=-1)], axis=-2)
# Make new final row for A
a_row = tf.concat([tf.zeros_like(a[..., :1, :]),
tf.ones_like(a[..., :1, :1])], axis=-1)
# Add new row to A
ap = tf.concat([a_b, a_row], axis=-2)
# Compute matrix product reduction
ap_prod = tf.scan(tf.matmul, ap)[..., -1, :, :]
# Compute final result
outp = tf.linalg.matvec(ap_prod, xp)
return outp[..., :-1]
#Test
tf.random.set_seed(0)
a = tf.random.uniform((10, 5, 5), -1, 1)
b = tf.random.uniform((10, 5), -1, 1)
x = tf.random.uniform((5,), -1, 1)
y = compute_state(a, b, x)
# Also works with batches of (a, b) or x
a = tf.random.uniform((100, 10, 5, 5), -1, 1)
b = tf.random.uniform((100, 10, 5), -1, 1)
x = tf.random.uniform((100, 5), -1, 1)
y = compute_state(a, b, x)

Numpy : multivariate indexing?

I wander, is it possible to index several dimensions at once ? With some broadcasting. Example :
Suppose i have an array A, shaped (n,d). Suppose i have a indexing array, say I with integer values between 0 and d-1. Set B = A[:,I].
If shape(I) == (k,), for whaterver k, then B has shape (n,k) and B[x,y] = A[x,I[y]].
But if shape(I) == (k,p) for whatever (k,p), then i wanted B to be shaped (n,k,p) with B[x,y,z] = A[x,I[y,z]].
1° How can i get this behavior ?
2° Does it have a drawback i did not see ?
You can do it exactly as you described it:
import numpy as np
n = 100
d = 20
k = 10
p = 17
A = np.random.random((n, d))
I = np.random.randint(low=0, high=d, size=(k, p))
B = A[:, I]
print(B.shape) # (n, k, p)
# Testing if the new array B is constructed as expected
x = 3
y = 5
z = 7
print(B[x, y, z])
print(A[x, I[y, z]])
print(B[x, y, z] == A[x, I[y, z]])
Its hard to answer if this is a good implementation or not, without context. But in general it is a good idea to use numpy and vectorization if you have speed in mind.

Faster way to patchify a picture to overlapping blocks

I'm looking for a FAST (and if possible memory afficiant) way to rewrite a function I crerated as part of Visual bag of words algorithm:
def get_pic_patches(pic, l, s): # "s" stands for stride
r, c = pic.shape
i, j = [0, 0]
x_range = list(range(0, r, s ) )
y_range = list(range(0, c , s ) )
patches = []
patches_location = []
for x in x_range: # without last two since it will exceed dimensions
for y in y_range: # without last two since it will exceed dimensions
if x+ l<= r and y+l <= c:
patch = pic[x:x + l , y:y + l ]
patches_location.append([x, y]) # patch location is the upper left pixel
patches.append( patch )
return patches, patches_location
it takes a grayscale image (NOT RGB!), desired patch length and stride value,
and gives back all patches as a list of numpy array.
On other qestions, I found this:
def patchify(img, patch_shape):
img = np.ascontiguousarray(img) # won't make a copy if not needed
X, Y = img.shape
x, y = patch_shape
shape = ((X-x+1), (Y-y+1), x, y) # number of patches, patch_shape
strides = img.itemsize*np.array([Y, 1, Y, 1])
return np.lib.stride_tricks.as_strided(img, shape=shape, strides=strides)
in order to get to return a list, I used it like this:
def patchify(img, patch_shape):
img = np.ascontiguousarray(img) # won't make a copy if not needed
X, Y = img.shape
x, y = patch_shape
shape = ((X-x+1), (Y-y+1), x, y) # number of patches, patch_shape
strides = img.itemsize*np.array([Y, 1, Y, 1])
patches = np.lib.stride_tricks.as_strided(img, shape=shape, strides=strides)
a,b,c,d = patches.shape
patches = patches.reshape(((a*b),c,d))
patches = patches.tolist()
return
but this was actually much slower than my original function! another problem is that is only works with stride = 1, and I want to be able to use all sorts of stride values.

hessian of a variable returned by tf.concat() is None

Let x and y be vectors of length N, and z is a function z = f(x,y). In Tensorflow v1.0.0, tf.hessians(z,x) and tf.hessians(z,y) both returns an N by N matrix, which is what I expected.
However, when I concatenate the x and y into a vector p of size 2*N using tf.concat, and run tf.hessian(z, p), it returns error "ValueError: None values not supported."
I understand this is because in the computation graph x,y ->z and x,y -> p, so there is no gradient between p and z. To circumvent the problem, I can create p first, slice it into x and y, but I will have to change a ton of my code. Is there a more elegant way?
related question: Slice of a variable returns gradient None
import tensorflow as tf
import numpy as np
N = 2
A = tf.Variable(np.random.rand(N,N).astype(np.float32))
B = tf.Variable(np.random.rand(N,N).astype(np.float32))
x = tf.Variable(tf.random_normal([N]) )
y = tf.Variable(tf.random_normal([N]) )
#reshape to N by 1
x_1 = tf.reshape(x,[N,1])
y_1 = tf.reshape(y,[N,1])
#concat x and y to form a vector with length of 2*N
p = tf.concat([x,y],axis = 0)
#define the function
z = 0.5*tf.matmul(tf.matmul(tf.transpose(x_1), A), x_1) + 0.5*tf.matmul(tf.matmul(tf.transpose(y_1), B), y_1) + 100
#works , hx and hy are both N by N matrix
hx = tf.hessians(z,x)
hy = tf.hessians(z,y)
#this gives error "ValueError: None values not supported."
#expecting a matrix of size 2*N by 2*N
hp = tf.hessians(z,p)
Compute the hessian by its definition.
gxy = tf.gradients(z, [x, y])
gp = tf.concat([gxy[0], gxy[1]], axis=0)
hp = []
for i in range(2*N):
hp.append(tf.gradients(gp[i], [x, y]))
Because tf.gradients computes the sum of (dy/dx), so when computing the second partial derivative, one should slice the vector into scalars and then compute the gradient. Tested on tf1.0 and python2.

solving a sparse non linear system of equations using scipy.optimize.root

I want to solve the following non-linear system of equations.
Notes
the dot between a_k and x represents dot product.
the 0 in the first equation represents 0 vector and 0 in the second equation is scaler 0
all the matrices are sparse if that matters.
Known
K is an n x n (positive definite) matrix
each A_k is a known (symmetric) matrix
each a_k is a known n x 1 vector
N is known (let's say N = 50). But I need a method where I can easily change N.
Unknown (trying to solve for)
x is an n x 1 a vector.
each alpha_k for 1 <= k <= N a scaler
My thinking.
I am thinking of using scipy root to find x and each alpha_k. We essentially have n equations from each row of the first equation and another N equations from the constraint equations to solve for our n + N variables. Therefore we have the required number of equations to have a solution.
I also have a reliable initial guess for x and the alpha_k's.
Toy example.
n = 4
N = 2
K = np.matrix([[0.5, 0, 0, 0], [0, 1, 0, 0],[0,0,1,0], [0,0,0,0.5]])
A_1 = np.matrix([[0.98,0,0.46,0.80],[0,0,0.56,0],[0.93,0.82,0,0.27],[0,0,0,0.23]])
A_2 = np.matrix([[0.23, 0,0,0],[0.03,0.01,0,0],[0,0.32,0,0],[0.62,0,0,0.45]])
a_1 = np.matrix(scipy.rand(4,1))
a_2 = np.matrix(scipy.rand(4,1))
We are trying to solve for
x = [x1, x2, x3, x4] and alpha_1, alpha_2
Questions:
I can actually brute force this toy problem and feed it to the solver. But how do I do I solve this toy problem in such a way that I can extend it easily to the case when I have let's say n=50 and N=50
I will probably have to explicitly compute the Jacobian for larger matrices??.
Can anyone give me any pointers?
I think the scipy.optimize.root approach holds water, but steering clear of the trivial solution might be the real challenge for this system of equations.
In any event, this function uses root to solve the system of equations.
def solver(x0, alpha0, K, A, a):
'''
x0 - nx1 numpy array. Initial guess on x.
alpha0 - nx1 numpy array. Initial guess on alpha.
K - nxn numpy.array.
A - Length N List of nxn numpy.arrays.
a - Length N list of nx1 numpy.arrays.
'''
# Establish the function that produces the rhs of the system of equations.
n = K.shape[0]
N = len(A)
def lhs(x_alpha):
'''
x_alpha is a concatenation of x and alpha.
'''
x = np.ravel(x_alpha[:n])
alpha = np.ravel(x_alpha[n:])
lhs_top = np.ravel(K.dot(x))
for k in xrange(N):
lhs_top += alpha[k]*(np.ravel(np.dot(A[k], x)) + np.ravel(a[k]))
lhs_bottom = [0.5*x.dot(np.ravel(A[k].dot(x))) + np.ravel(a[k]).dot(x)
for k in xrange(N)]
lhs = np.array(lhs_top.tolist() + lhs_bottom)
return lhs
# Solve the system of equations.
x0.shape = (n, 1)
alpha0.shape = (N, 1)
x_alpha_0 = np.vstack((x0, alpha0))
sol = root(lhs, x_alpha_0)
x_alpha_root = sol['x']
# Compute norm of residual.
res = sol['fun']
res_norm = np.linalg.norm(res)
# Break out the x and alpha components.
x_root = x_alpha_root[:n]
alpha_root = x_alpha_root[n:]
return x_root, alpha_root, res_norm
Running on the toy example, however, only produces the trivial solution.
# Toy example.
n = 4
N = 2
K = np.matrix([[0.5, 0, 0, 0], [0, 1, 0, 0],[0,0,1,0], [0,0,0,0.5]])
A_1 = np.matrix([[0.98,0,0.46,0.80],[0,0,0.56,0],[0.93,0.82,0,0.27],
[0,0,0,0.23]])
A_2 = np.matrix([[0.23, 0,0,0],[0.03,0.01,0,0],[0,0.32,0,0],
[0.62,0,0,0.45]])
a_1 = np.matrix(scipy.rand(4,1))
a_2 = np.matrix(scipy.rand(4,1))
A = [A_1, A_2]
a = [a_1, a_2]
x0 = scipy.rand(n, 1)
alpha0 = scipy.rand(N, 1)
print 'x0 =', x0
print 'alpha0 =', alpha0
x_root, alpha_root, res_norm = solver(x0, alpha0, K, A, a)
print 'x_root =', x_root
print 'alpha_root =', alpha_root
print 'res_norm =', res_norm
Output is
x0 = [[ 0.00764503]
[ 0.08058471]
[ 0.88300129]
[ 0.85299622]]
alpha0 = [[ 0.67872815]
[ 0.69693346]]
x_root = [ 9.88131292e-324 -4.94065646e-324 0.00000000e+000
0.00000000e+000]
alpha_root = [ -4.94065646e-324 0.00000000e+000]
res_norm = 0.0