Can tensorflow use matrix of matrix? - tensorflow

I know tensorflow can calculate expressions like [ [a,b,c] ] x [ [x],[y],[z] ] when the elements are primitive data type (integer or float).
Is it possible to perform a similar computation when each of a, b and c is a 1x3 matrix and x, y and z are 3x1 matrices?
Can TensorFlow calculate and optimize this formula?

The tf.batch_matmul() operator can perform matrix multiplications on batches of matrices. In this case, you would have a tensor abc of shape (3, 1, 3) (where abc[0, :, :] = a, abc[1, :, :] = b, etc.) and a tensor xyz of shape (3, 3, 1) (where xyz[0, :, :] = x, etc.).
abc = ...
xyz = ...
result = tf.batch_matmul(abc, xyz)
print result.get_shape() # ==> "(3, 1, 1)"
result is a 3-D tensor with contents equivalent to tf.pack([tf.matmul(a, x), tf.matmul(b, y), tf.matmul(c, z)]).

Related

Tabular data: Implementing a custom tensor layer without resorting to iteration

I have an idea for a tensor operation that would not be difficult to implement via iteration, with batch size one. However I would like to parallelize it as much as possible.
I have two tensors with shape (n, 5) called X and Y. X is actually supposed to represent 5 one-dimensional tensors with shape (n, 1): (x_1, ..., x_n). Ditto for Y.
I would like to compute a tensor with shape (n, 25) where each column represents the output of the tensor operation f(x_i, y_j), where f is fixed for all 1 <= i, j <= 5. The operation f has output shape (n, 1), just like x_i and y_i.
I feel it is important to clarify that f is essentially a fully-connected layer from the concatenated [...x_i, ...y_i] tensor with shape (1, 10), to an output layer with shape (1,5).
Again, it is easy to see how to do this manually with iteration and slicing. However this is probably very slow. Performing this operation in batches, where the tensors X, Y now have shape (n, 5, batch_size) is also desirable, particularly for mini-batch gradient descent.
It is difficult to really articulate here why I desire to create this network; I feel it is suited for my domain of 'itemized tabular data' and cuts down significantly on the number of weights per operation, compared to a fully connected network.
Is this possible using tensorflow? Certainly not using just keras.
Below is an example in numpy per AloneTogether's request
import numpy as np
features = 16
batch_size = 256
X_batch = np.random.random((features, 5, batch_size))
Y_batch = np.random.random((features, 5, batch_size))
# one tensor operation to reduce weights in this custom 'layer'
f = np.random.random((features, 2 * features))
for b in range(batch_size):
X = X_batch[:, :, b]
Y = Y_batch[:, :, b]
for i in range(5):
x_i = X[:, i:i+1]
for j in range(5):
y_j = Y[:, j:j+1]
x_i_y_j = np.concatenate([x_i, y_j], axis=0)
# f(x_i, y_j)
# implemented by a fully-connected layer
f_i_j = np.matmul(f, x_i_y_j)
All operations you need (concatenation and matrix multiplication) can be batched.
Difficult part here is, that you want to concatenate features of all items in X with features of all items in Y (all combinations).
My recommended solution is to expand the dimensions of X to [batch, features, 5, 1], expand dimensions of Y to [batch, features, 1, 5]
Than tf.repeat() both tensors so their shapes become [batch, features, 5, 5].
Now you can concatenate X and Y. You will have a tensor of shape [batch, 2*features, 5, 5]. Observe that this way all combinations are built.
Next step is matrix multiplication. tf.matmul() can also do batch matrix multiplication, but I use here tf.einsum() because I want more control over which dimensions are considered as batch.
Full code:
import tensorflow as tf
import numpy as np
batch_size=3
features=6
items=5
x = np.random.uniform(size=[batch_size,features,items])
y = np.random.uniform(size=[batch_size,features,items])
f = np.random.uniform(size=[2*features,features])
x_reps= tf.repeat(x[:,:,:,tf.newaxis], items, axis=3)
y_reps= tf.repeat(y[:,:,tf.newaxis,:], items, axis=2)
xy_conc = tf.concat([x_reps,y_reps], axis=1)
f_i_j = tf.einsum("bfij, fg->bgij", xy_conc,f)
f_i_j = tf.reshape(f_i_j , [batch_size,features,items*items])

computing multiple covariance matrices

I have a matrix X of size (1875, 77). For each column, I want to compute the covariance matrix, i.e., x_1 # x_1.T where x_1 has a shape (1875, 1). Ideally, I want to do this in one-go without a for loop. Is there an easy way to do this?
I was thinking about padding with zeros for each column up and down based on the column index(so x_1 will have 76 zeros columns, x_2 will have one (77, 1) zero column pad on top and 75 zero column pads), but this seems to complicate things more.
You probably want this:
import numpy as np
r, c = 1875, 77
X = np.random.rand(r, c)
covs = X.T[..., None] # X.T[:, None, :]
covs.shape
# (77, 1875, 1875)
This simply performs c number of matrix multiplications, 1 for each column of X. Here X.T[..., None] is of shape (c, r, 1) and X.T[:, None, :] is of shape (c, 1, r) and this makes the matrix multiple between them compatible.

Tensorflow: Mulitiplication of a vector with a Matrix

I have a vector x = <C elements>. I want to multiple it with a matrix Z = <C x C elements>. This will give an output of <1xC> matrix ( <C> vector).
I can just multiply them if I have one samples only for each.
But during training, I have tensors x = <NxC> and Z = <NxCxC>, where 'N' is my mini-batch size. How can I achieve the above computation in this case. Plain tf.matmul() return error complaining about dimensions.
ValueError: Shape must be rank 2 but is rank 3 for 'Channel/MatMul' (op: 'MatMul') with input shapes: [?,2], [?,2,2].
Thanks,
Vishnu Raj
Here is one way to do it:
Expand the last dimension of tensor x. Its shape is now (N, C).
Multiply tensors Z and x with broadcasting. The shape of the resulting tensor is (N, C, C).
Sum over the last dimension of that tensor.
For example:
import tensorflow as tf
N = 2
C = 3
Z = tf.ones((N, C, C)) # Shape=(N, C, C)
x = tf.reshape(tf.range(0, N*C, dtype=tf.float32), shape=(N, C)) # Shape=(N, C)
mul = tf.reduce_sum(Z * x[:, :, None], axis=-1) # Shape=(N, C)
with tf.Session() as sess:
print(sess.run(mul))

Evaluating the pairwise euclidean distance between multi-dimensional inputs in TensorFlow

I have two 2-D tensors of shape say m X d and n X d. What is the optimized(i.e. without for loops) or the tensorflow way of evaluating the pairwise euclidean distance between these two tensors so that I get an output tensor of shape m X n. I need it for creating the squared term of a Gaussian kernel for ultimately having a covariance matrix of size m x n.
The equivalent unoptimized numpy code would look like this
difference_squared = np.zeros((x.shape[0], x_.shape[0]))
for row_iterator in range(difference_squared.shape[0]):
for column_iterator in range(difference_squared.shape[1]):
difference_squared[row_iterator, column_iterator] = np.sum(np.power(x[row_iterator]-x_[column_iterator], 2))
I found the answer by taking help from here. Assuming the two tensors are x1 and x2, and their dimensions are m X d and n X d, their pair-wise Euclidean distance is given by
tile_1 = tf.tile(tf.expand_dims(x1, 0), [n, 1, 1])
tile_2 = tf.tile(tf.expand_dims(x2, 1), [1, m, 1])
pairwise_euclidean_distance = tf.reduce_sum(tf.square(tf.subtract(tile_1, tile_2)), 2))

How does tensorflow batch_matmul work?

Tensorflow has a function called batch_matmul which multiplies higher dimensional tensors. But I'm having a hard time understanding how it works, perhaps partially because I'm having a hard time visualizing it.
What I want to do is multiply a matrix by each slice of a 3D tensor, but I don't quite understand what the shape of tensor a is. Is z the innermost dimension? Which of the following is correct?
I would most prefer the first to be correct -- it's most intuitive to me and easy to see in the .eval() output. But I suspect the second is correct.
Tensorflow says that batch_matmul performs:
out[..., :, :] = matrix(x[..., :, :]) * matrix(y[..., :, :])
What does that mean? What does that mean in the context of my example? What is being multiplied with with what? And why aren't I getting a 3D tensor the way I expected?
You can imagine it as doing a matmul over each training example in the batch.
For example, if you have two tensors with the following dimensions:
a.shape = [100, 2, 5]
b.shape = [100, 5, 2]
and you do a batch tf.matmul(a, b), your output will have the shape [100, 2, 2].
100 is your batch size, the other two dimensions are the dimensions of your data.
First of all tf.batch_matmul() was removed and no longer available. Now you suppose to use tf.matmul():
The inputs must be matrices (or tensors of rank > 2, representing
batches of matrices), with matching inner dimensions, possibly after
transposition.
So let's assume you have the following code:
import tensorflow as tf
batch_size, n, m, k = 10, 3, 5, 2
A = tf.Variable(tf.random_normal(shape=(batch_size, n, m)))
B = tf.Variable(tf.random_normal(shape=(batch_size, m, k)))
tf.matmul(A, B)
Now you will receive a tensor of the shape (batch_size, n, k). Here is what is going on here. Assume you have batch_size of matrices nxm and batch_size of matrices mxk. Now for each pair of them you calculate nxm X mxk which gives you an nxk matrix. You will have batch_size of them.
Notice that something like this is also valid:
A = tf.Variable(tf.random_normal(shape=(a, b, n, m)))
B = tf.Variable(tf.random_normal(shape=(a, b, m, k)))
tf.matmul(A, B)
and will give you a shape (a, b, n, k)
You can now do it using tf.einsum, starting from Tensorflow 0.11.0rc0.
For example,
M1 = tf.Variable(tf.random_normal([2,3,4]))
M2 = tf.Variable(tf.random_normal([5,4]))
N = tf.einsum('ijk,lk->ijl',M1,M2)
It multiplies the matrix M2 with every frame (3 frames) in every batch (2 batches) in M1.
The output is:
[array([[[ 0.80474716, -1.38590837, -0.3379252 , -1.24965811],
[ 2.57852983, 0.05492432, 0.23039417, -0.74263287],
[-2.42627382, 1.70774114, 1.19503212, 0.43006262]],
[[-1.04652011, -0.32753903, -1.26430523, 0.8810069 ],
[-0.48935518, 0.12831448, -1.30816901, -0.01271309],
[ 2.33260512, -1.22395933, -0.92082584, 0.48991606]]], dtype=float32),
array([[ 1.71076882, 0.79229093, -0.58058828, -0.23246667],
[ 0.20446332, 1.30742455, -0.07969904, 0.9247328 ],
[-0.32047141, 0.66072595, -1.12330854, 0.80426538],
[-0.02781649, -0.29672042, 2.17819595, -0.73862702],
[-0.99663496, 1.3840003 , -1.39621222, 0.77119476]], dtype=float32),
array([[[ 0.76539308, 2.77609682, -1.79906654, 0.57580602, -3.21205115],
[ 4.49365759, -0.10607499, -1.64613271, 0.96234947, -3.38823152],
[-3.59156275, 2.03910899, 0.90939498, 1.84612727, 3.44476724]],
[[-1.52062428, 0.27325237, 2.24773455, -3.27834225, 3.03435063],
[ 0.02695178, 0.16020992, 1.70085776, -2.8645196 , 2.48197317],
[ 3.44154787, -0.59687197, -0.12784094, -2.06931567, -2.35522676]]], dtype=float32)]
I have verified, the arithmetic is correct.
tf.tensordot should solve this problem. It supports batch operations, e.g., if you want to contract a 2D tensor with a 3D tensor, with the latter having a batch dimension.
If a is shape [n,m] b is shape [?,m,l], then
y = tf.tensordot(b, a, axes=[1, 1]) will produce a tensor of shape [?,n,l]
https://www.tensorflow.org/api_docs/python/tf/tensordot
It is simply like splitting on the first dimension respectively, multiply and concat them back. If you want to do 3D by 2D, you can reshape, multiply, and reshape it back. I.e. [100, 2, 5] -> [200, 5] -> [200, 2] -> [100, 2, 2]
The answer to this particular answer is using tf.scan function.
If a = [5,3,2] #dimension of 5 batch, with 3X2 mat in each batch
and b = [2,3] # a constant matrix to be multiplied with each sample
then let def fn(a,x):
return tf.matmul(x,b)
initializer = tf.Variable(tf.random_number(3,3))
h = tf.scan(fn,outputs,initializer)
this h will store all the outputs.