How to multiply very large sparse matrix in tensorflow? - tensorflow

I have a very large sparse tensor input_a with shape: [20k, 20k], and a dense tensor input_b with shape: [20k, 512]. nnz(input_a) is about 400k (There are 400k nonzero values in input_a). I would like to multiply these two tensor, to get the output tensor output with shape: [20k, 512]. If I use tf.sparse_tensor_dense_matmul to multiply input_a and input_b, there will be an error: (with tensorflow version v1.8)
Cannot use GPU when output.shape[1] * nnz(a) > 2^31"
If output.shape[1] is 20k, it might make sense, because 20k * 400k = 8000000000 > 2^31 = 2147483648. But, I think output.shape[1] should be 512 in this case?
My code:
input_a: shape [20k, 20k]
input_b: shape [20k, 512]
output: shape [20k, 512]
output = tf.sparse_tensor_dense_matmul(input_a, input_b)
Then I will get the error message.
If I transform input_a to a dense tensor and then use tf.matmul, it works, but it might cost a lot of memory and time?
dense_input_a = tf.sparse_tensor_to_dense(dense_input_a)
output = tf.matmul(dense_input_a, input_b, a_is_sparse = True)
Why the error occurred and how should I multiply these two tensor in a efficient way? Thanks!

Related

ValueError: Dimensions must be equal in Tensorflow/Keras

My codes are as follow:
v = tf.Variable(initial_value=v, trainable=True)
v.shape is (1, 768)
In the model:
inputs_sents = keras.Input(shape=(50,3))
inputs_events = keras.Input(shape=(50,768))
x_1 = tf.matmul(v,tf.transpose(inputs_events))
x_2 = tf.matmul(x_1,inputs_sents)
But I got an error,
ValueError: Dimensions must be equal, but are 768 and 50 for
'{{node BatchMatMulV2_3}} =
BatchMatMulV2[T=DT_FLOAT,
adj_x=false,
adj_y=false](BatchMatMulV2_3/ReadVariableOp,
Transpose_3)' with input shapes: [1,768], [768,50,?]
I think it takes consideration of the batch? But how shall I deal with this?
v is a trainable vector (or 2d array with first dimension being 1), I want it to be trained in the training process.
PS: This is the result I got using the codes provided by the first answer, I think it is incorrect cause keras already takes consideration of the first batch dimension.
Plus, from the keras documentation,
shape: A shape tuple (integers), not including the batch size. For instance, shape=(32,) indicates that the expected input will be batches of 32-dimensional vectors. Elements of this tuple can be None; 'None' elements represent dimensions where the shape is not known.
https://keras.io/api/layers/core_layers/input/
Should I rewrite my codes without keras?
The shape of a batch is denoted by None:
import numpy as np
inputs_sents = keras.Input(shape=(None,1,3))
inputs_events = keras.Input(shape=(None,1,768))
v = np.ones(shape=(1,768), dtype=np.float32)
v = tf.Variable(initial_value=v, trainable=True)
x_1 = tf.matmul(v,tf.transpose(inputs_events))
x_2 = tf.matmul(x_1,inputs_sents)

Optimizing Tensorflow for many small matrix-vector multiplications

To build up a capsule network training script, I need to compute many small matrix-vector multiplications.
The size of each weight matrix is at most 20 by 20.
The number of weight matrices is more more than 900.
I'm curious tf.matmul or tf.linalg.matvec is the best option for this.
Could anybody give me a hint to optimize the training script?
EDIT:
Looking at the notebook that you are referring to, it seems you have the following parameters:
batch_size = 50
caps1_n_caps = 1152
caps1_n_dims = 8
caps2_n_caps = 10
caps2_n_dims = 16
And then you have a tensor w with shape (caps1_n_caps, caps2_n_caps, caps2_n_dims, caps1_n_dims) (in the notebook it has an initial dimension with size 1 that I am skipping) and another tensor caps1_output with shape (batch_size, caps1_n_caps, caps1_n_dims). And you need to combine them to produce caps2_predicted with shape (batch_size, caps1_n_caps, caps1_n_dims, caps2_n_dims).
In the notebook they tile the tensors in order to operate them with tf.linalg.matmul, but actually you can compute the same result without any tiling just using tf.einsum:
import tensorflow as tf
batch_size = 50
caps1_n_caps = 1152
caps1_n_dims = 8
caps2_n_caps = 10
caps2_n_dims = 16
w = tf.zeros((caps1_n_caps, caps2_n_caps, caps2_n_dims, caps1_n_dims), dtype=tf.float32)
caps1_output = tf.zeros((batch_size, caps1_n_caps, caps1_n_dims), dtype=tf.float32)
caps2_predicted = tf.einsum('ijkl,bil->bilk', w, caps1_output)
print(caps2_predicted.shape)
# (50, 1152, 8, 16)
I'm not sure if I have understood exactly what you want, but you say you want to compute something like:
ûij = Wij × ui
For a collection of several matrices W and vectors u. Assuming you have 900 matrices and vectors, matrices have size 20×20 and vectors have size 20, you can represent them as two tensors, ws, with shape (900, 20, 20), and us, with shape (900, 20). If you do that, you result us_hat, with shape (900, 20, 20), would be computed simply as:
us_hat = ws * tf.expand_dims(us, axis=-1)

Tensorflow: how to apply a conditional operation that is differentiable to a Tensor?

I would like to define a Tensorflow operation that allows me to, given a tensor, return a boolean tensor of the same size where all values in the tensor greater than 0 are set to 1, and all other values are set to 0.
I have tried using tf.cond, tf.where, x>0, but I'm getting the following error:
ValueError: No gradients provided for any variable, check your graph
for ops that do not support gradients, between variables
Are there Tensorflow operation(s) that will allow me to perform this binary thresholding that are also differentiable/have defined gradients?
Here is the code that is causing the error:
x1 and x2 are tensors of shape (32, 128, 128, 1):
diff = tf.abs(x1-x2)
diff = tf.to_float(diff > 0.0)
y = tf.reduce_mean(tf.reduce_sum(diff, axis=[1, 2, 3]))
Thanks

tensorflow conv1d kernel size dimensionality error

When taking the one dimensional convolution of a one dimensional array, I receive an error which suggests my second dimension is not big enough.
Here is the overview of the relevant code:
inputs_ = tf.placeholder(tf.float32 ,(None, 45), name='inputs')
x1 = tf.expand_dims(inputs_, axis=1)
x1 = tf.layers.conv1d(x1, filters=64, kernel_size=1, strides=1, padding='valid')
I am hoping to increase the kernel size to 3 such that neighbouring points also influence the output of each input node, however I get the following error:
ValueError: Negative dimension size caused by subtracting 3 from 1 for
'conv1d_4/convolution/Conv2D' (op: 'Conv2D') with input shapes:
[?,1,1,45], [1,3,45,64].
My guess is that tensorflow is expecting me to reshape my input into two dimensions so that some depth can be used to do the kernel multiplication. Question is why is this the case and what to expect for the layer behaviour based on the input dimensions
You need to add a Channel dimension as last dimension even if you only have one channel.
So this code works:
inputs_ = tf.placeholder(tf.float32 ,(None, 45), name='inputs')
x1 = tf.expand_dims(inputs_, axis=-1)
x1 = tf.layers.conv1d(x1, filters=64, kernel_size=3, strides=1, padding='valid')
So basically the error was caused because your tensor looked like having a width of 1, with 45 channels. TensorFlow was trying to convolve with a kernel size 3 along a size 1 dimension.

Tensorflow batch sparse multiply

I would like to multiply a sparse tensor by a dense tensor but do so within a batch.
For example I have a sparse tensor with the corresponding dense shape of (20,65536,65536) where 20 is the batch size. I would like to multiply each (65536,65536) in the batch with the corresponding (65536x1) from a tensor shape (20,65536) which has a dense representation. tf.sparse_tensor_dense_matmul only accepts a rank 2 sparse tensor. Is there a way to perform this over a batch?
I would like to avoid converting the sparse matrix to a dense matrix if possible due to memory constraints.
Assuming that a is a sparse tensor with shape (20, 65536, 65536) and b a dense tensor with shape (20, 65536), you could perform the batch sparse-dense matrix multiplication as follows:
y_sparse = tf.sparse.reduce_sum_sparse(a * b[:, None, :], axis=-1)
This solution expands the second dimension of tensor b to enable implicit broadcasting. Then, the batch matrix multiplication takes place by performing a sparse-dense multiplication and a sparse sum along the last axis.
If b has got a third dimension so it is a batch of matrices, you can multiply their columns individually and concatenate them later:
multiplied_dims = []
for i in range (b.shape[-1]):
multiplied_dims.append(tf.expand_dims(tf.sparse.reduce_sum(a * b[:, :, i][:, None, :], axis=-1), -1))
result = tf.concat(multiplied_dims, -1)
The answer is simple - you reshape the sparse tensor first and then multiply it by the dense matrix. Something like this would work:
sparse_tensor_rank2 = tf.sparse_reshape(sparse_tensor, [-1, 65536])