einsum for sparse tensor(s) in Tensorflow TF - tensorflow

I want to multiply two tensors, one sparse and the other dense. The sparse one is 3D and the dense one 2D. I cannot convert the sparse tensor to a dense tensor (i.e., avoid using tf.sparse.to_dense(...)).
My multiplication is given by the following law:
C[i,j] = \sum_k A[i,k]*B[i,k,j]
where C = A*B and A and B are the dense and sparse tensors described above.
An example of execution in TF would be as follows:
# Example
# inputs
A = tf.constant([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]], tf.float32)
B = tf.sparse.SparseTensor(values=tf.constant([1,1,1,1,2,1,1,1,1,1,-1], tf.float32),indices=[[0,0,1],[0,1,2],[0,2,0],[1,0,0],[1,1,1],[1,1,2],[1,2,2],[2,0,2],[2,1,1],[2,2,1],[2,2,2]], dense_shape=[3,3,3])
# output
C = tf.constant([[3, 1, 2],
[4, 10, 11],
[9, 8, -1]], tf.float32)
tf.einsum does not support sparse tensors.
I have a version where I slice the 3D sparse tensor B into a collection of 2D sparse matrices, B[0,:,:], B[1,:,:],B[2,:,:],..., and multiply each row of the dense matrix A, A[i,:], with each 2D sliced sparse matrix B[i,:,:] applying the tf.sparse.sparse_dense_matmul(A[i,:],B[i,:,:]) function (with the corresponding reshapes after the slicing to have 2D tensors as arguments of tf.sparse.sparse_dense_matmul). Then, I stack all the vector results to assemble the C matrix. This procedure is slow and breaks the tensorial structure of B. I want to perform the same operation by applying ONLY Tensorflow functions (avoiding for loops to slice and break the sparse tensor to later reassamble the result by stacking). Then, this should work with Keras as a layer of a Neural Network ([A,B] is the batched list of inputs, C = A*B is the batched output of the layer). Breaking the tensors to compute the multiplications is crazy for the training in the compiled graph!
Any ideas? Does there exist any tf.sparse.einsum-like function for sparse tensors?
If I converted B to dense tensor, it would be super straightforward by applying tf.einsum(A,B,'ik,ikj->ij'). However, I cannot afford to lose the sparsity of B.
Thank you. Regards,

Related

How to use list of sparse tensors in tf.data.Dataset?

I'm trying to build a model which takes list of sparse tensors as input. (list length is equal to batch size)
The reason I use sparse tensor is that I have to pass adjacency matrix to my GNN model and it is very sparse. (~99%)
I'm familiar with using pytorch, and it is very easy to feed sparse tensor into the network.
However I found that I have to use tf.data.Dataset or keras.utils.Sequence for making dataset in tensorflow.
But those methods throw error to me when I use list of sparse tensors as input.
For example, code below makes TypeError
import tensorflow as tf
tf.data.Dataset.from_tensor_slices(sparse_lists)
TypeError: Neither a SparseTensor nor SparseTensorValue:
[<tensorflow.python.framework.sparse_tensor.SparseTensor object at 0x7fbf2e25b5c0>,
<tensorflow.python.framework.sparse_tensor.SparseTensor object at 0x7fbf2c22ada0>,
<tensorflow.python.framework.sparse_tensor.SparseTensor object at 0x7fbf2c22a400>,
<tensorflow.python.framework.sparse_tensor.SparseTensor object at 0x7fbf2c1ed240>,
<tensorflow.python.framework.sparse_tensor.SparseTensor object at 0x7fbf2c1ed390>,
<tensorflow.python.framework.sparse_tensor.SparseTensor object at 0x7fbf2c1ed470>,
<tensorflow.python.framework.sparse_tensor.SparseTensor object at 0x7fbf2c1ed5c0>,
<tensorflow.python.framework.sparse_tensor.SparseTensor object at 0x7fbf2c1ed710>,
<tensorflow.python.framework.sparse_tensor.SparseTensor object at 0x7fbf2c1ed828>,
<tensorflow.python.framework.sparse_tensor.SparseTensor object at 0x7fbf2c1ed940>].
I know that it will work if I concat all sparse tensors in list as a huge tensor.
However it is not my option because I have to use indexing for sparse tensors later.
(If I concat 2D sparse tensors into 3D sparse tensors, I cannot use indexing like below)
Some3DSparseTensor[:10]
Also, it will take more time because I have to slice 3D tensors for matrix multiplication with other dense networks.
Furthermore, I know that it will be fine if I make sparse tensor by indices, values for every batch, but it would take too much time for each batch.
As a result, I want to make tf.data.Dataset to be able to generate batch from list of sparse tensors due to indexing, time issue.
Can anybody help me? :)
Long story short,
What I have: List of sparse tensors (e.g 1000000 length list)
What I need to do: Batch list of sparse tensors (e.g 1024 length list, not a sparse concat)
If the SparseTensors have the same dense_shape you can create a unique SparseTensor instead of a list and pass it to from_tensor_slices.
For example the following code produce separate SparseTensors from a large SparseTensor s splitting them along the first dimension
s = tf.sparse.SparseTensor(
indices=tf.constant([[0, 0, 0], [1, 0, 0], [1, 0, 1], [2, 1, 1]], dtype=tf.int64),
values=tf.range(4, dtype=tf.float32),
dense_shape=(3, 2, 2))
d = tf.data.Dataset.from_tensor_slices(s)
for t in d:
print(t)
>>> SparseTensor(indices=tf.Tensor([[0 0]], shape=(1, 2), dtype=int64), values=tf.Tensor([0.], shape=(1,), dtype=float32), dense_shape=tf.Tensor([2 2], shape=(2,), dtype=int64))
SparseTensor(indices=tf.Tensor(
[[0 0]
[0 1]], shape=(2, 2), dtype=int64), values=tf.Tensor([1. 2.], shape=(2,), dtype=float32), dense_shape=tf.Tensor([2 2], shape=(2,), dtype=int64))
SparseTensor(indices=tf.Tensor([[1 1]], shape=(1, 2), dtype=int64), values=tf.Tensor([3.], shape=(1,), dtype=float32), dense_shape=tf.Tensor([2 2], shape=(2,), dtype=int64))
To use from_tensor_slices in this way, you need a function to convert the list sparse_lists to a large SparseTensor s (reported below).
To recap, you can do
import tensorflow as tf
def sparse_list_to_sparse_tensor(sparse_lists):
n = len(sparse_lists)
shape = sparse_lists[0].dense_shape
out_shape = (n, *shape)
out_values = tf.concat([s.values for s in sparse_lists], axis=0)
out_indices = []
for i, s in enumerate(sparse_lists):
element_idx = tf.cast(tf.fill((s.indices.shape[0], 1), i), dtype=tf.int64)
out_indices.append(tf.concat([element_idx, s.indices], axis=1))
out_indices = tf.concat(out_indices, axis=0)
return tf.sparse.SparseTensor(out_indices, out_values, out_shape)
tf.data.Dataset.from_tensor_slices(sparse_list_to_sparse_tensor(sparse_lists))
An alternative solution uses from_tensor_slices on every sparse tensor (after the addition of a dummy batch dimension) to create many datasets with a single element that can be concatenated in a single dataset.
dataset = None
for sparse_tensor in sparse_list:
batched_sparse_tensor = tf.sparse.expand_dims(sparse_tensor, axis=0)
element_dataset = tf.data.Dataset.from_tensor_slices(batched_sparse_tensor)
if dataset is None:
dataset = element_dataset
else:
dataset = dataset.concatenate(element_dataset)
Notice that using this solution the sparse tensors can have different dense_shapes.

How to slice array in tensorflow

I have a tensor T of size [None, 4], and I want to slice it along the second dimension to give me a size [None] tensor. In numpy this would be T[:, DIMENSION], is there a fast way to do it with tensorflow commands?
You can use the same operation as numpy.
a = tf.constant([[1,2,3,4],[5,6,7,8],[7,8,9,0]])
a.shape
the shape of a is
Out[]:TensorShape([Dimension(3), Dimension(4)])
using the slicing operation, get values in the second dimension:
a[:, 2].eval()
and the output is
array([3, 7, 9])

shape of a sparse tensor without invoking run()

sparse tensor.shape method returns a tensor object which seems to be of no use to extract the actual shape of the sparse tensor without resorting to run function.
To clarify what I mean, first consider a sparse tensor:
a = tf.SparseTensor(indices=[[0, 0, 0], [1, 2, 1]], values=[1.0+2j, 2.0], shape=[3, 4, 2])
a.shape returns:
tf.Tensor 'SparseTensor_1/shape:0' shape=(3,) dtype=int64
This is kind of no use.
Now, consider a dense tensor:
a = tf.constant(np.random.normal(0.0, 1.0, (4, 4)).astype(dtype=np.complex128))
a.get_shape() returns:
TensorShape([Dimension(4), Dimension(4)])
I can use this output and cast it into a list or tuple of integers without ever invoking run(). However, I cannot do the same for sparse tensor, unless I first convert sparse tensor to dense (which is not implemented for complex sparse tensor yet) and then call get_shape() method on it, but this is kind of redundant, defeats the purpose of using a sparse tensor in the first place and also leads to error down the road if the input sparse tensor is complex.
Is there a way to obtain the shape of a sparse tensor without invoking run() or converting it to a dense tensor first?
tf.SparseTensor is implemented as a triple of dense Tensors under the hood. The shape of a SparseTensor is just a Tensor; if you want to know its value, your best bet is to evaluate it using session.run:
print(sess.run(a.shape))
In general, Tensorflow does not promise to compute an exact shape even for dense tensors at graph construction time; shapes are best effort and may not even have a fixed value. So even for a dense Tensor you may have to evaluate the Tensor using run to get a precise shape.

regarding reshape a multi-dimensional tensor into [-1, n]

While reading a tensorflow segmentation, I am trying to figure out how does the following implementation aiming to do?
A x tensor is defined as follows self.x = tf.placeholder("float", shape=[None, None, None, n_label]).
Later, one function tries to invoke a transformed tensor "x1", which is defined as x1=tf.reshape(self.x, [-1, n_label])
My understanding is that tf.reshape(self.x, [-1,n_label])should try to re-shape
x tensor into a 1-D vector.
But I am kind of confusing about the x defined this way as shape=[None, None, None, n_label] and x1 transformed as such. What really should x1 look like and why doing so?
None means we don't want to specify dimension when creating a graph, rather want to determine it in the runtime. For instance, it could be useful when you want to use different minibatch sizes during train and for the inference.
Reshape with -1 for some dimension means just 'preserve the total size of a tensor'. For example, reshape.(x, [-1, 2]) for x of shape [3, 4, 2] would produce a new tensor of shape [12, 2].

Tensorflow reshape tensor gives None dimension

I have used the model described here on the 0.6.0 branch. The code can be found here. I have done some minor changes to the linked code.
In my code I create two models, one for training and one for validation, very similar as it is done in the Tensorflow Tutorial.
with tf.variable_scope("model", reuse=None, initializer=initializer):
m = PTBModel_User(is_training=True, config=config, name='Training model')
with tf.variable_scope("model", reuse=True, initializer=initializer):
mtest = PTBModel_User(is_training=False, config=config_valid, name='Validation model')
The first model, the one for training, seems to be created just fine, but the second, used for validation, does not. The output gets a None dimension! The row I'm refering to is on row 134 in the linked code:
output = tf.reshape(tf.concat(1, outputs), [-1, size])
I've added these lines right after the reshape of the output:
output_shape = output.get_shape()
print("Model num_steps:", num_steps)
print("Model batch_size:", batch_size)
print("Output dims", output_shape[0], output_shape[1])
and that gives me this:
Model num_steps: 400
Model batch_size: 1
Output dims Dimension(None) Dimension(650)
This problem only happens with the 'validation model', not with the 'training model'. For the 'training model' I get expected output:
Model num_steps: 400
Model batch_size: 2
Output dims Dimension(800) Dimension(650)
(Note that with the 'validation model' I use a batch_size=1 instead of batch_size=2 that I use for the training model)
From what I understand, using -1 as input to the reshape function, will figure the output shape out automagically! But then why do I get None? Nothing in my config fed to the model has a None value.
Thank you for all the help and tips!
TL;DR: A dimension being None simply means that shape inference could not determine an exact shape for the output tensor, at graph-building time. When you run the graph, the tensor will have the appropriate run-time shape.
If you're not interested in how shape inference works, you can stop reading now.
Shape inference applies local rules, based on a "shape function" that takes the shapes of the inputs to an operation and computes (possibly incomplete) shapes for the outputs of an operation. To figure out why tf.reshape() gives an incomplete shape, we have to look at its inputs, and work backwards:
The shape argument to tf.reshape() includes a [-1], which means "figure the output shape automagically" based on the shape of the tensor input.
The tensor input is the output of tf.concat() on the same line.
The inputs to tf.concat() are computed by a tf.mul() in BasicLSTMCell.__call__(). The tf.mul() op multiplies the result of a tf.tanh() and a tf.sigmoid() op.
The tf.tanh() op produces an output of size [?, hidden_size], and the tf.sigmoid() op produces an output of size [batch_size, hidden_size].
The tf.mul() op performs NumPy-style broadcasting. A dimension will only be broadcast if it has size 1. Consider three cases where we compute tf.mul(x, y):
If x has shape [1, 10], and y has shape [5, 10], then broadcasting will happen, and the output shape will be [5, 10].
If x has shape [1, 10], and y has shape [1, 10], then there will be no broadcasting, and the output shape will be [1, 10].
However, if x has shape [1, 10], and y has shape [?, 10], there is insufficient static information to tell whether broadcasting will happen (even though we happen to know that case 2 applies at runtime).
Therefore, when batch_size is 1, the tf.mul() op produces an output with the shape [?, hidden_size]; but when batch_size is greater than 1, the output shape is [batch_size, hidden_size].
Where shape inference breaks down, it can be appropriate to use the Tensor.set_shape() method to add information. This would potentially be useful in the BasicLSTMCell implementation, where we know more than it is possible to infer about the shapes of the outputs.