Create a TF Dataset of SparseTensors with from_generator - tensorflow

I have a generator that yields tf.sparse.SparseTensors. I want to turn this into a Tensorflow Dataset, but am running into some issues. I am using TF2. First, unlike regular Tensors, you cannot simply pass them in (and providing the correct data types for output_types). For a sparse tensor of [1,0,0,0,5,0], the error looks like
tensorflow.python.framework.errors_impl.InvalidArgumentError: TypeError: `generator` yielded an element that could not be converted to the expected type. The expected type was int64, but the yielded element was SparseTensor(indices=tf.Tensor(
E [[0]
E [4]], shape=(2, 1), dtype=int64), values=tf.Tensor([1 5], shape=(2,), dtype=int64), dense_shape=tf.Tensor([6], shape=(1,), dtype=int64)).
After doing some looking around on the internet, I found this open issue and tried to do something similar https://github.com/tensorflow/tensorflow/issues/16689 - read the indices, values, and shape as separate tensors into a TF Dataset, and then mapping over the dataset to create the sparse tensor. This is not working as shown in some of the examples in the github issue - tf.sparse.SparseTensor(indices, values, shape) does not seem to accept indices and shape in the form of a tf.Tensor - it will happily take in a list or numpy array, but not a Tensor. Since map is not eager, I also cannot call .numpy() on the Tensor either. What is best way to get this to work? I see there is tf.py_function/tf.numpy_function which could help, but constructing the output type can be tricky (though not impossible) for my use case - the incoming data is not fixed and can have a mix of sparse and dense tensors.

Related

'Tensor' vs 'tf.Tensor' tensorflow

I am exploring tensorflow internals, and sometimes when I print the value of a tensor, I will see data like the following: Tensor("x/PlaceholderWithDefault:0", shape=(), dtype=int32)
and other times I will see tf.Tensor(0, shape=(), dtype=int32).
What is the difference between these two expressions? Are Tensor and tf.Tensor different? And if not, why are they displayed differently (and seem to have different behavior)?

Setting shape of RaggedTensor with Known Shape

I'm working with RaggedTensors to manipulate a dense tensor. Something like this :
out_left = tf.ragged.boolean_mask(input, index)
index = tf.math.logical_not(index)
out_right = tf.ragged.boolean_mask(input, index)
reconstruced_tensor = tf.concat([out_left, out_right], axis=-1)
reconstruced_tensor = reconstruced_tensor.to_tensor()
As you can see, in my example, I'm just splitting my input tensor using RaggedTensors and reconstructing it (I know its useless, but it's just for the sake of simplicity)
The problem I'm having is that I'm getting the following warning:
IndexedSlices(IndexedSlices(indices=Tensor("gradient_tape/model_14/channel_roll_13/RaggedToTensor/boolean_mask_1/GatherV2:0", shape=(None,), dtype=int32), values=Tensor("gradient_tape/model_14/channel_roll_13/RaggedToTensor/boolean_mask/GatherV2:0", shape=(None,), dtype=float32), dense_shape=Tensor("gradient_tape/model_14/channel_roll_13/RaggedToTensor/Shape:0", shape=(1,), dtype=int32))) to a dense Tensor of unknown shape. This may consume a large amount of memory. "shape. This may consume a large amount of memory." % value)
Since I know the shape of the output Tensor, I would have thought Tensorflow would too. Is there any way I can explicitly specify the shape of the output tensor since Tensorflow doesn't seem to deduce it on its own?

tf.data.Dataset.reduce of SparseTensor elements

I have a tf.data.Dataset object d, where each element is an integer tf.sparse.SparseTensor, and I would like to sum them, returning a sparse tensor. One way I see is the following:
d.reduce(tf.sparse.SparseTensor(tf.zeros([0, 1], tf.int64),
tf.zeros([0], tf.int32),
dense_shape),
tf.sparse.add)
Problem:
How do I construct the zero for the reduce operation if I do not know the dense_shape ahead of time? I know all the sparse tensors in the dataset will have same shape, but it is not statically known. Perhaps it depends on data from which the sparse tensors are constructed, and setting this interactively in eager mode is not a viable option.

Pytorch: turning a [1,x] sized tensor into an [x] sized tensor

While trying to load in Pytorch 0.4.0 a model that has probably been produced by Pytorch 0.3.1, I keep getting such errors:
While copying the parameter named "conv1_7x7_s2_bn.bias", whose dimensions in the
model are torch.Size([64]) and whose dimensions in the checkpoint are torch.Size([1, 64]).
I thought that if I had applied transpose on each tensor, then it would work, but it is still failing, as the dimension turns into [64, 1], rather than [64], which I need.
I can I remove the redundant dimension and thus turn the 1-row matrix into a vector?
Note: When calling torch.flatten, I get:
AttributeError: module 'torch' has no attribute 'flatten'
Removing empty dimensions is called "squeezing". NumPy does it, Tensorflow does it and PyTorch does it.
So the correct command is:
torch.squeeze(tensor)

shape of a sparse tensor without invoking run()

sparse tensor.shape method returns a tensor object which seems to be of no use to extract the actual shape of the sparse tensor without resorting to run function.
To clarify what I mean, first consider a sparse tensor:
a = tf.SparseTensor(indices=[[0, 0, 0], [1, 2, 1]], values=[1.0+2j, 2.0], shape=[3, 4, 2])
a.shape returns:
tf.Tensor 'SparseTensor_1/shape:0' shape=(3,) dtype=int64
This is kind of no use.
Now, consider a dense tensor:
a = tf.constant(np.random.normal(0.0, 1.0, (4, 4)).astype(dtype=np.complex128))
a.get_shape() returns:
TensorShape([Dimension(4), Dimension(4)])
I can use this output and cast it into a list or tuple of integers without ever invoking run(). However, I cannot do the same for sparse tensor, unless I first convert sparse tensor to dense (which is not implemented for complex sparse tensor yet) and then call get_shape() method on it, but this is kind of redundant, defeats the purpose of using a sparse tensor in the first place and also leads to error down the road if the input sparse tensor is complex.
Is there a way to obtain the shape of a sparse tensor without invoking run() or converting it to a dense tensor first?
tf.SparseTensor is implemented as a triple of dense Tensors under the hood. The shape of a SparseTensor is just a Tensor; if you want to know its value, your best bet is to evaluate it using session.run:
print(sess.run(a.shape))
In general, Tensorflow does not promise to compute an exact shape even for dense tensors at graph construction time; shapes are best effort and may not even have a fixed value. So even for a dense Tensor you may have to evaluate the Tensor using run to get a precise shape.