I have a tensor T of size [None, 4], and I want to slice it along the second dimension to give me a size [None] tensor. In numpy this would be T[:, DIMENSION], is there a fast way to do it with tensorflow commands?
You can use the same operation as numpy.
a = tf.constant([[1,2,3,4],[5,6,7,8],[7,8,9,0]])
a.shape
the shape of a is
Out[]:TensorShape([Dimension(3), Dimension(4)])
using the slicing operation, get values in the second dimension:
a[:, 2].eval()
and the output is
array([3, 7, 9])
Related
I'm trying to build a model which takes list of sparse tensors as input. (list length is equal to batch size)
The reason I use sparse tensor is that I have to pass adjacency matrix to my GNN model and it is very sparse. (~99%)
I'm familiar with using pytorch, and it is very easy to feed sparse tensor into the network.
However I found that I have to use tf.data.Dataset or keras.utils.Sequence for making dataset in tensorflow.
But those methods throw error to me when I use list of sparse tensors as input.
For example, code below makes TypeError
import tensorflow as tf
tf.data.Dataset.from_tensor_slices(sparse_lists)
TypeError: Neither a SparseTensor nor SparseTensorValue:
[<tensorflow.python.framework.sparse_tensor.SparseTensor object at 0x7fbf2e25b5c0>,
<tensorflow.python.framework.sparse_tensor.SparseTensor object at 0x7fbf2c22ada0>,
<tensorflow.python.framework.sparse_tensor.SparseTensor object at 0x7fbf2c22a400>,
<tensorflow.python.framework.sparse_tensor.SparseTensor object at 0x7fbf2c1ed240>,
<tensorflow.python.framework.sparse_tensor.SparseTensor object at 0x7fbf2c1ed390>,
<tensorflow.python.framework.sparse_tensor.SparseTensor object at 0x7fbf2c1ed470>,
<tensorflow.python.framework.sparse_tensor.SparseTensor object at 0x7fbf2c1ed5c0>,
<tensorflow.python.framework.sparse_tensor.SparseTensor object at 0x7fbf2c1ed710>,
<tensorflow.python.framework.sparse_tensor.SparseTensor object at 0x7fbf2c1ed828>,
<tensorflow.python.framework.sparse_tensor.SparseTensor object at 0x7fbf2c1ed940>].
I know that it will work if I concat all sparse tensors in list as a huge tensor.
However it is not my option because I have to use indexing for sparse tensors later.
(If I concat 2D sparse tensors into 3D sparse tensors, I cannot use indexing like below)
Some3DSparseTensor[:10]
Also, it will take more time because I have to slice 3D tensors for matrix multiplication with other dense networks.
Furthermore, I know that it will be fine if I make sparse tensor by indices, values for every batch, but it would take too much time for each batch.
As a result, I want to make tf.data.Dataset to be able to generate batch from list of sparse tensors due to indexing, time issue.
Can anybody help me? :)
Long story short,
What I have: List of sparse tensors (e.g 1000000 length list)
What I need to do: Batch list of sparse tensors (e.g 1024 length list, not a sparse concat)
If the SparseTensors have the same dense_shape you can create a unique SparseTensor instead of a list and pass it to from_tensor_slices.
For example the following code produce separate SparseTensors from a large SparseTensor s splitting them along the first dimension
s = tf.sparse.SparseTensor(
indices=tf.constant([[0, 0, 0], [1, 0, 0], [1, 0, 1], [2, 1, 1]], dtype=tf.int64),
values=tf.range(4, dtype=tf.float32),
dense_shape=(3, 2, 2))
d = tf.data.Dataset.from_tensor_slices(s)
for t in d:
print(t)
>>> SparseTensor(indices=tf.Tensor([[0 0]], shape=(1, 2), dtype=int64), values=tf.Tensor([0.], shape=(1,), dtype=float32), dense_shape=tf.Tensor([2 2], shape=(2,), dtype=int64))
SparseTensor(indices=tf.Tensor(
[[0 0]
[0 1]], shape=(2, 2), dtype=int64), values=tf.Tensor([1. 2.], shape=(2,), dtype=float32), dense_shape=tf.Tensor([2 2], shape=(2,), dtype=int64))
SparseTensor(indices=tf.Tensor([[1 1]], shape=(1, 2), dtype=int64), values=tf.Tensor([3.], shape=(1,), dtype=float32), dense_shape=tf.Tensor([2 2], shape=(2,), dtype=int64))
To use from_tensor_slices in this way, you need a function to convert the list sparse_lists to a large SparseTensor s (reported below).
To recap, you can do
import tensorflow as tf
def sparse_list_to_sparse_tensor(sparse_lists):
n = len(sparse_lists)
shape = sparse_lists[0].dense_shape
out_shape = (n, *shape)
out_values = tf.concat([s.values for s in sparse_lists], axis=0)
out_indices = []
for i, s in enumerate(sparse_lists):
element_idx = tf.cast(tf.fill((s.indices.shape[0], 1), i), dtype=tf.int64)
out_indices.append(tf.concat([element_idx, s.indices], axis=1))
out_indices = tf.concat(out_indices, axis=0)
return tf.sparse.SparseTensor(out_indices, out_values, out_shape)
tf.data.Dataset.from_tensor_slices(sparse_list_to_sparse_tensor(sparse_lists))
An alternative solution uses from_tensor_slices on every sparse tensor (after the addition of a dummy batch dimension) to create many datasets with a single element that can be concatenated in a single dataset.
dataset = None
for sparse_tensor in sparse_list:
batched_sparse_tensor = tf.sparse.expand_dims(sparse_tensor, axis=0)
element_dataset = tf.data.Dataset.from_tensor_slices(batched_sparse_tensor)
if dataset is None:
dataset = element_dataset
else:
dataset = dataset.concatenate(element_dataset)
Notice that using this solution the sparse tensors can have different dense_shapes.
I want to multiply two tensors, one sparse and the other dense. The sparse one is 3D and the dense one 2D. I cannot convert the sparse tensor to a dense tensor (i.e., avoid using tf.sparse.to_dense(...)).
My multiplication is given by the following law:
C[i,j] = \sum_k A[i,k]*B[i,k,j]
where C = A*B and A and B are the dense and sparse tensors described above.
An example of execution in TF would be as follows:
# Example
# inputs
A = tf.constant([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]], tf.float32)
B = tf.sparse.SparseTensor(values=tf.constant([1,1,1,1,2,1,1,1,1,1,-1], tf.float32),indices=[[0,0,1],[0,1,2],[0,2,0],[1,0,0],[1,1,1],[1,1,2],[1,2,2],[2,0,2],[2,1,1],[2,2,1],[2,2,2]], dense_shape=[3,3,3])
# output
C = tf.constant([[3, 1, 2],
[4, 10, 11],
[9, 8, -1]], tf.float32)
tf.einsum does not support sparse tensors.
I have a version where I slice the 3D sparse tensor B into a collection of 2D sparse matrices, B[0,:,:], B[1,:,:],B[2,:,:],..., and multiply each row of the dense matrix A, A[i,:], with each 2D sliced sparse matrix B[i,:,:] applying the tf.sparse.sparse_dense_matmul(A[i,:],B[i,:,:]) function (with the corresponding reshapes after the slicing to have 2D tensors as arguments of tf.sparse.sparse_dense_matmul). Then, I stack all the vector results to assemble the C matrix. This procedure is slow and breaks the tensorial structure of B. I want to perform the same operation by applying ONLY Tensorflow functions (avoiding for loops to slice and break the sparse tensor to later reassamble the result by stacking). Then, this should work with Keras as a layer of a Neural Network ([A,B] is the batched list of inputs, C = A*B is the batched output of the layer). Breaking the tensors to compute the multiplications is crazy for the training in the compiled graph!
Any ideas? Does there exist any tf.sparse.einsum-like function for sparse tensors?
If I converted B to dense tensor, it would be super straightforward by applying tf.einsum(A,B,'ik,ikj->ij'). However, I cannot afford to lose the sparsity of B.
Thank you. Regards,
I'm trying to use the dynamic_rnn function in Tensorflow to speed up training. After doing some reading, my understanding is that one way to speed up training is to explicitly pass a value to the sequence_length parameter in this function. After a bit more reading, and finding this SO explanation, it seems like what I need to pass is a vector (maybe defined by a tf.placeholder) that contains the length of each sequence within a batch.
Here's where I'm confused: in order to take advantage of this, should I pad each of my batches to the longest-length sequence within the batch instead of the longest-length sequence in the training set? How does Tensorflow handle the remaining zeros/pad-tokens in any of the shorter sequences? Also, is the main advantage here really speed, or just extra assurance that we're masking pad-tokens during training? Any help/context would be appreciated.
should I pad each of my batches to the longest-length sequence within the batch instead of the longest-length sequence in the training set?
The sequences within a batch must be aligned, i.e., have to have the same length. So the general answer to your question is "yes". But different batches doesn't have to be of the same length, so you can stratify input sequences into groups that have roughly the same size and pad them accordingly. This technique is called bucketing and you can read about it in this tutorial.
How does Tensorflow handle the remaining zeros/pad-tokens in any of the shorter sequences?
Pretty much intuitive. tf.nn.dynamic_rnn returns two tensors: output and states. Suppose the actual sequence length is t and the padded sequence length is T.
Then the output will contain zeros after i > t and states will contain the t-th cell state, ignoring the states of trailing cells.
Here's an example:
import numpy as np
import tensorflow as tf
n_steps = 2
n_inputs = 3
n_neurons = 5
X = tf.placeholder(dtype=tf.float32, shape=[None, n_steps, n_inputs])
seq_length = tf.placeholder(tf.int32, [None])
basic_cell = tf.nn.rnn_cell.BasicRNNCell(num_units=n_neurons)
outputs, states = tf.nn.dynamic_rnn(basic_cell, X,
sequence_length=seq_length, dtype=tf.float32)
X_batch = np.array([
# t = 0 t = 1
[[0, 1, 2], [9, 8, 7]], # instance 0
[[3, 4, 5], [0, 0, 0]], # instance 1
[[6, 7, 8], [6, 5, 4]], # instance 2
])
seq_length_batch = np.array([2, 1, 2])
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
outputs_val, states_val = sess.run([outputs, states], feed_dict={
X: X_batch,
seq_length: seq_length_batch
})
print(outputs_val)
print()
print(states_val)
Note that instance 1 is padded, so outputs_val[1,1] is a zero vector and states_val[1] == outputs_val[1,0]:
[[[ 0.76686853 0.8707901 -0.79509073 0.7430128 0.63775384]
[ 1. 0.7427926 -0.9452815 -0.93113345 -0.94975543]]
[[ 0.9998851 0.98436266 -0.9620067 0.61259484 0.43135557]
[ 0. 0. 0. 0. 0. ]]
[[ 0.99999994 0.9982034 -0.9934515 0.43735617 0.1671598 ]
[ 0.99999785 -0.5612586 -0.57177305 -0.9255771 -0.83750355]]]
[[ 1. 0.7427926 -0.9452815 -0.93113345 -0.94975543]
[ 0.9998851 0.98436266 -0.9620067 0.61259484 0.43135557]
[ 0.99999785 -0.5612586 -0.57177305 -0.9255771 -0.83750355]]
Also, is the main advantage here really speed, or just extra assurance that we're masking pad-tokens during training?
Of course, batch processing is more efficient, than feeding the sequences one by one. But the main advantage of specifying the length is that you get the reasonable state out of RNN, i.e., padded items don't affect the result tensor. You will get exactly the same result (and the same speed) if you don't set the length, but select the right states manually.
does anyone know how to use map_fn or any other tensorflow-func to do a computation on every combination of two input-tensors?
So what i want is something like this:
Having two arrays ([1,2] and [4,5]) i want as a result a matrix with the output of the computation (e.g. add) on every possible combination of the two arrays. So the result would be:
[[5,6],
[6,7]]
I used map_fn but this only takes the elements index-wise:
[[5]
[7]]
Has anyone an idea how implement this?
Thanks
You can add new unit dimensions to each Tensor, then rely on broadcasting addition:
import tensorflow as tf
import tensorflow.contrib.eager as tfe
tfe.enable_eager_execution()
first = tf.constant([1, 2])
second = tf.constant([4, 5])
print(first[None, :] + second[:, None])
Prints:
tf.Tensor(
[[5 6]
[6 7]], shape=(2, 2), dtype=int32)
While reading a tensorflow segmentation, I am trying to figure out how does the following implementation aiming to do?
A x tensor is defined as follows self.x = tf.placeholder("float", shape=[None, None, None, n_label]).
Later, one function tries to invoke a transformed tensor "x1", which is defined as x1=tf.reshape(self.x, [-1, n_label])
My understanding is that tf.reshape(self.x, [-1,n_label])should try to re-shape
x tensor into a 1-D vector.
But I am kind of confusing about the x defined this way as shape=[None, None, None, n_label] and x1 transformed as such. What really should x1 look like and why doing so?
None means we don't want to specify dimension when creating a graph, rather want to determine it in the runtime. For instance, it could be useful when you want to use different minibatch sizes during train and for the inference.
Reshape with -1 for some dimension means just 'preserve the total size of a tensor'. For example, reshape.(x, [-1, 2]) for x of shape [3, 4, 2] would produce a new tensor of shape [12, 2].