I need to slice a window of constant size for each batch element, but starting at different locations. For example, for windows of length two, I want to be able to do something like:
batch = tf.constant([[1, 2, 3],
[4, 5, 6]])
window_size = 2
window_starts = tf.constant([1, 0]) # The index from which to start slicing for each batch element.
slice_windows(batch, window_size, window_starts)
# this should return [[2, 3],
# [4, 5]]
I won’t know what the window_starts are beforehand (they come from data), so I can’t just enumerate all of the indices I need and use tf.gather_nd.
Furthermore, after doing computations on the windows, I then need to pad them back into place with 0s (so a different amount of padding for each batch element):
...computation on windows...
restore_window_positions(windows, window_starts, original_size=3)
# this should return [[0, 2, 3],
# [4, 5, 0]]
Related
I find it annoyed with slice and assign in tf2.0. Supposing array is a 2D array. In numpy, I can do
array[:, [1, 3]] = a_num - array[:, [3, 1]]
But how can i achieve these ops in tf2.0? I want to know.
I was using Tensorflow 2.0 to build a super resolution model. During pre-processing, I wanted to crop both the low and high resolution images by a given patch size. In order to do so, I wanted to get the height and width of the low and high resolution images. But tf.shape(image) is returning None.
Is there a better approach?
Currently I am just resizing every image to some size before using tf.shape, but since not all images have equal size, it is affecting the quality of the imaged. Looking forward to your suggestions.
Edited part:
Here is some parts of the code
low_r = tf.io.decode_jpeg(lr_filename, channels=3)
low_r = tf.cast(low_r, dtype=tf.float32)
print(low_r.shape)
The print statement prints (None, None, 3)
What I wanted was to get the height and weight, like (240,360,3)
I'm not sure if this is also your case, but in my TensorFlow (v2.4.0rc2), my_tensor.shape() also returns TensorShape([None, None, None, None]). It is connected to the fact that the TensorShape tensor is generated during the build and not during the execution.
Using tf.shape() (mentioned in your question, but not used in your code snippet actually) solves it for me.
> my_tensor.shape()
TensorShape([None, None, None, None])
> tf.shape(my_tensor)
[10 512 512 8]
I'm unable to repeat your issue, but this should give you a way to test out your Tensorflow 2.0 install and compare with the results you're currently getting.
Create a tensor and check it's shape:
import tensorflow as tf
t = tf.constant([[[1, 1, 1], [2, 2, 2]], [[3, 3, 3], [4, 4, 4]]])
tf.shape(t) # [2, 2, 3]
Out[1]: <tf.Tensor: id=1, shape=(3,), dtype=int32, numpy=array([2, 2, 3])>
Next, checking what the function return when called:
tf_shape_var = tf.shape(t)
print(tf_shape_var)
Output:
tf.Tensor([2 2 3], shape=(3,), dtype=int32)
Finally, calling it on an int and string to get back a valid return:
tf.shape(1)
Out[10]: <tf.Tensor: id=12, shape=(0,), dtype=int32, numpy=array([], dtype=int32)>
tf.shape('asd')
Out[11]: <tf.Tensor: id=15, shape=(0,), dtype=int32, numpy=array([], dtype=int32)>
And the print statements:
print(tf.shape(1))
print(tf.shape('asd'))
Output:
tf.Tensor([], shape=(0,), dtype=int32)
tf.Tensor([], shape=(0,), dtype=int32)
Link for tf.shape() https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/shape
i have a model that extract 512 features from an image (numbers between -1,1).
i converted this model to tflite float format using the instruction here
https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite
i run an inference on the same image with the original model and the tflite model.
i am getting different results for the vector, i was expecting to get very similar results as i didn't use quantized format. and from what i understand tf-lite should only improve the inference performance time and not effect the features calculation.
my question is this normal ? anyone else encountered this ?
i didn't find any topics regarding this at any place.
Updated with code.
i have this network i trained (removed many items as i can't share full network)
placeholder = tf.placeholder(name='input', dtype=tf.float32,shape=[None, 128,128, 1])
with slim.arg_scope([slim.conv2d, slim.separable_conv2d],
activation_fn=tf.nn.relu, normalizer_fn=slim.batch_norm):
net = tf.identity(placeholder)
net = slim.conv2d(net, 32, [3, 3], scope='conv11')
net = slim.separable_conv2d(net, 64, [3, 3], scope='conv12')
net = slim.max_pool2d(net, [2, 2], scope='pool1') # 64x64
net = slim.separable_conv2d(net, 128, [3, 3], scope='conv21')
net = slim.max_pool2d(net, [2, 2], scope='pool2') # 32x32
net = slim.separable_conv2d(net, 256, [3, 3], scope='conv31')
net = slim.max_pool2d(net, [2, 2], scope='pool3') # 16x16
net = slim.separable_conv2d(net, 512, [3, 3], scope='conv41')
net = slim.max_pool2d(net, [2, 2], scope='pool4') # 8x8
net = slim.separable_conv2d(net, 1024, [3, 3], scope='conv51')
net = slim.avg_pool2d(net, [8, 8], scope='pool5') # 1x1
net = slim.dropout(net)
net = slim.conv2d(net, feature_vector_size, [1, 1], activation_fn=None, normalizer_fn=None, scope='features')
embeddings = tf.nn.l2_normalize(net, 3, 1e-10, name='embeddings')
bazel-bin/tensorflow/contrib/lite/toco/toco --input_file=/tmp/network_512.pb
--input_format=TENSORFLOW_GRAPHDEF --output_format=TFLITE --output_file=/tmp/tffiles/network_512.tflite
--inference_type=FLOAT --input_type=FLOAT --input_arrays=input --output_arrays=embeddings --input_shapes=1,128,128,1
i run network_512.pb using tensorflow in python and network_512.tflite using the code from https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/java/demo
where i modified the code to load my network with and run it.
Update that i have found. the test i did was using the Demo app tensorflow provide and change it to use my costume model and extracting features, and there i noticed the difference in the features values.
once i compiled the tf-lite c++ lib manually on latest android, and run the flow with the same flow i use (which is TF-C API until now) i got almost same results for the features.
didn't have time to investigate from where come the difference. but i am happy now.
I'm trying to use the dynamic_rnn function in Tensorflow to speed up training. After doing some reading, my understanding is that one way to speed up training is to explicitly pass a value to the sequence_length parameter in this function. After a bit more reading, and finding this SO explanation, it seems like what I need to pass is a vector (maybe defined by a tf.placeholder) that contains the length of each sequence within a batch.
Here's where I'm confused: in order to take advantage of this, should I pad each of my batches to the longest-length sequence within the batch instead of the longest-length sequence in the training set? How does Tensorflow handle the remaining zeros/pad-tokens in any of the shorter sequences? Also, is the main advantage here really speed, or just extra assurance that we're masking pad-tokens during training? Any help/context would be appreciated.
should I pad each of my batches to the longest-length sequence within the batch instead of the longest-length sequence in the training set?
The sequences within a batch must be aligned, i.e., have to have the same length. So the general answer to your question is "yes". But different batches doesn't have to be of the same length, so you can stratify input sequences into groups that have roughly the same size and pad them accordingly. This technique is called bucketing and you can read about it in this tutorial.
How does Tensorflow handle the remaining zeros/pad-tokens in any of the shorter sequences?
Pretty much intuitive. tf.nn.dynamic_rnn returns two tensors: output and states. Suppose the actual sequence length is t and the padded sequence length is T.
Then the output will contain zeros after i > t and states will contain the t-th cell state, ignoring the states of trailing cells.
Here's an example:
import numpy as np
import tensorflow as tf
n_steps = 2
n_inputs = 3
n_neurons = 5
X = tf.placeholder(dtype=tf.float32, shape=[None, n_steps, n_inputs])
seq_length = tf.placeholder(tf.int32, [None])
basic_cell = tf.nn.rnn_cell.BasicRNNCell(num_units=n_neurons)
outputs, states = tf.nn.dynamic_rnn(basic_cell, X,
sequence_length=seq_length, dtype=tf.float32)
X_batch = np.array([
# t = 0 t = 1
[[0, 1, 2], [9, 8, 7]], # instance 0
[[3, 4, 5], [0, 0, 0]], # instance 1
[[6, 7, 8], [6, 5, 4]], # instance 2
])
seq_length_batch = np.array([2, 1, 2])
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
outputs_val, states_val = sess.run([outputs, states], feed_dict={
X: X_batch,
seq_length: seq_length_batch
})
print(outputs_val)
print()
print(states_val)
Note that instance 1 is padded, so outputs_val[1,1] is a zero vector and states_val[1] == outputs_val[1,0]:
[[[ 0.76686853 0.8707901 -0.79509073 0.7430128 0.63775384]
[ 1. 0.7427926 -0.9452815 -0.93113345 -0.94975543]]
[[ 0.9998851 0.98436266 -0.9620067 0.61259484 0.43135557]
[ 0. 0. 0. 0. 0. ]]
[[ 0.99999994 0.9982034 -0.9934515 0.43735617 0.1671598 ]
[ 0.99999785 -0.5612586 -0.57177305 -0.9255771 -0.83750355]]]
[[ 1. 0.7427926 -0.9452815 -0.93113345 -0.94975543]
[ 0.9998851 0.98436266 -0.9620067 0.61259484 0.43135557]
[ 0.99999785 -0.5612586 -0.57177305 -0.9255771 -0.83750355]]
Also, is the main advantage here really speed, or just extra assurance that we're masking pad-tokens during training?
Of course, batch processing is more efficient, than feeding the sequences one by one. But the main advantage of specifying the length is that you get the reasonable state out of RNN, i.e., padded items don't affect the result tensor. You will get exactly the same result (and the same speed) if you don't set the length, but select the right states manually.
I'm trying to use the tf.image_summary function in tensorflow. I'm trying to visualize a convolutional layer's filter. The filter is defined as tf.Variable(tf.constant(0.1, shape=[5, 5, 16, 32])).
But here, since I only want to see the final filters, I want to find a way to get a filter of size [5, 5, 32] by just taking the first index of the dimension that was 16. If I use [:, :, 0, :] then I assume I would get a [5, 5, 1, 32] filter instead of the [5, 5, 32] I want.
What should I do?
So tf.image_summary takes in a batch as input however it expects 1,3,or 4 in terms of color channels.
so you'd have to pass in to tf.image_summary something like this:
for i in range(int(math.floor(filter.get_shape()[4]/3))):
tf.image_summary(filter[:,:,:,i:i+3])