Extract certain dimensions from a tensor - tensorflow

I'm trying to use the tf.image_summary function in tensorflow. I'm trying to visualize a convolutional layer's filter. The filter is defined as tf.Variable(tf.constant(0.1, shape=[5, 5, 16, 32])).
But here, since I only want to see the final filters, I want to find a way to get a filter of size [5, 5, 32] by just taking the first index of the dimension that was 16. If I use [:, :, 0, :] then I assume I would get a [5, 5, 1, 32] filter instead of the [5, 5, 32] I want.
What should I do?

So tf.image_summary takes in a batch as input however it expects 1,3,or 4 in terms of color channels.
so you'd have to pass in to tf.image_summary something like this:
for i in range(int(math.floor(filter.get_shape()[4]/3))):
tf.image_summary(filter[:,:,:,i:i+3])

Related

Why does Tensoflow seq2seq pad all sequences to the same fixed length?

I am trying to implement a Chatbot using Tensorflow and its implementation of seq2seq.
After reading different tutorials (Chatbots with Seq2Seq, Neural Machine Translation (seq2seq) Tutorial, Unsupervised Deep Learning for Vertical Conversational Chatbots), and the original paper Sequence to Sequence Learning with Neural Networks, I could not find an explanation as to why the Tensorflow seq2seq implementation pads all sequences (both input and output) to the same fixed length.
Example:
Input data consists of sequences of integers:
x = [[5, 7, 8], [6, 3], [3], [1]]
RNNs need a different layout. Sequences shorter then the longest one are padded with zeros towards the end. This layout is called time-major.
x is now array([[5, 6, 3, 1],
[7, 3, 0, 0],
[8, 0, 0, 0]])
Why is this padding required?
Source of this tutorial.
If I am missing something, please let me know.
You need to pad the sequence (with some id, in your case 0) to the maximum sequence length. The reason you want to do this is so that sequences can fit in an array (tensor) object and be processed in the same step.

TensorFlow – Slice at different position for each batch element

I need to slice a window of constant size for each batch element, but starting at different locations. For example, for windows of length two, I want to be able to do something like:
batch = tf.constant([[1, 2, 3],
[4, 5, 6]])
window_size = 2
window_starts = tf.constant([1, 0]) # The index from which to start slicing for each batch element.
slice_windows(batch, window_size, window_starts)
# this should return [[2, 3],
# [4, 5]]
I won’t know what the window_starts are beforehand (they come from data), so I can’t just enumerate all of the indices I need and use tf.gather_nd.
Furthermore, after doing computations on the windows, I then need to pad them back into place with 0s (so a different amount of padding for each batch element):
...computation on windows...
restore_window_positions(windows, window_starts, original_size=3)
# this should return [[0, 2, 3],
# [4, 5, 0]]

Deconvolutions/Transpose_Convolutions with tensorflow

I am attempting to use tf.nn.conv3d_transpose, however, I am getting an error indicating that my filter and output shape is not compatible.
I have a tensor of size [1,16,16,4,192]
I am attempting to use a filter of [1,1,1,192,192]
I believe that the output shape would be [1,16,16,4,192]
I am using "same" padding and a stride of 1.
Eventually, I want to have an output shape of [1,32,32,7,"does not matter"], but I am attempting to get a simple case to work first.
Since these tensors are compatible in a regular convolution, I believed that the opposite, a deconvolution, would also be possible.
Why is it not possible to perform a deconvolution on these tensors. Could I get an example of a valid filter size and output shape for a deconvolution on a tensor of shape [1,16,16,4,192]
Thank you.
I have a tensor of size [1,16,16,4,192]
I am attempting to use a filter of [1,1,1,192,192]
I believe that the output shape would be [1,16,16,4,192]
I am using "same" padding and a stride of 1.
Yes the output shape will be [1,16,16,4,192]
Here is a simple example showing that the dimensions are compatible:
import tensorflow as tf
i = tf.Variable(tf.constant(1., shape=[1, 16, 16, 4, 192]))
w = tf.Variable(tf.constant(1., shape=[1, 1, 1, 192, 192]))
o = tf.nn.conv3d_transpose(i, w, [1, 16, 16, 4, 192], strides=[1, 1, 1, 1, 1])
print(o.get_shape())
There must be some other problem in your implementation than the dimensions.

Convolution-Deconvolution pair gives slightly different dimensionality

I am using a convolution layer followed by a deconvolution layer like so:
tf.nn.conv2d(num_outputs=1, kernel_size=[21, 11], stride=[2, 2], padding="SAME", rate=1)
tf.nn.conv2d_transpose(num_outputs=1, kernel_size=[21, 11], stride=[2, 2], padding="SAME")
My idea is to make the initial image smaller, then bring it to its original size with the deconvolution. I am actually using the tf.slim functions, but the arguments are the ones above.
When I look at the input and output, I have a small difference:
Input shape : (16, 161, 511, 1)
Output shape: (16, 162, 512, 1)
I think it could be due to my stride size or kernel size. I've tried multiple values but none seem to reproduce the original dimensions.
A popular methodology is to pad the input image such that the output after the convolution and deconvolution would be of the same size as the padded input. Then crop the output to fit the original input without the padding.

How is the input tensor for TensorFlow's tf.nn.dynamic_rnn operator structured?

I am trying to write a language model using word embeddings and recursive neural networks in TensorFlow 0.9.0 using the tf.nn.dynamic_rnn graph operation, but I don't understand how the input tensor is structured.
Let's say I have a corpus of n words. I embed each word in a vector of length e, and I want my RNN to unroll to t time steps. Assuming I use the default time_major = False parameter, what shape would my input tensor [batch_size, max_time, input_size] have?
Maybe a specific tiny example will make this question clearer. Say I have a corpus consisting of n=8 words that looks like this.
1, 2, 3, 3, 2, 1, 1, 2
Say I embed it in a vector of size e=3 with the embeddings 1 -> [10, 10, 10], 2 -> [20, 20, 20], and 3 -> [30, 30, 30], what would my input tensor look like?
I've read the TensorFlow Recurrent Neural Network tutorial, but that doesn't use tf.nn.dynamic_rnn. I've also read the documentation for tf.nn.dynamic_rnn, but find it confusing. In particular I'm not sure what "max_time" and "input_size" mean here.
Can anyone give the shape of the input tensor in terms of n, t, and e, and/or an example of what that tensor would look like initialized with data from the small corpus I describe?
TensorFlow 0.9.0, Python 3.5.1, OS X 10.11.5
In your case, it looks like batch_size = 1, since you're looking at a single example. So max_time is n=8 and input_size is the input depth, in your case e=3. So you would want to construct an input tensor which is shaped [1, 8, 3]. It's batch_major, so the first dimension (the batch dimension) is 1. If, say, you had another input at the same time, with n=6 words, then you would combine the two by padding this second example to 8 words (by padding zeros for the last 2 word embeddings) and you would have an inputs size of [2, 8, 3].