Tensorflow tf.expand_dims - tensorflow

The original Tensorflow tutorial includes the following code:
batch_size = tf.size(labels)
labels = tf.expand_dims(labels, 1)
indices = tf.expand_dims(tf.range(0, batch_size, 1), 1)
concated = tf.concat(1, [indices, labels])
onehot_labels = tf.sparse_to_dense(concated, tf.pack([batch_size, NUM_CLASSES]), 1.0, 0.0)
The second line adds a dimension to the labels tensor. However, labels was fed in via a feed dictionary so it should already have shape [batch_size, NUM_CLASSES]. If so then why is expand_dims used here?

That tutorial is pretty old. You're referencing version 0.6 whereas they are at 0.11 as of (11-20-2016 time of this post). So there were many functions that were different at that time v0.6.
Anyways to answer your question:
The labels in mnist were just encoded as the digits 0-9. however, the loss function expected the labels to be encoded as a one hot vector.
The labels are not already [batch_size, NUM_CLASSES] in that example it was just [batch_size].
This could have been done via similar numpy functions. Also they have also since provided functions to get the labels from the mnist dataset in tensorflow as one hot vectors which do already have the shape you stated.

Related

How to properly use tf.metrics.mean_iou in Tensorflow to show confusion matrix on Tensorboard?

I found evaluation script in Tensorflow official implementation of DeeplabV3+ (eval.py) uses tf.metrics.mean_iou to update mean IOU, and adds it to Tensorboard for record.
tf.metrics.mean_iou actually returns 2 tensors, one is calculated mean IOU, the other is an opdate_op, and according to official doc (doc), confusion matrix. It seems every time if you want to get calculated mean_iou, you have to call that update_op first.
I am trying to add this update_op into summary as a tensor, but it does not work. My question is how to add this confusion matrix into Tensorboard?
I saw some other threads on how to calculate confusion matrix and add it to Tensorboard, with extra operations. I just would like to know if one can do this without those extra operations.
Any help would be appreciated.
I will post my answer here since someone upvoted it.
Let's say you defined mean_iou op in the following manner:
miou, update_op = tf.metrics.mean_iou(
predictions, labels, dataset.num_of_classes, weights=weights)
tf.summary.scalar(predictions_tag, miou)
If you see your graph in Tensorboard, you will find there is a node named 'mean_iou', and after expanding this node, you will find there is an op called 'total_confucion_matrix'. This is what you will need to calculate recall and precision for each class.
After you get the node name, you can add it to your tensorboard via tf.summary.text or print in your terminal bytf.print function. An example is posted below:
miou, update_op = tf.metrics.mean_iou(
predictions, labels, dataset.num_of_classes, weights=weights)
tf.summary.scalar(predictions_tag, miou)
# Get the correct tensor name of confusion matrix, different graphs may vary
confusion_matrix = tf.get_default_graph().get_tensor_by_name('mean_iou/total_confusion_matrix:0')
# Calculate precision and recall matrix
precision = confusion_matrix / tf.reshape(tf.reduce_sum(confusion_matrix, 1), [-1, 1])
recall = confusion_matrix / tf.reshape(tf.reduce_sum(confusion_matrix, 0), [-1, 1])
# Print precision, recall and miou in terminal
precision_op = tf.print("Precision:\n", precision,
output_stream=sys.stdout)
recall_op = tf.print("Recall:\n", recall,
output_stream=sys.stdout)
miou_op = tf.print("Miou:\n", miou,
output_stream=sys.stdout)
# Add precision and recall matrix in Tensorboard
tf.summary.text('recall_matrix', tf.dtypes.as_string(recall, precision=4))
tf.summary.text('precision_matrix', tf.dtypes.as_string(precision, precision=4))
# Create summary hooks
summary_op = tf.summary.merge_all()
summary_hook = tf.contrib.training.SummaryAtEndHook(
log_dir=FLAGS.eval_logdir, summary_op=summary_op)
precision_op_hook = tf.train.FinalOpsHook(precision_op)
recall_op_hook = tf.train.FinalOpsHook(recall_op)
miou_op_hook = tf.train.FinalOpsHook(miou_op)
hooks = [summary_hook, precision_op_hook, recall_op_hook, miou_op_hook]
num_eval_iters = None
if FLAGS.max_number_of_evaluations > 0:
num_eval_iters = FLAGS.max_number_of_evaluations
if FLAGS.quantize_delay_step >= 0:
tf.contrib.quantize.create_eval_graph()
tf.contrib.training.evaluate_repeatedly(
master=FLAGS.master,
checkpoint_dir=FLAGS.checkpoint_dir,
eval_ops=[update_op],
max_number_of_evaluations=num_eval_iters,
hooks=hooks,
eval_interval_secs=FLAGS.eval_interval_secs)
Then you will have your precision and recall matrix summarised in your Tensorboard:

autocorrelation of the input in tensorflow/keras

I have a 1D input signal. I want to compute autocorrelation as the part of the neural net for further use inside the network.
I need to perform convolution of input with input itself.
To perform convolution in keras custom layer/ tensorflow. We need the following parameters
data shape is "[batch, in_height, in_width, in_channels]",
filter shape is "[filter_height, filter_width, in_channels, out_channels]
There is no batch present in filter shape, which needs to be input in my case
TensorFlow now has an auto_correlation function. It should be in release 1.6. If you build from source you can use it right now (see e.g. the github code).
Here is a possible solution.
By self convolution, I understood a regular convolution where the filter is exactly the same as the input (if it's not that, sorry for my misunderstanding).
We need a custom function for that, and a Lambda layer.
At first I used padding = 'same' which brings outputs with the same length as the inputs. I'm not sure about what output length you want exactly, but if you want more, you should add padding yourself before doing the convolution. (In the example with length 7, for a complete convolution from one end to another, this manual padding would include 6 zeros before and 6 zeros after the input length, and use padding = 'valid'. Find the backend functions here)
Working example - Input (5,7,2)
from keras.models import Model
from keras.layers import *
import keras.backend as K
batch_size = 5
length = 7
channels = 2
channels_batch = batch_size*channels
def selfConv1D(x):
#this function unfortunately needs to know previously the shapes
#mainly because of the for loop, for other lines, there are workarounds
#but these workarounds are not necessary since we'll have this limitation anyway
#original x: (batch_size, length, channels)
#bring channels to the batch position:
x = K.permute_dimensions(x,[2,0,1]) #(channels, batch_size, length)
#suppose channels are just individual samples (since we don't mix channels)
x = K.reshape(x,(channels_batch,length,1))
#here, we get a copy of x reshaped to match filter shapes:
filters = K.permute_dimensions(x,[1,2,0]) #(length, 1, channels_batch)
#now, in the lack of a suitable available conv function, we make a loop
allChannels = []
for i in range (channels_batch):
f = filters[:,:,i:i+1]
allChannels.append(
K.conv1d(
x[i:i+1],
f,
padding='same',
data_format='channels_last'))
#although channels_last is my default config, I found this bug:
#https://github.com/fchollet/keras/issues/8183
#convolution output: (1, length, 1)
#concatenate all results as samples
x = K.concatenate(allChannels, axis=0) #(channels_batch,length,1)
#restore the original form (passing channels to the end)
x = K.reshape(x,(channels,batch_size,length))
return K.permute_dimensions(x,[1,2,0]) #(batch_size, length, channels)
#input data for the test:
x = np.array(range(70)).reshape((5,7,2))
#little model that just performs the convolution
inp= Input((7,2))
out = Lambda(selfConv1D)(inp)
model = Model(inp,out)
#checking results
p = model.predict(x)
for i in range(5):
print("x",x[i])
print("p",p[i])
You can just use tf.nn.conv3d by treating the "batch size" as "depth":
# treat the batch size as depth.
data = tf.reshape(input_data, [1, batch, in_height, in_width, in_channels])
kernel = [filter_depth, filter_height, filter_width, in_channels, out_channels]
out = tf.nn.conv3d(data, kernel, [1,1,1,1,1], padding='SAME')

RNN and LSTM implementation in tensorflow

I have been trying to learn how to code up an RNN and LSTM in tensorflow. I found an example online on this blog post
http://r2rt.com/recurrent-neural-networks-in-tensorflow-ii.html
Below are the snippets which I am having trouble understanding for an LSTM network to be used eventually for char-rnn generation
x = tf.placeholder(tf.int32, [batch_size, num_steps], name='input_placeholder')
y = tf.placeholder(tf.int32, [batch_size, num_steps], name='labels_placeholder')
embeddings = tf.get_variable('embedding_matrix', [num_classes, state_size])
rnn_inputs = [tf.squeeze(i) for i in tf.split(1,
num_steps, tf.nn.embedding_lookup(embeddings, x))]
Different Section of the Code Now where the weights are defined
with tf.variable_scope('softmax'):
W = tf.get_variable('W', [state_size, num_classes])
b = tf.get_variable('b', [num_classes], initializer=tf.constant_initializer(0.0))
logits = [tf.matmul(rnn_output, W) + b for rnn_output in rnn_outputs]
y_as_list = [tf.squeeze(i, squeeze_dims=[1]) for i in tf.split(1, num_steps, y)]
x is the data to be fed, and y is the set of labels. In the lstm equations we have a series of gates, x(t) gets multiplied by a series and prev_hidden_state gets multiplied by some set of weights, biases are added and non-liniearities are applied.
Here are the doubts I have
In this case only one weight matrix is defined does that mean that
works for both x(t) and prev_hidden_state as well.
For the embeddings matrix I know it has to be multiplied by the
weight matrix but why is the first dimension num_classes
For the rnn_inputs we are using squeeze which removes dimensions of 1
but why would I want to do that in a one-hot-encoding.
Also from the splits I understand that we are unrolling the x of
dimension (batch_size X num_steps) into discrete (batch_size X 1)
vectors and then passing these values through the network is this
right
May I help you.
In this case only one weight matrix is defined does that mean that works for both x(t) and prev_hidden_state as well.
There are more weights as you call tf.nn.rnn_cell.LSTMCell. They are the internal weights of the RNN cell, which tensorflow created it implicitly when you call the cell.
The weight matrix you explicitly defined is the transform from the hidden state to the vocabulary space.
You can view the implicit weights accounting for the recurrent parts, taking the previous hidden state and current input and output the new hidden state. And the weight matrix you defined transform the hidden states(i.e. state_size = 200) to the higher vocabulary space.(i.e. vocab_size = 2000)
For further information, maybe you can view this tutorial : http://colah.github.io/posts/2015-08-Understanding-LSTMs/
For the embeddings matrix I know it has to be multiplied by the weight matrix but why is the first dimension num_classes
The num_classes accounts for the vocab_size, the embedding matrix is transforming the vocabulary to the required embedding size(in this example is equal to the state_size).
For the rnn_inputs we are using squeeze which removes dimensions of 1 but why would I want to do that in a one-hot-encoding.
You need to get rid of the extra dimension because tf.nn.rnn takes inputs as (batch_size, input_size) instead of (batch_size, 1, input_size).
Also from the splits I understand that we are unrolling the x of dimension (batch_size X num_steps) into discrete (batch_size X 1) vectors and then passing these values through the network is this right?
Being more precise, after embedding. (batch_size, num_steps, state_size) turns into a list of num_step elements, each of size (batch_size, 1, state_size).
The flow goes like this :
The embedding matrix embed each word as a state_size dimension vector(a row of the matrix), making the size (vocab_size, state_size).
Retrieve the the indices specified by the x placeholder and get the rnn input, which is size (batch_size, num_steps, state_size).
tf.split split the inputs to (batch_size, 1, state_size)
tf.squeeze sqeeze them to (batch_size, state_size), forming the desired input format for tf.nn.rnn.
If there's any problem with the tensorflow methods, maybe you can search them in the tensorflow API for more detailed introduction.

Confused on how tensorflow feed_dict works

Recently started using tensorflow and I'm really confused on the functionality of feed_dict.
Looking at the mnist example from the tensorflow website, x is a symbolic placeholder that will be filled with a new batch of images every training iteration, so 'None' here could also be 'batch_size'
x = tf.placeholder(tf.float32, shape=[None, 784])
when looking at the convolutional part of this tutorial, there's a command to reshape x from it's flattened 1x784 shape back to a 2D image 28x28 shape
x_image = tf.reshape(x, [-1,28,28,1])
during the training loop, x is fed through the command
train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})
My question is when we feed in values to x, does tensorflow automatically vectorize every op involving x? So for example when we define the op
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
this will automatically work across the entire batch?
if x is ndarray with each row being a flattened image, because we specified shape 'None' in the x placeholder tensorflow automatically knows to use each row as an individual training sample, and vectorize all subsequent ops?
The shape argument is used for static shape inference (ie,tensor.get_shape) and is optional. TensorFlow doesn't vectorize anything automatically, but for binary cwise ops it uses broadcasting which looks a bit like that. In your example, tf.conv2d is an operation that treats each row as an example, so it works with batches, but not with individual examples. Also batch[0] is a batch of inputs, and batch[1] is a batch of labels.

Per pixel softmax for fully convolutional network

I'm trying to implement something like a fully convolutional network, where the last convolution layer uses filter size 1x1 and outputs a 'score' tensor. The score tensor has shape [Batch, height, width, num_classes].
My question is, what function in tensorflow can apply softmax operation for each pixel, independent of other pixels. The tf.nn.softmax ops seems not for such purpose.
If there is no such ops available, I guess I have to write one myself.
Thanks!
UPDATE: if I do have to implement myself, I think I may need to reshape the input tensor to [N, num_claees] where N = Batch x width x height, and apply tf.nn.softmax, then reshape it back. Does it make sense?
Reshaping it to 2d and then reshaping it back, like you guessed, is the right approach.
You can use this function.
I found it by searching from GitHub.
import tensorflow as tf
"""
Multi dimensional softmax,
refer to https://github.com/tensorflow/tensorflow/issues/210
compute softmax along the dimension of target
the native softmax only supports batch_size x dimension
"""
def softmax(target, axis, name=None):
with tf.name_scope(name, 'softmax', values=[target]):
max_axis = tf.reduce_max(target, axis, keep_dims=True)
target_exp = tf.exp(target-max_axis)
normalize = tf.reduce_sum(target_exp, axis, keep_dims=True)
softmax = target_exp / normalize
return softmax