I'm building a stacked convolutional autoencoder with Tensorflow Core(no API pure Tensorflow). I want to add non trainable layers between encoder and decoder. Does anybody know how to add non trainable layers in tensorflow graph. The tensorboard graph picture is attached, the ops which appears in the blue marked box are the ones that I want to make non trainable, or one can say I do not want gradient computation on them.
TF Version: 1.15
I've tried out tf.stop_gradient() method but this method prevents the contribution of all the input before it. Tensorboard Graph
You have two options:
When you define the weights variable with tf.Variable or tf.get_variable, pass trainable=False. This will stop the variable from being added to the trainable variables collection (accessible through tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES)), which is used by default as the list of variables to train by the optimizer.
When you define the optimization step with minimize or compute_gradients, pass a var_list argument with the list of variables that you want to train. The optimizer will then ignore the trainable variables collection and will only affect the listed variables.
Related
In TensorFlow's offcial documentations, they always pass training=True when calling a Keras model in a training loop, for example, logits = mnist_model(images, training=True).
I tried help(tf.keras.Model.call) and it shows that
Help on function call in module tensorflow.python.keras.engine.network:
call(self, inputs, training=None, mask=None)
Calls the model on new inputs.
In this case `call` just reapplies
all ops in the graph to the new inputs
(e.g. build a new computational graph from the provided inputs).
Arguments:
inputs: A tensor or list of tensors.
training: Boolean or boolean scalar tensor, indicating whether to run
the `Network` in training mode or inference mode.
mask: A mask or list of masks. A mask can be
either a tensor or None (no mask).
Returns:
A tensor if there is a single output, or
a list of tensors if there are more than one outputs.
It says that training is a Boolean or boolean scalar tensor, indicating whether to run the Network in training mode or inference mode. But I didn't find any information about this two modes.
In a nutshell, I don't know what is the influence of this argument. And what if I missed this argument when training?
Some neural network layers behave differently during training and inference, for example Dropout and BatchNormalization layers. For example
During training, dropout will randomly drop out units and correspondingly scale up activations of the remaining units.
During inference, it does nothing (since you usually don't want the randomness of dropping out units here).
The training argument lets the layer know which of the two "paths" it should take. If you set this incorrectly, your network might not behave as expected.
Training indicating whether the layer should behave in training mode or in inference mode.
training=True: The layer will normalize its inputs using the mean and variance of the current batch of inputs.
training=False: The layer will normalize its inputs using the mean and variance of its moving statistics, learned during training.
Usually in inference mode training=False, but in some networks such as pix2pix_cGAN At both times of inference and training, training=True.
Good morning everyone;
I'm trying to implement this model where the neural network's inputs are based on a trainable vocabulary matrix (each row in the matrix represents a word entry in the vocabulary). I'm using keras (tensorflow backend), I was wondering if it's possible to define a trainable variable (without adding a custom layer), such that this variable will be trained as well as the neural network? like a tensorflow variable.
Could you please give a short example of how I can do it?
Thanks in advance.
The neural network's inputs are based on a trainable vocabulary matrix (each row in the matrix represents a word entry in the vocabulary)
This is the definition of a Word Embedding
There is already an embedding layer in Keras, you don't have to reimplement it.
You can find an easy example of how to use it here.
when I try to fine-tune a VGG network, I only want to update the weights after 5th convolution layers ,in caffe , we can cancel BP in configure file. What should I do in tensorflow ? thanks !
Just use tf.stop_gradient() on the input of your 5th layer. Tensorflow will not backpropagate the error below. tf.stop_gradient() is an operation that acts as the identity function in the forward direction, but stops the gradient in the backward direction.
From documentation:
tf.stop_gradient
Stops gradient computation.
When executed in a graph, this op outputs its input tensor as-is.
When building ops to compute gradients, this op prevents the
contribution of its inputs to be taken into account. Normally, the
gradient generator adds ops to a graph to compute the derivatives of a
specified 'loss' by recursively finding out inputs that contributed to
its computation. If you insert this op in the graph it inputs are
masked from the gradient generator. They are not taken into account
for computing gradients.
Otherwise you can use optimizer.minimize(loss, variables_of_fifth_layer). Here you are running backpropagation and updating only on the variables of your 5th layer.
For a fast selection of the variables of interest you could:
Define as trainable=False all the variables that you don't want to update, and use variables_of_fifth_layer=tf.trainable_variables().
Divide layers by defining specific scopes and then variables_of_fifth_layer = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES,"scope/of/fifth/layer")
I am trying to fine tune the last few layers in the tensorflow/slim resnet-v2-50 model for a dataset that I have.
I am struggling to find the names of the layers that I can train. In a tensorflow model, is there a way to find the names of the layers which are train-able? Is there a way to get these names an ordered way so that I can select a few last layers to train? Is there a way to get this information from tensorboard?
Just type
print(tf.trainable_variables())
This will print all the trainable variables.
When you want to train or optimize only certain layers of a pre-trained network, this is what you need to know.
TensorFlow's minimize method takes an optional argument var_list, a list of variables to be adjusted through back-propagation.
If you don't specify var_list, any TF variable in the graph could be adjusted by the optimizer. When you specify some variables in var_list, TF holds all other variables constant.
Here's an example of a script which jonbruner and his collaborator have used.
tvars = tf.trainable_variables()
g_vars = [var for var in tvars if 'g_' in var.name]
g_trainer = tf.train.AdamOptimizer(0.0001).minimize(g_loss, var_list=g_vars)
This finds all the variables they defined earlier that have "g_" in the variable name, puts them into a list, and runs the ADAM optimizer on them.
You can find the related answers here on Quora
Is it possible to get the gradients with respect to each layer in Caffe in CNNs, edit them and again apply the new gradients in the training process? If possible, using pycaffe interface.
For example in TensorFlow, it could be done by means of functions:
given_optimizer.compute_gradients(total_loss)
given_optimizer.apply_gradients(grads)
I'm not sure what you mean by "apply the new gradients in the training process", but you can access the gradients in the pycaffe interface:
import caffe
net = caffe.Net('/path/to/net.prototxt', '/path/to/weights.caffemodel', caffe.TEST)
# provide inputs to the net, do a pass so that meaningful data/gradients propagate to all the layers
net.forward_backward_all()
# once data/gradients are updated, you can access them
net.blobs['blob_name'].diff # access the gradient of blob 'blob_name'
net.layers[5].blobs[0].diff # access the gradient of the first parameter blob of the 6th layer
To map between layer names and layer indices, you can use this code:
list(net._layer_names).index('layer_name')
This will return the index of layer 'layer_name'.