Is it possible to make a trainable variable not trainable? - tensorflow

I created a trainable variable in a scope. Later, I entered the same scope, set the scope to reuse_variables, and used get_variable to retrieve the same variable. However, I cannot set the variable's trainable property to False. My get_variable line is like:
weight_var = tf.get_variable('weights', trainable = False)
But the variable 'weights' is still in the output of tf.trainable_variables.
Can I set a shared variable's trainable flag to False by using get_variable?
The reason I want to do this is that I'm trying to reuse the low-level filters pre-trained from VGG net in my model, and I want to build the graph like before, retrieve the weights variable, and assign VGG filter values to the weight variable, and then keep them fixed during the following training step.

After looking at the documentation and the code, I was not able to find a way to remove a Variable from the TRAINABLE_VARIABLES.
Here is what happens:
The first time tf.get_variable('weights', trainable=True) is called, the variable is added to the list of TRAINABLE_VARIABLES.
The second time you call tf.get_variable('weights', trainable=False), you get the same variable but the argument trainable=False has no effect as the variable is already present in the list of TRAINABLE_VARIABLES (and there is no way to remove it from there)
First solution
When calling the minimize method of the optimizer (see doc.), you can pass a var_list=[...] as argument with the variables you want to optimizer.
For instance, if you want to freeze all the layers of VGG except the last two, you can pass the weights of the last two layers in var_list.
Second solution
You can use a tf.train.Saver() to save variables and restore them later (see this tutorial).
First you train your entire VGG model with all trainable variables. You save them in a checkpoint file by calling saver.save(sess, "/path/to/dir/model.ckpt").
Then (in another file) you train the second version with non trainable variables. You load the variables previously stored with saver.restore(sess, "/path/to/dir/model.ckpt").
Optionally, you can decide to save only some of the variables in your checkpoint file. See the doc for more info.

When you want to train or optimize only certain layers of a pre-trained network, this is what you need to know.
TensorFlow's minimize method takes an optional argument var_list, a list of variables to be adjusted through back-propagation.
If you don't specify var_list, any TF variable in the graph could be adjusted by the optimizer. When you specify some variables in var_list, TF holds all other variables constant.
Here's an example of a script which jonbruner and his collaborator have used.
tvars = tf.trainable_variables()
g_vars = [var for var in tvars if 'g_' in var.name]
g_trainer = tf.train.AdamOptimizer(0.0001).minimize(g_loss, var_list=g_vars)
This finds all the variables they defined earlier that have "g_" in the variable name, puts them into a list, and runs the ADAM optimizer on them.
You can find the related answers here on Quora

In order to remove a variable from the list of trainable variables, you can first access the collection through:
trainable_collection = tf.get_collection_ref(tf.GraphKeys.TRAINABLE_VARIABLES)
There, trainable_collection contains a reference to the collection of trainable variables. If you pop elements from this list, doing for example trainable_collection.pop(0), you are going to remove the corresponding variable from the trainable variables, and thus this variable will not be trained.
Although this works with pop, I am still struggling to find a way to correctly use remove with the correct argument, so we don't depend on the index of the variables.
EDIT: Given that you have the name of the variables in the graph (you can obtain that by inspecting the graph protobuf or, what is easier, using Tensorboard), you can use it to loop through the list of trainable variables and then remove the variables from the trainable collection.
Example: say that I want the variables with names "batch_normalization/gamma:0" and "batch_normalization/beta:0" NOT to be trained, but they are already added to the TRAINABLE_VARIABLES collection. What I can do is:
`
#gets a reference to the list containing the trainable variables
trainable_collection = tf.get_collection_ref(tf.GraphKeys.TRAINABLE_VARIABLES)
variables_to_remove = list()
for vari in trainable_collection:
#uses the attribute 'name' of the variable
if vari.name=="batch_normalization/gamma:0" or vari.name=="batch_normalization/beta:0":
variables_to_remove.append(vari)
for rem in variables_to_remove:
trainable_collection.remove(rem)
`
This will successfully remove the two variables from the collection, and they will not be trained anymore.

You can use tf.get_collection_ref to get the reference of collection rather than tf.get_collection

Related

Is the use of a non-trainable weight equivalent to the use of a Python variable in TensorFlow?

I have a piece of code (not mine) that defines a non-trainable variable that is used to define another property of the layer, which looks something like
initial_weight_val = 1.0
w = my_layer.add_weight(name=layer.name + '/my_weight', shape=(),
initializer=tf.initializers.constant(initial_weight_val),
trainable=False)
# Use w to set another parameter of the layer.
my_layer.the_parameter = some_function(w)
Please, do not tell me what a non-trainable variable is (Of course, I know what it is?), which is also discussed in What is the definition of a non-trainable parameter?.
However, given that w will not be changed (I think), I don't get why someone would define such a variable, rather than simply using the Python variable initial_weight_val directly, especially when using TensorFlow 2.0 (which is my case and the only case I am interested in). Of course, one possibility would be that this variable could become trainable, in case one needs it to be trainable later, but why should one anticipate this, anyway?
Can I safely use initial_weight_val to define the_parameter, i.e. pass initial_weight_val to some_function rather than w?
I am concerned with this issue because I cannot save a model with a variable, because I get the error "variable is not JSON serializable" (Keras and TF are so buggy, btw!), so I was trying to understand the equivalence between user-defined non-trainable variables and Python variables.
You must make sure that this value doesn't change at all, and that it's a single value.
Then yes, you can use a Python var (if a python var is compatible with the function that uses this w).
In which case, you'd put that initial_weight_val both in the __init__ and in the get_config methods of the layer in order for it to be properly saved.
Now, if the function only accepts tensors, but you're still sure that this value will not change at all, then you can on call make w = tf.constant(self.initial_weight_val). You still have the value in __init__ and in get_config as a python var.
Finally, if this value, although non-trainable is changing, or if it's a tensor with many elements, then you'd better let it be a weight. (Non-trainable means "non trainable by backpropagation", but still allowed to be updated here and there).
There should be absolutely no problem for saving loading this weight if you defined it correctly, which should be inside build, with self.add_weight(....), as shown in https://keras.io/layers/writing-your-own-keras-layers/ .
A cool Keras example that uses non-trainable but updatable weights is the BatchNormalization layer. The mean and std of the batches are updated every pass, but not via backpropagation (thus trainable=False)

Why there is a need of using tf.Variable?

In the following code I am unable to understand the need of using tf.Variable? I get the same value whether I use tf.Variable or omit it.
`initial = tf.Variable(tf.truncated_normal(shape=[1,10,1], mean=0,
stddev=0.1,seed=123))`
As I answered in your another post, I will post it again, In tensorflow, anything that is created using tf.Variable(), will get updated during training in back-propagation, for example, a weight matrix.
Ideally, by default, every tf.Variable() becomes trainable unless you specify it non-trainable explicitly.
If you do this initial = tf.truncated_normal([5,10], mean=0, stddev=0.1), then tensorflow will not know that it's a trainable variable and hence it will not be trained. It will stay constant throughout the training.

How to find the names of the trainable layers in a tensorflow model?

I am trying to fine tune the last few layers in the tensorflow/slim resnet-v2-50 model for a dataset that I have.
I am struggling to find the names of the layers that I can train. In a tensorflow model, is there a way to find the names of the layers which are train-able? Is there a way to get these names an ordered way so that I can select a few last layers to train? Is there a way to get this information from tensorboard?
Just type
print(tf.trainable_variables())
This will print all the trainable variables.
When you want to train or optimize only certain layers of a pre-trained network, this is what you need to know.
TensorFlow's minimize method takes an optional argument var_list, a list of variables to be adjusted through back-propagation.
If you don't specify var_list, any TF variable in the graph could be adjusted by the optimizer. When you specify some variables in var_list, TF holds all other variables constant.
Here's an example of a script which jonbruner and his collaborator have used.
tvars = tf.trainable_variables()
g_vars = [var for var in tvars if 'g_' in var.name]
g_trainer = tf.train.AdamOptimizer(0.0001).minimize(g_loss, var_list=g_vars)
This finds all the variables they defined earlier that have "g_" in the variable name, puts them into a list, and runs the ADAM optimizer on them.
You can find the related answers here on Quora

How to create a non-trainable variable in Tensorflow?

Does it exist a parameter that specifies a tf.Variable as non-trainable, so that the variable is not included in tf.trainable_variables()?
You can mark variables as "non-trainable" on definition:
v = tf.Variable(tf.zeros([1]), trainable=False)
From the linked documentation (circa TensorFlow v0.11):
trainable: If True, the default, also adds the variable to the graph collection GraphKeys.TRAINABLE_VARIABLES. This collection is used as the default list of variables to use by the Optimizer classes.
There are also ways to change this condition with APIs such as tf.get_variable([v]).
You can create non-trainable variables in two different ways:
tf.Variable(a, trainable=False)
tf.get_variable("a", a, trainable=False)
There is no easy way to change the variable from trainable to non-trainable and otherwise. Also there is no easy way to check whether the variable is trainable (you need to check whether the name of your variable is in the list of tf.trainable_variables()

tensorflow restore only variables

I do some training in Tensorflow and save the whole session using a saver:
# ... define model
# add a saver
saver = tf.train.Saver()
# ... run a session
# ....
# save the model
save_path = saver.save(sess,fileSaver)
It works fine, and I can successfully restore the whole session by using the exact same model and calling:
saver.restore(sess, importSaverPath)
Now I want to modify only the optimizer while keeping the rest of the model constant (the computation graph stays the same apart from the optimizer):
# optimizer used before
# optimizer = tf.train.AdamOptimizer
# (learning_rate = learningRate).minimize(costPrediction)
# the new optimizer I want to use
optimizer = tf.train.RMSPropOptimizer
(learning_rate = learningRate, decay = 0.9, momentum = 0.1,
epsilon = 1e-5).minimize(costPrediction)
I also want to continue the training from the last graph state I saved (i.e., I want to restore the state of my variables and continue with another training algorithm). Of course I cannot use:
saver.restore
any longer, because the graph has changed.
So my question is: is there a way to restore only variables using the saver.restore command (or even, maybe for later use, only a subset of variables), when the whole session has been saved? I looked for such feature in the API documentation and online, but could not find any example / detailed enough explanations that could help me get it to work.
It is possible to restore a subset of variables by passing the list of variables as the var_list argument to the Saver constructor. However, when you change the optimizer, additional variables may have been created (momentum accumulators, for instance) and variable associated with the previous optimizer, if any, would have been removed from the model. So simply using the old Saver object to restore will not work, especially if you had constructed it with the default constructor, which uses tf.all_variables as the argument to var_list parameter. You have to construct the Saver object on the subset of variables that you created in your model and then restore would work. Note that, this would leave the new variables created by the new optimizer uninitialized, so you have to explicitly initialize them.
I see the same problem. Inspired by keveman' s answer. My solution is:
Define your new graph, (here only new optimizer related variables are different from the old graph).
Get all variables using tf.global_variables(). This return a var list I called g_vars.
Get all optimizer related variables using tf.contrib.framework.get_variables_by_suffix('some variable filter'). The filter may be RMSProp or RMSPRrop_*. This function returns a var list I called exclude_vars.
Get the variables in g_vars but not in exclude_vars. Simply use
vars = [item for item in g_vars if item not in exclude_vars]
these vars are common vars in both new and old graph, which you can restore from old model now.
you could recover the original Saver from a MetaGraph protobuf first and then use that saver to restore all old variables safely. For a concrete example, you can take a look at the eval.py script: TensorFlow: How do I release a model without source code?