Initializing new variables in tensorflow - tensorflow

I have built and trained a model. On second phase I want to replace the last two layers and retrain them using different data.
I constantly get the errors for not initializing variables even though I did run initialization on the new vars:
var_init_op = tf.initialize_variables(var_list=[fc1_weights, fc1_biases, fc2_weights, fc2_biases])
sess.run(var_init_op)
I understand I have to initialize the new optimizer (ADAMSolever) as well, but
not sure how to do that.
Assuming I want to replace the optimizer (and other variables) in the middle how do I initialize it without trashing already trained variables?

You can get all the trainable variables using tf.trainable_variables(), and exclude the variables which should be restored from the pretrained model. Then you can initialize the other variables.

Related

Replacing TensorFlow Saver with Checkpoint

I've been using TensorFlow's Saver class to save model parameters, but that class is going away in TensorFlow 2, so I need to replace it with Checkpoint. I can't figure out how to do that. All the examples in the documentation for Checkpoint assume you're saving a tf.keras.Model. I'm not using Keras, so that doesn't apply.
Saver just takes a list of variables to save, so that's what I'm starting from. How do I pass that to Checkpoint? It expects you to pass every checkpointable object as a named argument. I was hoping I could just say variables=[var1, var2, ...], but it doesn't accept lists. I could pass every variable as a separate argument, but what do I use as the names? The variable names? That defeats the whole purpose of checkpoint, which is to be more robust by not depending on variable names. What is the intended way of writing checkpoints in code that doesn't use Keras?

Using the EMA'ed weights for evaluation in Tensorflow

In Tensorflow's tutorial it says that there are two ways to use the EMA'ed weights for evaluation
Build a model that uses the shadow variables instead of the
variables. For this, use the average() method which returns the
shadow variable for a given variable.
Build a model normally but load the checkpoint files to evaluate by
using the shadow variable names. For this use the average_name()
method. See the Saver class for more information on restoring saved
variables.
I understand how to use the second method to use the EMA'ed weights for evaluation as an example is given. I was wondering if someone could give me a simple example of how to build a model that uses the shadow variables.

How to structure the model for training and evaluation on the test set

I want to train a model. Every 1000 steps, I want to evaluate it on the test set and write it to the tensorboard log. However, there's a problem. I have a code like this:
image_b_train, label_b_train = tf.train.shuffle_batch(...)
out_train = model.inference(image_b_train)
accuracy_train = tf.reduce_mean(...)
image_b_test, label_b_test = tf.train.shuffle_batch(...)
out_test = model.inference(image_b_test)
accuracy_test = tf.reduce_mean(...)
where model inference declares the variables in the model. However, there's a problem. For the test set I have a separate queue, and I can't swap one queue for another with tensorflow.
Currently I solved the problem by creating 2 graphs, one for training and the other for testing. I copy from one graph to the other with tf.train.Saver. Another solution might be to use tf.get_variable, but this is a global variable, and I don't like it because my code becomes less reusable.
Yes, you need two graphs. These graphs can share variables. This can be done by:
Using Keras layers (from tf.contrib.keras) which let you define the model once and use it to compute two inference graphs
Using slim-style layers (from tf.layers) with tf.get_variable and reuse
Using tf.make_template to make your own model-like object which can be called once to build the training graph and once to build the inference graph
Using tf.estimator.Estimator which lets you define a model function once and runs it automatically for training and evaluation for you
There are other options, but any of these is well-supported and should unblock you.

What caching model does TensorFlow use?

I read the question here
TensorFlow - get current value of a Variable
and the answer has left me confused.
On one hand, dga says "And to be very clear: Running the variable will
produce only the current value of the variable; it will not run any
assign operations associated with it. It's cheap."
On the other hand, Salvador Dali says "#dga yes, if the variable depends
on n other variables, they also need to be evaluated."
So, which is it? Does evaluating the variable only return its current
value, or does it recompute its value from scratch from the variables it
depends on?
What happens if I evaluate the same variable twice in a row? Does
Tensorflow have any notion of "stale" variables, i.e. variables that
need to be recomputed because their dependencies actually changed (i.e. like in
build system)?
I ask because I work with multiple nets where the partial output of one
net becomes the partial input of another net. I want to fetch the
gradients computed at the input layer of one net and merge+apply them to
the output layer of another net. I was hoping to do this by manually
retrieving/storing gradients in the variables of a graph, and then
running graph operations to backpropagate the gradients. Thus I need to
understand how it all works under the hood.
What I do is similar to this
How to use Tensorflow Optimizer without recomputing activations in reinforcement learning program that returns control after each iteration?, but I can't conclude whether it's possible based on the last answer (experimental support now in?)
Thanks!
#dga is correct. If you pass a tf.Variable object to tf.Session.run() TensorFlow will return the current value of the variable, and it will not perform any computation. It is cheap (the cost of a memory copy, or possibly a network transfer in the case of a distributed TensorFlow setup). TensorFlow does not retain any history* about how the value of a tf.Variable was updated, so it cannot in general recompute its value from scratch.
(* Technically TensorFlow remembers the tf.Tensor that was used to initialize each variable, so it is possible to recompute the inital value of the variable.)

tensorflow restore only variables

I do some training in Tensorflow and save the whole session using a saver:
# ... define model
# add a saver
saver = tf.train.Saver()
# ... run a session
# ....
# save the model
save_path = saver.save(sess,fileSaver)
It works fine, and I can successfully restore the whole session by using the exact same model and calling:
saver.restore(sess, importSaverPath)
Now I want to modify only the optimizer while keeping the rest of the model constant (the computation graph stays the same apart from the optimizer):
# optimizer used before
# optimizer = tf.train.AdamOptimizer
# (learning_rate = learningRate).minimize(costPrediction)
# the new optimizer I want to use
optimizer = tf.train.RMSPropOptimizer
(learning_rate = learningRate, decay = 0.9, momentum = 0.1,
epsilon = 1e-5).minimize(costPrediction)
I also want to continue the training from the last graph state I saved (i.e., I want to restore the state of my variables and continue with another training algorithm). Of course I cannot use:
saver.restore
any longer, because the graph has changed.
So my question is: is there a way to restore only variables using the saver.restore command (or even, maybe for later use, only a subset of variables), when the whole session has been saved? I looked for such feature in the API documentation and online, but could not find any example / detailed enough explanations that could help me get it to work.
It is possible to restore a subset of variables by passing the list of variables as the var_list argument to the Saver constructor. However, when you change the optimizer, additional variables may have been created (momentum accumulators, for instance) and variable associated with the previous optimizer, if any, would have been removed from the model. So simply using the old Saver object to restore will not work, especially if you had constructed it with the default constructor, which uses tf.all_variables as the argument to var_list parameter. You have to construct the Saver object on the subset of variables that you created in your model and then restore would work. Note that, this would leave the new variables created by the new optimizer uninitialized, so you have to explicitly initialize them.
I see the same problem. Inspired by keveman' s answer. My solution is:
Define your new graph, (here only new optimizer related variables are different from the old graph).
Get all variables using tf.global_variables(). This return a var list I called g_vars.
Get all optimizer related variables using tf.contrib.framework.get_variables_by_suffix('some variable filter'). The filter may be RMSProp or RMSPRrop_*. This function returns a var list I called exclude_vars.
Get the variables in g_vars but not in exclude_vars. Simply use
vars = [item for item in g_vars if item not in exclude_vars]
these vars are common vars in both new and old graph, which you can restore from old model now.
you could recover the original Saver from a MetaGraph protobuf first and then use that saver to restore all old variables safely. For a concrete example, you can take a look at the eval.py script: TensorFlow: How do I release a model without source code?