I created a model within a variable/name scope and would like to store its graph definition and variables to disk to later load it without defining the graph again. How can I save and load all operations and variables within a given variable/name scope?
Conveniently, I would like to just use tf.Saver. Its save() has an option to store the meta graph but restore() does not seem to import it. Moreover, in the saver constructor, I can specify a list of parameters, but not a name scope to control what operations are saved.
There is also tf.train.write_graph() but I couldn't find an explanation of what it is and how it relates to the saver class and meta graph.
Related
Looking at layers implementations of third parties, like tensorflow_addons, I see that each layer is being registered as a custom object.
For example, you can see the use of the wrapper register_custom_keras_object call here.
This wrapper uses the function tf.keras.utils.get_custom_objects() to do the registering.
My question, is why should this be done for any custom layers? What is the benefit of registering the layer as a custom object?
Doing this allows you to then to refer to your custom object via string. You see this with keras default objects all the time. For example:
# You can either compile a model with the Adam optimizer like this
model.compile(optimizer='adam', ...)
# or like this
adam = keras.optimizers.Adam()
model.compile(optimizer=adam, ...)
Taken from the definition of custom_object_scope:
Code within a with statement will be able to access custom objects by name. Changes to global custom objects persist within the enclosing with statement. At end of the with statement, global custom objects are reverted to state at beginning of the with statement.
Example: Consider a custom object MyObject
with custom_object_scope({'MyObject':MyObject}):
layer = Dense(..., kernel_regularizer='MyObject')
# save, load, etc. will recognize custom object by name
Defined as
def custom_object_scope(*args)
Arguments:
*args: Variable length list of dictionaries of name, class pairs to add to custom objects.
I am trying to read a big Tensorflow project. For a project that the nodes of the computation graph are scattered around the project, I wonder if there is a way to store a Tensor node of the computation graph and add that node to the fetch list in sess.run?
For example, if I want to add probs at line 615 of
https://github.com/allenai/document-qa/blob/master/docqa/nn/span_prediction.py to a global namespace, is there a method like tf.add_node(probs, "probs"), and later I could get tf.get_node("probs"), just for the sake of conveniently passing node around the project.
A more general question would be, what will be a better idea to structure the tensorflow code and improve the efficiency of experimenting with different models.
Of course you can. To retrieve it later, you'll have to give it a name so that you can retrieve it by name. Take probs in your code as an example. It's created with tf.nn.softmax() function, the API for which is shown below.
tf.nn.softmax(
logits,
axis=None,
name=None,
dim=None
)
See the parameter name? You can add this parameter to line 615 like this:
probs = tf.nn.softmax(all_logits, name='my_tensor')
Later when you need it, you can call tf.Graph.get_tensor_by_name(name) to retrieve this tensor.
graph = tf.get_default_graph()
retrieved_probs = graph.get_tensor_by_name('my_tensor:0')
'my_tensor' is the name of the softmax operation, and ':0' should be added to the end of it meaning that you're retrieving the tensor instead of the operation. When calling Graph.get_operation_by_name(), no ':0' should be added.
You'll have to make sure that the tensor exists(it might be created in the code executed before this line, or it might be restored from a meta graph file). If it's created in a variable scope, you'll also have to add the scope name and a '/' in the front of the name param. For example, 'my_scope/my_tensor:0'.
tfe.Checkpoint seems to require things to be checkpointed to implement CheckpointableBase which EagerVariableStore doesn't.
What is the right way then to use EagerVariableStore to "eagerify" the functional parts of Tensorflow with ability to checkpoint?
Providing some working code would be appreciated.
For eagerifying functional code, I'd suggest tf.make_template rather than EagerVariableStore directly. When executing eagerly, this will create a variable store automatically (allowing variable reuse with tf.get_variable), and the object tf.make_template returns is checkpointable.
import tensorflow as tf
tf.enable_eager_execution()
def uses_functional_layers(x):
return tf.layers.dense(inputs=x, units=1)
save_template = tf.make_template("save_template", uses_functional_layers)
save_checkpoint = tf.train.Checkpoint(model=save_template)
save_template(tf.ones([1, 1]))
save_template.variables[0].assign([42.])
save_output = save_template(tf.ones([1, 1]))
save_path = save_checkpoint.save('/tmp/tf_template_ckpt')
So we make a function which wraps our functional layers / tf.get_variable usage, then make a template object out of that with tf.make_template, and finally can checkpoint that template object after it has been called once to create its variables.
An advantage of doing it this way is that we get restore-on-create for variables in the template, meaning the template is evaluated with the restored values the first time it is called:
import numpy
# Create a second template to load the checkpoint into
restore_template = tf.make_template("save_template", uses_functional_layers)
tf.train.Checkpoint(model=restore_template).restore(save_path)
numpy.testing.assert_allclose(
save_output,
restore_template(tf.ones([1, 1]))) # Variables are restored on creation
numpy.testing.assert_equal([42.], restore_template.variables[0].numpy())
Nested templates work too. Note that the template object strips its own variable_scope from variables created within it, but otherwise uses the full variable names (which may be more fragile than usual object-based checkpointing):
Looking up variables repeatedly with tf.get_variable (done each time the template is evaluated) is also quite slow, which is one reason TensorFlow is moving toward object-oriented Keras-style layers instead of functional layers.
I have found a "hackish" way, but works!
The main problem is:
tfe.EagerVariableStore doesn't inherit CheckpointableBase, hence it can't be saved with tfe.Checkpoint
The big idea is:
We are going to create a CheckpointableBase object that "points" to every variable stored in the tfe.EagerVariableStore
How to know what are stored in EagerVariableStore?
Reference: https://github.com/tensorflow/tensorflow/blob/r1.8/tensorflow/python/ops/variable_scope.py
It says that EagerVariableStore uses _VariableStore to store all the variables, via _store.
Now, the _VariableStore stores the variables in self._vars as a dictionary.
If we have a container = tfe.EagerVariableStore(), we can get all the variables via container._store._vars as a dictionary.
How to create a CheckpointableBase that points to every variable then?
We will use tfe.Checkpointable since it has __setattr__.
checkpointable = tfe.Checkpointable()
for k, v in container._store._vars.items():
setattr(checkpointable, k, v)
How to combine the two?
As we have a tfe.Checkpoint for saving, all we need to do is this:
saver = tfe.Checkpoint(checkpointable=checkpointable)
saver.save(...)
And saver.restore(...) to restore.
Your tfe.EagerVariableStore need not to be changed, the checkpointable after restored via tfe.Checkpoint will "replace" the values in tfe.EagerVariableStore automigically!
In tensorflow, there's a class GraphKeys. I came across many codes, where it's been used. But it's not explained very well what's the usage of this class both in tensorflow documentation as well as in the codes, where it has been used.
Can someone please explain what's the usage of tf.GraphKey?
Thank you!
As far as I know, tf.GraphKeys is a collection of collections of keys for variables and ops in the graph. The usage (just as common python dictionaries) is to retrieve variables and ops.
Given that said, here are some subsets of tf.GraphKeys I came across:
GLOBAL_VARIABLES and LOCAL_VARIABLES contain all variables of the graph, which need to be initialized before training. tf.global_variables() returns the global variables in a list and can be used with tf.variables_initializer for initialization.
Variables created with option trainable=True will be added to TRAINABLE_VARIABLES and will be fetched and updated by any optimizer under tf.train during training.
SUMMARIES contains keys for all summaries added by tf.summary (scalar, image, histogram, text, etc). tf.summary.merge_all gathers all such keys and returns an op to be run and written to file so that you can visualize them on tensorboard.
Custom functions to update some variables can be added to UPDATE_OPS and separately run at each iteration using sess.run(tf.get_collection(tf.GraphKeys.UPDATE_OPS)). In this case, these variables are set trainable=False to avoid being updated by gradient descent.
You may create your own collections using tf.add_to_collection(some_name, var_or_op) and retrieve the variable or op later. You may retrieve specific variables or ops using tf.get_collection() and tweak the scope.
This looks like it should work but it just prints an empty list after importing the pretrained Inception network: https://gist.github.com/tachim/6d44136171be86430dba16fecafa5872.
That shows contents of VARIABLES collection which does not get restored with import_graph_def. However, it does get restored when you import meta graph
You could go over Graph and show all names for ops of type Variable
[op.name for op in tf.get_default_graph().get_operations() if op.op_def and op.op_def.name=='Variable']
This gives you name of the variable op in the Graph, rather than Python object wrapping it, so you'd be limited to low-level Graph-based API. IE, you can fetch the variable value by using this name as argument to sess.run, but there's no convenient way to get to its initializer or assign_op.