Find all variables that a tensorflow op depends upon - tensorflow

Is there a way to find all variables that a given operation (usually a loss) depends upon?
I would like to use this to then pass this collection into optimizer.minimize() or tf.gradients() using various set().intersection() combinations.
So far I have found op.op.inputs and tried a simple BFS on that, but I never chance upon Variable objects as returned by tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES) or slim.get_variables()
There does seem to be a correspondence between corresponding 'Tensor.op._idandVariables.op._id` fields, but I'm not sure that's a something I should rely upon.
Or maybe I should't want to do this in the first place?
I could of course construct my disjoint sets of variables meticulously while building my graph, but then it would be easy to miss something if I change the model.

The documentation for tf.Variable.op is not particularly clear, but it does refer to the crucial tf.Operation used in the implementation of a tf.Variable: any op that depends on a tf.Variable will be on a path from that operation. Since the tf.Operation object is hashable, you can use it as the key of a dict that maps tf.Operation objects to the corresponding tf.Variable object, and then perform the BFS as before:
op_to_var = {var.op: var for var in tf.trainable_variables()}
starting_op = ...
dependent_vars = []
queue = collections.deque()
queue.append(starting_op)
visited = set([starting_op])
while queue:
op = queue.popleft()
try:
dependent_vars.append(op_to_var[op])
except KeyError:
# `op` is not a variable, so search its inputs (if any).
for op_input in op.inputs:
if op_input.op not in visited:
queue.append(op_input.op)
visited.add(op_input.op)

Related

Kernel's hyper-parameters; initialization and setting bounds

I think many other people like me might be interested in how they can use GPFlow for their special problems. The key is how GPFlow is customizable, and a good example would be very helpful.
In my case, I read and tried lots of comments in raised issues without any real success. Setting kernel model parameters is not straightforward (creating with default values, and then do it via the delete object method). Transform method is vague.
It would be really helpful if you could add an example showing. how one can initialize and set bounds of an anisotropic kernel model (length-scales values and bounds, variances, ...) and specially adding observations error (as an array-like alpha parameter)
If you just want to set a value, then you can do
model = gpflow.models.GPR(np.zeros((1, 1)),
np.zeros((1, 1)),
gpflow.kernels.RBF(1, lengthscales=0.2))
Alternatively
model = gpflow.models.GPR(np.zeros((1, 1)),
np.zeros((1, 1)),
gpflow.kernels.RBF(1))
model.kern.lengthscales = 0.2
If you want to change the transform, you either need to subclass the kernel, or you can also do
with gpflow.defer_build():
model = gpflow.models.GPR(np.zeros((1, 1)),
np.zeros((1, 1)),
gpflow.kernels.RBF(1))
transform = gpflow.transforms.Logistic(0.1, 1.))
model.kern.lengthscales = gpflow.params.Parameter(0.3, transform=transform)
model.compile()
You need the defer_build to stop the graph being compiled before you've changed the transform. Using the approach above, the compilation of the tensorflow graph is delayed (until the explicit model.compile()) so is built with the intended bounding transform.
Using an array parameter for likelihood variance is outside the scope of gpflow. For what it's worth (and because it has been asked about before), that particular model is especially problematic as it is not clear how test points are defined.
Setting kernel parameters can be done using the .assign() function, or through direct assignment. See the notebook https://github.com/GPflow/GPflow/blob/develop/doc/source/notebooks/understanding/tf_graphs_and_sessions.ipynb. You do not need to delete a parameter to assign a new value to it.
If you want to have per-datapoint noise, you will need to implement your own custom likelihood, which you can do by taking Gaussian likelihood in likelihoods.py as an example.
If by "bounds" you mean limiting the optimisation range for a parameter, you can use the Logistic transform. If you want to pass in a custom transformation for a parameter, you can pass a constructed Parameter object into constructors with a custom transform. Alternatively you can assign a newly created Parameter with a new transform to the model.
Here is more information on how to access and change GPflow parameters: viewing, getting and settings parameters documentation.
Extra bit for #user1018464 answer about replacing transform in existing parameter: changing transformation is a bit tricky, you can't change transformation once a model was compiled in TensorFlow.
E.g.
likelihood = gpflow.likelihoods.Gaussian()
likelihood.variance.transform = gpflow.transforms.Logistic(1., 10.)
----
GPflowError: Parameter "Gaussian/variance" has already been compiled.
Instead you have to reset GPflow object:
likelihood = gpflow.likelihoods.Gaussian() # All tensors compiled
likelihood.clear()
likelihood.variance.transform = gpflow.transforms.Logistic(2, 5)
likelihood.variance = 2.5
likelihood.compile()

How do I store and retrieve Tensors in a global namespace in Tensorflow?

I am trying to read a big Tensorflow project. For a project that the nodes of the computation graph are scattered around the project, I wonder if there is a way to store a Tensor node of the computation graph and add that node to the fetch list in sess.run?
For example, if I want to add probs at line 615 of
https://github.com/allenai/document-qa/blob/master/docqa/nn/span_prediction.py to a global namespace, is there a method like tf.add_node(probs, "probs"), and later I could get tf.get_node("probs"), just for the sake of conveniently passing node around the project.
A more general question would be, what will be a better idea to structure the tensorflow code and improve the efficiency of experimenting with different models.
Of course you can. To retrieve it later, you'll have to give it a name so that you can retrieve it by name. Take probs in your code as an example. It's created with tf.nn.softmax() function, the API for which is shown below.
tf.nn.softmax(
logits,
axis=None,
name=None,
dim=None
)
See the parameter name? You can add this parameter to line 615 like this:
probs = tf.nn.softmax(all_logits, name='my_tensor')
Later when you need it, you can call tf.Graph.get_tensor_by_name(name) to retrieve this tensor.
graph = tf.get_default_graph()
retrieved_probs = graph.get_tensor_by_name('my_tensor:0')
'my_tensor' is the name of the softmax operation, and ':0' should be added to the end of it meaning that you're retrieving the tensor instead of the operation. When calling Graph.get_operation_by_name(), no ':0' should be added.
You'll have to make sure that the tensor exists(it might be created in the code executed before this line, or it might be restored from a meta graph file). If it's created in a variable scope, you'll also have to add the scope name and a '/' in the front of the name param. For example, 'my_scope/my_tensor:0'.

Integrating tfe.EagerVariableStore with tfe.Checkpoint?

tfe.Checkpoint seems to require things to be checkpointed to implement CheckpointableBase which EagerVariableStore doesn't.
What is the right way then to use EagerVariableStore to "eagerify" the functional parts of Tensorflow with ability to checkpoint?
Providing some working code would be appreciated.
For eagerifying functional code, I'd suggest tf.make_template rather than EagerVariableStore directly. When executing eagerly, this will create a variable store automatically (allowing variable reuse with tf.get_variable), and the object tf.make_template returns is checkpointable.
import tensorflow as tf
tf.enable_eager_execution()
def uses_functional_layers(x):
return tf.layers.dense(inputs=x, units=1)
save_template = tf.make_template("save_template", uses_functional_layers)
save_checkpoint = tf.train.Checkpoint(model=save_template)
save_template(tf.ones([1, 1]))
save_template.variables[0].assign([42.])
save_output = save_template(tf.ones([1, 1]))
save_path = save_checkpoint.save('/tmp/tf_template_ckpt')
So we make a function which wraps our functional layers / tf.get_variable usage, then make a template object out of that with tf.make_template, and finally can checkpoint that template object after it has been called once to create its variables.
An advantage of doing it this way is that we get restore-on-create for variables in the template, meaning the template is evaluated with the restored values the first time it is called:
import numpy
# Create a second template to load the checkpoint into
restore_template = tf.make_template("save_template", uses_functional_layers)
tf.train.Checkpoint(model=restore_template).restore(save_path)
numpy.testing.assert_allclose(
save_output,
restore_template(tf.ones([1, 1]))) # Variables are restored on creation
numpy.testing.assert_equal([42.], restore_template.variables[0].numpy())
Nested templates work too. Note that the template object strips its own variable_scope from variables created within it, but otherwise uses the full variable names (which may be more fragile than usual object-based checkpointing):
Looking up variables repeatedly with tf.get_variable (done each time the template is evaluated) is also quite slow, which is one reason TensorFlow is moving toward object-oriented Keras-style layers instead of functional layers.
I have found a "hackish" way, but works!
The main problem is:
tfe.EagerVariableStore doesn't inherit CheckpointableBase, hence it can't be saved with tfe.Checkpoint
The big idea is:
We are going to create a CheckpointableBase object that "points" to every variable stored in the tfe.EagerVariableStore
How to know what are stored in EagerVariableStore?
Reference: https://github.com/tensorflow/tensorflow/blob/r1.8/tensorflow/python/ops/variable_scope.py
It says that EagerVariableStore uses _VariableStore to store all the variables, via _store.
Now, the _VariableStore stores the variables in self._vars as a dictionary.
If we have a container = tfe.EagerVariableStore(), we can get all the variables via container._store._vars as a dictionary.
How to create a CheckpointableBase that points to every variable then?
We will use tfe.Checkpointable since it has __setattr__.
checkpointable = tfe.Checkpointable()
for k, v in container._store._vars.items():
setattr(checkpointable, k, v)
How to combine the two?
As we have a tfe.Checkpoint for saving, all we need to do is this:
saver = tfe.Checkpoint(checkpointable=checkpointable)
saver.save(...)
And saver.restore(...) to restore.
Your tfe.EagerVariableStore need not to be changed, the checkpointable after restored via tfe.Checkpoint will "replace" the values in tfe.EagerVariableStore automigically!

Usage of tf.GraphKeys

In tensorflow, there's a class GraphKeys. I came across many codes, where it's been used. But it's not explained very well what's the usage of this class both in tensorflow documentation as well as in the codes, where it has been used.
Can someone please explain what's the usage of tf.GraphKey?
Thank you!
As far as I know, tf.GraphKeys is a collection of collections of keys for variables and ops in the graph. The usage (just as common python dictionaries) is to retrieve variables and ops.
Given that said, here are some subsets of tf.GraphKeys I came across:
GLOBAL_VARIABLES and LOCAL_VARIABLES contain all variables of the graph, which need to be initialized before training. tf.global_variables() returns the global variables in a list and can be used with tf.variables_initializer for initialization.
Variables created with option trainable=True will be added to TRAINABLE_VARIABLES and will be fetched and updated by any optimizer under tf.train during training.
SUMMARIES contains keys for all summaries added by tf.summary (scalar, image, histogram, text, etc). tf.summary.merge_all gathers all such keys and returns an op to be run and written to file so that you can visualize them on tensorboard.
Custom functions to update some variables can be added to UPDATE_OPS and separately run at each iteration using sess.run(tf.get_collection(tf.GraphKeys.UPDATE_OPS)). In this case, these variables are set trainable=False to avoid being updated by gradient descent.
You may create your own collections using tf.add_to_collection(some_name, var_or_op) and retrieve the variable or op later. You may retrieve specific variables or ops using tf.get_collection() and tweak the scope.

tensorflow add new op : could attr accept scalar tensor?

I can't find detail info about this in official doc.
Could anyone give more detailed info?
TensorFlow uses attrs as "compile-time constants" that determine the behavior and type (number of inputs and outputs) of an op.
You can define an op that has a TensorProto as one of its attrs. For example the tf.constant() op takes its value as an attr, which is defined here in the corresponding op registration.
There are a few limitations to this feature:
It is not currently possible to constrain the shape of the tensor statically. You would need to validate this in the constructor for the op (where GetAttr is typically called).
Similarly, it is not currently possible to constrain the element type of the tensor statically, so you will need to check this at runtime as well.
In the Python wrapper for your op, you will need to pass the attr's value as a TensorProto, e.g. by calling tf.contrib.util.make_tensor_proto() to do the conversion.
In general, you may find it much easier to use a simple int, float, bool, or string attr instead of a scalar TensorProto, but the TensorProto option is available if you need to encode a less common type.