What is the purpose of Keras custom objects when creating custom layers? - tensorflow

Looking at layers implementations of third parties, like tensorflow_addons, I see that each layer is being registered as a custom object.
For example, you can see the use of the wrapper register_custom_keras_object call here.
This wrapper uses the function tf.keras.utils.get_custom_objects() to do the registering.
My question, is why should this be done for any custom layers? What is the benefit of registering the layer as a custom object?

Doing this allows you to then to refer to your custom object via string. You see this with keras default objects all the time. For example:
# You can either compile a model with the Adam optimizer like this
model.compile(optimizer='adam', ...)
# or like this
adam = keras.optimizers.Adam()
model.compile(optimizer=adam, ...)
Taken from the definition of custom_object_scope:
Code within a with statement will be able to access custom objects by name. Changes to global custom objects persist within the enclosing with statement. At end of the with statement, global custom objects are reverted to state at beginning of the with statement.
Example: Consider a custom object MyObject
with custom_object_scope({'MyObject':MyObject}):
layer = Dense(..., kernel_regularizer='MyObject')
# save, load, etc. will recognize custom object by name
Defined as
def custom_object_scope(*args)
Arguments:
*args: Variable length list of dictionaries of name, class pairs to add to custom objects.

Related

Kernel's hyper-parameters; initialization and setting bounds

I think many other people like me might be interested in how they can use GPFlow for their special problems. The key is how GPFlow is customizable, and a good example would be very helpful.
In my case, I read and tried lots of comments in raised issues without any real success. Setting kernel model parameters is not straightforward (creating with default values, and then do it via the delete object method). Transform method is vague.
It would be really helpful if you could add an example showing. how one can initialize and set bounds of an anisotropic kernel model (length-scales values and bounds, variances, ...) and specially adding observations error (as an array-like alpha parameter)
If you just want to set a value, then you can do
model = gpflow.models.GPR(np.zeros((1, 1)),
np.zeros((1, 1)),
gpflow.kernels.RBF(1, lengthscales=0.2))
Alternatively
model = gpflow.models.GPR(np.zeros((1, 1)),
np.zeros((1, 1)),
gpflow.kernels.RBF(1))
model.kern.lengthscales = 0.2
If you want to change the transform, you either need to subclass the kernel, or you can also do
with gpflow.defer_build():
model = gpflow.models.GPR(np.zeros((1, 1)),
np.zeros((1, 1)),
gpflow.kernels.RBF(1))
transform = gpflow.transforms.Logistic(0.1, 1.))
model.kern.lengthscales = gpflow.params.Parameter(0.3, transform=transform)
model.compile()
You need the defer_build to stop the graph being compiled before you've changed the transform. Using the approach above, the compilation of the tensorflow graph is delayed (until the explicit model.compile()) so is built with the intended bounding transform.
Using an array parameter for likelihood variance is outside the scope of gpflow. For what it's worth (and because it has been asked about before), that particular model is especially problematic as it is not clear how test points are defined.
Setting kernel parameters can be done using the .assign() function, or through direct assignment. See the notebook https://github.com/GPflow/GPflow/blob/develop/doc/source/notebooks/understanding/tf_graphs_and_sessions.ipynb. You do not need to delete a parameter to assign a new value to it.
If you want to have per-datapoint noise, you will need to implement your own custom likelihood, which you can do by taking Gaussian likelihood in likelihoods.py as an example.
If by "bounds" you mean limiting the optimisation range for a parameter, you can use the Logistic transform. If you want to pass in a custom transformation for a parameter, you can pass a constructed Parameter object into constructors with a custom transform. Alternatively you can assign a newly created Parameter with a new transform to the model.
Here is more information on how to access and change GPflow parameters: viewing, getting and settings parameters documentation.
Extra bit for #user1018464 answer about replacing transform in existing parameter: changing transformation is a bit tricky, you can't change transformation once a model was compiled in TensorFlow.
E.g.
likelihood = gpflow.likelihoods.Gaussian()
likelihood.variance.transform = gpflow.transforms.Logistic(1., 10.)
----
GPflowError: Parameter "Gaussian/variance" has already been compiled.
Instead you have to reset GPflow object:
likelihood = gpflow.likelihoods.Gaussian() # All tensors compiled
likelihood.clear()
likelihood.variance.transform = gpflow.transforms.Logistic(2, 5)
likelihood.variance = 2.5
likelihood.compile()

How do I store and retrieve Tensors in a global namespace in Tensorflow?

I am trying to read a big Tensorflow project. For a project that the nodes of the computation graph are scattered around the project, I wonder if there is a way to store a Tensor node of the computation graph and add that node to the fetch list in sess.run?
For example, if I want to add probs at line 615 of
https://github.com/allenai/document-qa/blob/master/docqa/nn/span_prediction.py to a global namespace, is there a method like tf.add_node(probs, "probs"), and later I could get tf.get_node("probs"), just for the sake of conveniently passing node around the project.
A more general question would be, what will be a better idea to structure the tensorflow code and improve the efficiency of experimenting with different models.
Of course you can. To retrieve it later, you'll have to give it a name so that you can retrieve it by name. Take probs in your code as an example. It's created with tf.nn.softmax() function, the API for which is shown below.
tf.nn.softmax(
logits,
axis=None,
name=None,
dim=None
)
See the parameter name? You can add this parameter to line 615 like this:
probs = tf.nn.softmax(all_logits, name='my_tensor')
Later when you need it, you can call tf.Graph.get_tensor_by_name(name) to retrieve this tensor.
graph = tf.get_default_graph()
retrieved_probs = graph.get_tensor_by_name('my_tensor:0')
'my_tensor' is the name of the softmax operation, and ':0' should be added to the end of it meaning that you're retrieving the tensor instead of the operation. When calling Graph.get_operation_by_name(), no ':0' should be added.
You'll have to make sure that the tensor exists(it might be created in the code executed before this line, or it might be restored from a meta graph file). If it's created in a variable scope, you'll also have to add the scope name and a '/' in the front of the name param. For example, 'my_scope/my_tensor:0'.

Integrating tfe.EagerVariableStore with tfe.Checkpoint?

tfe.Checkpoint seems to require things to be checkpointed to implement CheckpointableBase which EagerVariableStore doesn't.
What is the right way then to use EagerVariableStore to "eagerify" the functional parts of Tensorflow with ability to checkpoint?
Providing some working code would be appreciated.
For eagerifying functional code, I'd suggest tf.make_template rather than EagerVariableStore directly. When executing eagerly, this will create a variable store automatically (allowing variable reuse with tf.get_variable), and the object tf.make_template returns is checkpointable.
import tensorflow as tf
tf.enable_eager_execution()
def uses_functional_layers(x):
return tf.layers.dense(inputs=x, units=1)
save_template = tf.make_template("save_template", uses_functional_layers)
save_checkpoint = tf.train.Checkpoint(model=save_template)
save_template(tf.ones([1, 1]))
save_template.variables[0].assign([42.])
save_output = save_template(tf.ones([1, 1]))
save_path = save_checkpoint.save('/tmp/tf_template_ckpt')
So we make a function which wraps our functional layers / tf.get_variable usage, then make a template object out of that with tf.make_template, and finally can checkpoint that template object after it has been called once to create its variables.
An advantage of doing it this way is that we get restore-on-create for variables in the template, meaning the template is evaluated with the restored values the first time it is called:
import numpy
# Create a second template to load the checkpoint into
restore_template = tf.make_template("save_template", uses_functional_layers)
tf.train.Checkpoint(model=restore_template).restore(save_path)
numpy.testing.assert_allclose(
save_output,
restore_template(tf.ones([1, 1]))) # Variables are restored on creation
numpy.testing.assert_equal([42.], restore_template.variables[0].numpy())
Nested templates work too. Note that the template object strips its own variable_scope from variables created within it, but otherwise uses the full variable names (which may be more fragile than usual object-based checkpointing):
Looking up variables repeatedly with tf.get_variable (done each time the template is evaluated) is also quite slow, which is one reason TensorFlow is moving toward object-oriented Keras-style layers instead of functional layers.
I have found a "hackish" way, but works!
The main problem is:
tfe.EagerVariableStore doesn't inherit CheckpointableBase, hence it can't be saved with tfe.Checkpoint
The big idea is:
We are going to create a CheckpointableBase object that "points" to every variable stored in the tfe.EagerVariableStore
How to know what are stored in EagerVariableStore?
Reference: https://github.com/tensorflow/tensorflow/blob/r1.8/tensorflow/python/ops/variable_scope.py
It says that EagerVariableStore uses _VariableStore to store all the variables, via _store.
Now, the _VariableStore stores the variables in self._vars as a dictionary.
If we have a container = tfe.EagerVariableStore(), we can get all the variables via container._store._vars as a dictionary.
How to create a CheckpointableBase that points to every variable then?
We will use tfe.Checkpointable since it has __setattr__.
checkpointable = tfe.Checkpointable()
for k, v in container._store._vars.items():
setattr(checkpointable, k, v)
How to combine the two?
As we have a tfe.Checkpoint for saving, all we need to do is this:
saver = tfe.Checkpoint(checkpointable=checkpointable)
saver.save(...)
And saver.restore(...) to restore.
Your tfe.EagerVariableStore need not to be changed, the checkpointable after restored via tfe.Checkpoint will "replace" the values in tfe.EagerVariableStore automigically!

how to make two conv nets from a single class and do weight sharing[Siamese net]

I am trying to implement a siamese network , similar to below image
for representation.
In this I have made a class SiameseNet which implements one cnn's output. What I am trying to do is that I create two instance of this class to make two different neural nets, with compulsory condition that they both have same weights.THis is what I have tried so far , but haven't reached a working solution due to some mis conceptions regarding how should I vary the scopes and still manage weight sharing or whatever I am missing here.
class SiameseNet():
def __init__(self,X):
self.input_layer=X
def model(self):
with tf.variable_scope('layer1',reuse=True):
layer1=tf.layers.conv2d(inputs=self.input_layer,filters=8,kernel_size=[1,1],padding='same',activation=tf.nn.relu)
batch_layer1=tf.layers.batch_normalization(inputs=layer1,axis=-1)
dropout_layer1=tf.layers.dropout(inputs=batch_layer1,rate=0.2)#,training=mode == tf.estimator.ModeKeys.TRAIN)
with tf.variable_scope('layer2',reuse=True):
layer2=tf.layers.conv2d(inputs=dropout_layer1,filters=8,kernel_size=[4,4],padding='same',activation=tf.nn.relu)
batch_layer2=tf.layers.batch_normalization(inputs=layer2,axis=-1)
dropout_layer2=tf.layers.dropout(inputs=batch_layer2,rate=0.2)#,training=mode==tf.estimator.ModeKeys.TRAIN)
with tf.variable_scope('layer3',reuse=True):
layer3=tf.layers.conv2d(inputs=dropout_layer2,filters=16,kernel_size=[4,4],padding='same',activation=tf.nn.relu)
batch_layer3=tf.layers.batch_normalization(inputs=layer3,axis=-1)
dropout_layer3=tf.layers.dropout(inputs=batch_layer3,rate=0.2)#,training=mode==tf.estimator.ModeKeys.TRAIN)
with tf.variable_scope('logits',reuse=True):
flatten_layer3= tf.layers.flatten(dropout_layer3)
dense_layer4=tf.layers.dense(inputs=flatten_layer3,units=1000,activation=tf.nn.relu)
logits=tf.layers.dense(inputs=dense_layer4,units=500)
return logits
How i was intending to use it , to make two covnets with shared weights but both receiving different images as input
with tf.Session() as sess:
net1=SiameseNet(x1).model() #x1 = Image1
net2=SiameseNet(x3).model() #x2= Image2
loss=Loss(2)
optimiser=tf.train.AdamOptimizer(learning_rate=0.01).minimize(loss)
for i in range(4):
l=sess.run(loss.contrastive_loss(1,net1,net2))
print(l)
This gives me this error
ValueError: Variable layer1/conv2d/kernel does not exist, or was not
created with tf.get_variable(). Did you mean to set
reuse=tf.AUTO_REUSE in VarScope?
What I seeking is any clearance on where I am going wrong in terms of correct usage of tensorflow to make two neural nets from a single class SiameseNet. Any clearance on how to use variable scope here so that I can use same weights .Also, what if I don't use variable scope , would that mean that there will be some duplication of inputs or what.
Thanks for your time.
When you set the reuse argument of tf.variable_scope to True, TensorFlow expects the variables (with the name that you provide) to exist within the scope, and that is not the case when you define your first network. Instead you could set reuse=tf.AUTO_REUSE, and the variables will be created in case they don't exist.
If you don't use variable scope, two different networks with non-shared weights will be created. In case you want to avoid using a variable scope directly, there is also the option of setting reuse=tf.AUTO_REUSE in tf.layers.dense and tf.layers.conv2d.

Example to save and load a sub graph in TensorFlow?

I created a model within a variable/name scope and would like to store its graph definition and variables to disk to later load it without defining the graph again. How can I save and load all operations and variables within a given variable/name scope?
Conveniently, I would like to just use tf.Saver. Its save() has an option to store the meta graph but restore() does not seem to import it. Moreover, in the saver constructor, I can specify a list of parameters, but not a name scope to control what operations are saved.
There is also tf.train.write_graph() but I couldn't find an explanation of what it is and how it relates to the saver class and meta graph.