Custom optimiser implementation in Keras - tensorflow

I am trying to use a custom optimiser to train a NN in Keras. The original algorithm has been developed to train CNNs and is based on using different adaptive learning rates for every weight of the network. The algorithm is called WAME (weight-wise adaptive learning rates with moving average estimator).
The optimiser has been developed by a former University of London student and can be found in this GitHub repo (lines 54 to 153).
As you can see in the code, it is built as a subclass of the optimizer_v2 superclass available in TensorFlow. This class is called WAMEprop.
What I am trying to do is simply:
pasting the code that defines the class into my Google Colab notebook
using the new optimizer as follows:
#building the model
wame_model = models.Sequential()
wame_model.add(layers.Dense(32, activation='relu', input_shape=(11,)))
wame_model.add(layers.Dense(32, activation='relu'))
wame_model.add(layers.Dense(1))
#compiling the model using WAMEprop as optimizer
wame_model.compile(optimizer=WAMEprop(name='wame'), loss='mse', metrics=['mae'])
#fitting the model
history = wame_model.fit(train_features, train_targets,
validation_data=(test_features, test_targets),
epochs=50, batch_size=1, verbose=1)
Now, I get the following error:
ValueError Traceback (most recent call last)
<ipython-input-55-a13c54fc61f2> in <module>()
5 wame_model.add(layers.Dense(32, activation='relu'))
6 wame_model.add(layers.Dense(1))
----> 7 wame_model.compile(optimizer=WAMEprop(name='wame'), loss='mse', metrics=['mae'])
8
9 history = wame_model.fit(white_train_features_df_s.to_numpy(), white_train_targets.to_numpy(),
5 frames
/usr/local/lib/python3.7/dist-packages/keras/optimizers.py in get(identifier)
131 else:
132 raise ValueError(
--> 133 'Could not interpret optimizer identifier: {}'.format(identifier))
ValueError: Could not interpret optimizer identifier: <__main__.WAMEprop object at 0x7fd4bc1d01d0>
Since I am new to Keras and I, unfortunately, don't know Tensorflow, I am not sure what exactly it is not able to find.
Did I get some import wrong?
Did I use the keras compile() method in the wrong way?
Also, if I don't pass the parameter name='wame' to the WAMEprop() call, I get an error message saying that the positional argument 'name' is required. Strangely enough, there is no parameter 'name' in the class constructor. Does this depend on the interaction with the superclass?
Thank you very much a lot in advance for any help you could offer!
Cheers!
UPDATE:
the error message refers to a method get() that takes as input identifier in the optimizers.py file that must have been installed with TensorFlow. Now, this function is expecting to get a string (I guess fr the readily available optimizers), a configuration dictionary, an optimizer_v2.OptimizerV2 object or a tf.compat.v1.train.Optimizer object.
I think the object I am passing as an optimizer is no one of these.
If I run:
my_optimizer = WAMEprop(name='wame')
print(type(my_optimizer))
I get <class '__main__.WAMEprop'>.
So, I suspect the object I am dealing with is something different from what Keras is expecting.
UPDATE2: it runs on my laptop, I have tensorflow installed within an Anaconda environment. Now, I am convinced there is some installation or import problem in Google Colab

Problem solved. I needed to uninstall tensorflow 2.6.0 from Google Colab and install tensorflow 2.0.0 instead.

Related

how to fix this Value Error ' ValueError: decay is deprecated in the new Keras optimizer,'?

I'm new at deep learning and i follow tutorial about face detection.
model = canaro.models.createSimpsonsModel(IMG_SIZE=IMG_SIZE, channels=channels, output_dim=len(characters),
loss='binary_crossentropy', decay=1e-7, learning_rate=0.001, momentum=0.9,
nesterov=True)
ValueError Traceback (most recent call last)
WARNING:absl:lr is deprecated, please use learning_rate instead, or use the legacy optimizer, e.g.,tf.keras.optimizers.legacy.SGD.
Output exceeds the size limit. Open the full output data in a text editor
ValueError: decay is deprecated in the new Keras optimizer, pleasecheck the docstring for valid arguments, or use the legacy optimizer, e.g., tf.keras.optimizers.legacy.SGD.
I already tried follow some steps but i dont know how to fix it.
It looks like you are trying to use a quite old deep learning library which is clearly out-dated and not maintained lately. The error you saw was due to the API used by that specific library was written in an older version of tensorflow (<=2.3), which is now deprecated. If you want to fix that, you have to either manually downgrade your tensorflow or modify the source code of that library canaro.

"ValueError: Your Layer or Model is in an invalid state." after upgrading to tensorflow federated 0.17.0 from 0.16.1

I am running into an error after upgrading to TFF 0.17.0. The same code works perfectly in TFF 0.16.1. The training works just fine in both versions however when I try to copy weights from the FL state to model to evaluate it on test dataset, I get the following error:
File "fl/main_fl.py", line 166, in keras_evaluate
loss, accuracy = self.model.evaluate(test_dataset, verbose=0)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow/python/keras/engine/training_v1.py", line 905, in evaluate
self._assert_built_as_v1()
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer_v1.py", line 852, i$ _assert_built_as_v1
(type(self),))
ValueError: Your Layer or Model is in an invalid state. This can happen for the following cases:
1. You might be interleaving estimator/non-estimator models or interleaving models/layers made in tf.compat.v1.Graph.as_default() with model$/layers created outside of it. Converting a model to an estimator (via model_to_estimator) invalidates all models/layers made before the conv$rsion (even if they were not the model converted to an estimator). Similarly, making a layer or a model inside a a tf.compat.v1.Graph invalid$tes all layers/models you previously made outside of the graph.
2. You might be using a custom keras layer implementation with custom __init__ which didn't call super().__init__. Please check the impleme$tation of <class 'tensorflow.python.keras.engine.functional.Functional'> and its bases.
Below is my keras_evaluate method:
def keras_evaluate(self, test_dataset, mode='test', step=0):
self.state.model.assign_weights_to(self.model)
loss, accuracy = self.model.evaluate(test_dataset, verbose=0)
print('Mode={}, Loss={}, Accuracy={}'.format(mode, loss, accuracy))
self.state is the state returned by tff.learning.build_federated_averaging_process i.e tff.templates.IterativeProcess, test_dataset is of type tf.data.Dataset and self.model is tf.keras.Model type i.e keras functional model. I have one custom layer however it does have super() method so point 2 in the error is misleading me.
Any help will be appreciated.

Tf 2.0 MirroredStrategy on Albert TF Hub model (multi gpu)

I'm trying to run Albert Tensorflow hub version on multiple GPUs in the same machine. The model works perfectly on single GPU.
This is the structure of my code:
strategy = tf.distribute.MirroredStrategy()
print('Number of devices: {}'.format(strategy.num_replicas_in_sync)) # it prints 2 .. correct
if __name__ == "__main__":
with strategy.scope():
run()
Where in run() function, I read the data, build the model, and fit it.
I'm getting this error:
Traceback (most recent call last):
File "Albert.py", line 130, in <module>
run()
File "Albert.py", line 88, in run
model = build_model(bert_max_seq_length)
File "Albert.py", line 55, in build_model
model.compile(loss="categorical_crossentropy", optimizer=optimizer, metrics=["accuracy"])
File "/home/****/py_transformers/lib/python3.5/site-packages/tensorflow_core/python/training/tracking/base.py", line 457, in _method_wrapper
result = method(self, *args, **kwargs)
File "/home/bighanem/py_transformers/lib/python3.5/site-packages/tensorflow_core/python/keras/engine/training.py", line 471, in compile
' model.compile(...)'% (v, strategy))
ValueError: Variable (<tf.Variable 'bert/embeddings/word_embeddings:0' shape=(30000, 128) dtype=float32>) was not created in the distribution strategy scope of (<tensorflow.python.distribute.mirrored_strategy.MirroredStrategy object at 0x7f62e399df60>). It is most likely due to not all layers or the model or optimizer being created outside the distribution strategy scope. Try to make sure your code looks similar to the following.
with strategy.scope():
model=_create_model()
model.compile(...)
Is it possible that this error occures because Albert model was prepared before by tensorflow team (built and compiled)?
Edited:
To be precise, Tensorflow version is 2.1.
Also, this is the way I load Albert pretrained model:
features = {"input_ids": in_id, "input_mask": in_mask, "segment_ids": in_segment, }
albert = hub.KerasLayer(
"https://tfhub.dev/google/albert_xxlarge/3",
trainable=False, signature="tokens", output_key="pooled_output",
)
x = albert(features)
Following this tutorial: SavedModels from TF Hub in TensorFlow 2
Two-part answer:
1) TF Hub hosts two versions of ALBERT (each in several sizes):
https://tfhub.dev/google/albert_base/3 etc. from the Google research team that originally developed ALBERT comes in the hub.Module format for TF1. This will likely not work with a TF2 distribution strategy.
https://tfhub.dev/tensorflow/albert_en_base/1 etc. from the TensorFlow Model Garden comes in the revised TF2 SavedModel format. Please try this one for use in TF2 with a distribution strategy.
2) That said, the immediate problem appears to be what is explained in the error message (abridged):
Variable 'bert/embeddings/word_embeddings' was not created in the distribution strategy scope ... Try to make sure your code looks similar to the following.
with strategy.scope():
model = _create_model()
model.compile(...)
For a SavedModel (from TF Hub or otherwise), it's the loading that needs to happen under the distribution strategy scope, because that's what's re-creating the tf.Variable objects in the current program. Specifically, any of the following ways to load a TF2 SavedModel from TF Hub have to occur under the distribution strategy scope for distribution to work:
tf.saved_model.load();
hub.load(), which just calls tf.saved_model.load() (after downloading if necessary);
hub.KerasLayer when used with a string-valued model handle, on which it then calls hub.load().

Keras + Tensorflow model convert to coreml exits NameError: global name ... is not defined

I've adapted the VAE example from the keras site to train on my data, and everything runs fine. But I'm unable to convert to coreml. The error is:
NameError: global name `batch_size' is not defined
Since batch_size clearly is defined in the python source, I'm guessing it has to do with how the conversion tool captures variable names. Does anyone know how I can fix it (or whether it is, indeed, possible to fix)?
Many thanks,
J.
I ran into a similar message when using parameters to construct the neural net. This should work:
from keras import models
batch_size = 50
model = models.load_model(filename, custom_objects={'batch_size': batch_size})
See also documentation: https://keras.io/getting-started/faq/#how-can-i-save-a-keras-model

Cannot run Tensorflow code multiple times in Jupyter Notebook

I'm struggling running Tensorflow (v1.1) code multiple times in Jupyter Notebook.
For example, I execute this simple code snippet that creates an encoding layer for a seq2seq model:
# Construct encoder layer (LSTM)
encoder_cell = tf.contrib.rnn.LSTMCell(encoder_hidden_units)
encoder_outputs, encoder_final_state = tf.nn.dynamic_rnn(
encoder_cell, encoder_inputs_embedded,
dtype=tf.float32, time_major=False
)
First time is totally fine, my encoder is created.
However, if I rerun it (no matter the changes I've applied), I get this error:
Attempt to have a second RNNCell use the weights of a variable scope that already has weights
It's very annoying as it forces me to restart the kernel every time I want to change a layer.
Can someone explain me why this happens and how I can fix this ?
Thanks!
You are trying to build the exact same graph twice and therefore TensorFlow complains because the variables already exist in the default graph.
What you could do is to call tf.reset_default_graph() before trying to call the method a second time to ensure you create a new graph when required.
Just in case, I would also suggest using an interactive session as described here in the Start TensorFlow InteractiveSession section:
import tensorflow as tf
sess = tf.InteractiveSession()