Keras model compilation settings (view/change) - tensorflow

Suppose I have the following Keras model:
model = Sequential()
model.add(Dense(units=64, activation='relu'))
model.add(Dense(units=10, activation='softmax'))
model.compile(
loss = CategoricalCrossentropy(label_smoothing=0.01),
optimizer = RMSprop(learning_rate=0.001, momentum=0.0)
metrics = [Accuracy()]
)
I have two questions:
How can I view compilation settings (like the learning_rate)?
How can I change compilation settings (like the learning_rate)?
Remarks:
I noticed I can view layer settings using model.summary() or model.get_config() but that does not show compilation settings.
I know I can change the learning_rate by running the compile statement again with a different learning_rate. But I would like a "cleaner"/ more readable way to do this. Something like: model['compilation']['optimizer']['learning_rate'] = xxx. (Many sklearn model can be adjusted this way.)

Use .lr:
rate = model.optimizer.lr

Related

How to save a model with DenseVariational layer?

I'm trying to build a model with DenseVariational layer so that it can report epistemic uncertainties. Something like https://www.tensorflow.org/probability/examples/Probabilistic_Layers_Regression#figure_3_epistemic_uncertainty
The model training works just fine and now I would like to save the model and load it in a production environment. However, when I tried model.save('path/model.h5'), I got
Layer DenseVariational has arguments in `__init__` and therefore must override `get_config`.
Then I added
class CustomVariational(tfp.layers.DenseVariational):
def get_config(self):
config = super().get_config().copy()
config.update({
'units': self.units,
'make_posterior_fn': self._make_posterior_fn,
'make_prior_fn': self._make_prior_fn
})
return config
but it failed with a new error
Unable to create link (name already exists)
Is DenseVariational layer for research only?
I think we can circumvent this problem by using the save_weights method.
When you add with tf.name_scope(...) to prior & posterior functions, it should be resolved, otherwise they end up with the same name for both tensors.
We're also fixing the example tutorial colab, should be online soon, thanks.
Update:
Instead of fixing it at the applications, we fixed it in the library instead: https://github.com/tensorflow/probability/commit/0ca065fb526b50ce38b68f7d5b803f02c78c8f16. Once it is updated, the duplicate tensor names should be resolved. Thanks.
It's been almost 2 years, and the problem is still going on.
A workaround is to store only the weights:
tf.keras.Model.save_weights(filepath, overwrite=True)
Then, you can use the same model structure and load the weights.
For example:
# model
model = tf.keras.Sequential([
tf.keras.layers.InputLayer(input_shape=(input_shape,), name="input"),
tfp.layers.DenseVariational(32, posterior_mean_field, prior_trainable), # trainable
tf.keras.layers.Dense(32, activation="relu"),
tf.keras.layers.Dense(1)
])
# save weights after compiling and training your model
model.save_weights('model_weights.h5')
Initialize a new model with the same structure:
# different model, same weights
model2 = tf.keras.Sequential([
tf.keras.layers.InputLayer(input_shape=(input_shape,), name="input"),
tfp.layers.DenseVariational(32, posterior_mean_field, prior_trainable),
tf.keras.layers.Dense(32, activation="relu"),
tf.keras.layers.Dense(1)
])
# load weights
model2.load_weights('model_weights.h5')
I hope this helps!

Tune a pre-existing model with Keras Tuner

I am looking at Keras Tuner as a way of doing hyperparameter optimization, but all of the examples I have seen show an entirely fresh model being defined. For example, from the Keras Tuner Hello World:
def build_model(hp):
model = keras.Sequential()
model.add(layers.Flatten(input_shape=(28, 28)))
for i in range(hp.Int('num_layers', 2, 20)):
model.add(layers.Dense(units=hp.Int('units_' + str(i), 32, 512, 32),
activation='relu'))
model.add(layers.Dense(10, activation='softmax'))
model.compile(
optimizer=keras.optimizers.Adam(
hp.Choice('learning_rate', [1e-2, 1e-3, 1e-4])),
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
return model
I already have a model that I would like to tune, but does that mean I have to rewrite it with the hyperparameters spliced in to the body, as above, or can I simply pass the hyperameters in to the model at the top? For example like this:
def build_model(hp):
model = MyExistingModel(
batch_size=hp['batch_size'],
seq_len=hp['seq_len'],
rnn_hidden_units=hp['hidden_units'],
rnn_type='gru',
num_rnn_layers=hp['num_rnn_layers']
)
optimizer = optimizer_factory['adam'](
learning_rate=hp['learning_rate'],
momentum=0.9,
)
model.compile(
optimizer=optimizer,
loss='sparse_categorical_crossentropy',
metrics=['sparse_categorical_accuracy'],
)
return model
The above seems to work, as far as I can see. The model initialization args are all passed to the internal TF layers, through a HyperParameters instance, and accessed from there... although I'm not really sure how to pass it in... I think it can be done by predefining a HyperParameters object and passing it in to the tuner, so it then gets passed in to build_model:
hp = HyperParameters()
hp.Choice('learning_rate', [1e-1, 1e-3])
tuner = RandomSearch(
build_model,
max_trials=5,
hyperparameters=hp,
tune_new_entries=False,
objective='val_accuracy')
Internally my model has two RNNs (LSTM or GRU) and an MLP. But I have yet to come across a Keras Tuner build_model that takes an existing model like this a simply passes in the hyperparameters. The model is quite complex, and I would like to avoid having to redefine it (as well as avoiding code duplication).
Indeed this is possible, as this GitHub issue makes clear...
However rather than passing the hp object through the hyperparameters arg to the Tuner, instead I override the Tuner run_trial method in the manner suggested here.

Different results when using Manual KFold-Cross validation vs. KerasClassifier-KFold Cross Validation

I've been struggling to understand why two similar Kfold-cross validations result in two different averages.
When I use a manual KFold approach (with Tensorflow and Keras)
cvscores = []
kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=3)
for train, test in kfold.split(X, y):
model = create_baseline()
model.fit(X[train], y[train], epochs=50, batch_size=32, verbose=0)
scores = model.evaluate(X[test], y[test], verbose=0)
#print("%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))
cvscores.append(scores[1] * 100)
print("%.2f%% (+/- %.2f%%)" % (np.mean(cvscores), np.std(cvscores)))
I get
65.89% (+/- 3.77%)
When I use the KerasClassifier wrapper from scikit
estimator = KerasClassifier(build_fn=create_baseline, epochs=50, batch_size=32, verbose=0)
kfold = StratifiedKFold(n_splits=10,shuffle=True, random_state=3)
results = cross_val_score(estimator, X, y, cv=kfold, scoring='accuracy')
print("Baseline: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))
I get
63.82% (5.37%)
Additionally, when using KerasClassifier the following warning appears
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/wrappers/scikit_learn.py:241: Sequential.predict_classes (from tensorflow.python.keras.engine.sequential) is deprecated and will be removed after 2021-01-01.
Instructions for updating:
Please use instead:* `np.argmax(model.predict(x), axis=-1)`, if your model does multi-class classification (e.g. if it uses a `softmax` last-layer activation).* `(model.predict(x) > 0.5).astype("int32")`, if your model does binary classification (e.g. if it uses a `sigmoid` last-layer activation).
Do the results differ because KerasClassifier uses predict_classes() while the manual Tensorflow/Keras approach uses just predict()? If so, which approach is more reasonable?
My model looks like this
def create_baseline():
model = tf.keras.models.Sequential()
model.add(Dense(8, activation='relu', input_shape=(12,)))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
The two CV-results do not look too different, they are both within each others standard deviation.
You fixed the seed for the StratifiedKFold class, that's good. However there is additional randomness you should take control of and that comes from the weight initialization. Make sure you initialize your model for each CV-run with different weights, but use the same 10 initializations for both cross-validations, manual and automatic. You can pass an initializer to each layer, they have a seed argument as well. In general you should fix all possible seeds (np.random.seed(3), tf.set_random_seed(3)).
What happens if you run cross_val_score() or your manual version twice? Do you get the same results / numbers?

Tensor flow 2.0 load save model without optimizer

I trained my model and save it like this:
network.save(os.path.join(args.logdir, "cifar_model.h5") ,
include_optimizer=False)
now, I would like to load it and continue training like this, but that doesn't work
model = tf.keras.models.load_model("...\\cifar_model.h5", compile ="False")
model.compile(
optimizer=tf.keras.optimizers.RMSprop(learning_rate=0.001, decay=1e-6),
loss=tf.keras.losses.SparseCategoricalCrossentropy(),
metrics=[tf.keras.metrics.SparseCategoricalAccuracy(name="accuracy")],
)
model.tb_callback = tf.keras.callbacks.TensorBoard(args.logdir, update_freq=1000, profile_batch=1)
model.tb_callback.on_train_end = lambda *_: None
datagen = tf.keras.preprocessing.image.ImageDataGenerator(
rotation_range=15,
width_shift_range=0.1,
height_shift_range=0.1,
horizontal_flip=True,
)
datagen.fit(cifar.train.data["images"])
model.fit_generator(
# cifar.train.data["images"], cifar.train.data["labels"],
datagen.flow(cifar.train.data["images"], cifar.train.data["labels"], batch_size=args.batch_size),
# batch_size=args.batch_size,
steps_per_epoch=200,
epochs=args.epochs,
validation_data=(cifar.dev.data["images"], cifar.dev.data["labels"]),
callbacks=[model.tb_callback],
)
It throws an error:
AttributeError: 'Network' object has no attribute 'compile'
This should work according to https://www.tensorflow.org/alpha/guide/keras/saving_and_serializing
Note that I'm saving without optimizer so I can avoid bug with loading optimizer.
UPDATE:
I found out how to do this when I know the exact structure of layers.
Which I know, then I can recreate model and use weights from a loaded model like this:
load = tf.keras.models.load_model("...\\cifar_model.h5", compile ="False")
model.set_weights(load.get_weights())
But I couldnt apply same for load.layers i think its possible if u dont have sequential layers

Eager mode optimizers

Only TF native optimizers are supported in Eager mode
I'm getting this error with every optimiser I have tried in the following:
def create_model():
model = tf.keras.models.Sequential([tf.keras.layers.Dense(512,activation=tf.nn.relu, input_shape=(784,)), tf.keras.layers.Dropout(0.2), tf.keras.layers.Dense(10, activation=tf.nn.softmax) ])
opt = tf.train.GradientDescentOptimizer
model.compile(optimizer = opt,
loss=tf.keras.losses.sparse_categorical_crossentropy, metrics= ['accuracy'])
return model
So my question is, what is a 'TF native optimizer' please?
Thanks.
Short answer: Change from opt = tf.train.GradientDescentOptimizer to opt = tf.train.GradientDescentOptimizer(<your desired learning rate>).
Longer answer: In the snippet provided above, you're passing a class (tf.train.GradientDescentOptimizer) instead of an object to model.compile. The error message is thus complaining that the type of the opt argument is incorrect.
Hope that helps.
(A recent commit will hopefully result in a better error message in a future release)
In addition to ash's answer, another possible cause of the "Only TF native optimizers are supported in Eager mode" error is using a tf.keras optimizer rather than a tf.train optimizer.
For example:
# Gives error
model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
# Also gives error
model.compile(loss='binary_crossentropy', optimizer=tf.keras.optimizers.RMSprop(), metrics=['accuracy'])
# Correct
model.compile(loss='binary_crossentropy', optimizer=tf.train.RMSPropOptimizer(learning_rate=0.001), metrics=['accuracy'])