Only TF native optimizers are supported in Eager mode
I'm getting this error with every optimiser I have tried in the following:
def create_model():
model = tf.keras.models.Sequential([tf.keras.layers.Dense(512,activation=tf.nn.relu, input_shape=(784,)), tf.keras.layers.Dropout(0.2), tf.keras.layers.Dense(10, activation=tf.nn.softmax) ])
opt = tf.train.GradientDescentOptimizer
model.compile(optimizer = opt,
loss=tf.keras.losses.sparse_categorical_crossentropy, metrics= ['accuracy'])
return model
So my question is, what is a 'TF native optimizer' please?
Thanks.
Short answer: Change from opt = tf.train.GradientDescentOptimizer to opt = tf.train.GradientDescentOptimizer(<your desired learning rate>).
Longer answer: In the snippet provided above, you're passing a class (tf.train.GradientDescentOptimizer) instead of an object to model.compile. The error message is thus complaining that the type of the opt argument is incorrect.
Hope that helps.
(A recent commit will hopefully result in a better error message in a future release)
In addition to ash's answer, another possible cause of the "Only TF native optimizers are supported in Eager mode" error is using a tf.keras optimizer rather than a tf.train optimizer.
For example:
# Gives error
model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
# Also gives error
model.compile(loss='binary_crossentropy', optimizer=tf.keras.optimizers.RMSprop(), metrics=['accuracy'])
# Correct
model.compile(loss='binary_crossentropy', optimizer=tf.train.RMSPropOptimizer(learning_rate=0.001), metrics=['accuracy'])
Related
I'm trying to visualize the model in Tensorboard without training.
I checked this and that, but this still doesn't work even for the simplest model.
import tensorflow as tf
import tensorflow.keras as keras
# Both tf.__version__ tensorboard.__version__ are 2.5.0
s_model = keras.models.Sequential([
keras.layers.Flatten(input_shape=(28, 28)),
keras.layers.Dense(32, activation='relu'),
keras.layers.Dropout(0.2),
keras.layers.Dense(10, activation='softmax')
])
logdir = '.../logs'
_callbacks = keras.callbacks.TensorBoard(log_dir=logdir)
_callbacks.set_model(s_model) # This is exactly suggested in the link
When I did the above, I get the error message:
Graph visualization failed.
Error: Malformed GraphDef. This can sometimes be caused by a bad
network connection or difficulty reconciling mulitple GraphDefs; for
the latter case, please refer to
https://github.com/tensorflow/tensorboard/issues/1929.
I don't think this is a reconciliation problem because it is not a custom function, and if I compile the model, train, then I can get the graph visualization I wanted.
s_model.compile(
optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
(train_images, train_labels), _ = keras.datasets.fashion_mnist.load_data()
train_images = train_images / 255.0
logdir = '.../logs'
tensorboard_callback = keras.callbacks.TensorBoard(log_dir=logdir)
s_model.fit(
train_images,
train_labels,
batch_size=64,
epochs=5,
callbacks=[tensorboard_callback])
This gives the wanted graph visualization. But is there any other way to get graph visualization in Tensorboard without training?
Of course, I'm also aware that workaround, i.e. train with the tf.random.normal() for a while, would do the trick but I'm looking for the neat way like _callbacks.set_model(s_model)...
I am using optimizer.get_config() to get the final state of my adam optimizer (as in https://stackoverflow.com/a/60077159/607528) however .get_config() is returning the initial state. I assume this means one of the following
.get_config() is supposed to return the initial state
my optimizer is not updating because I've set something up wrong
my optimizer is not updating tf's adam is broken (highly unlikely)
my optimizer is updating but is being reset somewhere before I call .get_config()
something else?
Of course I originally noticed the issue in a proper project with training and validation sets etc, but here is a really simple snippet that seems to reproduce the issue:
import tensorflow as tf
import numpy as np
x=np.random.rand(100)
y=(x*3).round()
model = tf.keras.models.Sequential([
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(x, y, epochs=500)
model.evaluate(x, y)
model.optimizer.get_config()
If you want to restore your training - you should save optimizer weights, not config:
weight_values = optimizer.get_weights()
with open(self.output_path+'optimizer.pkl', 'wb') as f:
pickle.dump(weight_values, f)
And then load them:
model.fit(dummy_x, dummy_y, epochs=500) # build optimizer by fitting model with dummy input - e.g. random tensors with simpliest shape
with open(self.path_to_saved_model+'optimizer.pkl', 'rb') as f:
weight_values = pickle.load(f)
optimizer.set_weights(weight_values)
Suppose I have the following Keras model:
model = Sequential()
model.add(Dense(units=64, activation='relu'))
model.add(Dense(units=10, activation='softmax'))
model.compile(
loss = CategoricalCrossentropy(label_smoothing=0.01),
optimizer = RMSprop(learning_rate=0.001, momentum=0.0)
metrics = [Accuracy()]
)
I have two questions:
How can I view compilation settings (like the learning_rate)?
How can I change compilation settings (like the learning_rate)?
Remarks:
I noticed I can view layer settings using model.summary() or model.get_config() but that does not show compilation settings.
I know I can change the learning_rate by running the compile statement again with a different learning_rate. But I would like a "cleaner"/ more readable way to do this. Something like: model['compilation']['optimizer']['learning_rate'] = xxx. (Many sklearn model can be adjusted this way.)
Use .lr:
rate = model.optimizer.lr
I am looking at Keras Tuner as a way of doing hyperparameter optimization, but all of the examples I have seen show an entirely fresh model being defined. For example, from the Keras Tuner Hello World:
def build_model(hp):
model = keras.Sequential()
model.add(layers.Flatten(input_shape=(28, 28)))
for i in range(hp.Int('num_layers', 2, 20)):
model.add(layers.Dense(units=hp.Int('units_' + str(i), 32, 512, 32),
activation='relu'))
model.add(layers.Dense(10, activation='softmax'))
model.compile(
optimizer=keras.optimizers.Adam(
hp.Choice('learning_rate', [1e-2, 1e-3, 1e-4])),
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
return model
I already have a model that I would like to tune, but does that mean I have to rewrite it with the hyperparameters spliced in to the body, as above, or can I simply pass the hyperameters in to the model at the top? For example like this:
def build_model(hp):
model = MyExistingModel(
batch_size=hp['batch_size'],
seq_len=hp['seq_len'],
rnn_hidden_units=hp['hidden_units'],
rnn_type='gru',
num_rnn_layers=hp['num_rnn_layers']
)
optimizer = optimizer_factory['adam'](
learning_rate=hp['learning_rate'],
momentum=0.9,
)
model.compile(
optimizer=optimizer,
loss='sparse_categorical_crossentropy',
metrics=['sparse_categorical_accuracy'],
)
return model
The above seems to work, as far as I can see. The model initialization args are all passed to the internal TF layers, through a HyperParameters instance, and accessed from there... although I'm not really sure how to pass it in... I think it can be done by predefining a HyperParameters object and passing it in to the tuner, so it then gets passed in to build_model:
hp = HyperParameters()
hp.Choice('learning_rate', [1e-1, 1e-3])
tuner = RandomSearch(
build_model,
max_trials=5,
hyperparameters=hp,
tune_new_entries=False,
objective='val_accuracy')
Internally my model has two RNNs (LSTM or GRU) and an MLP. But I have yet to come across a Keras Tuner build_model that takes an existing model like this a simply passes in the hyperparameters. The model is quite complex, and I would like to avoid having to redefine it (as well as avoiding code duplication).
Indeed this is possible, as this GitHub issue makes clear...
However rather than passing the hp object through the hyperparameters arg to the Tuner, instead I override the Tuner run_trial method in the manner suggested here.
I've been struggling to understand why two similar Kfold-cross validations result in two different averages.
When I use a manual KFold approach (with Tensorflow and Keras)
cvscores = []
kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=3)
for train, test in kfold.split(X, y):
model = create_baseline()
model.fit(X[train], y[train], epochs=50, batch_size=32, verbose=0)
scores = model.evaluate(X[test], y[test], verbose=0)
#print("%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))
cvscores.append(scores[1] * 100)
print("%.2f%% (+/- %.2f%%)" % (np.mean(cvscores), np.std(cvscores)))
I get
65.89% (+/- 3.77%)
When I use the KerasClassifier wrapper from scikit
estimator = KerasClassifier(build_fn=create_baseline, epochs=50, batch_size=32, verbose=0)
kfold = StratifiedKFold(n_splits=10,shuffle=True, random_state=3)
results = cross_val_score(estimator, X, y, cv=kfold, scoring='accuracy')
print("Baseline: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))
I get
63.82% (5.37%)
Additionally, when using KerasClassifier the following warning appears
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/wrappers/scikit_learn.py:241: Sequential.predict_classes (from tensorflow.python.keras.engine.sequential) is deprecated and will be removed after 2021-01-01.
Instructions for updating:
Please use instead:* `np.argmax(model.predict(x), axis=-1)`, if your model does multi-class classification (e.g. if it uses a `softmax` last-layer activation).* `(model.predict(x) > 0.5).astype("int32")`, if your model does binary classification (e.g. if it uses a `sigmoid` last-layer activation).
Do the results differ because KerasClassifier uses predict_classes() while the manual Tensorflow/Keras approach uses just predict()? If so, which approach is more reasonable?
My model looks like this
def create_baseline():
model = tf.keras.models.Sequential()
model.add(Dense(8, activation='relu', input_shape=(12,)))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
The two CV-results do not look too different, they are both within each others standard deviation.
You fixed the seed for the StratifiedKFold class, that's good. However there is additional randomness you should take control of and that comes from the weight initialization. Make sure you initialize your model for each CV-run with different weights, but use the same 10 initializations for both cross-validations, manual and automatic. You can pass an initializer to each layer, they have a seed argument as well. In general you should fix all possible seeds (np.random.seed(3), tf.set_random_seed(3)).
What happens if you run cross_val_score() or your manual version twice? Do you get the same results / numbers?