Keras neural network "Function call stack: keras_scratch_graph" - tensorflow

I am trying to build a machine learning model using keras and I am getting below error
InvalidArgumentError: indices[1,1478] = 92260 is not in [0, 22000)
[[node embedding_3/embedding_lookup (defined at C:\Users\username\AppData\Local\Continuum\Anaconda3-5.2.0\lib\site-packages\keras\backend\tensorflow_backend.py:3009) ]] [Op:__inference_keras_scratch_graph_10071]
Function call stack:
keras_scratch_graph
code
filepath=".../weights-simple.hdf5"
checkpointer = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')
history = model.fit([X_train], batch_size=32, y=to_categorical(y_train), verbose=1, validation_split=0.25,
shuffle=True, epochs=3, callbacks=[checkpointer])
I even tried using below code, but I am still getting the error
tf.config.experimental.set_visible_devices([], 'GPU')
Can you please help me with the issue

Related

How to configure checkpoint and earlystopping so that no warning will be issued?

My model definition is below. Two callbacks are used. I want to monitor val_accuracy and early stopping is used based on loss.
checkpoint_filepath = '/tmp/checkpoint'
model_checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(
filepath=checkpoint_filepath,
save_weights_only=True,
monitor='val_accuracy',
mode='max',
save_best_only=True,
verbose=1)
early_stopping = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=1)
history = model.fit(train_ds, epochs=epochs, validation_data=val_ds, callbacks=[early_stopping, model_checkpoint_callback])
Why does it always complaint about "WARNING:tensorflow:Can save best model only with val_accuracy available, skipping."?
Please compile your model using 'metrics=["accuracy"]' parameter, like:
model.compile(loss="sparse_categorical_crossentropy", optimizer="nadam", metrics=["accuracy"])
Then it should work.

Tensorboard not updating by batch in google colab

I'm using tensorboard in google colab, it's works fine if i want to track the epochs. However, i want to track the accuracy/loss by batch. I'm trying it using the getting started at documentation https://www.tensorflow.org/tensorboard/get_started but if i change the argument update_freq by update_freq="batch" it doesn't work. I have tried in my local pc and it works. Any idea of what is happening?
Using tensorboard 2.8.0 and tensorflow 2.8.0
Code (running in colab)
%load_ext tensorboard
import tensorflow as tf
import datetime
!rm -rf ./logs/
mnist = tf.keras.datasets.mnist
(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
def create_model():
return tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(512, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation='softmax')
])
model = create_model()
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
log_dir = "logs/fit_2/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, update_freq="batch")
model.fit(x=x_train,
y=y_train,
epochs=5,
validation_data=(x_test, y_test),
callbacks=[tensorboard_callback])
I've tried to use a integer and it doesn't work either. In my local computer i've no problems.
The change after TensorFlow 2.3 made the batch-level summaries part of the Model.train_function rather than something that the TensorBoard callback creates itself. This resulted in a 2x improvement in speed for many small models in Model.fit, but it does have the side effect that calling TensorBoard.on_train_batch_end(my_batch, my_metrics) in a custom training loop will no longer log batch-level metrics.
This issue was discussed in one of the GitHub issue.
There can be a workaround by creating a custom callback like LambdaCallback.
I have modified the last part of your code to explicitly add scalar values of batch_loss and batch_accuracy using tf.summary.scalar() to be shown in tensorboard logs.
The code module is as follows:
model = create_model()
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
from keras.callbacks import LambdaCallback
def batchOutput(batch, logs):
tf.summary.scalar('batch_loss', data=logs['loss'], step=batch)
tf.summary.scalar('batch_accuracy', data=logs['accuracy'], step=batch)
return batch
batchLogCallback = LambdaCallback(on_batch_end=batchOutput)
log_dir = "logs/fit_2/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir='./logs', update_freq='batch')
model.fit(x=x_train,
y=y_train,
epochs=1,
validation_data=(x_test, y_test),
callbacks=[tensorboard_callback, batchLogCallback])
I tried this in Colab as well it worked.

AttributeError: 'Sequential' object has no attribute 'predict_proba'

predict_proba returns the error in the neural network
i saw the example on this link https://machinelearningmastery.com/how-to-make-classification-and-regression-predictions-for-deep-learning-models-in-keras/
https://faroit.com/keras-docs/1.0.0/models/sequential/#the-sequential-model-api
I am using Tensorflow Version: 2.6.0
Code:
#creating the object (Initializing the ANN)
import tensorflow as tf
from tensorflow import keras
LAYERS = [
tf.keras.layers.Dense(50, activation="relu", input_shape=X_train.shape[1:]),
tf.keras.layers.LeakyReLU(),
tf.keras.layers.Dense(25, activation="relu"),
tf.keras.layers.Dense(10, activation="relu"),
tf.keras.layers.Dense(5, activation="relu"),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(1, activation='sigmoid')
]
LOSS = "binary_crossentropy"
OPTIMIZER = tf.keras.optimizers.Adam(learning_rate=1e-3)
model_cEXT = tf.keras.models.Sequential(LAYERS)
model_cEXT.compile(loss=LOSS , optimizer=OPTIMIZER, metrics=['accuracy'])
EPOCHS = 100
checkpoint_cb = tf.keras.callbacks.ModelCheckpoint("model_cEXT.h5", save_best_only=True)
early_stopping_cb = tf.keras.callbacks.EarlyStopping(patience=10, restore_best_weights=True)
tensorboard_cb = tf.keras.callbacks.TensorBoard(log_dir="logs")
CALLBACKS = [checkpoint_cb, early_stopping_cb, tensorboard_cb]
model_cEXT.fit(X_train, y_train['cEXT'], epochs = EPOCHS, validation_data=(X_test, y_test['cEXT']), callbacks = CALLBACKS)
model_cEXT.predict_proba(X_test)
Error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-72-8f06353cf345> in <module>()
----> 1 model_cEXT.predict_proba(X_test)
AttributeError: 'Sequential' object has no attribute 'predict_proba'
Edit:
i need sklearn's like predict_proba output it is needed for visualization
skplt.metrics.plot_precision_recall_curve(y_test['cEXT'].values, y_prob)
plt.title('Precision-Recall Curve - cEXT')
plt.show()
Use this code instead
predict_prob=model.predict([testa,testb])
predict_classes=np.argmax(predict_prob,axis=1)
New Version might not have predict_proba method so i have creadted my own using .predict method
def predict_prob(number):
return [number[0],1-number[0]]
y_prob = np.array(list(map(predict_prob, model_cEXT.predict(X_test))))
y_prob

Keras model.fit causes InvalidArgumentError when training on TPU

When running the model.fit function an error is thrown. The main question is, what does this error mean? The code is run on a TPU V3-8 and uses Google cloud for data retrieval. I did try to look up the error on the web, however I could not find a single case of someone else getting this error.
model.fit(
dataset,
steps_per_epoch = N_IMGS // BATCH_SIZE,
epochs = EPOCHS,
)
Throws the error
InvalidArgumentError: {{function_node __inference_train_function_528542}} Compilation failure: Depth of output must be a multiple of the number of groups: 3 vs 2
[[{{node sequential/conv2d/Conv2D}}]]
TPU compilation failed
[[tpu_compile_succeeded_assert/_15965336225898828069/_5]]
The error message is not clear to me, what exactly is going wrong? The following model is used.
def get_model():
# reset to free memory and training variables
tf.keras.backend.clear_session()
with strategy.scope():
net = efn.EfficientNetB0(include_top=False, weights='noisy-student', input_shape=(HEIGHT, WIDTH, 3))
model = tf.keras.Sequential([
tf.keras.layers.Conv2D(3, (3, 3), padding='same', input_shape=(HEIGHT, WIDTH, 1)),
net,
tf.keras.layers.GlobalAveragePooling2D(),
tf.keras.layers.Dropout(0.25),
tf.keras.layers.Dense(N_LABELS, activation='softmax', dtype='float32'),
])
model.compile(optimizer=tf.keras.optimizers.Adam(), loss='sparse_categorical_crossentropy', metrics=['accuracy'])
return model
model = get_model()
tf.keras.utils.plot_model(model, 'model.png', show_shapes=True)
The dataset gives the following output
for images, labels in dataset.take(1): # only take first element of dataset
print(f'images.shape: {images.shape}, images.dtype: {images.dtype}, labels.shape: {labels.shape}, labels.dtype: {labels.dtype}')
images.shape: (64, 224, 400, 1), images.dtype: <dtype: 'float32'>, labels.shape: (64,), labels.dtype: <dtype: 'int32'>

AlreadyExistsError while training a network on colab

I'm trying to train an LSTMs network on Google Colab. However, this error occurs:
AlreadyExistsError: Resource __per_step_116/training_4/Adam/gradients/bidirectional_4/while/ReadVariableOp/Enter_grad/ArithmeticOptimizer/AddOpsRewrite_Add/tmp_var/N10tensorflow19TemporaryVariableOp6TmpVarE
[[{{node training_4/Adam/gradients/bidirectional_4/while/ReadVariableOp/Enter_grad/ArithmeticOptimizer/AddOpsRewrite_Add/tmp_var}}]]
I don't know where can be the issue. This is the model of the network:
sl_model = keras.models.Sequential()
sl_model.add(keras.layers.Embedding(max_index+1, hidden_size, mask_zero=True))
sl_model.add(keras.layers.Bidirectional(keras.layers.LSTM(hidden_size,
activation='tanh', dropout=0.2, recurrent_dropout = 0.2, return_sequences=True)))
sl_model.add(keras.layers.Bidirectional(keras.layers.LSTM(hidden_size, activation='tanh', dropout=0.2, recurrent_dropout = 0.2, return_sequences=False))
)
sl_model.add(keras.layers.Dense(max_length, activation='softsign'))
optimizer = keras.optimizers.Adam()
sl_model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['acc'])
batch_size = 128
epochs = 3
cbk = keras.callbacks.TensorBoard("logging/keras_model")
print("\nStarting training...")
sl_model.fit(x_train, y_train, epochs=epochs, batch_size=batch_size,
shuffle=True, validation_data=(x_dev, y_dev), callbacks=[cbk])
Thank you so much!
You need to restart your runtime -- this happens when you have defined multiple graphs built in a single jupyter (Colaboratory) runtime.
Calling tf.reset_default_graph() may also help, but depending on whether you are using eager exection and how you've defined your sessions this may or may not work.