How to get history over one epoch after every batch - tensorflow

I am training a model with keras for one epoch only:
history = model.fit([x], y,
validation_split=0.2, epochs=1, batch_size=2)
print(history.history['accuracy'])
The history now obviously only contains one value from the end of the epoch. How can I evaluate the training accuracy or loss during the epoch? I expect these to be the values that are shown in the console during training.
To be clear: I want a history to be written after every batch (not after every epoch, as per usual).

I assume you want to save the accuracy and loss at the end of each batch. To do that you need to create a custom callback as shown below
class BCP(keras.callbacks.Callback):
batch_accuracy = [] # accuracy at given batch
batch_loss = [] # loss at given batch
def __init__(self):
super(BCP,self).__init__()
def on_train_batch_end(self, batch, logs=None):
BCP.batch_accuracy.append(logs.get('accuracy'))
BCP.batch_loss.append(logs.get('loss'))
now in model.fit include
callbacks = [BCP()]
now train for 1 epoch. at the end of the epoch the values of the accuracy and loss for each batch is stored in BCP.batch_accuracy and BCP.batch_loss. You can print them out
as follows:
print('{0:^4s}{1:^22s}{2:^10s}'.format('Batch', 'Loss', 'Accuracy'))
for i in range (len(BCP.batch_accuracy)):
print('{0:^4s}{1:15.5f}{2:15.2f}'.format(str(i), BCP.batch_loss[i], BCP.batch_accuracy[i]* 100))

Related

Keras Early Stop and Monitor

How can I activate keras.EarlyStopping only when the monitored value is greater than a threshold. For example, how can I trigger the earlystop = EarlyStopping(monitor='val_accuracy', min_delta=0.0001, patience=5, verbose=1, mode='auto') only when val accuracy > 0.9? Also, how should I properly export the intermediate model for example every 50 epochs?
I don't have too much knowledge and the baseline argument for EarlyStopping seems like to mean something else than the threshold.
The best way to stop on a metric threshold is to use a Keras custom callback. Below is the code for a custom callback (SOMT - stop on metric threshold) that will do the job. The SOMT callback is useful to end training based on the value of the training accuracy or the validation accuracy or both.
The form of use is callbacks=[SOMT(model, train_thold, valid_thold)] where
model is the name of your complied model
train_thold is a float. It is the value of accuracy (in Percent) that must be achieved by the model in order to conditionally stop training
valid_threshold is a float. It is the value of validation accuracy (in Percent) that must be achieved by the model
in order to conditionally stop training
Note to stop training BOTH the train_thold and valid_thold must be exceeded in the SAME epoch.
If you want to stop training based solely on the training accuracy set the valid_thold to 0.0.
Similarly if you want to stop training on just validation accuracy set train_thold= 0.0.
Note if both thresholds are not achieved in the same epoch training will continue until the value of epochs. If both thresholds are reached in the same epoch, training is halted and your model weights are set to the weights for that epoch.
As an example lets take the case that you want to stop training when the
training accuracy has reached or exceeded 95 % and the validation accuracy has achieved at least 85%
then the code would be callbacks=[SOMT(my_model, .95, .85)]
# the callback uses the time module so
import time
class SOMT(keras.callbacks.Callback):
def __init__(self, model, train_thold, valid_thold):
super(SOMT, self).__init__()
self.model=model
self.train_thold=train_thold
self.valid_thold=valid_thold
def on_train_begin(self, logs=None):
print('Starting Training - training will halt if training accuracy achieves or exceeds ', self.train_thold)
print ('and validation accuracy meets or exceeds ', self.valid_thold)
msg='{0:^8s}{1:^12s}{2:^12s}{3:^12s}{4:^12s}{5:^12s}'.format('Epoch', 'Train Acc', 'Train Loss','Valid Acc','Valid_Loss','Duration')
print (msg)
def on_train_batch_end(self, batch, logs=None):
acc=logs.get('accuracy')* 100 # get training accuracy
loss=logs.get('loss')
msg='{0:1s}processed batch {1:4s} training accuracy= {2:8.3f} loss: {3:8.5f}'.format(' ', str(batch), acc, loss)
print(msg, '\r', end='') # prints over on the same line to show running batch count
def on_epoch_begin(self,epoch, logs=None):
self.now= time.time()
def on_epoch_end(self,epoch, logs=None):
later=time.time()
duration=later-self.now
tacc=logs.get('accuracy')
vacc=logs.get('val_accuracy')
tr_loss=logs.get('loss')
v_loss=logs.get('val_loss')
ep=epoch+1
print(f'{ep:^8.0f} {tacc:^12.2f}{tr_loss:^12.4f}{vacc:^12.2f}{v_loss:^12.4f}{duration:^12.2f}')
if tacc>= self.train_thold and vacc>= self.valid_thold:
print( f'\ntraining accuracy and validation accuracy reached the thresholds on epoch {epoch + 1}' )
self.model.stop_training = True # stop training
Note include this code after compiling your model and prior to fitting your model
train_thold= .98
valid_thold=.95
callbacks=[SOMT(model, train_thold, valid_thold)]
# training will halt if train accuracy meets or exceeds train_thold
# AND validation accuracy meets or exceeds valid_thold in the SAME epoch
In model.fit include callbacks=callbacks, verbose=0.
At the end of each epoch the callback produces a spreadsheet like printout of the form
Epoch Train Acc Train Loss Valid Acc Valid_Loss Duration
1 0.90 4.3578 0.95 2.3982 84.16
2 0.95 1.6816 0.96 1.1039 63.13
3 0.97 0.7794 0.95 0.5765 63.40
training accuracy and validation accuracy reached the thresholds on epoch 3.

didn't assign model.fit, can I plot the history?

Can I plot accuracy, loss... if I didn't assign model.fit to something.
I just wrote model.fit and trained the model.
Thanks
There are 2 ways to do this without a history in keras:
take the text output of the keras training and manually take the loss values of every epoch and do the plot by hand by filling 2 numpy arrays with the values ( one for the loss and an other for the validation loss).
This seems a long task but with a text editor like visual studio code it's a matter of seconds.
write a callback that after every epoch writes the result of the epoch on an external text file and then take the values in a similar way as point 1.
something like this:
class print_log_Callback(Callback):
def __init__(self, logpath, steps):
self.logpath = logpath
self.losslst = np.zeros(steps)
def on_epoch_end(self, epoch, logs=None):
with open(logpath, 'a') as writefile:
with redirect_stdout(writefile):
print("The average loss for epoch {} is {:7.2f} and val_loss is {:7.2f}.".format(epoch, logs["loss"], logs['val_loss']))
writefile.write("\n")
print("The mean train loss is: ", np.mean(self.losslst))
writefile.write("\n")
writefile.write("\n")

Why doesn't custom training loop average loss over batch_size?

Below code snippet is the custom training loop from Tensorflow official tutorial.https://www.tensorflow.org/guide/keras/writing_a_training_loop_from_scratch . Another tutorial also does not average loss over batch_size, as shown here https://www.tensorflow.org/tutorials/customization/custom_training_walkthrough
Why is the loss_value not averaged over batch_size at this line loss_value = loss_fn(y_batch_train, logits)? Is this a bug? From another question here Loss function works with reduce_mean but not reduce_sum, reduce_mean is indeed needed to average loss over batch_size
The loss_fn is defined in the tutorial as below. It obviously does not average over batch_size.
loss_fn = keras.losses.SparseCategoricalCrossentropy(from_logits=True)
From documentation, keras.losses.SparseCategoricalCrossentropy sums loss over the batch without averaging. Thus, this is essentially reduce_sum instead of reduce_mean!
Type of tf.keras.losses.Reduction to apply to loss. Default value is AUTO. AUTO indicates that the reduction option will be determined by the usage context. For almost all cases this defaults to SUM_OVER_BATCH_SIZE.
The code is shown below.
epochs = 2
for epoch in range(epochs):
print("\nStart of epoch %d" % (epoch,))
# Iterate over the batches of the dataset.
for step, (x_batch_train, y_batch_train) in enumerate(train_dataset):
# Open a GradientTape to record the operations run
# during the forward pass, which enables auto-differentiation.
with tf.GradientTape() as tape:
# Run the forward pass of the layer.
# The operations that the layer applies
# to its inputs are going to be recorded
# on the GradientTape.
logits = model(x_batch_train, training=True) # Logits for this minibatch
# Compute the loss value for this minibatch.
loss_value = loss_fn(y_batch_train, logits)
# Use the gradient tape to automatically retrieve
# the gradients of the trainable variables with respect to the loss.
grads = tape.gradient(loss_value, model.trainable_weights)
# Run one step of gradient descent by updating
# the value of the variables to minimize the loss.
optimizer.apply_gradients(zip(grads, model.trainable_weights))
# Log every 200 batches.
if step % 200 == 0:
print(
"Training loss (for one batch) at step %d: %.4f"
% (step, float(loss_value))
)
print("Seen so far: %s samples" % ((step + 1) * 64))
I've figured it out, the loss_fn = keras.losses.SparseCategoricalCrossentropy(from_logits=True) indeed averages loss over batch_size by default.

Keras evaluate the validation data before the epoch ends

I'm want to train my model with Keras. I'm using a huge dataset Where one training epoch has more than 30000 steps. My problem is that I don't want to wait for an epoch before checking the model improvement on the validation dataset. Is there any way to make Keras evaluate the validation data every 1000 steps of the training data? I think one option will be to use a callback but is there any built-in solution with Keras?
if train:
log('Start training')
history = model.fit(train_dataset,
steps_per_epoch=train_steps,
epochs=50,
validation_data=val_dataset,
validation_steps=val_steps,
callbacks=[
keras.callbacks.EarlyStopping(
monitor='loss',
patience=10,
restore_best_weights=True,
),
keras.callbacks.ModelCheckpoint(
filepath=f'model.h5',
monitor='val_loss',
save_best_only=True,
save_weights_only=True,
),
keras.callbacks.ReduceLROnPlateau(
monitor = "val_loss",
factor = 0.5,
patience = 3,
min_lr=0.001,
),
],
)
With the in-built callbacks, you cannot do that. What you need is to implement a custom callback.
class MyCustomCallback(tf.keras.callbacks.Callback):
def on_train_batch_begin(self, batch, logs=None):
print('Training: batch {} begins at {}'.format(batch, datetime.datetime.now().time()))
def on_train_batch_end(self, batch, logs=None):
print('Training: batch {} ends at {}'.format(batch, datetime.datetime.now().time()))
def on_test_batch_begin(self, batch, logs=None):
print('Evaluating: batch {} begins at {}'.format(batch, datetime.datetime.now().time()))
def on_test_batch_end(self, batch, logs=None):
print('Evaluating: batch {} ends at {}'.format(batch, datetime.datetime.now().time()))
This is taken from the TensorFlow documentation.
You can override the on_train_batch_end() function and, since the batch parameter is an integer, you can verify batch % 100 == 0, then self.model.predict(val_data) etc. to your needs.
Please check my answer here: How to get other metrics in Tensorflow 2.0 (not only accuracy)? to have a good overview on how to override a custom callback function. Please note that in your case it is the on_train_batch_end() not on_epoch_end() that is important.

Validation loss is inconsistent if I predict same results and calculate loss afterwards

I have an LSTM model that predicts weather. When I run the model with model.fit, it gives around %20 MAPE.
When I try to predict the same data that given to model.fit, when I calculate the loss, it results %60 MAPE. What might be causing this difference? I would have ignored it but the difference is too much.
Here is my code in main:
#preparing the data and building the model first
regressor.fit(x_train, y_train, epochs = 100, batch_size = 32,
validation_data = (x_test, y_test))
results = regressor.predict(x_test)
print(bm.mean_absolute_percentage_error(y_test, results))
in bm:
def mean_absolute_percentage_error(real, est):
"""Calculates the mean absolute precentage error.
"""
sess = Session()
with sess.as_default():
tensor = losses.mean_absolute_percentage_error(real, est)
return tensor.eval()[-1]
I used the same function that keras uses for calculating MAPE. Even if I made a mistake when preparing test data, they both should be consistently wrong because they take the same set as argument.