keras-train and validation ploting - tensorflow

I have a project to detect faces using tiny yolov1 with Keras and TensorFlow and I have to train the model from scratch. When I train the model using the dataset https://pixeldrain.com/u/wnUrWG2k. The loss value does not decrease much in each epoch and when I plot the y_pred and y_validation they are both straight lines.
model = Model(inputs=inputs, outputs=yolo_outputs)
model.compile(loss=yolo_loss, optimizer='adam')
history = model.fit(X_train, y_train, validation_split=0.33, epochs=5, batch_size=10)`
this is my model loss plot
loss plot

Did you
normalize the input features?
double check your learning rate?

Related

Validation loss reported seems to be wrong, can preprocessing be the reason?

I'm training a resnet model with Keras, fine tuned on my own images. While training, Tensorboard is constantly reporting a validation loss that seems unrelated to training loss (much higher, see image below where train is orange line and validation blue line). Furthermore when training is finished (for example final losses as reported by Tensorboard could be respectively 0.06 and 0.57) I evaluate the model "manually" and validation loss seems to be in the same range of training loss (ex:0.07).
I suspect that preprocessing could be the reason of this strange result. Essentially the inputs and the outputs of the model are created like this:
inp = tf.keras.Input(input_shape)
resnet = tf.keras.applications.ResNet50V2(include_top=False, input_shape=input_shape, input_tensor=inp,pooling="avg")
# Add ResNet50V2 specific preprocessing method into the model.
preprocessed = tf.keras.layers.Lambda(lambda x: tf.keras.applications.resnet_v2.preprocess_input(x))(inp)
out = resnet(preprocessed)
out = tf.keras.layers.Dense(num_outputs, activation=None)(out)
and the training :
model.compile(
optimizer=tf.keras.optimizers.Adam(lrate),
loss='mse',
metrics=[tf.keras.metrics.MeanSquaredError()],
)
model.fit(
train_dataset,
epochs=epochs,
validation_data=val_dataset,
callbacks=callbacks
)
It's like if preprocessing does not occur when validation loss is calculated but I don't know why.

Why is my val_loss starting low and increasing does it even matters with transfer learning?

Hey guys I am new to machine learning and I am running this code from https://machinelearningmastery.com/how-to-develop-a-convolutional-neural-network-to-classify-photos-of-dogs-and-cats/
I want to understand why my val_loss starts low and then increasing. is this overfitting or under-fitting? also what can I use to improve the val_loss so it will give a better fit? in the blog post his cross entropy plot is a lot different from mine.
def define_model():
# load model
model = VGG16(include_top=False, input_shape=(224, 224, 3))
# mark loaded layers as not trainable
for layer in model.layers:
layer.trainable = False
# add new classifier layers
flat1 = Flatten()(model.layers[-1].output)
class1 = Dense(128, activation='relu', kernel_initializer='he_uniform')(flat1)
output = Dense(1, activation='sigmoid')(class1)
# define new model
model = Model(inputs=model.inputs, outputs=output)
# compile model
opt = SGD(lr=0.001, momentum=0.9)
model.compile(optimizer=opt, loss='binary_crossentropy', metrics=['accuracy'])
return model
# plot diagnostic learning curves
def summarize_diagnostics(history):
# plot loss
pyplot.subplot(211)
pyplot.title('Cross Entropy Loss')
pyplot.plot(history.history['loss'], color='blue', label='train')
pyplot.plot(history.history['val_loss'], color='orange', label='test')
# plot accuracy
pyplot.subplot(212)
pyplot.title('Classification Accuracy')
pyplot.plot(history.history['accuracy'], color='blue', label='train')
pyplot.plot(history.history['val_accuracy'], color='orange', label='test')
# save plot to file
filename = sys.argv[0].split('/')[-1]
pyplot.savefig(filename + '_plot.png')
pyplot.close()
# run the test harness for evaluating a model
def run_test_harness():
# define model
model = define_model()
# create data generator
datagen = ImageDataGenerator(featurewise_center=True)
# specify imagenet mean values for centering
datagen.mean = [123.68, 116.779, 103.939]
# prepare iterator
train_it = datagen.flow_from_directory('dataset_dogs_vs_cats/train/',
class_mode='binary', batch_size=64, target_size=(224, 224))
test_it = datagen.flow_from_directory('dataset_dogs_vs_cats/test/',
class_mode='binary', batch_size=64, target_size=(224, 224))
# fit model
history = model.fit_generator(train_it, steps_per_epoch=len(train_it),
validation_data=test_it, validation_steps=len(test_it), epochs=10, verbose=1)
# evaluate model
_, acc = model.evaluate_generator(test_it, steps=len(test_it), verbose=1)
print('> %.3f' % (acc * 100.0))
# learning curves
summarize_diagnostics(history)
model.save('Transfer_Learning_Model.h5')
# entry point, run the test harness
run_test_harness()
A number of issues. The VGG model was trained with pixel values from -1 to +1 so you need to add the following
def scaler(x):
y=x/127.5-1
return y
datagen = ImageDataGenerator(preprocessing_function=scaler)
I would remove featurewise_center=True. If you use it you have to run fit on the train set.
In model.fit you have steps_per_epoch=len(train_it) and validation_steps=len(test_it).
Since you set a batch_size of 64 in the generators these values should be
steps_per_epoch=len(train_it.labels)//64 and validaion_steps=len(test_it)/64. Actually in model.fit leave these parameters out. Model.fit will calculate the right values internally. Your validation loss curve indicates a small degree of over fitting. Add a drop out layer after the class1 layer. Set the dropout rate to something like.2.
If you want to push for the highest degree of accuracy I recommend you incorporate two Keras callbacks. The EarlyStopping callback monitors validation loss and halts training if the loss fails to reduce after 'patience' number of consecutive epochs. Setting restore_best_weights=True will load the weights for the epoch with the lowest validation loss so you don't have to save then reload the weights. Set epochs to a large number to ensure this callback activates. Use the the keras callback ReduceLROnPlateau to automatically adjust the learning rate based on validation loss. The documentation for the callbacks is located here. The code I use is shown below
es=tf.keras.callbacks.EarlyStopping( monitor="val_loss", patience=3,
verbose=1, restore_best_weights=True)
rlronp=tf.keras.callbacks.ReduceLROnPlateau( monitor="val_loss", factor=0.5, patience=1,
verbose=1)
callbacks=[es, rlronp]
in model.fit add callbacks=callbacks

My validation loss of my LSTM model is very volatile

I'm creating an LSTM model to predict the closing price of bitcoin. However, when I started training, my validation loss starts getting very volatile and my test_prediction becomes inaccurate.
Here's my model:
model = Sequential()
model.add(LSTM(80, input_shape=(1,look_back)))
model.add(LSTM(60))
model.compile(optimizer='adam', loss='mean_squared_error')
Fitting the model:
from keras.callbacks import ModelCheckpoint
callbacks = [ModelCheckpoint(save_best_only = True, filepath='btc_close_prediction.h5')]
history = model.fit(xTrain, yTrain, batch_size=10, epochs=30, callbacks=callbacks, validation_split=0.2)
loss graph:
Prediction Plot:
Please advise how can I adjust my model for a better val_loss and better predicting accuracy.
Your validation dataset should comprise of 5000 samples to get your validation loss smooth.
Try transformer model - it requires less training data.

Approximating a smooth multidimensional function using Keras to an error of 1e-4

I am trying to approximate a function that smoothly maps five inputs to a single probability using Keras, but seem to have hit a limit. A similar problem was posed here (Keras Regression to approximate function (goal: loss < 1e-7)) for a ten-dimensional function and I have found that the architecture proposed there, namely:
model = Sequential()
model.add(Dense(128,input_shape=(5,), activation='tanh'))
model.add(Dense(64,activation='tanh'))
model.add(Dense(1,activation='sigmoid'))
model.compile(optimizer='adam', loss='mae')
gives me my best results, converging to a best loss of around 7e-4 on my validation data when the batch size is 1000. Adding or removing more neurons or layers seems to reduce the accuracy. Dropout regularisation also reduces accuracy. I am currently using 1e7 training samples, which took two days to generate (hence the desire to approximate this function). I would like to reduce the mae by another order of magnitude, does anyone have any suggestions how to do this?
I recommend use utilize the keras callbacks ReduceLROnPlateau, documentation is [here][1] and ModelCheckpoint, documentation is [here.][2]. For the first, set it to monitory validation loss and it will reduce the learning rate by a factor(factor) if the loss fails to reduce after a fixed number (patience) of consecutive epochs. For the second also monitor validation loss and save the weights for the model with the lowest validation loss to a directory. After training load the weights and use them to evaluate or predict on the test set. My code implementation is shown below.
checkpoint=tf.keras.callbacks.ModelCheckpoint(filepath=save_loc, monitor='val_loss', verbose=1, save_best_only=True,
save_weights_only=True, mode='auto', save_freq='epoch', options=None)
lr_adjust=tf.keras.callbacks.ReduceLROnPlateau( monitor="val_loss", factor=0.5, patience=1, verbose=1, mode="auto",
min_delta=0.00001, cooldown=0, min_lr=0)
callbacks=[checkpoint, lr_adjust]
history = model.fit_generator( train_generator, epochs=EPOCHS,
steps_per_epoch=STEPS_PER_EPOCH,validation_data=validation_generator,
validation_steps=VALIDATION_STEPS, callbacks=callbacks)
model.load_weights(save_loc) # load the saved weights
# after this use the model to evaluate or predict on the test set.
# if you are satisfied with the results you can then save the entire model with
model.save(save_loc)
[1]: https://keras.io/api/callbacks/reduce_lr_on_plateau/
[2]: https://keras.io/api/callbacks/model_checkpoint/

why do you need to compile a keras / tensorflow model that was loaded from a saved copy?

I understand how you save model and weights files and load them back up. I don't understand why the compilation with optimizers
loaded_model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
score = loaded_model.evaluate(X_test, Y_test, verbose=0)
Since you are no longer during gradient descent, why do you need the loss function and optimizer?