My validation loss of my LSTM model is very volatile - tensorflow

I'm creating an LSTM model to predict the closing price of bitcoin. However, when I started training, my validation loss starts getting very volatile and my test_prediction becomes inaccurate.
Here's my model:
model = Sequential()
model.add(LSTM(80, input_shape=(1,look_back)))
model.add(LSTM(60))
model.compile(optimizer='adam', loss='mean_squared_error')
Fitting the model:
from keras.callbacks import ModelCheckpoint
callbacks = [ModelCheckpoint(save_best_only = True, filepath='btc_close_prediction.h5')]
history = model.fit(xTrain, yTrain, batch_size=10, epochs=30, callbacks=callbacks, validation_split=0.2)
loss graph:
Prediction Plot:
Please advise how can I adjust my model for a better val_loss and better predicting accuracy.

Your validation dataset should comprise of 5000 samples to get your validation loss smooth.
Try transformer model - it requires less training data.

Related

Validation loss reported seems to be wrong, can preprocessing be the reason?

I'm training a resnet model with Keras, fine tuned on my own images. While training, Tensorboard is constantly reporting a validation loss that seems unrelated to training loss (much higher, see image below where train is orange line and validation blue line). Furthermore when training is finished (for example final losses as reported by Tensorboard could be respectively 0.06 and 0.57) I evaluate the model "manually" and validation loss seems to be in the same range of training loss (ex:0.07).
I suspect that preprocessing could be the reason of this strange result. Essentially the inputs and the outputs of the model are created like this:
inp = tf.keras.Input(input_shape)
resnet = tf.keras.applications.ResNet50V2(include_top=False, input_shape=input_shape, input_tensor=inp,pooling="avg")
# Add ResNet50V2 specific preprocessing method into the model.
preprocessed = tf.keras.layers.Lambda(lambda x: tf.keras.applications.resnet_v2.preprocess_input(x))(inp)
out = resnet(preprocessed)
out = tf.keras.layers.Dense(num_outputs, activation=None)(out)
and the training :
model.compile(
optimizer=tf.keras.optimizers.Adam(lrate),
loss='mse',
metrics=[tf.keras.metrics.MeanSquaredError()],
)
model.fit(
train_dataset,
epochs=epochs,
validation_data=val_dataset,
callbacks=callbacks
)
It's like if preprocessing does not occur when validation loss is calculated but I don't know why.

keras-train and validation ploting

I have a project to detect faces using tiny yolov1 with Keras and TensorFlow and I have to train the model from scratch. When I train the model using the dataset https://pixeldrain.com/u/wnUrWG2k. The loss value does not decrease much in each epoch and when I plot the y_pred and y_validation they are both straight lines.
model = Model(inputs=inputs, outputs=yolo_outputs)
model.compile(loss=yolo_loss, optimizer='adam')
history = model.fit(X_train, y_train, validation_split=0.33, epochs=5, batch_size=10)`
this is my model loss plot
loss plot
Did you
normalize the input features?
double check your learning rate?

Keras model not learning and predicting only one class out of three classes

New to the field of deep learning and currently working on this competition for predicting the earthquake damage to buildings.
The model I created starts at an accuracy of .56 but remains at this for any number of epochs i let it run. When finished, the model only predicts one of the three classes (which I one hot encoded into a dataframe with three columns). Changing the number of layers, optimizers, data preparation, dropout wont change anything. Even trying to overfit my model with the over-parameterization of the neural network will still have the same accuracy and a non-learning model.
What am I doing wrong?
This is my code:
model = keras.models.Sequential()
model.add(keras.layers.Dense(64, input_dim = 85, activation = "relu"))
keras.layers.Dropout(0.3)
model.add(keras.layers.Dense(128, activation = "relu"))
keras.layers.Dropout(0.3)
model.add(keras.layers.Dense(256, activation = "relu"))
keras.layers.Dropout(0.3)
model.add(keras.layers.Dense(512, activation = "relu"))
model.add(keras.layers.Dense(3, activation = "softmax"))
adam = keras.optimizers.Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)
model.compile(optimizer = adam,
loss='categorical_crossentropy',
metrics = ['accuracy'])
history = model.fit(traindata, trainlabels,
epochs = 5,
validation_split = 0.2,
verbose = 1,)
There's nothing visually wrong with your model, but it may be too haevy to learn any useful features.
Try normalizing your input with https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html
Start with only 2 layers, and a few numbers of neurons.
Increase batch_size and try learning_rate scheduling.
Observe the validation_accuracy, stop when it starts to overfit.
Finally, for a 3-class classification, 56% accuracy is better than baseline, remmeber it's a competition so the data is not dummy playground data which you can expect to get a 90% accuracy with an MLP in the first try.
Finally, try hyperparameter optimization with tuner.

the reason that the training loss does not change at all

I have built the following model to perform sequence prediction, here is the model
inputs = Input(shape=(64,1))
model = Sequential()
model.add(Conv1D(64,12,activation='relu',input_shape= (64,1),padding='causal'))
model.add(Conv1D(64,12,activation='relu',padding='causal'))
model.add(MaxPooling1D(2))
model.add(Conv1D(128,12,activation='relu',padding='causal'))
model.add(Conv1D(128,12,activation='relu',padding='causal'))
model.add(GlobalAveragePooling1D())
model.add(Dropout((0.5)))
model.add(Dense(dense_expansion,activation='relu'))
model.add(Dense(1,activation='relu'))
model.compile(loss=loss_function, optimizer=optimizer,metrics=['mse','mae'])
model.fit(X_train, Y_train, batch_size=batch_size, validation_data=(X_val,Y_val), epochs=nr_of_epochs,verbose=2)
The model architecture is like this
However, the training result looks like there is no change, what are the possible reasons. The training data is of shape (1496000,64,1) and (1496000,1); the validation data is of (374000,64,1) and (374000,1).

Keras model.load_weights(WEIGHTS) provide inaccurate results

I'm training a LSTM RNN for description generation using Keras (Tensorflow Backend) with MSCOCO dataset. When training the model it had 92% accuracy with 0.79 loss. Further when the model was training I tested the description generation at each epoch and the model provided very good predictions with a meaningful description when it gives a random word.
However after training I loaded the model using model.load_weights(WEIGHTS) method in Keras and tried to create a description by giving a random word as I've done before. But now model is not providing a meaningful description and it just outputs random words which has no meaning at all.
Can anyone tell me what could be the issue for this ?
My model parameters are:
10 LSTM layers, Learning rate: 0.04, Activation: Softmax, Loss Function: Categorical Cross entropy, Optimizer: rmsprop
UPDATE:
This is my model:
model = Sequential()
model.add(LSTM(HIDDEN_DIM, input_shape=(None, VOCAB_SIZE), return_sequences=True))
for i in range(LAYER_NUM - 1):
model.add(LSTM(HIDDEN_DIM, return_sequences=True))
model.add(TimeDistributed(Dense(VOCAB_SIZE)))
model.add(Activation('softmax'))
model.add(Dropout(0.04))
model.compile(loss="categorical_crossentropy", optimizer="rmsprop", metrics=['accuracy'])
This is how I train & save my model weights (I generate a description at each epoch to test the accuracy):
model.fit(X, Y, batch_size=BATCH_SIZE, verbose=1, epochs=EPOCHS)
EPOCHS += 1
generate_description(model, GENERATE_LENGTH, VOCAB_SIZE, index_to_word) model.save_weights('checkpoint_layer_{}_hidden_{}_epoch_{}.hdf5'.format(LAYER_NUM, HIDDEN_DIM, EPOCHS))
This is how I load my model (WEIGHTS = my saved model):
model.load_weights(WEIGHTS)
desc = generate_description(model, GENERATE_LENGTH, VOCAB_SIZE, index_to_word)
print(desc)
I provide randomly generated vector to my model for testing. This is how I generate the description.
def generate_description(model, length, vocab_size, index_to_word):
index = [np.random.randint(vocab_size)]
Y_word = [index_to_word[index[-1]]]
X = np.zeros((1, length, vocab_size))
for i in range(length):
# Appending the last predicted word to next timestep
X[0, i, :][index[-1]] = 1
print(index_to_word[index[-1]])
index = np.argmax(model.predict(X[:, :i + 1, :])[0], 1)
Y_word.append(index_to_word[index[-1]])
Y_word.append(' ')
return ('').join(Y_word)