I have a question about the code posted on keras.io\EDSR . Why there's a contradiction between Val_loss and Val_PSNR? in some epochs, from one epoch to the next epoch when Val_loss decreases, we expect Val_PSNR to increase, but this does not happen and vice versa. Is this normal? what's the reason? If not, what is the solution?
I tried to implement PSNR and Loss function in such a way that they are compatible with each other, but The problem was not solved.
Related
I have a time-series dataset and I trained it using LSTM. I train using 200 epochs and the result is the loss value and val_loss value is pretty good (IMO)
then I think the result still can be better if I add more epochs so I retrain using 400 epochs. but the loss and val_loss is rising
but somehow the result is different. even become worse. is it better still to use the 200 epoch model or there really is a condition if more epochs can worsen the model
This is probably because your lr(learning rate) is too large. You could try to reduce your lr. From the graph, the training loss is increased so I think this case is not the problem of overfitting.
There are two models that have same structure and learn same dataset.
And those models record the same least validation loss value.
The difference is like below.
The First model doesn't have regularization and stops learning at epoch 15(this epoch is the point where the least validation loss value happens.
Second model has regularization and stop learning at epoch 25(this epoch is the point where the least validation loss value happens.
As I wrote down in the front, key point is that those model record the same least validation loss value
In this case, which model is better?
If you used your validation data in the process of training (for selecting hyperparameters) then I highly suggest to have a different set of data as test-set and that will give you the answer, otherwise your validation-set is actually your test-set and when you say they have the same test loss then you mean their performance are the same (if you are not sure if your test-set is a good representation of models performance in real life, then make your test-set better, e.g. make it bigger). If you want to use one of them for an important task, I suggest you to have a cross-validation test to see which of them is better.
Note that regularization is used for reducing the overfitting in order to achieve higher validation and test results. So after learning and when you want to use the model it really doesn't matter if the model used regularization or not, but the performance and speed matters.
For summary, I see no better model from what you said.
What i would suggest is go for 2nd model .i.e model with regularization and stops learning at epoch 25.
The reason is since you have used regularization technique chances of overfitting your training data is very less.
As per my understanding the loss would decrease after an epoch, but why while training a model the loss value is decreasing after each step in an epoch? Backpropagation happens only once in an epoch or does it happen at each step in an epoch?
This is probably a silly question, but I couldn't find answer anywhere. If there is already a question regarding this, please post the link.
For research, I'm working on face anti-spoofing. I have the rose-youtu dataset which spec could be found here in details. The problem is no matter what architecture I use for the Model, the validation loss would not be less than ~0.5, Although I tried different architectures (3D CNN, 2D CNN, Fine-tuning, Inception) with different combinations of regularization (Dropout, L1, L2, L1_L2) but the thing is the val_loss will end up at ~0.5 with and when I do evaluation on the test set I get ~0.7 loss and ~0.85% accuracy. Note that there is no overfitting happening, both loss and val_loss are close and converging until they reach ~0.3 which causes the loss to decrease and val_loss fluctuating in the range of ~0.4 0.6.
What is that I'm not considering? Does it mean that the dataset could not be improved more than this?
I am training a model, and using the original learning rate of the author (I use their github too), I get a validation loss that keeps oscillating a lot, it will decrease but then suddenly jump to a large value and then decrease again, but never really converges as the lowest it gets is 2 (while training loss converges to 0.0 something - much below 1)
At each epoch I get the training accuracy and at the end, the validation accuracy. Validation accuracy is always greater than the training accuracy.
When I test on real test data, I get good results, but I wonder if my model is overfitting. I expect a good model's val loss to converge in a similar fashion with training loss, but this doesn't happen and the fact that the val loss oscillates to very large values at times worries me.
Adjusting the learning rate and scheduler etc etc, I got the val loss and training loss to a downward fashion with less oscilliation, but this time my test accuracy remains low (as well as training and validation accuracies)
I did try a couple of optimizers (adam, sgd, adagrad) with step scheduler and also the pleateu one of pytorch, I played with step sizes etc. but it didn't really help, neither did clipping gradients.
Is my model overfitting?
If so, how can I reduce the overfitting besides data augmentation?
If not (I read some people on quora said it is nothing to worry about, though I would think it must be overfitting), how can I justify it? Even if I would get similar results for a k-fold experiment, would it be good enough? I don't feel it would justify the oscilliating. How should I proceed?
The training loss at each epoch is usually computed on the entire training set.
The validation loss at each epoch is usually computed on one minibatch of the validation set, so it is normal for it to be more noisey.
Solution: You can report the Exponential Moving Average of the validation loss across different epochs to have less fluctuations.
It is not overfitting since your validation accuracy is not less than the training accuracy. In fact, it sounds like your model is underfitting since your validation accuracy > training accuracy.