Training a Keras model for object location - tensorflow

I am trying to use Keras to train a model for location a certain mountain peak in the images of a moving wide-angle camera. This is my training data, I have 329 images labeled:
The position does not change, so except for lens distortion and rotation, is always looks the same. Peace of cake for a human.
These plots are generated from my training data generator function to rule out any issues in that regard:
train_generator = traingui_generator([], frames_labeled, labels, batch_size)
im,la = next(train_generator)
fig = plt.figure(figsize=(16,10))
for n, (i,l) in enumerate(zip(im,la)):
a = plt.subplot(2,2,n+1)
plt.imshow(i)
plt.plot(l[0],l[1],'r+')
plt.xlim((1440,2880))
plt.ylim((0,1440))
a.invert_yaxis()
plt.show()
With the help of the Keras documentation examples, I came up with this model:
from keras import Input
from keras.models import Model
from keras.optimizers import Adam
from keras.layers import Cropping2D, Rescaling, Conv2D, MaxPooling2D, Flatten, Dense, AveragePooling2D,Resizing
inputs = Input(shape=input_shape)
x = Cropping2D(((0,0),(1440,0)))(inputs)
x = Rescaling(scale=1.0 / 255)(x)
x = Resizing(720,720)(x)
x = Conv2D(32, (3, 3), activation='relu')(x)
x = MaxPooling2D((2, 2))(x)
x = Conv2D(64, (3, 3), activation='relu')(x)
x = MaxPooling2D((2, 2))(x)
x = Flatten()(x)
x = Dense(64, activation='relu')(x)
location = Dense(2)(x)
# Define the model with the input and output layers
model = Model(inputs=inputs, outputs=location)
model.compile(optimizer=Adam(learning_rate=0.001),
loss='mse', metrics=['accuracy'])
batch_size = 4
num_train_samples = 329
num_epochs = 10
train_generator = traingui_generator([], frames_labeled, labels, batch_size)
model.fit(train_generator, steps_per_epoch=num_train_samples // batch_size, epochs=num_epochs)
This is my learning process:
27/27 [==============================] - 87s 3s/step - loss: 1126492.5000 - accuracy: 0.9877
Epoch 2/20
27/27 [==============================] - 87s 3s/step - loss: 750418.5000 - accuracy: 1.0000
Epoch 3/20
27/27 [==============================] - 87s 3s/step - loss: 702527.0625 - accuracy: 1.0000
Epoch 4/20
27/27 [==============================] - 87s 3s/step - loss: 651334.3125 - accuracy: 1.0000
Epoch 5/20
27/27 [==============================] - 87s 3s/step - loss: 591387.7500 - accuracy: 1.0000
Epoch 6/20
27/27 [==============================] - 88s 3s/step - loss: 495730.0625 - accuracy: 0.9722
Epoch 7/20
27/27 [==============================] - 87s 3s/step - loss: 322107.7500 - accuracy: 0.8981
Epoch 8/20
27/27 [==============================] - 87s 3s/step - loss: 213287.5312 - accuracy: 0.8981
Epoch 9/20
27/27 [==============================] - 87s 3s/step - loss: 151553.3281 - accuracy: 0.9475
Epoch 10/20
27/27 [==============================] - 87s 3s/step - loss: 114601.3828 - accuracy: 0.9506
Epoch 11/20
27/27 [==============================] - 87s 3s/step - loss: 96194.7031 - accuracy: 0.9475
Epoch 12/20
27/27 [==============================] - 88s 3s/step - loss: 69348.9922 - accuracy: 0.9321
Epoch 13/20
27/27 [==============================] - 87s 3s/step - loss: 65372.2852 - accuracy: 0.9475
Epoch 14/20
27/27 [==============================] - 88s 3s/step - loss: 58215.0547 - accuracy: 0.9043
Epoch 15/20
27/27 [==============================] - 87s 3s/step - loss: 57038.0078 - accuracy: 0.9475
Epoch 16/20
27/27 [==============================] - 87s 3s/step - loss: 47969.5234 - accuracy: 0.9660
Epoch 17/20
27/27 [==============================] - 87s 3s/step - loss: 45780.5820 - accuracy: 0.9383
Epoch 18/20
27/27 [==============================] - 88s 3s/step - loss: 39562.1836 - accuracy: 0.9660
Epoch 19/20
27/27 [==============================] - 87s 3s/step - loss: 51684.8164 - accuracy: 0.9537
Epoch 20/20
27/27 [==============================] - 88s 3s/step - loss: 45646.8398 - accuracy: 0.9815
I notice that the loss function is quite often increasing during the epochs and the accuracy isn't stable, so I already went down with the learning_rate by one order of magnitude.
Unfortunately, the result is not particularly good:
These plots are generated with the same piece of code as above, just replacing la with the output of model.predict(im).
Note that this on the training data, so I would rather rule out overfitting, if I understand correctly.
I have also tried just two convolutional layers, without success.
Now I read that I should modify the model, but I am lacking guidance in what direction.
Should I just play around randomly, each time waiting for it to finish? Where would I start to adjust? Filter number? Kernel size of convolution or pooling? Number of layers? Number pf epochs? Is the training data even enough?
Or is there something fundamentally wrong in what I am doing?
The size of the image is unfortunately not really open for discussion, I would even like to remove the resize because there will be other, smaller feature that I would like to detect, and add the other halve sphere, such that my final data would have 1440x2880 pixel..
If necessary, I would have access to quite powerful hardware, though, if RAM is an issue, and I would not have too much of a problem if the training took hours or more, if the problem REALLY requires it.
I would really appreciate if someone experienced could give me a push into the right direction for this concrete problem. I have no idea what to expect.
Edit: Given that my images are much larger than most examples and I read that the first layers are meant capture low frequency features, I increased the kernel size of the first layers
x = Conv2D(32, (15, 15), activation='relu')(x)
x = MaxPooling2D((2, 2))(x)
x = Conv2D(64, (5, 5), activation='relu')(x)
x = MaxPooling2D((2, 2))(x)
x = Conv2D(128, (3, 3), activation='relu')(x)
x = MaxPooling2D((2, 2))(x)
If anything, this made things worse.

Related

Fine tuning in CNN using Tensor Flow - 2.0

I am currently working defect classification problem in solar panel. It's a multi class classification problem. Currently its 3 class. I have done the coding part but my accuracy is very low. How to improve my accuracy?
Total training images - 900
Testing/validation - 300
Class - 3
My code is given below -
import tensorflow as tf
import keras_preprocessing
from keras_preprocessing import image
from keras_preprocessing.image import ImageDataGenerator
TRAINING_DIR = "/content/drive/My Drive/solar_images/solar_images/train/"
training_datagen = ImageDataGenerator(
rescale = 1./255,
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
fill_mode='nearest')
VALIDATION_DIR = "/content/drive/My Drive/solar_images/solar_images/test/"
validation_datagen = ImageDataGenerator(rescale = 1./255)
train_generator = training_datagen.flow_from_directory(
TRAINING_DIR,
target_size=(150,150),
class_mode='categorical',
batch_size=64
)
validation_generator = validation_datagen.flow_from_directory(
VALIDATION_DIR,
target_size=(150,150),
class_mode='categorical',
batch_size=64
)
model = tf.keras.models.Sequential([
# Note the input shape is the desired size of the image 150x150 with 3 bytes color
# This is the first convolution
tf.keras.layers.Conv2D(64, (3,3), activation='relu', input_shape=(150, 150, 3)),
tf.keras.layers.MaxPooling2D(2, 2),
# The second convolution
tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
tf.keras.layers.MaxPooling2D(2,2),
# The third convolution
tf.keras.layers.Conv2D(128, (3,3), activation='relu'),
tf.keras.layers.MaxPooling2D(2,2),
# The fourth convolution
tf.keras.layers.Conv2D(128, (3,3), activation='relu'),
tf.keras.layers.MaxPooling2D(2,2),
# Flatten the results to feed into a DNN
tf.keras.layers.Flatten(),
tf.keras.layers.Dropout(0.5),
# 512 neuron hidden layer
tf.keras.layers.Dense(512, activation='relu'),
tf.keras.layers.Dense(3, activation='softmax')
])
model.summary()
model.compile(loss = 'categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
batch_size=64
history = model.fit(train_generator,
epochs=20,
steps_per_epoch=int(894/batch_size),
validation_data = validation_generator,
verbose = 1,
validation_steps=int(289/batch_size))
model.save("solar_images_weight.h5")
My accuracy is -
Epoch 1/20
13/13 [==============================] - 1107s 85s/step - loss: 1.2893 - accuracy: 0.3470 - val_loss: 1.0926 - val_accuracy: 0.3594
Epoch 2/20
13/13 [==============================] - 1239s 95s/step - loss: 1.1037 - accuracy: 0.3566 - val_loss: 1.0954 - val_accuracy: 0.3125
Epoch 3/20
13/13 [==============================] - 1203s 93s/step - loss: 1.0964 - accuracy: 0.3904 - val_loss: 1.0841 - val_accuracy: 0.5625
Epoch 4/20
13/13 [==============================] - 1182s 91s/step - loss: 1.0980 - accuracy: 0.3750 - val_loss: 1.0894 - val_accuracy: 0.3633
Epoch 5/20
13/13 [==============================] - 1218s 94s/step - loss: 1.1086 - accuracy: 0.3386 - val_loss: 1.0874 - val_accuracy: 0.3125
Epoch 6/20
13/13 [==============================] - 1214s 93s/step - loss: 1.0953 - accuracy: 0.3257 - val_loss: 1.0763 - val_accuracy: 0.6094
Epoch 7/20
13/13 [==============================] - 1136s 87s/step - loss: 1.0851 - accuracy: 0.3831 - val_loss: 1.0754 - val_accuracy: 0.3164
Epoch 8/20
13/13 [==============================] - 1170s 90s/step - loss: 1.1005 - accuracy: 0.3940 - val_loss: 1.0545 - val_accuracy: 0.5039
Epoch 9/20
13/13 [==============================] - 1138s 88s/step - loss: 1.1294 - accuracy: 0.4337 - val_loss: 1.0130 - val_accuracy: 0.5703
Epoch 10/20
13/13 [==============================] - 1131s 87s/step - loss: 1.0250 - accuracy: 0.4531 - val_loss: 0.8911 - val_accuracy: 0.6055
Epoch 11/20
13/13 [==============================] - 1162s 89s/step - loss: 1.0243 - accuracy: 0.4735 - val_loss: 0.9160 - val_accuracy: 0.4727
Epoch 12/20
13/13 [==============================] - 1153s 89s/step - loss: 0.9978 - accuracy: 0.4783 - val_loss: 0.7754 - val_accuracy: 0.6406
Epoch 13/20
13/13 [==============================] - 1187s 91s/step - loss: 1.0080 - accuracy: 0.4687 - val_loss: 0.7701 - val_accuracy: 0.6602
Epoch 14/20
13/13 [==============================] - 1204s 93s/step - loss: 0.9851 - accuracy: 0.5048 - val_loss: 0.7450 - val_accuracy: 0.6367
Epoch 15/20
13/13 [==============================] - 1181s 91s/step - loss: 0.9699 - accuracy: 0.4892 - val_loss: 0.7409 - val_accuracy: 0.6289
Epoch 16/20
13/13 [==============================] - 1187s 91s/step - loss: 0.8884 - accuracy: 0.5241 - val_loss: 0.7169 - val_accuracy: 0.6133
Epoch 17/20
13/13 [==============================] - 1197s 92s/step - loss: 0.9372 - accuracy: 0.5084 - val_loss: 0.7464 - val_accuracy: 0.5859
Epoch 18/20
13/13 [==============================] - 1224s 94s/step - loss: 0.9230 - accuracy: 0.5229 - val_loss: 0.9198 - val_accuracy: 0.5156
Epoch 19/20
13/13 [==============================] - 1270s 98s/step - loss: 0.9161 - accuracy: 0.5192 - val_loss: 0.6785 - val_accuracy: 0.6289
Epoch 20/20
13/13 [==============================] - 1173s 90s/step - loss: 0.8728 - accuracy: 0.5193 - val_loss: 0.6674 - val_accuracy: 0.5781
Training and validation accuracy plot is given below -
You could use transfer learning. Using a pre-trained model such as mobilenet or inception to train on your dataset. This would significantly improve your accuracy.

Validation Loss Increases every iteration

Recently I have been trying to do multi-class classification. My datasets consist of 17 image categories. Previously I was using 3 conv layers and 2 hidden layers. It resulted my model overfitting with huge validation loss around 11.0++ and my validation accuracy was very low. So I decided to decrease the conv layers by 1 and hidden layer by 1. I also have removed dropout and it still have the same problem with the validation which still overfitting, even though my training accuracy and loss are getting better.
Here is my code for prepared datasets:
import cv2
import numpy as np
import os
import pickle
import random
CATEGORIES = ["apple_pie", "baklava", "caesar_salad","donuts",
"fried_calamari", "grilled_salmon", "hamburger",
"ice_cream", "lasagna", "macaroni_and_cheese", "nachos", "omelette","pizza",
"risotto", "steak", "tiramisu", "waffles"]
DATALOC = "D:/Foods/Datasets"
IMAGE_SIZE = 50
data_training = []
def create_data_training():
for category in CATEGORIES:
path = os.path.join(DATALOC, category)
class_num = CATEGORIES.index(category)
for image in os.listdir(path):
try:
image_array = cv2.imread(os.path.join(path,image), cv2.IMREAD_GRAYSCALE)
new_image_array = cv2.resize(image_array, (IMAGE_SIZE,IMAGE_SIZE))
data_training.append([new_image_array,class_num])
except Exception as exc:
pass
create_data_training()
random.shuffle(data_training)
X = []
y = []
for features, label in data_training:
X.append(features)
y.append(label)
X = np.array(X).reshape(-1, IMAGE_SIZE, IMAGE_SIZE, 1)
y = np.array(y)
pickle_out = open("X.pickle", "wb")
pickle.dump(X, pickle_out)
pickle_out.close()
pickle_out = open("y.pickle", "wb")
pickle.dump(y, pickle_out)
pickle_out.close()
pickle_in = open("X.pickle","rb")
X = pickle.load(pickle_in)
Here is the code of my model:
import pickle
import tensorflow as tf
import time
from tensorflow.keras.models import Sequential
from tensorflow.keras.callbacks import TensorBoard
from tensorflow.keras.layers import Activation, Conv2D, Dense, Dropout, Flatten, MaxPooling2D
NAME = "Foods-Model-{}".format(int(time.time()))
tensorboard = TensorBoard(log_dir='logs\{}'.format(NAME))
X = pickle.load(open("X.pickle","rb"))
y = pickle.load(open("y.pickle","rb"))
X = X/255.0
model = Sequential()
model.add(Conv2D(32,(3,3), input_shape = X.shape[1:]))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size =(2,2)))
model.add(Conv2D(64,(3,3)))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size =(2,2)))
model.add(Flatten())
model.add(Dense(128))
model.add(Activation("relu"))
model.add(Dense(17))
model.add(Activation('softmax'))
model.compile(loss = "sparse_categorical_crossentropy", optimizer = "adam", metrics = ['accuracy'])
model.fit(X, y, batch_size = 16, epochs = 20 , validation_split = 0.1, callbacks = [tensorboard])
The result of the trained model:
Train on 7650 samples, validate on 850 samples
Epoch 1/20
7650/7650 [==============================] - 242s 32ms/sample - loss: 2.7826 - accuracy: 0.1024 - val_loss: 2.7018 - val_accuracy: 0.1329
Epoch 2/20
7650/7650 [==============================] - 241s 31ms/sample - loss: 2.5673 - accuracy: 0.1876 - val_loss: 2.5597 - val_accuracy: 0.2059
Epoch 3/20
7650/7650 [==============================] - 234s 31ms/sample - loss: 2.3529 - accuracy: 0.2617 - val_loss: 2.5329 - val_accuracy: 0.2153
Epoch 4/20
7650/7650 [==============================] - 233s 30ms/sample - loss: 2.0707 - accuracy: 0.3510 - val_loss: 2.6628 - val_accuracy: 0.2059
Epoch 5/20
7650/7650 [==============================] - 231s 30ms/sample - loss: 1.6960 - accuracy: 0.4753 - val_loss: 2.8143 - val_accuracy: 0.2047
Epoch 6/20
7650/7650 [==============================] - 230s 30ms/sample - loss: 1.2336 - accuracy: 0.6247 - val_loss: 3.3130 - val_accuracy: 0.1929
Epoch 7/20
7650/7650 [==============================] - 233s 30ms/sample - loss: 0.7738 - accuracy: 0.7715 - val_loss: 3.9758 - val_accuracy: 0.1776
Epoch 8/20
7650/7650 [==============================] - 231s 30ms/sample - loss: 0.4271 - accuracy: 0.8827 - val_loss: 4.7325 - val_accuracy: 0.1882
Epoch 9/20
7650/7650 [==============================] - 233s 30ms/sample - loss: 0.2080 - accuracy: 0.9519 - val_loss: 5.7198 - val_accuracy: 0.1918
Epoch 10/20
7650/7650 [==============================] - 233s 30ms/sample - loss: 0.1402 - accuracy: 0.9668 - val_loss: 6.0608 - val_accuracy: 0.1835
Epoch 11/20
7650/7650 [==============================] - 236s 31ms/sample - loss: 0.0724 - accuracy: 0.9872 - val_loss: 6.7468 - val_accuracy: 0.1753
Epoch 12/20
7650/7650 [==============================] - 232s 30ms/sample - loss: 0.0549 - accuracy: 0.9895 - val_loss: 7.4844 - val_accuracy: 0.1718
Epoch 13/20
7650/7650 [==============================] - 229s 30ms/sample - loss: 0.1541 - accuracy: 0.9591 - val_loss: 7.3335 - val_accuracy: 0.1553
Epoch 14/20
7650/7650 [==============================] - 231s 30ms/sample - loss: 0.0477 - accuracy: 0.9905 - val_loss: 7.8453 - val_accuracy: 0.1729
Epoch 15/20
7650/7650 [==============================] - 233s 30ms/sample - loss: 0.0346 - accuracy: 0.9908 - val_loss: 8.1847 - val_accuracy: 0.1753
Epoch 16/20
7650/7650 [==============================] - 231s 30ms/sample - loss: 0.0657 - accuracy: 0.9833 - val_loss: 7.8582 - val_accuracy: 0.1624
Epoch 17/20
7650/7650 [==============================] - 233s 30ms/sample - loss: 0.0555 - accuracy: 0.9830 - val_loss: 8.2578 - val_accuracy: 0.1553
Epoch 18/20
7650/7650 [==============================] - 230s 30ms/sample - loss: 0.0423 - accuracy: 0.9892 - val_loss: 8.6970 - val_accuracy: 0.1694
Epoch 19/20
7650/7650 [==============================] - 236s 31ms/sample - loss: 0.0291 - accuracy: 0.9927 - val_loss: 8.5275 - val_accuracy: 0.1882
Epoch 20/20
7650/7650 [==============================] - 234s 31ms/sample - loss: 0.0443 - accuracy: 0.9873 - val_loss: 9.2703 - val_accuracy: 0.1812
Thank You for your time. Any help and suggestion will be really appreciated.
Your model suggests early over-fitting.
Get rid of the dense layer completely and use global pooling.
model = Sequential()
model.add(Conv2D(32,(3,3), input_shape = X.shape[1:]))
model.add(Activation("relu"))
model.add(Conv2D(64,(3,3)))
model.add(Activation("relu"))
model.add(Conv2D(128,(3,3)))
model.add(Activation("relu"))
model.add(GlobalAveragePooling2D())
model.add(Dense(17))
model.add(Activation('softmax'))
model.summary()
Use SpatialDropout2D after conv layers.
ref: https://www.tensorflow.org/api_docs/python/tf/keras/layers/SpatialDropout2D
Use early stopping to get a balanced model.
Your output suggests categorical_crossentropy as a better-fit loss.

Keras VGG16 low validation accuracy

I built a very simple Convolutional Neural Network using the pre-trained VGG16.
I am using the Pokemon generation one dataset containing 10.000 images belonging to 149 different classes. I manually split the dataset, 0.7 for training and 0.3 for validation in different directories.
The problem is that I am getting high accuracy but the validation accuracy is not very high.
In the code below, there is the best configuration found, using Adam optimizer with 0.0001 of learning rate.
Can someone suggest me how I can improve the performance and avoid overfitting?
Code:
import tensorflow as tf
import numpy as np
vgg_model = tf.keras.applications.VGG16(weights='imagenet', include_top=False, input_shape = (224,224,3))
vgg_model.trainable = False
model = tf.keras.models.Sequential()
model.add(vgg_model)
model.add(tf.keras.layers.Flatten(input_shape=vgg_model.output_shape[1:]))
model.add(tf.keras.layers.Dense(256, activation='relu'))
model.add(tf.keras.layers.Dense(149, activation='softmax'))
model.compile(optimizer=tf.keras.optimizers.Adam(lr=0.0001, decay=0.0001/100), loss='categorical_crossentropy', metrics=['accuracy'])
train= tf.keras.preprocessing.image.ImageDataGenerator(rescale=1./255, shear_range=0.2, zoom_range=0.2, horizontal_flip=True)
test= tf.keras.preprocessing.image.ImageDataGenerator(rescale=1./255)
training_set = train.flow_from_directory('datasets/generation/train', target_size=(224,224), class_mode = 'categorical')
val_set = train.flow_from_directory('datasets/generation/test', target_size=(224,224), class_mode = 'categorical')
history = model.fit_generator(training_set, steps_per_epoch = 64, epochs = 100, validation_data = val_set, validation_steps = 64)
Here is the output every 10 epochs:
Epoch 1/100
64/64 [====================] - 57s 891ms/step - loss: 4.8707 - acc: 0.0654 - val_loss: 4.7281 - val_acc: 0.0718
Epoch 10/100
64/64 [====================] - 53s 821ms/step - loss: 2.9540 - acc: 0.4141 - val_loss: 3.2206 - val_acc: 0.3447
Epoch 20/100
64/64 [====================] - 56s 869ms/step - loss: 1.9040 - acc: 0.6279 - val_loss: 2.6155 - val_acc: 0.4577
Epoch 30/100
64/64 [====================] - 50s 781ms/step - loss: 1.2899 - acc: 0.7658 - val_loss: 2.3345 - val_acc: 0.4897
Epoch 40/100
64/64 [====================] - 53s 832ms/step - loss: 1.0192 - acc: 0.8096 - val_loss: 2.1765 - val_acc: 0.5149
Epoch 50/100
64/64 [====================] - 55s 854ms/step - loss: 0.7948 - acc: 0.8672 - val_loss: 2.1082 - val_acc: 0.5359
Epoch 60/100
64/64 [====================] - 52s 816ms/step - loss: 0.5774 - acc: 0.9106 - val_loss: 2.0673 - val_acc: 0.5435
Epoch 70/100
64/64 [====================] - 52s 811ms/step - loss: 0.4383 - acc: 0.9385 - val_loss: 2.0499 - val_acc: 0.5454
Epoch 80/100
64/64 [====================] - 56s 881ms/step - loss: 0.3638 - acc: 0.9473 - val_loss: 1.9849 - val_acc: 0.5501
Epoch 90/100
64/64 [====================] - 55s 860ms/step - loss: 0.2860 - acc: 0.9609 - val_loss: 1.9564 - val_acc: 0.5531
Epoch 100/100
64/64 [====================] - 52s 815ms/step - loss: 0.2328 - acc: 0.9697 - val_loss: 2.0334 - val_acc: 0.5615
As i can see in your output above you are not overfitting yet but there is a huge spread between train and validation score. There are plenty of things you can try to improve you validation score.
You could add more training data (not always possible)
heavier augmentation
tta
and add dropout layers
add a dropout layer like this:
model.add(tf.keras.layers.Dense(256, activation='relu'))
model.add(tf.keras.layers.Dropout(0.5))
model.add(tf.keras.layers.Dense(149, activation='softmax'))

Why does the output layer is simply zero at the end of the network?

I am trying to train a model that takes a 15x15 image and classify each pixel into two classes (1/0).
This is my loss function:
smooth = 1
def tversky(y_true, y_pred):
y_true_pos = K.flatten(y_true)
y_pred_pos = K.flatten(y_pred)
true_pos = K.sum(y_true_pos * y_pred_pos)
false_neg = K.sum(y_true_pos * (1-y_pred_pos))
false_pos = K.sum((1-y_true_pos)*y_pred_pos)
alpha = 0.5
return (true_pos + smooth)/(true_pos + alpha*false_neg + (1-alpha)*false_pos + smooth)
def tversky_loss2(y_true, y_pred):
return 1 - tversky(y_true,y_pred)
This is the model:
input_image = layers.Input(shape=(size, size, 1))
b2 = layers.Conv2D(128, (3,3), padding='same', activation='relu')(input_image)
b2 = layers.Conv2D(128, (3,3), padding='same', activation='relu')(b2)
b2 = layers.Conv2D(128, (3,3), padding='same', activation='relu')(b2)
output = layers.Conv2D(1, (1,1), activation='sigmoid', padding='same')(b2)
model = models.Model(input_image, output)
model.compile(optimizer='adam', loss=tversky_loss2, metrics=['accuracy'])
The model left is the input and the label is the middle column and the prediction is always zero on the right column:
The training performs really poorly:
Epoch 1/10
100/100 [==============================] - 4s 38ms/step - loss: 0.9269 - acc: 0.1825
Epoch 2/10
100/100 [==============================] - 3s 29ms/step - loss: 0.9277 - acc: 0.0238
Epoch 3/10
100/100 [==============================] - 3s 29ms/step - loss: 0.9276 - acc: 0.0239
Epoch 4/10
100/100 [==============================] - 3s 29ms/step - loss: 0.9270 - acc: 0.0241
Epoch 5/10
100/100 [==============================] - 3s 30ms/step - loss: 0.9274 - acc: 0.0240
Epoch 6/10
100/100 [==============================] - 3s 29ms/step - loss: 0.9269 - acc: 0.0242
Epoch 7/10
100/100 [==============================] - 3s 29ms/step - loss: 0.9270 - acc: 0.0241
Epoch 8/10
100/100 [==============================] - 3s 29ms/step - loss: 0.9271 - acc: 0.0241
Epoch 9/10
100/100 [==============================] - 3s 29ms/step - loss: 0.9276 - acc: 0.0239
Epoch 10/10
100/100 [==============================] - 3s 29ms/step - loss: 0.9266 - acc: 0.0242
This sounds like a very imbalanced dataset with very tiny true regions. This might be hard to train indeed.
You may want to increase alpha to penalize more false negatives than false positives. Anyway, unless alpha is big enough, it's very normal that in the beginning your model first goes to all neg because it's definitely a great way to decrease the loss.
Now, there is a conceptual mistake regarding how Keras works in that loss. You need to keep the "samples" separate. Otherwise you are calculating a loss as if all images were one image. (Thus, it's probable that images with many positives have a reasoable result, while images with few positives don't, and this will be a good solution)
Fix the loss as:
def tversky(y_true, y_pred):
y_true_pos = K.batch_flatten(y_true) #keep the batch dimension
y_pred_pos = K.batch_flatten(y_pred)
true_pos = K.sum(y_true_pos * y_pred_pos, axis=-1) #don't sum over the batch dimension
false_neg = K.sum(y_true_pos * (1-y_pred_pos), axis=-1)
false_pos = K.sum((1-y_true_pos)*y_pred_pos, axis=-1)
alpha = 0.5
return (true_pos + smooth)/(true_pos + alpha*false_neg + (1-alpha)*false_pos + smooth)
This way you have an individual loss value for each image, so the exitence of images with many positives don't affect the results of images with few positives.

Why does a cnn with keras not learn?

I am kind of new to deep learning and especially keras, and I have an assignment from university to train a CNN and learn about it, using keras. I am using the MURA dataset (skeletonal radiography).
What I have done until now is to go over all images from the dataset and split the training set into train and validation (90/10).
I am using a CNN that has been given in the paper and I am not allowed to modify it until the second task. The first task is to observe and understand the CNN.
def run():
train_datagen = ImageDataGenerator(rescale=1./255)
val_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory('train_data',
target_size=(227,227),
batch_size=BATCH_SIZE,
class_mode='binary',
color_mode='grayscale'
)
val_generator = val_datagen.flow_from_directory('test_data',
target_size=(227,227),
batch_size=BATCH_SIZE,
class_mode='binary',
color_mode='grayscale'
)
classifier = Sequential()
classifier.add(Conv2D(64,(7,7),strides=2, input_shape=(227,227,1)))
classifier.add(Activation('relu'))
classifier.add(MaxPooling2D(pool_size=(2,2), strides=2))
classifier.add(Conv2D(128, (5,5), strides=2 ))
classifier.add(Activation('relu'))
classifier.add(MaxPooling2D(pool_size=(2,2),strides=2))
classifier.add(Conv2D(256, (3,3), strides=1))
classifier.add(Activation('relu'))
classifier.add(Conv2D(384, (3,3), strides=1))
classifier.add(Activation('relu'))
classifier.add(Conv2D(256, (3,3), strides=1))
classifier.add(Activation('relu'))
classifier.add(Conv2D(256, (3,3), strides=1))
classifier.add(Activation('relu'))
classifier.add(MaxPooling2D(pool_size=(2,2),strides=2))
classifier.add(Flatten())
classifier.add(Dropout(0.5))
classifier.add(Dense(units=2048))
classifier.add(Activation('relu'))
classifier.add(Dropout(0.5))
classifier.add(Dense(units=1))
classifier.add(Activation('sigmoid'))
classifier.summary()
# from keras.optimizers import SGD
# sg = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
classifier.compile(optimizer=keras.optimizers.SGD(),loss='binary_crossentropy', metrics=['accuracy'])
classifier.fit_generator(train_generator,
steps_per_epoch= training_len//BATCH_SIZE,
epochs=10,
validation_data=val_generator,
validation_steps= valid_len//BATCH_SIZE,
shuffle=True,
verbose=1)
classifier.save_weights('first_model_weights.h5')
classifier.save('first_model.h5')
The problem I am having is that if I run this, it just does not learn. Or at least I think it doesn't.
The output looks like this:
Epoch 1/10
575/575 [==============================] - 693s 1s/step - loss: 0.6767 - acc: 0.5958 - val_loss: 0.6751 - val_acc: 0.5966
Epoch 2/10
575/575 [==============================] - 207s 359ms/step - loss: 0.6760 - acc: 0.5948 - val_loss: 0.6752 - val_acc: 0.5958
Epoch 3/10
575/575 [==============================] - 258s 448ms/step - loss: 0.6745 - acc: 0.5983 - val_loss: 0.6748 - val_acc: 0.5958
Epoch 4/10
575/575 [==============================] - 165s 287ms/step - loss: 0.6760 - acc: 0.5950 - val_loss: 0.6757 - val_acc: 0.5947
Epoch 5/10
575/575 [==============================] - 166s 288ms/step - loss: 0.6761 - acc: 0.5948 - val_loss: 0.6731 - val_acc: 0.6016
Epoch 6/10
575/575 [==============================] - 167s 290ms/step - loss: 0.6742 - acc: 0.5990 - val_loss: 0.6778 - val_acc: 0.5875
Epoch 7/10
575/575 [==============================] - 206s 359ms/step - loss: 0.6762 - acc: 0.5938 - val_loss: 0.6721 - val_acc: 0.6038
Epoch 8/10
575/575 [==============================] - 165s 286ms/step - loss: 0.6762 - acc: 0.5938 - val_loss: 0.6763 - val_acc: 0.5947
Epoch 9/10
575/575 [==============================] - 164s 286ms/step - loss: 0.6751 - acc: 0.5972 - val_loss: 0.6787 - val_acc: 0.5897
Epoch 10/10
575/575 [==============================] - 168s 292ms/step - loss: 0.6750 - acc: 0.5971 - val_loss: 0.6722 - val_acc: 0.6022
Am I doing something wrong in the code? Is it the dataset splitting? I am currently in a dark spot and I can't seem to figure it out.