Preprocessing layers with seed not producing the same data augmentation for images and masks - tensorflow

I'm trying to create a simple preprocessing augmentation layer, following this Tensorflow tutorial. I created this 'simple' example that shows the problem I'm having.
Even though I'm initializing the augmentation class with a seed, operations applied to the images, and the corresponding masks are not always equal.
What am I doing wrong?
Note: tf v2.10.0
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np
import skimage
import rasterio as rio
def normalize(array: np.ndarray):
""" normalise image to give a meaningful output """
array_min, array_max = array.min(), array.max()
return (array - array_min) / (array_max - array_min)
# field
im = rio.open('penguins.tif')
fields = np.zeros((1,im.shape[0],im.shape[1],3))
fields[0,:,:,0] = normalize(im.read(1))
fields[0,:,:,1] = normalize(im.read(2))
fields[0,:,:,2] = normalize(im.read(3))
# mask is a simple contour
masks = skimage.color.rgb2gray(skimage.filters.sobel(fields[0]))
masks = np.expand_dims(masks, [0,3])
In this case, the dataset is only one image, we can use this function to visualize the field and the mask.
def show(field:np.ndarray, mask:np.ndarray):
"""Show the field and corresponding mask."""
fig = plt.figure(figsize=(8,6))
ax1 = fig.add_subplot(121)
ax2 = fig.add_subplot(122)
ax1.imshow(field[:,:,:3])
ax2.imshow(mask,cmap='binary')
plt.tight_layout()
plt.show()
show(fields[0], masks[0])
Alright, now I used the example from the tutorial that will randomly flip (horizontal) the image and the mask.
class Augment(tf.keras.layers.Layer):
def __init__(self, seed=42):
super().__init__()
# both use the same seed, so they'll make the same random changes.
self.augment_inputs = tf.keras.layers.RandomFlip(mode="horizontal", seed=seed)
self.augment_labels = tf.keras.layers.RandomFlip(mode="horizontal", seed=seed)
def call(self, inputs, labels):
inputs = self.augment_inputs(inputs)
labels = self.augment_labels(labels)
return inputs, labels
Now if I run the following multiple times, I will eventually get opposite flip on the field and mask.
# Create a tf.datasets
ds = tf.data.Dataset.from_tensor_slices((fields, masks))
ds2 = ds.map(Augment())
for f,m in ds2.take(1):
show(f, m)
I would expect the image and its mask to be flip the same way since I set the seed in the Augment class as suggested in the Tensorflow tutorial.

Augmentation can be done on the concatenated image and mask along the channel axis to form a single array and then recover the image and label back, which is shown below:
class Augment(tf.keras.layers.Layer):
def __init__(self):
super().__init__()
# both use the same seed, so they'll make the same random changes.
self.augment_inputs = tf.keras.layers.RandomRotation(0.3)
def call(self, inputs, labels):
output = self.augment_inputs(tf.concat([inputs, labels], -1) )
inputs = output[:,:,0:4]
labels = output[:,:,4:]
return inputs, labels

Related

Different results in tensorflow prediction

I cannot understand why the following codes gives different results. I'm printing the first 3 components of the prediction array to compare results. my_features and feat have totally different results, but they should be the same, given the model and the data are the same. There should be something wrong in the loading and image preprocessing, but I cannot find it. Any help will be appreciated.
import tensorflow as tf
import os
import numpy as np
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications import MobileNetV3Small
from tensorflow.keras.applications.imagenet_utils import preprocess_input
model= MobileNetV3Small(weights='imagenet', include_top=False, pooling='avg')
DatasetPath= "DB"
imagePathList= sorted(os.listdir(DatasetPath))
imagePathList= [os.path.join(DatasetPath, imagePath) for imagePath in imagePathList]
def read_image(filename):
image_string = tf.io.read_file(filename)
image = tf.image.decode_jpeg(image_string, channels=3)
image = tf.image.convert_image_dtype(image, tf.float32)
image = tf.image.resize(image, [224,224])
image = tf.keras.applications.mobilenet_v3.preprocess_input(image)
return image
ds_imagePathList= tf.data.Dataset.from_tensor_slices(imagePathList)
dataset = ds_imagePathList.map(read_image, num_parallel_calls=tf.data.AUTOTUNE)
dataset = dataset.batch(32, drop_remainder=False)
dataset = dataset.prefetch(tf.data.AUTOTUNE)
my_features = model.predict(dataset)
my_features[0][:3]
Second snippet
def loadProcessedImage(path):
#img = image.load_img(path, target_size=model.input_shape[1:3])
img = image.load_img(path, target_size= (224,224,3))
imgP = image.img_to_array(img)
imgP = np.expand_dims(imgP, axis=0)
imgP = preprocess_input(imgP)
return img, imgP
img, x = loadProcessedImage(imagePathList[0])
feat = model.predict(x)
feat = feat.flatten()
feat[:3]
The problem is related to the image resize. In the second snippet there is a call to load_img which internally uses pillow to load and resize the image. The problem is that tf.image.resize is not correct see here, and even this a 2018 blog post, the problem is still there

I want to convert my binary classification model to multiclass classification model I am taking labels using directory names

This is my code below it works fine for classification of two categories of images it takes labels based on directory names but whenever I add one more directory it stops working can someone help me
This is my code for image classification for images from two directories and two labels but when I convert it to three labels/ directories I get an error the error is posted below can someone help me solve the problem This if for image classification
I have tried removing the NumPy array I somewhere saw I need to just pass it through a CNN but I couldn't do that.
I am trying to make a classifier for pneumonia caused by a coronavirus and other disease using frontal chest x rays
from tensorflow.keras.preprocessing.image import ImageDataGeneratorfrom
from tensorflow.keras.applications import VGG16
from tensorflow.keras.layers import AveragePooling2D
from tensorflow.keras.layers import Dropout
from tensorflow.keras.layers import Flatten
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Input
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.utils import to_categorical
from sklearn.preprocessing import LabelBinarizer
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
from imutils import paths
import matplotlib.pyplot as plt
import numpy as np
import argparse
import cv2
import os
# construct the argument parser and parse the arguments
# initialize the initial learning rate, number of epochs to train for,
# and batch size
INIT_LR = 1e-3
EPOCHS = 40
BS = 66
# grab the list of images in our dataset directory, then initialize
# the list of data (i.e., images) and class images
print("[INFO] loading images...")
imagePaths = list(paths.list_images('/content/drive/My Drive/testset/'))
data = []
labels = []
# loop over the image paths
for imagePath in imagePaths:
# extract the class label from the filename
label = imagePath.split(os.path.sep)[-2]
# load the image, swap color channels, and resize it to be a fixed
# 224x224 pixels while ignoring aspect ratio
image = cv2.imread(imagePath)
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
image = cv2.resize(image, (224, 224))
# update the data and labels lists, respectively
data.append(image)
labels.append(label)
# convert the data and labels to NumPy arrays while scaling the pixel
# intensities to the range [0, 255]
data = np.array(data) / 255.0
labels = np.array(labels)
# perform one-hot encoding on the labels
lb = LabelBinarizer()
labels = lb.fit_transform(labels)
labels = to_categorical(labels)
# partition the data into training and testing splits using 80% of
# the data for training and the remaining 20% for testing
(trainX, testX, trainY, testY) = train_test_split(data, labels,
test_size=0.20, stratify=labels, random_state=42)
# initialize the training data augmentation object
trainAug = ImageDataGenerator(
rotation_range=15,
fill_mode="nearest")
# load the VGG16 network, ensuring the head FC layer sets are left
# off
baseModel = VGG16(weights="imagenet", include_top=False,
input_tensor=Input(shape=(224, 224, 3)))
# construct the head of the model that will be placed on top of the
# the base model
headModel = baseModel.output
headModel = AveragePooling2D(pool_size=(4, 4))(headModel)
headModel = Flatten(name="flatten")(headModel)
headModel = Dense(64, activation="relu")(headModel)
headModel = Dropout(0.5)(headModel)
headModel = Dense(2, activation="softmax")(headModel)
# place the head FC model on top of the base model (this will become
# the actual model we will train)
model = Model(inputs=baseModel.input, outputs=headModel)
# loop over all layers in the base model and freeze them so they will
# *not* be updated during the first training process
for layer in baseModel.layers:
layer.trainable = False
# compile our model
print("[INFO] compiling model...")
opt = Adam(lr=INIT_LR, decay=INIT_LR / EPOCHS)
model.compile(loss="binary_crossentropy", optimizer=opt, metrics=["accuracy"])
# train the head of the network
print("[INFO] training head...")
H = model.fit(
trainAug.flow(trainX, trainY, batch_size=BS),
steps_per_epoch=len(trainX) // BS,
validation_data=(testX, testY),
validation_steps=len(testX) // BS,
epochs=EPOCHS)
# make predictions on the testing set
print("[INFO] evaluating network...")
predIdxs = model.predict(testX, batch_size=BS)
# for each image in the testing set we need to find the index of the
# label with corresponding largest predicted probability
predIdxs = np.argmax(predIdxs, axis=1)
# show a nicely formatted classification report
print(classification_report(testY.argmax(axis=1), predIdxs,
target_names=lb.classes_))
# compute the confusion matrix and and use it to derive the raw
# accuracy, sensitivity, and specificity
cm = confusion_matrix(testY.argmax(axis=1), predIdxs)
total = sum(sum(cm))
acc = (cm[0, 0] + cm[1, 1]) / total
sensitivity = cm[0, 0] / (cm[0, 0] + cm[0, 1])
specificity = cm[1, 1] / (cm[1, 0] + cm[1, 1])
# show the confusion matrix, accuracy, sensitivity, and specificity
print(cm)
print("acc: {:.4f}".format(acc))
print("sensitivity: {:.4f}".format(sensitivity))
print("specificity: {:.4f}".format(specificity))
# plot the training loss and accuracy
N = EPOCHS
plt.style.use("ggplot")
plt.figure()
plt.plot(np.arange(0, N), H.history["loss"], label="train_loss")
plt.plot(np.arange(0, N), H.history["val_loss"], label="val_loss")
plt.plot(np.arange(0, N), H.history["accuracy"], label="train_acc")
plt.plot(np.arange(0, N), H.history["val_accuracy"], label="val_acc")
plt.title("Training Loss and Accuracy on COVID-19 Dataset")
plt.xlabel("Epoch #")
plt.ylabel("Loss/Accuracy")
plt.legend(loc="lower left")
plt.savefig("plot.png")
# serialize the model to disk
print("[INFO] saving COVID-19 detector model...")
model.save('/content/drive/My Drive/setcovid/model.h5', )
This is the error I got in my program
There are a few changes you need to make it work. The error you're getting is because of one-hot-encode. You're encoding your labels to one-hot twice.
lb = LabelBinarizer()
labels = lb.fit_transform(labels)
labels = to_categorical(labels)
Please remove the last line 'to_categorical' from your code. You will get the one-hot encode in the correct format. It will fix the error you're getting now.
And there is another problem I must mention. Your model output layer has only 2 neurons but you want to classify 3 classes. Please set the output layer neurons to 3.
headModel = Dense(3, activation="softmax")(headModel)
And you're now training with 3 classes, it's not binary anymore. You have to use another loss. I will recommend you to use categorical.
model.compile(loss="categorical_crossentropy", optimizer=opt, metrics=["accuracy"])
You also forgot to import the followings. Add these imports too.
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.applications.vgg16 import VGG16
from tensorflow.keras.layers import *
And you're good to go.
Btw, I'm pretty much afraid of the batch size(66) you're using. I don't know which GPU you have but still, I would suggest you decrease the batch size.

keras - `sample_weight` results in NaN when zero passed - also not efficient for unbalanced data

I am designing a model with two outputs, y and dy, where I have much more training data for y than dy while the location (x) of those data points are the same (please check the image bellow).
I am handling this issue with sample_weight in keras.model.fit. There are two concerns:
If I pass 'zero' for a sample weight, after the first training, it results into NaN. I instead have to pass a very small number, which I am not sure how it affects the training.
This is inefficient if I have multiple outputs with many of them have available training data at very few locations. Because, all the training data will be included in the updates. Is there any other way to handle this case?
Note that Keras works fine training the model, however, I am looking for more efficient way to also be able to pass zero for unwanted weights.
Please check the code bellow:
import numpy as np
import keras as k
import tensorflow as tf
from matplotlib.pyplot import plot, show, legend
# Note this is needed to handle lambda layers as Keras' gradient does not work in this setup.
def custom_grad(y, x):
return tf.gradients(y, x, unconnected_gradients='zero', colocate_gradients_with_ops=True)
# Setting up keras model.
x = k.Input((1,), name='x', dtype='float32')
lay = k.layers.Dense(10, activation='tanh')(x)
lay = k.layers.Dense(10, activation='tanh')(lay)
y = k.layers.Dense(1, name='y')(lay)
dy = k.layers.Lambda(lambda f: custom_grad(f, x), name='dy')(y)
model = k.Model(x, [y, dy])
# Preparing training data.
num_samples = 10000
x_true = np.linspace(0.0, np.pi, num_samples)
y_true = np.sin(x_true)
dy_true = np.zeros_like(y_true)
# for dy, we only have values at certain points -
# say 10% of what is available for yfrom initial and the end.
percentage = 0.1
dy_ids = np.concatenate((np.arange(0, num_samples*percentage, dtype=int),
np.arange(num_samples*(1-percentage), 10000, dtype=int)))
dy_true[dy_ids] = np.cos(x_true[dy_ids])
# I use sample weight to circumvent unbalanced available data.
y_sample_weight = np.ones_like(y_true)
dy_sample_weight = np.zeros_like(y_true) + 1.0e-8
dy_sample_weight[dy_ids] = num_samples/dy_ids.size
assert abs(dy_sample_weight.sum() - num_samples) <= 1.0e-3
# training the model.
model.compile("adam", loss="mse")
model.fit(x_true, [y_true, dy_true],
sample_weight=[y_sample_weight, dy_sample_weight],
epochs=50, shuffle=True)
[y_pred, dy_pred] = model.predict(x_true)
# expected outputs.
plot(x_true, y_true, '.k', label='y true')
plot(x_true[dy_ids], dy_true[dy_ids], '.r', label='dy true')
plot(x_true, y_pred, '--b', label='y pred')
plot(x_true, dy_pred, '--b', label='dy pred')
legend()
show()

How can I print output (tensor values, shapes) in gpflow?

I am trying to develop a new model within gpflow. In order to debug it I need to know shapes and values of tensors during execution of the graph.
I tried the below based on printing tensor values in tensorflow, but nothing is printed to the console.
import numpy as np
import sys
import gpflow
from gpflow.mean_functions import MeanFunction
from gpflow.decors import params_as_tensors
class Log(MeanFunction):
"""
:math:`y_i = \log(x_i)`
"""
def __init__(self):
MeanFunction.__init__(self)
#params_as_tensors
def __call__(self, X):
# I want to figure out the shape of X here
tf.print(tf.shape(X), output_stream=sys.stdout)
# Returns the natural logarithm of the input
return tf.log(X)
# Test gpflow implementation
sess = tf.InteractiveSession()
with sess.as_default(), sess.graph.as_default():
X = np.random.uniform(size=[100, 1])
y = np.random.uniform(size=[100, 1])
m = gpflow.models.GPR(X=X, Y=y, mean_function=Log(), kern=gpflow.kernels.RBF(input_dim=1))
You're on the right track. According to the TensorFlow docs [1], you need to wrap tf.print() in a tf.control_dependencies() context manager to make sure it's run, when in graph model. GPflow currently works in graph model. GPflow 2.0, which is indevelopment, will allow usage in eager mode.
#params_as_tensors
def __call__(self, X):
# I want to figure out the shape of X here
print_op = tf.print(tf.shape(X), output_stream=sys.stdout)
with tf.control_dependencies([print_op]):
log_calc = tf.log(X)
# Returns the natural logarithm of the input
return log_calc
[1] https://www.tensorflow.org/api_docs/python/tf/print

CNN Training in Keras freezes

I am training a CNN model in Keras (Tensorflow backend). I have used on the fly augmentation with fit_generator(). The model takes images aa input and is supposed to predict the steering angle for a self driving car. The training just freezes after this point. I have tried changing the batch size, learning rate etc, but it doesn't work.
The training freezes at the end of first epoch.
Please help!
[BATCH_SIZE=32
INPUT_IMAGE_ROWS=160
INPUT_IMAGE_COLS=320
INPUT_IMAGE_CHANNELS=3
AUGMENTATION_NUM_BINS=200
NUM_EPOCHS=3
AUGMENTATION_BIN_MAX_PERC=0.5
AUGMENTATION_FACTOR=3
import csv
import cv2
import numpy as np
from random import shuffle
from sklearn.model_selection import train_test_split
import keras
from keras.callbacks import Callback
import math
from keras.preprocessing.image import *
print("\nLoading the dataset from file ...")
def load_dataset(file_path):
dataset = \[\]
with open(file_path) as csvfile:
reader = csv.reader(csvfile)
for line in reader:
try:
dataset.append({'center':line\[0\], 'left':line\[1\], 'right':line\[2\], 'steering':float(line\[3\]),
'throttle':float(line\[4\]), 'brake':float(line\[5\]), 'speed':float(line\[6\])})
except:
continue # some images throw error during loading
return dataset
dataset = load_dataset('C:\\Users\\kiit1\\Documents\\steering angle prediction\\dataset_coldivision\\data\\driving_log.csv')
print("Loaded {} samples from file {}".format(len(dataset),'C:\\Users\\kiit1\\Documents\\steering angle prediction\\dataset_coldivision\\data\\driving_log.csv'))
print("Partioning the dataset:")
shuffle(dataset)
#partitioning data into 80% training, 19% validation and 1% testing
X_train,X_validation=train_test_split(dataset,test_size=0.2)
X_validation,X_test=train_test_split(X_validation,test_size=0.05)
print("X_train has {} elements.".format(len(X_train)))
print("X_validation has {} elements.".format(len(X_validation)))
print("X_test has {} elements.".format(len(X_test)))
print("Partitioning the dataset complete.")
def generate_batch_data(dataset, batch_size = 32):
global augmented_steering_angles
global epoch_steering_count
global epoch_bin_hits
batch_images = np.zeros((batch_size, INPUT_IMAGE_ROWS, INPUT_IMAGE_COLS, INPUT_IMAGE_CHANNELS))
batch_steering_angles = np.zeros(batch_size)
while 1:
for batch_index in range(batch_size):
# select a random image from the dataset
image_index = np.random.randint(len(dataset))
image_data = dataset\[image_index\]
while 1:
try:
image, steering_angle = load_and_augment_image(image_data)
except:
continue
bin_idx = int (steering_angle * AUGMENTATION_NUM_BINS / 2)
if( epoch_bin_hits\[bin_idx\] < epoch_steering_count/AUGMENTATION_NUM_BINS*AUGMENTATION_BIN_MAX_PERC
or epoch_steering_count<500 ):
batch_images\[batch_index\] = image
batch_steering_angles\[batch_index\] = steering_angle
augmented_steering_angles.append(steering_angle)
epoch_bin_hits\[bin_idx\] = epoch_bin_hits\[bin_idx\] + 1
epoch_steering_count = epoch_steering_count + 1
break
yield batch_images, batch_steering_angles
print("\nTraining the model ...")
class LifecycleCallback(keras.callbacks.Callback):
def on_epoch_begin(self, epoch, logs={}):
pass
def on_epoch_end(self, epoch, logs={}):
global epoch_steering_count
global epoch_bin_hits
global bin_range
epoch_steering_count = 0
epoch_bin_hits = {k:0 for k in range(-bin_range, bin_range)}
def on_batch_begin(self, batch, logs={}):
pass
def on_batch_end(self, batch, logs={}):
self.losses.append(logs.get('loss'))
def on_train_begin(self, logs={}):
print('Beginning training')
self.losses = \[\]
def on_train_end(self, logs={}):
print('Ending training')
# Compute the correct number of samples per epoch based on batch size
def compute_samples_per_epoch(array_size, batch_size):
num_batches = array_size / batch_size
samples_per_epoch = math.ceil(num_batches)
samples_per_epoch = samples_per_epoch * batch_size
return samples_per_epoch
def load_and_augment_image(image_data, side_camera_offset=0.2):
# select a value between 0 and 2 to swith between center, left and right image
index = np.random.randint(3)
if (index==0):
image_file = image_data\['left'\].strip()
angle_offset = side_camera_offset
elif (index==1):
image_file = image_data\['center'\].strip()
angle_offset = 0.
elif (index==2):
image_file = image_data\['right'\].strip()
angle_offset = - side_camera_offset
steering_angle = image_data\['steering'\] + angle_offset
image = cv2.imread(image_file)
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# apply a misture of several augumentation methods
image, steering_angle = random_transform(image, steering_angle)
return image, steering_angle
augmented_steering_angles = \[\]
epoch_steering_count = 0
bin_range = int(AUGMENTATION_NUM_BINS / 4 * 3)
epoch_bin_hits = {k:0 for k in range(-bin_range, bin_range)}
#flips image about y-axis
def horizontal_flip(image,steering_angle):
flipped_image=cv2.flip(image,1);
steering_angle=-steering_angle
return flipped_image,steering_angle
def translate(image,steering_angle,width_shift_range=50.0,height_shift_range=5.0):
tx = width_shift_range * np.random.uniform() - width_shift_range / 2
ty = height_shift_range * np.random.uniform() - height_shift_range / 2
# new steering angle
steering_angle += tx / width_shift_range * 2 * 0.2
transformed_matrix=np.float32(\[\[1,0,tx\],\[0,1,ty\]\])
rows,cols=(image.shape\[0\],image.shape\[1\])
translated_image=cv2.warpAffine(image,transformed_matrix,(cols,rows))
return translated_image,steering_angle
def brightness(image,bright_increase=None):
if(image.shape\[2\]>1):
image_hsv=cv2.cvtColor(image,cv2.COLOR_RGB2HSV)
else:
image_hsv=image
if bright_increase:
image_hsv\[:,:,2\] += bright_increase
else:
bright_increase = int(30 * np.random.uniform(-0.3,1))
image_hsv\[:,:,2\] = image\[:,:,2\] + bright_increase
image = cv2.cvtColor(image_hsv, cv2.COLOR_HSV2RGB)
return image
def rotation(image,rotation_range=5):
image=random_rotation(image,rotation_range);
return image
# Shift range for each channels
def channel_shift(image, intensity=30, channel_axis=2):
image = random_channel_shift(image, intensity, channel_axis)
return image
# Crop and resize the image
def crop_resize_image(image, cols=INPUT_IMAGE_COLS, rows=INPUT_IMAGE_ROWS, top_crop_perc=0.1, bottom_crop_perc=0.2):
height = image.shape\[0\]
width= image.shape\[1\]
# crop top and bottom
top_rows = int(height*top_crop_perc)
bottom_rows = int(height*bottom_crop_perc)
image = image\[top_rows:height-bottom_rows, 0:width\]
# resize to the final sizes even the aspect ratio is destroyed
image = cv2.resize(image, (cols, rows), interpolation=cv2.INTER_LINEAR)
return image
# Apply a sequence of random tranformations for a better generalization and to prevent overfitting
def random_transform(image, steering_angle):
# all further transformations are done on the smaller image to reduce the processing time
image = crop_resize_image(image)
# every second image is flipped horizontally
if np.random.random() < 0.5:
image, steering_angle = horizontal_flip(image, steering_angle)
image, steering_angle = translate(image, steering_angle)
image = rotation(image)
image = brightness(image)
image = channel_shift(image)
return img_to_array(image), steering_angle
from keras.models import Sequential, Model
from keras.layers.core import Lambda, Dense, Activation, Flatten, Dropout
from keras.layers.convolutional import Cropping2D, Convolution2D
from keras.layers.advanced_activations import ELU
from keras.layers.noise import GaussianNoise
from keras.optimizers import Adam
print("\nBuilding and compiling the model ...")
model = Sequential()
model.add(Lambda(lambda x: (x / 127.5) - 1.0, input_shape=(INPUT_IMAGE_ROWS, INPUT_IMAGE_COLS, INPUT_IMAGE_CHANNELS)))
# Conv Layer1 of 16 filters having size(8, 8) with strides (4,4)
model.add(Convolution2D(16, 8, 8, subsample=(4, 4), border_mode="same"))
model.add(ELU())
# Conv Layer1 of 32 filters having size(5, 5) with strides (2,2)
model.add(Convolution2D(32, 5, 5, subsample=(2, 2), border_mode="same"))
model.add(ELU())
# Conv Layer1 of 64 filters having size(5, 5) with strides (2,2)
model.add(Convolution2D(64, 5, 5, subsample=(2, 2), border_mode="same"))
model.add(Flatten())
model.add(Dropout(.5))
model.add(ELU())
model.add(Dense(512))
model.add(Dropout(.5))
model.add(ELU())
model.add(Dense(1))
model.summary()
adam = Adam(lr=0.0001)
model.compile(loss='mse', optimizer=adam)
lifecycle_callback = LifecycleCallback()
train_generator = generate_batch_data(X_train, BATCH_SIZE)
validation_generator = generate_batch_data(X_validation, BATCH_SIZE)
samples_per_epoch = compute_samples_per_epoch((len(X_train)*AUGMENTATION_FACTOR), BATCH_SIZE)
nb_val_samples = compute_samples_per_epoch((len(X_validation)*AUGMENTATION_FACTOR), BATCH_SIZE)
history = model.fit_generator(train_generator,
validation_data = validation_generator,
samples_per_epoch = ((len(X_train) // BATCH_SIZE ) * BATCH_SIZE) * 2,
nb_val_samples = ((len(X_validation) // BATCH_SIZE ) * BATCH_SIZE) * 2,
nb_epoch = NUM_EPOCHS, verbose=1,
)
print("\nTraining the model ended.")][1]
You have a weird structure for the data generator and that is most likely causing this issue, though I cannot be completely sure.
You structure is as follows:
while 1:
....
for _ in range(batch_size):
randomly select an image # this is inefficient, see below for comments
while 1:
process image
if epoch is not done:
collect images in a list
break
yield ...
Now,
Do not choose images randomly at each iteration. Instead shuffle your dataset once at the starting of each epoch and then choose sequentially.
As far as I understood, if epoch is not done, then break is a typo. Did you mean if epoch is not done then collect images, otherwise break? Your break is inside the if which means when it enters if for the first time, it will come out of the innermost while 1 loop. Surely not what you intend to do, right?
The yield is outside the for loop. You should yield each batch, so if for is iterating over batches, then yield should be inside for.
The structure of a basic data generator should be:
while 1:
shuffle entire dataset once # not applicable for massive datasets
for _ in range(n_batches_per_epoch):
get a data batch
Optionally, do some preprocessing # preferably on the entire batch,
not one by one, you could also preprocess the entire dataset if its simple
enough, such as mean subtraction.
yield batches, labels
I would suggest you to again write the data generator. You could see the myGenerator() function on this page for a basic data generator. Once you write the generator, then test it as a stand-alone function to make sure it outputs the data indefinitely and keeps the track of epochs.
In short, it is hard to say which part is problematic, maybe data, maybe a model, or something else. So please be patient, and you will resolve the issue eventually.
First of all, you can train a baseLine model without data augmentation. If your data augmentation is helpful, you shall expect performance improvement after applying data augmentation to the new augmLine model.
If baseLine behaves similarly to augmLine, you may consider changing your network design. For example, in your current design, 1) Conv2D layers without any activation are very rare, and you may want to use relu or tanh, and 2) ELU(alpha) is known to be sensitive to the alpha value.
If baseLine actually works fine, this is an indicator that your augmLine's data is problematic. To ensure the correctness of the augmented data, you'd better plot both image data and target values and manually verify them. One common mistake for image data augmentation is that if the target values depend on the input image, then you have to generate new target values according to the augmented image. Sometimes this task is not trivial.
Note, to have a fair comparison, you need to keep validation data unchanged for both experiments.