I'm using tf.data.Dataset and tf.keras in TF2.1 to train on a dataset. But I saw strange behavior that the resulting batches do not show fully random as I expected. I mean, I usually see elements from only 2 classes in one batch even my dataset has 4 classes. My code is as follows:
def process_train_sample(file_path):
sp = tf.strings.regex_replace(file_path, train_data_dir, '')
cls = tf.math.argmax(tf.cast(tf.math.equal(tf.strings.split(sp, os.path.sep)[0],['A','B','C','D']), tf.int64))
img = tf.io.read_file(file_path)
img = tf.image.decode_jpeg(img, channels=3) # RGB
img = tf.image.resize(img, (224, 224))
img = tf.cast(img, tf.float32)
img = img - np.array([123.68, 116.779, 103.939])
img = img / 255.0
cls = tf.expand_dims(cls, 0)
return img, cls
train_data_list = glob.glob(os.path.join(train_data_dir, '**', '*.jpg'), recursive=True)
train_data_list = tf.data.Dataset.from_tensor_slices(train_data_list)
train_ds = train_data_list.map(process_train_sample, num_parallel_calls=tf.data.experimental.AUTOTUNE)
train_ds = train_ds.shuffle(10000)
train_ds = train_ds.batch(batch_size)
for img, cls in train_ds.take(10):
print('img: ', img.numpy().shape, 'cls: ', cls.numpy())
model.compile(loss='categorical_crossentropy',
optimizer=optimizers.SGD(lr=0.0001, momentum=0.9),
metrics=['categorical_accuracy', 'categorical_crossentropy'])
model.fit(train_ds, epochs=50)
When I'm training on a dataset with 4 classes - A,B,C,D, I found the training accuracy does not increase stably, instead, it fluctuates up and down. Then I checked my data input pipeline by showing labels batch by batch as in the for-loop and found that each batch contains only elements from 2 classes, instead of 4. It seems the dataset is not shuffled as I expected which may cause the accuracy not to increase steadily. But I don't see what's wrong in my code.
In .shuffle(10 000), the 10 000 is the buffer size, which means it will sample from the first 10 000 images. As you have ~30 000 images, this results in images only from the first and second classes in the first batches batches. As you continue training, you will start to sample from class (1,2,3), then only (2,3), then (2,3,4), then (3,4), then (3,4,1), then (4,1), then (4,1,2), then (1,2), then (1,2,3), and so on. Try setting shuffle buffer size to 30 000 if you have the memory, or if you don't, first shuffle your list of paths, and then use a large batch size.
Related
I have this problem of splitting mnist dataset + adding augmentation data. i want to take only total of 22000(including training + test set) data from mnist dataset which is 70000. mnist dataset have 10 label. im only using shear, rotation, width-shift, and heigh-shift for augmetation method.
training set --> 20000(total) --> 20 images + 1980 augmentation images(per label)
test set --> 2000(total) --> 200 images(per label)
i also want to make sure that the class distribution is preserved in the split.
i'm really confused how to split those data. would gladly if anyone can provide the code.
i have tried this code :
# Load the MNIST dataset
(x_train_full, y_train_full), (x_test_full, y_test_full) = keras.datasets.mnist.load_data()
# Normalize the data
x_train_full = x_train_full / 255.0
x_test_full = x_test_full / 255.0
# Create a data generator for data augmentation
data_gen = ImageDataGenerator(shear_range=0.2, rotation_range=20,
width_shift_range=0.2, height_shift_range=0.2)
# Initialize empty lists for the training and test sets
x_train, y_train, x_test, y_test = [], [], [], []
# Loop through each class/label
for class_n in range(10):
# Get the indices of the images for this class
class_indices = np.where(y_train_full == class_n)[0]
# Select 20 images for training
train_indices = np.random.choice(class_indices, 20, replace=False)
# Append the training images and labels to the respective lists
x_train.append(x_train_full[train_indices])
y_train.append(y_train_full[train_indices])
# Select 200 images for test
test_indices = np.random.choice(class_indices, 200, replace=False)
# Append the test images and labels to the respective lists
x_test.append(x_test_full[test_indices])
y_test.append(y_test_full[test_indices])
# Generate 100 augmented images for training
x_augmented = data_gen.flow(x_train_full[train_indices], y_train_full[train_indices], batch_size=100)
# Append the augmented images and labels to the respective lists
x_train.append(x_augmented[0])
y_train.append(x_augmented[1])
# Concatenate the list of images and labels to form the final training and test sets
x_train = np.concatenate(x_train)
y_train = np.concatenate(y_train)
x_test = np.concatenate(x_test)
y_test = np.concatenate(y_test)
print("training set shape: ", x_train.shape)
print("training label shape: ", y_train.shape)
print("test set shape: ", x_test.shape)
print("test label shape: ", y_test.shape)
but it keep saying error like this :
IndexError: index 15753 is out of bounds for axis 0 with size 10000
You are mixing the train and test set. In the loop, you are getting the class_indices from the train set:
# Get the indices of the images for this class
class_indices = np.where(y_train_full == class_n)[0]
but then you are using these train indices (that might be numbers above 10000!) to address indices in the testset (that has only 10000 samples) some lines further down:
# Select 200 images for test
test_indices = np.random.choice(class_indices, 200, replace=False)
So, you will need to do the same index-selection for the label in the loop for the test-set and it should work out.
I have A Large dataset (> 62 GiB) after processing saved as two NumPy.memmap arrays one of the data and the other for the labels the dataset has these shapes (7390,60,224,224,3) , and (7390) and is NOT shuffled so i need to shuffle it first.
now i use tensorflow2 and used this code with my generator to manage memmap arrays before
def my_generator():
for i in range(len(numpy_array)):
yield numpy_array[i,:,:,:,:],np.array(labels[i]).reshape(1)
full_dataset = tf.data.Dataset.from_generator(
generator=my_generator,
output_types=(np.uint8,np.int32),
output_shapes=((60,224,224,3),(1))
)
full_dataset = full_dataset.shuffle(SHUFFLE_BUFFER_SIZE, reshuffle_each_iteration=False)
train_dataset = full_dataset.take(train_size)
test_dataset = full_dataset.skip(train_size)
val_dataset = test_dataset.skip(test_size)
test_dataset = test_dataset.take(test_size)
That way i can train without loading to memory the entire dataset with shuffling and batching.
Now with this current model and dataset the vram is not enogh for more than 2 batches to be loaded as tensors.
and i can't train with batchsize of 2.
i thought of gradient accumulation but i couldn't do it with TF2 and i found it easy with pytorch but i can't find how to deal with the memmap arrays with shuffle and split as in tensorflow with generators.
so i need to know how to load the datset from pytorch with the same shuffling and batching in pytorch.
Or if someone has a readymade code for GA on TF2
I will just address the shuffle question.
Instead of shuffling with tf.data.Dataset, do it at the generator level. This should work:
class Generator(object):
def __init__(self, images, labels, batch_size):
self.images = images
self.labels = labels
self.batch_size = batch_size
self.idxs = np.arange(len(self.images))
self.on_epoch_end()
def on_epoch_end(self):
# Shuffle the indices
np.random.shuffle(self.idxs)
def generator(self):
i = 0
while i < len(self.idxs):
idx = self.idxs[i]
yield (self.images[idx], self.labels[i])
i += 1
self.on_epoch_end()
def batch_generator(self):
it = iter(self.generator)
while True:
vals = [next(it) for i in range(self.batch_size)]
images, labels = zip(*vals)
yield images, labels
Then you can use it by
gen = Generator(...)
it = iter(gen)
batch = next(it) # Call this every time you want a new batch
I'm sure pytorch has build in methods for this kind of stuff though
I'm attempting to create a CNN to classify emotions based on facial expressions from an image. I'm using this dataset I found on Kaggle, in csv format with each row having grayscale image data and emotion. It has ~30,000 data points randomly split into 80% training, 10% validation and 10% testing.
While adjusting settings and structure of my CNN I've experienced at least one of these problems at a time.
CNN only (or mainly) predicting one or two outputs.
Validation Accuracy that does not change at all
Accuracy fluctuating around equivalent to random guessing (16.66%) +/- 10%
Constant accuracy and loss for both training and data
I've tried changing learning rate, optimizer, batch size, number of filters, epochs. I've used a different data set and tried class weights for slight imbalance as well as varying numbers of hidden layers and complexities.
I've also tried training on one sample but got this for accuracy and this for loss.
The following graphs are for the configuration currently in the code, though changing what I've listed above just results in any of the problems I mentioned.
Train/Val Accuracy during training
Train/Vall Loss during training
I suspected it may be due to labels being mismatched with data, so I manually check whether labels and images matched in train_set, which they did.
However using the following code to check them in all_sets (after conversion) gave several mismatched data samples out of 16.
for num, row in enumerate(all_sets[0]):
if num < 16:
newimage = tensorflow.keras.preprocessing.image.array_to_img(row)
newimage.save("filename.png") # takes type from filename extension
display(Image('filename.png'))
print(classes[all_labels[0][num]])
!rm -rf filename.png
Conversion code in question:
tempSets = [train_set, test_set, val_set] #store all sets in list to iterate through
#panda reads image from CSV as a string rather than an array of floats.
#convert from panda dataframe to numpy array, as easier to feed into image augmentation
set_sizes = [train_set.shape[0] , test_set.shape[0], val_set.shape[0]] #Array storing number of records for each set
all_sets = [np.empty([set_sizes[0], 48,48,1], dtype=float), # train
np.empty([set_sizes[1], 48,48,1], dtype=float), # test
np.empty([set_sizes[2], 48,48,1], dtype=float)] # validate
all_labels = [np.empty(set_sizes[0], dtype=int), # train
np.empty(set_sizes[1], dtype=int), # test
np.empty(set_sizes[2], dtype=int)] # validate
for count, val in enumerate(tempSets): # for each set
for num, row in enumerate(val.itertuples()): #each row in a set, store as a tuple and keep index num
num_str_list = row._2.split() # split long string into array of string seperated by whitespaces (_2 is emotion, for some reason it renames itself)
for convertI in range(len(num_str_list)): # for each element in string array
num_str_list[convertI] = float(num_str_list[convertI]) #convert to float and store in new (1d) array
for xPixel in range(48):
for yPixel in range(48):
#match indexes to map 1d array contents into 2d
all_sets[count][num, xPixel, yPixel, 0] = num_str_list[xPixel*48 + yPixel]
all_labels[count][num] = data['emotion'][num] #assign labels with matching index
I'm afraid I'm lost as to where the issue is, as I can't find problems in my conversion or in displaying the images. I would appreciate any help. Thank you.
My full code:
drive.mount('/content/drive', force_remount=False)
driveContent='/content/drive/My Drive/EmotionClassification'
#Set base_dir as current working directory
base_dir = os.getcwd()
#Check whether training data is present in current working directory
dataSetIsInColab = False
for fname in os.listdir(base_dir): #get all names in current directory
if fname == 'icml_face_data.csv':
dataSetIsInColab = True
if dataSetIsInColab == False:
#dataset.zip is a zip file containing a CSV file which holds the dataset.
zip_path = os.path.join(driveContent,'dataset.zip') #grab content from drive
!cp "{zip_path}" . #Copy zip file to current working directory
!unzip -q dataset.zip # Unzip (-q prevents printing all files)
!rm dataset.zip #Remove zip file since we now have unzipped data
data = panda.read_csv('icml_face_data.csv') # import data as panda dataframe object
data.pop(' Usage') #remove unnecessary column
data.rename(columns={" pixels":"pixels"}) # remove space from column name as it can cause issues
data[data['emotion']!=1] #remove disgust emotion due to large imbalance
data.loc[data['emotion'] > 1, 'emotion'] = data['emotion'] - 1 #shift emotion values to fill disgust slot
classes = ['anger', 'fear', 'happiness', 'sadness', 'surprise', 'neutral']
#Use panda dataframe methods to randomly split dataset
train_set = data.sample(frac=0.8) # Train set is 80%
temp_set = data.drop(train_set.index) # Temp set is 20%
test_set = temp_set.sample(frac=0.5) #Test set is half of 20%
val_set = temp_set.drop(test_set.index) #Val set is the other half of 20%
#train_set.reset_index(drop=True, inplace=True)
#test_set.reset_index(drop=True, inplace=True)
#val_set.reset_index(drop=True, inplace=True)
tempSets = [train_set, test_set, val_set] #store all sets in list to iterate through
#panda reads image from CSV as a string rather than an array of floats.
#convert from panda dataframe to numpy array, as easier to feed into image augmentation
set_sizes = [train_set.shape[0] , test_set.shape[0], val_set.shape[0]] #Array storing number of records for each set
all_sets = [np.empty([set_sizes[0], 48,48,1], dtype=float), # train
np.empty([set_sizes[1], 48,48,1], dtype=float), # test
np.empty([set_sizes[2], 48,48,1], dtype=float)] # validate
all_labels = [np.empty(set_sizes[0], dtype=int), # train
np.empty(set_sizes[1], dtype=int), # test
np.empty(set_sizes[2], dtype=int)] # validate
for count, val in enumerate(tempSets): # for each set
for num, row in enumerate(val.itertuples()): #each row in a set, store as a tuple and keep index num
num_str_list = row._2.split() # split long string into array of string seperated by whitespaces (_2 is emotion, for some reason it renames itself)
for convertI in range(len(num_str_list)): # for each element in string array
num_str_list[convertI] = float(num_str_list[convertI]) #convert to float and store in new (1d) array
for xPixel in range(48):
for yPixel in range(48):
#match indexes to map 1d array contents into 2d
all_sets[count][num, xPixel, yPixel, 0] = num_str_list[xPixel*48 + yPixel]
all_labels[count][num] = data['emotion'][num] #assign labels with matching index
#Check to use GPU if possible
gpu_name = tensorflow.test.gpu_device_name()
if gpu_name != '/device:GPU:0':
print('GPU device not found')
print('Found GPU at: {}'.format(gpu_name))
#SET UP IMAGE AUGMENTATION AND INPUT
imgShape=(48,48,1) # input image dimensions (greyscale)
batchSize = 64
#this datagen is applied to training sets
dataGen = ImageDataGenerator(
rescale=1./255,
rotation_range=45,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
vertical_flip = True)
# Set up data generator batches, shapes and class modes
trainingDataGen = dataGen.flow(
x=all_sets[0],
y=tensorflow.keras.utils.to_categorical(all_labels[0]),
batch_size=batchSize)
testGen = ImageDataGenerator(rescale=1./255).flow(
x=all_sets[1],
y=tensorflow.keras.utils.to_categorical(all_labels[1]))
validGen = ImageDataGenerator(rescale=1./255).flow(
x=all_sets[2],
y=tensorflow.keras.utils.to_categorical(all_labels[2]),
batch_size=batchSize)
# BUILD THE MODEL
model = Sequential()
model.add(Conv2D(256, (3,3), activation='relu', input_shape = imgShape))
model.add(MaxPooling2D((2,2)))
model.add(Dropout(0.1))
model.add(Conv2D(512, (3,3), activation='relu'))
model.add(MaxPooling2D((2,2)))
model.add(Dropout(0.1))
model.add(Flatten())
model.add(Dense(32, activation='relu'))
model.add(Dense(6, activation='softmax'))
model.summary() # Show the structure of the network
opt = tensorflow.keras.optimizers.Adam(learning_rate = 0.0001) # initialise optimizer
model.compile(loss='categorical_crossentropy', optimizer=opt, metrics=['accuracy'])
weights = sklearn.utils.class_weight.compute_class_weight('balanced', np.unique(all_labels[0]), all_labels[0]) # use class weights for imbalanced classes
weights = {i : weights[i] for i in range(6)}
history = model.fit(
trainingDataGen,
batch_size = batchSize,
epochs = 32,
class_weight = weights,
validation_data = validGen
)
EDIT: I found the issue in my conversion, I was assigning labels based on the unsplit data instead of each split set individually.
From all_labels[count][num] = data['emotion'][num] to
all_labels[count][num] = row.emotion
Labels are matching now. Training accuracy/loss steadily improves, but validation accuracy and loss are fluctuating a lot. I'm going to do some more tweaking, but is this something I should worry about?
EDIT 2: Made title more fitting
So I left it training for more epochs, and here are the results.
accuracy and loss.
Peaking at 35% isn't ideal
EDIT 3: It's back to fluctuating wildly.
accuracy and loss.
I used some code to test predictions of my network after training. Here are results;
Total tests performed: 4068
Total accuracy: 0.17649950835791545
anger : {'Successes': 0, 'Occured': 567} Accuracy: 0.0
fear : {'Successes': 0, 'Occured': 633} Accuracy: 0.0
happiness : {'Successes': 0, 'Occured': 1023} Accuracy: 0.0
sadness : {'Successes': 0, 'Occured': 685} Accuracy: 0.0
surprise : {'Successes': 0, 'Occured': 442} Accuracy: 0.0
neutral : {'Successes': 718, 'Occured': 718} Accuracy: 1.0
Code below;
#Test our model's performance
truth = []
predict = []
#Test a batch of 128 test images against the model
for i in range(128):
a,b = next(testGen)
predict.append(model.predict(a))
truth.append(b)
predict = np.concatenate(predict) #Array of predictions model has made
truth = np.concatenate(truth) #Array of true results
successes = 0
items = []
for i in range(len(classes)):
items.append({"Successes" : 0, "Occured" : 0})
for i in range(0, len(predict)):
items[np.argmax(truth[i])]["Occured"] += 1
if (np.argmax(predict[i]) == np.argmax(truth[i])): # If the model's prediction matches the true value
successes += 1
items[np.argmax(truth[i])]["Successes"] += 1
print("Total tests performed: ", len(predict))
print("Total accuracy: ", successes/len(predict))
print()
for i in range(len(classes)):
print(classes[i], ':', items[i], "Accuracy: ", (items[i]["Successes"]/items[i]["Occured"]))
I am trying to create a preprocessing function so that the training_dataset can be directly fed into a keras sequential neural network. The preprocess function should return features and labels.
def preprocessing_function(data):
features = ...
labels = ...
return features, labels
dataset, info = tfds.load(name='cats_vs_dogs', split=tfds.Split.TRAIN, with_info=True)
training_dataset = dataset.map(preprocessing_function)
How should I write the preprocessing_function? I spent several hours researching and trying to make it happen, but to no avail. Hoping someone can assist.
Here are two functions for preprocessing. FIrst one will be applied to both train and validation data to normalize the data and resize to the expected size of network. The second function, augmentation, will be applied to training set only. The type of augmentation you want to do depends on your dataset and application, but I provided this as an example.
#Fetching, pre-processing & preparing data-pipeline
def preprocess(ds):
x = tf.image.resize_with_pad(ds['image'], IMG_SIZE_W, IMG_SIZE_H)
x = tf.cast(x, tf.float32)
x = (x-MEAN)/(VARIANCE)
y = tf.one_hot(ds['label'], NUM_CLASSES)
return x, y
def augmentation(image,label):
image = tf.image.random_flip_left_right(image)
image = tf.image.resize_with_crop_or_pad(image, IMG_W+4, IMG_W+4) # zero pad each side with 4 pixels
image = tf.image.random_crop(image, size=[BATCH_SIZE, IMG_W, IMG_H, 3]) # Random crop back to 32x32
return image, label
and to load training and validation datasets, do something like this:
def get_dataset(dataset_name, shuffle_buff_size=1024, batch_size=BATCH_SIZE, augmented=True):
train, info_train = tfds.load(dataset_name, split='train[:80%]', with_info=True)
val, info_val = tfds.load(dataset_name, split='train[80%:]', with_info=True)
TRAIN_SIZE = info_train.splits['train'].num_examples * 0.8
VAL_SIZE = info_train.splits['train'].num_examples * 0.2
train = train.map(preprocess).cache().repeat().shuffle(shuffle_buff_size).batch(batch_size)
if augmented==True:
train = train.map(augmentation)
train = train.prefetch(tf.data.experimental.AUTOTUNE)
val = val.map(preprocess).cache().repeat().batch(batch_size)
val = val.prefetch(tf.data.experimental.AUTOTUNE)
return train, info_train, val, info_val, TRAIN_SIZE, VAL_SIZE
I am training a CNN model in Keras (Tensorflow backend). I have used on the fly augmentation with fit_generator(). The model takes images aa input and is supposed to predict the steering angle for a self driving car. The training just freezes after this point. I have tried changing the batch size, learning rate etc, but it doesn't work.
The training freezes at the end of first epoch.
Please help!
[BATCH_SIZE=32
INPUT_IMAGE_ROWS=160
INPUT_IMAGE_COLS=320
INPUT_IMAGE_CHANNELS=3
AUGMENTATION_NUM_BINS=200
NUM_EPOCHS=3
AUGMENTATION_BIN_MAX_PERC=0.5
AUGMENTATION_FACTOR=3
import csv
import cv2
import numpy as np
from random import shuffle
from sklearn.model_selection import train_test_split
import keras
from keras.callbacks import Callback
import math
from keras.preprocessing.image import *
print("\nLoading the dataset from file ...")
def load_dataset(file_path):
dataset = \[\]
with open(file_path) as csvfile:
reader = csv.reader(csvfile)
for line in reader:
try:
dataset.append({'center':line\[0\], 'left':line\[1\], 'right':line\[2\], 'steering':float(line\[3\]),
'throttle':float(line\[4\]), 'brake':float(line\[5\]), 'speed':float(line\[6\])})
except:
continue # some images throw error during loading
return dataset
dataset = load_dataset('C:\\Users\\kiit1\\Documents\\steering angle prediction\\dataset_coldivision\\data\\driving_log.csv')
print("Loaded {} samples from file {}".format(len(dataset),'C:\\Users\\kiit1\\Documents\\steering angle prediction\\dataset_coldivision\\data\\driving_log.csv'))
print("Partioning the dataset:")
shuffle(dataset)
#partitioning data into 80% training, 19% validation and 1% testing
X_train,X_validation=train_test_split(dataset,test_size=0.2)
X_validation,X_test=train_test_split(X_validation,test_size=0.05)
print("X_train has {} elements.".format(len(X_train)))
print("X_validation has {} elements.".format(len(X_validation)))
print("X_test has {} elements.".format(len(X_test)))
print("Partitioning the dataset complete.")
def generate_batch_data(dataset, batch_size = 32):
global augmented_steering_angles
global epoch_steering_count
global epoch_bin_hits
batch_images = np.zeros((batch_size, INPUT_IMAGE_ROWS, INPUT_IMAGE_COLS, INPUT_IMAGE_CHANNELS))
batch_steering_angles = np.zeros(batch_size)
while 1:
for batch_index in range(batch_size):
# select a random image from the dataset
image_index = np.random.randint(len(dataset))
image_data = dataset\[image_index\]
while 1:
try:
image, steering_angle = load_and_augment_image(image_data)
except:
continue
bin_idx = int (steering_angle * AUGMENTATION_NUM_BINS / 2)
if( epoch_bin_hits\[bin_idx\] < epoch_steering_count/AUGMENTATION_NUM_BINS*AUGMENTATION_BIN_MAX_PERC
or epoch_steering_count<500 ):
batch_images\[batch_index\] = image
batch_steering_angles\[batch_index\] = steering_angle
augmented_steering_angles.append(steering_angle)
epoch_bin_hits\[bin_idx\] = epoch_bin_hits\[bin_idx\] + 1
epoch_steering_count = epoch_steering_count + 1
break
yield batch_images, batch_steering_angles
print("\nTraining the model ...")
class LifecycleCallback(keras.callbacks.Callback):
def on_epoch_begin(self, epoch, logs={}):
pass
def on_epoch_end(self, epoch, logs={}):
global epoch_steering_count
global epoch_bin_hits
global bin_range
epoch_steering_count = 0
epoch_bin_hits = {k:0 for k in range(-bin_range, bin_range)}
def on_batch_begin(self, batch, logs={}):
pass
def on_batch_end(self, batch, logs={}):
self.losses.append(logs.get('loss'))
def on_train_begin(self, logs={}):
print('Beginning training')
self.losses = \[\]
def on_train_end(self, logs={}):
print('Ending training')
# Compute the correct number of samples per epoch based on batch size
def compute_samples_per_epoch(array_size, batch_size):
num_batches = array_size / batch_size
samples_per_epoch = math.ceil(num_batches)
samples_per_epoch = samples_per_epoch * batch_size
return samples_per_epoch
def load_and_augment_image(image_data, side_camera_offset=0.2):
# select a value between 0 and 2 to swith between center, left and right image
index = np.random.randint(3)
if (index==0):
image_file = image_data\['left'\].strip()
angle_offset = side_camera_offset
elif (index==1):
image_file = image_data\['center'\].strip()
angle_offset = 0.
elif (index==2):
image_file = image_data\['right'\].strip()
angle_offset = - side_camera_offset
steering_angle = image_data\['steering'\] + angle_offset
image = cv2.imread(image_file)
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# apply a misture of several augumentation methods
image, steering_angle = random_transform(image, steering_angle)
return image, steering_angle
augmented_steering_angles = \[\]
epoch_steering_count = 0
bin_range = int(AUGMENTATION_NUM_BINS / 4 * 3)
epoch_bin_hits = {k:0 for k in range(-bin_range, bin_range)}
#flips image about y-axis
def horizontal_flip(image,steering_angle):
flipped_image=cv2.flip(image,1);
steering_angle=-steering_angle
return flipped_image,steering_angle
def translate(image,steering_angle,width_shift_range=50.0,height_shift_range=5.0):
tx = width_shift_range * np.random.uniform() - width_shift_range / 2
ty = height_shift_range * np.random.uniform() - height_shift_range / 2
# new steering angle
steering_angle += tx / width_shift_range * 2 * 0.2
transformed_matrix=np.float32(\[\[1,0,tx\],\[0,1,ty\]\])
rows,cols=(image.shape\[0\],image.shape\[1\])
translated_image=cv2.warpAffine(image,transformed_matrix,(cols,rows))
return translated_image,steering_angle
def brightness(image,bright_increase=None):
if(image.shape\[2\]>1):
image_hsv=cv2.cvtColor(image,cv2.COLOR_RGB2HSV)
else:
image_hsv=image
if bright_increase:
image_hsv\[:,:,2\] += bright_increase
else:
bright_increase = int(30 * np.random.uniform(-0.3,1))
image_hsv\[:,:,2\] = image\[:,:,2\] + bright_increase
image = cv2.cvtColor(image_hsv, cv2.COLOR_HSV2RGB)
return image
def rotation(image,rotation_range=5):
image=random_rotation(image,rotation_range);
return image
# Shift range for each channels
def channel_shift(image, intensity=30, channel_axis=2):
image = random_channel_shift(image, intensity, channel_axis)
return image
# Crop and resize the image
def crop_resize_image(image, cols=INPUT_IMAGE_COLS, rows=INPUT_IMAGE_ROWS, top_crop_perc=0.1, bottom_crop_perc=0.2):
height = image.shape\[0\]
width= image.shape\[1\]
# crop top and bottom
top_rows = int(height*top_crop_perc)
bottom_rows = int(height*bottom_crop_perc)
image = image\[top_rows:height-bottom_rows, 0:width\]
# resize to the final sizes even the aspect ratio is destroyed
image = cv2.resize(image, (cols, rows), interpolation=cv2.INTER_LINEAR)
return image
# Apply a sequence of random tranformations for a better generalization and to prevent overfitting
def random_transform(image, steering_angle):
# all further transformations are done on the smaller image to reduce the processing time
image = crop_resize_image(image)
# every second image is flipped horizontally
if np.random.random() < 0.5:
image, steering_angle = horizontal_flip(image, steering_angle)
image, steering_angle = translate(image, steering_angle)
image = rotation(image)
image = brightness(image)
image = channel_shift(image)
return img_to_array(image), steering_angle
from keras.models import Sequential, Model
from keras.layers.core import Lambda, Dense, Activation, Flatten, Dropout
from keras.layers.convolutional import Cropping2D, Convolution2D
from keras.layers.advanced_activations import ELU
from keras.layers.noise import GaussianNoise
from keras.optimizers import Adam
print("\nBuilding and compiling the model ...")
model = Sequential()
model.add(Lambda(lambda x: (x / 127.5) - 1.0, input_shape=(INPUT_IMAGE_ROWS, INPUT_IMAGE_COLS, INPUT_IMAGE_CHANNELS)))
# Conv Layer1 of 16 filters having size(8, 8) with strides (4,4)
model.add(Convolution2D(16, 8, 8, subsample=(4, 4), border_mode="same"))
model.add(ELU())
# Conv Layer1 of 32 filters having size(5, 5) with strides (2,2)
model.add(Convolution2D(32, 5, 5, subsample=(2, 2), border_mode="same"))
model.add(ELU())
# Conv Layer1 of 64 filters having size(5, 5) with strides (2,2)
model.add(Convolution2D(64, 5, 5, subsample=(2, 2), border_mode="same"))
model.add(Flatten())
model.add(Dropout(.5))
model.add(ELU())
model.add(Dense(512))
model.add(Dropout(.5))
model.add(ELU())
model.add(Dense(1))
model.summary()
adam = Adam(lr=0.0001)
model.compile(loss='mse', optimizer=adam)
lifecycle_callback = LifecycleCallback()
train_generator = generate_batch_data(X_train, BATCH_SIZE)
validation_generator = generate_batch_data(X_validation, BATCH_SIZE)
samples_per_epoch = compute_samples_per_epoch((len(X_train)*AUGMENTATION_FACTOR), BATCH_SIZE)
nb_val_samples = compute_samples_per_epoch((len(X_validation)*AUGMENTATION_FACTOR), BATCH_SIZE)
history = model.fit_generator(train_generator,
validation_data = validation_generator,
samples_per_epoch = ((len(X_train) // BATCH_SIZE ) * BATCH_SIZE) * 2,
nb_val_samples = ((len(X_validation) // BATCH_SIZE ) * BATCH_SIZE) * 2,
nb_epoch = NUM_EPOCHS, verbose=1,
)
print("\nTraining the model ended.")][1]
You have a weird structure for the data generator and that is most likely causing this issue, though I cannot be completely sure.
You structure is as follows:
while 1:
....
for _ in range(batch_size):
randomly select an image # this is inefficient, see below for comments
while 1:
process image
if epoch is not done:
collect images in a list
break
yield ...
Now,
Do not choose images randomly at each iteration. Instead shuffle your dataset once at the starting of each epoch and then choose sequentially.
As far as I understood, if epoch is not done, then break is a typo. Did you mean if epoch is not done then collect images, otherwise break? Your break is inside the if which means when it enters if for the first time, it will come out of the innermost while 1 loop. Surely not what you intend to do, right?
The yield is outside the for loop. You should yield each batch, so if for is iterating over batches, then yield should be inside for.
The structure of a basic data generator should be:
while 1:
shuffle entire dataset once # not applicable for massive datasets
for _ in range(n_batches_per_epoch):
get a data batch
Optionally, do some preprocessing # preferably on the entire batch,
not one by one, you could also preprocess the entire dataset if its simple
enough, such as mean subtraction.
yield batches, labels
I would suggest you to again write the data generator. You could see the myGenerator() function on this page for a basic data generator. Once you write the generator, then test it as a stand-alone function to make sure it outputs the data indefinitely and keeps the track of epochs.
In short, it is hard to say which part is problematic, maybe data, maybe a model, or something else. So please be patient, and you will resolve the issue eventually.
First of all, you can train a baseLine model without data augmentation. If your data augmentation is helpful, you shall expect performance improvement after applying data augmentation to the new augmLine model.
If baseLine behaves similarly to augmLine, you may consider changing your network design. For example, in your current design, 1) Conv2D layers without any activation are very rare, and you may want to use relu or tanh, and 2) ELU(alpha) is known to be sensitive to the alpha value.
If baseLine actually works fine, this is an indicator that your augmLine's data is problematic. To ensure the correctness of the augmented data, you'd better plot both image data and target values and manually verify them. One common mistake for image data augmentation is that if the target values depend on the input image, then you have to generate new target values according to the augmented image. Sometimes this task is not trivial.
Note, to have a fair comparison, you need to keep validation data unchanged for both experiments.