keras epoch runs twice

keras epoch runs twice - tensorflow

When I run the model.fit_generator, the epoch runs twice, the first time it only goes up to 39/40, then the second time 40/40. Any reason why this is happening?
Here is a GIF, you can also see epoch 1/2 actually appears in the epoch 2/2 run. This only happens when I pass validation_data=validation_generator
Update, Here is the code:
The dataset is from here
https://tiny-imagenet.herokuapp.com/
Packages are:
absl-py==0.9.0
astor==0.7.1
attrs==19.3.0
autopep8==1.4.4
backcall==0.1.0
bleach==3.1.4
brotlipy==0.7.0
certifi==2020.4.5.1
cffi==1.14.0
chardet==3.0.4
colorama==0.4.3
cryptography==2.8
cycler==0.10.0
decorator==4.4.2
defusedxml==0.6.0
entrypoints==0.3
future==0.18.2
gast==0.2.2
google-pasta==0.2.0
grpcio==1.23.0
h5py==2.10.0
idna==2.9
imageio==2.8.0
importlib-metadata==1.6.0
ipykernel==5.2.0
ipython==7.13.0
ipython-genutils==0.2.0
jedi==0.17.0
Jinja2==2.11.2
joblib==0.14.1
json5==0.9.0
jsonschema==3.2.0
jupyter-client==6.1.3
jupyter-core==4.6.3
jupyter-tensorboard==0.2.0
jupyterlab==2.1.0
jupyterlab-server==1.1.1
Keras-Applications==1.0.8
Keras-Preprocessing==1.1.0
kiwisolver==1.2.0
llvmlite==0.31.0
Markdown==3.2.1
MarkupSafe==1.1.1
matplotlib==3.2.1
mistune==0.8.4
nbconvert==5.6.1
nbformat==5.0.6
notebook==6.0.3
numba==0.48.0
numpy==1.18.1
olefile==0.46
opt-einsum==0+untagged.56.g2664021.dirty
pandas==1.0.3
pandocfilters==1.4.2
parso==0.7.0
pickleshare==0.7.5
Pillow==7.1.1
prometheus-client==0.7.1
prompt-toolkit==3.0.5
protobuf==3.11.4
pycparser==2.20
Pygments==2.6.1
pyOpenSSL==19.1.0
pyparsing==2.4.7
PyQt5==5.12.3
PyQt5-sip==4.19.18
PyQtWebEngine==5.12.1
pyreadline==2.1
pyrsistent==0.16.0
PySocks==1.7.1
python-dateutil==2.8.1
pytz==2019.3
pywin32==227
pywinpty==0.5.7
pyzmq==19.0.0
requests==2.23.0
scikit-learn==0.22.2.post1
scipy==1.2.1
Send2Trash==1.5.0
six==1.14.0
tensorboard==1.15.0
tensorflow==1.15.0
tensorflow-estimator==1.15.1
termcolor==1.1.0
terminado==0.8.3
testpath==0.4.4
tornado==6.0.4
traitlets==4.3.3
urllib3==1.25.9
wcwidth==0.1.9
webencodings==0.5.1
Werkzeug==0.16.1
win-inet-pton==1.1.0
wincertstore==0.2
wrapt==1.12.1
zipp==3.1.0
Code is
train_datagen = ImageDataGenerator(validation_split=0.9)
train_generator = train_datagen.flow_from_directory(directory= 'tiny-imagenet-200/train/',
target_size=(64, 64),
batch_size=256,
class_mode='categorical',
shuffle=True,
seed=42,
subset ="training"
)
val_data = pd.read_csv('./tiny-imagenet-200/val/val_annotations.txt', sep='\t', header=None, names=['File', 'Class', 'X', 'Y', 'H', 'W'])
val_data.drop(['X', 'Y', 'H', 'W'], axis=1, inplace=True)
valid_datagen = ImageDataGenerator(validation_split=0.9)
validation_generator = valid_datagen.flow_from_dataframe(dataframe=val_data,
directory='./tiny-imagenet-200/val/images/',
x_col='File',
y_col='Class',
target_size=(64, 64),
color_mode='rgb',
class_mode='categorical',
batch_size=256,
shuffle=True,
seed=42,
subset ="training")
history = model.fit_generator(train_generator,
epochs=2,
validation_data=validation_generator,
#callbacks=[tensorboard_callback]
)

You are using validation_split when instancing ImageDataGenerator and setting subset ="training" to validation_generator, but you actually have your validation and training sets separated in different directories. Now, I'm not 100% sure, but I think it may have to do with it.
Also, I would use the same common arguments for both training and validation when calling flow_from_dataframe: x_col, y_col, target_size, color_mode, etc.
Take a look at the examples shown here (official docs):
train_datagen = ImageDataGenerator(
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
'data/train',
target_size=(150, 150),
batch_size=32,
class_mode='binary')
validation_generator = test_datagen.flow_from_directory(
'data/validation',
target_size=(150, 150),
batch_size=32,
class_mode='binary')
model.fit_generator(
train_generator,
steps_per_epoch=2000,
epochs=50,
validation_data=validation_generator,
validation_steps=800) ```

Related

Autoencoder Custom Dataset tensorflow 2.3 ValueError: `y` argument is not supported when using dataset as input

I'm trying to implement an Autoencoder in Tensorflow 2.3. I am taking my own Image dataset stored on disk as input.can someone explain to me how this can be done in a correct way?
I tried loading the data in directory using tf.keras.preprocessing.image_dataset_from_directory() but when I use start training with the data taken from above method I am getting following error.
"ValueError: y argument is not supported when using dataset as input."
PFB the code that I am running
'''
import tensorflow as tf
from convautoencoder import ConvAutoencoder
from tensorflow.keras.optimizers import Adam
import matplotlib.pyplot as plt
import numpy as np
EPOCHS = 25
batch_size = 1
img_height = 180
img_width = 180
data_dir = "/media/aniruddha/FE47-91B8/Laptop_Backup/Auto-Encoders/Basic/data"
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
data_dir,
validation_split=0.2,
subset="training",
seed=123,
image_size=(img_height, img_width),
batch_size=batch_size)
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
data_dir,
validation_split=0.2,
subset="validation",
seed=123,
image_size=(img_height, img_width),
batch_size=batch_size)
(encoder, decoder, autoencoder) = ConvAutoencoder.build(224, 224, 3)
opt = Adam(lr=1e-3)
autoencoder.compile(loss="mse", optimizer=opt)
H = autoencoder.fit( train_ds, train_ds, validation_data=(val_ds, val_ds), epochs=EPOCHS, batch_size=batch_size)
'''

I resolved this. I was not feeding the input dataset as a tuple to the model for training. Once I corrected that the training started.

I used generators to feed the input data as tuple to the autoencoder.
Please find my code below.
# initialize the training training data augmentation object
trainAug = ImageDataGenerator(rescale=1. / 255)
valAug = ImageDataGenerator(rescale=1. / 255)
# initialize the training generator
trainGen = trainAug.flow_from_directory(
config.TRAIN_PATH,
class_mode="input",
classes=None,
target_size=(64, 64),
color_mode="grayscale",
shuffle=True,
batch_size=BS)
# initialize the validation generator
valGen = valAug.flow_from_directory(
config.TRAIN_PATH,
class_mode="input",
classes=None,
target_size=(64, 64),
color_mode="grayscale",
shuffle=False,
batch_size=BS)
# initialize the testing generator
testGen = valAug.flow_from_directory(
config.TRAIN_PATH,
class_mode="input",
classes=None,
target_size=(64, 64),
color_mode="grayscale",
shuffle=False,
batch_size=BS)
early_stop = EarlyStopping(monitor='val_loss', patience=20)
mc = ModelCheckpoint('best_model_1.h5', monitor='val_loss', mode='min', save_best_only=True)
# construct our convolutional autoencoder
print("[INFO] building autoencoder...")
(encoder, decoder, autoencoder) = ConvAutoencoder.build(64, 64, 1)
opt = Adam(learning_rate= 0.0001, beta_1=0.9, beta_2=0.999, epsilon=1e-04, amsgrad=False)
autoencoder.compile(loss="mse", optimizer=opt)
# train the convolutional autoencoder
H = autoencoder.fit( trainGen, validation_data=valGen, epochs=EPOCHS, batch_size=BS ,callbacks=[ mc , early_stop])

fit is expecting data and labels, but it only accepts a single tf.data.Dataset. To use data as labels for the autoencoder you should provide it twice to the dataset constructor, e.g. :
dataset = tf.data.Dataset.from_tensor_slices((images, images))

Tensorflow: train multiple models in parallel with the same ImageDataGenerator

I'm doing HPO on a small custom CNN. During training the GPU is under-utilised and I'm finding a bottleneck in the CPU: the data augmentation process is too slow. Looking online, I found that I could use multiple CPU cores for the generator and speedup the process. I set up workers=n_cores and this did improve things, but not as much as I'd like.
So I though that I could train multiple models simultaneously on the GPU, and feed the same augmented data to the models. However, I can't come up with some idea on how to do this and I couldn't find any similar question.
Here's a minimal example (I'm leaving out imports for brevity):
# load model and set only last layer as trainable
def create_model(learning_rate, alpha, dropout):
model_path = '/content/drive/My Drive/Progetto Advanced Machine Learning/Model Checkpoints/Custom Model 1 2020-06-01 10:56:21.010759.hdf5'
model = tf.keras.models.load_model(model_path)
x = model.layers[-2].output
x = Dropout(dropout)(x)
predictions = Dense(120, activation='softmax', name='prediction', kernel_regularizer=tf.keras.regularizers.l2(alpha))(x)
model = Model(inputs=model.inputs, outputs=predictions)
for layer in model.layers[:-2]:
layer.trainable = False
model.compile(loss='categorical_crossentropy', optimizer=Adam(learning_rate), metrics=['accuracy'])
return model
#declare the search space
SEARCH_SPACE = [skopt.space.Real(0.0001, 0.1, name='learning_rate', prior='log-uniform'),
skopt.space.Real(1e-9, 1, name='alpha', prior='log-uniform'),
skopt.space.Real(0.0001, 0.95, name='dropout', prior='log-uniform')]
# declare generator
train_datagenerator = ImageDataGenerator(rescale=1. / 255, rotation_range=30, zoom_range=0.2, horizontal_flip=True, validation_split=0.2, data_format='channels_last')
# training function to be called by the optimiser
#use_named_args(SEARCH_SPACE)
def fitness(learning_rate, alpha, dropout):
model = create_model(learning_rate, alpha, dropout)
#compile generators
train_batches = train_datagenerator.flow_from_directory(train_out_path, target_size=image_size, color_mode="rgb", class_mode="categorical" , batch_size=32, subset='training', seed = 20052020)
val_batches = train_datagenerator.flow_from_directory(directory=train_out_path, target_size=image_size, color_mode="rgb", class_mode="categorical" , batch_size=32, subset='validation', shuffle=False, seed = 20052020)
#train
early_stopping = EarlyStopping(monitor='val_loss', patience=3, restore_best_weights=True)
training_results = model.fit(train_batches, epochs=5, verbose=1, shuffle=True, validation_data=val_batches, workers=2)
history[hyperpars] = training_results.history
with open(dict_save_path, 'wb') as f:
pickle.dump(history, f)
return training_results.history['val_accuracy'][-1]
# HPO
result = skopt.forest_minimize(fitness, SEARCH_SPACE, n_calls=10, callback=checkpoint_saver)

What's wrong with my ResNet50 on two machines?

Firstly, I trained a ResNet50 to be a six-class classifier from scratch on Kaggle, and got like this.
As you can see, the accuracy of training set and validation set improved steadily.
And after that, I rented a cloud host on the internet for a better GPU(1080ti), and copied my code (I uploaded my Jupyter notebook). And then I runned it. But strange things happened. My validation accuracy is extremely unsteady and always fluctuated widely (around 0.3). Here's the screenshot.
And also, the training on the host is much more difficult than on Kaggle kernel.
Here are the screenshots after some epochs.(actually the host's one is trained over much more epochs than the Kaggle's one)
And here's my codes of ImageDataGenerator.
train_datagen = ImageDataGenerator(
rescale=1./255,
shear_range=0.1,
zoom_range=0.1,
width_shift_range=0.1,
height_shift_range=0.1,
horizontal_flip=True,
vertical_flip=True,
validation_split=0.1
)
test_datagen = ImageDataGenerator(
rescale=1./255,
validation_split=0.1
)
train_generator = train_datagen.flow_from_directory(
base_path,
target_size=(300, 300),
batch_size=16,
class_mode='categorical',
subset='training',
seed=0
)
validation_generator = test_datagen.flow_from_directory(
base_path,
target_size=(300, 300),
batch_size=16,
class_mode='categorical',
subset='validation',
seed=0
)

Keras, training with an empty category

Maybe it is a naive question.
I want to try a small experiment for research: train the model with an extra and empty category from the one that I have in the training and validation and see how the prediction for this extra category goes down with the number of samples and epochs.
In particular I added a 5th phantom category in the pandas dataframe.
I am also using an ImageDataGenerator.
train_datagen = ImageDataGenerator(
rotation_range=0,
rescale=1./255,
shear_range=0.0,
zoom_range=0.2,
horizontal_flip=False,
width_shift_range=0.0,
height_shift_range=0.0
)
train_generator = train_datagen.flow_from_dataframe(
train_df,
"/mypath/",
x_col='filename',
y_col='category',
target_size=IMAGE_SIZE,
class_mode='categorical',
batch_size=batch_size
)
validation_datagen = ImageDataGenerator(rescale=1./255)
validation_generator = validation_datagen.flow_from_dataframe(
validate_df,
"/mypath/",
x_col='filename',
y_col='category',
target_size=IMAGE_SIZE,
class_mode='categorical',
batch_size=batch_size
)
history = model.fit_generator(
train_generator,
epochs=epochs,
validation_data=validation_generator,
validation_steps=total_validate//batch_size,
steps_per_epoch=total_train//batch_size,
callbacks=callbacks
)
However when I a try to train the CNN I got the following error:
Error when checking target: expected dense_2 to have shape (5,) but got array with shape (4,)
Someone can suggest a workaround?

Low accurcacy - Transfer learning + bottle-neck keras-tensorflow (resnet50)

I'm trying to do transfer learning / bottle neck with keras/tensorflow on a google Colaboratory notebook. My problem is that the accuracy doesn't go over 6% (Kaggle's dog breed challenge, 120 classes, data generated with datagen.flow_from_directory)
Below is my code, is there something I'm missing?
tr_model=ResNet50(include_top=False,
weights='imagenet',
input_shape = (224, 224, 3),)
datagen = ImageDataGenerator(rescale=1. / 255)
#### Training ####
train_generator = datagen.flow_from_directory(train_data_dir,
target_size=(image_size,image_size),
class_mode=None,
batch_size=batch_size,
shuffle=False)
bottleneck_features_train = tr_model.predict_generator(train_generator)
train_labels = to_categorical(train_generator.classes , num_classes=num_classes)
#### Validation ####
validation_generator = datagen.flow_from_directory(validation_data_dir,
target_size=(image_size,image_size),
class_mode=None,
batch_size=batch_size,
shuffle=False)
bottleneck_features_validation = tr_model.predict_generator(validation_generator)
validation_labels = to_categorical(validation_generator.classes, num_classes=num_classes)
#### Model creation ####
model = Sequential()
model.add(Flatten(input_shape=bottleneck_features_train.shape[1:]))
model.add(Dense(num_class, activation='softmax'))
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
history = model.fit(bottleneck_features_train, train_labels,
epochs=30,
batch_size=batch_size,
validation_data=(bottleneck_features_validation, validation_labels))
I get a val_acc = 0.0592
When I use ResNet50 with the last layer, I get a score of 82%.
Can anyone spot what's wrong with my code.

Suppress the rescale and add the preprocessing helped a lot.
Those modifications help immensely:
from keras.applications.resnet50 import preprocess_input
datagen = ImageDataGenerator(preprocessing_function=preprocess_input)
I now have an accuracy of 80%

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

keras epoch runs twice - tensorflow

Related

Autoencoder Custom Dataset tensorflow 2.3 ValueError: `y` argument is not supported when using dataset as input

Tensorflow: train multiple models in parallel with the same ImageDataGenerator

What's wrong with my ResNet50 on two machines?

Keras, training with an empty category

Low accurcacy - Transfer learning + bottle-neck keras-tensorflow (resnet50)

Categories

Resources