Data-augmentation generators not working with TensorFlow 2.0 - tensorflow2.0

I am trying to train model with image data-augmentation generators on TensorFlow 2.0, after downloading Kaggle's cats_vs_dogs dataset using below code.
train_datagen = ImageDataGenerator(rescale=1. / 255,
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1. / 255)
train_generator = train_datagen.flow_from_directory(train_dir,
target_size=(150, 150),
batch_size=32,
class_mode='binary')
validation_generator = test_datagen.flow_from_directory(validation_dir,
target_size=(150, 150),
batch_size=32,
class_mode='binary')
history = model.fit_generator(train_generator,
steps_per_epoch=100,
epochs=100,
validation_data=validation_generator,
validation_steps=50)
But on first epoch, getting this error:
Found 2000 images belonging to 2 classes.
Found 1000 images belonging to 2 classes.
WARNING:tensorflow:From <ipython-input-18-e571f2719e1b>:27: Model.fit_generator (from tensorflow.python.keras.engine.training) is deprecated and will be removed in a future version.
Instructions for updating:
Please use Model.fit, which supports generators.
WARNING:tensorflow:sample_weight modes were coerced from
...
to
['...']
WARNING:tensorflow:sample_weight modes were coerced from
...
to
['...']
Train for 100 steps, validate for 50 steps
Epoch 1/100
63/100 [=================>............] - ETA: 59s - loss: 0.7000 - accuracy: 0.5000 WARNING:tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least `steps_per_epoch * epochs` batches (in this case, 10000 batches). You may need to use the repeat() function when building your dataset.
How should I modify the above code base for TensorFlow 2?

The kaggle dataset contain 25000 training examples. The error message states that:
Blockquote tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least steps_per_epoch * epochs batches (in this case, 10000 batches). You may need to use the repeat() function when building your dataset.
Which means the data generator needs to generate at least 10000 batches. But with the current batch size of 32 the generator would produce only 25000 / 32 is approximately equal to 781 batches. My suggestion is try to reduce the steps_per_epoch or epochs and try.
You can get rid of the deprecation message by passing the generator object to model.fit(...) instead of model.fit_generator

Related

Keras Data Augmentation with ImageDataGenerator (Your input ran out of data)

I am currently learning how to perform data augmentation with Keras ImageDataGenerator from "Deep learning with Keras" by François Chollet.
I now have 1000 (Dogs) & 1000 (Cats) images in training dataset.
I also have 500(Dogs) & 500(Cats) images in validation dataset.
The book defined the batch size as 32 for both training and validation data in the Generator to perform data augmentation with both "step_per_epoch" and "epoch" in fitting the model.
Hpwever, when I train the model, I received the Tensorflow Warning, "Your input ran out of data..." and stopped the training process.
I searched online and many solutions mentioned that the step_per_epoch should be,
steps_per_epoch = len(train_dataset) // batch_size & steps_per_epoch = len(validation_dataset) // batch_size
I understand the logic above and there is no warning in the training.
But I am wondering, originally I have 2000 training samples. This is too little so that I need to perform data augmentation to increase the numbers of training images.
If the steps_per_epoch = len(train_dataset) // batch_size is applied, since the len(train_dataset) is only 2000. Isn't that I am still using 2000 samples to train the model instead of adding more augmented images to the model?
train_datagen = ImageDataGenerator(
rescale=1./255,
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,)
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
train_dir,
target_size=(150, 150),
batch_size=32,
class_mode='binary')
validation_generator = test_datagen.flow_from_directory(
validation_dir,
target_size=(150, 150),
batch_size=32,
class_mode='binary')
history = model.fit_generator(
train_generator,
steps_per_epoch=100,
epochs=100,
validation_data=validation_generator,
validation_steps=50)
The fact that, imagedatagenerator does not increase the size of the training set. All augmentations are done in memory. So an original image is augmented randomly, then its augmented version is returned. If you want to have a look to augmented images you need set these parameters for the function flow_from_directory:
save_to_dir=path,
save_prefix="",
save_format="png",
Now you have 2000 images and with a batch size of 32, you would have 2000 // 32 = 62 steps per epoch, but you are trying to have 100 steps which is causing the error.
If you have a dataset which does not generate batches and want to use all data points, then you should set:
steps_per_epoch = len(train_dataset) // batch_size
But when you use flow_from_directory, it generates batches, so there is no need to set steps_per_epoch unless you want to use less data points than generated batches.

Stateful LSTM Tensorflow Invalid Input_h Shape Error

I am experimenting with stateful LSTM on a time-series regression problem by using TensorFlow. I apologize that I cannot share the dataset.
Below is my code.
train_feature = train_feature.reshape((train_feature.shape[0], 1, train_feature.shape[1]))
val_feature = val_feature.reshape((val_feature.shape[0], 1, val_feature.shape[1]))
batch_size = 64
model = tf.keras.Sequential()
model.add(tf.keras.layers.LSTM(50, batch_input_shape=(batch_size, train_feature.shape[1], train_feature.shape[2]), stateful=True))
model.add(tf.keras.layers.Dense(1))
model.compile(optimizer='adam',
loss='mse',
metrics=[tf.keras.metrics.RootMeanSquaredError()])
model.fit(train_feature, train_label,
epochs=10,
batch_size=batch_size)
When I run the above code, after the end of the first epoch, I will get an error as follows.
InvalidArgumentError: [_Derived_] Invalid input_h shape: [1,64,50] [1,49,50]
[[{{node CudnnRNN}}]]
[[sequential_1/lstm_1/StatefulPartitionedCall]] [Op:__inference_train_function_1152847]
Function call stack:
train_function -> train_function -> train_function
However, the model will be successfully trained if I change the batch_size to 1, and change the code for model training to the following.
total_epochs = 10
for i in range(total_epochs):
model.fit(train_feature, train_label,
epochs=1,
validation_data=(val_feature, val_label),
batch_size=batch_size,
shuffle=False)
model.reset_states()
Nevertheless, with a very large data (1 million rows), the model training will take a very long time since the batch_size is 1.
So, I wonder, how to train a stateful LSTM with a batch size larger than 1 (e.g. 64), without getting the invalid input_h shape error?
Thanks for your answers.
The fix is to ensure batch size never changes between batches. They must all be the same size.
Method 1
One way is to use a batch size that perfectly divides your dataset into equal-sized batches. For example, if total size of data is 1500 examples, then use a batch size of 50 or 100 or some other proper divisor of 1500.
batch_size = len(data)/proper_divisor
Method 2
The other way is to ignore any batch that is less than the specified size, and this can be done using the TensorFlow Dataset API and setting the drop_remainder to True.
batch_size = 64
train_data = tf.data.Dataset.from_tensor_slices((train_feature, train_label))
train_data = train_data.repeat().batch(batch_size, drop_remainder=True)
steps_per_epoch = len(train_feature) // batch_size
model.fit(train_data,
epochs=10, steps_per_epoch = steps_per_epoch)
When using the Dataset API like above, you will need to also specify how many rounds of training count as an epoch (essentially how many batches to count as 1 epoch). A tf.data.Dataset instance (the result from tf.data.Dataset.from_tensor_slices) doesn't know the size of the data that it's streaming to the model, so what constitutes as one epoch has to be manually specified with steps_per_epoch.
Your new code will look like this:
train_feature = train_feature.reshape((train_feature.shape[0], 1, train_feature.shape[1]))
val_feature = val_feature.reshape((val_feature.shape[0], 1, val_feature.shape[1]))
batch_size = 64
train_data = tf.data.Dataset.from_tensor_slices((train_feature, train_label))
train_data = train_data.repeat().batch(batch_size, drop_remainder=True)
model = tf.keras.Sequential()
model.add(tf.keras.layers.LSTM(50, batch_input_shape=(batch_size, train_feature.shape[1], train_feature.shape[2]), stateful=True))
model.add(tf.keras.layers.Dense(1))
model.compile(optimizer='adam',
loss='mse',
metrics=[tf.keras.metrics.RootMeanSquaredError()])
steps_per_epoch = len(train_feature) // batch_size
model.fit(train_data,
epochs=10, steps_per_epoch = steps_per_epoch)
You can also include the validation set as well, like this (not showing other code):
batch_size = 64
val_data = tf.data.Dataset.from_tensor_slices((val_feature, val_label))
val_data = val_data.repeat().batch(batch_size, drop_remainder=True)
validation_steps = len(val_feature) // batch_size
model.fit(train_data, epochs=10,
steps_per_epoch=steps_per_epoch,
validation_steps=validation_steps)
Caveat: This means a few datapoints will never be seen by the model. To get around that, you can shuffle the dataset each round of training, so that the datapoints left behind each epoch changes, giving everyone a chance to be seen by the model.
buffer_size = 1000 # the bigger the slower but more effective shuffling.
train_data = tf.data.Dataset.from_tensor_slices((train_feature, train_label))
train_data = train_data.shuffle(buffer_size=buffer_size, reshuffle_each_iteration=True)
train_data = train_data.repeat().batch(batch_size, drop_remainder=True)
Why the error occurs
Stateful RNNs and their variants (LSTM, GRU, etc.) require fixed batch size. The reason is simply because statefulness is one way to realize Truncated Backprop Through Time, by passing the final hidden state for a batch as the initial hidden state of the next batch. The final hidden state for the first batch has to have exactly the same shape as the initial hidden state of the next batch, which requires that batch size stay the same across batches.
When you set the batch size to 64, model.fit will use the remaining data at the end of an epoch as a batch, and this may not have up to 64 datapoints. So, you get such an error because the batch size is different from what the stateful LSTM expects. You don't have the problem with batch size of 1 because any remaining data at the end of an epoch will always contain exactly 1 datapoint, so no errors. More generally, 1 is always a divisor of any integer. So, if you picked any other divisor of your data size, you should not get the error.
In the error message you posted, it appears the last batch has size of 49 instead of 64. On a side note: The reason the shapes look different from the input is because, under the hood, keras works with the tensors in time_major (i.e. the first axis is for steps of sequence). When you pass a tensor of shape (10, 15, 2) that represents (batch_size, steps_per_sequence, num_features), keras reshapes it to (15, 10, 2) under the hood.

Obtaining number of images after using imagedatagenerator() in keras and before training

Below is the sample code, that I am trying to use. I want to know if there is any way to get the number of images immediately after applying imagedatagenerator() and before performing training (i.e, .fit_generator). The reason is, I want to use these images later for training instead of my original dataset.
train_datagen=ImageDataGenerator(rescale=1./255,
#featurewise_center=True,
samplewise_center=True,
zca_epsilon=1e-06,
#channel_shift_range=100.0,
#samplewise_std_normalization=True,
#featurewise_std_normalization=True,
rotation_range=15,
#width_shift_range=0,
#height_shift_range=0,
shear_range=0.2,
fill_mode='nearest',
zoom_range=0.1,
horizontal_flip= True,
)
val_datagen= ImageDataGenerator(rescale=1./255,
samplewise_center=True,
)
train_generator= train_datagen.flow(X_train, Y_train, batch_size=batch_size,shuffle=True)
val_generator= val_datagen.flow(X_val, Y_val,batch_size=batch_size,shuffle=True)
history= model.fit_generator(train_generator,
batch_size= batch_size,
steps_per_epoch=trainSize,
epochs=10,
validation_data=val_generator,
validation_steps=valSize,
callbacks=[LearningRateScheduler(lr_schedule)]
#callbacks=[es_callback]
)
No, because ImageDataGenerator can generate augmented images indefinitely, and the final number of generated images used for training is a function of the batch_size, steps_per_epoch and number of epochs you train for. You can use the save_to_dir argument of the flow() method of your generator to save the augmented images. The number of images should equal batch_size times steps_per_epoch times epochs.

Unknown number of steps - Training convolution neural network at Google Colab Pro

I am trying to run (training) my CNN at Google Colab Pro, when I run my code, all is allright, but It does not know the number of steps, so an infinite loop is created.
Mounted at /content/drive
2.2.0-rc3
Found 10018 images belonging to 2 classes.
Found 1336 images belonging to 2 classes.
WARNING:tensorflow:`period` argument is deprecated. Please use `save_freq` to specify the frequency in number of batches seen.
Epoch 1/300
8/Unknown - 364s 45s/step - loss: 54.9278 - accuracy: 0.5410
I am using ImageDataGenerator() for loadings images. How can I fix it?
An iterator does not store anything, it generates the data dynamically. When you are using a dataset or dataset iterator, you must provide steps_per_epoch. The length of an iterator is unknown until you iterate through it. You could explicitly pass len(datafiles) into the .fit function. So, You need to provide steps_per_epoch as shown below.
model.fit_generator(
train_data_gen,
steps_per_epoch=total_train // batch_size,
epochs=epochs,
validation_data=val_data_gen,
validation_steps=total_val // batch_size
)
More details are mentioned here
steps_per_epoch: Integer or None. Total number of steps (batches of
samples) before declaring one epoch finished and starting the next
epoch. When training with input tensors such as TensorFlow data
tensors, the default None is equal to the number of samples in your
dataset divided by the batch size, or 1 if that cannot be determined.
If x is a tf.data dataset, and 'steps_per_epoch' is None, the epoch
will run until the input dataset is exhausted. This argument is not
supported with array inputs.
I notice you are using binary classification. One more thing to remember when you use ImageDataGenerator is to provide class_mode as shown below. Otherwise, there will be a bug (in keras) or 50% accuracy (in tf.keras).
train_data_gen = train_image_generator.flow_from_directory(batch_size=batch_size,
directory=train_dir,
shuffle=True,
target_size=(IMG_HEIGHT, IMG_WIDTH),class_mode='binary') #

Keras fit freezes at the end of the first epoch

I am currently experimenting with fine tuning the VGG16 network using Keras.
I started tweaking a little bit with the cats and dogs dataset.
However, with the current configuration the training seems to block on the first epoch
from keras import applications
from keras.preprocessing.image import ImageDataGenerator
from keras import optimizers
from keras.models import Sequential, Model
from keras.layers import Dropout, Flatten, Dense
img_width, img_height = 224, 224
train_data_dir = 'data/train'
validation_data_dir = 'data/validation'
nb_train_samples = 2000
nb_validation_samples = 800
epochs = 50
batch_size = 20
model = applications.VGG16(weights='imagenet', include_top=False , input_shape=(224,224,3))
print('Model loaded.')
top_model = Sequential()
top_model.add(Flatten(input_shape=model.output_shape[1:]))
top_model.add(Dense(256, activation='relu',name='newlayer'))
top_model.add(Dropout(0.5))
top_model.add(Dense(2, activation='softmax'))
model = Model(inputs= model.input, outputs= top_model(model.output))
for layer in model.layers[:19]:
layer.trainable = False
model.compile(loss='categorical_crossentropy',
optimizer=optimizers.Adam(lr=0.0001),
metrics=['accuracy'])
train_datagen = ImageDataGenerator(
rescale=1. / 255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1. / 255)
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_height, img_width),
batch_size=batch_size,
shuffle=True,
class_mode='categorical')
validation_generator = test_datagen.flow_from_directory(
validation_data_dir,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='categorical')
model.fit_generator(
train_generator,
steps_per_epoch=nb_train_samples// batch_size,
epochs=epochs,
validation_data=validation_generator,
validation_steps=nb_validation_samples)
Last output:
Epoch 1/50 99/100 [============================>.] - ETA: 0s - loss:
0.5174 - acc: 0.7581
Am I missing something ?
Shuffle
In my case, I was calling fit(...) with shuffle='batch'. Removing this parameter from the arguments resolved the problem. (I assume it's a TensorFlow bug but I didn't dig into it.)
Validation
Another consideration is that validation is being performed at the end of the epoch... If your validation data isn't being batched, and particularly if you are padding your data, then you could be performing validation on data much larger than your training batch size padded to the maximum sample length of your validation data. This could be a problem of out-of-memory proportions.
I faced this problem in Co-Lab provides limited memory upto(12 GB) in cloud which creates many issues while solving a problem. That's why only 300 images are used to train and test.when images was preprocessed with dimension 600x600 and batch size was set to 128 it Keras model freezed during epoch 1 .Compiler did not show this error.Actually the error was runtime limited memory which was unable to handle by CoLab because it gave only 12GB limited memory for usage.
Solution to above mentioned problem was solved by changing batch size to 4 and reduce image dimension to 300x300 because with 600x600 it still not work.
Conclusively,Recommend Solution is Make Images dimension and Batch_size small until you get no error Run Again and Again until there will no run time error
I faced the same issue.
This is because the model is running on the validation dataset, and this usually takes a lot of time. Try reducing the validation dataset, or wait for some time it worked for me. It seems like it's stuck, but it is running on the validation dataset.
If you are using from tensorflow.keras.preprocessing.image import ImageDataGenerator, try changing it to from keras.preprocessing.image import ImageDataGenerator, or vice versa. Worked for me. Its said that you should never mix keras and tensorflow.
I tried everything posted in here, but they didn't work for me. I found the solution by simply putting the validation set into a numpy.array like this:
numpy.array(validation_x)
Super simple. Works like a charm. I hope this helps someone.