Google Colab not taking complete data from cifar10 - google-colaboratory

from sklearn.preprocessing import LabelBinarizer
from sklearn.metrics import classification_report
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
os.environ['TF_FORCE_GPU_ALLOW_GROWTH'] = 'true'
from tensorflow.keras.optimizers import SGD
import matplotlib
matplotlib.use("Agg")
import matplotlib.pyplot as plt
import numpy as np
from tensorflow.keras.datasets import cifar10
print("[INFO] loading CIFAR-10 data")
((trainX, trainY), (testX, testY)) = cifar10.load_data()
trainX = trainX.astype("float") / 255.0
testX = testX.astype("float") / 255.0
print("trainX: {}, testX ={}".format(trainX.shape,testX.shape))
lb=LabelBinarizer()
# convert the labels from integers to vectors
trainY = lb.fit_transform(trainY)
testY = lb.transform(testY)
labelNames = ["airplane", "automobile", "bird", "cat", "deer",
"dog", "frog", "horse", "ship", "truck"]
print("[INFO] compiling model")
opt=SGD(lr=0.01, decay=0.01/40, momentum=0.9, nesterov=True)
model= MiniVGGNet.build(width=32,height=32,depth=3, classes=10)
model.compile(loss="categorical_crossentropy",
optimizer=opt,metrics=["accuracy"])
#train the network
print("[INFO] training network..")
H=model.fit(trainX, trainY, validation_data=(testX, testY),
batch_size=64, epochs=40, verbose=1)
The output is:
[INFO] loading CIFAR-10 data
Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
170500096/170498071 [==============================] - 4s 0us/step
trainX: (50000, 32, 32, 3), testX =(10000, 32, 32, 3)
[INFO] compiling model
[INFO] training network..
Epoch 1/40
782/782 [==============================] - 10s 12ms/step - loss: 1.6249 - accuracy: 0.4555 - val_loss: 1.3396 - val_accuracy: 0.5357
Epoch 2/40
782/782 [==============================] - 9s 12ms/step - loss: 1.1462
When I download the data from the website above, I get the correct cifar data but when I run my model, I can see it only takes 782 images.
I have worked on other models as well but same result.
This only happens in google colab and not in my local pc. What am I missing?

Both the training and testing sets are working perfectly fine. Train set has 50000 images and the test set has 10000. So, there is no problem in the code that you posted. Consider adding rest of the code that you used to train the model. You can check the shape of your sets by executing.
from tensorflow.keras.datasets import cifar10
(train_X, train_y), (test_X, test_y) = cifar10.load_data()
train_X = train_X.astype("float") / 255.0
test_X = test_X.astype("float") / 255.0
print(f"train_X: {train_X.shape}, test_X = {test_X.shape}")
Update:
Tested this in MyBinder, my local Juyter Notebook and Colab and got to this conclusion:
MyBinder and local Notebook didn't separate CIFAR training set into mini-batches or just showed total number individual data-points in training set. So, they show showed that 50000 steps were necessary at each epoch.
On contrary, Google Colab has mini-batched the CIFAR dataset into 64 mini-batches and has trained the model. So the total steps at each epoch is 5000/64 which is equal to 782.
Hope this cleared your confusion. It was just that Colab displayed total mini-batches, whereas Jupyter Notebook showed total invidual number of entities in the set.
PS: You might want to add missing bracket at the end of line 34 in the code that you shared here.

Related

Tensowflow 2.0: Multiple Inputs of type tf.data.Dataset and ImageDataGenerator Fails

I am predicting house prices using Deep Learning with Tensorflow 2 libraries.
I have data for 3 attributes (baths, bedrooms, area), and image of each house as my dataset. (nearly 2000 samples)
I am building 3 Deep Learning models:
Regression model (model_reg): using 3 attributes -- worked fine
CNN model (img_model) using images -- worked fine
Combining above two models (model_combined) -- erroring out
Regressoin model:
I built a Deep Neural Network (DNN) model uisng Tensorflow2.0 with 3 attributes as features and prices as labels.
I could fit the DNN and could predict the house prices.
Note: I used tf.data.Dataset combining X_train, y_train while building this model.
model_reg.fit(X_reg_train_dataset, batch_size=BATCH_SIZE, epochs=EPOCHS,
callbacks=[stop_callback], verbose=1)
63/63 [==============================] - 0s 2ms/step - loss: 0.3993 - mae: 0.4256 - mse: 0.3993
CNN Model:
Next, I build other CNN using house image as features and price as labels.
This too worked fine and I could predict the house prices.
Note: I used ImageDataGenerator from tensorflow.keras.preprocessing to build the generator.
hist = img_model.fit(
train_images_generator,
steps_per_epoch = train_images_generator.samples // BATCH_SIZE ,
validation_data = test_images_generator,
validation_steps = test_images_generator.samples // BATCH_SIZE,
epochs = EPOCHS,
callbacks=[stop_callback], verbose=1)
49/49 [==============================] - 59s 1s/step - loss: 1741.6321 - mae: 20.8221 - mse: 1741.6321 - val_loss: 755833241600.0000 - val_mae: 768718.3125 - val_mse: 755833241600.0000
Lastly, I merged these 2 models appropriately using Concatenate() layer.
Now I have 2 inputs that need to pass to the model.
Hence I defined a model using Functional API's with 2 inputs.
model_combined = Model(inputs=[input_layer_reg, img_input_layer], outputs=[output_layer_combined])
optimizer = tf.keras.optimizers.Adam(0.01)
model_combined.compile(loss='mae', optimizer=optimizer, metrics=['mae', 'mse'])
Till this point it worked fine and the generated model looks fine.
It's going error while trying to train it using fit:
model_combined.fit([X_reg_train_dataset, train_images_generator], epochs=EPOCHS,
callbacks=[stop_callback], verbose=1,
batch_size=BATCH_SIZE )
ValueError: Failed to find data adapter that can handle input:
(<class 'list'> containing values of types {"<class 'tensorflow.python.keras.preprocessing.image.DataFrameIterator'>",
"<class 'tensorflow.python.data.ops.dataset_ops.BatchDataset'>"}),
<class 'NoneType'>
Question: How to pass 2 inputs: tf.data.Dataset and ImageDataGenerator to Tensorflow model?
model_combined.fit(ta,
steps_per_epoch = train_images_generator.samples // train_images_generator.batch_size,
validation_data = test_images_generator,
validation_steps = test_images_generator.samples // test_images_generator.batch_size,
epochs = 5,
callbacks=[stop_callback], verbose=1)
Epoch 1/5
36/49 [=====================>........] - ETA: 20s - loss: 0.4512 - mae: 0.4512 - mse: 0.5005
I had to write a custom generator and a dataset on top of that to make it work
ta = tf.data.Dataset.from_generator(train_image_dataset_generator,
output_signature=(
(tf.TensorSpec(shape=(None, 5,), dtype=tf.float64),
tf.TensorSpec(shape=(None, None, None, None), dtype=tf.float64)),
(tf.TensorSpec(shape=(None,), dtype=tf.float64),
tf.TensorSpec(shape=(None,), dtype=tf.float64)),
)).repeat()
def train_image_dataset_generator():
data_X_reg = enumerate(X_reg_train_dataset)
a_tuple = next(data_X_reg)
b_tuple = train_images_generator.next()

Keras: custom data validation callback on training data always returns validation data results

I am working on an autoencoder in Keras that contains some dropout layers. To evaluate bias and variance, I'd like to compare the losses of training and test data. However, since dropout is used during training, the losses cannot be compared. (See here for an explanation of why the training data results can be worse than test data results.)
In order to get training data losses that are not influenced by the dropout, I wrote a callback to validate some additional data set (in this case, it would be the training data again).
The strange thing is that I ALWAYS get the same results as for the validation data. Here's a minimal example:
from pprint import pprint
import keras
import numpy as np
import pandas as pd
from numpy.random import seed as np_seed
from tensorflow.random import set_seed as tf_seed
np_seed(1)
tf_seed(2)
# Generation of data sets for training and testing. Random data is only used to showcase the problem.
df_train = pd.DataFrame(data=np.random.random((1000, 10))) # This will be used for training
df_test_1 = pd.DataFrame(data=np.random.random((1000, 10))) # This will be used as validation data set directly
df_test_2 = pd.DataFrame(data=np.random.random((1000, 10))) # This will be used within the callback
np_seed(1)
tf_seed(2)
model = keras.models.Sequential(
[
keras.Input(shape=(10, )),
keras.layers.Dropout(rate=0.01),
keras.layers.Dense(5, activation='relu'),
keras.layers.Dropout(rate=0.01),
keras.layers.Dense(10, activation='linear'),
]
)
model.compile(
loss='mean_squared_error',
optimizer=keras.optimizers.Adam(),
)
class CustomDataValidation(keras.callbacks.Callback):
def __init__(self, x=None, y=None):
self.x = x
self.y = y
def on_epoch_end(self, epoch, logs=None):
result = self.model.evaluate(x=self.x, y=self.y, return_dict=True)
for loss_name, loss_value in result.items():
logs["custom_" + loss_name] = loss_value
cdv = CustomDataValidation(df_test_2, df_test_2)
hist = model.fit(df_train, df_train, validation_data=(df_test_1, df_test_1), epochs=2, validation_split=0.1, callbacks=[cdv])
pprint(hist.history)
The output is
Epoch 1/2
4/4 [==============================] - 0s 1ms/step - loss: 0.7625
29/29 [==============================] - 0s 5ms/step - loss: 0.9666 - val_loss: 0.7625
Epoch 2/2
4/4 [==============================] - 0s 1ms/step - loss: 0.5331
29/29 [==============================] - 0s 2ms/step - loss: 0.6638 - val_loss: 0.5331
{'custom_loss': [0.7624925374984741, 0.5331208109855652],
'loss': [0.9665887951850891, 0.6637843251228333],
'val_loss': [0.7624925374984741, 0.5331208109855652]}
'custom_loss' and 'val_loss' are equal although they should be based on totally different data sets.
The question is therefore: How can I evaluate the model performance on custom data within a callback?
Edit: Since I did not yet got an answer on stackoverflow, I created an issue at tensorflow's git repo. Also, there's now a notebook available that shows the problem.
It seems that this is a bug in tensorflow versions 2.3.x (tested with 2.3.0 and 2.3.1). In versions 2.4.0-rc0 and 2.2.1, the loss outputs of loss and custom_loss differ, which is the expected behavior:
{'custom_loss': [0.7694963216781616, 0.541864812374115],
'loss': [0.9665887951850891, 0.6637843251228333],
'val_loss': [0.7624925374984741, 0.5331208109855652]}

tf keras SparseCategoricalCrossentropy and sparse_categorical_accuracy reporting wrong values during training

This is tf 2.3.0. During training, reported values for SparseCategoricalCrossentropy loss and sparse_categorical_accuracy seemed way off. I looked through my code but couldn't spot any errors yet. Here's the code to reproduce:
import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
x = np.random.randint(0, 255, size=(64, 224, 224, 3)).astype('float32')
y = np.random.randint(0, 3, (64, 1)).astype('int32')
ds = tf.data.Dataset.from_tensor_slices((x, y)).batch(32)
def create_model():
input_layer = tf.keras.layers.Input(shape=(224, 224, 3), name='img_input')
x = tf.keras.layers.experimental.preprocessing.Rescaling(1./255, name='rescale_1_over_255')(input_layer)
base_model = tf.keras.applications.ResNet50(input_tensor=x, weights='imagenet', include_top=False)
x = tf.keras.layers.GlobalAveragePooling2D(name='global_avg_pool_2d')(base_model.output)
output = Dense(3, activation='softmax', name='predictions')(x)
return tf.keras.models.Model(inputs=input_layer, outputs=output)
model = create_model()
model.compile(
optimizer=tf.keras.optimizers.Adam(learning_rate=1e-4),
loss=tf.keras.losses.SparseCategoricalCrossentropy(),
metrics=['sparse_categorical_accuracy']
)
model.fit(ds, steps_per_epoch=2, epochs=5)
This is what printed:
Epoch 1/5
2/2 [==============================] - 0s 91ms/step - loss: 1.5160 - sparse_categorical_accuracy: 0.2969
Epoch 2/5
2/2 [==============================] - 0s 85ms/step - loss: 0.0892 - sparse_categorical_accuracy: 1.0000
Epoch 3/5
2/2 [==============================] - 0s 84ms/step - loss: 0.0230 - sparse_categorical_accuracy: 1.0000
Epoch 4/5
2/2 [==============================] - 0s 82ms/step - loss: 0.0109 - sparse_categorical_accuracy: 1.0000
Epoch 5/5
2/2 [==============================] - 0s 82ms/step - loss: 0.0065 - sparse_categorical_accuracy: 1.0000
But if I double check with model.evaluate, and "manually" checking the accuracy:
model.evaluate(ds)
2/2 [==============================] - 0s 25ms/step - loss: 1.2681 - sparse_categorical_accuracy: 0.2188
[1.268101453781128, 0.21875]
y_pred = model.predict(ds)
y_pred = np.argmax(y_pred, axis=-1)
y_pred = y_pred.reshape(-1, 1)
np.sum(y == y_pred)/len(y)
0.21875
Result from model.evaluate(...) agrees on the metrics with "manual" checking. But if you stare at the loss/metrics from training, they look way off. It is rather hard to see whats wrong since no error or exception is ever thrown.
Additionally, i created a very simple case to try to reproduce this, but it actually is not reproducible here. Note that batch_size == length of data so this isnt mini-batch GD, but full batch GD (to eliminate confusion with mini-batch loss/metrics:
x = np.random.randn(1024, 1).astype('float32')
y = np.random.randint(0, 3, (1024, 1)).astype('int32')
ds = tf.data.Dataset.from_tensor_slices((x, y)).batch(1024)
model = Sequential()
model.add(Dense(3, activation='softmax'))
model.compile(
optimizer=tf.keras.optimizers.Adam(learning_rate=1e-4),
loss=tf.keras.losses.SparseCategoricalCrossentropy(),
metrics=['sparse_categorical_accuracy']
)
model.fit(ds, epochs=5)
model.evaluate(ds)
As mentioned in my comment, one suspect is batch norm layer, which I dont have for the case that can't reproduce.
You get different results because fit() displays the training loss as the average of the losses for each batch of training data, over the current epoch. This can bring the epoch-wise average down. And the computed loss is employed further to update the model. Whereas, evaluate() is computed using the model as it is at the end of the training, resulting in a different loss. You can check the official Keras FAQ and the related StackOverflow post.
Also, try to increase the learning rate.
The big discrepancy seem in the metrics can be explained (or at least partially so) by presence of batch norm in the model. Will present 2 case where one is not reproducible vs. another that is reproduced if batch norm is introduced. In both case, batch_size is equal to full length of data (aka full gradient descent without 'stochastic') to minimize confusion over mini-batch statistics.
Not reproducible:
x = np.random.randn(1024, 1).astype('float32')
y = np.random.randint(0, 3, (1024, 1)).astype('int32')
ds = tf.data.Dataset.from_tensor_slices((x, y)).batch(1024)
model = Sequential()
model.add(Dense(10, activation='relu'))
model.add(Dense(10, activation='relu'))
model.add(Dense(10, activation='relu'))
model.add(Dense(3, activation='softmax'))
Reproducible:
model = Sequential()
model.add(Dense(10))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dense(10))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dense(10))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dense(3, activation='softmax'))
In fact, you can try model.predict(x), model(x, training=True) and you will see large difference in the y_pred. Also, per keras doc, this result also depend on whats in the batch. So prediction model(x[0:1], training=True) for x[0] will differ from model(x[0:2], training=True) by including an extra sample.
Probably best go to Keras doc and the original paper for the details, but I do think you will have to live with this and interprete what you see in the progress bar accordingly. It looks rather fishy if you try to use training loss/accuracy to see if you have a bias (not variance) issue. When in doubt, i think we can just run evaluate on the train set to be sure when after your model "converges" to a great minima. I sort of overlook this detail all together in my prior work 'cos underfitting (bias) is rare for deep net, and so I go by with the validation loss/metrics to determine when to stop training. But i probably would go back to the same model and evaluate on the train set (just to see if model has the capacity (not bias).

How can combine two tensors so they are in one dataset?

I'm working with the Titanic data set from the TensorFlow API.
I don't know how to make the features tensors model friendly.
Here's the best I got, but it's for one tensor at a time. How do I make it so it can handle all tensors in the features item?
import tensorflow as tf
import tensorflow_datasets as tfds
from tensorflow.keras.optimizers import Adam
data = tfds.load("titanic",split='train', as_supervised=True).map(lambda x,y: (x,y)).prefetch(1)
for i in data.batch(1309):
xx1 = i[0]['age']
xx2 = i[0]['fare']
yyy = tf.convert_to_tensor(tf.one_hot(i[1],2))
model = tf.keras.models.Sequential([tf.keras.layers.Dense(1),
tf.keras.layers.Dense(13, activation='relu'),
tf.keras.layers.Dense(2, activation='softmax')])
model.compile(
optimizer=Adam(learning_rate=0.01),
loss='categorical_crossentropy',
metrics=['accuracy']
)
model.fit(xx1,yyy,epochs=30)
How do I concat the age and fare tensors so that they're in one data set?
I tried concat and stack to no avail.
This should be accomplishable with using tf.stack. As the input is already using the dataset API, I've refactored some code to leverage the dataset features for mapping the input format to the goal format that you have described. For convenience, here is a colab notebook with the example: https://colab.research.google.com/drive/1dHNe9rYaJSgqbj_QtQ1aJL_7WgKnLKsU?usp=sharing
# Nothing novel here
import tensorflow as tf
import tensorflow_datasets as tfds
from tensorflow.keras.optimizers import Adam
data = tfds.load("titanic",split='train', as_supervised=True).map(lambda x,y: (x,y)).prefetch(1)
Basic demo of intended data restructuring
Take 1 item from the dataset and convert it to a tensor that includes both of the goal datapoints using tf.stack
for item in data.take(1):
age = item[0]['age']
fare = item[0]['fare']
output = tf.stack([age, fare], axis=0)
print(output)
Output: tf.Tensor([30. 13.], shape=(2,), dtype=float32)
Within the output we can see a single tensor with two values embedded as expected.
Usage as a TensorFlow Dataset
Tensorflow datasets can be provided directly for training, we can easily create a function which will map from the input data format to the goal format described in the problem. The below function will accomplish this, using the sample code from above.
# Input data and associated label
def transform_data(item, label):
# Extract values
age = item['age']
fare = item['fare']
# Create output tensor
output = tf.stack([age, fare], axis=0)
return output, label
# Create a training dataset from the base dataset - for each batch map the input format to the goal format by passing the mapping function
train_dataset = data.map(transform_data).batch(1200)
# Model - I made some minor changes to get it to run cleaner
model = tf.keras.models.Sequential([
tf.keras.layers.Dense(2),
tf.keras.layers.Dense(13, activation='relu'),
# As we have only two labels, this is really a binary problem, so I've created a single output neuron activated by sigmoid
tf.keras.layers.Dense(1,activation='sigmoid')
])
# Compiled with binary_crossentropy to complement the binary classification
model.compile(optimizer=Adam(learning_rate=0.01),loss='binary_crossentropy', metrics=['accuracy'])
model.fit(train_dataset,epochs=30)
Output:
Epoch 1/30
2/2 [==============================] - 0s 16ms/step - loss: 11.7881 - accuracy: 0.4385
Epoch 2/30
2/2 [==============================] - 0s 7ms/step - loss: 10.2350 - accuracy: 0.4270
...

Should Keras with Theano backend be over 18X slower than with Tensorflow backend?

I've just installed keras, tensorflow and theano on my machine and did a quick comparison of keras using tensorflow as a backend to using theano as a backend. The results were more extreme than I expected.
I am using the following versions of the 3 packages:
>>> theano.__version__
'0.8.2'
>>> tensorflow.__version__
'0.12.1'
>>> keras.__version__
'1.2.1'
To compare the two backends, I used the cifar10_cnn.py When I use tensorflow as my backend, I get this result:
deep#deep-Precision-7710:~/Downloads/keras/examples$ python cifar10_cnn.py
Using TensorFlow backend.
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.so locally
X_train shape: (50000, 32, 32, 3)
50000 train samples
10000 test samples
Using real-time data augmentation.
Epoch 1/200
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties:
name: Quadro M5000M
major: 5 minor: 2 memoryClockRate (GHz) 1.0505
pciBusID 0000:01:00.0
Total memory: 7.93GiB
Free memory: 7.59GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Quadro M5000M, pci bus id: 0000:01:00.0)
50000/50000 [==============================] - 16s - loss: 1.7916 - acc: 0.3350 - val_loss: 1.4998 - val_acc: 0.4490
Epoch 2/200
50000/50000 [==============================] - 15s - loss: 1.4020 - acc: 0.4907 - val_loss: 1.2039 - val_acc: 0.5779
Epoch 3/200
50000/50000 [==============================] - 15s - loss: 1.2460 - acc: 0.5531 - val_loss: 1.0272 - val_acc: 0.6311
and the whole thing takes just over 51 mins to run.
When I use Theano as my backend, I get this result:
deep#deep-Precision-7710:~/Downloads/keras/examples$ python cifar10_cnn.py
Using Theano backend.
X_train shape: (50000, 32, 32, 3)
50000 train samples
10000 test samples
Using real-time data augmentation.
Epoch 1/200
50000/50000 [==============================] - 292s - loss: 1.8008 - acc: 0.3286 - val_loss: 1.4991 - val_acc: 0.4613
Epoch 2/200
50000/50000 [==============================] - 285s - loss: 1.4302 - acc: 0.4774 - val_loss: 1.1840 - val_acc: 0.5737
Epoch 3/200
50000/50000 [==============================] - 288s - loss: 1.2690 - acc: 0.5452 - val_loss: 1.0930 - val_acc: 0.6030
This takes about 15.75 hours to run !?
Should I be surprised that the keras theano backend is about 18X - 19X slower than the keras tensorflow backend? (I'm including the version of cifar10_cnn.py that I was using) IN convering from one backend to the other, I just changed the backend specification in keras.json. I didn't adjust the image_dim_ordering, since that seemed to be specified by the dataset.
'''Train a simple deep CNN on the CIFAR10 small images dataset.
GPU run command with Theano backend (with TensorFlow, the GPU is automatically used):
THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python cifar10_cnn.py
It gets down to 0.65 test logloss in 25 epochs, and down to 0.55 after 50 epochs.
(it's still underfitting at that point, though).
'''
from __future__ import print_function
from keras.datasets import cifar10
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D
from keras.optimizers import SGD
from keras.utils import np_utils
import datetime
batch_size = 32
nb_classes = 10
nb_epoch = 200
data_augmentation = True
# input image dimensions
img_rows, img_cols = 32, 32
# The CIFAR10 images are RGB.
img_channels = 3
# The data, shuffled and split between train and test sets:
(X_train, y_train), (X_test, y_test) = cifar10.load_data()
print('X_train shape:', X_train.shape)
print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')
# Convert class vectors to binary class matrices.
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)
model = Sequential()
model.add(Convolution2D(32, 3, 3, border_mode='same',
input_shape=X_train.shape[1:]))
model.add(Activation('relu'))
model.add(Convolution2D(32, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Convolution2D(64, 3, 3, border_mode='same'))
model.add(Activation('relu'))
model.add(Convolution2D(64, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))
# Let's train the model using SGD + momentum (how original).
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy',
optimizer=sgd,
metrics=['accuracy'])
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
if not data_augmentation:
print('Not using data augmentation.')
model.fit(X_train, Y_train,
batch_size=batch_size,
nb_epoch=nb_epoch,
validation_data=(X_test, Y_test),
shuffle=True)
else:
print('Using real-time data augmentation.')
# This will do preprocessing and realtime data augmentation:
datagen = ImageDataGenerator(
featurewise_center=False, # set input mean to 0 over the dataset
samplewise_center=False, # set each sample mean to 0
featurewise_std_normalization=False, # divide inputs by std of the dataset
samplewise_std_normalization=False, # divide each input by its std
zca_whitening=False, # apply ZCA whitening
rotation_range=0, # randomly rotate images in the range (degrees, 0 to 180)
width_shift_range=0.1, # randomly shift images horizontally (fraction of total width)
height_shift_range=0.1, # randomly shift images vertically (fraction of total height)
horizontal_flip=True, # randomly flip images
vertical_flip=False) # randomly flip images
# Compute quantities required for featurewise normalization
# (std, mean, and principal components if ZCA whitening is applied).
datagen.fit(X_train)
start = datetime.datetime.now()
# Fit the model on the batches generated by datagen.flow().
model.fit_generator(datagen.flow(X_train, Y_train,
batch_size=batch_size),
samples_per_epoch=X_train.shape[0],
nb_epoch=nb_epoch,
validation_data=(X_test, Y_test))
stop = datetime.datetime.now()
print("\nTime to run:",stop-start)