How to customize AlexNet for difference uses - tensorflow

So I've been working on learning some machine learning this past week and I have been messing around with my own regression CNN with inputs of color images 128x128 and an output of a rating. Although my dataset is small, 400ish total, I got alright results with a little overfitting (mean_absolute_error of 0.5 for training and 0.9 for testing with scale 1-10) with the model showed below:
model = keras.Sequential([
keras.layers.Conv2D(32, kernel_size=(5, 5), strides=(1, 1), activation='relu', input_shape=(128, 128, 3)),
keras.layers.Dropout(0.15),
keras.layers.MaxPooling2D(pool_size=(2, 2), strides=(2, 2)),
keras.layers.Conv2D(64, kernel_size=(5, 5), activation='relu'),
keras.layers.Dropout(0.15),
keras.layers.MaxPooling2D(pool_size=(2, 2)),
keras.layers.Flatten(),
keras.layers.Dense(1000, activation='relu'),
keras.layers.Dropout(0.4),
keras.layers.Dense(1)
])
However, not being satisfied with the results, I wanted to try out tried and true models. So I used AlexNet:
model = keras.Sequential([
keras.layers.Conv2D(filters=96, kernel_size=(11, 11), strides=(4, 4), activation='relu', input_shape=(128, 128, 3), padding='same'),
keras.layers.MaxPooling2D(pool_size=(2, 2), strides=(2, 2), padding='same'),
keras.layers.Conv2D(filters=256, kernel_size=(11, 11), strides=(1, 1), activation='relu', padding='same'),
keras.layers.MaxPooling2D(pool_size=(2, 2), strides=(2, 2), padding='same'),
keras.layers.Conv2D(filters=384, kernel_size=(3, 3), strides=(1, 1), activation='relu', padding='same'),
keras.layers.Conv2D(filters=384, kernel_size=(3, 3), strides=(1, 1), activation='relu', padding='same'),
keras.layers.Conv2D(filters=256, kernel_size=(3, 3), strides=(1, 1), activation='relu', padding='same'),
keras.layers.MaxPooling2D(pool_size=(2, 2), strides=(2, 2), padding='same'),
keras.layers.Flatten(),
keras.layers.Dense(4096, activation='relu'),
keras.layers.Dropout(0.4),
keras.layers.Dense(4096, activation='relu'),
keras.layers.Dropout(0.4),
keras.layers.Dense(1000, activation='relu'),
keras.layers.Dropout(0.4),
keras.layers.Dense(1)
])
However, it converged much slower and pretty much plateaued at MAE of 1.2 for train and 0.9 for test. Although this does show less overfitting, I thought it was strange that I still got the same test results. Is my implemention of AlexNet flawed or is this just not the right application for AlexNet. I understand it is usually used for classification, but I figured it might be worth trying with regression. Any info/suggestions/criticisms help, thanks!

I don't see anything clearly wrong with your AlexNet implementation. But I'd like to point few things.
The way dropout used in the first model
It is not a standard things to apply Dropout like that after a convolution output. When you apply dropout this way the outputs in the Convolution output gets randomly switched off. But unlike a fully connected layer, convolution outputs have a "spatial" structure. What I'm trying to say is it make me a more sense to switch of full channels than switching off random neurons. I think an illustration would help. Think of one channel output corresponding to a single neuron of a fully connected layer (not the best analogy but it helps to understand my proposal).
Or the other option is to be rid of Dropout after convolution outputs and only have Dropout after fully connected layers.
Time taken to converge for AlexNet
AlexNet is significantly large than the model 1, meaning way more parameters than your first model. So it makes sense of it to take longer to converge.
Why the accuracy low?
One thing I can think of is the size of the output just before the Flatten() layer. In model 1, it is 32x32, where with Alexnet it is, 4x4 which is very small. So your fully connected layers have very little information coming from the Convolution layers. This might be causing AlexNet to underperform (Just a speculation).

Related

Can convolutional neural network (CNN) be used for feature extraction in unsupervised learning?

I'm doing a side project to learn AI with ANN, I thought of making an unsupervised model that extracts features of each frame on a video to compare them in the future and detect image repetitions.
My idea is to use a CNN to extract for each frame the features but I can't seem to make it work, as I am learning my intuition tells me that there is something I am just not understanding.
How can I create an unsupervised model that extracts features of an array of images?
This is what I got:
img = load_image_func(???) # this loads a video and return a reshaped ordered list of frames
input_shape = (150, 150, 3)
# The model
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', padding='same', name='conv_1', input_shape=input_shape))
model.add(MaxPooling2D((2, 2), name='maxpool_1'))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same', name='conv_2'))
model.add(MaxPooling2D((2, 2), name='maxpool_2'))
model.add(Conv2D(128, (3, 3), activation='relu', padding='same', name='conv_3'))
model.add(MaxPooling2D((2, 2), name='maxpool_3'))
model.add(Conv2D(128, (3, 3), activation='relu', padding='same', name='conv_4'))
model.add(MaxPooling2D((2, 2), name='maxpool_4'))
model.add(Flatten())
model.add(Dropout(0.5))
model.add(Dense(512, activation='relu', name='dense_1'))
model.add(Dense(128, activation='relu', name='dense_2'))
model.add(Dense(67500, activation='sigmoid', name='output'))
optimizer=keras.optimizers.Adam(learning_rate=0.001)
model.compile(loss = 'sparse_categorical_crossentropy', optimizer= optimizer, metrics=['accuracy'])
#model.summary()
model.fit(vidcap, vidcap, batch_size=64, epochs=20)
I have the feeling I should be training the model but as it is unsupervised I don't have train data.
Also, how many units should I put in the output layer as I don't how many features will be detected?
Thanks for your time
Indeed, the CNN model will extract several features of an Image (e.g. colors, shapes, edges, patterns, etc.)
However, what are you defining as Images Repetition? Are you looking for an algorithm that finds similar images? If this is the case, then You might wanna look into Siamese Networks, which is exactly what they do:
https://www.cs.cmu.edu/~rsalakhu/papers/oneshot1.pdf
The main idea here is that there are 2 Neural Networks that are trained together! Then, after the training is done, You use both neural networks to extract features of the same images seperately and compare the results to find the similarity.

modifying an ssd net in tensorflow

If I were using Keras, Modifying the architecture would be straight forward modification of the network layers:
x = Conv2D(32, (3, 3), padding="same")(inputs)
x = Activation("relu")(x)
x = Conv2D(32, (3, 3), padding="same")(x)
x = Activation("relu")(x)
x = MaxPooling2D(pool_size=(2, 2), strides=(2, 2))(x)
But I cant seem to find that structure in Tensorflow:
https://github.com/tensorflow/models/blob/master/research/object_detection/samples/configs/ssd_mobilenet_v2_coco.config
Am I looking at the right file? In other words, how would I go about adding extra layers (whether convolutional, maxpool, or fully connected) to my tensorflow model?
It looks like the actual model has been abstracted away to
https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet/mobilenet_v2.py
Though I would imagine there has got to be a simpler way to change the architecture of the neural net than just modifying that file...

Batch normalization destroys validation performances

I'm adding some batch normalization to my model in order to improve the training time, following some tutorials.
This is my model:
model = Sequential()
model.add(Conv2D(16, kernel_size=(3, 3), activation='relu', input_shape=(64,64,3)))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, kernel_size=(3, 3), activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(128, kernel_size=(3, 3), activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(256, kernel_size=(3, 3), activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.5))
#NB: adding more parameters increases the probability of overfitting!! Try to cut instead of adding neurons!!
model.add(Dense(units=512, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(units=20, activation='softmax'))
Without batch normalization, i get around 50% accuracy on my data. Adding batch normalization destroys my performance, with a validation accuracy reduced to 10%.
Why is this happening?
I'm not sure if this is what you are asking, but batch normalization is still active during validation, it's just that the parameters are defined and set during training and not altered during validation.
As for why batch normalization is not good for your model/problem in general, it's like any hyper parameter, some work well with some scenarios, not well with others. Do you know if this is the best placement for BN within your network? Other than that would need to know more about your data and problem to give any further guesses.
Try using lesser number of batch normalization layers. And it is a general practice to use it at the last convolution layer. Start with just one of them and add more if it improves the validation accuracy.

Out of memory (OOM) error of tensorflow/keras model

when i tried to add dropout to the keras model it cause OOM error:
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[128,128,176,216]...
the model suppose to be autoencoder that produce super resolution x4.
autoencoder = Sequential()
autoencoder.add(Conv2D(64*comlex, (3, 3), activation='relu',
padding='same', input_shape=x_train[0].shape))
autoencoder.add(Dropout(0.25))
autoencoder.add(UpSampling2D((2, 2)))
autoencoder.add(Conv2D(64*comlex, (3, 3), activation='relu', padding='same'))
# autoencoder.add(Dropout(0.25))
autoencoder.add(UpSampling2D((2, 2)))
autoencoder.add(Conv2D(3, (3, 3), activation='sigmoid', padding='same'))
autoencoder.compile(optimizer='adam', loss='binary_crossentropy',metrics=['accuracy'])
the line in comment causes OOM.
why does dropout take so much memory?
Update:
Tensorflow/Keras OOM(Out of memory) error occures because of the excess amount of model parameters e.g. image size/feature number etc. with high batch size value, which can cause higher GPU memory consumption.
Sometimes these can associate with residual memory from previous processes as well.

Tensorboard and Dropout Layers

I have a very basic query. I have made 4 almost identical(Difference being input shapes) CNN and have merged them while connecting to a Feed Forward Network of fully connected layers.
Code for the almost identical CNN(s):
model3 = Sequential()
model3.add(Convolution2D(32, (3, 3), activation='relu', padding='same',
input_shape=(batch_size[3], seq_len, channels)))
model3.add(MaxPooling2D(pool_size=(2, 2)))
model3.add(Dropout(0.1))
model3.add(Convolution2D(64, (3, 3), activation='relu', padding='same'))
model3.add(MaxPooling2D(pool_size=(2, 2)))
model3.add(Flatten())
But on tensorboard I see all the Dropout layers are interconnected, and Dropout1 is of different color than Dropout2,3,4,etc which all are the same color.
I know this is an old question but I had the same issue myself and just now I realized what's going on
Dropout is only applied if we're training the model. This should be deactivated by the time we're evaluating/predicting. For that purpose, keras creates a learning_phase placeholder, set to 1.0 if we're training the model.
This placeholder is created inside the first Dropout layer you create and is shared across all of them. So that's what you're seeing there!