Mask RCNN with custom classes

Mask RCNN with custom classes - tensorflow

after dealing with the compatability issues with Mask RCNN I am working now on training the model on my custom classes with the annotations json being in Coco-style.
class modelConfig(Config):
NAME = "my_coco"
NUM_CLASSES = 1 + 6
STEPS_PER_EPOCH = 100
config = modelConfig()
config.display()
model = MaskRCNN(mode='training', model_dir=DEFAULT_LOGS_DIR, config=config)
# load weights (mscoco) and exclude the output layers
model.load_weights(COCO_WEIGHTS_PATH, by_name=True, exclude=["mrcnn_class_logits", "mrcnn_bbox_fc", "mrcnn_bbox", "mrcnn_mask"])
# train weights (output layers or 'heads')
model.train(dataset_train, dataset_train, learning_rate=config.LEARNING_RATE, epochs=25, layers='heads')
The problem is that the execution is stuck on epoch 1/N without really starting (even if reducing training dataset to 1 image).
This thread suggests that the problem can be resolved with a custom Layer --> but currently I did not manage to reproduce to resolve the problem.
But My question is do I really need the coco weights if I have custom classes? and should I (how?) initiated weights for these classes? (and of course a viable solution for the problem would be appreciated)

Related

training model CNN KERAS

hello everyone i am trying to train a model using cnn and keras but the training don't finish and i got this warning and it stops training , i don't know why and i didn't understand where the problem is can anyone gives me a advice or what i should change in the code
def myModel():
no_Of_Filters=60
size_of_Filter=(5,5) # THIS IS THE KERNEL THAT MOVE AROUND THE IMAGE TO GET THE FEATURES.
# THIS WOULD REMOVE 2 PIXELS FROM EACH BORDER WHEN USING 32 32 IMAGE
size_of_Filter2=(3,3)
size_of_pool=(2,2) # SCALE DOWN ALL FEATURE MAP TO GERNALIZE MORE, TO REDUCE OVERFITTING
no_Of_Nodes = 500 # NO. OF NODES IN HIDDEN LAYERS
model= Sequential()
model.add((Conv2D(no_Of_Filters,size_of_Filter,input_shape=(imageDimesions[0],imageDimesions[1],1),activation='relu'))) # ADDING MORE CONVOLUTION LAYERS = LESS FEATURES BUT CAN CAUSE ACCURACY TO INCREASE
model.add((Conv2D(no_Of_Filters, size_of_Filter, activation='relu')))
model.add(MaxPooling2D(pool_size=size_of_pool)) # DOES NOT EFFECT THE DEPTH/NO OF FILTERS
model.add((Conv2D(no_Of_Filters//2, size_of_Filter2,activation='relu')))
model.add((Conv2D(no_Of_Filters // 2, size_of_Filter2, activation='relu')))
model.add(MaxPooling2D(pool_size=size_of_pool))
model.add(Dropout(0.5))
model.add(Flatten())
model.add(Dense(no_Of_Nodes,activation='relu'))
model.add(Dropout(0.5)) # INPUTS NODES TO DROP WITH EACH UPDATE 1 ALL 0 NONE
model.add(Dense(noOfClasses,activation='softmax')) # OUTPUT LAYER
# COMPILE MODEL
model.compile(Adam(lr=0.001),loss='categorical_crossentropy',metrics=['accuracy'])
return model
############################### TRAIN
model = myModel()
print(model.summary())
history=model.fit_generator(dataGen.flow(X_train,y_train,batch_size=batch_size_val),steps_per_epoch=steps_per_epoch_val,epochs=epochs_val,validation_data=(X_validation,y_validation),shuffle=1)
WARNING:tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least `steps_per_epoch * epochs` batches (in this case, 20000 batches). You may need to use the repeat() function when building your dataset.

While using generators, you can either run the model without the step_per_epoch parameter and let the model figure out how many steps are there to cover an epoch.
history=model.fit_generator(dataGen.flow(X_train,y_train,batch_size=batch_size_val),
epochs=epochs_val,
validation_data=(X_validation,y_validation),
shuffle=1)
OR
you'll have to calculate steps_per_epoch and use it while training as follows;
history=model.fit_generator(dataGen.flow(X_train,y_train,batch_size=batch_size_val),
steps_per_epoch=(data_samples/batch_size)
epochs=epochs_val,
validation_data=(X_validation,y_validation),
shuffle=1)
Let us know if the issue still persists. Thanks!

CNN + LSTM model for images performs poorly on validation data set

My training and loss curves look like below and yes, similar graphs have received comments like "Classic overfitting" and I get it.
My model looks like below,
input_shape_0 = keras.Input(shape=(3,100, 100, 1), name="img3")
model = tf.keras.layers.TimeDistributed(Conv2D(8, 3, activation="relu"))(input_shape_0)
model = tf.keras.layers.TimeDistributed(Dropout(0.3))(model)
model = tf.keras.layers.TimeDistributed(MaxPooling2D(2))(model)
model = tf.keras.layers.TimeDistributed(Conv2D(16, 3, activation="relu"))(model)
model = tf.keras.layers.TimeDistributed(MaxPooling2D(2))(model)
model = tf.keras.layers.TimeDistributed(Conv2D(32, 3, activation="relu"))(model)
model = tf.keras.layers.TimeDistributed(MaxPooling2D(2))(model)
model = tf.keras.layers.TimeDistributed(Dropout(0.3))(model)
model = tf.keras.layers.TimeDistributed(Flatten())(model)
model = tf.keras.layers.TimeDistributed(Dropout(0.4))(model)
model = LSTM(16, kernel_regularizer=tf.keras.regularizers.l2(0.007))(model)
# model = Dense(100, activation="relu")(model)
# model = Dense(200, activation="relu",kernel_regularizer=tf.keras.regularizers.l2(0.001))(model)
model = Dense(60, activation="relu")(model)
# model = Flatten()(model)
model = Dropout(0.15)(model)
out = Dense(30, activation='softmax')(model)
model = keras.Model(inputs=input_shape_0, outputs = out, name="mergedModel")
def get_lr_metric(optimizer):
def lr(y_true, y_pred):
return optimizer.lr
return lr
opt = tf.keras.optimizers.RMSprop()
lr_metric = get_lr_metric(opt)
# merged.compile(loss='sparse_categorical_crossentropy',
optimizer='adam', metrics=['accuracy'])
model.compile(loss='sparse_categorical_crossentropy',
optimizer=opt, metrics=['accuracy',lr_metric])
model.summary()
In the above model building code, please consider the commented lines as some of the approaches I have tried so far.
I have followed the suggestions given as answers and comments to this kind of question and none seems to be working for me. Maybe I am missing something really important?
Things that I have tried:
Dropouts at different places and different amounts.
Played with inclusion and expulsion of dense layers and their number of units.
Number of units on the LSTM layer was tried with different values (started from as low as 1 and now at 16, I have the best performance.)
Came across weight regularization techniques and tried to implement them as shown in the code above and so tried to put it at different layers ( I need to know what is the technique in which I need to use it instead of simple trial and error - this is what I did and it seems wrong)
Implemented learning rate scheduler using which I reduce the learning rate as the epochs progress after a certain number of epochs.
Tried two LSTM layers with the first one having return_sequences = true.
After all these, I still cannot overcome the overfitting problem.
My data set is properly shuffled and divided in a train/val ratio of 80/20.
Data augmentation is one more thing that I found commonly suggested which I am yet to try, but I want to see if I am making some mistake so far which I can correct it and avoid diving into data augmentation steps for now. My data set has the below sizes:
Training images: 6780
Validation images: 1484
The numbers shown are samples and each sample will have 3 images. So basically, I input 3 mages at once as one sample to my time-distributed CNN which is then followed by other layers as shown in the model description. Following that, my training images are 6780 * 3 and my Validation images are 1484 * 3. Each image is 100 * 100 and is on channel 1.
I am using RMS prop as the optimizer which performed better than adam as per my testing
UPDATE
I tried some different architectures and some reularizations and dropouts at different places and I am now able to achieve a val_acc of 59% below is the new model.
# kernel_regularizer=tf.keras.regularizers.l2(0.004)
# kernel_constraint=max_norm(3)
model = tf.keras.layers.TimeDistributed(Conv2D(32, 3, activation="relu"))(input_shape_0)
model = tf.keras.layers.TimeDistributed(Dropout(0.3))(model)
model = tf.keras.layers.TimeDistributed(MaxPooling2D(2))(model)
model = tf.keras.layers.TimeDistributed(Conv2D(64, 3, activation="relu"))(model)
model = tf.keras.layers.TimeDistributed(MaxPooling2D(2))(model)
model = tf.keras.layers.TimeDistributed(Conv2D(128, 3, activation="relu"))(model)
model = tf.keras.layers.TimeDistributed(MaxPooling2D(2))(model)
model = tf.keras.layers.TimeDistributed(Dropout(0.3))(model)
model = tf.keras.layers.TimeDistributed(GlobalAveragePooling2D())(model)
model = LSTM(128, return_sequences=True,kernel_regularizer=tf.keras.regularizers.l2(0.040))(model)
model = Dropout(0.60)(model)
model = LSTM(128, return_sequences=False)(model)
model = Dropout(0.50)(model)
out = Dense(30, activation='softmax')(model)

Try to perform Data Augmentation as a preprocessing step. Lack of data samples can lead to such curves. You can also try using k-fold Cross Validation.

There are many ways to prevent overfitting, according to the papers below:
Dropout layers (Disabling randomly neurons). https://www.cs.toronto.edu/~hinton/absps/JMLRdropout.pdf
Input Noise (e.g. Random Gaussian Noise on the imges). https://arxiv.org/pdf/2010.07532.pdf
Random Data Augmentations (e.g. Rotating, Shifting, Scaling, etc.).
https://arxiv.org/pdf/1906.11052.pdf
Adjusting Number of Layers & Units.
https://clgiles.ist.psu.edu/papers/UMD-CS-TR-3617.what.size.neural.net.to.use.pdf
Regularization Functions (e.g. L1, L2, etc)
https://www.researchgate.net/publication/329150256_A_Comparison_of_Regularization_Techniques_in_Deep_Neural_Networks
Early Stopping: If you notice that for N successive epochs that your model's training loss is decreasing, but the model performs poorly on validaiton data set, then It is a good sign to stop the training.
Shuffling the training data or K-Fold cross validation is also common way way of dealing with Overfitting.
I found this great repository, which contains examples of how to implement data augmentations:
https://github.com/kochlisGit/random-data-augmentations
Also, this repository here seems to have examples of CNNs that implement most of the above methods:
https://github.com/kochlisGit/Tensorflow-State-of-the-Art-Neural-Networks

The goal should be to get the model predict correctly irrespective of
the order in which the 3 images in the sample are arranged.
If the order of the images of each sample is not important for the training, I think your model does the inverse, the Timedistributed layers succeded by LSTM take into account the order of the three images. As a solution, primarily, you can add images by reordering the images of each sample (= Augmented data). Secondly, try to consider the three images as one image with three-channel and remove the Timedistributed layers (I'm not sure that the three-channels are more efficient but you can give it a try)

How to design an optimal CNN? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 1 year ago.
Improve this question
I am working on a Ph.D. project, which objective is to reduce CO2 emissions on Earth.
I have a dataset, and I was able to successfully implement a CNN, which gives 80% accuracy (worst-case scenario). However, the field where I work is very demanding, and I have the impression that I could get better accuracy with a well-optimized CNN.
How do experts design CNN's? How could I choose between Inception Modules, Dropout Regularization, Batch Normalization, convolutional filter size, size and depth of convolutional channels, number of fully-connected layers, activations neurons, etc? How do people navigate this large optimization problem in a scientific manner? The combinations are endless. Are there any real-life examples where this problem is navigated, addressing its full complexity (not just optimizing a few hyper-parameters)?
Hopefully, my dataset is not too large, so the CNN models that I am considering should have very few parameters.

How do experts design CNN's? How could I choose between Inception Modules, Dropout Regularization, Batch Normalization, convolutional filter size, size and depth of convolutional channels, number of fully-connected layers, activations neurons, etc? How do people navigate this large optimization problem in a scientific manner? The combinations are endless.
You said truly that the combinations are huge in number. And without approaching rightly you may end up with nowhere. A great one said machine Learning is an art, not science. Results are data-dependent. Here are a few tips regarding your above concern.
Log Everything: In the training time, save necessary logs of every experiment such as training loss, validation loss, weight files, execution times, visualization, etc. Some of them can be saved with CSVLogger, ModelCheckpoint etc. TensorBoard is a great tool for inspecting both training log and visualization and many more.
Strong Validation Strategies: This is very important. To build a stable Cross-Validation (CV), we must have a good understanding of the data and the challenges faced. We’ll check and make sure the validation set has a similar distribution to the training set and test set. And We’ll try to make sure our models improve both on our CV and on the test set (if gt is available for the test set). Basically, partitioning the data randomly is usually not enough to satisfy this. Understanding the data and how we can partition it without introducing a data leakage in our CV is key to avoid overfitting.
Change Only One: During the experiment, change one thing at a time and save the observations (logs) for those changes. For example: change the image size gradually from 224 (for example) to higher and observe the results. We should start with a small combination. While experimenting with image size, fix others like model architecture, learning rate, etc. The same goes for the learning rate part or model architectures. However, later we also may need to change more than one when we get some promising combinations. In kaggle competition, these are very common approaches one would follow. Below is a very simple example regarding this. But it's not limited any way.
However, as you said, your Ph.D. project is to reduce CO2 emissions on Earth. In my understanding, these are more application-specific problems and less than the algorithm-specific problems. So, we think it's better to take benefit from well-recognized pre-trained models.
In case if we wish to write our CNN on our own, we should give a decent time on it. Start with a very simple one, for example:
Conv2D (16, 3, 'relu') - > MaxPool (2)
Conv2D (32, 3, 'relu') - > MaxPool (2)
Conv2D (64, 3, 'relu') - > MaxPool (2)
Conv2D (128, 3, 'relu') - > MaxPool (2)
Here we gradually increase the depth but reducing the feature dimension. By the end layer, more semantic information would emerge. While stacking Conv2D layers, it's common practice to increase the channel depth in such order 16, 32, 64, 128 etc. If we want to impute Inception or Residual Block inside our network, I think, we should do some basic math first about what feature properties will come out of this, etc. Following a concept like this, we may also wish to look at approaches like SENet, ResNeSt etc. About Dropout, if we observe that our model is getting overfitted during training, then we should add some. In the final layer, we may want to choose GlobalAveragePooling over the Flatten layer (FCC). We can probably now understand that there are lots of ablation studies that need to be done to get a satisfactory CNN model.
In this regard, We suggest you explore the two most important things: (1). Read one of the pre-trained model papers/blogs/videos about their strategies to build the algorithm. For example: check out this EfficientNet Explained. (2). Next, explore the source code of it. That would give your more sense and encourage you to build your own giant.
We like to end this with one last working example. See the model diagram below, it's a small inception network, source. If we look closely, we will see, it consists of the following three modules.
Conv Module
Inception Module
Downsample Modul
Take a close look at each module's configuration such as filter size, strides, etc. Let's try to understand and implement this module. Before that, here are two good references (1, 2) for the Inception concept to refresh the concept.
Conv Module
From the diagram we can see, it consists of one convolutional network, one batch normalization, and one relu activation. Also, it produces C times feature maps with K x K filters and S x S strides. To do that, we will create a class object that will inherit the tf.keras.layers.Layer classes
class ConvModule(tf.keras.layers.Layer):
def __init__(self, kernel_num, kernel_size, strides, padding='same'):
super(ConvModule, self).__init__()
# conv layer
self.conv = tf.keras.layers.Conv2D(kernel_num,
kernel_size=kernel_size,
strides=strides, padding=padding)
# batch norm layer
self.bn = tf.keras.layers.BatchNormalization()
def call(self, input_tensor, training=False):
x = self.conv(input_tensor)
x = self.bn(x, training=training)
x = tf.nn.relu(x)
return x
Inception Module
Next comes the Inception module. According to the above graph, it consists of two convolutional modules and then merges together. Now as we know to merge, here we need to ensure that the output feature maps dimension ( height and width ) needs to be the same.
class InceptionModule(tf.keras.layers.Layer):
def __init__(self, kernel_size1x1, kernel_size3x3):
super(InceptionModule, self).__init__()
# two conv modules: they will take same input tensor
self.conv1 = ConvModule(kernel_size1x1, kernel_size=(1,1), strides=(1,1))
self.conv2 = ConvModule(kernel_size3x3, kernel_size=(3,3), strides=(1,1))
self.cat = tf.keras.layers.Concatenate()
def call(self, input_tensor, training=False):
x_1x1 = self.conv1(input_tensor)
x_3x3 = self.conv2(input_tensor)
x = self.cat([x_1x1, x_3x3])
return x
Here you may notice that we are now hard-coded the exact kernel size and strides number for both convolutional layers according to the network (diagram). And also in ConvModule, we have already set padding to the same, so that the dimension of the feature maps will be the same for both (self.conv1 and self.conv2); which is required in order to concatenate them to the end.
Again, in this module, two variable performs as the placeholder, kernel_size1x1, and kernel_size3x3. This is for the purpose of course. Because we will need different numbers of feature maps to the different stages of the entire model. If we look into the diagram of the model, we will see that InceptionModule takes a different number of filters at different stages in the model.
Downsample Module
Lastly the downsampling module. The main intuition for downsampling is that we hope to get more relevant feature information that highly represents the inputs to the model. As it tends to remove the unwanted feature so that model can focus on the most relevant. There are many ways we can reduce the dimension of the feature maps (or inputs). For example: using strides 2 or using the conventional pooling operation. There are many types of pooling operation, namely: MaxPooling, AveragePooling, GlobalAveragePooling.
From the diagram, we can see that the downsampling module contains one convolutional layer and one max-pooling layer which later merges together. Now, if we look closely at the diagram (top-right), we will see that the convolutional layer takes a 3 x 3 size filter with strides 2 x 2. And the pooling layer (here MaxPooling) takes pooling size 3 x 3 with strides 2 x 2. Fair enough, however, we also ensure that the dimension coming from each of them should be the same in order to merge at the end. Now, if we remember when we design the ConvModule we purposely set the value of the padding argument to same. But in this case, we need to set it to valid.
class DownsampleModule(tf.keras.layers.Layer):
def __init__(self, kernel_size):
super(DownsampleModule, self).__init__()
# conv layer
self.conv3 = ConvModule(kernel_size, kernel_size=(3,3),
strides=(2,2), padding="valid")
# pooling layer
self.pool = tf.keras.layers.MaxPooling2D(pool_size=(3, 3),
strides=(2,2))
self.cat = tf.keras.layers.Concatenate()
def call(self, input_tensor, training=False):
# forward pass
conv_x = self.conv3(input_tensor, training=training)
pool_x = self.pool(input_tensor)
# merged
return self.cat([conv_x, pool_x])
Okay, now we have built all three modules, namely: ConvModule InceptionModule DownsampleModule. Let's initialize their parameter according to the diagram.
class MiniInception(tf.keras.Model):
def __init__(self, num_classes=10):
super(MiniInception, self).__init__()
# the first conv module
self.conv_block = ConvModule(96, (3,3), (1,1))
# 2 inception module and 1 downsample module
self.inception_block1 = InceptionModule(32, 32)
self.inception_block2 = InceptionModule(32, 48)
self.downsample_block1 = DownsampleModule(80)
# 4 inception module and 1 downsample module
self.inception_block3 = InceptionModule(112, 48)
self.inception_block4 = InceptionModule(96, 64)
self.inception_block5 = InceptionModule(80, 80)
self.inception_block6 = InceptionModule(48, 96)
self.downsample_block2 = DownsampleModule(96)
# 2 inception module
self.inception_block7 = InceptionModule(176, 160)
self.inception_block8 = InceptionModule(176, 160)
# average pooling
self.avg_pool = tf.keras.layers.AveragePooling2D((7,7))
# model tail
self.flat = tf.keras.layers.Flatten()
self.classfier = tf.keras.layers.Dense(num_classes, activation='softmax')
def call(self, input_tensor, training=True, **kwargs):
# forward pass
x = self.conv_block(input_tensor)
x = self.inception_block1(x)
x = self.inception_block2(x)
x = self.downsample_block1(x)
x = self.inception_block3(x)
x = self.inception_block4(x)
x = self.inception_block5(x)
x = self.inception_block6(x)
x = self.downsample_block2(x)
x = self.inception_block7(x)
x = self.inception_block8(x)
x = self.avg_pool(x)
x = self.flat(x)
return self.classfier(x)
The amount of filter number for each computational block is set according to the design of the model (see the diagram). After initialing all the blocks (in the __init__ function), we connect them according to the design (in the call function).

I think you are way off on your estimate of the number of parameters needed. Think more like a few million which is what you will get if you use transfer learning. You can struggle trying to make your own model if you wish but you will probable not be any better (and more likely no where near as good) as the results you will get from transfer learning. I highly recommend the MobileV2 model. Now you can make that or any of the other models perform better if you use an adjustable learning rate using ReduceLROnPlateau . Documentation for that is here. The other thing I recommend is to use the Keras callback EarlyStopping. Documentation is here. . Set it to monitor validation loss and set restore_best_weights=True. Set the number of epochs to a large number so this callback gets triggered and returns the model with the weights from the epoch with the lowest validation loss. My recommended code is shown below
height=224
width=224
img_shape=(height, width, 3)
dropout=.3
lr=.001
class_count=156 # number of classes
img_shape=(height, width, 3)
base_model=tf.keras.applications.MobileNetV2( include_top=False, input_shape=img_shape, pooling='max', weights='imagenet')
x=base_model.output
x=keras.layers.BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001 )(x)
x = Dense(512, kernel_regularizer = regularizers.l2(l = 0.016),activity_regularizer=regularizers.l1(0.006),
bias_regularizer=regularizers.l1(0.006) ,activation='relu', kernel_initializer= tf.keras.initializers.GlorotUniform(seed=123))(x)
x=Dropout(rate=dropout, seed=123)(x)
output=Dense(class_count, activation='softmax',kernel_initializer=tf.keras.initializers.GlorotUniform(seed=123))(x)
model=Model(inputs=base_model.input, outputs=output)
model.compile(Adamax(lr=lr), loss='categorical_crossentropy', metrics=['accuracy'])
rlronp=tf.keras.callbacks.ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=1, verbose=1, mode='auto', min_delta=0.0001, cooldown=0, min_lr=0)
estop=tf.keras.callbacks.EarlyStopping( monitor="val_loss", min_delta=0, patience=4,
verbose=1, mode="auto", baseline=None,
restore_best_weights=True)
callbacks=[rlronp, estop]
Also look at the balance in your data set. That is, compare how many training samples you have for each class. If the ratio of most samples/least samples>2 or 3 you may want to take action to mitigate that. Numerous methods are available, the simplest is to use the class_weight parameter in model.fit. o do that you need to create a class_weights dictionary. The process to do that is outline below
Lets say your class distribution is
class0 - 500 samples
class1- 2000 samples
class2 - 1500 samples
class3 - 200 samples
Then your dictionary would be
class_weights={0: 2000/500, 1:2000/2000, 2: 2000/1500, 3: 2000/200}
in model.fit set class_weight=class_weights

Tensorflow Hub + Estimator Potential Bug: Trained weights not reused for Evaluation/Prediction

In my current project I'm using TF Hub image module along with estimator for a classification problem. As per TF Hub guidelines, I set the tags to "train" in training mode - and to None during Eval/Predict modes. Test loss/accuracy was so bad but training loss kept decreasing. After debugging for days I learnt that somehow the hub's trained model weights were not being used (seemed only the last dense layer outside hub was being reused).
To confirm where the problem is I did not pass "train" tags even for training (with no other changes) - and the problem was immediately resolved.
Grateful for all the help - many thanks!
#inside model_fn
tags_val = None
if is_training:
tags_val = {"train"}
is_training = (mode == tf.estimator.ModeKeys.TRAIN)
tf_hub_model_spec = "https://tfhub.dev/google/imagenet/inception_v3/feature_vector/1"
img_module = hub.Module(tf_hub_model_spec, trainable=is_training, tags=tags_val)
#Add final dense layer, etc

For https://tfhub.dev/google/imagenet/inception_v3/feature_vector/1, the difference between default tags (meaning the empty set) and tags={"train"} is that the latter operates batch norm in training mode (i.e., using batch statistics for normalization). If that leads to catastrophic quality loss, my first suspicion would be: are UPDATE_OPS being run with the train_op?
https://github.com/tensorflow/hub/issues/24 discusses that on the side of other issues, with code pointers.

DeepLearning Anomaly Detection for images

I am still relatively new to the world of Deep Learning. I wanted to create a Deep Learning model (preferably using Tensorflow/Keras) for image anomaly detection. By anomaly detection I mean, essentially a OneClassSVM.
I have already tried sklearn's OneClassSVM using HOG features from the image. I was wondering if there is some example of how I can do this in deep learning. I looked up but couldn't find one single code piece that handles this case.

The way of doing this in Keras is with the KerasRegressor wrapper module (they wrap sci-kit learn's regressor interface). Useful information can also be found in the source code of that module. Basically you first have to define your Network Model, for example:
def simple_model():
#Input layer
data_in = Input(shape=(13,))
#First layer, fully connected, ReLU activation
layer_1 = Dense(13,activation='relu',kernel_initializer='normal')(data_in)
#second layer...etc
layer_2 = Dense(6,activation='relu',kernel_initializer='normal')(layer_1)
#Output, single node without activation
data_out = Dense(1, kernel_initializer='normal')(layer_2)
#Save and Compile model
model = Model(inputs=data_in, outputs=data_out)
#you may choose any loss or optimizer function, be careful which you chose
model.compile(loss='mean_squared_error', optimizer='adam')
return model
Then, pass it to the KerasRegressor builder and fit with your data:
from keras.wrappers.scikit_learn import KerasRegressor
#chose your epochs and batches
regressor = KerasRegressor(build_fn=simple_model, nb_epoch=100, batch_size=64)
#fit with your data
regressor.fit(data, labels, epochs=100)
For which you can now do predictions or obtain its score:
p = regressor.predict(data_test) #obtain predicted value
score = regressor.score(data_test, labels_test) #obtain test score
In your case, as you need to detect anomalous images from the ones that are ok, one approach you can take is to train your regressor by passing anomalous images labeled 1 and images that are ok labeled 0.
This will make your model to return a value closer to 1 when the input is an anomalous image, enabling you to threshold the desired results. You can think of this output as its R^2 coefficient to the "Anomalous Model" you trained as 1 (perfect match).
Also, as you mentioned, Autoencoders are another way to do anomaly detection. For this I suggest you take a look at the Keras Blog post Building Autoencoders in Keras, where they explain in detail about the implementation of them with the Keras library.
It is worth noticing that Single-class classification is another way of saying Regression.
Classification tries to find a probability distribution among the N possible classes, and you usually pick the most probable class as the output (that is why most Classification Networks use Sigmoid activation on their output labels, as it has range [0, 1]). Its output is discrete/categorical.
Similarly, Regression tries to find the best model that represents your data, by minimizing the error or some other metric (like the well-known R^2 metric, or Coefficient of Determination). Its output is a real number/continuous (and the reason why most Regression Networks don't use activations on their outputs). I hope this helps, good luck with your coding.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Mask RCNN with custom classes - tensorflow

Related

training model CNN KERAS

CNN + LSTM model for images performs poorly on validation data set

How to design an optimal CNN? [closed]

Tensorflow Hub + Estimator Potential Bug: Trained weights not reused for Evaluation/Prediction

DeepLearning Anomaly Detection for images

Categories

Resources