Possible to train Tensorflow Inception V3 models with images greater than 299x299? - tensorflow

Is it possible to train a Tensorflow Inception V3 model with an image size greater than size 299x299? Seems that the Inception V3 CNN is designed for this image size only.

As long as you do not include the fully connected (Dense) layers at the top, it should be fine to use a different image size.
You can do that by adding this argument while loading the model
base_model = InceptionV3(weights=weights, include_top=False)
The convolutional layer weights should be independent of the image size in general and hence you can use those weights. The FC layer of the pre-trained network with n fully connected nodes would have a weight matrix of size[m X n]. This layer will expect the input to that layer to be of size m. However, due to the change in image size, you will end up with a different value for m when you feed the image from the new dataset(convolution filter convolving on a different image size).
After adding new dense layers, you can fine-tune the network to train on the top layers alone (keeping the weights of the conv-blocks below it fixed).

Related

changing a trained static input shape to dynamic shape in keras

I have trained my MobileNetV3Small with 2 dense layers and got my result. but I set 640,480,3 as my input shape and now I need to test some images with different sizes. as it starts with convolution layers size must not matter. but I received errors as it requires its defined size (640,480,3) and padding and resizing didn't perform well. so I want to change the input shape to (None, None,3) without retraining it from scratch. It took 3 days to train my model and I don't want to train it again to change the input size.
Maybe you could try this on your Keras.Model instance:
model.input.set_shape((None, None, 3))

SegNet for CT images pretrained weights

I'm trying to train a SegNet for segmentation task on ct images (with Keras TF).
I'm using VGG16 pretrained weights but I had a problem with the first convolutional layer because I'm using grayscale images but VGG was trained on rgb ones.
I solved that using second method of this (can't use first method because requires too much memory).
However it didn't help me, values are really bad (trained for 100 epochs).
Should I train the first convolutional layer from scratch?
You can try to add a Conv2D before the vgg. Something like :
> Your Input(shape=(height,width,1))
Conv2D(filters=3,kernel_size=1, padding='same',activation='relu')
> The VGG pretrained network (input = (height,width,3))
is interesting in your case because 1x1 convolution is usually employed to change the depth of your object.

Retraining InceptionV3: Does not converge with input dimensions 299x299

I'm using Keras with pretrained weights for the Conv Layers and want to train only the dense layer.
The model performs as expected when I use input dimensions 150x150 or 224x224 but does not converge with 299x299 (train loss increases, train & validation accuracy remains flat equivalent to random guess).
Why does this happen?
solved - learning rate needs to be adjusted depending on input size in CNN

How to change number of channels to fine tune VGG16 net in Keras

I would like to fine tune the VGG16 model using my own grayscale images. I know I can fine tune/add my own top layers by doing something like:
base_model = keras.applications.vgg16.VGG16(include_top=False, weights='imagenet', input_tensor=None, input_shape=(im_height,im_width,channels))
but only when channels = 3 according to the documentation.
I have thought of simply adding two redundant channels to my image, but this seems like a waste of computation/could make the classification worse. I could also replicate the same image across three channels, but I am similarly unsure of how it would preform.
Keras pre-trained models have trained on color images and if you want to use their full power, you should use color images for fine-tuning. However, if you have grayscale images you can still use these pre-trained models by repeating your grayscale image over three channels. But obviously, it will not as well as using color images as input.
The VGG keras model uses the function: keras.applications.imagenet_utils._obtain_input_shape.
This function was tailored for ImageNet data thus it enforces the input channel to be 3. One possible workaround will be to copy the VGG16 module and replace the line:
input_shape = _obtain_input_shape(input_shape, default_size=224, min_size=48, data_format=K.image_data_format(), include_top=include_top)
with:
input_shape = (im_height, im_width, 1)
As a side note, you will not be able to load ImageNet weights since your input space has changed and the first layer convolutions will not match.

Retrain last inception or mobilenet layer to work with INPUT_SIZE 64x64 or 32x32

I want to retrain last inception or mobilenet layer so it would classify my own objects (about 5-15)
Also I want this to work with INPUT_SIZE == 64x64 or 32x32 (not 224 like for the default inception model)
I found some articles about retraining models:
https://hackernoon.com/creating-insanely-fast-image-classifiers-with-mobilenet-in-tensorflow-f030ce0a2991
https://medium.com/#daj/creating-an-image-classifier-on-android-using-tensorflow-part-3-215d61cb5fcd
For mobilenet they say
the input image size, either '224', '192', '160', or '128'
so I can't train with 64 or 32 (it's bad) https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/image_retraining/retrain.py#L80
What about inception models? Can I somehow train models to work with small image input sizes (to get results faster)?
Objects which I want to classify from such small images will be already cropped from its parent image (for example from camera frames), it could be traffic/road signs cropped by fastest cascade classifiers (LBP/Haar) which were trained to detect everything that looks like sign's shapes/figures (triangle/rhombus,circle shapes)
So 64x64 images which fully include/contain only interested object should be enough for classification
No you still can, use the smallest option which would be 128. It will just scale your 32 or 64 image up, which is fine.
it's not possible for classificators
but it become possible for tensorflow object detection api (we can set any input size) https://github.com/tensorflow/models/tree/master/research/object_detection