How to add customm layers inside vgg16 when doing transfer learning? - tensorflow

I am trying to use transfer learning using vgg16. My main concept is to train the first few layers of vgg16, and add my own layer, afterwords add the rest of the layers from vgg16, and add my own output layer to the end. To do this I follow this sequence: (1) load layers and freez layers, (2) add my layers, (3) load the rest of layers (except the output layer) [THIS IS WHERE I ENCOUNTER THE FOLLOWING ERROR] and freez the layer, (4) add output layer. Is my approach ok? If not, then where I am doing wrong? Here's the error:
ValueError: Input 0 is incompatible with layer block3_conv1: expected axis -1 of input shape to have value 128 but got shape (None, 64, 56, 64)
The full code is here for better understanding:
vgg16_model= load_model('Fetched_VGG.h5')
vgg16_model.summary()
model= Sequential()
#add vgg layer (inputLayer, block1, block2)
for layer in vgg16_model.layers[0:6]:
model.add(layer)
#frees
# Freezing the layers (Oppose weights to be updated)
for layer in model.layers:
layer.trainable = False
#add custom
model.add(Conv2D(64, (3, 3), activation='relu', padding='same', name='block66_conv1_m') )
model.add( Conv2D(64, (3, 3), activation='relu', padding='same', name='block66_conv2_m') )
model.add( Conv2D(64, (3, 3), activation='relu', padding='same', name='block66_conv3_m') )
model.add( MaxPooling2D((2, 2), strides=(2, 2), name='block66_pool_m'))
# add vgg layer (block 3 to last layer (except the output dense layer))
for layer in vgg16_model.layers[7:-1]:
model.add(layer)
# Freezing the layers (Oppose weights to be updated)
for layer in model.layers:
layer.trainable = False
# add out out layer
model.add(Dense(2, activation='softmax', name='predictions'))
model.summary()

As VGG16 layer 7 is expecting 128 filters you'll need to match this with your final Conv2D
model.add( Conv2D(128, (3, 3), activation='relu', padding='same', name='block66_conv3_m') )
If the dimensions match you should be able to build your model but it's not clear what you're trying to achieve. Your approach of adding to the middle of the VGG16 model will mean that all the downstream layers will need to be retrained

Related

Passing output of 3DCNN layer to LSTM layer

Whilst trying to learn Recurrent Neural Networks(RNNs) am trying to train an Automatic Lip Reading Model using 3DCNN + LSTM. I tried out a code I found for the same on Kaggle.
model = Sequential()
# 1st layer group
model.add(Conv3D(32, (3, 3, 3), strides = 1, input_shape=(22, 100, 100, 1), activation='relu', padding='valid'))
model.add(MaxPooling3D(pool_size=(2, 2, 2), strides=2))
model.add(Conv3D(64, (3, 3, 3), activation='relu', strides=1))
model.add(MaxPooling3D(pool_size=(2, 2, 2), strides=2))
model.add(Conv3D(128, (3, 3, 3), activation='relu', strides=1))
model.add(MaxPooling3D(pool_size=(2, 2, 2), strides=2))
shape = model.get_output_shape_at(0)
model.add(Reshape((shape[-1],shape[1]*shape[2]*shape[3])))
# LSTMS - Recurrent Network Layer
model.add(LSTM(32, return_sequences=True))
model.add(Dropout(.5))
model.add((Flatten()))
# # FC layers group
model.add(Dense(2048, activation='relu'))
model.add(Dropout(.5))
model.add(Dense(1024, activation='relu'))
model.add(Dropout(.5))
model.add(Dense(10, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='Adam', metrics=['accuracy'])
model.summary()
However, it returns the following error:
11 model.add(MaxPooling3D(pool_size=(2, 2, 2), strides=2))
12
---> 13 shape = model.get_output_shape_at(0)
14 model.add(Reshape((shape[-1],shape[1]*shape[2]*shape[3])))
15
RuntimeError: The layer sequential_2 has never been called and thus has no defined output shape.
From my understanding, I see that the author of the code was trying to get the output shape of the first layer and reshape it such as to forward to the LSTM layer.
Found a similar post following which I made the following changes and the error was fixed.
shape = model.layers[-1].output_shape
# shape = model.get_output_shape_at(0)
Still I am confused as to what the code does to forward the input from the CNN layer to LSTM layer. Any help to make me understand the above is appreciated. Thank You!!
When you are passing the code from top to bottom then the inputs are flowing in the graph from top to bottom, you are getting this error because you can't call this function on eager mode, as Tensorflow 2.0 is fully transferred to eager mode, so, once you will fit the function and train it 1 epoch then you can use model.get_output_at(0) otherwise use mode.layers[-1].output.
The CNN Layer will extract the features locally then LSTM will sequentially extract and learn the feature, using CONV with LSTM is a good approach, but I will recommend you directly using tf.keras.layers.ConvLSTM3D. Check it here https://www.tensorflow.org/api_docs/python/tf/keras/layers/ConvLSTM3D
tf.keras.backend.clear_session()
model = Sequential()
# 1st layer group
model.add(Conv3D(32, (3, 3, 3), strides = 1, input_shape=(22, 100, 100, 1), activation='relu', padding='valid'))
model.add(MaxPooling3D(pool_size=(2, 2, 2), strides=2))
model.add(Conv3D(64, (3, 3, 3), activation='relu', strides=1))
model.add(MaxPooling3D(pool_size=(2, 2, 2), strides=2))
model.add(Conv3D(128, (3, 3, 3), activation='relu', strides=1))
model.add(MaxPooling3D(pool_size=(2, 2, 2), strides=2))
shape = model.layers[-1].output_shape
model.add(Reshape((shape[-1],shape[1]*shape[2]*shape[3])))
# LSTMS - Recurrent Network Layer
model.add(LSTM(32, return_sequences=True))
model.add(Dropout(.5))
model.add((Flatten()))
# # FC layers group
model.add(Dense(2048, activation='relu'))
model.add(Dropout(.5))
model.add(Dense(1024, activation='relu'))
model.add(Dropout(.5))
model.add(Dense(10, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='Adam', metrics=['accuracy'])
model.summary()

Create my own convolutional network without using Keras

I've just started with AI and implementing my own CONV networks.
I have this conv network:
model = tf.keras.models.Sequential([
# Note the input shape is the desired size of the image 150x150 with 3 bytes color
tf.keras.layers.Conv2D(16, (3,3), activation='relu', input_shape=(150, 150, 3)),
tf.keras.layers.MaxPooling2D(2,2),
tf.keras.layers.Conv2D(32, (3,3), activation='relu'),
tf.keras.layers.MaxPooling2D(2,2),
tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
tf.keras.layers.MaxPooling2D(2,2),
# Flatten the results to feed into a DNN
tf.keras.layers.Flatten(),
# 512 neuron hidden layer
tf.keras.layers.Dense(512, activation='relu'),
# Only 1 output neuron. It will contain a value from 0-1 where 0 for 1 class ('cats') and 1 for the other ('dogs')
tf.keras.layers.Dense(1, activation='sigmoid')
])
Is there any way to implement the Conv2D layers without using keras?
I want to try to understand how they work implementing them by myself.

Pretrained Tensorflow model RGB -> RGBY channel extension

I am working on the protein analysis project. We receive the images* of proteins with 4 filters (Red, Green, Blue and Yellow). Every of those RGBY channels contains unique data as different cellular structures are visible with different filters.
The idea is to use a pre-trained network e.g. VGG19 and extend the number of channels from default 3 to 4. Something like this:
(My appologies, I am not allowed to add images directly before 10 reputation, please press the "Run code snippet" button to visualize):
<img src="https://i.stack.imgur.com/TZKka.png" alt="Italian Trulli">
Picture: VGG model with RGB extended to RGBY
The Y channel should be the copy of the existing pretrained channel. Then it is possible to make use of the pretrained weights.
Does anyone have an idea of how such extension of a pretrained network can be achieved?
*
Author of the collage - Allunia from Kaggle, "Protein Atlas - Exploration and Baseline" kernel.
Use the layer.get_weights() and layer.set_weights() functions of Keras api.
Create a template structure for 4-layers VGG (set input shape=(width, height, 4)). Then load the weights from 3-channel RGB model into 4-channel as RGBB.
Below is the code that does the procedure. In case of sequential VGG, the only layer that needs to be modified is the first Convolution layer. The structure of the subsequent layers is independent on the number of channels.
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
from keras.applications.vgg19 import VGG19
from keras.models import Model
vgg19 = VGG19(weights='imagenet')
vgg19.summary() # To check which layers will be omitted in 'pretrained' model
# Load part of the VGG without the top layers into 'pretrained' model
pretrained = Model(inputs=vgg19.input, outputs=vgg19.get_layer('block5_pool').output)
pretrained.summary()
#%% Prepare model template with 4 input channels
config = pretrained.get_config() # run config['layers'][i] for reference
# to restore layer-by layer structure
from keras.layers import Input, Conv2D, MaxPooling2D
from keras import optimizers
# For training from scratch change kernel_initializer to e.g.'VarianceScaling'
inputs = Input(shape=(224, 224, 4), name='input_17')
# block 1
x = Conv2D(64, (3,3), padding='same', activation='relu', kernel_initializer='zeros', name='block1_conv1')(inputs)
x = Conv2D(64, (3,3), padding='same', activation='relu', kernel_initializer='zeros', name='block1_conv2')(x)
x = MaxPooling2D(pool_size=(2, 2), name='block1_pool')(x)
# block 2
x = Conv2D(128, (3,3), padding='same', activation='relu', kernel_initializer='zeros', name='block2_conv1')(x)
x = Conv2D(128, (3,3), padding='same', activation='relu', kernel_initializer='zeros', name='block2_conv2')(x)
x = MaxPooling2D(pool_size=(2, 2), strides=(2,2), name='block2_pool')(x)
# block 3
x = Conv2D(256, (3,3), padding='same', activation='relu', kernel_initializer='zeros', name='block3_conv1')(x)
x = Conv2D(256, (3,3), padding='same', activation='relu', kernel_initializer='zeros', name='block3_conv2')(x)
x = Conv2D(256, (3,3), padding='same', activation='relu', kernel_initializer='zeros', name='block3_conv3')(x)
x = Conv2D(256, (3,3), padding='same', activation='relu', kernel_initializer='zeros', name='block3_conv4')(x)
x = MaxPooling2D(pool_size=(2, 2), strides=(2,2), name='block3_pool')(x)
# block 4
x = Conv2D(512, (3,3), padding='same', activation='relu', kernel_initializer='zeros', name='block4_conv1')(x)
x = Conv2D(512, (3,3), padding='same', activation='relu', kernel_initializer='zeros', name='block4_conv2')(x)
x = Conv2D(512, (3,3), padding='same', activation='relu', kernel_initializer='zeros', name='block4_conv3')(x)
x = Conv2D(512, (3,3), padding='same', activation='relu', kernel_initializer='zeros', name='block4_conv4')(x)
x = MaxPooling2D(pool_size=(2, 2), strides=(2,2), name='block4_pool')(x)
# block 5
x = Conv2D(512, (3,3), padding='same', activation='relu', kernel_initializer='zeros', name='block5_conv1')(x)
x = Conv2D(512, (3,3), padding='same', activation='relu', kernel_initializer='zeros', name='block5_conv2')(x)
x = Conv2D(512, (3,3), padding='same', activation='relu', kernel_initializer='zeros', name='block5_conv3')(x)
x = Conv2D(512, (3,3), padding='same', activation='relu', kernel_initializer='zeros', name='block5_conv4')(x)
x = MaxPooling2D(pool_size=(2, 2), strides=(2,2), name='block5_pool')(x)
vgg_template = Model(inputs=inputs, outputs=x)
vgg_template.compile(optimizer=optimizers.RMSprop(lr=2e-4),
loss='categorical_crossentropy',
metrics=['acc'])
#%% Rewrite the weight loading/modification function
import numpy as np
layers_to_modify = ['block1_conv1'] # Turns out the only layer that changes
# shape due to 4th channel is the first
# convolution layer.
for layer in pretrained.layers: # pretrained Model and template have the same
# layers, so it doesn't matter which to
# iterate over.
if layer.get_weights() != []: # Skip input, pooling and no weights layers
target_layer = vgg_template.get_layer(name=layer.name)
if layer.name in layers_to_modify:
kernels = layer.get_weights()[0]
biases = layer.get_weights()[1]
kernels_extra_channel = np.concatenate((kernels,
kernels[:,:,-1:,:]),
axis=-2) # For channels_last
target_layer.set_weights([kernels_extra_channel, biases])
else:
target_layer.set_weights(layer.get_weights())
#%% Save 4 channel model populated with weights for futher use
vgg_template.save('vgg19_modified_clear.hdf5')
Beyond the RGBY case, the following snippet works generally by copying or removing the layer's weights and/or biases vectors dimensions as needed. Please refer to numpy documentation on what numpy.resize does: in the case of the original question it copies the B-channel weights onto the Y-channel (or more generally onto any higher dimensionality).
import numpy as np
import tensorflow as tf
...
model = ... # your RGBY model is here
pretrained_model = tf.keras.models.load_model(...) # pretrained RGB model
# the following assumes that the layers match with the two models and
# only the shapes of weights and/or biases are different
for pretrained_layer, layer in zip(pretrained_model.layers, model.layers):
pretrained = pretrained_layer.get_weights()
target = layer.get_weights()
if len(pretrained) == 0: # skip input, pooling and other no weights layers
continue
try:
# set the pretrained weights as is whenever possible
layer.set_weights(pretrained)
except:
# numpy.resize to the rescue whenever there is a shape mismatch
for idx, (l1, l2) in enumerate(zip(pretrained, target)):
target[idx] = np.resize(l1, l2.shape)
layer.set_weights(target)

Conversion of a conv layer to fc layer and vice vers sa?

I just wanna know if it's possible to convert a conv layer to a fully connected one and then return back to the conv layer ?
It is just a matter of ensuring that the input is the correct shape. I assume you are using keras.
from tensorflow.keras.layers import Dense, Flatten, Conv2D, Reshape
# Add a convolution to the network (previous layer called some_input)
c1 = Conv2D(32, (3, 3), activation='relu', name='first_conv')(some_input)
# Now reshape using 'Flatten'
f1 = Flatten(name='flat_c1')(c1)
# Now add a dense layer with 10 nodes
dense1 = Dense(10, activation='relu', name='dense1')(f1)
# Now add a dense layer, making sure it has the right number of nodes for my next conreshape8v layer.
dense2 = Dense(784, activation='relu', name='dense2')(dense1)
reshape2 = Reshape((7, 7, 16), name='reshape2')(dense2)
#Now back to convolutions (up or down)
c2 = Conv2D(16, kernel_size=(3, 3), activation='relu',
name='conv2')(reshape2)

Merge layer keras with tensorflow backend

I wanted to merge two sequential models into one using a Merge layer but it is showing me an error. I am working with images, with size 128x128 (RGB image) and batch size is 32.
The error is:
ValueError: The model expects 3 input arrays, but only received one array. Found: array with shape (32, 3, 128, 128)
The model is defined as:
model = Sequential() leftBranch = Sequential()
leftBranch.add(Reshape((3,128,128), input_shape=(3, img_width, img_height)))
leftBranch.add(Convolution2D(14, 3, 1, activation='relu'))
leftBranch.add(ZeroPadding2D((1, 1)))
leftBranch.add(Flatten())
rightBranch = Sequential()
rightBranch.add(Reshape((3,128,128), input_shape=(3, img_width, img_height)))
rightBranch.add(Convolution2D(14, 1, 3, activation='relu'))
rightBranch.add(MaxPooling2D((2, 2), strides=(2, 2)))
rightBranch.add(Flatten())
centralBranch = Sequential()
centralBranch.add(Reshape((3,128,128), input_shape=(3, img_width, img_height)))
centralBranch.add(Convolution2D(14, 5, 5, activation='relu'))
centralBranch.add(MaxPooling2D((2, 2), strides=(2, 2)))
centralBranch.add(Flatten())
merged = Merge([leftBranch, centralBranch, rightBranch], mode='concat')
model = Sequential()
model.add(merged) model.add(Dense(64))
model.add(Activation('relu')) model.add(Dropout(0.5)) model.add(Dense(1))
model.add(Activation('sigmoid'))
Error coming is :
ValueError: The model expects 3 input arrays, but only received one array. Found: array with shape (32, 3, 128, 128)
So, whats is the proper way to concatenate two sequential model with convolutional layers. I just want to Merge convolutional layers output like I did it here.