Mobilenet: Transfer learning with Gradcam - tensorflow

I am a newbie to all this so please be kind to this question :)
What I am trying to do is train a Mobilenet classifier using the transfer learning technique and then implement the Gradcam technique to understand what my model is looking into.
I created a model
input_layer = tf.keras.layers.Input(shape=IMG_SHAPE)
x = preprocess_input(input_layer)
y = base_model(x)
y = tf.keras.layers.GlobalAveragePooling2D()(y)
y = tf.keras.layers.Dropout(0.2)(y)
outputs = tf.keras.layers.Dense(5)(y)
model = tf.keras.Model(inputs=input_layer, outputs=outputs)
model summary:
Model: "functional_2"
Layer (type) Output Shape Param #
input_3 (InputLayer) [(None, 224, 224, 3)] 0
tf_op_layer_RealDiv_1 (Tenso [(None, 224, 224, 3)] 0
tf_op_layer_Sub_1 (TensorFlo [(None, 224, 224, 3)] 0
mobilenetv2_1.00_224 (Functi (None, 7, 7, 1280) 2257984
global_average_pooling2d_1 ( (None, 1280) 0
dropout_1 (Dropout) (None, 1280) 0
dense_1 (Dense) (None, 5) 6405
Total params: 2,264,389
Trainable params: 6,405
Non-trainable params: 2,257,984
passed it to grad cam algorithm but the grad cam algorithm is not able to find the last convolutional layer
Plausible solution:
If instead of having an encapsulated 'mobilenetv2_1.00_224' layer if I can have unwrapped layers of mobilenet added in the model the grad cam algorithm will be able to find that last layer
I am not able to create the model where I can have data augmentation and pre_processing layer added to mobilenet unwrapped layers.
#skruff see if this helps
def make_gradcam_heatmap(img_array, model, last_conv_layer_name, pred_index=None):
# First, we create a model that maps the input image to the activations
# of the last conv layer as well as the output predictions
grad_model = tf.keras.models.Model(
[model.inputs], [model.get_layer(last_conv_layer_name).output, model.output]
# Then, we compute the gradient of the top predicted class for our input image
# with respect to the activations of the last conv layer
with tf.GradientTape() as tape:
last_conv_layer_output, preds = grad_model(img_array)
if pred_index is None:
pred_index = tf.argmax(preds[0])
class_channel = preds[:, pred_index]
# This is the gradient of the output neuron (top predicted or chosen)
# with regard to the output feature map of the last conv layer
grads = tape.gradient(class_channel, last_conv_layer_output)
# This is a vector where each entry is the mean intensity of the gradient
# over a specific feature map channel
pooled_grads = tf.reduce_mean(grads, axis=(0, 1, 2))
# We multiply each channel in the feature map array
# by "how important this channel is" with regard to the top predicted class
# then sum all the channels to obtain the heatmap class activation
last_conv_layer_output = last_conv_layer_output[0]
heatmap = last_conv_layer_output # pooled_grads[..., tf.newaxis]
heatmap = tf.squeeze(heatmap)
# For visualization purpose, we will also normalize the heatmap between 0 & 1
heatmap = tf.maximum(heatmap, 0) / tf.math.reduce_max(heatmap)
return heatmap.numpy()


Add Augmentation Layers Before keras.applications.EfficientNetB0 and Retain Layer Names

I have a trained EfficientNetB0-based model with saved weights in a H5 format.
I want to add some preprocessing layers before the model, load the weights, and retrain it.
If I create a model like this:
inp = tf.keras.layers.Input(shape=[224,224,3])
noise = tf.keras.layers.GaussianNoise(stddev=10.)(inp)
feature_extractor = tf.keras.applications.EfficientNetB0(include_top=False, pooling="max")
features = feature_extractor(noise)
output1 = tf.keras.layers.Dense(100, activation="sigmoid")(features)
output2 = tf.keras.layers.Dense(10, activation="softmax")(output1)
model = tf.keras.models.Model(inp, [output1, output2])
I get this summary:
Layer (type) Output Shape Param #
input_27 (InputLayer) [(None, 224, 224, 3)] 0
gaussian_noise_13 (GaussianN (None, 224, 224, 3) 0
efficientnetb0 (Functional) (None, 1280) 4049571
dense (Dense) (None, 100) 128100
dense_1 (Dense) (None, 10) 1010
and I lose access to intermediate layers. I can't use the tf.keras.Sequential approach because my model has two outputs.
I want to retain the layer names inside EfficientNetB0 so that I can reload my weights. How do I do that?
So it looks like for the toy example I created above the answer is:
inp = tf.keras.layers.Input(shape=[224,224,3])
noise = tf.keras.layers.GaussianNoise(stddev=10.)(inp)
feature_extractor = tf.keras.applications.EfficientNetB0(input_tensor=noise, include_top=False, pooling="max")
output1 = tf.keras.layers.Dense(100, activation="sigmoid")(feature_extractor.output)
output2 = tf.keras.layers.Dense(10, activation="softmax")(output1)
model = tf.keras.models.Model(inp, [output1, output2])
However, I'm actually working with a custom model class that doesn't have that argument in the constructor...
Without the input_tensor argument is there another way to do this?

GradientTape returns None

I am trying to use grad-CAM (I'm following this from PyImageSearch) on a CNN I'm using transfer learning on.
In particular, I am using a simple CNN for a regression problem. I used MobileNetV2 with an Average Pooling layer and a Dense layer with one unit on top, as shown below:
base_model = MobileNetV2(include_top=False, input_shape=(224, 224, 3), weights='imagenet')
base_model.trainable = False
inputs = keras.Input(shape=(224, 224, 3))
x = base_model(inputs)
x = keras.layers.GlobalAveragePooling2D()(x)
outputs = keras.layers.Dense(1, activation="linear")(x)
model = keras.Model(inputs, outputs)
and the summary is:
Model: "model"
Layer (type) Output Shape Param #
input_2 (InputLayer) [(None, 224, 224, 3)] 0
mobilenetv2_1.00_224 (Model) (None, 7, 7, 1280) 2257984
global_average_pooling2d (Gl (None, 1280) 0
dense (Dense) (None, 1) 1281
Total params: 2,259,265
Trainable params: 1,281
Non-trainable params: 2,257,984
I initialize the CAM object with:
pred = 0.35
cam = GradCAM(model, pred, layerName='input_2')
where pred is the predicted output on which I want to inspect the CAM and I also specify the layer name in order to refer to the input layer. Then I compute the heatmap on a sample image "img":
heatmap = cam.compute_heatmap(img)
Now, let's focus on a part of the implementation of the function compute_heatmap from PyImageSearch:
# record operations for automatic differentiation
with tf.GradientTape() as tape:
# cast the image tensor to a float-32 data type, pass the
# image through the gradient model, and grab the loss
# associated with the specific class index
inputs = tf.cast(image, tf.float32)
(convOutputs, predictions) = gradModel(inputs)
# loss = predictions[:, self.classIdx] # original from PyImageSearch
loss = predictions[:] # modified by me as I have only 1 output unit
# use automatic differentiation to compute the gradients
grads = tape.gradient(loss, convOutputs)
The problem here is that the gradient grads is None.
I thought that maybe the problem could lie in the network structure (all goes fine when reproducing the example of the classification task from the website), but I can't figure out where is the problem with this network used for regression!
Getting intermediate layer output from a nested network - Keras

I have a U-net network with VGG16 encoder architecture with pre-trained imagenet weights. Since my input images are grayscale, I added in a convolutional layer with depth 3 prior to sending the input to the U-net model.
Now, I'm trying to get the output of an intermediate layer within the U-net network. I create an intermediate model whose output is the output of the layer that I'm interested in. Here is my code:
base_model = sm.Unet('vgg16', encoder_weights='imagenet', classes=1, activation='sigmoid')
inp = Input(shape=(448, 224, 1))
l1 = Conv2D(3, (1,1))(inp)
out = base_model(l1)
model = Model(inp, out)
intermediate_layer_model = Model(inputs=model.layers[0].input,
Here is the output:
Layer (type) Output Shape Param #
input_2 (InputLayer) (None, 448, 224, 1) 0
conv2d_1 (Conv2D) (None, 448, 224, 3) 6
model_1 (Model) multiple 23752273
Total params: 23,752,279
Trainable params: 23,748,247
Non-trainable params: 4,032
ValueError: Graph disconnected: cannot obtain value for tensor Tensor("input_1:0", shape=(?, ?, ?, 3), dtype=float32) at layer "input_1". The following previous layers were accessed without issue: []
It seems to me that there is an issue with the U-net model having an input layer (input_1) and I'm not supplying this information during the construction of intermediate_layer_model. However, I expect that the intermediate model to take only the grayscale images as input and not require an additional 3-channel input.
Time distributed layer keras

Iam trying to understand the time distributed layer in keras/tensorflow.
As far as I have understood it is a kind of wrapper, making it possible to in example process a sequence of images.
Now Iam wondering how would design a time distributed network without using the time distributed layer.
In example if I would have a sequence of 3 images, each having 1 channel and a pixel dimension of 256x256px, that should first be processed by a CNN and then by LSTM cells.
My input to the time distributed layer would then be (N,3,256,256,1), where N is the batch size.
The CNN would then have 3 outputs, which are fed to the LSTM cell.
Now, without using the time distributed layers, would it be possible to accomplish the same by setting up a network with 3 different inputs and 3 similar CNNs? The outputs of the 3 CNNs could then be flattened and concatenated.
Is that any different from the time distributed approach?
I created a prototype for you. I used the least number of layers and arbitrary units/kernels/filters, change them as you like. It creates a cnn model first that takes inputs of size (256,256,1). It uses the same cnn model 3 times (for your three images in the sequence) to extract features. It stacks all the features using Lambda layer to put it back in a sequence. The sequence then goes through LSTM layer. I have chosen for the LSTM to return a single feature vector per example, but if you want the output to be a sequence as well, you could change it to say return_sequences=True. You could also add final additional layers to adapt it to your needs.
from tensorflow.keras.layers import Input, LSTM, Conv2D, Flatten, Lambda
from tensorflow.keras import Model
import tensorflow.keras.backend as K
def create_cnn_model():
inp = Input(shape=(256,256,1))
x = Conv2D(filters=16, kernel_size=5, strides=2)(inp)
x = Flatten()(x)
model = Model(inputs=inp, outputs=x, name='cnn_Model')
return model
def combined_model():
cnn_model = create_cnn_model()
inp_1 = Input(shape=(256,256,1))
inp_2 = Input(shape=(256,256,1))
inp_3 = Input(shape=(256,256,1))
out_1 = cnn_model(inp_1)
out_2 = cnn_model(inp_2)
out_3 = cnn_model(inp_3)
lstm_inp = [out_1, out_2, out_3]
lstm_inp = Lambda(lambda x: K.stack(x, axis=-2))(lstm_inp)
x = LSTM(units=32, return_sequences=False)(lstm_inp)
model = Model(inputs=[inp_1, inp_2, inp_3], outputs=x)
return model
Now create the model as such:
model = combined_model()
Check the summary:
which will print:
Model: "model_14"
Layer (type) Output Shape Param # Connected to
input_53 (InputLayer) [(None, 256, 256, 1) 0
input_54 (InputLayer) [(None, 256, 256, 1) 0
input_55 (InputLayer) [(None, 256, 256, 1) 0
cnn_Model (Model) (None, 254016) 416 input_53[0][0]
lambda_3 (Lambda) (None, 3, 254016) 0 cnn_Model[1][0]
lstm_13 (LSTM) (None, 32) 32518272 lambda_3[0][0]
Total params: 32,518,688
Trainable params: 32,518,688
Non-trainable params: 0
The inner cnn model summary could be printed:
which currently prints:
Model: "cnn_Model"
Layer (type) Output Shape Param #
input_52 (InputLayer) [(None, 256, 256, 1)] 0
conv2d_10 (Conv2D) (None, 126, 126, 16) 416
flatten_6 (Flatten) (None, 254016) 0
Total params: 416
Trainable params: 416
Non-trainable params: 0
Your model expects a list as input. The list should have a length of 3 (since there are 3 images in a sequence). Each element of the list should be a numpy array of shape (batch_size, 256, 256, 1). I have worked a dummy example below with a batch size of 1:
import numpy as np
a = np.zeros((256,256,1)) # first image filled with zeros
b = np.zeros((256,256,1)) # second image filled with zeros
c = np.zeros((256,256,1)) # third image filled with zeros
a = np.expand_dims(a, 0) # adding batch dimension to make it (1, 256, 256, 1)
b = np.expand_dims(b, 0) # same here
c = np.expand_dims(c, 0) # same here
model.compile(loss='mse', optimizer='adam')
# train your model with
e = model.predict([a,b,c]) # a,b and c have shape of (1, 256, 256, 1) where the first 1 is the batch size

how to save, restore, make predictions with siamese network (with triplet loss)

I am trying to develop a siamese network for simple face verification (and recognition in the second stage). I have a network in place that I managed to train but I am a bit puzzled when it comes to how to save and restore the model + making predictions with the trained model. Hoping that maybe an experienced person in the domain can help to make progress..
Here is how I create my siamese network, to begin with...
model = ResNet50(weights='imagenet') # get the original ResNet50 model
model.layers.pop() # Remove the last layer
for layer in model.layers:
layer.trainable = False # do not train any of original layers
x = model.get_layer('flatten_1').output
model_out = Dense(128, activation='relu', name='model_out')(x)
model_out = Lambda(lambda x: K.l2_normalize(x,axis=-1))(model_out)
new_model = Model(inputs=model.input, outputs=model_out)
# At this point, a new layer (with 128 units) added and normalization applied.
# Now create siamese network on top of this
anchor_in = Input(shape=(224, 224, 3))
positive_in = Input(shape=(224, 224, 3))
negative_in = Input(shape=(224, 224, 3))
anchor_out = new_model(anchor_in)
positive_out = new_model(positive_in)
negative_out = new_model(negative_in)
merged_vector = concatenate([anchor_out, positive_out, negative_out], axis=-1)
# Define the trainable model
siamese_model = Model(inputs=[anchor_in, positive_in, negative_in],
And I train the siamese_model. When I train it, if I interpret results right, it is not really training the underlying model, it just trains the new siamese network (essentially, just the last layer is trained).
But this model has 3 input streams. After the training, I need to save this model in a way so that it just takes 1 or 2 inputs so that I can perform predictions by calculating the distance between 2 given images. How do I save this model and reuse it now?
In case you wonder, here is the summary of siamese model.
Layer (type) Output Shape Param # Connected to
input_2 (InputLayer) (None, 224, 224, 3) 0
input_3 (InputLayer) (None, 224, 224, 3) 0
input_4 (InputLayer) (None, 224, 224, 3) 0
model_1 (Model) (None, 128) 23849984 input_2[0][0]
concatenate_1 (Concatenate) (None, 384) 0 model_1[1][0]
Total params: 23,849,984
Trainable params: 262,272
Non-trainable params: 23,587,712
You can use below code to save your model
And then to load your model you need to use