Is there a way to freeze specific layers in a KerasLayer? - tensorflow

I'm currently building a CNN that uses transfer learning to classify images.
In my model, there is a tensorflow-hub KerasLayer that uses EfficientNet in order to create a feature vector.
My code is here:
model = models.Sequential([
hub.KerasLayer("https://tfhub.dev/google/efficientnet/b7/feature-vector/1", trainable=True), # Trainable
layers.Dropout(DROPOUT),
layers.Dense(NEURONS_PER_LAYER, kernel_regularizer=tf.keras.regularizers.l2(REG_LAMBDA), activation=ACTIVATION),
layers.Dropout(DROPOUT),
layers.Dense(NEURONS_PER_LAYER, kernel_regularizer=tf.keras.regularizers.l2(REG_LAMBDA), activation=ACTIVATION),
layers.Dropout(DROPOUT),
layers.Dense(NEURONS_PER_LAYER, kernel_regularizer=tf.keras.regularizers.l2(REG_LAMBDA), activation=ACTIVATION),
layers.Dropout(DROPOUT),
layers.Dense(NEURONS_PER_LAYER, kernel_regularizer=tf.keras.regularizers.l2(REG_LAMBDA), activation=ACTIVATION),
layers.Dropout(DROPOUT),
layers.Dense(1, activation="sigmoid")])
I can freeze or unfreeze the entire KerasLayer, but I can't seem to find a way to only freeze the earlier layers and fine-tune the higher-level parts. Can anyone help?

You can freeze entire layer by using layer.trainable = False. Just in case you happen to load entire model or create a model from scratch you can do this loop to find specific a layer to freeze.
# load a model or create a model
model = Model(...)
# first you print out your model summary
model.summary()
# you will get something like this
'''
Model: "sequential_2"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
inception_resnet_v2 (Model) (None, 2, 2, 1536) 54336736
_________________________________________________________________
flatten_2 (Flatten) (None, 6144) 0
_________________________________________________________________
dropout_2 (Dropout) (None, 6144) 0
_________________________________________________________________
dense_8 (Dense) (None, 2048) 12584960
_________________________________________________________________
dense_9 (Dense) (None, 1024) 2098176
_________________________________________________________________
dense_10 (Dense) (None, 512) 524800
_________________________________________________________________
dense_11 (Dense) (None, 17) 8721
=================================================================
'''
# here is loop for freezing particular layer (dense_10 in this example)
for layer in model.layers:
# selecting layer by name
if layer.name == 'dense_10':
layer.trainable = False
# for that hub layer you need to create hub layer outside your model just for easy access
# my inception layer
inception_layer = keras.applications.InceptionResNetV2(weights='imagenet', include_top=False, input_shape=(128, 128, 3))
# create model
model.add(inception_layer)
# same trick
inception_layer.summary()
# here is same loop from upper example
for layer in inception_layer.layers:
# selecting layer by name
if layer.name == 'block8_10_conv':
layer.trainable = False

Related

How does model.weights in tensorflow/keras work?

I have a model trained.
summary is as follows
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 256) 2560
dense_1 (Dense) (None, 128) 32896
dropout (Dropout) (None, 128) 0
dense_2 (Dense) (None, 1) 129
=================================================================
Total params: 35,585
Trainable params: 35,585
Non-trainable params: 0
_________________________________________________________________
And have weights
for i,weight in enumerate(Model.weights):
exec('w{}=np.array(weight)'.format(i))
have test data for predict
x=test_data.iloc[0]
then I predict with model
Model.predict(np.array(x).reshape(1,9))
get array([[226241.66]], dtype=float32)
then I predict with weights
((x#w0+w1)#w2+w3)#w4+w5
get array([98039.99664026])
Can someone explain how the weights in model works?
And how to get the model-predict result with weights?
Try Model.layers which will return a list of all layers in your model, each layer has a function get_weights() which will return the weights as numpy arrays. I was able to reproduce the output of a simple 3 layer feed-forward model with this approach.
for i,layer in enumerate(model.layers):
exec('w{}=np.array(layer.get_weights()[0])'.format(i)) # weight
exec('b{}=np.array(layer.get_weights()[1])'.format(i)) # bias
X = np.random.randn(1,9)
np.allclose(((X#w1[0] + b1[1])#w2[0] + b2[1])#w4[0] + b4[1], model.predict(X)) # True
Note: In my examle layer 0 was a input layer (no weights) and layer 3 a dropout layer (no weights). When calling model.predict(), dropout is not applied, therefore you can ignore it in this case.

Tensorflow: access to see the layer activation (fine-tuning),

I use fine-tuning. How can I see and access the activations of all layers that are inside of the convolutional base?
conv_base = VGG16(weights='imagenet',
include_top=False,
input_shape=(inp_img_h, inp_img_w, 3))
def create_functional_model():
inp = Input(shape=(inp_img_h, inp_img_w, 3))
model = conv_base(inp)
model = Flatten()(model)
model = Dense(256, activation='relu')(model)
outp = Dense(1, activation='sigmoid')(model)
return Model(inputs=inp, outputs=outp)
model = create_functional_model()
model.summary()
The model summary is
Layer (type) Output Shape Param #
=================================================================
vgg16 (Functional) (None, 7, 7, 512) 14714688
_________________________________________________________________
flatten_2 (Flatten) (None, 25088) 0
_________________________________________________________________
dense_4 (Dense) (None, 256) 6422784
_________________________________________________________________
dense_5 (Dense) (None, 1) 257
=================================================================
Total params: 21,137,729
Trainable params: 21,137,729
Non-trainable params: 0
_________________________________________________________________
Thus, the levels inside the conv_base are not accessible.
As #Frightera said in comments, you can access the base model summary by:
model.layers[0].summary()
And if you want to access activation functions of its layers you can try this:
print(model.layers[0].layers[index_of_layer].activation)
#or
print(model.layers[0].get_layer("name_of_layer").activation)

How to show all layers in a Tensorflow model with nested model?

How to show all layers in a tensorflow model with the model base?
base_model = keras.applications.MobileNetV3Small(
input_shape=model_input_shape,
include_top=False,
weights="imagenet",
)
# =================== build model
model = keras.Sequential(
[
keras.Input(shape=image_shape),
preprocessing.Resizing(*model_input_shape[:2]),
preprocessing.Rescaling(1.0 / 255),
base_model,
layers.GlobalAveragePooling2D(),
# missing dropout
layers.Dense(1, activation="sigmoid"),
]
)
model.summary()
The output is this:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
resizing (Resizing) (None, 224, 224, 3) 0
_________________________________________________________________
rescaling_1 (Rescaling) (None, 224, 224, 3) 0
_________________________________________________________________
MobilenetV3small (Functional (None, 7, 7, 1024) 1529968 <---------- why can't I see all layers here?
_________________________________________________________________
global_average_pooling2d (Gl (None, 1024) 0
_________________________________________________________________
dense (Dense) (None, 1) 1025
How do I show all layers?
for layer in model.layers:
print(layer)
The above has the same problem. What am I doing wrong?
In such a setup, the base_model acts as a single layer, ie. become nested. To inspect it, you can try either
model.layers[2].summary()
for i, layer in enumerate(model.layers):
if i == 2:
for nested_layer in layer.layers:
print(nested_layer)
or, more intuitively, you can use this solution.
def summary_plus(layer, i=0):
if hasattr(layer, 'layers'):
if i != 0:
layer.summary()
for l in layer.layers:
i += 1
summary_plus(l, i=i)
summary_plus(model)
or, you can also use the plot_model function as well
keras.utils.plot_model(
model,
expand_nested=True # < make it true
)
Update 1: Raised on the issue regarding this. Keras #15239. Hopefully, it will be solved soon.
Update 2: model.summary now has expand_nested parameter. #15251

Access to intermediate layers in Keras Functional Model

I am using a transfer learning model is a ay very similar to that explained in Chollet's keras Transfer learning guide. To avoid problems with the batch normalization layer, as stated in the guide and many other places, I have to insert the original pretrained base model as a functional model with the training=false option like this:
inputs = layers.Input(shape=(224,224, 3))
x = img_augmentation(inputs)
baseModel = VGG19(weights="imagenet", include_top=False,input_tensor=x)
x=baseModel(x,training=False)
# construct the head of the model that will be placed on top of the
# the base model
x=Conv2D(32,2)(x)
headModel = AveragePooling2D(pool_size=(4, 4))(x)
headModel = Flatten(name="flatten")(headModel)
headModel = Dense(64, activation="relu")(headModel)
headModel = Dropout(0.5)(headModel)
headModel = Dense(3, activation="softmax")(headModel)
model = Model(inputs, outputs=headModel)
My problem is that I need to use gradcam as in Chollet's gradcam example page. To do this I need access to the basemodel last convolutional layer but when I summarize my model I get:
Model: "model_163"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_3 (InputLayer) [(None, 224, 224, 3)] 0
_________________________________________________________________
img_augmentation (Sequential (None, 224, 224, 3) 0
_________________________________________________________________
vgg19 (Functional) (None, 7, 7, 512) 20024384
_________________________________________________________________
conv2d_2 (Conv2D) (None, 6, 6, 32) 65568
_________________________________________________________________
average_pooling2d_2 (Average (None, 1, 1, 32) 0
_________________________________________________________________
flatten (Flatten) (None, 32) 0
_________________________________________________________________
dense_4 (Dense) (None, 64) 2112
_________________________________________________________________
dropout_2 (Dropout) (None, 64) 0
_________________________________________________________________
dense_5 (Dense) (None, 3) 195
=================================================================
Total params: 20,092,259
Trainable params: 67,875
Non-trainable params: 20,024,384
__________________________________________
Thus, the outputs I need are inside one of the vgg19 functional model layers. How can I access this layer without having to remove the training=True option?
I generally don't like nesting models in models. Although it encourages modularity and introduce nice structure to complex models, TensorFlow gives trouble when you want to do unconventional things (like computing GradCAM or accessing gradients, etc.). I've found it easier to un-nest the model so that you can access the layer that you like easily.
I recently wrote a tutorial to implement GradCAM
on TensorFlow 2 for InceptionNet. It should give you enough context to access the required layer.
So as you see the VGG model in your case has type Functional. When you iterate through your compound model's layers you can check for the type of each layer, like this, find the nested Functional model and work with it's layers:
for layer in model.layers:
if "Functional" == layer.__class__.__name__:
#here you can iterate and choose the layers of your nested model
for _layer in layer.layers:
# your logic with nested model layers

Grad-CAM in keras, ValueError: Graph disconnected: cannot obtain value for tensor Tensor "input_11_6:0", shape=(None, 150, 150, 3)

How to perform Grad-CAM on pretrained custom model.
How to select last_conv_layer_name and classifier_layer_names?
What is its significances and how to select layers' names?
Should I consider Densenet121 sublayers or densenet as one functional layer?
How to perform Grad-CAM for this trained network?
These are the steps I tried,
#load model and custom metrics
dependencies = {'recall_m': recall_m, 'precision_m' : precision_m, 'f1_m' : f1_m }
model = keras.models.load_model("model_val_acc-73.33.h5", custom_objects = dependencies)
model.summary()
Model: "sequential_9"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
densenet121 (Functional) (None, 4, 4, 1024) 7037504
_________________________________________________________________
flatten (Flatten) (None, 16384) 0
_________________________________________________________________
dense_encoder (Dense) (None, 1024) 16778240
_________________________________________________________________
dropout_51 (Dropout) (None, 1024) 0
_________________________________________________________________
dense_2 (Dense) (None, 256) 262400
_________________________________________________________________
dropout_52 (Dropout) (None, 256) 0
_________________________________________________________________
dense_3 (Dense) (None, 128) 32896
_________________________________________________________________
dropout_53 (Dropout) (None, 128) 0
_________________________________________________________________
dense_4 (Dense) (None, 64) 8256
_________________________________________________________________
dropout_54 (Dropout) (None, 64) 0
_________________________________________________________________
dense_5 (Dense) (None, 32) 2080
_________________________________________________________________
dropout_55 (Dropout) (None, 32) 0
_________________________________________________________________
Final (Dense) (None, 2) 66
=================================================================
Total params: 24,121,442
Trainable params: 17,083,938
Non-trainable params: 7,037,504
This is the heat map function:-
###defining heat map
def make_gradcam_heatmap(img_array, model, last_conv_layer_name, classifier_layer_names):
# First, we create a model that maps the input image to the activations
# of the last conv layer
last_conv_layer = model.get_layer(last_conv_layer_name)
last_conv_layer_model = keras.Model(model.inputs, last_conv_layer.output)
# Second, we create a model that maps the activations of the last conv
# layer to the final class predictions
classifier_input = keras.Input(shape=last_conv_layer.output.shape[1:])
x = classifier_input
for layer_name in classifier_layer_names:
x = model.get_layer(layer_name)(x)
classifier_model = keras.Model(classifier_input, x)
# Then, we compute the gradient of the top predicted class for our input image
# with respect to the activations of the last conv layer
with tf.GradientTape() as tape:
# Compute activations of the last conv layer and make the tape watch it
last_conv_layer_output = last_conv_layer_model(img_array)
tape.watch(last_conv_layer_output)
# Compute class predictions
preds = classifier_model(last_conv_layer_output)
top_pred_index = tf.argmax(preds[0])
top_class_channel = preds[:, top_pred_index]
# This is the gradient of the top predicted class with regard to
# the output feature map of the last conv layer
grads = tape.gradient(top_class_channel, last_conv_layer_output)
# This is a vector where each entry is the mean intensity of the gradient
# over a specific feature map channel
pooled_grads = tf.reduce_mean(grads, axis=(0, 1, 2))
# We multiply each channel in the feature map array
# by "how important this channel is" with regard to the top predicted class
last_conv_layer_output = last_conv_layer_output.numpy()[0]
pooled_grads = pooled_grads.numpy()
for i in range(pooled_grads.shape[-1]):
last_conv_layer_output[:, :, i] *= pooled_grads[i]
# The channel-wise mean of the resulting feature map
# is our heatmap of class activation
heatmap = np.mean(last_conv_layer_output, axis=-1)
# For visualization purpose, we will also normalize the heatmap between 0 & 1
heatmap = np.maximum(heatmap, 0) / np.max(heatmap)
return heatmap
this is an image input
img_array = X_test[10] # 10th image sample
X_test[10].shape
#(150, 150, 3)
last_conv_layer_name = "densenet121"
classifier_layer_names = [ "dense_2", "dense_3", "dense_4", "dense_5", "Final" ]
# Generate class activation heatmap
heatmap = make_gradcam_heatmap(
img_array, model, last_conv_layer_name, classifier_layer_names
) ####===> (I'm getting error here, in this line)
So what is wrong with last_conv_layer_name and classifier_layer_names.
Can anyone please explain this?