How to show all layers in a Tensorflow model with nested model? - tensorflow

How to show all layers in a tensorflow model with the model base?
base_model = keras.applications.MobileNetV3Small(
input_shape=model_input_shape,
include_top=False,
weights="imagenet",
)
# =================== build model
model = keras.Sequential(
[
keras.Input(shape=image_shape),
preprocessing.Resizing(*model_input_shape[:2]),
preprocessing.Rescaling(1.0 / 255),
base_model,
layers.GlobalAveragePooling2D(),
# missing dropout
layers.Dense(1, activation="sigmoid"),
]
)
model.summary()
The output is this:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
resizing (Resizing) (None, 224, 224, 3) 0
_________________________________________________________________
rescaling_1 (Rescaling) (None, 224, 224, 3) 0
_________________________________________________________________
MobilenetV3small (Functional (None, 7, 7, 1024) 1529968 <---------- why can't I see all layers here?
_________________________________________________________________
global_average_pooling2d (Gl (None, 1024) 0
_________________________________________________________________
dense (Dense) (None, 1) 1025
How do I show all layers?
for layer in model.layers:
print(layer)
The above has the same problem. What am I doing wrong?

In such a setup, the base_model acts as a single layer, ie. become nested. To inspect it, you can try either
model.layers[2].summary()
for i, layer in enumerate(model.layers):
if i == 2:
for nested_layer in layer.layers:
print(nested_layer)
or, more intuitively, you can use this solution.
def summary_plus(layer, i=0):
if hasattr(layer, 'layers'):
if i != 0:
layer.summary()
for l in layer.layers:
i += 1
summary_plus(l, i=i)
summary_plus(model)
or, you can also use the plot_model function as well
keras.utils.plot_model(
model,
expand_nested=True # < make it true
)
Update 1: Raised on the issue regarding this. Keras #15239. Hopefully, it will be solved soon.
Update 2: model.summary now has expand_nested parameter. #15251

Related

Add Augmentation Layers Before keras.applications.EfficientNetB0 and Retain Layer Names

I have a trained EfficientNetB0-based model with saved weights in a H5 format.
I want to add some preprocessing layers before the model, load the weights, and retrain it.
If I create a model like this:
inp = tf.keras.layers.Input(shape=[224,224,3])
noise = tf.keras.layers.GaussianNoise(stddev=10.)(inp)
feature_extractor = tf.keras.applications.EfficientNetB0(include_top=False, pooling="max")
features = feature_extractor(noise)
output1 = tf.keras.layers.Dense(100, activation="sigmoid")(features)
output2 = tf.keras.layers.Dense(10, activation="softmax")(output1)
model = tf.keras.models.Model(inp, [output1, output2])
I get this summary:
Layer (type) Output Shape Param #
=================================================================
input_27 (InputLayer) [(None, 224, 224, 3)] 0
_________________________________________________________________
gaussian_noise_13 (GaussianN (None, 224, 224, 3) 0
_________________________________________________________________
efficientnetb0 (Functional) (None, 1280) 4049571
_________________________________________________________________
dense (Dense) (None, 100) 128100
_________________________________________________________________
dense_1 (Dense) (None, 10) 1010
and I lose access to intermediate layers. I can't use the tf.keras.Sequential approach because my model has two outputs.
I want to retain the layer names inside EfficientNetB0 so that I can reload my weights. How do I do that?
So it looks like for the toy example I created above the answer is:
inp = tf.keras.layers.Input(shape=[224,224,3])
noise = tf.keras.layers.GaussianNoise(stddev=10.)(inp)
feature_extractor = tf.keras.applications.EfficientNetB0(input_tensor=noise, include_top=False, pooling="max")
output1 = tf.keras.layers.Dense(100, activation="sigmoid")(feature_extractor.output)
output2 = tf.keras.layers.Dense(10, activation="softmax")(output1)
model = tf.keras.models.Model(inp, [output1, output2])
However, I'm actually working with a custom model class that doesn't have that argument in the constructor...
Without the input_tensor argument is there another way to do this?

Access to intermediate layers in Keras Functional Model

I am using a transfer learning model is a ay very similar to that explained in Chollet's keras Transfer learning guide. To avoid problems with the batch normalization layer, as stated in the guide and many other places, I have to insert the original pretrained base model as a functional model with the training=false option like this:
inputs = layers.Input(shape=(224,224, 3))
x = img_augmentation(inputs)
baseModel = VGG19(weights="imagenet", include_top=False,input_tensor=x)
x=baseModel(x,training=False)
# construct the head of the model that will be placed on top of the
# the base model
x=Conv2D(32,2)(x)
headModel = AveragePooling2D(pool_size=(4, 4))(x)
headModel = Flatten(name="flatten")(headModel)
headModel = Dense(64, activation="relu")(headModel)
headModel = Dropout(0.5)(headModel)
headModel = Dense(3, activation="softmax")(headModel)
model = Model(inputs, outputs=headModel)
My problem is that I need to use gradcam as in Chollet's gradcam example page. To do this I need access to the basemodel last convolutional layer but when I summarize my model I get:
Model: "model_163"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_3 (InputLayer) [(None, 224, 224, 3)] 0
_________________________________________________________________
img_augmentation (Sequential (None, 224, 224, 3) 0
_________________________________________________________________
vgg19 (Functional) (None, 7, 7, 512) 20024384
_________________________________________________________________
conv2d_2 (Conv2D) (None, 6, 6, 32) 65568
_________________________________________________________________
average_pooling2d_2 (Average (None, 1, 1, 32) 0
_________________________________________________________________
flatten (Flatten) (None, 32) 0
_________________________________________________________________
dense_4 (Dense) (None, 64) 2112
_________________________________________________________________
dropout_2 (Dropout) (None, 64) 0
_________________________________________________________________
dense_5 (Dense) (None, 3) 195
=================================================================
Total params: 20,092,259
Trainable params: 67,875
Non-trainable params: 20,024,384
__________________________________________
Thus, the outputs I need are inside one of the vgg19 functional model layers. How can I access this layer without having to remove the training=True option?
I generally don't like nesting models in models. Although it encourages modularity and introduce nice structure to complex models, TensorFlow gives trouble when you want to do unconventional things (like computing GradCAM or accessing gradients, etc.). I've found it easier to un-nest the model so that you can access the layer that you like easily.
I recently wrote a tutorial to implement GradCAM
on TensorFlow 2 for InceptionNet. It should give you enough context to access the required layer.
So as you see the VGG model in your case has type Functional. When you iterate through your compound model's layers you can check for the type of each layer, like this, find the nested Functional model and work with it's layers:
for layer in model.layers:
if "Functional" == layer.__class__.__name__:
#here you can iterate and choose the layers of your nested model
for _layer in layer.layers:
# your logic with nested model layers

Is there a way to freeze specific layers in a KerasLayer?

I'm currently building a CNN that uses transfer learning to classify images.
In my model, there is a tensorflow-hub KerasLayer that uses EfficientNet in order to create a feature vector.
My code is here:
model = models.Sequential([
hub.KerasLayer("https://tfhub.dev/google/efficientnet/b7/feature-vector/1", trainable=True), # Trainable
layers.Dropout(DROPOUT),
layers.Dense(NEURONS_PER_LAYER, kernel_regularizer=tf.keras.regularizers.l2(REG_LAMBDA), activation=ACTIVATION),
layers.Dropout(DROPOUT),
layers.Dense(NEURONS_PER_LAYER, kernel_regularizer=tf.keras.regularizers.l2(REG_LAMBDA), activation=ACTIVATION),
layers.Dropout(DROPOUT),
layers.Dense(NEURONS_PER_LAYER, kernel_regularizer=tf.keras.regularizers.l2(REG_LAMBDA), activation=ACTIVATION),
layers.Dropout(DROPOUT),
layers.Dense(NEURONS_PER_LAYER, kernel_regularizer=tf.keras.regularizers.l2(REG_LAMBDA), activation=ACTIVATION),
layers.Dropout(DROPOUT),
layers.Dense(1, activation="sigmoid")])
I can freeze or unfreeze the entire KerasLayer, but I can't seem to find a way to only freeze the earlier layers and fine-tune the higher-level parts. Can anyone help?
You can freeze entire layer by using layer.trainable = False. Just in case you happen to load entire model or create a model from scratch you can do this loop to find specific a layer to freeze.
# load a model or create a model
model = Model(...)
# first you print out your model summary
model.summary()
# you will get something like this
'''
Model: "sequential_2"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
inception_resnet_v2 (Model) (None, 2, 2, 1536) 54336736
_________________________________________________________________
flatten_2 (Flatten) (None, 6144) 0
_________________________________________________________________
dropout_2 (Dropout) (None, 6144) 0
_________________________________________________________________
dense_8 (Dense) (None, 2048) 12584960
_________________________________________________________________
dense_9 (Dense) (None, 1024) 2098176
_________________________________________________________________
dense_10 (Dense) (None, 512) 524800
_________________________________________________________________
dense_11 (Dense) (None, 17) 8721
=================================================================
'''
# here is loop for freezing particular layer (dense_10 in this example)
for layer in model.layers:
# selecting layer by name
if layer.name == 'dense_10':
layer.trainable = False
# for that hub layer you need to create hub layer outside your model just for easy access
# my inception layer
inception_layer = keras.applications.InceptionResNetV2(weights='imagenet', include_top=False, input_shape=(128, 128, 3))
# create model
model.add(inception_layer)
# same trick
inception_layer.summary()
# here is same loop from upper example
for layer in inception_layer.layers:
# selecting layer by name
if layer.name == 'block8_10_conv':
layer.trainable = False

How to read Keras's model structure?

For example:
BUFFER_SIZE = 10000
BATCH_SIZE = 64
train_dataset = train_dataset.shuffle(BUFFER_SIZE)
train_dataset = train_dataset.padded_batch(BATCH_SIZE, tf.compat.v1.data.get_output_shapes(train_dataset))
test_dataset = test_dataset.padded_batch(BATCH_SIZE, tf.compat.v1.data.get_output_shapes(test_dataset))
def pad_to_size(vec, size):
zeros = [0] * (size - len(vec))
vec.extend(zeros)
return vec
...
model = tf.keras.Sequential([
tf.keras.layers.Embedding(encoder.vocab_size, 64),
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64, return_sequences=False)),
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid')
])
print(model.summary())
The print reads as:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding (Embedding) (None, None, 64) 523840
_________________________________________________________________
bidirectional (Bidirectional (None, 128) 66048
_________________________________________________________________
dense (Dense) (None, 64) 8256
_________________________________________________________________
dense_1 (Dense) (None, 1) 65
=================================================================
Total params: 598,209
Trainable params: 598,209
Non-trainable params: 0
I have the following question:
1) For the embedding layer, why is the ouput shape is (None, None, 64). I understand '64' is the vector length. Why are the other two None?
2) How is the output shape of bidirectional layer is (None, 128)? Why is it 128?
For the embedding layer, why is the ouput shape is (None, None, 64). I understand '64' is the vector length. Why are the other two None?
You can see this function produces (None,None) (including the batch dimension) (in other words it does input_shape=(None,) as default) if you don't define the input_shape to the first layer of the Sequential model.
If you pass in an input tensor of size (None, None) to an embedding layer, it produces a (None, None, 64) tensor assuming embedding dimension is 64. The first None is the batch dimension and the second is the time dimension (refers to the input_length parameter). So that's why you get a (None, None, 64) sized output.
How is the output shape of bidirectional layer is (None, 128)? Why is it 128?
Here, you have a Bidirectional LSTM. Your LSTM layer produces a (None, 64) sized output (when return_sequences=False). When you have a Bidirectional layer it is like having two LSTM layers (one going forward, other going backwards). And you have a default merge_mode of concat meaning that the two output states from forward and backward layers will be concatenated. This gives you a (None, 128) sized output.

Tensorflow keras Sequential .add is different than inline definition?

Keras is giving different results when I define my model via the declarative method instead of the functional method. The two models appear to be equivillent, but using the ".add()" syntax works while using the declarative syntax gives errors -- it's a different error each time, but usually something like:
A target array with shape (10, 1) was passed for an output of shape (None, 16) while using as loss `mean_squared_error`. This loss expects targets to have the same shape as the output.
There seems to be something going on with auto-conversion of input shapes, but I can't tell what. Does anyone know what I'm doing wrong? Why aren't these two models exactly equivillent?
import tensorflow as tf
import tensorflow.keras
import numpy as np
x = np.arange(10).reshape((-1,1,1))
y = np.arange(10)
#This model works fine
model = tf.keras.Sequential()
model.add(tf.keras.layers.LSTM(32, input_shape=(1, 1), return_sequences = True))
model.add(tf.keras.layers.LSTM(16))
model.add(tf.keras.layers.Dense(1))
model.add(tf.keras.layers.Activation('linear'))
#This model fails. But shouldn't this be equivalent to the above?
model2 = tf.keras.Sequential(
{
tf.keras.layers.LSTM(32, input_shape=(1, 1), return_sequences = True),
tf.keras.layers.LSTM(16),
tf.keras.layers.Dense(1),
tf.keras.layers.Activation('linear')
})
#This works
model.compile(loss='mean_squared_error', optimizer='adagrad')
model.fit(x, y, epochs=1, batch_size=1, verbose=2)
#But this doesn't! Why not? The error is different each time, but usually
#something about the input size being wrong
model2.compile(loss='mean_squared_error', optimizer='adagrad')
model2.fit(x, y, epochs=1, batch_size=1, verbose=2)
Why aren't those two models equivalent? Why does one handle the input size correctly but the other doesn't? The second model fails with a different error each time (once in a while it even works) so i thought maybe there's some interaction with the first model? But I've tried commenting out the first model and that doesn't help. So why doesn't the second one work?
UPDATE: Here is the "model.summary() for the first and second model. They do seem different but I don't understand why.
For model.summary():
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm (LSTM) (None, 1, 32) 4352
_________________________________________________________________
lstm_1 (LSTM) (None, 16) 3136
_________________________________________________________________
dense (Dense) (None, 1) 17
_________________________________________________________________
activation (Activation) (None, 1) 0
=================================================================
Total params: 7,505
Trainable params: 7,505
Non-trainable params: 0
For model2.summary():
model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_2 (LSTM) (None, 1, 32) 4352
_________________________________________________________________
activation_1 (Activation) (None, 1, 32) 0
_________________________________________________________________
lstm_3 (LSTM) (None, 16) 3136
_________________________________________________________________
dense_1 (Dense) (None, 1) 17
=================================================================
Total params: 7,505
Trainable params: 7,505
Non-trainable params: 0```
When you are creating the model with the inline declarations, you put the layers in curly braces {}, which makes it a set, which is inherently unordered. Change the curly braces to square brackets [] to put them in an ordered list. This will make sure that the layers are in the correct order in your model.