Tensorflow:: How to get layers with trainable parameters - tensorflow

I'm using pretrained Xception model from TensorFlow.
base_model = keras.applications.Xception(
weights='imagenet',
input_shape=(150,150,3),
include_top=False
)
It shows like 132 layers:
len(base_model.layers)
But only part of them have trainable parameters (This 132 includes Activation layers, MaxPool, and other e.g concatenating). So thers my question: Is there a way to access only those layers with trainable parameters (should be like 71 of them)?

try this
for layer in base_model.layers:
if layer.trainable:
print (layer.name)

I have found answer by myself. Its not perfect, but closest to what I wanted.
from tensorflow.python.keras.utils.layer_utils import count_params
for layer in base_model.layers:
if layer.count_params() > 0:
print(layer.name)
It shows 80 layers, and batch_normalization amongs them - dunno why, I thought they dont have trainable parameters.

Related

How to access a particular layer of Huggingface's pre-trained BERT model?

For experimentation purposes, I need to access an Embedding layer of the encoder. That is, assuming Tensorflow implementation, the layer defined as tf.keras.layers.Embedding(...).
For example, what is a way to set 'embeddings_regularizer=' argument of the Embedding() layer in the encoder part of the transformer?
You can iterate over the BERT model in the same way as any other model, like so:
for layer in model.layers:
if isinstance(layer ,tf.keras.layers.Embedding):
layer.embeddings_regularizer = argument
isinstance checks the type of the layer, so really you can put any layer type here and change what you need.
I haven't checked specifically whether embeddings_regularizer is available, however if you want to see what methods are available to that particular layer, run a debugger and call dir(layer) inside the above function.
Updated question
The TFBertForSequenceClassification model has 3 layers:
>>> model.summary()
Model: "tf_bert_for_sequence_classification"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
bert (TFBertMainLayer) multiple 108310272
_________________________________________________________________
dropout_37 (Dropout) multiple 0
_________________________________________________________________
classifier (Dense) multiple 1538
=================================================================
Total params: 108,311,810
Trainable params: 108,311,810
Non-trainable params: 0
Similarly, calling model.layers gives:
[<transformers.models.bert.modeling_tf_bert.TFBertMainLayer at 0x7efda85595d0>,
<tensorflow.python.keras.layers.core.Dropout at 0x7efd6000ae10>,
<tensorflow.python.keras.layers.core.Dense at 0x7efd6000afd0>]
We can access the layers inside TFBertMainLayer:
>>> model.layers[0]._layers
[<transformers.models.bert.modeling_tf_bert.TFBertEmbeddings at 0x7efda8080f90>,
<transformers.models.bert.modeling_tf_bert.TFBertEncoder at 0x7efda855ced0>,
<transformers.models.bert.modeling_tf_bert.TFBertPooler at 0x7efda84f0450>,
DictWrapper({'name': 'bert'})]
So from the above we can access the TFBertEmbeddings layer by:
model.layers[0].embeddings
OR
model.layers[0]._layers[0]
If you check the documentation (search for the "TFBertEmbeddings" class) you can see that this inherits a standard tf.keras.layers.Layer which means you have access to all the normal regularizer methods, so you should be able to call something like:
from tensorflow.keras import regularizers
model.layers[0].embeddings.activity_regularizer = regularizers.l2(1e-5)
Or whatever argument / regularizer you need to change. See here for regularizer docs.

Keras remove activation function of last layer

I want to use ResNet50 with Imagenet weights.
The last layer of ResNet50 is (from here)
x = layers.Dense(1000, activation='softmax', name='fc1000')(x)
I need to keep the weights of this layer but remove the softmax function.
I want to manually change it so my last layer looks like this
x = layers.Dense(1000, name='fc1000')(x)
but the weights stay the same.
Currently I call my net like this
resnet = Sequential([
Input(shape(224,224,3)),
ResNet50(weights='imagenet', input_shape(224,224,3))
])
I need the Input layer because otherwise the model.compile says that placeholders aren't filled.
Generally there are two ways of achievieng this:
Quick way - supported functions:
To change the final layer's activation function, you can pass an argument classifier_activation.
So in order to get rid of activation all together, your module can be called like:
import tensorflow as tf
resnet = tf.keras.Sequential([
tf.keras.layers.Input(shape=(224,224,3)),
tf.keras.applications.ResNet50(
weights='imagenet',
input_shape=(224,224,3),
pooling="avg",
classifier_activation=None
)
])
This however, is not going to work if the you want a different function, that is not supported by Keras classifer_activation parameter (e. g. custom activation function).
To achieve this you can use the workaround solution:
Long way - copy the model's weights
This solution proposes copying the original model's weights onto your custom one. This approach works because apart from the activation function you are not chaning the model's architecture.
You need to:
1. Download original model.
2. Save it's weights.
3. Declare your modified version of the model (in your case, without the activation function).
4. Set the weights of the new model.
Below snippet explains this concept in more detail:
import tensorflow as tf
# 1. Download original resnet
resnet = tf.keras.Sequential([
tf.keras.layers.Input(shape=(224,224,3)),
tf.keras.applications.ResNet50(
weights='imagenet',
input_shape=(224,224,3),
pooling="avg"
)
])
# 2. Hold weights in memory:
imagenet_weights = resnet.get_weights()
# 3. Declare the model, but without softmax
resnet_no_softmax = tf.keras.Sequential([
tf.keras.layers.Input(shape=(224,224,3)),
tf.keras.applications.ResNet50(
include_top=False,
weights='imagenet',
input_shape=(224,224,3),
pooling="avg"
),
tf.keras.layers.Dense(1000, name='fc1000')
])
# 4. Pass the imagenet weights onto the second resnet
resnet_no_softmax.set_weights(imagenet_weights)
Hope this helps!

Build layers with fixed weights in TensorFlow

I want to build a fully-connected (dense) layer for a regression task. I usually do it with TF2, using Keras API like:
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(units=2, activation='sigmoid', input_shape=(1, )))
model.add(tf.keras.layers.Dense(units=2, activation='linear'))
model.compile(optimizer='adam', loss='mae')
model.fit(inp_data, out_data, epochs=1000)
Now I want to build a custom layer. The layer is composed of, say 10 units, in which 8 units have predefined, fixed, untrainable weights and biases and 2 units have randomly-chosen weights and biases, to be trained by the network. Has anyone any idea how can I define it in Tensorflow?
Keras layers may receive a trainable parameter, True by default, to indicate whether you want them to be trained. Non-trainable layers will just keep the value they are given by the initializer. If I understand correctly, you want to have one layer which is only partially trainable. That is not possible as such with existing layers. Maybe you could do it with a custom layer class, but you can have an equivalent behavior by using two simple layers and then concatenating them (as long as your activation works element-wise, and even it it doesn't, like in a softmax layer, you could apply that activation after the concatenation). This is how it could work:
inputs = tf.keras.Input(shape=(1,))
# This is the trainable part of the layer
layer_train = tf.keras.layers.Dense(units=8, activation='sigmoid')(inputs)
# This is the non-trainable part
layer_const = tf.keras.layers.Dense(units=2, activation='sigmoid', trainable=False)(inputs)
# Merge both parts
layer = tf.keras.layers.Concatenate()([layer_train, layer_const])
# Make model
model = tf.keras.Model(inputs=inputs, outputs=layer)
# ...

What is the expected behavior and purpose of model.trainable=False in tensorflow keras

It seems setting model.trainable=False in tensorflow keras does nothing except for to print a wrong model.summary(). Here is the code to reproduce the issue:
import tensorflow as tf
import numpy as np
IMG_SHAPE = (160, 160, 3)
# Create the base model from the pre-trained model MobileNet V2
base_model = tf.keras.applications.MobileNetV2(input_shape=IMG_SHAPE,
include_top=False,
weights='imagenet')
base_model.trainable = False
# for layer in base_model.layers:
# layer.trainable=False
bc=[] #before compile
ac=[] #after compile
for layer in base_model.layers:
bc.append(layer.trainable)
print(np.all(bc)) #True
print(base_model.summary()) ##this changes to show no trainable parameters but that is wrong given the output to previous np.all(bc)
base_model.compile(optimizer=tf.keras.optimizers.Adam(lr=0.001),
loss='categorical_crossentropy',
metrics=['accuracy'])
for layer in base_model.layers:
ac.append(layer.trainable)
print(np.all(ac)) #True
print(base_model.summary()) #this changes to show no trainable parameters but that is wrong given the output to previous np.all(ac)
In light of this - What is the expected behavior and purpose of model.trainable=False in tensorflow keras?
https://github.com/tensorflow/tensorflow/issues/29535
I think this issue could help.
If you are looking for a way to not update some weights in your model I would suggest using the parameter var_list in the minimize function from your Optimizer.
For some reason when creating a model from keras Tensorflow switch all tf.Variables to True, and since all are Tensors we are not able to update the value to False.
What I do in my code is create scope names for all pretrained models and loop over it adding all layers that are not from my pretrained model.
trainable_variables = []
variables_collection = tf.get_collection('learnable_variables')
for layer in tf.trainable_variables():
if 'vgg_model' not in layer.name:
trainable_variables.append(layer)
tf.add_to_collection('learnable_variables', layer)
grad = tf.train.GradientDescentOptimizer(lr)
train_step = grad.minimize(tf.reduce_sum([loss]), var_list=trainable_variables)
Watch out for global_initializer as well, since it will overwrite your pretrained Weights as well. You can solve that by using tf.variables_initializer and passing a list of variables you want to add weights.
sess.run(tf.variables_initializer(variables_collection))
Source I used when trying to solve this problem
Is it possible to make a trainable variable not trainable?
TensorFlow: Using tf.global_variables_initializer() after partially loading pre-trained weights

keras add external trainable variable to graph

I am working on language modelling and the vocabulary is large. So I want to use sampled_softmax_loss from tensorflow. The problem is that weights and biases which are the arguments of the sampled_softmax_loss function seems not trainable (their values don't change after training)
So I guess that I should add them to the computation graph building automatically by keras Model, but I spent a lot of time and still haven't find a proper way to do so.
So, once again. I want to add external trainable tf.Variables to the keras computation graph. Does anyone know the method to do so?
my model (head and tail)
input_sentence = Input(shape=(INPUT_LENGTH,), dtype='int32')
words = Embedding(embedding_matrix.shape[0], embedding_matrix.shape[1],
weights=[embedding_matrix], trainable=True)(input_sentence)
...
context = Dense(256, activation='tanh')(context)
model = Model(inputs=input_sentence, outputs=context, name=name)
loss
def softmax_fine_loss(labels, logits, transposed_W=None, b=None):
res = tf.map_fn(lambda (__labels, __logits): tf.nn.sampled_softmax_loss(transposed_W, b, __labels, __logits,
num_sampled=1000, num_classes=OUTPUT_COUNT+1),
(labels, logits), dtype=tf.float32)
return res
loss = lambda labels, logits: softmax_fine_loss(labels, logits, transposed_W=transposed_W, b=b)
model_truncated.compile(optimizer=optimizer, loss=loss, sample_weight_mode='temporal')
I have finally found a workaround
Let's say we need to train weights W and biases b with our model.
So the workaround is just add them to one of the trainable layers of our model.
model.layers[-1].trainable_weights.extend([W, b])
When we can compile the model
model.compile(...)
It is extremely important to add variables to trainable layer, for example I've experimented with Sequential model, and adding [W, b] to the Activation layer does not make them actually trainable.