What is the expected behavior and purpose of model.trainable=False in tensorflow keras - tensorflow

It seems setting model.trainable=False in tensorflow keras does nothing except for to print a wrong model.summary(). Here is the code to reproduce the issue:
import tensorflow as tf
import numpy as np
IMG_SHAPE = (160, 160, 3)
# Create the base model from the pre-trained model MobileNet V2
base_model = tf.keras.applications.MobileNetV2(input_shape=IMG_SHAPE,
include_top=False,
weights='imagenet')
base_model.trainable = False
# for layer in base_model.layers:
# layer.trainable=False
bc=[] #before compile
ac=[] #after compile
for layer in base_model.layers:
bc.append(layer.trainable)
print(np.all(bc)) #True
print(base_model.summary()) ##this changes to show no trainable parameters but that is wrong given the output to previous np.all(bc)
base_model.compile(optimizer=tf.keras.optimizers.Adam(lr=0.001),
loss='categorical_crossentropy',
metrics=['accuracy'])
for layer in base_model.layers:
ac.append(layer.trainable)
print(np.all(ac)) #True
print(base_model.summary()) #this changes to show no trainable parameters but that is wrong given the output to previous np.all(ac)
In light of this - What is the expected behavior and purpose of model.trainable=False in tensorflow keras?

https://github.com/tensorflow/tensorflow/issues/29535
I think this issue could help.
If you are looking for a way to not update some weights in your model I would suggest using the parameter var_list in the minimize function from your Optimizer.
For some reason when creating a model from keras Tensorflow switch all tf.Variables to True, and since all are Tensors we are not able to update the value to False.
What I do in my code is create scope names for all pretrained models and loop over it adding all layers that are not from my pretrained model.
trainable_variables = []
variables_collection = tf.get_collection('learnable_variables')
for layer in tf.trainable_variables():
if 'vgg_model' not in layer.name:
trainable_variables.append(layer)
tf.add_to_collection('learnable_variables', layer)
grad = tf.train.GradientDescentOptimizer(lr)
train_step = grad.minimize(tf.reduce_sum([loss]), var_list=trainable_variables)
Watch out for global_initializer as well, since it will overwrite your pretrained Weights as well. You can solve that by using tf.variables_initializer and passing a list of variables you want to add weights.
sess.run(tf.variables_initializer(variables_collection))
Source I used when trying to solve this problem
Is it possible to make a trainable variable not trainable?
TensorFlow: Using tf.global_variables_initializer() after partially loading pre-trained weights

Related

Convert Inception model with include_top=False from Keras to Pytorch

Im try convert old project writen on Keras to PyTorch.
Keras create_model() contains folowing code. This is (129,500,1) grayscale image as input and (None, 2, 14, 2038) as output. Output tensor used in another BiLSTM later.
from tensorflow.python.keras.applications.inception_v3 import InceptionV3
inception_model = InceptionV3(include_top=False, weights=None, input_tensor=input_tensor)
for layer in inception_model.layers:
layer.trainable = False
x = inception_model.output
How I am can convert this code to Pytorch? The main problem is "include_top=False" what do not exist in Pytorch torchvision.inception_v3 model. This flag allow Keras model work with non-standard 1 channel inputs and 4-dim last Conv block outputs.
Actually, InceptionV3 model available in PyTorch.
You can try the below code.
import torchvision
torchvision.models.inception_v3()

Keras remove activation function of last layer

I want to use ResNet50 with Imagenet weights.
The last layer of ResNet50 is (from here)
x = layers.Dense(1000, activation='softmax', name='fc1000')(x)
I need to keep the weights of this layer but remove the softmax function.
I want to manually change it so my last layer looks like this
x = layers.Dense(1000, name='fc1000')(x)
but the weights stay the same.
Currently I call my net like this
resnet = Sequential([
Input(shape(224,224,3)),
ResNet50(weights='imagenet', input_shape(224,224,3))
])
I need the Input layer because otherwise the model.compile says that placeholders aren't filled.
Generally there are two ways of achievieng this:
Quick way - supported functions:
To change the final layer's activation function, you can pass an argument classifier_activation.
So in order to get rid of activation all together, your module can be called like:
import tensorflow as tf
resnet = tf.keras.Sequential([
tf.keras.layers.Input(shape=(224,224,3)),
tf.keras.applications.ResNet50(
weights='imagenet',
input_shape=(224,224,3),
pooling="avg",
classifier_activation=None
)
])
This however, is not going to work if the you want a different function, that is not supported by Keras classifer_activation parameter (e. g. custom activation function).
To achieve this you can use the workaround solution:
Long way - copy the model's weights
This solution proposes copying the original model's weights onto your custom one. This approach works because apart from the activation function you are not chaning the model's architecture.
You need to:
1. Download original model.
2. Save it's weights.
3. Declare your modified version of the model (in your case, without the activation function).
4. Set the weights of the new model.
Below snippet explains this concept in more detail:
import tensorflow as tf
# 1. Download original resnet
resnet = tf.keras.Sequential([
tf.keras.layers.Input(shape=(224,224,3)),
tf.keras.applications.ResNet50(
weights='imagenet',
input_shape=(224,224,3),
pooling="avg"
)
])
# 2. Hold weights in memory:
imagenet_weights = resnet.get_weights()
# 3. Declare the model, but without softmax
resnet_no_softmax = tf.keras.Sequential([
tf.keras.layers.Input(shape=(224,224,3)),
tf.keras.applications.ResNet50(
include_top=False,
weights='imagenet',
input_shape=(224,224,3),
pooling="avg"
),
tf.keras.layers.Dense(1000, name='fc1000')
])
# 4. Pass the imagenet weights onto the second resnet
resnet_no_softmax.set_weights(imagenet_weights)
Hope this helps!

How to do Gradient Normalization using Tensorflow LazyAdamOptimizer in functional Keras Model?

I am using a bidirectional RNN in Keras and need to use Tensoflows LazyAdamOptimizer. I need to do Gradient Normalization. How can I implement gradient normalization with tensorflows LazyAdamOptimizer and than use the functional keras model further on?
I am training a unsupervised RNN to predict a input sequence of lenght 10. The Problem is, that i am using a keras functional model. Because of the sparsity of the embedding layer i need to use Tensorflows LazyAdamOptimizer, which is not a default optimizer in keras. When using a default keras optimizer i can do gradient normalization just by setting the argument 'clipnorm=1' in the optimizer function. Because i am using LazyAdam i need to do this with tensorflow and than pass it back to my keras model, but i can't get the code going.
#model architecture
model_input = Input(shape=(seq_len, ))
embedding_a = Embedding(len(port_fwd_dict), 50, input_length=seq_len, mask_zero=True)(model_input)
lstm_a = Bidirectional(GRU(25, return_sequences=True,implementation=2, reset_after=True, recurrent_activation='sigmoid'), merge_mode="concat (embedding_a)
dropout_a = Dropout(0.2)(lstm_a)
lstm_b = Bidirectional(GRU(25, return_sequences=False, activation="relu", implementation=2, reset_after=True, recurrent_activation='sigmoid'), merge_mode="concat")(dropout_a)
dropout_b = Dropout(0.2)(lstm_b)
dense_layer = Dense(100, activation="linear")(dropout_b)
dropout_c = Dropout(0.2)(dense_layer)
model_output = Dense(len(port_fwd_dict)-1, activation="softmax(dropout_c)
# trying to implement gradient normalization
optimizer = tf.contrib.opt.LazyAdamOptimizer()
optimizer = tf.contrib.estimator.clip_gradients_by_norm(optimizer, 1)
loss = tf.reduce_mean(categorical_crossentropy(model_input, model_output))
train_op = optimizer.minimize(loss, tf.train.get_global_step())
model = Model(inputs=model_input, outputs=model_output)
model.compile(optimizer=train_op, loss='categorical_crossentropie', metrics = [ 'categorical_accuracy'])
history = model.fit(X_train, Y_train, epochs=epochs, batch_size=batch_size, validation_split=validation_split, class_weight = 'auto')
Blockquote
I get the following Error: NameError: name 'categorical_crossentropy' is not defined
But even if this error is solved, i do not know if this code will work. Because I need to use the keras function model.compile and in this function there need to be a loss specified. but when i do this in the tensorflow part above, it is not working.
I need a way to do gradient normalization and use my normal keras functional model?!
maybe you can try my implement of lazy optimizer:
https://github.com/bojone/keras_lazyoptimizer
It is a pure keras implement, wrapping a existing optimizer to be a lazy version.

Inference / Prediction from frozen model: Obtain value after activation through the graph

I have a simple frozen tensorflow model (frozen in Keras) that I load and then try to use for prediction. I do this first in python (code below), and then using C and libtensorflow (and get the same results). The examples I have found provide the the logits (before activation) as the final output, rather than the class label after activation. Is there a way to obtain the label through the graph itself?
I understand I can operate the sigmoid / softmax operators on the logits, but that's not what I want to do. (I'm porting the code to use the libtensorflow C api, and would prefer to let the graph do the math.)
My understanding is that the session runs the graph to the operation / tensor, and stops before that operation. Is there a way to get the operation after the activation?
Keras Model:
model = Sequential()
model.add(Dense(100, activation='relu', input_shape=(21,)))
model.add(Dense(1, activation='sigmoid'))
Tensorflow code to load frozen model and predict:
from tensorflow.python.platform import gfile
with tf.Session() as sess:
with gfile.FastGFile('slopemodel/slopemodel.pb', 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
sess.graph.as_default()
g_in = tf.import_graph_def(graph_def)
tensor_output = sess.graph.get_tensor_by_name('import/dense_2/Sigmoid:0')
tensor_input = sess.graph.get_tensor_by_name('import/dense_1_input:0')
predictions = sess.run(tensor_output, {tensor_input:sample})
print(predictions)
Truncated list of important nodes in the graph:
['import/dense_1_input',
'import/dense_1/kernel',
'import/dense_1/kernel/read',
'import/dense_1/bias',
'import/dense_1/bias/read',
'import/dense_1/MatMul',
'import/dense_1/BiasAdd',
'import/dense_1/Relu',
'import/dense_2/kernel',
'import/dense_2/kernel/read',
'import/dense_2/bias',
'import/dense_2/bias/read',
'import/dense_2/MatMul',
'import/dense_2/BiasAdd',
'import/dense_2/Sigmoid',
'import/Adam/iterations',
.
.
.]
Yes, you simply need to change the tensor_output to the one you want to obtain. Note that you will not receive the labels themselves but a one-hot vector from which you'll need to find the corresponding label yourself.

Keras models in tensorflow

I'm building image processing network in tensorflow and I want to make use of texture loss. Texture loss seems simple to implement if you have pretrained model loaded.
I'm using TF to build the computational graph for my model and I want to incorporate Keras.application.VGG19 model to get output from layer 'block4_conv4'.
The problem is: I have two TF tensors target and result from my main model, how to feed them into keras VGG19 in the same session to compute their diff and use it in main loss for my model?
It seems following code does the trick
with tf.variable_scope("") as scope:
phi_func = VGG19(include_top=False, weights=None, input_shape=(128, 128, 3))
text_1 = phi_func(predicted)
scope.reuse_variables()
text_2 = phi_func(x)
text_loss = tf.reduce_mean((text_1 - text_2)**2)
right after session created I call phi_func.load_weights(path) to initiate weights