I am using a pretrained model like so:
base_model = keras.applications.Xception(
weights='imagenet',
input_shape=(150,150,3),
include_top=False
)
Then I freeze all the layers:
base_model.trainable = False
Now, I would like to unfreeze only, let's say the most lower layer.
When I do base_model.summary() That's what at the bottom:
So, let's say I would like to unfreeze block14_sepconv2 layer.
I do:
my_layer = base_model.get_layer('block14_sepconv2')
my_layer.trainable = True
And summary() still shows, that Trainable params: 0
What am I doing wrong? How to unfreeze only few of the lowest layers?
Whats intresting, when I firstly do base_model.trainable = True, and then I am freezing layers strating from the top, trainable params number actually changes. But its not intuitive for me, and primarly not incomprehensible.
Here is one way to unfreeze specific layers. We pick the same model and some layers (e.g. block14_sepconv2). The purpose is to unfreeze these layers and make the rest of the layers freeze.
from tensorflow import keras
base_model = keras.applications.Xception(
weights='imagenet',
input_shape=(150,150,3),
include_top=False
)
# free all layer except the desired layers
# which is in [ ... ]
for layer in base_model.layers:
if layer.name not in ['block14_sepconv2', 'block13_sepconv1']:
layer.trainable = False
if layer.trainable:
print(layer.name)
block14_sepconv2
block13_sepconv1
Compute the trainable and non-trainable variables.
import tensorflow.keras.backend as K
import numpy as np
trainable_count = np.sum([K.count_params(w) \
for w in base_model.trainable_weights])
non_trainable_count = np.sum([K.count_params(w) \
for w in base_model.non_trainable_weights])
print('Total params: {:,}'.format(trainable_count + non_trainable_count))
print('Trainable params: {:,}'.format(trainable_count))
print('Non-trainable params: {:,}'.format(non_trainable_count))
Total params: 20,861,480
Trainable params: 3,696,088
Non-trainable params: 17,165,392
FYI, don't forget to recompile your model (model.compile(...)) each time you freeze or unfreeze the layers.
Related
I want to calculate gradients for both trainable and non-trainable variables.
And update only trainable parameters.
At first, I implemented it as follows
with tf.GradientTape(persistent = True) as g:
preds = model(data)
loss = criterion(labels, preds)
gradients = g.gradient(loss, model.trainable_variables)
optimizer.apply_gradients(zip(gradients, model.trainable_variables))
non_train_gradients = g.gradient(loss, model.non_trainable_variables)
However, the above code do twice backpropagation to calculate gradients.
I want to estimate the gradients of both trainable and non-trainable variables simultaneosuly,
but only updates trainable parameters.
How can I do it?
We can use the fact that the gradients are just a list and are returned in the same order as the variables we put in:
n_trainable = len(model.trainable_variables)
gradients = g.gradient(
loss, model.trainable_variables + model.non_trainable_variables
)
trainable_gradients = gradients[:n_trainable]
non_trainable_gradients = gradients[n_trainable:]
optimizer.apply_gradients(
zip(trainable_gradients, model.trainable_variables)
)
That is, we just put all the non-trainable variables at the end, and then split the gradients at that point.
I want to create a GRU model with Keras. I can't understand the method to specify the number of hidden recurrent layers. I used the following code. According to what I understood, with this code we created an input layer with 64 neurons, a hidden layer with 64 neurons and an output layer with 4 neurons. I want to know is this correct? I also want to know how I can add other hidden layers.
model = Sequential()
model.add(m (units = units, return_sequences = True,
input_shape = [X_train.shape[1], X_train.shape[2]]))
model.add(Dropout(0.2))
model.add(m (units = units))
model.add(Dropout(0.2))
model.add(Dense(4))
#Compile model
model.compile(loss='mse', optimizer=MyOptimizer(learning_rate=0.001))
return model
model_gru = create_model(64, GRU)```
I would like to clear the memory / network after every time I am done with the training. I used the alternatives proposed online, but it seems like they are not working if I am correctly interpreting my results. I use tf.compat.v1.reset_default_graph() and tf.keras.backend.clear_session() since they are mostly recommended online.
import numpy as np
import random
import tensorflow as tf
from tensorflow import keras
from tensorflow.python.keras import backend as K
upper_limit = 2
lower_limit = -2
training_input= np.random.random ([100,5])*(upper_limit - lower_limit) + lower_limit
training_output = np.random.random ([100,1]) *10*(upper_limit - lower_limit) + lower_limit
model = tf.keras.Sequential([
tf.keras.layers.Flatten(input_shape=(5,)),
tf.keras.layers.Dense(12, activation='relu'),
tf.keras.layers.Dense(1)
])
model.compile(loss="mse",optimizer = tf.keras.optimizers.Adam(learning_rate=0.01))
for layer in model.layers:
print("layer weights before fitting: ",layer.get_weights(),"\n") # weights
model.fit(training_input, training_output, epochs=5, batch_size=100,verbose=0)
for layer in model.layers:
print("layer weights after fitting: ",layer.get_weights(),"\n") # weights
print("\n")
tf.compat.v1.reset_default_graph()
tf.keras.backend.clear_session()
print("after clear","\n")
for layer in model.layers:
print(layer.get_weights(),"\n") # weights
When I print the layer weights after attempting to clear the network, I get the same weight values as before cleaning the session.
I think what are you looking is reset the weights of you model, and that is not really related to the session or the graph (with some exceptions).
The reset of the weights is currently a debated topic you can find how to do it in most of the cases here but as you can see, today nobody is planning to implement this function
for easy access I post the current proposition below
def reset_weights(model):
for layer in model.layers:
if isinstance(layer, tf.keras.Model): #if you're using a model as a layer
reset_weights(layer) #apply function recursively
continue
#where are the initializers?
if hasattr(layer, 'cell'):
init_container = layer.cell
else:
init_container = layer
for key, initializer in init_container.__dict__.items():
if "initializer" not in key: #is this item an initializer?
continue #if no, skip it
# find the corresponding variable, like the kernel or the bias
if key == 'recurrent_initializer': #special case check
var = getattr(init_container, 'recurrent_kernel')
else:
var = getattr(init_container, key.replace("_initializer", ""))
var.assign(initializer(var.shape, var.dtype))
remember that if you are not defining a seed, the weigths will be differents each time you call reset
I am following the official Keras transfer learning and fine-tuning tutorial. It consists of loading the Xception model with include_top=False, and adding a new classifier part on top.
I am then saving the model with model.save() and loading with load_model().
So this is what I see when I do model.summary()
My problem is that I would like to iterate through the layers, while now Xception layers are somehow folded (on the picture: xception(Functional)). Is there a way to somehow unfold it, to see all the layers (including those that are creating Xception)?
For model. summary(), you can unfold that as follows:
from tensorflow import keras
base_model = keras.applications.Xception(
weights='imagenet', # Load weights pre-trained on ImageNet.
input_shape=(150, 150, 3),
include_top=False) # Do not include the ImageNet classifier at the top.
inputs = keras.Input(shape=(150, 150, 3))
x = base_model(inputs, training=False)
x = keras.layers.GlobalAveragePooling2D()(x)
outputs = keras.layers.Dense(1)(x)
model = keras.Model(inputs, outputs)
model.summary() # full model
model.layers[1].summary() # only xception model
Note, you cal also see the layer using the plot_model utility.
keras.utils.plot_model(model, expand_nested=True)
In Tensorflow I would have a placeholder such that I can feed it to the network as required:
self.dropout_keep_prob = tf.placeholder(tf.float32, name="dropout_keep_prob")
# ...
# Add dropout
with tf.name_scope("dropout"):
self.h_drop = tf.nn.dropout(self.h_pool_flat, self.dropout_keep_prob)
However, I am not sure how to do this in Keras:
# in_dropout = tf.placeholder(tf.float32, name="dropout_keep_prob")
in_dropout = Input(shape=(1,), name='dropout')
# ..
# Add droppout
droput = Dropout(in_dropout)(max_pool) # This does not work of course
In keras Dropout layers behave differently in training and test phase, that is its only enabled in training phase.
To use dropout in through training/test phase you have to replace the Dropout layers with Lambda layers using dropout function from keras backend.
from keras.layers.core import Lambda
from keras import backend as K
model.add(Lambda(lambda x: K.dropout(x, level=0.5)))
For more reference check: here.