I'm trying to save and load the weights of my model which has merged layer. Since my model code is bit long, let me shorten it by using a simple example and pseudo-style code.
First, my model looks like this:
def some_model():
model1 = Sequential()
model1.add(...)
model2 = Sequential()
model2.add(..)
final_model = Sequential()
final_model.add(Merge[model1, model2], mode='concat')
return final_model
So, after the training, I only saved final_mode's weight.
final_model.save_weights('w_final_model.h5')
And it had no problem when I load the weight for further training or testing.
final_model = some_model()
final_model.load_weights('w_final_model.h5')
So far so good. Yet, my curiosity comes when I tried to investigate the shape of the final_model's layers.
Obviously, the final_model only has its own layers. In other words, it doesn't look carrying all the weights vectors of model1 and model2. But, it still works. I wonder how this can be possible. Or, is it only loading the weights for the final_model layers while the model1 and model2 weights are initialized again? Yet, the network's output is too good to assume that model1 and model2 weights are newly initialized. Indeed, do I need to save each model's weight separately?
Related
I have defined my Functional model like this:
base_model = VGG16(include_top=False, input_shape=(224,224,3), pooling='avg')
inputs = tf.keras.Input(shape=(224,224,3))
x = preprocess_input(inputs)
x = base_model(x, training=False)
x = tf.keras.layers.Dropout(0.2)(x, training=True)
outputs = tf.keras.layers.Dense(1, activation='sigmoid')(x)
model = tf.keras.Model(inputs=inputs, outputs=outputs)
The problem is when I call .evaluate() or .predict() I get slightly different results everytime when using the exact same batch (with shuffle=False in my dataset, and all the random seeds initialized).
I tried reconstructing the model without some of the layers and I found the culprit to be these 2 layers constructed by the line x=preprocess_input(inputs), which give randomness to the results:
model summary
Note: preprocess_input is a vgg16 preprocessing function at tf.keras.applications.vgg16.preprocess_input.
However, if I redefine my Functional model as Sequential:
new_model = tf.keras.Sequential()
new_model.add(model.layers[0]) #input layer
new_model.add(tf.keras.layers.Lambda(preprocess_input))
new_model.add(model.layers[3]) #vgg16
new_model.add(model.layers[4]) #dropout
new_model.add(model.layers[5]) #dense
The problem is gone and I get consistent results from .evaluate() or .predict().
What could potentially cause the Functional model to behave like this?
EDIT
As xdurch0 pointed out, it was the dropout layer at fault for different results. The functional model applied dropout during .predict() and .evaluate() methods.
I was working on an image recognition problem. After training the model, I saved the architecture as well as weights. Now I want to use the model for extracting features from other images and perform SVM on that. For this, I want to remove the last two layers of my model and get the values calculated by the CNN and fully connected layers till then. How can I do that in Keras?
# a simple model
model = keras.models.Sequential([
keras.layers.Input((32,32,3)),
keras.layers.Conv2D(16, 3, activation='relu'),
keras.layers.Flatten(),
keras.layers.Dense(10, activation='softmax')
])
# after training
feature_only_model = keras.models.Model(model.inputs, model.layers[-2].output)
feature_only_model take a (32,32,3) for input and the output is the feature vector
If your model is subclassed - just change call() method.
If not:
if your model is complicated - wrap your model by subclassed model and change forward pass in call() method, or
if your model is simple - create model without the last layers, load weights to every layer separately
I want to build a fully-connected (dense) layer for a regression task. I usually do it with TF2, using Keras API like:
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(units=2, activation='sigmoid', input_shape=(1, )))
model.add(tf.keras.layers.Dense(units=2, activation='linear'))
model.compile(optimizer='adam', loss='mae')
model.fit(inp_data, out_data, epochs=1000)
Now I want to build a custom layer. The layer is composed of, say 10 units, in which 8 units have predefined, fixed, untrainable weights and biases and 2 units have randomly-chosen weights and biases, to be trained by the network. Has anyone any idea how can I define it in Tensorflow?
Keras layers may receive a trainable parameter, True by default, to indicate whether you want them to be trained. Non-trainable layers will just keep the value they are given by the initializer. If I understand correctly, you want to have one layer which is only partially trainable. That is not possible as such with existing layers. Maybe you could do it with a custom layer class, but you can have an equivalent behavior by using two simple layers and then concatenating them (as long as your activation works element-wise, and even it it doesn't, like in a softmax layer, you could apply that activation after the concatenation). This is how it could work:
inputs = tf.keras.Input(shape=(1,))
# This is the trainable part of the layer
layer_train = tf.keras.layers.Dense(units=8, activation='sigmoid')(inputs)
# This is the non-trainable part
layer_const = tf.keras.layers.Dense(units=2, activation='sigmoid', trainable=False)(inputs)
# Merge both parts
layer = tf.keras.layers.Concatenate()([layer_train, layer_const])
# Make model
model = tf.keras.Model(inputs=inputs, outputs=layer)
# ...
I have two Keras models, let's call them model1 and model2. Both models are a simple perceptron. Here is the code for setting up model1; model2 is identical.
model1 = keras.Sequential([
keras.layers.Dense(100, activation=tf.nn.relu),
keras.layers.Dropout(0.5, noise_shape=None, seed=None),
keras.layers.Dense(26, activation=tf.nn.softmax)
])
model1.compile(optimizer='sgd',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
I want to mix these two models after training them, such that the resulting model is a random sampling of the weights and biases of model1 and model2. So for example, if the weights are represented by [x1,x2,x3,x4...] and [y1,y2,y3.y4...] the result will be a random combination of those [x1, y2, y3, x4...]
I've looked into merging layers of Keras, but do not see a clear way of accomplishing this in the API. I am looking for insight on how to build a new model that consists of a random ~50/50 split of the weights and biases of model1 and model2. Any ideas on how to accomplish this?
Aight, after another week of beating my head against a table, I finally realized how much of a doofus I was being. Here's the function I made to solve this problem, and it is so unbelievably simple.
#Initialize and train model1 and model2, they are the inputs to this function.
def mateKerasNN(net1,net2):
net1weights = net1.get_weights()
net2weights = net2.get_weights()
net3weights = net1.get_weights()
for i in range(len(net1weights)):
for j in range(len(net1weights[i])):
net3weights[i][j] = random.choice([net1weights[i][j],net2weights[i][j]])
return net3weights
model3weights = mateKerasNN(model1,model2)
model3.set_weights(model3weights)
Note, this actually randomizes each neuron's weights as a group. So neuron 1 with its 40 weights all move as one group into the new model, as do neurons 2 through 784. I will be building a version where all the weights are randomized, but this code is a good start.
I have built an image classifier with 2 classes, say 'A' and 'B'. I have also saved this model, using model.save().
Now, after a certain time, the requirement arose to add one more class 'C'. Is it possible to load_model() and then add only one class to the previously saved model so that we have the final model with 3 classes ('A','B' and 'C'), without having to retrain the whole model, for classes 'A and 'B' again?
Can anyone help?
I have tried this:
I used vgg16 as a base model and pop out its last layer, freeze weights and added one dense layer (DL2), trained it for predicting 2 classes.
Then I added one more dense layer on top of DL2 say DL3, freeze weights and train with class C only but now its always predicting class C.
I think you should check this tutorial:
https://www.tensorflow.org/tutorials/image_retraining.
In short:
You can not take trained model, and add new classes.
You should make some additional 'finetuning', may be not retrain the model from scratch, but at least to train classifier (and some additional layers).
You can also simply change the number of output classes in the last layer and freeze weights for the remaining layer. Retrain the weights for only the last layer.
Just use a transfer learning, and create a new model.
model = VGG16(weights='imagenet',
include_top=False,
input_shape=(150, 150, 3))
model.pop()
base_model_layers = model.output
pred = Dense(11, activation='softmax')(base_model_layers)
model = Model(inputs=model.input, outputs=pred)
# Freeze the first layers, before train it
for layer in model.layers[:-2]:
layer.trainable = False