How to set custom weights in layers? - tensorflow

I am looking at how to set custom weights into the layers.
Below is the code I work with
batch_size = 64
input_dim = 12
units = 64
output_size = 1 # labels are from 0 to 9
# Build the RNN model
def build_model(allow_cudnn_kernel=True):
lstm_layer = keras.layers.RNN(
keras.layers.LSTMCell(units), input_shape=(None, input_dim))
model = keras.models.Sequential(
[
lstm_layer,
keras.layers.BatchNormalization(),
keras.layers.Dense(output_size),
]
)
return model
model = build_model()
model.compile(
loss=keras.losses.MeanSquaredError(),
optimizer="Adam",
metrics=["accuracy"],
)
model.fit(
x_train, y_train, validation_data=(x_val, y_val), batch_size=batch_size, epochs=15
)
Modle Summary
Can anyone help me how to set_weights in above code?
Thanks in advance.

You can do it using set_weights method.
For example, if you want to set the weights of your LSTM Layer, it can be accessed using model.layers[0] and if your Custom Weights are, say in an array, named, my_weights_matrix, then you can set your Custom Weights to First Layer (LSTM) using the code shown below:
model.layers[0].set_weights([my_weights_matrix])
If you don't want your weights to be modified during Training, then you have to Freeze that Layer using the code, model.layers[0].trainable = False.
Please let me know if you face any other issue and I will be Happy to Help you.
Hope this helps. Happy Learning!

Related

retrain a pretrained model after adding layers dives broadcastable shapes error

I'm trying to train a model that I loaded and freezed its layers then added 3 new layers that I want to train, in the model.fit stage I'm getting InvalidArgumentError: required broadcastable shapes [Op:Sub]
This is the code I'm using
# Load Saved Model and freeze layers
file_path = r'F:\ku.ac.ae\Intelligent Robotic Manufacturing - Documents\codes\Visuotactile sensor\contact_est\final\m3_130x173_512x16_DATASET_3'
loaded_model = tf.keras.models.load_model(file_path)
tf.keras.backend.set_epsilon(1)
model = tf.keras.models.Sequential(loaded_model.layers[:-3])
for layer in model.layers[:]:
layer.trainable = False
#print(layer, layer.trainable)
# Add Layers
model.add(tfl.Flatten())
model.add(tfl.Dense(64))
model.add(tfl.Dense(66, activation='softmax'))
for layer in model.layers[:]:
print(layer, layer.trainable)
model.compile(
optimizer=tf.keras.optimizers.Adam(learning_rate=1e-3),
loss='mean_absolute_percentage_error',
metrics=['mean_absolute_error'],
#metrics=['accuracy'],
run_eagerly=True)
file_name = 'freezed_m3_130x173_512x32_dataset3'
and then I run this
history = model.fit(
x_train, y_train,
epochs = 512,
batch_size = 32,
validation_data = (x_valid, y_valid),
#callbacks = callbacks_list,
shuffle=True)
I'm getting the error InvalidArgumentError: required broadcastable shapes [Op:Sub]
Any idea about this ? knowing that x_train and y_train have the exact same shape of the loaded model and in fact they are the train dataset used to train the loaded model I just want to play with the last layer
Thanks

Deep Learning model (LSTM) predicts same class label

I am trying to solve the Spoken Digit Recognition task using the LSTM model, where the audio files are converted into spectrograms and fed into an LSTM model after doing Global Average Pooling. Here is the architecture of it
tf.keras.backend.clear_session()
#input layer
input_= Input(shape = (64, 35))
lstm = LSTM(100, activation='tanh', return_sequences= True, kernel_regularizer = l2(0.000001),
recurrent_initializer = 'glorot_uniform')(input_)
lstm = GlobalAveragePooling1D(data_format='channels_first')(lstm)
dense = Dense(20, activation='relu', kernel_regularizer = l2(0.000001), kernel_initializer='glorot_uniform')(lstm)
drop = Dropout(0.8)(dense)
dense1 = Dense(25, activation='relu', kernel_regularizer = l2(0.000001), kernel_initializer= 'he_uniform')(drop)
drop = Dropout(0.95)(dense1)
output = Dense(10,activation = 'softmax', kernel_regularizer = l2(0.000001), kernel_initializer= 'glorot_uniform')(drop)
model_2 = Model(inputs = [input_], outputs = output)
model_2.summary()
Having summary as -
I need to calculate the F1 score to check the performance of the model, I have implemented a custom callback and used TensorFlow addons F1 score too. However, I won't get the correct result, for every epoch I get the constant F1 score value.
On further digging, I found out that my model predicts the same class label, for the entire epoch, whereas it is supposed to predict 10 classes in one epoch. as there are 10 class label values present.
Here is my model.compile and model.predict commands. I have used TensorFlow addon here -
from tensorflow import keras
opt = keras.optimizers.Adam(0.001, clipnorm=0.8)
model_2.compile(loss='categorical_crossentropy', optimizer=opt, metrics = metric)
hist = model_2.fit([X_train_spectrogram],
[y_train_converted],
validation_data= ([X_test_spectrogram], [y_test_converted]),
epochs = 10,
verbose =1,
callbacks=[tensorBoard_callbk2, ClearMemory()],
# steps_per_epoch = 3,
batch_size=32)
Here is what I mean by getting the same prediction, the entire array is filled with the same predicted values.
Why is the model predicting the same class label? or How to rectify it?
I have tried increasing the number of trainable parameters, increasing - decreasing batch size too, but it won't help me. If anyone knows can you please help me out?

why am I getting error in transfer learning?

I am training a model for Optical Character Recognition of Gujarati Language. The input image is a character image. I have taken 37 classes. Total training images are 22200 (600 per class) and testing images are 5920 (160 per class). My input images are 32x32
Below is my code:
model = tf.keras.applications.DenseNet121(include_top=False, weights='imagenet', pooling='max')
base_inputs = model.layers[0].input
base_outputs = model.layers[-1].output # NOTICE -1 not -2
prefinal_outputs = layers.Dense(1024)(base_outputs)
final_outputs = layers.Dense(37)(prefinal_outputs)
new_model = keras.Model(inputs=base_inputs, outputs=base_outputs)
from tensorflow.keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=False)
test_datagen = ImageDataGenerator(horizontal_flip = False)
training_set = train_datagen.flow_from_directory('C:/Users/shweta/Desktop/characters/train',
target_size = (32, 32),
batch_size = 64,
class_mode = 'categorical')
test_set = test_datagen.flow_from_directory('C:/Users/shweta/Desktop/characters/test',
target_size = (32, 32),
batch_size = 64,
class_mode = 'categorical')
new_model.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])
new_model.fit_generator(training_set,
epochs = 25,
validation_data = test_set, shuffle=True)
new_model.save('alphanumeric.mod')
I am getting following output:
Thanks in advance!
First of all, very well written code.
These are some of the things, I have noticed while I was going through the code and tf,keras docs.
I would like to ask what kind of labels have you got beacuse you know categorical_crossentropy expects ONE HOT CODED labels.(Check this).So, if your labels are integers, use sparsecategoricalentropy.
Similar issue
There was post where someone was trying to classsify into 2 and used categorical instead of binary crossentropy. If you want to look at.
Cheers
Let me know how it goes!
PS: #gerry made a very good point and if labels are One hot encoded use categoricalcrossentropy!
The code should be:
model = tf.keras.applications.DenseNet121(include_top=False, weights='imagenet, pooling='max', input_shape=(32,32,3))
base_outputs = model.layers[-1].output
prefinal_outputs = layers.Dense(1024)(base_outputs)
final_outputs = layers.Dense(37)(prefinal_outputs)
new_model = keras.Model(inputs=model.input, outputs=final_outputs)
new_model.compile(Adam(), loss='categorical_crossentropy', metrics=['accuracy'])
Also you should use model.fit in the future. Model.fit can now work with generators and model.fit_generator will be depreciate in future versions of tensorflow. I ran against your dataset and got accurate results in about 10 epochs. Here is some additional advice. It is best to use and adjustable learning rate. The keras callback ReduceLROnPlateau makes this easy to do. Documentation is here. Set it to monitor the validation loss. My use is shown below.
lr_adjust=tf.keras.callbacks.ReduceLROnPlateau( monitor="val_loss", factor=0.5, patience=1, verbose=1, mode="auto",
min_delta=0.00001, cooldown=0, min_lr=0)
Also I recommend using the callback ModelCheckpoint. Documentation is here. Set it up to monitor validation loss and it will save the weights that achieved the lowest validation loss. My implementation is shown below.
sav_loc=r'c:\Temp' # set this to the path where you want to save the weights
checkpoint=tf.keras.callbacks.ModelCheckpoint(filepath=save_loc, monitor='val_loss', verbose=1, save_best_only=True,
save_weights_only=True, mode='auto', save_freq='epoch', options=None)
callbacks=[checkpoint, lr_adjust]
In model.fit include callbacks=callbacks. When training is completed you want to load these saved weights into the model, then save the model. You can use the saved model to make predictions. Code is below.
model.load_weights(save_loc)
model.save(save_loc)

AlreadyExistsError while training a network on colab

I'm trying to train an LSTMs network on Google Colab. However, this error occurs:
AlreadyExistsError: Resource __per_step_116/training_4/Adam/gradients/bidirectional_4/while/ReadVariableOp/Enter_grad/ArithmeticOptimizer/AddOpsRewrite_Add/tmp_var/N10tensorflow19TemporaryVariableOp6TmpVarE
[[{{node training_4/Adam/gradients/bidirectional_4/while/ReadVariableOp/Enter_grad/ArithmeticOptimizer/AddOpsRewrite_Add/tmp_var}}]]
I don't know where can be the issue. This is the model of the network:
sl_model = keras.models.Sequential()
sl_model.add(keras.layers.Embedding(max_index+1, hidden_size, mask_zero=True))
sl_model.add(keras.layers.Bidirectional(keras.layers.LSTM(hidden_size,
activation='tanh', dropout=0.2, recurrent_dropout = 0.2, return_sequences=True)))
sl_model.add(keras.layers.Bidirectional(keras.layers.LSTM(hidden_size, activation='tanh', dropout=0.2, recurrent_dropout = 0.2, return_sequences=False))
)
sl_model.add(keras.layers.Dense(max_length, activation='softsign'))
optimizer = keras.optimizers.Adam()
sl_model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['acc'])
batch_size = 128
epochs = 3
cbk = keras.callbacks.TensorBoard("logging/keras_model")
print("\nStarting training...")
sl_model.fit(x_train, y_train, epochs=epochs, batch_size=batch_size,
shuffle=True, validation_data=(x_dev, y_dev), callbacks=[cbk])
Thank you so much!
You need to restart your runtime -- this happens when you have defined multiple graphs built in a single jupyter (Colaboratory) runtime.
Calling tf.reset_default_graph() may also help, but depending on whether you are using eager exection and how you've defined your sessions this may or may not work.

Keras "return_sequences" option returns 2D array instead of 3D

I'm trying to use a simple character-level Keras model for extract key text from a sentence.
I feed it x_train a padded sequence of dim (n_examples, 500) representing the entire sentence and y_train, a padded sequence of dim (n_examples, 100) representing the import text to extract.
I try a simple model like such:
vocab_size = 1000
src_txt_length = 500
sum_txt_length = 100
inputs = Input(shape=(src_txt_length,))
encoder1 = Embedding(vocab_size, 128)(inputs)
encoder2 = LSTM(128)(encoder1)
encoder3 = RepeatVector(sum_txt_length)(encoder2)
decoder1 = LSTM(128, return_sequences=True)(encoder3)
outputs = TimeDistributed(Dense(100, activation='softmax'))(decoder1)
model = Model(inputs=inputs, outputs=outputs)
model.compile(loss='categorical_crossentropy', optimizer='adam')
When I try to train it with the following code:
hist = model.fit(x_train, y_train, verbose=1, validation_data=(x_test, y_test), batch_size=batch_size, epochs=5)
I get the error:
ValueError: Error when checking target: expected time_distributed_27 to have 3 dimensions, but got array with shape (28500, 100)
My question is: I have the return_sequences parameter set to True on the last LSTM layer, but the Dense fully-connected layer is telling me that the input is 2-dimensional.
What am I doing wrong here? Any help would be greatly appreciated!
It isn't complaining about the input to TimeDistributed but the target y_train.shape == (n_examples, 100) which isn't 3D. You have a mismatch between predicting a sequence and a single point. In other words, outputs is 3D but y_train is 2D.