retrain a pretrained model after adding layers dives broadcastable shapes error - tensorflow

I'm trying to train a model that I loaded and freezed its layers then added 3 new layers that I want to train, in the model.fit stage I'm getting InvalidArgumentError: required broadcastable shapes [Op:Sub]
This is the code I'm using
# Load Saved Model and freeze layers
file_path = r'F:\ku.ac.ae\Intelligent Robotic Manufacturing - Documents\codes\Visuotactile sensor\contact_est\final\m3_130x173_512x16_DATASET_3'
loaded_model = tf.keras.models.load_model(file_path)
tf.keras.backend.set_epsilon(1)
model = tf.keras.models.Sequential(loaded_model.layers[:-3])
for layer in model.layers[:]:
layer.trainable = False
#print(layer, layer.trainable)
# Add Layers
model.add(tfl.Flatten())
model.add(tfl.Dense(64))
model.add(tfl.Dense(66, activation='softmax'))
for layer in model.layers[:]:
print(layer, layer.trainable)
model.compile(
optimizer=tf.keras.optimizers.Adam(learning_rate=1e-3),
loss='mean_absolute_percentage_error',
metrics=['mean_absolute_error'],
#metrics=['accuracy'],
run_eagerly=True)
file_name = 'freezed_m3_130x173_512x32_dataset3'
and then I run this
history = model.fit(
x_train, y_train,
epochs = 512,
batch_size = 32,
validation_data = (x_valid, y_valid),
#callbacks = callbacks_list,
shuffle=True)
I'm getting the error InvalidArgumentError: required broadcastable shapes [Op:Sub]
Any idea about this ? knowing that x_train and y_train have the exact same shape of the loaded model and in fact they are the train dataset used to train the loaded model I just want to play with the last layer
Thanks

Related

I am training a deepfake image detection model, but why the validation accuracy is not changing?

I am training deepfake image detection using Tensorflow, but the validation accuracy is stuck at 67. I have tried to use different optimizers, but it's not decreasing and only floating around the same score.
Here is my step to creating the model.
Importing data from the image folder
Create an ImageDataGenerator object to do some augmentation.
datagen = ImageDataGenerator(
horizontal_flip=True,
validation_split=0.2,
rescale=1./255,
)
Creating the model
image dimension: 299, 299, 3
input_layer = Input(shape = (image_dimensions['height'], image_dimensions['width'], image_dimensions['channels']))
base_model = keras.applications.EfficientNetB5(
weights='imagenet',
input_shape=(image_dimensions['height'], image_dimensions['width'], image_dimensions['channels']),
include_top=False)
base_model.trainable = False
x = base_model(input_layer, training=False)
# Add pooling layer or flatten layer
y = GlobalAveragePooling2D()(x)
y = Dense(512, activation='relu')(y)
y = Dropout(0.4)(y)
y = Dense(256)(y)
# Add final dense layer
output_layer = Dense(1, activation='sigmoid')(y)
model = Model(inputs=input_layer, outputs=output_layer)
Training
efficientNet = EfficientNet(learning_rate = 0.001)
efficientNet.summary()
history = efficientNet.fit(datagen.flow(X_train, y_train, batch_size=64, subset='training'),
epochs=10,
validation_data=datagen.flow(X_train, y_train, batch_size=64, subset='validation'))
Result
Here is the result of the model training
Is there anyway I can fix this problem?

How to provide specific training, validation and test sets in StellarGraph PaddedGraphGenerator -

I am trying to train a graph convolutional neural network using the StellarGraph library. I would like to run this example https://stellargraph.readthedocs.io/en/stable/demos/graph-classification/gcn-supervised-graph-classification.html
but without the N-Fold Crossvalidation by providing my own training, validation and test sets. This is the code I am using (taken from this post)
generator = PaddedGraphGenerator(graphs=graphs)
train_gen = generator.flow([x for x in range(0, len(graphs_train))],
targets=graphs_train_labels,
batch_size=35)
test_gen = generator.flow([x for x in range(len(graphs_train),len(graphs_train) + len(graphs_test))],
targets=graphs_test_labels,
batch_size=35)
# Stopping criterium
es = EarlyStopping(monitor="val_loss",
min_delta=0,
patience=20,
restore_best_weights=True)
# Model definition
gc_model = GCNSupervisedGraphClassification(layer_sizes=[64, 64],
activations=["relu", "relu"],
generator=generator,
dropout=0.5)
x_inp, x_out = gc_model.in_out_tensors()
predictions = Dense(units=32, activation="relu")(x_out)
predictions = Dense(units=16, activation="relu")(predictions)
predictions = Dense(units=1, activation="sigmoid")(predictions)
# Creating Keras model and preparing it for training
model = Model(inputs=x_inp, outputs=predictions)
model.compile(optimizer=Adam(0.001), loss=binary_crossentropy, metrics=["acc"])
# GNN Training
history = model.fit(train_gen, epochs=10, validation_data=test_gen, verbose=1)
model.fit(x=graphs_train,
y=graphs_train_labels,
epochs=10,
verbose=1,
callbacks=[es])
# Calculate performance on the validation data
test_metrics = model.evaluate(valid_gen, verbose=1)
valid_acc = test_metrics[model.metrics_names.index("acc")]
print(f"Test Accuracy model = {valid_acc}")
But at the end I am getting this error
ValueError: Failed to find data adapter that can handle input: (<class 'list'> containing values of types {"<class 'stellargraph.core.graph.StellarGraph'>"}), <class 'numpy.ndarray'>
What am I missing here? Is it because of the way I have created the graphs? In my case the graphs is a list which contains the stellar graphs
Problem solved. I was calling
model.fit(x=graphs_train,
y=graphs_train_labels,
epochs=10,
verbose=1,
callbacks=[es])
after the line
history = model.fit(train_gen, epochs=10, validation_data=test_gen, verbose=1)

Keras functional api input shape error, lstm layer received 2d instead of 3d shape

I am using the keras functional api, but i'm getting an error about the input shape of the model -
ValueError: Input 0 is incompatible with layer financial_model: expected shape=(None, 1, 62), found shape=(1, 62)
samples = np.array(samples, dtype=np.float64)
labels = np.array(labels, dtype=np.uint8)
x_train, x_test, y_train, y_test = train_test_split(samples, labels, test_size=0.33,
random_state=42)
min_max = MinMaxScaler()
x_train = min_max.fit_transform(x_train)
lstm_input = np.expand_dims(x_train, axis=1).shape
inputs = keras.Input(shape=(lstm_input[1],lstm_input[2]))
hidden = keras.layers.LSTM(lstm_input[2], activation='tanh')(inputs)
output = keras.layers.Dense(2)(hidden)
model = keras.Model(inputs=inputs, outputs=output, name="financial_model")
model.compile(
loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
optimizer=keras.optimizers.Adam(learning_rate=0.001),
metrics=["accuracy"],
)
model.summary()
history = model.fit(x_train, y_train, batch_size=1, epochs=5, validation_split=0.2)
I've learnt from similar questions that the batch size is omitted in the input shape dimensions. How do I feed a 3 dimensional input shape into the lstm layer when the batch size is left out in the input object?
Since I have less than 50 reputation, I cannot comment. I'm not sure of this, but as the error says, your input shape is wrong. You have to add another dimension to it. Try something like this:
inputs = keras.Input(shape=(lstm_input[1],lstm_input[2], 1))

My loss is "nan" and accuracy is " 0.0000e+00 " in Transfer learning: InceptionV3

I am working on transfer learning. My use case is to classify two categories of images. I used InceptionV3 to classify images. When training my model, I am getting nan as loss and 0.0000e+00 as accuracy in every epoch. I am using 20 epochs because my data amount is small: I got 1000 images for training and 100 for testing and per batch 5 records.
from keras.applications.inception_v3 import InceptionV3
from keras.preprocessing import image
from keras.models import Model
from keras.layers import Dense, GlobalAveragePooling2D
from keras import backend as K
# create the base pre-trained model
base_model = InceptionV3(weights='imagenet', include_top=False)
# add a global spatial average pooling layer
x = base_model.output
x = GlobalAveragePooling2D()(x)
# let's add a fully-connected layer
x = Dense(1024, activation='relu')(x)
x = Dense(512, activation='relu')(x)
x = Dense(32, activation='relu')(x)
# and a logistic layer -- we have 2 classes
predictions = Dense(1, activation='softmax')(x)
# this is the model we will train
model = Model(inputs=base_model.input, outputs=predictions)
for layer in base_model.layers:
layer.trainable = False
# we chose to train the top 2 inception blocks, i.e. we will freeze
# the first 249 layers and unfreeze the rest:
for layer in model.layers[:249]:
layer.trainable = False
for layer in model.layers[249:]:
layer.trainable = True
model.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])
from keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1./255)
training_set = train_datagen.flow_from_directory(
'C:/Users/Desktop/Transfer/train/',
target_size=(64, 64),
batch_size=5,
class_mode='binary')
test_set = test_datagen.flow_from_directory(
'C:/Users/Desktop/Transfer/test/',
target_size=(64, 64),
batch_size=5,
class_mode='binary')
model.fit_generator(
training_set,
steps_per_epoch=1000,
epochs=20,
validation_data=test_set,
validation_steps=100)
It sounds like your gradient is exploding. There could be a few reasons for that:
Check that your input is generated correctly. For example use the save_to_dir parameter of flow_from_directory
Since you have a batch size of 5, fix the steps_per_epoch from 1000 to 1000/5=200
Use sigmoid activation instead of softmax
Set a lower learning rate in Adam; to do that you need to create the optimizer separately like adam = Adam(0.0001) and pass it in model.compile(..., optimizer=adam)
Try VGG16 instead of InceptionV3
Let us know when you tried all of the above.
Using Softmax for the activation does not make sense in case of single class. Your output value will always be normed by itself, thus equals to 1. The purpose of softmax is to make the values sum up to 1. In case of single value you will get it == 1. I believe at some moment in time you got 0 as predicted value, which resulted in zero division and NaN loss value.
You should either change the number of classes to 2 by:
predictions = Dense(2, activation='softmax')(x)
class_mode='categorical' in flow_from_directory
loss="categorical_crossentropy"
or use the sigmoid activation function for the last layer.

How to use batch trained model, to predict on single input?

I have RNN model that have been trained on Dataset:
train = tf.data.Dataset.from_tensor_slices((data_x[:train_size],
data_y[:train_size])).batch(batch_size).repeat()
model:
model = tf.keras.Sequential()
model.add(tf.keras.layers.GRU(units=lstm_num_units,
return_sequences=True,
kernel_initializer='random_uniform',
recurrent_initializer='random_uniform',
bias_initializer='random_uniform',
batch_size=batch_size,
input_shape = [seq_len, num_features]))
model.add(tf.keras.layers.LSTM(units=lstm_num_units,
batch_size=batch_size,
return_sequences=True,
input_shape = [seq_len, num_features]))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(units=dence_units))
model.add(tf.keras.layers.Dropout(drop_flat))
model.add(tf.keras.layers.Dense(units=out_units))
model.add(tf.keras.layers.Softmax())
model.compile(loss="sparse_categorical_crossentropy",
optimizer=tf.train.RMSPropOptimizer(opt),
metrics=['accuracy'])
model.fit(train, epochs=EPOCHS,
steps_per_epoch=repeat_size_train,
validation_data=validate,
validation_steps=repeat_size_validate,
verbose=1,
shuffle=True)
callbacks=[tensorboard, cp_callback])
I need to do prediction on single input of seq_len, but looks like my input have to be of a batch size:
ar = np.random.randint(98, size=[batch_size, seq_len])
ar = np.reshape(ar, [batch_size, seq_len, 1])
prediction = model.m.predict(ar)
Is there a way to make it work on a single input of shape [1, seq_len, 1]?
Yes, simply rebuild the model without a batch size in the first layer.
Copy the weights of the old model.
newModel.set_weights(oldModel.get_weights())
The purpose of the batch size only exists in stateful=True models to keep consistency between batches.
Even though, there is no mathematical change due to batch size.