Heatmap on custom model with transfer learning - tensorflow

While trying to get a Grad-CAM for my custom model, I ran into a problem. I am trying to fine-tune a model for image classification, using resnet50. My model is defined in the following way:
IMG_SHAPE = (img_height,img_width) + (3,)
base_model = tf.keras.applications.ResNet50(input_shape=IMG_SHAPE, include_top=False, weights='imagenet')
preprocess_input = tf.keras.applications.resnet50.preprocess_input
and finnaly,
input_layer = tf.keras.Input(shape=(img_height, img_width, 3),name="input_layer")
x = preprocess_input(input_layer)
x = base_model(x, training=False)
x = tf.keras.layers.GlobalAveragePooling2D(name="global_average_layer")(x)
x = tf.keras.layers.Dropout(0.2,name="dropout_layer")(x)
x = tf.keras.layers.Dense(4,name="training_layer")(x)
outputs = tf.keras.layers.Dense(4,name="prediction_layer")(x)
model = tf.keras.Model(input_layer, outputs)
Now, I was following the tutorial at https://keras.io/examples/vision/grad_cam/ in order to get a heatmap. But, while the tutorial recommends using model.summary() to get the last convolutional layer and classifier layers, I am not sure how to do it for my model.
If I run model.summary(), i get:
Layer (type) Output Shape Param # Connected to
input_layer (InputLayer) [(None, 224, 224, 3)] 0
tf.operators.getitem_11 (None, 224, 224, 3) 0
tf.nn.bias_add_11 (TFOpLambd [(None, 224, 224, 3)] 0
resnet50 (Functional) (None, 7, 7, 2048) 23587712
global_average (GlobalAverag (None, 2048) 0
dropout_layer (Dropout) (None, 2048) 0
hidden_layer (Dense) (None, 4) 8196
predict_layer (Dense) (None, 4) 20
However, if I run base_model.summary(), i get:
Layer (type) Output Shape Param # Connected to
input_29 (InputLayer) [(None, 224, 224, 3) 0
conv1_pad (ZeroPadding2D) (None, 230, 230, 3) 0 input_29[0][0]
conv1_conv (Conv2D) (None, 112, 112, 64) 9472 conv1_pad[0][0]
conv1_bn (BatchNormalization) (None, 112, 112, 64) 256 conv1_conv[0][0]
... ... ... ...
conv5_block3_3_bn (BatchNormali (None, 7, 7, 2048) 8192 conv5_block3_3_conv[0][0]
conv5_block3_add (Add) (None, 7, 7, 2048) 0 conv5_block2_out[0][0]
conv5_block3_out (Activation) (None, 7, 7, 2048) 0 conv5_block3_add[0][0]
If i follow the tutorial using 'resnet50' as the last convolutional layer, i get the following error:
Graph disconnected: cannot obtain value for tensor KerasTensor(type_spec=TensorSpec(shape=(None, 224, 224, 3), dtype=tf.float32, name='input_29'), name='input_29', description="created by layer 'input_29'") at layer "conv1_pad". The following previous layers were accessed without issue: []
but if I use 'conv5_block3_out', the program cannot find that layer on the model. How can I acess the layers that seem to be hidden on the resnet50 layer?

I managed to find a solution to this problem. When defining "make-gradcam_heatmap", I added the line
input_layer = model.get_layer('resnet50').get_layer('input_1').input
and changed the next line to
last_conv_layer = model.get_layer(last_conv_layer_name).get_layer("conv5_block3_out")


Keras model shape incompatible / ValueError: Shapes (None, 3) and (None, 3, 3) are incompatible

I'm trying to train my keras model but shapes are incompatible.
The error says
ValueError: Shapes (None, 3) and (None, 3, 3) are incompatible
My train set's shape is (2000, 3, 768) and lable's shape is (2000, 3).
What is the wrong the point?
Model define & fit code
input_shape = x_train.shape[1:]
model = my_dnn(input_shape, 3)
model.fit(x_train, y_train, epochs=25, verbose=1)
Model code
def my_dnn(input, num_classes):
model = Sequential()
model.compile( loss='categorical_crossentropy',
return model
In addition to what's said, it seems you are carrying the second dimension of the input data until the end of the model. So your model summary is something like this:
Layer (type) Output Shape Param #
dense_1 (Dense) (None, 3, 1024) 787456
activation_1 (Activation) (None, 3, 1024) 0
dropout_1 (Dropout) (None, 3, 1024) 0
dense_2 (Dense) (None, 3, 512) 524800
activation_2 (Activation) (None, 3, 512) 0
dense_3 (Dense) (None, 3, 225) 115425
activation_3 (Activation) (None, 3, 225) 0
dense_4 (Dense) (None, 3, 100) 22600
activation_4 (Activation) (None, 3, 100) 0
dense_5 (Dense) (None, 3, 3) 303
activation_5 (Activation) (None, 3, 3) 0
Total params: 1,450,584
Trainable params: 1,450,584
Non-trainable params: 0
As you can see, the output shape of the model (None, 3, 3) is not compatible with the label's shape (None, 3), and at some point, you need to use a Flatten layer.
There are two possible reasons:
Your problem is multi-class classification, hence you need softmax instead of sigmoid + accuracy or CategoricalAccuracy() as a metric.
Your problem is multi-label classification, hence you need binary_crossentropy and tf.keras.metrics.BinaryAccuracy()
Depending on how your dataset is built/the task you are trying to solve, you need to opt for one of those.
For case 1, ensure your data is OHE(one-hot encoded).
Also, Marco Cerliani and Amir (in the comment below) point out that the data output needs to be in a 2D format rather than 3D : you should either preprocess the data accordingly before feeding it to the network or use, as suggested in the comment below, a Flatten() at a point (probably before the final Dense())

How to feed data into Multiple dense(2,)?

Model :
import tensorflow as tf
from tensorflow.keras.layers import Dense
from tensorflow.keras import Model
vgg_base =tf.keras.applications.VGG16(include_top=False, weights='imagenet')
x = vgg_base.output
x = tf.keras.layers.GlobalAveragePooling2D()(x)
cl1 = Dense(2, activation = 'softmax',name='cl1')(x)
cl2 = Dense(2, activation = 'softmax',name='cl2')(x)
model = Model(inputs=vgg_base.input, outputs= [cl1,cl2])
model.compile(optimizer='adam', loss='categorical_crossentropy' , metrics=['categorical_accuracy'])
Here i have three Dense(2,) layers. I tried to feed data into model.fit() by using following custom data function
def func(img_batch, lb_batch):
lbs = tf.one_hot(lb_batch,depth=2)
return img_batch, lbs
train_Data = train_ds.map(func)
But getting the following format
ValueError: Error when checking model target: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 2 array(s), for inputs ['cl2', 'cl2'] but instead got the following list of 1 arrays: [<tf.Tensor 'args_1:0' shape=(None, 2, 2) dtype=float32>]...
Here the model is expecting a 2 separate arrays , but the custom funtion return single array with shape (2,2). So have to convert the single tf.Tensor array with shape(2,2) into 3 seperate tf.Tensor array .. How to solve this !
Model.summary() output :
Layer (type) Output Shape Param # Connected to
input_1 (InputLayer) [(None, None, None, 0
block1_conv1 (Conv2D) (None, None, None, 6 1792 input_1[0][0]
block1_conv2 (Conv2D) (None, None, None, 6 36928 block1_conv1[0][0]
block1_pool (MaxPooling2D) (None, None, None, 6 0 block1_conv2[0][0]
block2_conv1 (Conv2D) (None, None, None, 1 73856 block1_pool[0][0]
block2_conv2 (Conv2D) (None, None, None, 1 147584 block2_conv1[0][0]
block2_pool (MaxPooling2D) (None, None, None, 1 0 block2_conv2[0][0]
block3_conv1 (Conv2D) (None, None, None, 2 295168 block2_pool[0][0]
block3_conv2 (Conv2D) (None, None, None, 2 590080 block3_conv1[0][0]
block3_conv3 (Conv2D) (None, None, None, 2 590080 block3_conv2[0][0]
block3_pool (MaxPooling2D) (None, None, None, 2 0 block3_conv3[0][0]
block4_conv1 (Conv2D) (None, None, None, 5 1180160 block3_pool[0][0]
block4_conv2 (Conv2D) (None, None, None, 5 2359808 block4_conv1[0][0]
block4_conv3 (Conv2D) (None, None, None, 5 2359808 block4_conv2[0][0]
block4_pool (MaxPooling2D) (None, None, None, 5 0 block4_conv3[0][0]
block5_conv1 (Conv2D) (None, None, None, 5 2359808 block4_pool[0][0]
block5_conv2 (Conv2D) (None, None, None, 5 2359808 block5_conv1[0][0]
block5_conv3 (Conv2D) (None, None, None, 5 2359808 block5_conv2[0][0]
block5_pool (MaxPooling2D) (None, None, None, 5 0 block5_conv3[0][0]
global_average_pooling2d (Globa (None, 512) 0 block5_pool[0][0]
cl1 (Dense) (None, 2) 1026 global_average_pooling2d[0][0]
cl2 (Dense) (None, 2) 1026 global_average_pooling2d[0][0]
Total params: 14,716,740
Trainable params: 14,716,740
Non-trainable params: 0
You can use the tf.split function to split the one tensor into three separate tensors.
img_batch_split = tf.split(img_batch, 3, axis = 1)
This will return a list containing three tensors of shape (None, 1, 2). To get rid of the extra 1 dimension, you can use a tf.squeeze.
img_batch_split_squeezed = [tf.squeeze(x, axis = 1) for x in img_batch_split]
After this, each element in the list will be of dimension (None, 2). Your entire function may look like this:
def func(img_batch, lb_batch):
img_batch_split = tf.split(img_batch, 3, axis = 1)
img_batch_split_squeezed = [tf.squeeze(x, axis = 1) for x in img_batch_split]
lbs = tf.one_hot(lb_batch,depth=2)
lbs_split = tf.split(lbs, 3, axis = 1)
lbs_split_squeezed = [tf.squeeze(x, axis = 1) for x in lbs_split]
return img_batch_split_squeezed, lbs_split_squeezed

ValueError: Error when checking target: expected dense_2 to have 2 dimensions, but got array with shape (3306, 67, 1)

I am trying to train a neural network on Semantic Role Labeling task (text classification task). The dataset consist of sentences on which the neural network has to be trained to predict a class for each word. Apart from using the embedding matrix, I am also using other features (meta_data_features). The number of classes in Y_train are 61. The number 3306 represents the number of sentences in my dataset (size of my dataset). MAX_LEN = 67. The code for the architecture is:
embedding_layer = Embedding(67,
sentence_input = Input(shape=(67,), dtype='int32')
meta_input = Input(shape=(67,), name='meta_input')
embedded_sequences = embedding_layer(sentence_input)
x_1 = (SimpleRNN(256))(embedded_sequences)
x = concatenate([x_1, meta_input], axis=1)
x = Dropout(0.3)(x)
x = Dense(32, activation='relu')(x)
predictions = Dense(61, activation='softmax')(x)
model = Model([sentence_input,meta_input], predictions)
The snapshot of model summary is:
Layer (type) Output Shape Param # Connected to
input_1 (InputLayer) (None, 67) 0
embedding_1 (Embedding) (None, 67, 300) 1176000 input_1[0][0]
simple_rnn_1 (SimpleRNN) (None, 256) 142592 embedding_1[0][0]
meta_input (InputLayer) (None, 67) 0
concatenate_1 (Concatenate) (None, 323) 0 simple_rnn_1[0][0]
dropout_1 (Dropout) (None, 323) 0 concatenate_1[0][0]
dense_1 (Dense) (None, 32) 10368 dropout_1[0][0]
dense_2 (Dense) (None, 61) 2013 dense_1[0][0]
Total params: 1,330,973
Trainable params: 154,973
Non-trainable params: 1,176,000
The function call is:
simple_RNN_model_trainable.fit([padded_sentences, meta_data_features], padded_verbs,batch_size=32,epochs=1)
X_train constitutes [padded_sentences, meta_data_features] and Y_train is padded_verbs. Their shapes are:
padded_sentences - (3306, 67)
meta_data_features - (3306, 67)
padded_verbs - (3306, 67, 1)
When I try to fit the model, I get the error, "ValueError: Error when checking target: expected dense_2 to have 2 dimensions, but got array with shape (3306, 67, 1)"
It would be great if somebody can help me in resolving the error. Thanks!

Sci-kit Learn Confusion Matrix: Found input variables with inconsistent numbers of samples

I'm trying to plot a confusion matrix between the predicted test labels and the actual ones, but I'm getting this error
ValueError: Found input variables with inconsistent numbers of samples: [1263, 12630]
Dataset: GTSRB
Code used
Image augmentation
train_datagen = ImageDataGenerator(rescale=1./255,
zoom_range=[0.9, 1.25],
brightness_range=[0.5, 1.5])
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator and test_generator
batch_size = 10
train_generator = train_datagen.flow_from_directory(
target_size=(224, 224),
test_generator = test_datagen.flow_from_directory(
target_size=(224, 224),
Output of that code
Found 39209 images belonging to 43 classes.
Found 12630 images belonging to 43 classes.
Then, I used a VGG-16 model and replaced the latest Dense layer with a Dense(43, activation='softmax')
Model summary
Layer (type) Output Shape Param #
block1_conv1 (Conv2D) (None, 224, 224, 64) 1792
block1_conv2 (Conv2D) (None, 224, 224, 64) 36928
block1_pool (MaxPooling2D) (None, 112, 112, 64) 0
block2_conv1 (Conv2D) (None, 112, 112, 128) 73856
block2_conv2 (Conv2D) (None, 112, 112, 128) 147584
block2_pool (MaxPooling2D) (None, 56, 56, 128) 0
block3_conv1 (Conv2D) (None, 56, 56, 256) 295168
block3_conv2 (Conv2D) (None, 56, 56, 256) 590080
block3_conv3 (Conv2D) (None, 56, 56, 256) 590080
block3_pool (MaxPooling2D) (None, 28, 28, 256) 0
block4_conv1 (Conv2D) (None, 28, 28, 512) 1180160
block4_conv2 (Conv2D) (None, 28, 28, 512) 2359808
block4_conv3 (Conv2D) (None, 28, 28, 512) 2359808
block4_pool (MaxPooling2D) (None, 14, 14, 512) 0
block5_conv1 (Conv2D) (None, 14, 14, 512) 2359808
block5_conv2 (Conv2D) (None, 14, 14, 512) 2359808
block5_conv3 (Conv2D) (None, 14, 14, 512) 2359808
block5_pool (MaxPooling2D) (None, 7, 7, 512) 0
flatten (Flatten) (None, 25088) 0
fc1 (Dense) (None, 4096) 102764544
fc2 (Dense) (None, 4096) 16781312
predictions (Dense) (None, 1000) 4097000
dense_1 (Dense) (None, 43) 43043
Total params: 138,400,587
Trainable params: 43,043
Non-trainable params: 138,357,544
Compile the model
my_sgd = SGD(lr=0.01)
Train the model
predictions = model.predict_generator(test_generator, steps=STEP_SIZE_TEST, verbose=1)
1263/1263 [==============================] - 229s 181ms/step
Predictions shape
(12630, 43)
Getting the test_data and test_labels
test_data = []
test_labels = []
batch_index = 0
while batch_index <= test_generator.batch_index:
data = next(test_generator)
batch_index = batch_index + 1
test_data_array = np.asarray(test_data)
test_labels_array = np.asarray(test_labels)
Shape of test_data_array and test_labels_array
(1263, 10, 224, 224, 3)
(1263, 10, 43)
Confusion Matrix
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(test_labels_array, predictions)
I get the output
ValueError: Found input variables with inconsistent numbers of samples: [1263, 12630]
I understand that this error is because the test_labels_array size isn't equal to the predictions; 1263 and 12630 respectively, but I don't really know what I'm doing wrong.
Any help would be much appreciated.
PS: If anyone has any tips on how to increase the training accuracy while we're at it, that would be brilliant.
You should reshape test_data_array and test_labels_array as follows:
data_count, batch_count, w, h, c = test_data_array.shape
test_data_array=np.reshape(test_data_array, (data_count*batch_count, w, h, c))
test_labels_array = np.reshape(test_labels_array , (data_count*batch_count, -1))
the way you are appending the results of test_generator is the reason. In fact the first call of your test_generator will generate 10 data with shape of (224, 224, 3). For the next call again your test_generator will generate 10 data with shape of (224, 224, 3). So now you should have 20 data of shape (224, 224, 3) while the way you are appending the results would cause that you came up with 2 data of shape (10, 224, 224, 3). which is not what you are expecting.

Keras - Freezing A Model And Then Adding Trainable Layers

I am taking a CNN model that is pretrained, and then trying to implement a CNN-LSTM with parallel CNNs all with the same weights from the pretraining.
# load in CNN
weightsfile = 'final_weights.h5'
modelfile = '2dcnn_model.json'
# load model from json
json_file = open(modelfile, 'r')
loaded_model_json = json_file.read()
fixed_cnn_model = keras.models.model_from_json(loaded_model_json)
# remove the last 2 dense FC layers and freeze it
fixed_cnn_model.trainable = False
This will produce the summary:
Layer (type) Output Shape Param #
input_1 (InputLayer) (None, 32, 32, 4) 0
conv2d_1 (Conv2D) (None, 30, 30, 32) 1184
conv2d_2 (Conv2D) (None, 28, 28, 32) 9248
conv2d_3 (Conv2D) (None, 26, 26, 32) 9248
conv2d_4 (Conv2D) (None, 24, 24, 32) 9248
max_pooling2d_1 (MaxPooling2 (None, 12, 12, 32) 0
conv2d_5 (Conv2D) (None, 10, 10, 64) 18496
conv2d_6 (Conv2D) (None, 8, 8, 64) 36928
max_pooling2d_2 (MaxPooling2 (None, 4, 4, 64) 0
conv2d_7 (Conv2D) (None, 2, 2, 128) 73856
max_pooling2d_3 (MaxPooling2 (None, 1, 1, 128) 0
flatten_1 (Flatten) (None, 128) 0
dropout_1 (Dropout) (None, 128) 0
dense_1 (Dense) (None, 512) 66048
Total params: 224,256
Trainable params: 0
Non-trainable params: 224,256
Now, I will add to it and compile and show that the non-trainable all become trainable.
# create sequential model to get this all before the LSTM
# initialize loss function, SGD optimizer and metrics
loss = 'binary_crossentropy'
optimizer = keras.optimizers.Adam(lr=1e-4,
metrics = ['accuracy']
currmodel = Sequential()
currmodel.add(TimeDistributed(fixed_cnn_model, input_shape=(num_timewins, imsize, imsize, n_colors)))
currmodel.add(Dense(1024, activation='relu')
currmodel.add(Dense(2, activation='softmax')
currmodel = Model(inputs=currmodel.input, outputs = currmodel.output)
config = currmodel.compile(optimizer=optimizer, loss=loss, metrics=metrics)
Layer (type) Output Shape Param #
time_distributed_3_input (In (None, 5, 32, 32, 4) 0
time_distributed_3 (TimeDist (None, 5, 512) 224256
lstm_3 (LSTM) (None, 50) 112600
dropout_1 (Dropout) (None, 50) 0
dense_1 (Dense) (None, 1024) 52224
dropout_2 (Dropout) (None, 1024) 0
dense_2 (Dense) (None, 2) 2050
Total params: 391,130
Trainable params: 391,130
Non-trainable params: 0
How am I supposed to freeze the layers in this case? I am almost 100% positive that I had working code in this format in an earlier keras version. It seems like this is the right direction, since you define a model and declare certain layers trainable, or not.
Then you add layers, which are by default trainable. However, this seems to convert all the layers to trainable.
try adding
for layer in currmodel.layers[:5]:
layer.trainable = False
First print the layer numbers in you network
for i,layer in enumerate(currmodel.layers):
Now check which layers are trainable and which are not
for i,layer in enumerate(model.layers):
Now you can set the parameter 'trainable' for the layers which you want. Let us say you want to train only last 2 layers out of total 6 (the numbering starts from 0) then you can write something like this
for layer in model.layers[:5]:
for layer in model.layers[5:]:
To cross check try to print again and you will get the desired settings.