I am taking a CNN model that is pretrained, and then trying to implement a CNN-LSTM with parallel CNNs all with the same weights from the pretraining.
# load in CNN
weightsfile = 'final_weights.h5'
modelfile = '2dcnn_model.json'
# load model from json
json_file = open(modelfile, 'r')
loaded_model_json =
fixed_cnn_model = keras.models.model_from_json(loaded_model_json)
# remove the last 2 dense FC layers and freeze it
fixed_cnn_model.trainable = False
This will produce the summary:
Layer (type) Output Shape Param #
input_1 (InputLayer) (None, 32, 32, 4) 0
conv2d_1 (Conv2D) (None, 30, 30, 32) 1184
conv2d_2 (Conv2D) (None, 28, 28, 32) 9248
conv2d_3 (Conv2D) (None, 26, 26, 32) 9248
conv2d_4 (Conv2D) (None, 24, 24, 32) 9248
max_pooling2d_1 (MaxPooling2 (None, 12, 12, 32) 0
conv2d_5 (Conv2D) (None, 10, 10, 64) 18496
conv2d_6 (Conv2D) (None, 8, 8, 64) 36928
max_pooling2d_2 (MaxPooling2 (None, 4, 4, 64) 0
conv2d_7 (Conv2D) (None, 2, 2, 128) 73856
max_pooling2d_3 (MaxPooling2 (None, 1, 1, 128) 0
flatten_1 (Flatten) (None, 128) 0
dropout_1 (Dropout) (None, 128) 0
dense_1 (Dense) (None, 512) 66048
Total params: 224,256
Trainable params: 0
Non-trainable params: 224,256
Now, I will add to it and compile and show that the non-trainable all become trainable.
# create sequential model to get this all before the LSTM
# initialize loss function, SGD optimizer and metrics
loss = 'binary_crossentropy'
optimizer = keras.optimizers.Adam(lr=1e-4,
metrics = ['accuracy']
currmodel = Sequential()
currmodel.add(TimeDistributed(fixed_cnn_model, input_shape=(num_timewins, imsize, imsize, n_colors)))
currmodel.add(Dense(1024, activation='relu')
currmodel.add(Dense(2, activation='softmax')
currmodel = Model(inputs=currmodel.input, outputs = currmodel.output)
config = currmodel.compile(optimizer=optimizer, loss=loss, metrics=metrics)
Layer (type) Output Shape Param #
time_distributed_3_input (In (None, 5, 32, 32, 4) 0
time_distributed_3 (TimeDist (None, 5, 512) 224256
lstm_3 (LSTM) (None, 50) 112600
dropout_1 (Dropout) (None, 50) 0
dense_1 (Dense) (None, 1024) 52224
dropout_2 (Dropout) (None, 1024) 0
dense_2 (Dense) (None, 2) 2050
Total params: 391,130
Trainable params: 391,130
Non-trainable params: 0
How am I supposed to freeze the layers in this case? I am almost 100% positive that I had working code in this format in an earlier keras version. It seems like this is the right direction, since you define a model and declare certain layers trainable, or not.
Then you add layers, which are by default trainable. However, this seems to convert all the layers to trainable.

try adding
for layer in currmodel.layers[:5]:
layer.trainable = False

First print the layer numbers in you network
for i,layer in enumerate(currmodel.layers):
Now check which layers are trainable and which are not
for i,layer in enumerate(model.layers):
Now you can set the parameter 'trainable' for the layers which you want. Let us say you want to train only last 2 layers out of total 6 (the numbering starts from 0) then you can write something like this
for layer in model.layers[:5]:
for layer in model.layers[5:]:
To cross check try to print again and you will get the desired settings.


Employing LeakyReLU as the activation function of my CNN model causes 'nan' loss during training?

When I change my CNN model's activation from ReLU to LeakyReLU, both training and validation losses become nan. How can I resolve this issue?
Here is my model's summary:
Shape of all data: (1889, 10801)
Shape of X_train: (1322, 10800, 1)
Shape of Y_train: (1322, 3)
Shape of X_test: (567, 10800, 1)
Shape of y_test: (567, 3)
Layer (type) Output Shape Param #
conv1d_48 (Conv1D) (None, 10721, 128) 10368
batch_normalization_48 (Batc (None, 10721, 128) 512
activation_48 (Activation) (None, 10721, 128) 0
max_pooling1d_41 (MaxPooling (None, 5360, 128) 0
dropout_29 (Dropout) (None, 5360, 128) 0
conv1d_49 (Conv1D) (None, 5357, 128) 65664
batch_normalization_49 (Batc (None, 5357, 128) 512
activation_49 (Activation) (None, 5357, 128) 0
max_pooling1d_42 (MaxPooling (None, 2678, 128) 0
dropout_30 (Dropout) (None, 2678, 128) 0
conv1d_50 (Conv1D) (None, 2675, 128) 65664
batch_normalization_50 (Batc (None, 2675, 128) 512
activation_50 (Activation) (None, 2675, 128) 0
max_pooling1d_43 (MaxPooling (None, 1337, 128) 0
dropout_31 (Dropout) (None, 1337, 128) 0
conv1d_51 (Conv1D) (None, 1334, 256) 131328
batch_normalization_51 (Batc (None, 1334, 256) 1024
activation_51 (Activation) (None, 1334, 256) 0
max_pooling1d_44 (MaxPooling (None, 667, 256) 0
global_max_pooling1d_6 (Glob (None, 256) 0
dense_20 (Dense) (None, 512) 131584
dense_21 (Dense) (None, 3) 1539
The model was compiled as follows:
n_lr = 1e-5
weight_decay = 1e-4
adam = Adam(lr=n_lr, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=weight_decay)
model.compile(loss='categorical_crossentropy', optimizer=adam, metrics=['acc'])
p.s. I'm aware of that this issue has already been reported to Keras on GitHub; but it has not got any responses as of posting this question.

How to find Time Steps in LSTM?

I have a Bi-LSTM model and I want to get the computational complexity of it. I have read on internet that
The computational complexity of learning LSTM models per weight and time step with the stochastic gradient descent (SGD) optimization technique is O(1). Therefore, the learn- ing computational complexity per time step is O(W).
But how do I find the time steps in my model? My model is
model = Sequential()
model.add(Embedding(max_words, 768, input_length=max_len, weights=[embedding]))
model.add(Dense(2, activation='softmax', use_bias=True, kernel_regularizer=regularizers.l1_l2(l1=1e-5, l2=1e-4), bias_regularizer=regularizers.l2(1e-4),
Model summary is
Model: "sequential_1"
Layer (type) Output Shape Param #
embedding_1 (Embedding) (None, 768, 768) 37147392
batch_normalization_2 (Batch (None, 768, 768) 3072
activation_2 (Activation) (None, 768, 768) 0
bidirectional_1 (Bidirection (None, 32) 100480
batch_normalization_3 (Batch (None, 32) 128
activation_3 (Activation) (None, 32) 0
dropout_1 (Dropout) (None, 32) 0
dense_1 (Dense) (None, 2) 66
Total params: 37,251,138
Trainable params: 37,249,538
Non-trainable params: 1,600

Problems with visualization classification_report

I have trying plot classification report, but in my problem have a only 2 classes (0 and 1) and when I called the classification report, his output is it:
enter image description here
My model is a LSTM with Glove embedding for sentiment classification, this is an architecture:
Model: "sequential_6"
Layer (type) Output Shape Param #
embedding_6 (Embedding) (None, 55, 300) 68299200
spatial_dropout1d_12 (Spatia (None, 55, 300) 0
lstm_12 (LSTM) (None, 55, 128) 219648
lstm_13 (LSTM) (None, 55, 64) 49408
spatial_dropout1d_13 (Spatia (None, 55, 64) 0
dense_18 (Dense) (None, 55, 512) 33280
dropout_6 (Dropout) (None, 55, 512) 0
dense_19 (Dense) (None, 55, 64) 32832
dense_20 (Dense) (None, 55, 1) 65
Total params: 68,634,433
Trainable params: 335,233
Non-trainable params: 68,299,200
You can define your output from the classification_report to be a dict(), so that you can then read it as a pandas DataFrame via pandas.DataFrame.from_dict() like this:
import pandas as pd
display(pd.DataFrame.from_dict(classification_report(y_true, y_pred, output_dict=True)).T)

Regarding Convolutional Neural Network

Hi wish to enquire some help regarding neural networks, i am doing a school project whereby i am required to build deep fake detection neural network. I am unsure on why by adding more layers into the neural. My Accuracy during training goes from 0.7 in the first epoch and jumps to 1.0 in the second to fifth epoch which is overfittin and the loss value goes to a weird number, Wish to seek advice on how i could adjust the neural network to suit deepfake detections.
Thank you all for the time in reading
import tensorflow as tf
from keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Activation, Flatten, Conv2D, MaxPooling2D, Dropout
model = Sequential()
model.add(Conv2D(32, (3,3), input_shape = (256,256,3)))
model.add(Conv2D(64, (3,3)))
model.add(Conv2D(64, (3,3)))
model.add(Conv2D(64, (3,3)))
model.add(Conv2D(64, (3,3)))
model.add(Conv2D(64, (3,3)))
#flatten the layer conv 2d dense is 1d data set
model.add(Flatten()) #convets 3d feature maps to 1D feature Vectors
model.compile(loss="binary_crossentropy", optimizer="adam", metrics=['accuracy']), y, batch_size=32, epochs=5)
Model Summary
Model: "sequential"
Layer (type) Output Shape Param #
conv2d (Conv2D) (None, 254, 254, 32) 896
activation (Activation) (None, 254, 254, 32) 0
max_pooling2d (MaxPooling2D) (None, 127, 127, 32) 0
conv2d_1 (Conv2D) (None, 125, 125, 64) 18496
activation_1 (Activation) (None, 125, 125, 64) 0
max_pooling2d_1 (MaxPooling2 (None, 62, 62, 64) 0
conv2d_2 (Conv2D) (None, 60, 60, 64) 36928
activation_2 (Activation) (None, 60, 60, 64) 0
dropout (Dropout) (None, 60, 60, 64) 0
conv2d_3 (Conv2D) (None, 58, 58, 64) 36928
activation_3 (Activation) (None, 58, 58, 64) 0
dropout_1 (Dropout) (None, 58, 58, 64) 0
conv2d_4 (Conv2D) (None, 56, 56, 64) 36928
activation_4 (Activation) (None, 56, 56, 64) 0
dropout_2 (Dropout) (None, 56, 56, 64) 0
conv2d_5 (Conv2D) (None, 54, 54, 64) 36928
activation_5 (Activation) (None, 54, 54, 64) 0
flatten (Flatten) (None, 186624) 0
dense (Dense) (None, 64) 11944000
activation_6 (Activation) (None, 64) 0
dense_1 (Dense) (None, 1) 65
activation_7 (Activation) (None, 1) 0
Total params: 12,111,169
Trainable params: 12,111,169
Non-trainable params: 0
You have to specify more stuff inside each layer, not only the size and number of filters. This will help you to increase the model performance.
For example, you could use adam from keras_optimizers, which will help to increase the accuracy during training the model. Also, l2 from keras.regularizers will help you to reduce overfitting. Which means you can't increase the accuracy just by increasing the epochs, you must first build a good model before starting the training

Sci-kit Learn Confusion Matrix: Found input variables with inconsistent numbers of samples

I'm trying to plot a confusion matrix between the predicted test labels and the actual ones, but I'm getting this error
ValueError: Found input variables with inconsistent numbers of samples: [1263, 12630]
Dataset: GTSRB
Code used
Image augmentation
train_datagen = ImageDataGenerator(rescale=1./255,
zoom_range=[0.9, 1.25],
brightness_range=[0.5, 1.5])
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator and test_generator
batch_size = 10
train_generator = train_datagen.flow_from_directory(
target_size=(224, 224),
test_generator = test_datagen.flow_from_directory(
target_size=(224, 224),
Output of that code
Found 39209 images belonging to 43 classes.
Found 12630 images belonging to 43 classes.
Then, I used a VGG-16 model and replaced the latest Dense layer with a Dense(43, activation='softmax')
Model summary
Layer (type) Output Shape Param #
block1_conv1 (Conv2D) (None, 224, 224, 64) 1792
block1_conv2 (Conv2D) (None, 224, 224, 64) 36928
block1_pool (MaxPooling2D) (None, 112, 112, 64) 0
block2_conv1 (Conv2D) (None, 112, 112, 128) 73856
block2_conv2 (Conv2D) (None, 112, 112, 128) 147584
block2_pool (MaxPooling2D) (None, 56, 56, 128) 0
block3_conv1 (Conv2D) (None, 56, 56, 256) 295168
block3_conv2 (Conv2D) (None, 56, 56, 256) 590080
block3_conv3 (Conv2D) (None, 56, 56, 256) 590080
block3_pool (MaxPooling2D) (None, 28, 28, 256) 0
block4_conv1 (Conv2D) (None, 28, 28, 512) 1180160
block4_conv2 (Conv2D) (None, 28, 28, 512) 2359808
block4_conv3 (Conv2D) (None, 28, 28, 512) 2359808
block4_pool (MaxPooling2D) (None, 14, 14, 512) 0
block5_conv1 (Conv2D) (None, 14, 14, 512) 2359808
block5_conv2 (Conv2D) (None, 14, 14, 512) 2359808
block5_conv3 (Conv2D) (None, 14, 14, 512) 2359808
block5_pool (MaxPooling2D) (None, 7, 7, 512) 0
flatten (Flatten) (None, 25088) 0
fc1 (Dense) (None, 4096) 102764544
fc2 (Dense) (None, 4096) 16781312
predictions (Dense) (None, 1000) 4097000
dense_1 (Dense) (None, 43) 43043
Total params: 138,400,587
Trainable params: 43,043
Non-trainable params: 138,357,544
Compile the model
my_sgd = SGD(lr=0.01)
Train the model
predictions = model.predict_generator(test_generator, steps=STEP_SIZE_TEST, verbose=1)
1263/1263 [==============================] - 229s 181ms/step
Predictions shape
(12630, 43)
Getting the test_data and test_labels
test_data = []
test_labels = []
batch_index = 0
while batch_index <= test_generator.batch_index:
data = next(test_generator)
batch_index = batch_index + 1
test_data_array = np.asarray(test_data)
test_labels_array = np.asarray(test_labels)
Shape of test_data_array and test_labels_array
(1263, 10, 224, 224, 3)
(1263, 10, 43)
Confusion Matrix
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(test_labels_array, predictions)
I get the output
ValueError: Found input variables with inconsistent numbers of samples: [1263, 12630]
I understand that this error is because the test_labels_array size isn't equal to the predictions; 1263 and 12630 respectively, but I don't really know what I'm doing wrong.
Any help would be much appreciated.
PS: If anyone has any tips on how to increase the training accuracy while we're at it, that would be brilliant.
You should reshape test_data_array and test_labels_array as follows:
data_count, batch_count, w, h, c = test_data_array.shape
test_data_array=np.reshape(test_data_array, (data_count*batch_count, w, h, c))
test_labels_array = np.reshape(test_labels_array , (data_count*batch_count, -1))
the way you are appending the results of test_generator is the reason. In fact the first call of your test_generator will generate 10 data with shape of (224, 224, 3). For the next call again your test_generator will generate 10 data with shape of (224, 224, 3). So now you should have 20 data of shape (224, 224, 3) while the way you are appending the results would cause that you came up with 2 data of shape (10, 224, 224, 3). which is not what you are expecting.