Keras (+tensorflow) cannot predict with only part of the sequential - tensorflow

I am now working on building a stereo matching network using Keras with tensorflow as backend. The network has the following structure:
After training the whole network, I need to test it. However, training phase and testing phase are quite different. I have to split the model into two parts. The first part is CNN+Concatenate which only needs to be run once, while the fully-connected part (actually I modify it to be fully-conv form when testing) needs to be run for d times with slightly different input, where d varies from 100 to 228.
The first part network code:
# input image dimensions
img_rows, img_cols = X1.shape[0], X1.shape[1]
input_shape = (img_rows, img_cols, 1)
X1 = X1.reshape(1, img_rows, img_cols, 1)
X2 = X2.reshape(1, img_rows, img_cols, 1)
# number of conv filters to use
nb_filters = 112
# CNN kernel size
kernel_size = (3,3)
left_branch = Sequential()
left_branch.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1], border_mode='same', input_shape=input_shape))
left_branch.add(Activation('relu'))
left_branch.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1], border_mode='same'))
left_branch.add(Activation('relu'))
left_branch.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1], border_mode='same'))
left_branch.add(Activation('relu'))
left_branch.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1], border_mode='same'))
left_branch.add(Activation('relu'))
right_branch = Sequential()
right_branch.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1], border_mode='same', input_shape=input_shape))
right_branch.add(Activation('relu'))
right_branch.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1], border_mode='same'))
right_branch.add(Activation('relu'))
right_branch.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1], border_mode='same'))
right_branch.add(Activation('relu'))
right_branch.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1], border_mode='same'))
right_branch.add(Activation('relu'))
merged = Merge([left_branch, right_branch], mode='concat')
cnn = Sequential()
cnn.add(merged)
I load the weights gained from training phase into the first part of the network and try to get prediction of it.
def load_cnn_weights(filepath):
f = h5py.File(filepath, mode='r')
weights = []
for i in range(1, 9):
weights.append(f['model_weights/conv2d_{}/conv2d_{}/kernel:0'.format(i, i)][()])
weights.append(f['model_weights/conv2d_{}/conv2d_{}/bias:0'.format(i, i)][()])
f.close()
return weights
weights = load_cnn_weights("/home/users/shixin.li/segment/Lecun_stereo_rebuild/weights.hdf5")
cnn.set_weights(weights)
output_cnn = cnn.predict([X1, X2])
I already check that the weights are read successfully and can fit into the network according to calling get_weights() function. X1 and X2 are not zero, they are normalized gray scale image matrix. I even tried compile the network before predict. But the result output_cnn gives all zero.
I didn't see anyone have this problem and I am stuck for two days. The part which really confuses me is that the input and weights are all not zero, then why the result is zero? If you could help, I would really appreciate that!

You might want to try using tfdbg to find out exactly what the inputs to the op with all-zero outputs are, to try to understand what is going on.

Related

improve CNN model accuracy

We have got some train valid and test data to create as homework CNN1D and to compare results with another model to get the exam marks
I tried with this model however I'm getting 84.18 accuracy Vs 84.58 for the competitor model. my classmates got also the same model as mine and they could improve it to get 85.20% as accuracy. Im just authorized to change the hyper parameters or to add/modify/delete some layers v after the fusion = concate()
Can anyone please help me improve this
def CNN1D ()
n_filters=256,
dropout_rate = 0.4
conv1 = Conv1D(filters=n_filters, kernel_size=3, padding='valid', name="conv1_", activation="relu")
Dropout1 = Dropout(rate=dropout_rate, name="dropOut1_")
conv2 = Conv1D(filters=n_filters, kernel_size=3, padding='valid', name="conv2_", activation="relu")
Dropout2 = Dropout(rate=dropout_rate, name="dropOut2_")
conv3 = Conv1D(filters=n_filters*2, kernel_size=3, padding='valid', name="conv3_", activation="relu")
Dropout3 = Dropout(rate=dropout_rate, name="dropOut3_")
conv4 = Conv1D(filters=n_filters*2, kernel_size=1, padding='valid', name="conv4_", activation="relu")
Dropout4 = Dropout(rate=dropout_rate,name="dropOut4_")
globPool = GlobalAveragePooling1D()
def TwoBranchModel():
num_units=256
branch1 = CNN1D()
branch2 = CNN1D()
fusion = concate()
out = tf.keras.Sequential([
Dense(num_units,activation='relu'),
BatchNormalization(),
Dense(n_classes,activation='softmax')
])
I would suggest you to try playing with the following
decrease the dropout percentage
Try playing with BatchNormalisation hyperparameters. See here: https://keras.io/api/layers/normalization_layers/batch_normalization/ and adjust the momentum. Also remove the BN layer and see the accuracy.
I am not sure if you can change the below (as per your constraint)
change globalpoolaverage to maxpool
your filter size seems to be constant, it is good idea if start increasing your number of filters. For example start with 32, 64, 128 and so on..
remove some dropout layers.

'Channels first' training accuracy very low compared to 'channels last'

My issue:
I am trying to train a semantic segmentation model in tf.keras, in fact it works very well when I am using channels_last (WHC) mode (it reaches 96%+ val acc). I wanted to train it in channels_first (CHW) mode so the weights are compatible with TensorRT. When I do this, the ~80% training accuracy in the first few epochs dips down to around 0.020% and stays there permanently.
It is useful to know that the base of my model is a tf.keras.applications.MobileNet() model with the pre-trained 'imagenet' weights. (Model architecture at the bottom.)
The transformation process:
I used the guidelines provided and I change only a few things here:
Set tf.keras.backend.set_image_data_format() to 'channels_first'.
I change the channel order in the input tensor from: input_tensor=Input(shape=(376, 672, 3)) to: input_tensor=Input(shape=(3, 376, 672))
In my image preprocessing (using tf.data.Dataset), i use tf.transpose(img, perm=[2, 0, 1]) on both my input image and one-hot encoded mask to change the channel orders. I checked this with equality assertion to make sure its correct and it seems to be fine.
When I change these the training starts fine but as I said the training accuracy goes down to almost zero. When I revert back everything's fine again.
Possible leads:
What am I doing wrong or what could be the problematic part here? My suspicions are around these questions:
Are the pre-trained imageNet weights changed to the 'channels_first' order also when I set the backend? Is this something I should consider at all?
Could it be that the tf.transpose() function messes up the mask's one-hot encoding? (I have 3 classes represented by 3 colors: lane, opposing lane, background)
Maybe I am not seeing something obvious. I can provide further code and answers as needed.
EDIT:
08/17: This is still an ongoing issue, I have tried several things:
I checked if the image and the mask is correct after the transpose with numpy assertion, seems correct.
I suspected that the loss function calculates on the wrong axis, so I customized the loss function for the first axis (where the channels are). Here it is:
def ReverseAxisLoss(y_true, y_pred):
return K.categorical_crossentropy(y_true, y_pred, from_logits=True, axis=1)
My main suspicion is that the 'channels first' backend setting does nothing to transpose the pretrained 'imagenet' weights for the mobilenet part. Is there an updated way for TF2.x / Keras to transpose the pre-trained weights into CHW format?
Here is the architecture that I use (the skipNet() is the head network and the mobilenet is the base, and it is connected in the create_model() function)
def skipNet(encoder_output, feed1, feed2, classes):
# random initializer and regularizer
stddev = 0.01
init = RandomNormal(stddev=stddev)
weight_decay = 1e-3
reg = l2(weight_decay)
score_feed2 = Conv2D(kernel_size=(1, 1), filters=classes, padding="SAME",
kernel_initializer=init, kernel_regularizer=reg)(feed2)
score_feed2_bn = BatchNormalization()(score_feed2)
score_feed1 = Conv2D(kernel_size=(1, 1), filters=classes, padding="SAME",
kernel_initializer=init, kernel_regularizer=reg)(feed1)
score_feed1_bn = BatchNormalization()(score_feed1)
upscore2 = Conv2DTranspose(kernel_size=(4, 4), filters=classes, strides=(2, 2),
padding="SAME", kernel_initializer=init,
kernel_regularizer=reg)(encoder_output)
height_pad1 = ZeroPadding2D(padding=((1,0),(0,0)))(upscore2)
upscore2_bn = BatchNormalization()(height_pad1)
fuse_feed1 = add([score_feed1_bn, upscore2_bn])
upscore4 = Conv2DTranspose(kernel_size=(4, 4), filters=classes, strides=(2, 2),
padding="SAME", kernel_initializer=init,
kernel_regularizer=reg)(fuse_feed1)
height_pad2 = ZeroPadding2D(padding=((0,1),(0,0)))(upscore4)
upscore4_bn = BatchNormalization()(height_pad2)
fuse_feed2 = add([score_feed2_bn, upscore4_bn])
upscore8 = Conv2DTranspose(kernel_size=(16, 16), filters=classes, strides=(8, 8),
padding="SAME", kernel_initializer=init,
kernel_regularizer=reg, activation="softmax")(fuse_feed2)
return upscore8
def create_model(classes):
base_model = tf.keras.applications.MobileNet(input_tensor=Input(shape=IMG_SHAPE),
include_top=False,
weights='imagenet')
conv4_2_output = base_model.get_layer(index=43).output
conv3_2_output = base_model.get_layer(index=30).output
conv_score_output = base_model.output
head_model = skipNet(conv_score_output, conv4_2_output, conv3_2_output, classes)
for layer in base_model.layers:
layer.trainable = False
model = Model(inputs=base_model.input, outputs=head_model)
return model

a problem using LSTM network (neural networks)

im trying to create a speaker diarization system using lstm (im trying to make the network tell the difference between speakers).
this is the model i've created:
model = Sequential()
model.add(LSTM(768, batch_input_shape=(39, 40, 1), return_sequences=True))
model.add(Dense(256))
model.add(LSTM(768, return_sequences=True))
model.add(Dense(256))
model.add(LSTM(768, return_sequences=True))
model.add(Dense(4))
there are 4 different speakers.
in my dataset i have the array 'features' (256 at length for 256 speech segments).
for each segment in 'features' i have 39 vectors to represent each segment and each of these vectors is at size 40.
each of these 39 vectors is extracted from a different time window. (i used log mel filterbank energies).
i also have the array 'lables' which is also 256 at length and contains the lables for each segment.
i used 'to_categorical' for it:
labels = tf.keras.utils.to_categorical(labels, num_classes=4)
i tried using a generator to feed it to the network but it didnt work.
this is the class i used:
class KerasBatchGenerator(object):
def __init__(self, features, batch_size, labels):
self.features = features
self.batch_size = batch_size
self.labels = labels
def generate(self):
while True:
for i in self.labels:
for j in self.features:
temp = [j, i]
# temp = np.expand_dims(temp, axis=1)
temp = np.expand_dims(temp, axis=2)
yield tuple(temp)
and the code i used to run the network is:
train_data_generator = KerasBatchGenerator(features, batch_size, labels)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit_generator(train_data_generator.generate(), 100, 1)
please help!!!
If i guessed correctly, you want to classify which input in spoke by which speaker.
In that case your final layer should have a shape (batch_size, numOfClasses) or (39, 4)
But if you take close look at the summary the output shape for final layer is (39, 40, 4)
to get the proper shape remove the argument return_sequences=True from last LSTM layer.

UnimplementedError: Fused conv implementation does not support grouped convolutions for now

I am trying to build a CNN model to recognise human sketch using the TU-Berlin dataset. I downloaded the png zip file, imported the data to Google Colab and then split the data into train-test folders. Here is the model:
model = tf.keras.models.Sequential([
tf.keras.layers.Conv2D(filters = 64, kernel_size = (5,5),padding = 'Same',
activation ='relu', input_shape = target_dims),
tf.keras.layers.Conv2D(filters = 64, kernel_size = (5,5),padding = 'Same',
activation ='relu'),
tf.keras.layers.MaxPool2D(pool_size=(2,2)),
tf.keras.layers.Dropout(0.25),
tf.keras.layers.Conv2D(filters = 128, kernel_size = (3,3),padding = 'Same',
activation ='relu'),
tf.keras.layers.Conv2D(filters = 128, kernel_size = (3,3),padding = 'Same',
activation ='relu'),
tf.keras.layers.MaxPool2D(pool_size=(2,2), strides=(2,2)),
tf.keras.layers.Dropout(0.25),
tf.keras.layers.Conv2D(256, kernel_size=4, strides=1, activation='relu', padding='same'),
tf.keras.layers.Conv2D(256, kernel_size=4, strides=2, activation='relu', padding='same'),
tf.keras.layers.Dropout(0.25),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(512, activation = "relu"),
tf.keras.layers.Dropout(0.5),
tf.keras.layers.Dense(n_classes, activation= "softmax")
])
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=["accuracy"])
model.fit_generator(train_generator, epochs=10, validation_data=val_generator)
And I am getting the following error:
UnimplementedError: Fused conv implementation does not support grouped convolutions for now.
[[node sequential/conv2d/Relu (defined at <ipython-input-9-36d4624b896d>:1) ]] [Op:__inference_train_function_1358]
Function call stack:
train_function
I would be grateful to any kind of help that will solve this issue. Thank you.
(PS - I am running Tensorflow 2.2.0 and no GPU)
I had a similar error, the problem was with the number of channels for my image and the number of channels I specified in the model. So check the number of dimension of your image and check the value specified in the input shape ensure they are the same
I had this same error using the facial expression recognition dataset, here's how i solved this same error.
From what i understand the dataset is gray color,
when you use ImageDataGenerator of tensorflow and flow_from_directory to generate the train and validation set,
you need to specify the color_mode as grayscale or rgb based on the dataset/images, here it will be 'grayscale',
in the model the first layer Conv2D the input_shape should be
input_shape = (height, width, 1), 1 because its grayscale.
Just mention the color_mode="grayscale" in flow from directory and check your model input (height,width,1).
Just as #grande_cifer said, the issue pops up from an incompatibility of number of image channel specified and correct number of channels of real images.
If you are not sure of the exact number of channel, I advice you specify 1 in your parameter target_dims, and forcefully convert all images when loading them to your net as grayscale, using the parameter color_mode = "grayscale" when loading the images to your net.
For more info, check keras online doc.
You will find this error in 2 cases:
when the number of channels for your image and the number of channels you specified in the model are not same . Here the solution is to make them equal.
When you use group param of Conv2D from tensorflow.keras . Here they have not implemented it with the use of group param , which is Depthwise Convolution in real (use tf.keras.layers.DepthwiseConv2D). For me the work around was pip install tf-nightly==2.10.0.dev20220406 as this package also have some unimplemented keras APIs...as this was not mentioned anywhere when I encountered this error
I hope this is useful

Are these images too 'noisy' to be correctly classified by a CNN?

I'm attempting to build an image classifier to identify between 2 types of images on property sites. I've split my dataset into 2 categories: [Property, Room]. I'm hoping to be able to differentiate between whether the image is of the outside of some property or a room inside the property.
Below are 2 examples of the types of image I am using. My dataset consists of 800 images for each category, and then a training set of an additional 160 images for each category (not present in the training set).
I always seem to be get reasonable results in training, but then when I test against some real samples it usually ends up classifying all of the images into a single category.
Below you can see the model I am using:
train_datagen = ImageDataGenerator(
rescale=1./255,
width_shift_range=0.1,
height_shift_range=0.1,
rotation_range=10,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
fill_mode='nearest'
) # set validation split
validate_datagen = ImageDataGenerator(rescale=1./255)
IMG_HEIGHT = IMG_WIDTH = 128
model = tf.keras.models.Sequential([
tf.keras.layers.Conv2D(32, (11,11), activation='relu', input_shape=(IMG_HEIGHT, IMG_WIDTH, 3), padding='same'),
tf.keras.layers.MaxPooling2D(11, 11),
# tf.keras.layers.Dropout(0.5),
# Second convolutional layer
tf.keras.layers.Conv2D(64, (11, 11), padding='same', activation='relu'),
tf.keras.layers.MaxPooling2D(11, 11),
# tf.keras.layers.Dropout(0.5),
# Flattening
tf.keras.layers.Flatten(),
# Full connection
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.5),
tf.keras.layers.Dense(1, activation='sigmoid')
])
from tensorflow.keras.optimizers import RMSprop
model.compile(
optimizer=RMSprop(lr=0.001),
loss='binary_crossentropy',
metrics=['accuracy']
)
# now train the model
history = model.fit_generator(
train_generator,
validation_data=validation_generator,
steps_per_epoch=75, #100
epochs=5, # 15, or 20, and 100 steps per epoch
validation_steps=50,
verbose=1
)
# Predict image
def load_image(img_path, show=False):
test_image = image.load_img(img_path, target_size=(IMG_HEIGHT, IMG_WIDTH))
test_image = image.img_to_array(test_image)
test_image /= 255.
test_image = np.expand_dims(test_image, axis = 0)
return test_image
def predict_image(img_path, show=False):
loaded_img = load_image(img_path, show)
pred = model.predict(loaded_img)
return 'property' if pred[0][0] == 0.0 else 'room'
print('Prediction is...')
print(predict_image('path/to/my/img')
Can anyone suggest the possible reasons for this? I've tried using different epochs and batch sizes, augmenting the images further, changing the Conv2D and Pooling layer size but nothing seems to help.
Do I perhaps not have enough data, or are they bad images to begin with? This is my first foray into ML so apologies if any of questions seem obvious.
You are not post-processing the output of the classifier correctly, it outputs a probability in [0, 1], with values < 0.5 corresponding to the first class, and values >= 0.5 for the second class. You should change the code accordingly.
Try Data Augmentation: it augments the image to some random transformations like Random Rotation, Random Zoom, Random Horizontal Flip, width shift and height shift. And also try to implement Batch Normalisation.