how to give two inputs to yolo darknet cfg files? - tensorflow

I have developed a rgbd model for yolov2 tiny..So it requires two inputs rgb and depth ..feature extraction seperately and join the layer later..On using [route] I cannot get two inputs
x = Conv2D(16, (3,3), strides=(1,1), padding='same', name='conv_1', use_bias=False)(input_image)
self.convLayers+=1
x = BatchNormalization(name='norm_1')(x)
x = LeakyReLU(alpha=0.1)(x)
Depthx = MaxPooling2D(pool_size=(2, 2))(x)
x = Conv2D(16, (3,3), strides=(1,1), padding='same', name='conv_2', use_bias=False)(input_image)
self.convLayers+=1
x = BatchNormalization(name='norm_1')(x)
x = LeakyReLU(alpha=0.1)(x)
Rgbx = MaxPooling2D(pool_size=(2, 2))(x)
# Fuse Layer
x = concatenate([Depthx, Rgbx])
x=Conv2D(16, (1,1), strides=(1,1), padding='same', name='conv_3', use_bias=False)(x)
self.convLayers+=1
x = BatchNormalization(name='norm_1')(x)
x = LeakyReLU(alpha=0.1)(x)
x = MaxPooling2D(pool_size=(2, 2))(x)
I need to write a config file for this model...Any kind of knowledge in config file writing is welcome
Thanks in advance

Related

Loss function and Loss Weight for Multi-Output Keras Classification model

I am trying to understand the loss function using Keras functional API.
I have a sample multi-output model based on the B-CNN model.
img_input = Input(shape=input_shape, name='input')
#--- block 1 ---
x = Conv2D(32, (3, 3), activation='relu', padding='same', name='block1_conv1')(img_input)
x = BatchNormalization()(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(x)
#--- coarse 1 branch ---
c_1_bch = Flatten(name='c_flatten')(x)
c_1_bch = Dense(64, activation='relu', name='c_dense')(c_1_bch)
c_1_bch = BatchNormalization()(c_1_bch)
c_1_bch = Dropout(0.5)(c_1_bch)
c_1_pred = Dense(num_c, activation='softmax', name='pred_coarse')(c_1_bch)
#--- block 3 ---
x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block3_conv1')(x)
x = BatchNormalization()(x)
x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block3_conv2')(x)
x = BatchNormalization()(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool')(x)
#--- fine block ---
x = Flatten(name='flatten')(x)
x = Dense(128, activation='relu', name='fc_1')(x)
x = BatchNormalization()(x)
x = Dropout(0.5)(x)
fine_pred = Dense(num_classes, activation='softmax', name='pred_fine')(x)
model = keras.Model(inputs= [img_input],
outputs= [c_1_pred, fine_pred],
name='B-CNN_Model')
This classification model takes one input and provides 2 predictions.
According to this post, we need to compile it first with the proper loss function, metrics, and optimizer by mentioning the name variables for each output layer.
I have done this in the following way.
model.compile(optimizer = optimizers.SGD(learning_rate=0.003, momentum=0.9, nesterov=True),
loss={'pred_coarse':'mse',
'pred_fine':'categorical_crossentropy'},
loss_weights={'pred_coarse':beta,
'pred_fine':gamma},
metrics={'pred_coarse':'accuracy',
'pred_fine':'accuracy'})
[Note: Here, output layer pred_coarse is using Mean Square Error and pred_fine is using Categorical Cross Entropy loss function. The loss_weights beta and gamma are variable and update the value after certain epochs using keras.callbacks.Callback function ]
Now, My question is, what happens if we compile the model without mentioning the name variables for each output layer and provide only one function instead? For example, we compile the model as follows:
model.compile(optimizer=optimizers.SGD(learning_rate=0.003, momentum=0.9, nesterov=True),
loss='categorical_crossentropy',
loss_weights=[beta, gamma],
metrics=['accuracy'])
Unlike the previous compile example, this one uses the Categorical Cross Entropy loss function. The model compiles and runs without any errors. Does the model using Categorical Cross Entropy loss function for both pred_coarse and pred_fine output layers?

problem using convolutional autoencoder for 2d data

I want to train an autoencoder for the purpose of gpr investigations.
The input data dimension is 149x8.However, While i am trying deep autoencoder it works fine
input_img = Input(shape=(8,))
encoded1 = Dense(8, activation='relu')(input_img)
encoded2 = Dense(4, activation='relu')(encoded1)
encoded3 = Dense(2, activation='relu' )(encoded2)
decoded1 = Dense(2, activation='relu' )(encoded3)
decoded2 = Dense(4, activation='relu')(decoded1)
decoded3 = Dense(8, activation='relu' )(decoded2)
decoded = Dense(8, activation='linear')(decoded3)
autoencoder = Model(input_img, decoded)
sgd = optimizers.Adam(lr=0.001)
autoencoder.compile(optimizer=sgd, loss='mse')
autoencoder.summary()
..................................................
But while trying to use convolutional autoencoder for the same input
it gives error `ValueError: Input 0 is incompatible with layer conv2d_1: expected ndim=4, found ndim=2`
can anybody suggest me how to overcome this problem.
My code is
input_img = Input(shape=(8,))
x = layers.Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = layers.MaxPooling2D((2, 2), padding='same')(x)
x = layers.Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = layers.MaxPooling2D((2, 2), padding='same')(x)
x = layers.Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = layers.MaxPooling2D((2, 2), padding='same')(x)
x = layers.Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = layers.UpSampling2D((2, 2))(x)
x = layers.Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = layers.UpSampling2D((2, 2))(x)
x = layers.Conv2D(16, (3, 3), activation='relu')(x)
x = layers.UpSampling2D((2, 2))(x)
decoded = layers.Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)
autoencoder = Model(input_img, decoded)
sgd = optimizers.Adam(lr=0.001)
autoencoder.compile(optimizer=sgd, loss='mse')
autoencoder.summary()
Wrong Input Shape:
This is because we are passing the input shape of (8,) and 1 extra dimension added by TensorFlow for Batch size, so the error message says that it found ndim=3, but the CNN has expected min_ndim=4, 3 for the image size and 1 for the batch size. e.g.
input_shape=(number_of_rows, 28,28,1)

Good Accuracy, Bad prediction

I am trying to do a multi-class classification project with CNN. My issue is getting good accuracy but not predicting well on validation data. I have introduced l2 regularization but it is not generalizing well. Also tried with different l2 regularization values (1e-3, 1e-4)
Here is my Accuracy graph and Loss graph.
Topology:
model = Sequential()
inputs = keras.Input(shape=(512, 512, 3), name="img")
x = Conv2D(32, kernel_size=(3,3), strides=(1,1), kernel_regularizer=l2(1e-5), padding='same')
x = BatchNormalization()(x)
x1 = Activation('relu')(x)
x2 = Conv2D(32, kernel_size=(3,3), strides=(1,1), kernel_regularizer=l2(1e-5), padding='same'
(x1)
x = BatchNormalization()(x2)
x = Activation('relu')(x2)
x3 = Conv2D(32, kernel_size=(3,3), strides=(1,1), kernel_regularizer=l2(1e-5), padding='same')
(x)
x = BatchNormalization()(x3)
x = tensorflow.keras.layers.add([x, x1]) # ==> Shortcut
x = Activation('relu')(x)
x4 = Conv2D(64, kernel_size=(3,3), strides=(2,2), kernel_regularizer=l2(1e-5),padding='same')(x)
x = BatchNormalization()(x4)
x = Activation('relu')(x)
x5 = Conv2D(64, kernel_size=(3,3), strides=(1,1), kernel_regularizer=l2(1e-5), padding='same')
(x)
x = BatchNormalization()(x5)
x = Activation('relu')(x)
x6 = Conv2D(64, kernel_size=(3,3), strides=(1,1), kernel_regularizer=l2(1e-5), padding='same')
(x)
x = BatchNormalization()(x6)
x = tensorflow.keras.layers.add([x, x4]) # ==> Shortcut
x = Activation('relu')(x)
x7 = Conv2D(128, kernel_size=(3,3), strides=(2,2), kernel_regularizer=l2(1e-5), padding='same')
(x)
x = BatchNormalization()(x7)
x = Activation('relu')(x)
x8 = Conv2D(128, kernel_size=(3,3), strides=(1,1), kernel_regularizer=l2(1e-5), padding='same')
(x)
x = BatchNormalization()(x8)
x = Activation('relu')(x)
x9 = Conv2D(128, kernel_size=(3,3), strides=(1,1), kernel_regularizer=l2(1e-5), padding='same')
(x)
x = BatchNormalization()(x9)
x = tensorflow.keras.layers.add([x, x7]) #
x = Activation('relu')(x)
x10 = Conv2D(256, kernel_size=(3,3), strides=(1,1), kernel_regularizer=l2(1e-5), padding='same')
(x)
x = BatchNormalization()(x10)
x = Activation('relu')(x)
x11 = Conv2D(256, kernel_size=(3,3) , strides=(1,1), kernel_regularizer=l2(1e-5),padding='same')
(x)
x = BatchNormalization()(x11)
x = Activation('relu')(x)
x12 = Conv2D(256, kernel_size=(3,3), strides=(1,1), kernel_regularizer=l2(1e-5), padding='same')
(x)
x = BatchNormalization()(x12)
x = tensorflow.keras.layers.add([x, x10]) #
x = Activation('relu')(x)
x13 = Conv2D(512, kernel_size=(3,3), strides=(1,1), kernel_regularizer=l2(1e-5), padding='same')
(x)
x = BatchNormalization()(x13)
x = Activation('relu')(x)
x14 = Conv2D(512, kernel_size=(3,3), strides=(1,1),kernel_regularizer=l2(1e-5), padding='same')
(x)
x = BatchNormalization()(x14)
x = Activation('relu')(x)
x15 = Conv2D(512, kernel_size=(3,3), strides=(1,1), kernel_regularizer=l2(1e-5), padding='same')(x)
x = BatchNormalization()(x15)
x = tensorflow.keras.layers.add([x, x13]) #
x = Activation('relu')(x)
x = Flatten()(Conv2D(1, kernel_size=1, strides=(1,1), kernel_regularizer=l2(1e-5),
padding='same')(x))
x = layers.Dropout(0.3)(x)
outputs = Dense(4, activation ='softmax', kernel_initializer ='he_normal')(x)
model = Model(inputs, outputs)
model.summary()
`
I tried with different filters, decreasing/increasing layers. Is this issue is because of overfitting? Any suggestion on what I can improve so that I get smoother curve and good predictions.
You could try putting dropouts in the Conv2D layers too, that should help with some of the overfitting.
Decrease the alpha (learning rate of the optimizers), so that the optima doesn't overshoot.
Should help :)

Get the output of a specific layer as the result on test data in place of last layer (auto-encoder latent features) in keras

I am trying to get the output of the latent layer/hidden layer to use it as input for something else. I trained my model in an efficient way to minimize the loss so my model could learn the latent features efficiently and as close as possible to the image.
My model is
input_img = Input(shape=(28, 28, 1)) # adapt this if using `channels_first` image data format
#Encoder
x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
# Decoder
x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x) # opposite of Pooling
x = Conv2D(16, (3, 3), activation='relu')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)
autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
I want the output of encoded layer as my output for the model. Is it possible? ad If yes, Please tell me how.
you can simply do in this way
autoencoder.fit(...)
latent_model = Model(input_img, encoded)
latent_representation = latent_model.predict(X)

CNN Overfitting

I have a siamese CNN that is performing very well (96% accuracy, 0.08 loss) on training data but poorly (70% accuracy, 0.1 loss) on testing data.
The architecture is below:
input_main = Input(shape=input_shape, dtype='float32')
x = Conv2D(32, (3, 3), padding='same', activation='relu',
kernel_regularizer=l2(0.005))(input_main)
x = Conv2D(16, (5, 5), activation='relu',
kernel_regularizer=l2(0.005))(x)
x = MaxPooling2D(pool_size=(5, 5))(x)
x = Dropout(0.5)(x)
x = Conv2D(32, (3, 3), padding='same', activation='relu',
kernel_regularizer=l2(0.0005))(x)
x = Conv2D(32, (7, 7), activation='relu',
kernel_regularizer=l2(0.005))(x)
x = MaxPooling2D(pool_size=(3, 3))(x)
x = Dropout(0.5)(x)
x = Flatten()(x)
#x = Dropout(0.5)(x)
x = Dense(16, activation='relu',
kernel_regularizer=l2(0.005))(x)
model = Model(inputs=input_main, outputs=x)
Two of these are then combined to make a siamese architecture, and the difference between the vectors from the final layer informs the result. I have experimented with dropout and regularization, and neither has been able to solve the problem (these parameters are the ones I am testing at time of posting)
I have also tried simplifying the architecture to fewer conv layers, and this has not solved the problem.
The data is 256x128x1 images, sent through the network in pairs with binary labels based on whether they are the same or not. I also use data augmentation, with some small rotations and translations.
Can anyone suggest anything else to try to solve this overfitting problem?