We have got some train valid and test data to create as homework CNN1D and to compare results with another model to get the exam marks
I tried with this model however I'm getting 84.18 accuracy Vs 84.58 for the competitor model. my classmates got also the same model as mine and they could improve it to get 85.20% as accuracy. Im just authorized to change the hyper parameters or to add/modify/delete some layers v after the fusion = concate()
Can anyone please help me improve this
def CNN1D ()
n_filters=256,
dropout_rate = 0.4
conv1 = Conv1D(filters=n_filters, kernel_size=3, padding='valid', name="conv1_", activation="relu")
Dropout1 = Dropout(rate=dropout_rate, name="dropOut1_")
conv2 = Conv1D(filters=n_filters, kernel_size=3, padding='valid', name="conv2_", activation="relu")
Dropout2 = Dropout(rate=dropout_rate, name="dropOut2_")
conv3 = Conv1D(filters=n_filters*2, kernel_size=3, padding='valid', name="conv3_", activation="relu")
Dropout3 = Dropout(rate=dropout_rate, name="dropOut3_")
conv4 = Conv1D(filters=n_filters*2, kernel_size=1, padding='valid', name="conv4_", activation="relu")
Dropout4 = Dropout(rate=dropout_rate,name="dropOut4_")
globPool = GlobalAveragePooling1D()
def TwoBranchModel():
num_units=256
branch1 = CNN1D()
branch2 = CNN1D()
fusion = concate()
out = tf.keras.Sequential([
Dense(num_units,activation='relu'),
BatchNormalization(),
Dense(n_classes,activation='softmax')
])
I would suggest you to try playing with the following
decrease the dropout percentage
Try playing with BatchNormalisation hyperparameters. See here: https://keras.io/api/layers/normalization_layers/batch_normalization/ and adjust the momentum. Also remove the BN layer and see the accuracy.
I am not sure if you can change the below (as per your constraint)
change globalpoolaverage to maxpool
your filter size seems to be constant, it is good idea if start increasing your number of filters. For example start with 32, 64, 128 and so on..
remove some dropout layers.
Related
TF / TB running a profile with the following setup :
HP_NUM_NODES_ONE = hp.HParam('nodes_one', hp.Discrete([128]))
HP_NUM_NODES_TWO = hp.HParam('nodes_two', hp.Discrete([64, 128, 256]))
HP_NUM_NODES_THR = hp.HParam('nodes_thr', hp.Discrete([64, 128, 256]))
HP_NUM_FILT = hp.HParam('num_filter', hp.Discrete([64, 128, 256]))
HP_DROPOUT = hp.HParam('dropout', hp.RealInterval(0.1, 0.3))
HP_OPTIMIZER = hp.HParam('optimizer', hp.Discrete(['adam', 'sgd', 'RMSprop']))
MSE = 'mean_squared_error'
with tf.summary.create_file_writer('logs/hparam_21-6').as_default():
hp.hparams_config(
hparams=[HP_NUM_NODES_ONE, HP_NUM_NODES_TWO, HP_NUM_NODES_THR,
HP_NUM_FILT, HP_DROPOUT, HP_OPTIMIZER],
metrics=[hp.Metric(MSE, display_name='Mean Squared Error')],
)
with this model :
def train_test_model(hparams):
model = tf.keras.models.Sequential([
tf.keras.layers.Conv1D(filters = hparams[HP_NUM_FILT], kernel_size=6, strides=1,
activation='relu', input_shape=(300,4), use_bias=True),
tf.keras.layers.MaxPooling1D(pool_size=100),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(hparams[HP_NUM_NODES_ONE], activation=tf.nn.relu),
tf.keras.layers.Dropout(hparams[HP_DROPOUT]),
tf.keras.layers.Dense(hparams[HP_NUM_NODES_TWO], activation=tf.nn.relu),
tf.keras.layers.Dropout(hparams[HP_DROPOUT]),
tf.keras.layers.Dense(hparams[HP_NUM_NODES_THR], activation="linear"),
])
model.compile(
optimizer=hparams[HP_OPTIMIZER],
loss='mean_squared_error',
metrics=['mean_squared_error'],
)
model.fit(feature3, label2[0,], epochs=500)
_, mean_squared_error = model.evaluate(x_test, y_test[0,])
return mean_squared_error
It all seems to run fine and I do get outputs, however the outputs in tensorboard do not show the results for different values of the "optimizer", which I want also. Does it need to be treated/coded differently? Happy to supply more of the code if this is not clear. Thx. J
I found the problem, as simple as an unticked box in the tensorbox results! So the standard results returned are:
however is I simply tick the optimizer box, the results appear.
I have no idea why tensorboard would, by default, turn this off.
My issue:
I am trying to train a semantic segmentation model in tf.keras, in fact it works very well when I am using channels_last (WHC) mode (it reaches 96%+ val acc). I wanted to train it in channels_first (CHW) mode so the weights are compatible with TensorRT. When I do this, the ~80% training accuracy in the first few epochs dips down to around 0.020% and stays there permanently.
It is useful to know that the base of my model is a tf.keras.applications.MobileNet() model with the pre-trained 'imagenet' weights. (Model architecture at the bottom.)
The transformation process:
I used the guidelines provided and I change only a few things here:
Set tf.keras.backend.set_image_data_format() to 'channels_first'.
I change the channel order in the input tensor from: input_tensor=Input(shape=(376, 672, 3)) to: input_tensor=Input(shape=(3, 376, 672))
In my image preprocessing (using tf.data.Dataset), i use tf.transpose(img, perm=[2, 0, 1]) on both my input image and one-hot encoded mask to change the channel orders. I checked this with equality assertion to make sure its correct and it seems to be fine.
When I change these the training starts fine but as I said the training accuracy goes down to almost zero. When I revert back everything's fine again.
Possible leads:
What am I doing wrong or what could be the problematic part here? My suspicions are around these questions:
Are the pre-trained imageNet weights changed to the 'channels_first' order also when I set the backend? Is this something I should consider at all?
Could it be that the tf.transpose() function messes up the mask's one-hot encoding? (I have 3 classes represented by 3 colors: lane, opposing lane, background)
Maybe I am not seeing something obvious. I can provide further code and answers as needed.
EDIT:
08/17: This is still an ongoing issue, I have tried several things:
I checked if the image and the mask is correct after the transpose with numpy assertion, seems correct.
I suspected that the loss function calculates on the wrong axis, so I customized the loss function for the first axis (where the channels are). Here it is:
def ReverseAxisLoss(y_true, y_pred):
return K.categorical_crossentropy(y_true, y_pred, from_logits=True, axis=1)
My main suspicion is that the 'channels first' backend setting does nothing to transpose the pretrained 'imagenet' weights for the mobilenet part. Is there an updated way for TF2.x / Keras to transpose the pre-trained weights into CHW format?
Here is the architecture that I use (the skipNet() is the head network and the mobilenet is the base, and it is connected in the create_model() function)
def skipNet(encoder_output, feed1, feed2, classes):
# random initializer and regularizer
stddev = 0.01
init = RandomNormal(stddev=stddev)
weight_decay = 1e-3
reg = l2(weight_decay)
score_feed2 = Conv2D(kernel_size=(1, 1), filters=classes, padding="SAME",
kernel_initializer=init, kernel_regularizer=reg)(feed2)
score_feed2_bn = BatchNormalization()(score_feed2)
score_feed1 = Conv2D(kernel_size=(1, 1), filters=classes, padding="SAME",
kernel_initializer=init, kernel_regularizer=reg)(feed1)
score_feed1_bn = BatchNormalization()(score_feed1)
upscore2 = Conv2DTranspose(kernel_size=(4, 4), filters=classes, strides=(2, 2),
padding="SAME", kernel_initializer=init,
kernel_regularizer=reg)(encoder_output)
height_pad1 = ZeroPadding2D(padding=((1,0),(0,0)))(upscore2)
upscore2_bn = BatchNormalization()(height_pad1)
fuse_feed1 = add([score_feed1_bn, upscore2_bn])
upscore4 = Conv2DTranspose(kernel_size=(4, 4), filters=classes, strides=(2, 2),
padding="SAME", kernel_initializer=init,
kernel_regularizer=reg)(fuse_feed1)
height_pad2 = ZeroPadding2D(padding=((0,1),(0,0)))(upscore4)
upscore4_bn = BatchNormalization()(height_pad2)
fuse_feed2 = add([score_feed2_bn, upscore4_bn])
upscore8 = Conv2DTranspose(kernel_size=(16, 16), filters=classes, strides=(8, 8),
padding="SAME", kernel_initializer=init,
kernel_regularizer=reg, activation="softmax")(fuse_feed2)
return upscore8
def create_model(classes):
base_model = tf.keras.applications.MobileNet(input_tensor=Input(shape=IMG_SHAPE),
include_top=False,
weights='imagenet')
conv4_2_output = base_model.get_layer(index=43).output
conv3_2_output = base_model.get_layer(index=30).output
conv_score_output = base_model.output
head_model = skipNet(conv_score_output, conv4_2_output, conv3_2_output, classes)
for layer in base_model.layers:
layer.trainable = False
model = Model(inputs=base_model.input, outputs=head_model)
return model
I'm trying to train a Convolutional GAN in Keras with Tensorflow backend for generating faces. Having read several examples there seem to be two ways to build the generator, you can either use the Conv2DTranspose layer with strides to upsample, like so:
def build_generator(seed_size, channels):
inputs_rand = Input(shape=(seed_size,))
inputs_feat = Input(shape=(NUM_FEATS,))
inputs = Concatenate()([inputs_rand, inputs_feat])
dense1 = Dense(4*4*64, activation='relu')(inputs)
reshape1 = Reshape((4,4,64))(dense1)
conv_trans1 = Conv2DTranspose(64, kernel_size=5, strides=2*GENERATE_RES, padding='same')(reshape1)
batch_norm1 = BatchNormalization(momentum=0.8)(conv_trans1)
leaky_relu1 = ReLU()(batch_norm1)
conv_trans2 = Conv2DTranspose(64, kernel_size=5, strides=2, padding='same')(leaky_relu1)
batch_norm2 = BatchNormalization(momentum=0.8)(conv_trans2)
leaky_relu2 = ReLU()(batch_norm2)
conv_trans3 = Conv2DTranspose(64, kernel_size=5, strides=2, padding='same')(leaky_relu2)
batch_norm3 = BatchNormalization(momentum=0.8)(conv_trans3)
leaky_relu3 = ReLU()(batch_norm3)
output = Conv2DTranspose(channels, kernel_size=3, padding='same', activation='tanh')(leaky_relu3)
generator = Model(inputs=[inputs_rand, inputs_feat], outputs=[output, inputs_feat])
return generator
or use the Upsample2D layer along with Conv2D layers, like so:
def build_generator(seed_size, channels):
inputs_rand = Input(shape=(seed_size,))
inputs_feat = Input(shape=(NUM_FEATS,))
inputs = Concatenate()([inputs_rand, inputs_feat])
dense1 = Dense(4*4*64, activation='relu')(inputs)
reshape1 = Reshape((4,4,64))(dense1)
upsamp1 = UpSampling2D(2*GENERATE_RES)(reshape1)
conv_trans1 = Conv2D(64, kernel_size=5, padding='same')(upsamp1)
batch_norm1 = BatchNormalization(momentum=0.8)(conv_trans1)
leaky_relu1 = ReLU()(batch_norm1)
upsamp2 = UpSampling2D()(leaky_relu1)
conv_trans2 = Conv2D(64, kernel_size=5, padding='same')(upsamp2)
batch_norm2 = BatchNormalization(momentum=0.8)(conv_trans2)
leaky_relu2 = ReLU()(batch_norm2)
upsamp3 = UpSampling2D()(leaky_relu2)
conv_trans3 = Conv2D(64, kernel_size=5, padding='same')(upsamp3)
batch_norm3 = BatchNormalization(momentum=0.8)(conv_trans3)
leaky_relu3 = ReLU()(batch_norm3)
output = Conv2D(channels, kernel_size=3, padding='same', activation='tanh')(leaky_relu3)
generator = Model(inputs=[inputs_rand, inputs_feat], outputs=[output, inputs_feat])
return generator
I have read in a few places that Conv2DTranspose is preferable, however I can't seem to get it working. It just produces a repeating noise pattern according to the strides, and then no matter how long I leave it to train, it stays the same. Meanwhile the other method seems to work well enough, but I would like to get both methods working (just to satisfy my own curiosity). I think I must be doing something wrong, but my code looks pretty much the same as other examples I've found and I can't find anyone else having this sort of problem.
I have tried a few tweaks to the model, for example adding dropout and removing the batch normalisation just in case there was a simple fix, but nothing seems to work. I haven't included the rest of my code to keep things tidy, if it would help, though, I can add the rest.
This is the noise obtained when using the Conv2DTranspose layers.
Meanwhile the Upsampling with Conv2D layers produce these, for example.
Any comments and suggestions on how to improve my results would be welcomed too.
I am new to this Deep Learning. I have learnt the basics through reading and trying to implement a real network to see how/if it'll really work. I chose Tensorflow in digits and the following network because they give out the exact architecture with training materiel. Steganalysis with DL
I wrote the following code for the architecture in Steganalysis with DL by looking at networks existing networks in digits and Tensorflow documentation.
from model import Tower
from utils import model_property
import tensorflow as tf
import tensorflow.contrib.slim as slim
import utils as digits
class UserModel(Tower):
#model_property
def inference(self):
x = tf.reshape(self.x, shape=[-1, self.input_shape[0], self.input_shape[1], self.input_shape[2]])
with slim.arg_scope([slim.conv2d, slim.fully_connected],
weights_initializer=tf.contrib.layers.xavier_initializer(),
weights_regularizer=slim.l2_regularizer(0.0001)):
conv1 = tf.layers.conv2d(inputs=x, filters=64, kernel_size=7, padding='same', strides=2, activation=tf.nn.relu)
rnorm1 = tf.nn.local_response_normalization(input=conv1)
conv2 = tf.layers.conv2d(inputs=rnorm1, filters=16, kernel_size=5, padding='same', strides=1, activation=tf.nn.relu)
rnorm2 = tf.nn.local_response_normalization(input=conv2)
flatten = tf.contrib.layers.flatten(rnorm2)
fc1 = tf.contrib.layers.fully_connected(inputs=flatten, num_outputs=1000, activation_fn=tf.nn.relu)
fc2 = tf.contrib.layers.fully_connected(inputs=fc1, num_outputs=1000, activation_fn=tf.nn.relu)
fc3 = tf.contrib.layers.fully_connected(inputs=fc2, num_outputs=2)
sm = tf.nn.softmax(fc3)
return fc3
#model_property
def loss(self):
model = self.inference
loss = digits.classification_loss(model, self.y)
accuracy = digits.classification_accuracy(model, self.y)
self.summaries.append(tf.summary.scalar(accuracy.op.name, accuracy))
return loss
I tried running it but the accuracy is pretty low. Could someone tell me if I've done it completely wrong or what's wrong with it and tell me how to properly code it?
UPDATE: Thank you Nessuno! With the fix you mentioned I came up with this code:
from model import Tower
from utils import model_property
import tensorflow as tf
import tensorflow.contrib.slim as slim
import utils as digits
class UserModel(Tower):
#model_property
def inference(self):
x = tf.reshape(self.x, shape=[-1, self.input_shape[0], self.input_shape[1], self.input_shape[2]])
with slim.arg_scope([slim.conv2d, slim.fully_connected],
weights_initializer=tf.contrib.layers.xavier_initializer(),
weights_regularizer=slim.l2_regularizer(0.00001)):
conv1 = tf.layers.conv2d(inputs=x, filters=64, kernel_size=7, padding='Valid', strides=2, activation=tf.nn.relu)
rnorm1 = tf.nn.local_response_normalization(input=conv1)
conv2 = tf.layers.conv2d(inputs=rnorm1, filters=16, kernel_size=5, padding='Valid', strides=1, activation=tf.nn.relu)
rnorm2 = tf.nn.local_response_normalization(input=conv2)
flatten = tf.contrib.layers.flatten(rnorm2)
fc1 = tf.contrib.layers.fully_connected(inputs=flatten, num_outputs=1000, activation_fn=tf.nn.relu)
fc2 = tf.contrib.layers.fully_connected(inputs=fc1, num_outputs=1000, activation_fn=tf.nn.relu)
fc3 = tf.contrib.layers.fully_connected(inputs=fc2, num_outputs=2, activation_fn=None)
return fc3
#model_property
def loss(self):
model = self.inference
loss = digits.classification_loss(model, self.y)
accuracy = digits.classification_accuracy(model, self.y)
self.summaries.append(tf.summary.scalar(accuracy.op.name, accuracy))
return loss
Solver type is SGD. Learning rate is 0.001. I am shuffling training data.I have increased training data to 6000 (3000 per category, 20% from that is reserved for validation). I downloaded the training data from this link. But I am only getting the following graph. I think this is overfitting. Do you have any suggestions to improve the validation accuracy?
In NVIDIA digits, classification_loss, exactly as in tensorflow tf.nn.softmax_cross_entropy_with_logits expects as input a linear layer of neuron.
Instead, you're passing as input sm = tf.nn.softmax(fc3), hence you're applying the softmax operation 2 times and this is the reasong of your low accuracy.
In order to solve this issue, just change the model output layer to
fc3 = slim.fully_connected(fc2, 2, activation_fn=None, scope='fc3')
return fc3
I am now working on building a stereo matching network using Keras with tensorflow as backend. The network has the following structure:
After training the whole network, I need to test it. However, training phase and testing phase are quite different. I have to split the model into two parts. The first part is CNN+Concatenate which only needs to be run once, while the fully-connected part (actually I modify it to be fully-conv form when testing) needs to be run for d times with slightly different input, where d varies from 100 to 228.
The first part network code:
# input image dimensions
img_rows, img_cols = X1.shape[0], X1.shape[1]
input_shape = (img_rows, img_cols, 1)
X1 = X1.reshape(1, img_rows, img_cols, 1)
X2 = X2.reshape(1, img_rows, img_cols, 1)
# number of conv filters to use
nb_filters = 112
# CNN kernel size
kernel_size = (3,3)
left_branch = Sequential()
left_branch.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1], border_mode='same', input_shape=input_shape))
left_branch.add(Activation('relu'))
left_branch.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1], border_mode='same'))
left_branch.add(Activation('relu'))
left_branch.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1], border_mode='same'))
left_branch.add(Activation('relu'))
left_branch.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1], border_mode='same'))
left_branch.add(Activation('relu'))
right_branch = Sequential()
right_branch.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1], border_mode='same', input_shape=input_shape))
right_branch.add(Activation('relu'))
right_branch.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1], border_mode='same'))
right_branch.add(Activation('relu'))
right_branch.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1], border_mode='same'))
right_branch.add(Activation('relu'))
right_branch.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1], border_mode='same'))
right_branch.add(Activation('relu'))
merged = Merge([left_branch, right_branch], mode='concat')
cnn = Sequential()
cnn.add(merged)
I load the weights gained from training phase into the first part of the network and try to get prediction of it.
def load_cnn_weights(filepath):
f = h5py.File(filepath, mode='r')
weights = []
for i in range(1, 9):
weights.append(f['model_weights/conv2d_{}/conv2d_{}/kernel:0'.format(i, i)][()])
weights.append(f['model_weights/conv2d_{}/conv2d_{}/bias:0'.format(i, i)][()])
f.close()
return weights
weights = load_cnn_weights("/home/users/shixin.li/segment/Lecun_stereo_rebuild/weights.hdf5")
cnn.set_weights(weights)
output_cnn = cnn.predict([X1, X2])
I already check that the weights are read successfully and can fit into the network according to calling get_weights() function. X1 and X2 are not zero, they are normalized gray scale image matrix. I even tried compile the network before predict. But the result output_cnn gives all zero.
I didn't see anyone have this problem and I am stuck for two days. The part which really confuses me is that the input and weights are all not zero, then why the result is zero? If you could help, I would really appreciate that!
You might want to try using tfdbg to find out exactly what the inputs to the op with all-zero outputs are, to try to understand what is going on.