Keras Conv2DTranspose layers in Convolutional GAN - tensorflow

I'm trying to train a Convolutional GAN in Keras with Tensorflow backend for generating faces. Having read several examples there seem to be two ways to build the generator, you can either use the Conv2DTranspose layer with strides to upsample, like so:
def build_generator(seed_size, channels):
inputs_rand = Input(shape=(seed_size,))
inputs_feat = Input(shape=(NUM_FEATS,))
inputs = Concatenate()([inputs_rand, inputs_feat])
dense1 = Dense(4*4*64, activation='relu')(inputs)
reshape1 = Reshape((4,4,64))(dense1)
conv_trans1 = Conv2DTranspose(64, kernel_size=5, strides=2*GENERATE_RES, padding='same')(reshape1)
batch_norm1 = BatchNormalization(momentum=0.8)(conv_trans1)
leaky_relu1 = ReLU()(batch_norm1)
conv_trans2 = Conv2DTranspose(64, kernel_size=5, strides=2, padding='same')(leaky_relu1)
batch_norm2 = BatchNormalization(momentum=0.8)(conv_trans2)
leaky_relu2 = ReLU()(batch_norm2)
conv_trans3 = Conv2DTranspose(64, kernel_size=5, strides=2, padding='same')(leaky_relu2)
batch_norm3 = BatchNormalization(momentum=0.8)(conv_trans3)
leaky_relu3 = ReLU()(batch_norm3)
output = Conv2DTranspose(channels, kernel_size=3, padding='same', activation='tanh')(leaky_relu3)
generator = Model(inputs=[inputs_rand, inputs_feat], outputs=[output, inputs_feat])
return generator
or use the Upsample2D layer along with Conv2D layers, like so:
def build_generator(seed_size, channels):
inputs_rand = Input(shape=(seed_size,))
inputs_feat = Input(shape=(NUM_FEATS,))
inputs = Concatenate()([inputs_rand, inputs_feat])
dense1 = Dense(4*4*64, activation='relu')(inputs)
reshape1 = Reshape((4,4,64))(dense1)
upsamp1 = UpSampling2D(2*GENERATE_RES)(reshape1)
conv_trans1 = Conv2D(64, kernel_size=5, padding='same')(upsamp1)
batch_norm1 = BatchNormalization(momentum=0.8)(conv_trans1)
leaky_relu1 = ReLU()(batch_norm1)
upsamp2 = UpSampling2D()(leaky_relu1)
conv_trans2 = Conv2D(64, kernel_size=5, padding='same')(upsamp2)
batch_norm2 = BatchNormalization(momentum=0.8)(conv_trans2)
leaky_relu2 = ReLU()(batch_norm2)
upsamp3 = UpSampling2D()(leaky_relu2)
conv_trans3 = Conv2D(64, kernel_size=5, padding='same')(upsamp3)
batch_norm3 = BatchNormalization(momentum=0.8)(conv_trans3)
leaky_relu3 = ReLU()(batch_norm3)
output = Conv2D(channels, kernel_size=3, padding='same', activation='tanh')(leaky_relu3)
generator = Model(inputs=[inputs_rand, inputs_feat], outputs=[output, inputs_feat])
return generator
I have read in a few places that Conv2DTranspose is preferable, however I can't seem to get it working. It just produces a repeating noise pattern according to the strides, and then no matter how long I leave it to train, it stays the same. Meanwhile the other method seems to work well enough, but I would like to get both methods working (just to satisfy my own curiosity). I think I must be doing something wrong, but my code looks pretty much the same as other examples I've found and I can't find anyone else having this sort of problem.
I have tried a few tweaks to the model, for example adding dropout and removing the batch normalisation just in case there was a simple fix, but nothing seems to work. I haven't included the rest of my code to keep things tidy, if it would help, though, I can add the rest.
This is the noise obtained when using the Conv2DTranspose layers.
Meanwhile the Upsampling with Conv2D layers produce these, for example.
Any comments and suggestions on how to improve my results would be welcomed too.

Related

Tensorflow / Tensorboard missing output from one of hparams during profiling

TF / TB running a profile with the following setup :
HP_NUM_NODES_ONE = hp.HParam('nodes_one', hp.Discrete([128]))
HP_NUM_NODES_TWO = hp.HParam('nodes_two', hp.Discrete([64, 128, 256]))
HP_NUM_NODES_THR = hp.HParam('nodes_thr', hp.Discrete([64, 128, 256]))
HP_NUM_FILT = hp.HParam('num_filter', hp.Discrete([64, 128, 256]))
HP_DROPOUT = hp.HParam('dropout', hp.RealInterval(0.1, 0.3))
HP_OPTIMIZER = hp.HParam('optimizer', hp.Discrete(['adam', 'sgd', 'RMSprop']))
MSE = 'mean_squared_error'
with tf.summary.create_file_writer('logs/hparam_21-6').as_default():
hp.hparams_config(
hparams=[HP_NUM_NODES_ONE, HP_NUM_NODES_TWO, HP_NUM_NODES_THR,
HP_NUM_FILT, HP_DROPOUT, HP_OPTIMIZER],
metrics=[hp.Metric(MSE, display_name='Mean Squared Error')],
)
with this model :
def train_test_model(hparams):
model = tf.keras.models.Sequential([
tf.keras.layers.Conv1D(filters = hparams[HP_NUM_FILT], kernel_size=6, strides=1,
activation='relu', input_shape=(300,4), use_bias=True),
tf.keras.layers.MaxPooling1D(pool_size=100),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(hparams[HP_NUM_NODES_ONE], activation=tf.nn.relu),
tf.keras.layers.Dropout(hparams[HP_DROPOUT]),
tf.keras.layers.Dense(hparams[HP_NUM_NODES_TWO], activation=tf.nn.relu),
tf.keras.layers.Dropout(hparams[HP_DROPOUT]),
tf.keras.layers.Dense(hparams[HP_NUM_NODES_THR], activation="linear"),
])
model.compile(
optimizer=hparams[HP_OPTIMIZER],
loss='mean_squared_error',
metrics=['mean_squared_error'],
)
model.fit(feature3, label2[0,], epochs=500)
_, mean_squared_error = model.evaluate(x_test, y_test[0,])
return mean_squared_error
It all seems to run fine and I do get outputs, however the outputs in tensorboard do not show the results for different values of the "optimizer", which I want also. Does it need to be treated/coded differently? Happy to supply more of the code if this is not clear. Thx. J
I found the problem, as simple as an unticked box in the tensorbox results! So the standard results returned are:
however is I simply tick the optimizer box, the results appear.
I have no idea why tensorboard would, by default, turn this off.

improve CNN model accuracy

We have got some train valid and test data to create as homework CNN1D and to compare results with another model to get the exam marks
I tried with this model however I'm getting 84.18 accuracy Vs 84.58 for the competitor model. my classmates got also the same model as mine and they could improve it to get 85.20% as accuracy. Im just authorized to change the hyper parameters or to add/modify/delete some layers v after the fusion = concate()
Can anyone please help me improve this
def CNN1D ()
n_filters=256,
dropout_rate = 0.4
conv1 = Conv1D(filters=n_filters, kernel_size=3, padding='valid', name="conv1_", activation="relu")
Dropout1 = Dropout(rate=dropout_rate, name="dropOut1_")
conv2 = Conv1D(filters=n_filters, kernel_size=3, padding='valid', name="conv2_", activation="relu")
Dropout2 = Dropout(rate=dropout_rate, name="dropOut2_")
conv3 = Conv1D(filters=n_filters*2, kernel_size=3, padding='valid', name="conv3_", activation="relu")
Dropout3 = Dropout(rate=dropout_rate, name="dropOut3_")
conv4 = Conv1D(filters=n_filters*2, kernel_size=1, padding='valid', name="conv4_", activation="relu")
Dropout4 = Dropout(rate=dropout_rate,name="dropOut4_")
globPool = GlobalAveragePooling1D()
def TwoBranchModel():
num_units=256
branch1 = CNN1D()
branch2 = CNN1D()
fusion = concate()
out = tf.keras.Sequential([
Dense(num_units,activation='relu'),
BatchNormalization(),
Dense(n_classes,activation='softmax')
])
I would suggest you to try playing with the following
decrease the dropout percentage
Try playing with BatchNormalisation hyperparameters. See here: https://keras.io/api/layers/normalization_layers/batch_normalization/ and adjust the momentum. Also remove the BN layer and see the accuracy.
I am not sure if you can change the below (as per your constraint)
change globalpoolaverage to maxpool
your filter size seems to be constant, it is good idea if start increasing your number of filters. For example start with 32, 64, 128 and so on..
remove some dropout layers.

Trying to visualize activations in Tensorflow

I created a simple cnn to detect custom digits and I am trying to visualize the activations of my layers. When I run the following code layer_outputs = [layer.output for layer in model.layers[:9]] I get the error Layer conv2d has no inbound nodes
When I searched online, it said to define input shape of first layer, but I've already done that and I'm not sure why that is happening. Below is my model.
class myModel(Model):
def __init__(self):
super().__init__()
self.conv1 = Conv2D(filters=32, kernel_size=(3,3), activation='relu', padding='same',
input_shape=(image_height, image_width, num_channels))
self.maxPool1 = MaxPool2D(pool_size=(2,2))
self.conv2 = Conv2D(filters=64, kernel_size=(3,3), activation='relu', padding='same')
self.maxPool2 = MaxPool2D(pool_size=(2,2))
self.conv3 = Conv2D(filters=64, kernel_size=(3,3), activation='relu', padding='same')
self.maxPool3 = MaxPool2D(pool_size=(2,2))
self.flatten = Flatten()
self.d1 = Dense(128, activation='relu')
self.d2 = Dense(10, activation='softmax')
def call(self, x):
x = self.conv1(x)
x = self.maxPool1(x)
x = self.conv2(x)
x = self.maxPool2(x)
x = self.conv3(x)
x = self.maxPool3(x)
x = self.flatten(x)
x = self.d1(x)
x = self.d2(x)
return x
Based on your stated goal and what you've posted, I believe the problem here is slightly (and very understandably) misunderstanding the way the TensorFlow APIs work. The model object and its constituent parts only store state for the model, not the evaluation of it, for example the hyperparameters you've set and the parameters the model learns when its fed training data. Even if you worked to fix the problem with what you're trying, the .output of the layer objects as part of the model wouldn't return the activations you want to visualize. It instead returns the part of the TensorFlow graph that represents that part of the computation.
For what you want to do, you'll need to manipulate an object that's the result of calling the .predict function on the model that you've set up and trained. Or you could drop down to below the Keras abstractions and manipulate the tensors directly.
If I gave this more thought, there's probably a reasonably elegant way to get this by only evaluating your graph (i.e., calling .predict) once, but the most obvious naïve way is simply to instantiate several new models (or several subclasses of your model) with each of the layers of interest as the terminal output, which should get you what you want.
For example, you could do something like this for each of the layers whose outputs you're interested in:
my_test_image = # get an image
input = Input(shape=(None, 256, 256, 3)) # input size will need to be set according to the relevant model
outputs_of_interest = Model(input, my_model.layers[-2].output)
outputs_of_interest.predict(my_test_image) # <=== this has the output you want

'Channels first' training accuracy very low compared to 'channels last'

My issue:
I am trying to train a semantic segmentation model in tf.keras, in fact it works very well when I am using channels_last (WHC) mode (it reaches 96%+ val acc). I wanted to train it in channels_first (CHW) mode so the weights are compatible with TensorRT. When I do this, the ~80% training accuracy in the first few epochs dips down to around 0.020% and stays there permanently.
It is useful to know that the base of my model is a tf.keras.applications.MobileNet() model with the pre-trained 'imagenet' weights. (Model architecture at the bottom.)
The transformation process:
I used the guidelines provided and I change only a few things here:
Set tf.keras.backend.set_image_data_format() to 'channels_first'.
I change the channel order in the input tensor from: input_tensor=Input(shape=(376, 672, 3)) to: input_tensor=Input(shape=(3, 376, 672))
In my image preprocessing (using tf.data.Dataset), i use tf.transpose(img, perm=[2, 0, 1]) on both my input image and one-hot encoded mask to change the channel orders. I checked this with equality assertion to make sure its correct and it seems to be fine.
When I change these the training starts fine but as I said the training accuracy goes down to almost zero. When I revert back everything's fine again.
Possible leads:
What am I doing wrong or what could be the problematic part here? My suspicions are around these questions:
Are the pre-trained imageNet weights changed to the 'channels_first' order also when I set the backend? Is this something I should consider at all?
Could it be that the tf.transpose() function messes up the mask's one-hot encoding? (I have 3 classes represented by 3 colors: lane, opposing lane, background)
Maybe I am not seeing something obvious. I can provide further code and answers as needed.
EDIT:
08/17: This is still an ongoing issue, I have tried several things:
I checked if the image and the mask is correct after the transpose with numpy assertion, seems correct.
I suspected that the loss function calculates on the wrong axis, so I customized the loss function for the first axis (where the channels are). Here it is:
def ReverseAxisLoss(y_true, y_pred):
return K.categorical_crossentropy(y_true, y_pred, from_logits=True, axis=1)
My main suspicion is that the 'channels first' backend setting does nothing to transpose the pretrained 'imagenet' weights for the mobilenet part. Is there an updated way for TF2.x / Keras to transpose the pre-trained weights into CHW format?
Here is the architecture that I use (the skipNet() is the head network and the mobilenet is the base, and it is connected in the create_model() function)
def skipNet(encoder_output, feed1, feed2, classes):
# random initializer and regularizer
stddev = 0.01
init = RandomNormal(stddev=stddev)
weight_decay = 1e-3
reg = l2(weight_decay)
score_feed2 = Conv2D(kernel_size=(1, 1), filters=classes, padding="SAME",
kernel_initializer=init, kernel_regularizer=reg)(feed2)
score_feed2_bn = BatchNormalization()(score_feed2)
score_feed1 = Conv2D(kernel_size=(1, 1), filters=classes, padding="SAME",
kernel_initializer=init, kernel_regularizer=reg)(feed1)
score_feed1_bn = BatchNormalization()(score_feed1)
upscore2 = Conv2DTranspose(kernel_size=(4, 4), filters=classes, strides=(2, 2),
padding="SAME", kernel_initializer=init,
kernel_regularizer=reg)(encoder_output)
height_pad1 = ZeroPadding2D(padding=((1,0),(0,0)))(upscore2)
upscore2_bn = BatchNormalization()(height_pad1)
fuse_feed1 = add([score_feed1_bn, upscore2_bn])
upscore4 = Conv2DTranspose(kernel_size=(4, 4), filters=classes, strides=(2, 2),
padding="SAME", kernel_initializer=init,
kernel_regularizer=reg)(fuse_feed1)
height_pad2 = ZeroPadding2D(padding=((0,1),(0,0)))(upscore4)
upscore4_bn = BatchNormalization()(height_pad2)
fuse_feed2 = add([score_feed2_bn, upscore4_bn])
upscore8 = Conv2DTranspose(kernel_size=(16, 16), filters=classes, strides=(8, 8),
padding="SAME", kernel_initializer=init,
kernel_regularizer=reg, activation="softmax")(fuse_feed2)
return upscore8
def create_model(classes):
base_model = tf.keras.applications.MobileNet(input_tensor=Input(shape=IMG_SHAPE),
include_top=False,
weights='imagenet')
conv4_2_output = base_model.get_layer(index=43).output
conv3_2_output = base_model.get_layer(index=30).output
conv_score_output = base_model.output
head_model = skipNet(conv_score_output, conv4_2_output, conv3_2_output, classes)
for layer in base_model.layers:
layer.trainable = False
model = Model(inputs=base_model.input, outputs=head_model)
return model

How can I change dense layer to conv2D layer in keras?

I want to classification for pictures with different input sizes. I would like to use the following paper ideas.
'Fully Convolutional Networks for Semantic Segmentation'
https://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Long_Fully_Convolutional_Networks_2015_CVPR_paper.pdf
I did change the dense layer to conv2D layer like this.
def FullyCNN(input_shape, n_classes):
inputs = Input(shape=(None, None, 1))
first_layer = Conv2D(filters=16, kernel_size=(12,16), strides=1, activation='relu', kernel_initializer='he_normal', name='conv1')(inputs)
first_layer = BatchNormalization()(first_layer)
first_layer = MaxPooling2D(pool_size=2)(first_layer)
second_layer = Conv2D(filters=24, kernel_size=(8,12), strides=1, activation='relu', kernel_initializer='he_normal', name='conv2')(first_layer)
second_layer = BatchNormalization()(second_layer)
second_layer = MaxPooling2D(pool_size=2)(second_layer)
third_layer = Conv2D(filters=32, kernel_size=(5,7), strides=1, activation='relu', kernel_initializer='he_normal', name='conv3')(first_layer)
third_layer = BatchNormalization()(third_layer)
third_layer = MaxPooling2D(pool_size=2)(third_layer)
fully_layer = Conv2D(64, kernel_size=8, activation='relu', kernel_initializer='he_normal')(third_layer)
fully_layer = BatchNormalization()(fully_layer)
fully_layer = Dropout(0.5)(fully_layer)
fully_layer = Conv2D(n_classes, kernel_size=1)(fully_layer)
output = Conv2DTranspose(n_classes, kernel_size=1, activation='softmax')(fully_layer)
model = Model(inputs=inputs, outputs=output)
return model
and I made generator for using fit_generator().
def data_generator(x_train, y_train):
while True:
index = np.asscalar(np.random.choice(len(x_train),1))
feature = np.expand_dims(x_train[index],-1)
feature = np.resize(feature,(-1,feature.shape))
feature = np.expand_dims(feature,0) # make (1,input_height,input_width,1)
label = y_train[index]
yield (feature,label)
and These are images about my data.
However, there is some problems about dimension.
Since the output layer must have 4 dimensions unlike the original CNN model, dimensions do not fit in the label.
Model summary:
Original CNN model summary:
How can I handle this problem? I tried to change dimension about label by expanding the dimension.
label = np.expand_dims(label,0)
label = np.expand_dims(label,0)
label = np.expand_dims(label,0)
I think there is a better way and I wonderd that is it necessary to have conv2DTranspose? And should batch size be 1?