How is it possible to encode an input with one 2D Convolution and applying the opposite 2D DeConv / Transposed Conv to get the same dimension back? - tensorflow

I am working on an autoencoder and I have an issue with reproducing the input in the same size. If I am using transposed convolution / deconvolution operation with the same parameters, I got a different output size then the original input was. For illustrating my problem, let us assume our model consists of just one convlution (to encode the input) and one deconvolution (to decode the encoded input). However, I not get the same size as my input. More precisely, the second and third dimension / axis 1 and axis 2 are 16 and not as one would expect: 15. Here is the code:
import tensorflow as tf
input = tf.keras.Input(shape=(15, 15, 3), name="Input0")
conv2d_layer2 = tf.keras.layers.Conv2D(filters=32, strides=[2, 2], kernel_size=[3, 3],
padding='same',
activation='selu', name="Conv1")
conv2d_trans_layer2 = tf.keras.layers.Conv2DTranspose(filters=32, strides=[2, 2],
kernel_size=[3, 3], padding='same',
activation='selu', name="DeConv1")
x_endcoded_1 = conv2d_layer2(input)
x_reconstructed = conv2d_trans_layer2(x_endcoded_1)
model = tf.keras.Model(inputs=input, outputs=x_reconstructed)
Results in the following model:
Use tf.where in 2.0, which has the same broadcast rule as np.where
Model: "model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
Input0 (InputLayer) [(None, 15, 15, 3)] 0
_________________________________________________________________
Conv1 (Conv2D) (None, 8, 8, 32) 896
_________________________________________________________________
DeConv1 (Conv2DTranspose) (None, 16, 16, 32) 9248
=================================================================
Total params: 10,144
Trainable params: 10,144
How can I reproduce my original input with using just this tranposed convolution? Is this possible?

deleting padding from both you can reproduce the mapping
input = Input(shape=(15, 15, 3), name="Input0")
conv2d_layer2 = Conv2D(filters=32, strides=[2, 2], kernel_size=[3, 3],
activation='selu', name="Conv1")(input)
conv2d_trans_layer2 = Conv2DTranspose(filters=32, strides=[2, 2],
kernel_size=[3, 3],
activation='selu', name="DeConv1")(conv2d_layer2)
model = Model(inputs=input, outputs=conv2d_trans_layer2)
model.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
Input0 (InputLayer) [(None, 15, 15, 3)] 0
_________________________________________________________________
Conv1 (Conv2D) (None, 7, 7, 32) 896
_________________________________________________________________
DeConv1 (Conv2DTranspose) (None, 15, 15, 32) 9248
=================================================================
In general, to do this in deeper structures you have to play with padding, strides and pooling
online there are a lot of good resources that explain how this operation works and their application in keras
Padding and Stride for Convolutional Neural Networks
Pooling Layers for Convolutional Neural Networks
How to use the UpSampling2D and Conv2DTranspose

Related

Sentence classification: Why does my embedding not reduce the shape of the subsequent layer?

I want to embed sentences that all contain 5 words and a my training-set has a total vocabulary of 10000 words. I use this code:
import tensorflow as tf
vocab_size = 10000
inputs = tf.keras.layers.Input(shape=(5,vocab_size), name="input", )
embedding = tf.keras.layers.Embedding(10000, 64)(inputs)
conv2d_1 = Conv2D( filters = 32, kernel_size = (3,3),
strides =(1), padding = 'SAME',)(embedding)
model = tf.keras.models.Model(inputs=inputs, outputs=conv2d_1)
model.summary()
After running I get:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input (InputLayer) [(None, 5, 10000)] 0
_________________________________________________________________
embedding_105 (Embedding) (None, 5, 10000, 64) 640000
_________________________________________________________________
conv2d_102 (Conv2D) (None, 5, 10000, 32) 18464
=================================================================
I want to do the embedding to convert the sparse 10000x5 tensor to a dense 64x5 tensor. Apparently that doesn't work as intended, so my question is: Why is the shape of the next layer not (None, 5, 64, 32) instead of (None, 5, 10000, 32)? How can I achieve the compactization?

Regularization function using weights from multiple layers?

I don't know if it is feasible but I'm asking just in case. Here is the (simplified) architecture of my model.
Layer (type) Output Shape Param #Connected to
==========================================
input_1 (InputLayer) [(None, 7, 7, 1024) 0
conv (Conv2D) (None, 7, 7, 10) 10240 input_1[0][0]
where each of the 10 filters in "conv" is a 1x1x1024 convolutional filter (with no bias but it's irrelevant for this particular issue).
I am currently using a custom regularization function on "conv" to make sure that the (1x1)x1024x10 matrix of filter weights has a nice property (basically that all vectors are pairwise orthogonal) and so far, everything is working as expected.
Now I also want the ability to disable training on some of these 10 filters. The only way I know how to do that would be to implement 10 filters independently as follows
Layer (type) Output Shape Param # Connected to
=========================================================
input_1 (InputLayer) [(None, 7, 7, 1024) 0
conv_1 (Conv2D) (None, 7, 7, 1) 1024 input_1[0][0]
conv_2 (Conv2D) (None, 7, 7, 1) 1024 input_1[0][0]
conv_3 (Conv2D) (None, 7, 7, 1) 1024 input_1[0][0]
...
conv_10 (Conv2D) (None, 7, 7, 1) 1024 input_1[0][0]
followed by a Concatenate layer, then to set the "trainable" parameter to True/False on each conv_i layer as I see fit. However, now I don't know how to implement my regularization function which must be computed on the weights of all layers conv_i simultaneously rather than independently.
Is there a trick that I can use to implement such function? Or conversely, is there a way to freeze only part of the weights of a convolutional layer?
Thanks!
Solution
For those interested, here is the working code for my problem following the advice provided by #LaplaceRicky.
class SpecialRegularization(tf.keras.Model):
""" In order to avoid a warning message when saving the model,
I use the solution indicated here
https://github.com/tensorflow/tensorflow/issues/44541
and now inherit from tf.keras.Model instead of Layer
"""
def __init__(self,nfilters,**kwargs):
super().__init__(**kwargs)
self.inner_layers=[Conv2D(1,(1,1)) for _ in range(nfilters)]
def call(self, inputs):
outputs=[l(inputs) for l in self.inner_layers]
self.add_loss(self.define_your_regularization_here())
return tf.concat(outputs,-1)
def set_trainable_parts(self, trainables):
""" Set the trainable attribute independently on each filter """
for l,t in zip(self.inner_layers,trainables):
l.trainable = t
def define_your_regularization_here(self):
#reconstruct the original kernel
large_kernel=tf.concat([l.kernel for l in self.inner_layers],-1)
return tf.reduce_sum(large_kernel*large_kernel[:,:,:,::-1])
One way to achieve this is to have a custom keras layer that wraps all of the small conv layers and is responsible for computing the regularization loss.
Example Codes:
import tensorflow as tf
def _get_losses(model,x):
model(x)
return model.losses
def _get_grads(model,x):
with tf.GradientTape() as t:
model(x)
reg_loss=tf.math.add_n(model.losses)
return t.gradient(reg_loss,model.trainable_weights)
class SpecialRegularization(tf.keras.layers.Layer):
def __init__(self, **kwargs):
self.inner_layers=[tf.keras.layers.Conv2D(1,(1,1)) for i in range(10)]
super().__init__(**kwargs)
def call(self, inputs,training=None):
outputs=[l(inputs,training=training) for l in self.inner_layers]
self.add_loss(self.define_your_regularization_here())
return tf.concat(outputs,-1)
def define_your_regularization_here(self):
#reconstruct the original kernel
large_kernel=tf.concat([l.kernel for l in self.inner_layers],-1)
#just giving an example here
#you should define your own regularization using the entire kernel
return tf.reduce_sum(large_kernel*large_kernel[:,:,:,::-1])
tf.random.set_seed(123)
inputs = tf.keras.Input(shape=(7,7,1024))
outputs = SpecialRegularization()(inputs)
model = tf.keras.Model(inputs=inputs, outputs=outputs)
#get_losses, get_grads are for demonstration purpose
get_losses=tf.function(_get_losses)
get_grads=tf.function(_get_grads)
data=tf.random.normal((64,7,7,1024))
print(get_losses(model,data))
print(get_grads(model,data)[0])
print(model.layers[1].inner_layers[-1].kernel*2)
model.summary()
'''
[<tf.Tensor: shape=(), dtype=float32, numpy=-0.20446025>]
tf.Tensor(
[[[[ 0.02072023]
[ 0.12973154]
[ 0.11631528]
...
[ 0.00804012]
[-0.07299817]
[ 0.06031524]]]], shape=(1, 1, 1024, 1), dtype=float32)
tf.Tensor(
[[[[ 0.02072023]
[ 0.12973154]
[ 0.11631528]
...
[ 0.00804012]
[-0.07299817]
[ 0.06031524]]]], shape=(1, 1, 1024, 1), dtype=float32)
Model: "model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 7, 7, 1024)] 0
_________________________________________________________________
special_regularization (Spec (None, 7, 7, 10) 10250
=================================================================
Total params: 10,250
Trainable params: 10,250
Non-trainable params: 0
_________________________________________________________________
'''

AttributeError: The layer has never been called and thus has no defined output shape

I am trying to define a model happyModel()
# GRADED FUNCTION: happyModel
def happyModel():
"""
Implements the forward propagation for the binary classification model:
ZEROPAD2D -> CONV2D -> BATCHNORM -> RELU -> MAXPOOL -> FLATTEN -> DENSE
Note that for simplicity and grading purposes, you'll hard-code all the values
such as the stride and kernel (filter) sizes.
Normally, functions should take these values as function parameters.
Arguments:
None
Returns:
model -- TF Keras model (object containing the information for the entire training process)
"""
model = tf.keras.Sequential(
[
## ZeroPadding2D with padding 3, input shape of 64 x 64 x 3
tf.keras.layers.ZeroPadding2D(padding=(3,3), data_format=(64,64,3)),
## Conv2D with 32 7x7 filters and stride of 1
tf.keras.layers.Conv2D(32, (7, 7), strides = (1, 1), name = 'conv0'),
## BatchNormalization for axis 3
tf.keras.layers.BatchNormalization(axis = 3, name = 'bn0'),
## ReLU
tf.keras.layers.Activation('relu'),
## Max Pooling 2D with default parameters
tf.keras.layers.MaxPooling2D((2, 2), name='max_pool0'),
## Flatten layer
tf.keras.layers.Flatten(),
## Dense layer with 1 unit for output & 'sigmoid' activation
tf.keras.layers.Dense(1, activation='sigmoid', name='fc'),
# YOUR CODE STARTS HERE
# YOUR CODE ENDS HERE
]
)
return model
and following code is for creating the object of this model defined above:
happy_model = happyModel()
# Print a summary for each layer
for layer in summary(happy_model):
print(layer)
output = [['ZeroPadding2D', (None, 70, 70, 3), 0, ((3, 3), (3, 3))],
['Conv2D', (None, 64, 64, 32), 4736, 'valid', 'linear', 'GlorotUniform'],
['BatchNormalization', (None, 64, 64, 32), 128],
['ReLU', (None, 64, 64, 32), 0],
['MaxPooling2D', (None, 32, 32, 32), 0, (2, 2), (2, 2), 'valid'],
['Flatten', (None, 32768), 0],
['Dense', (None, 1), 32769, 'sigmoid']]
comparator(summary(happy_model), output)
I got following error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-67-f33284fd82fe> in <module>
1 happy_model = happyModel()
2 # Print a summary for each layer
----> 3 for layer in summary(happy_model):
4 print(layer)
5
~/work/release/W1A2/test_utils.py in summary(model)
30 result = []
31 for layer in model.layers:
---> 32 descriptors = [layer.__class__.__name__, layer.output_shape, layer.count_params()]
33 if (type(layer) == Conv2D):
34 descriptors.append(layer.padding)
/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/engine/base_layer.py in output_shape(self)
2177 """
2178 if not self._inbound_nodes:
-> 2179 raise AttributeError('The layer has never been called '
2180 'and thus has no defined output shape.')
2181 all_output_shapes = set(
AttributeError: The layer has never been called and thus has no defined output shape.
I suspect my calling of ZeroPadding2D() is not right. The project seems to require the input shape of ZeroPadding2D() to be 64X64X3. I tried many formats but could not fix the problem. Anyone can give a pointer? Thanks a lot.
In your model definition, there's an issue with the following layer:
tf.keras.layers.ZeroPadding2D(padding=(3,3), data_format=(64,64,3)),
First, you didn't define any input layer also, the data_format is a string, one of channels_last (default) or channels_first, source. The correct way to define the above model as follows:
def happyModel():
model = tf.keras.Sequential(
[
## ZeroPadding2D with padding 3, input shape of 64 x 64 x 3
tf.keras.layers.ZeroPadding2D(padding=(3,3),
input_shape=(64, 64, 3), data_format="channels_last"),
....
....
happy_model = happyModel()
happy_model.summary()
Model: "sequential_2"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
zero_padding2d_4 (ZeroPaddin (None, 70, 70, 3) 0
_________________________________________________________________
conv0 (Conv2D) (None, 64, 64, 32) 4736
_________________________________________________________________
bn0 (BatchNormalization) (None, 64, 64, 32) 128
_________________________________________________________________
activation_2 (Activation) (None, 64, 64, 32) 0
_________________________________________________________________
max_pool0 (MaxPooling2D) (None, 32, 32, 32) 0
_________________________________________________________________
flatten_16 (Flatten) (None, 32768) 0
_________________________________________________________________
fc (Dense) (None, 1) 32769
=================================================================
Total params: 37,633
Trainable params: 37,569
Non-trainable params: 64
Per the documentation for tf.keras.Sequential() (https://www.tensorflow.org/api_docs/python/tf/keras/Sequential):
"Optionally, the first layer can receive an input_shape argument"
So instead of
tf.keras.layers.ZeroPadding2D(padding=(3,3), data_format=(64,64,3))
if you want to specify input shape it should be
tf.keras.layers.ZeroPadding2D(padding=(3,3), input_shape=(64,64,3))
model = tf.keras.Sequential([
# YOUR CODE STARTS HERE
tf.keras.layers.ZeroPadding2D(padding=(3, 3), input_shape=(64,64,3), data_format="channels_last"),
tf.keras.layers.Conv2D(32, (7, 7), strides = (1, 1)),
tf.keras.layers.BatchNormalization(axis=3),
tf.keras.layers.ReLU(),
tf.keras.layers.MaxPooling2D(),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(1, activation='sigmoid'),
# YOUR CODE ENDS HERE
])
return model
Try it working perfectly......
model = tf.keras.Sequential(
[
## ZeroPadding2D with padding 3, input shape of 64 x 64 x 3
## Conv2D with 32 7x7 filters and stride of 1
## BatchNormalization for axis 3
## ReLU
## Max Pooling 2D with default parameters
## Flatten layer
## Dense layer with 1 unit for output & 'sigmoid' activation
# YOUR CODE STARTS HERE
tfl.ZeroPadding2D(padding=(3,3), input_shape=(64,64,3),data_format="channels_last"),
tfl.Conv2D(32, (7, 7), strides = (1, 1), name = 'conv0'),
tfl.BatchNormalization(axis = 3, name = 'bn0'),
tfl.ReLU(),
tfl.MaxPooling2D((2, 2), name='max_pool0'),
tfl.Flatten(),
tfl.Dense(1, activation='sigmoid', name='fc'),
# YOUR CODE ENDS HERE
])
It is working you can try it.

What is the correct way to upsample a [32x32x6] layer in a CNN

I have a CNN that produces a [32x32] image with 6 channels, but I need to upsample it to 256x256. I'm doing:
def upsample(filters, size):
initializer = tf.random_normal_initializer(0., 0.02)
result = tf.keras.Sequential()
result.add(tf.keras.layers.Conv2DTranspose(filters, size, strides=2,
padding='same',
kernel_initializer=initializer,
use_bias=False))
return result
Then I pass the layer like this:
up_stack = [
upsample(6, 3), # x2
upsample(6, 3), # x2
upsample(6, 3) # x2
]
for up in up_stack:
finalLayer = up(finalLayer)
But this setup produces inaccurate results. Is there anything I'm doing wrong?
Your other option would be to use tf.keras.layers.UpSampling2D for your purpose, but that doesn't learn a kernel to upsample (it uses bilinear upsampling).
So, your approach is correct. But, you have used kernel_size as 3x3.
It should be 2x2 and if you are not satisfied with the results, you should increase the number of filters from [32, 256].
If you wish to use the up-convolution, I will suggest doing the following to achieve what you want. Following code works, just change the filter based on your need.
import tensorflow as tf
from tensorflow.keras import layers
# in = 32x32 out 256x256
inputs = layers.Input(shape=(32, 32, 6))
deconc01 = layers.Conv2DTranspose(256, kernel_size=2, strides=(2, 2), activation='relu')(inputs)
deconc02 = layers.Conv2DTranspose(256, kernel_size=2, strides=(2, 2), activation='relu')(deconc01)
outputs = layers.Conv2DTranspose(256, kernel_size=2, strides=(2, 2), activation='relu')(deconc02)
model = tf.keras.Model(inputs=inputs, outputs=outputs, name="up-conv")
Model: "up-conv"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 32, 32, 6)] 0
_________________________________________________________________
conv2d_transpose (Conv2DTran (None, 64, 64, 256) 6400
_________________________________________________________________
conv2d_transpose_1 (Conv2DTr (None, 128, 128, 256) 262400
_________________________________________________________________
conv2d_transpose_2 (Conv2DTr (None, 256, 256, 256) 262400
=================================================================
Total params: 531,200
Trainable params: 531,200
Non-trainable params: 0
_________________________________________________________________

Keras LSTM with stateful in Reinforcement Learning

I'm doing a simple DQN RL algorithm with Keras, but using an LSTM in the network. The idea is that a stateful LSTM will remember the relevant information from all prior states and thus predict rewards for different actions better. This problem is more of a keras problem than RL. I think the stateful LSTM is not being handled by me correctly.
MODEL CODE - functional api used:
state_input = Input( batch_shape=(batch_size,look_back,1,resolution[0], resolution[1]))
conv1 = TimeDistributed(Conv2D(8, 6, strides=3, activation='relu', data_format="channels_first"))(
state_input) # filters, kernal_size, stride
conv2 = TimeDistributed(Conv2D(8, 3, strides=2, activation='relu', data_format="channels_first"))(
conv1) # filters, kernal_size, stride
flatten = TimeDistributed(Flatten())(conv2)
fc1 = TimeDistributed(Dense(128,activation='relu'))(flatten)
fc2 = TimeDistributed(Dense(64, activation='relu'))(fc1)
lstm_layer = LSTM(4, stateful=True)(fc2)
fc3 = Dense(128, activation='relu')(lstm_layer)
fc4 = Dense(available_actions_count)(fc3)
model = keras.models.Model(input=state_input, output=fc4)
adam = RMSprop(lr= learning_rate)#0.001
model.compile(loss="mse", optimizer=adam)
print(model.summary())
This is the model summary:
Layer (type) Output Shape Param
=================================================================
input_1 (InputLayer) (1, 1, 1, 30, 45) 0
_________________________________________________________________
time_distributed_1 (TimeDist (1, 1, 8, 9, 14) 296
_________________________________________________________________
time_distributed_2 (TimeDist (1, 1, 8, 4, 6) 584
_________________________________________________________________
time_distributed_3 (TimeDist (1, 1, 192) 0
_________________________________________________________________
time_distributed_4 (TimeDist (1, 1, 128) 24704
_________________________________________________________________
time_distributed_5 (TimeDist (1, 1, 64) 8256
_________________________________________________________________
lstm_1 (LSTM) (1, 4) 1104
_________________________________________________________________
dense_3 (Dense) (1, 128) 640
_________________________________________________________________
dense_4 (Dense) (1, 8) 1032
=================================================================
Total params: 36,616
Trainable params: 36,616
Non-trainable params: 0
================================================================
I feed in one frame at a time to fit the model. When I need to predict to act, I make sure I save the model state and restore it as follows.
CODE TO FIT/TRAIN THE MODEL:
#save the state (lstm memory) for recovery before fitting.
prev_state = get_model_states(model)
target_q = model.predict(s1, batch_size=batch_size)
#lstm predict updates the state of the lstm modules
q_next = model.predict(s2, batch_size=batch_size)
max_q_next = np.max(q_next, axis=1)
target_q[np.arange(target_q.shape[0]), a] = r + discount_factor * (1 - isterminal) * max_q_next
#now recover states for fitting the model correctly
set_model_states(model,prev_state)#to before s1 prediction
model.fit(s1, target_q,batch_size=batch_size, verbose=0)
#after fitting, the state and weights both get updated !!
#so lstm has already moved forward in the sequence
The model does not seem to be working at all. The variance remains very high across different epochs. I am resetting the model after every episode , as one would expect. So stateful does not affect the training between episodes. Each episode is fed in one frame at a time, that is why I need stateful.
I've tried different discount factors and learning rates.In theory, this should be a superior model to the vanilla dqn (CNN with 4 frames )
What am I doing wrong ? Any help would be appreciated.