Tensorflow - fixed convolutional kernel - tensorflow

I am trying to create a fixed convolutional kernel that applies a blur filter to each channel separately:
# inputs = <previous layer>
kernel_weights = np.array([[1, 2, 1],
[2, 4, 2],
[1, 2, 1]])
kernel_weights = kernel_weights / np.sum(kernel_weights)
kernel_weights = np.reshape(kernel_weights, (*kernel_weights.shape, 1, 1))
kernel_weights = np.tile(kernel_weights, (1, 1, inputs.get_shape().as_list()[3], 1))
return tf.nn.depthwise_conv2d_native(max_pool, kernel_weights, strides=(1, 2, 2, 1), padding="SAME")
I'm currently under the impression that this convolutional kernel can/will change during training - how can I prevent it from doing so?
Would it be sufficient to wrap it in a tf.constant before passing it to the conv2d layer? Like so:
kernel_weights = tf.constant(kernel_weights)
Thanks!

As GPhilo's comment correctly identifies: just passing the kernel as a tf.constant (or plain numpy array) works, verified by plotting the histogram for the kernel in tensorboard.

Related

Tensorflow conv2d on RGB image

From the accepted answer in this question,
given the following
input and kernel matrices, the output of tf.nn.conv2d is
[[14 6]
[6 12]]
which makes sense. However, when I make the input and kernel matrices have 3-channels each (by repeating each original matrix), and run the same code:
# the previous input
i_grey = np.array([
[4, 3, 1, 0],
[2, 1, 0, 1],
[1, 2, 4, 1],
[3, 1, 0, 2]
])
# copy to 3-dimensions
i_rgb = np.repeat( np.expand_dims(i_grey, axis=0), 3, axis=0 )
# convert to tensor
i_rgb = tf.constant(i_rgb, dtype=tf.float32)
# make kernel depth match input; same process as input
k = np.array([
[1, 0, 1],
[2, 1, 0],
[0, 0, 1]
])
k_rgb = np.repeat( np.expand_dims(k, axis=0), 3, axis=0 )
# convert to tensor
k_rgb = tf.constant(k_rgb, dtype=tf.float32)
here's what my input and kernel matrices look like at this point
# reshape input to format: [batch, in_height, in_width, in_channels]
image_rgb = tf.reshape(i_rgb, [1, 4, 4, 3])
# reshape kernel to format: [filter_height, filter_width, in_channels, out_channels]
kernel_rgb = tf.reshape(k_rgb, [3, 3, 3, 1])
conv_rgb = tf.squeeze( tf.nn.conv2d(image_rgb, kernel_rgb, [1,1,1,1], "VALID") )
with tf.Session() as sess:
conv_result = sess.run(conv_rgb)
print(conv_result)
I get the final output:
[[35. 15.]
[35. 26.]]
But I was expecting the original output*3:
[[42. 18.]
[18. 36.]]
because from my understanding, each channel of the kernel is convolved with each channel of the input, and the resultant matrices are summed to get the final output.
Am I missing something from this process or the tensorflow implementation?
Reshape is a tricky function. It will produce you the shape you want, but can easily ground things together. In cases like yours, one should avoid using reshape by all means.
In that particular case instead, it is better to duplicate the arrays along the new axis. When using [batch, in_height, in_width, in_channels] channels is the last dimension and it should be used in repeat() function. Next code should better reflect the logic behind it:
i_grey = np.expand_dims(i_grey, axis=0) # add batch dim
i_grey = np.expand_dims(i_grey, axis=3) # add channel dim
i_rgb = np.repeat(i_grey, 3, axis=3 ) # duplicate along channels dim
And likewise with filters:
k = np.expand_dims(k, axis=2) # input channels dim
k = np.expand_dims(k, axis=3) # output channels dim
k_rgb = np.repeat(k, 3, axis=2) # duplicate along the input channels dim

Tensorflow: Get the same tensor after a series of convolution and deconvolution

I am wondering whether it is possible to end up with the same tensor after propagating it through a convolutional and then deconvolutional filter. For example:
random_image = np.random.rand(1, 6, 6, 3)
input_image = tf.placeholder(shape=[1, 6, 6, 3], dtype=tf.float32)
conv = tf.layers.conv2d(input_image, filters=6, kernel_size=[3, 3], strides=(1, 1), data_format="channels_last")
deconv = tf.layers.conv2d_transpose(conv, filters=3, kernel_size=[3, 3], strides=(1, 1), data_format="channels_last")
sess = tf.Session()
sess.run(tf.global_variables_initializer())
print(random_image)
# Get an output which will be same as:
print(sess.run(deconv, feed_dict={input_image: random_image}))
In other words, if the generated random_image vector is for example: [1,2,3,4,5], after convolution and deconvolution the deconv vector to be [1,2,3,4,5].
However, I am not able to get it to work.
Looking forward to you answers!
It's possible to get some degree of visual similarity, by using VarianceScaling initialization for example. Or even with completely custom initializer. But transposed convolution isn't mathematically deconvolution. So you can't get math equality with conv2d_transpose.
Take a look Why isn't this Conv2d_Transpose / deconv2d returning the original input in tensorflow?

Tensorflow, update the Variable to have arbitrary shape

So, according to the documentation, we can use tf.assign with validate_shape=False to change the shape. It does change the shape of the content of the variable, but the shape you can get from get_shape() doesn't get updated. For example:
>>> a = tf.Variable([1, 1, 1, 1])
>>> sess.run(tf.global_variables_initializer())
>>> tf.assign(a, [[1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1]], validate_shape=False).eval()
array([[1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1]], dtype=int32)
>>> a.get_shape()
TensorShape([Dimension(4)])
It's pretty annoying that the later layers of the network base their shapes on the get_shape() value of this variable. So, even though the actual shape is correct, Tensorflow will complain the dimensions doesn't match. So any ideas on how to update the "believed" shape of each Variable?
In short: use set_shape to update the static shape of the variable.
You can understand what is going on by reading the TF FAQ:
In TensorFlow, a tensor has both a static (inferred) shape and a
dynamic (true) shape. The static shape can be read using the
tf.Tensor.get_shape method: this shape is inferred from the operations
that were used to create the tensor, and may be partially complete. If
the static shape is not fully defined, the dynamic shape of a Tensor t
can be determined by evaluating tf.shape(t).
So the static shape was not properly inferred and you should give TF a hint. Luckily the next few lines from the same FAQ tell you what to do:
The tf.Tensor.set_shape method updates the static shape of a Tensor
object, and it is typically used to provide additional shape
information when this cannot be inferred directly. It does not change
the dynamic shape of the tensor.
Since validate_shape is set to false static shape of the variable doesn't get update automatically in the graph. a wrokaround is to set it manualy with the new shape (that's known)
a = tf.Variable([1, 1, 1, 1], validate_shape=False)
sess.run(tf.global_variables_initializer())
new_arr_assign = np.array([[1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1]])
tf.assign(a, new_arr_assign, validate_shape=False).eval(session=sess)
a.set_shape(new_arr_assign.shape)
a.get_shape()
# results: TensorShape([Dimension(2), Dimension(7)])

How to use tensorflow to implement deconvolution?

I want to take use of tensorflow to implement fully convolutional network. There is a function
tf.nn.conv2d_transpose(value, filter, output_shape, strides, padding, name),
which could be used to take bilinear upsampling. However, I am confused as to how to use it? The input is an image with a single channel, and the output is also an image with a single channel, whose size is two times the one of the input.
I tried to use the function as follows, but got an IndexError: list index out of range:
with tf.name_scope('deconv') as scope:
deconv = tf.nn.conv2d_transpose(conv6, [3, 3, 1, 1],
[1, 26, 20, 1], 2, padding='SAME', name=None)
Got it! (assuming input_size = [1, 13, 10,1])
with tf.name_scope('deconv') as scope:
deconv = tf.nn.conv2d_transpose(input_layer, [3, 3, 1, 1],
[1, 26, 20, 1], [1, 2, 2, 1], padding='SAME', name=None)

Confused about conv2d_transpose

I'm getting this error message when using conv2d_transpose:
W tensorflow/core/common_runtime/executor.cc:1102] 0x7fc81f0d6250 Compute status: Invalid argument: Conv2DBackpropInput: Number of rows of out_backprop doesn't match computed: actual = 32, computed = 4
[[Node: generator/g_h1/conv2d_transpose = Conv2DBackpropInput[T=DT_FLOAT, padding="SAME", strides=[1, 2, 2, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/cpu:0"](generator/g_h1/conv2d_transpose/output_shape, generator/g_h1/w/read, _recv_l_0)]]
However, it occurs after the graph is built while compiling the loss function (Adam). Any ideas on what would cause this? I suspect it's related to the input dimensions but I'm not sure exactly why.
Full error: https://gist.github.com/jimfleming/75d88e888044615dd6e3
Relevant code:
# l shape: [batch_size, 32, 32, 4]
output_shape = [self.batch_size, 8, 8, 128]
filter_shape = [7, 7, 128, l.get_shape()[-1]]
strides = [1, 2, 2, 1]
with tf.variable_scope("g_h1"):
w = tf.get_variable('w', filter_shape, initializer=tf.random_normal_initializer(stddev=0.02))
h1 = tf.nn.conv2d_transpose(l, w, output_shape=output_shape, strides=strides, padding='SAME')
h1 = tf.nn.relu(h1)
output_shape = [self.batch_size, 16, 16, 128]
filter_shape = [7, 7, 128, h1.get_shape()[-1]]
strides = [1, 2, 2, 1]
with tf.variable_scope("g_h2"):
w = tf.get_variable('w', filter_shape, initializer=tf.random_normal_initializer(stddev=0.02))
h2 = tf.nn.conv2d_transpose(h1, w,output_shape=output_shape, strides=strides, padding='SAME')
h2 = tf.nn.relu(h2)
output_shape = [self.batch_size, 32, 32, 3]
filter_shape = [5, 5, 3, h2.get_shape()[-1]]
strides = [1, 2, 2, 1]
with tf.variable_scope("g_h3"):
w = tf.get_variable('w', filter_shape, initializer=tf.random_normal_initializer(stddev=0.02))
h3 = tf.nn.conv2d_transpose(h2, w,output_shape=output_shape, strides=strides, padding='SAME')
h3 = tf.nn.tanh(h3)
Thanks for the question! You're exactly right---the problem is that the input and output dimensions being passed to tf.nn.conv2d_transpose don't agree. (The error may be detected when computing gradients, but the gradient computation isn't the problem.)
Let's look at just the first part of your code, and simplify it a little bit:
sess = tf.Session()
batch_size = 3
output_shape = [batch_size, 8, 8, 128]
strides = [1, 2, 2, 1]
l = tf.constant(0.1, shape=[batch_size, 32, 32, 4])
w = tf.constant(0.1, shape=[7, 7, 128, 4])
h1 = tf.nn.conv2d_transpose(l, w, output_shape=output_shape, strides=strides, padding='SAME')
print sess.run(h1)
I replaced the variables with constants --- it's easier to see what's going on.
If you try to run this code, you get a similar error:
InvalidArgumentError: Conv2DCustomBackpropInput: Size of out_backprop doesn't match computed: actual = 32, computed = 4
[[Node: conv2d_transpose_6 = Conv2DBackpropInput[T=DT_FLOAT, data_format="NHWC", padding="SAME", strides=[1, 2, 2, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/cpu:0"](conv2d_transpose_6/output_shape, Const_25, Const_24)]]
Now, the error is a little misleading --- it talks about the 'out_backprop' argument to 'Conv2DCustomBackpropInput'. The key is that tf.nn.conv2d_transpose is actually just the gradient of tf.nn.conv2d, so Tensorflow uses the same code internally (Conv2DCustomBackpropInput) to compute the gradient of tf.nn.conv2d and to compute tf.nn.conv2d_transpose.
The error means that the 'output_shape' you requested is not possible, given the shapes of 'l' and 'w'.
Since tf.nn.conv2d_transpose is the backward (gradient) counterpart of tf.nn.conv2d, one way to see what the correct shapes should be is to use the corresponding forward operation:
output = tf.constant(0.1, shape=output_shape)
expected_l = tf.nn.conv2d(output, w, strides=strides, padding='SAME')
print expected_l.get_shape()
# Prints (3, 4, 4, 4)
That is, in the forward direction, if you provided a tensor of shape 'output_shape', you would get out a tensor of shape (3, 4, 4, 4).
So one way to fix the problem is to change the shape of 'l' to (3, 4, 4, 4); if you change the code above to:
l = tf.constant(0.1, shape=[batch_size, 4, 4, 4])
everything works fine.
In general, try using tf.nn.conv2d to get a feel for what the relationship between the tensor shapes is. Since tf.nn.conv2d_transpose is its backward counterpart, it has the same relationship between input, output and filter shapes (but with the roles of the input and output reversed.)
Hope that helps!
Using padding='SAME' in tf.nn.conv2d_transpose() function may works too