Presence of unaccounted conditional nodes in TensorBoard - tensorflow

The Problem
When I run my training, my preprocessing examples are successfully created, however my training does not start. Much weirder is the fact, that on analyzing my TensorBoard graph, I see some extra conditional nodes which do not exist in the code. I want to know where and why do these extra nodes come into picture and exactly why the training does not begin. Below is a systematic description of the situation :
TensorFlow Graph
The following TensorBoard diagram shows my graph :
The code which constructs this graph is below
def getconv2drelu(inputtensor, kernelsize, strides, padding, convname,
imagesummaries=False):
weights = tf.get_variable("weights", shape=kernelsize, dtype=tf.float32,
initializer=tf.truncated_normal_initializer(0,
0.01),
regularizer=tf.nn.l2_loss)
biases = tf.get_variable("biases", shape=kernelsize[3], dtype=tf.float32,
initializer=tf.constant_initializer(0.0))
conv = tf.nn.conv2d(input=inputtensor, filter=weights, strides=strides,
padding=padding, name=convname)
response = tf.nn.bias_add(conv, biases)
if imagesummaries:
filters = (weights - tf.reduce_min(weights)) / (tf.reduce_max(
weights) - tf.reduce_min(weights))
filters = tf.transpose(filters, [3, 0, 1, 2])
tf.summary.image(convname + " filters", filters,
max_outputs=kernelsize[3])
response = tf.nn.relu(response)
activation_summary(response)
return response
def getfullyconnected(inputtensor, numinput, numoutput):
weights = tf.get_variable("weights", shape=[numinput, numoutput],
dtype=tf.float32,
initializer=
tf.truncated_normal_initializer(0, 0.01))
biases = tf.get_variable("biases", shape=[numoutput], dtype=tf.float32,
initializer=tf.truncated_normal_initializer(
0, 0.01))
response = tf.add(tf.matmul(inputtensor, weights), biases)
response = tf.nn.relu(response)
activation_summary(response)
return response
def inference(inputs):
with tf.variable_scope("layer1"):
conv = getconv2drelu(inputtensor=inputs, kernelsize=[7, 7, 3, 96],
strides=[1, 2, 2, 1], padding="VALID",
convname="conv1", imagesummaries=True)
pool = tf.nn.max_pool(conv, [1, 3, 3, 1], strides=[1, 3, 3, 1],
padding="SAME", name="pool1")
with tf.variable_scope("layer2"):
conv = getconv2drelu(inputtensor=pool, kernelsize=[7, 7, 96, 256],
strides=[1, 1, 1, 1], padding="VALID",
convname="conv2", imagesummaries=False)
pool = tf.nn.max_pool(conv, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1],
padding="SAME", name="pool2")
with tf.variable_scope("layer3"):
conv = getconv2drelu(inputtensor=pool, kernelsize=[7, 7, 256, 512],
strides=[1, 1, 1, 1], padding="SAME",
convname="conv3", imagesummaries=False)
with tf.variable_scope("layer4"):
conv = getconv2drelu(inputtensor=conv, kernelsize=[3, 3, 512, 512],
strides=[1, 1, 1, 1], padding="SAME",
convname="conv4", imagesummaries=False)
with tf.variable_scope("layer5"):
conv = getconv2drelu(inputtensor=conv, kernelsize=[3, 3, 512, 1024],
strides=[1, 1, 1, 1], padding="SAME",
convname="conv5", imagesummaries=False)
with tf.variable_scope("layer6"):
conv = getconv2drelu(inputtensor=conv, kernelsize=[3, 3, 1024, 1024],
strides=[1, 1, 1, 1], padding="SAME",
convname="conv6", imagesummaries=False)
pool = tf.nn.max_pool(conv, [1, 3, 3, 1], strides=[1, 3, 3, 1],
padding="SAME", name="pool1")
pool = tf.contrib.layers.flatten(pool)
with tf.variable_scope("fc1"):
fc = getfullyconnected(pool, 5 * 5 * 1024, 4096)
drop = tf.nn.dropout(fc, keep_prob=0.5)
with tf.variable_scope("fc2"):
fc = getfullyconnected(drop, 4096, 4096)
drop = tf.nn.dropout(fc, keep_prob=0.5)
with tf.variable_scope("fc3"):
logits = getfullyconnected(drop, 4096, 1000)
return logits
The complete TensorBoard graph is shown below :
The figure is too small, but you can see a series of pink nodes to the left. An expanded version of such a segment is shown below :
Expansion of one of the condition blocks ( all blocks are similar !!) is shown below :
I am unable to understand the presence and existence of these extra condition blocks. All my images when fed to the graph are of size [221, 221, 3].
You can also see that inside a condition block, there is an isVariableInitialized test. I do initialize my variables right after the launch of a session. So, I do not understand as to why these checks will be performed.I have figured out that these condition blocks are there due to the use tf.get_variable() which checks for the initialization Do they cause any performance difference ?
Another observation
When I decrease the batchsize, the size of my tensorboard file also decreases. But the nodes shown on the graph remain the same. Why is this so ?
My training code is as follows :
with tf.control_dependencies(putops):
train_op = tf.group(apply_gradient_op, variables_averages_op)
sess.run(train_op) # tf.Session() as been defined before sess.run()
And putops is initialized to [] and during graph construction for each GPU, it is populated as follows :
# cpu_compute_stage is appended only once since it corresponds to centralized preprocessing
cpu_compute_stage = data_flow_ops.StagingArea(
[tf.float32, tf.int32],
shapes=[images_shape, labels_shape]
)
cpu_compute_stage_op = gpu_copy_stage.put(
[host_images, host_labels])
putops.append(gpu_copy_stage_op)
# For each device the putops is further appended by gpu_compute_stage which is for each GPU since CPU-GPU copy has to take place
with tf.device('/gpu:%d' % i):
with tf.name_scope('%s_%d' % (TOWER_NAME, i)) as scope:
gpu_compute_stage = data_flow_ops.StagingArea(
[tf.float32, tf.int32],
shapes=[images_shape, labels_shape]
)
gpu_compute_stage_op = gpu_compute_stage.put(
[host_images, host_labels]
)
putops.append(gpu_compute_stage_op)
However, my code does not run despite the fact that I do initialize both global and local variables.

Related

why if we use "tf.make_template()" in training stage, we must use tf.make_template() again in testing stage

I defined a model function which named "drrn_model". While I was training my model, I use model by:
shared_model = tf.make_template('shared_model', drrn_model)
train_output = shared_model(train_input, is_training=True)
It begin training step by step, and I can restore .ckpt file to the model when I want to continue to train the model from an old point.
But there is a problem when I test my trained model.
I use the code below directly without using tf.make_template:
train_output = drrn_model(train_input, is_training=False)
Then the terminal gave me a lots of NotFoundError like "Key LastLayer/Variable_2 not found in checkpoint".
But when I use
shared_model = tf.make_template('shared_model', drrn_model)
output_tensor = shared_model(input_tensor,is_training=False)
It can test normally.
So why we must use tf.make_template() again in testing stage. What is the difference between drrn_model and make_template when we construct our model.
And there is another question: the BN layer in tensorflow.
I have tried many ways but the outputs is always wrong(always worse then the version without BN layer).
There is my newest version of model with BN layer:
tensor = None
def drrn_model(input_tensor, is_training):
with tf.device("/gpu:0"):
with tf.variable_scope("FirstLayer"):
conv_0_w = tf.get_variable("conv_w", [3, 3, 1, 128], initializer=tf.random_normal_initializer(stddev=np.sqrt(2.0 / 9)))
tensor = tf.nn.conv2d(tf.nn.relu(batchnorm(input_tensor, is_training= is_training)), conv_0_w, strides=[1,1,1,1], padding="SAME")
first_layer = tensor
### recursion ###
with tf.variable_scope("recycle", reuse=False):
tensor = drrnblock(first_layer, tensor, is_training)
for i in range(1,10):
with tf.variable_scope("recycle", reuse=True):
tensor = drrnblock(first_layer, tensor, is_training)
### end layer ###
with tf.variable_scope("LastLayer"):
conv_end_w = tf.get_variable("conv_w", [3, 3, 128, 1], initializer=tf.random_normal_initializer(stddev=np.sqrt(2.0 / 9)))
conv_end_layer = tf.nn.conv2d(tf.nn.relu(batchnorm(tensor, is_training= is_training)), conv_end_w, strides=[1, 1, 1, 1], padding='SAME')
tensor = tf.add(input_tensor,conv_end_layer)
return tensor
def drrnblock(first_layer, input_layer, is_training):
conv1_w = tf.get_variable("conv1__w", [3, 3, 128, 128], initializer=tf.random_normal_initializer(stddev=np.sqrt(2.0 / 9)))
conv1_layer = tf.nn.conv2d(tf.nn.relu(batchnorm(input_layer, is_training= is_training)), conv1_w, strides=[1,1,1,1], padding= "SAME")
conv2_w = tf.get_variable("conv2__w", [3, 3, 128, 128], initializer=tf.random_normal_initializer(stddev=np.sqrt(2.0 / 9)))
conv2_layer = tf.nn.conv2d(tf.nn.relu(batchnorm(conv1_layer, is_training=is_training)), conv2_w, strides=[1, 1, 1, 1], padding="SAME")
tensor = tf.add(first_layer, conv2_layer)
return tensor
def batchnorm(inputs, is_training, decay = 0.999):# there is my BN layer
scale = tf.Variable(tf.ones([inputs.get_shape()[-1]]))
beta = tf.Variable(tf.zeros([inputs.get_shape()[-1]]))
pop_mean = tf.Variable(tf.zeros([inputs.get_shape()[-1]]), trainable=False)
pop_var = tf.Variable(tf.ones([inputs.get_shape()[-1]]), trainable=False)
if is_training:
batch_mean, batch_var = tf.nn.moments(inputs,[0,1,2])
print("batch_mean.shape: ", batch_mean.shape)
train_mean = tf.assign(pop_mean, pop_mean*decay+batch_mean*(1-decay))
train_var = tf.assign(pop_var, pop_var*decay+batch_var*(1-decay))
with tf.control_dependencies([train_mean, train_var]):
return tf.nn.batch_normalization(inputs,batch_mean,batch_var,beta,scale,variance_epsilon=1e-3)
else:
return tf.nn.batch_normalization(inputs,pop_mean,pop_var,beta,scale,variance_epsilon=1e-3)
Please tell me where is wrong in my code.
Thanks a lot!!

alexnet_v2/conv1/biases not found in checkpoint

I try to save and restore alxenet slim model. But I always get this error when I run the codesaver.restore(sess, tf.train.latest_checkpoint('I:/model/mnist/')).
And which throw the error:
NotFoundError (see above for traceback): Key alexnet_v2/conv1/biases not found in checkpoint.
When I run the tf.global_variables(), I can only get the weights of conv2d, and there is no biases in the result.
I don't understand what's the problem is. Here is my code:
here is my alex model
def alexnet_v2_arg_scope(weight_decay=0.0005,
stddev=0.1,
batch_norm_var_collection='moving_vars',
use_fused_batchnorm=True):
batch_norm_params = {
# Decay for the moving averages.
'decay': 0.9997,
# epsilon to prevent 0s in variance.
'epsilon': 0.001,
# collection containing update_ops.
'updates_collections': ops.GraphKeys.UPDATE_OPS,
# Use fused batch norm if possible.
'fused': use_fused_batchnorm,
# collection containing the moving mean and moving variance.
'variables_collections': {
'beta': None,
'gamma': None,
'moving_mean': [batch_norm_var_collection],
'moving_variance': [batch_norm_var_collection],
}
}
with slim.arg_scope([slim.conv2d, slim.fully_connected],
activation_fn=tf.nn.relu,
biases_initializer=tf.constant_initializer,
normalizer_fn=slim.batch_norm,
normalizer_params=batch_norm_params,
weights_regularizer=slim.l2_regularizer(weight_decay)):
with slim.arg_scope([slim.conv2d], padding='SAME'):
with slim.arg_scope([slim.max_pool2d], padding='VALID') as arg_sc:
return arg_sc
def alex_net(inputs,
num_classes=10,
is_training=True,
droupout_keep_prob=0.5,
spatial_squeze=True,
scope='alexnet_v2'):
with tf.variable_scope(scope, 'alexnet_v2',[inputs]) as sc:
end_points_collection = sc.name + '_end_points'
with slim.arg_scope([slim.conv2d],
weights_initializer=trunc_norm(0.1),
biases_initializer=tf.constant_initializer(0.1),
outputs_collections=end_points_collection):
inputs = tf.reshape(inputs,[-1,28,28,1])
net = slim.conv2d(inputs, 64, [3, 3], 1, padding='VALID', scope='conv1')
net = slim.max_pool2d(net, [2, 2], 2, scope='pool1')
net = slim.conv2d(net, 128, [3, 3], scope='conv2')
net = slim.max_pool2d(net, [3, 3], 2, scope='pool2')
# net = slim.conv2d(net, 384, [3, 3], scope='conv3')
# net = slim.conv2d(net, 384, [3, 3], scope='conv4')
net = slim.conv2d(net, 256, [3, 3], scope='conv3')
# net = slim.max_pool2d(net, [3, 3], 2, scope='pool5')
net = slim.conv2d(net, 512, [3, 3], scope='conv4')
with slim.arg_scope([slim.conv2d],
weights_initializer=trunc_norm(0.1),
biases_initializer=tf.constant_initializer(0.1)):
# net = slim.conv2d(net, 1028, [6, 6], padding='VALID', scope='fc6')
net = slim.avg_pool2d(net, [6,6], stride=1,padding='VALID',scope='avg_pool5' )
net = slim.dropout(net, droupout_keep_prob, is_training=is_training, scope='dropout6')
# net = slim.conv2d(net, 512, [1, 1], scope='fc7')
# net = slim.dropout(net, droupout_keep_prob, is_training=is_training, scope='dropout7')
net = slim.conv2d(net, num_classes, [1, 1],
activation_fn=None,
normalizer_fn=None,
biases_initializer=tf.zeros_initializer(),
scope='fc7x')
end_points = slim.utils.convert_collection_to_dict(end_points_collection)
if spatial_squeze:
net = tf.squeeze(net, [1,2], name='fc8/squeezed')
end_points[sc.name + '/fc8'] = net
return net, end_point
save model
varialbes_to_restore = slim.get_variables_to_restore()
saver = tf.train.Saver(varialbes_to_restore)
saver.save(sess,'I:/model/mnist/')
restore model
with tf.Session() as sess:
logits, _ = _alex_slim.alex_net(teX[:200])
saver = tf.train.Saver()
saver.restore(sess, tf.train.latest_checkpoint('I:/model/mnist/'))
logits_var = sess.run(logits)
print(logits_var)
Perhapse tensorflow expects you to save it in a checkpoint because you put it in a scope.
You will get the same error if you give a name to a tensor which isn't being saved.
If you don't want to save it, don't give it a name or scope.

How to add more layers to Convolutional Neural Network text classification TensorFlow example?

According to the documentation, the model presented in this example is similar to the following paper:
"Character-level Convolutional Networks for Text Classification"
I found that the original model (presented in the paper) contains 9 layers deep with 6 convolutional layers and 3 fully-connected layers, but the implemented example contains only two convolutional layers:
with tf.variable_scope('CNN_Layer1'):
# Apply Convolution filtering on input sequence.
conv1 = tf.contrib.layers.convolution2d(
byte_list, N_FILTERS, FILTER_SHAPE1, padding='VALID')
# Add a RELU for non linearity.
conv1 = tf.nn.relu(conv1)
# Max pooling across output of Convolution+Relu.
pool1 = tf.nn.max_pool(
conv1,
ksize=[1, POOLING_WINDOW, 1, 1],
strides=[1, POOLING_STRIDE, 1, 1],
padding='SAME')
# Transpose matrix so that n_filters from convolution becomes width.
pool1 = tf.transpose(pool1, [0, 1, 3, 2])
with tf.variable_scope('CNN_Layer2'):
# Second level of convolution filtering.
conv2 = tf.contrib.layers.convolution2d(
pool1, N_FILTERS, FILTER_SHAPE2, padding='VALID')
# Max across each filter to get useful features for classification.
pool2 = tf.squeeze(tf.reduce_max(conv2, 1), squeeze_dims=[1])
If anybody can help me to extend this model for more layers?
Similar to BVLC Caffenet :
def bvlc_caffenet(imgs,weights,biases):
# mean subtraction
mean = tf.constant([123.68, 116.779, 103.939], dtype=tf.float32, shape=[1, 1, 1, 3], name='img_mean')
images = imgs-mean
#conv1
conv1 = tf.nn.conv2d(images,weights['c1'], [1, 3, 3, 1], padding='VALID')
out1 = tf.nn.relu(tf.nn.bias_add(conv1, biases['b1']))
pool1 = tf.nn.max_pool(out1,ksize=[1,3,3,1], strides=[1,2,2,1],padding='VALID')
#conv2
conv2 = tf.nn.conv2d(pool1,weights['c2'], [1, 1, 1, 1], padding='VALID')
out2 = tf.nn.relu(tf.nn.bias_add(conv2, biases['b2']))
pool2 = tf.nn.max_pool(out2,ksize=[1,3,3,1], strides=[1,2,2,1],padding='VALID')
#conv3
conv3 = tf.nn.conv2d(pool2,weights['c3'], [1, 1, 1, 1], padding='VALID')
out3 = tf.nn.relu(tf.nn.bias_add(conv3, biases['b3']))
#conv4
conv4 = tf.nn.conv2d(out3,weights['c4'], [1, 1, 1, 1], padding='VALID')
out4 = tf.nn.relu(tf.nn.bias_add(conv4, biases['b4']))
#conv5
conv5 = tf.nn.conv2d(out4,weights['c5'], [1, 1, 1, 1], padding='VALID')
out5 = tf.nn.relu(tf.nn.bias_add(conv5, biases['b5']))
pool5 = tf.nn.max_pool(out5,ksize=[1,3,3,1], strides=[1,2,2,1],padding='VALID')
#flattening
shape = int(np.prod(pool5.get_shape()[1:]))
pool5_flat = tf.reshape(pool5, [-1, shape])
#fc6
fc6 = tf.matmul(pool5_flat,weights['f6'])
out6 = tf.nn.relu(tf.nn.bias_add(fc6,biases['b6']))
out6 = tf.nn.dropout(out6,0.5)
#fc7
fc7 = tf.matmul(out6,weights['f7'])
out7 = tf.nn.relu(tf.nn.bias_add(fc7,biases['b7']))
out7 = tf.nn.dropout(out7,0.5)
#fc8
fc8 = tf.matmul(out7,weights['f8'])
out8 = tf.nn.relu(tf.nn.bias_add(fc8,biases['b8']))
out8 = tf.nn.dropout(out8,0.5)
probs = tf.nn.softmax(out8)
return probs
Initialized Weights and Biases for the Network
weights = {
'c1': tf.Variable(tf.truncated_normal([7,7,3,96],stddev=0.1)),
'c2': tf.Variable(tf.truncated_normal([5,5,96,256],stddev=0.1)),
'c3': tf.Variable(tf.truncated_normal([3,3,256,384],stddev=0.1)),
'c4': tf.Variable(tf.truncated_normal([3,3,384,384],stddev=0.1)),
'c5': tf.Variable(tf.truncated_normal([3,3,384,256],stddev=0.1)),
'f6': tf.Variable(tf.truncated_normal([4096,2048],stddev=0.1)),
'f7': tf.Variable(tf.truncated_normal([2048,2048],stddev=0.1)),
'f8': tf.Variable(tf.truncated_normal([2048,1000],stddev=0.1))
}
biases = {
'b1' : tf.Variable(tf.constant(0.1, shape=[96])),
'b2' : tf.Variable(tf.constant(0.1, shape=[256])),
'b3' : tf.Variable(tf.constant(0.1, shape=[384])),
'b4' : tf.Variable(tf.constant(0.1, shape=[384])),
'b5' : tf.Variable(tf.constant(0.1, shape=[256])),
'b6' : tf.Variable(tf.constant(0.1, shape=[2048])),
'b7' : tf.Variable(tf.constant(0.1, shape=[2048])),
'b8' : tf.Variable(tf.constant(0.1, shape=[1000]))
}
Or
follow this (another format) : https://www.cs.toronto.edu/~frossard/vgg16/vgg16.py
Are these helpful ?

how to visualize feature map of CNN for tensorflow? [duplicate]

Similarly to the Caffe framework, where it is possible to watch the learned filters during CNNs training and it's resulting convolution with input images, I wonder if is it possible to do the same with TensorFlow?
A Caffe example can be viewed in this link:
http://nbviewer.jupyter.org/github/BVLC/caffe/blob/master/examples/00-classification.ipynb
Grateful for your help!
To see just a few conv1 filters in Tensorboard, you can use this code (it works for cifar10)
# this should be a part of the inference(images) function in cifar10.py file
# conv1
with tf.variable_scope('conv1') as scope:
kernel = _variable_with_weight_decay('weights', shape=[5, 5, 3, 64],
stddev=1e-4, wd=0.0)
conv = tf.nn.conv2d(images, kernel, [1, 1, 1, 1], padding='SAME')
biases = _variable_on_cpu('biases', [64], tf.constant_initializer(0.0))
bias = tf.nn.bias_add(conv, biases)
conv1 = tf.nn.relu(bias, name=scope.name)
_activation_summary(conv1)
with tf.variable_scope('visualization'):
# scale weights to [0 1], type is still float
x_min = tf.reduce_min(kernel)
x_max = tf.reduce_max(kernel)
kernel_0_to_1 = (kernel - x_min) / (x_max - x_min)
# to tf.image_summary format [batch_size, height, width, channels]
kernel_transposed = tf.transpose (kernel_0_to_1, [3, 0, 1, 2])
# this will display random 3 filters from the 64 in conv1
tf.image_summary('conv1/filters', kernel_transposed, max_images=3)
I also wrote a simple gist to display all 64 conv1 filters in a grid.

Confused about conv2d_transpose

I'm getting this error message when using conv2d_transpose:
W tensorflow/core/common_runtime/executor.cc:1102] 0x7fc81f0d6250 Compute status: Invalid argument: Conv2DBackpropInput: Number of rows of out_backprop doesn't match computed: actual = 32, computed = 4
[[Node: generator/g_h1/conv2d_transpose = Conv2DBackpropInput[T=DT_FLOAT, padding="SAME", strides=[1, 2, 2, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/cpu:0"](generator/g_h1/conv2d_transpose/output_shape, generator/g_h1/w/read, _recv_l_0)]]
However, it occurs after the graph is built while compiling the loss function (Adam). Any ideas on what would cause this? I suspect it's related to the input dimensions but I'm not sure exactly why.
Full error: https://gist.github.com/jimfleming/75d88e888044615dd6e3
Relevant code:
# l shape: [batch_size, 32, 32, 4]
output_shape = [self.batch_size, 8, 8, 128]
filter_shape = [7, 7, 128, l.get_shape()[-1]]
strides = [1, 2, 2, 1]
with tf.variable_scope("g_h1"):
w = tf.get_variable('w', filter_shape, initializer=tf.random_normal_initializer(stddev=0.02))
h1 = tf.nn.conv2d_transpose(l, w, output_shape=output_shape, strides=strides, padding='SAME')
h1 = tf.nn.relu(h1)
output_shape = [self.batch_size, 16, 16, 128]
filter_shape = [7, 7, 128, h1.get_shape()[-1]]
strides = [1, 2, 2, 1]
with tf.variable_scope("g_h2"):
w = tf.get_variable('w', filter_shape, initializer=tf.random_normal_initializer(stddev=0.02))
h2 = tf.nn.conv2d_transpose(h1, w,output_shape=output_shape, strides=strides, padding='SAME')
h2 = tf.nn.relu(h2)
output_shape = [self.batch_size, 32, 32, 3]
filter_shape = [5, 5, 3, h2.get_shape()[-1]]
strides = [1, 2, 2, 1]
with tf.variable_scope("g_h3"):
w = tf.get_variable('w', filter_shape, initializer=tf.random_normal_initializer(stddev=0.02))
h3 = tf.nn.conv2d_transpose(h2, w,output_shape=output_shape, strides=strides, padding='SAME')
h3 = tf.nn.tanh(h3)
Thanks for the question! You're exactly right---the problem is that the input and output dimensions being passed to tf.nn.conv2d_transpose don't agree. (The error may be detected when computing gradients, but the gradient computation isn't the problem.)
Let's look at just the first part of your code, and simplify it a little bit:
sess = tf.Session()
batch_size = 3
output_shape = [batch_size, 8, 8, 128]
strides = [1, 2, 2, 1]
l = tf.constant(0.1, shape=[batch_size, 32, 32, 4])
w = tf.constant(0.1, shape=[7, 7, 128, 4])
h1 = tf.nn.conv2d_transpose(l, w, output_shape=output_shape, strides=strides, padding='SAME')
print sess.run(h1)
I replaced the variables with constants --- it's easier to see what's going on.
If you try to run this code, you get a similar error:
InvalidArgumentError: Conv2DCustomBackpropInput: Size of out_backprop doesn't match computed: actual = 32, computed = 4
[[Node: conv2d_transpose_6 = Conv2DBackpropInput[T=DT_FLOAT, data_format="NHWC", padding="SAME", strides=[1, 2, 2, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/cpu:0"](conv2d_transpose_6/output_shape, Const_25, Const_24)]]
Now, the error is a little misleading --- it talks about the 'out_backprop' argument to 'Conv2DCustomBackpropInput'. The key is that tf.nn.conv2d_transpose is actually just the gradient of tf.nn.conv2d, so Tensorflow uses the same code internally (Conv2DCustomBackpropInput) to compute the gradient of tf.nn.conv2d and to compute tf.nn.conv2d_transpose.
The error means that the 'output_shape' you requested is not possible, given the shapes of 'l' and 'w'.
Since tf.nn.conv2d_transpose is the backward (gradient) counterpart of tf.nn.conv2d, one way to see what the correct shapes should be is to use the corresponding forward operation:
output = tf.constant(0.1, shape=output_shape)
expected_l = tf.nn.conv2d(output, w, strides=strides, padding='SAME')
print expected_l.get_shape()
# Prints (3, 4, 4, 4)
That is, in the forward direction, if you provided a tensor of shape 'output_shape', you would get out a tensor of shape (3, 4, 4, 4).
So one way to fix the problem is to change the shape of 'l' to (3, 4, 4, 4); if you change the code above to:
l = tf.constant(0.1, shape=[batch_size, 4, 4, 4])
everything works fine.
In general, try using tf.nn.conv2d to get a feel for what the relationship between the tensor shapes is. Since tf.nn.conv2d_transpose is its backward counterpart, it has the same relationship between input, output and filter shapes (but with the roles of the input and output reversed.)
Hope that helps!
Using padding='SAME' in tf.nn.conv2d_transpose() function may works too