rebuild torch tensor from its batchified version - numpy

This is very nice example of how to build a 3D tensor:
import torch
y = torch.rand(100, 1)
batch_size = 10
batched_data = y.contiguous().view(batch_size, -1, y.size(-1)).transpose(0,1)
batched_data.shape
the output is:
torch.Size([10, 10, 1])
ok, now what I want to do is, starting from batched_data I want to build y.
The other way around.
Any good suggestion with a powerful pytorch streamlined code?
==== Additional input =====
I am using this for RNN and now I have some doubts, becaus eif you consider the following code:
import torch
y = torch.arange(100).view(100,1)
batch_size = 10
batched_data = y.contiguous().view(batch_size, -1, y.size(-1)).transpose(0,1)
batched_data.shape
The output is:
tensor([[[ 0],
[10],
[20],
[30],
[40],
[50],
[60],
[70],
[80],
[90]],
[[ 1],
[11],
[21],
[31],
[41],
[51],
[61],
[71],
[81],
[91]],
Which I would not expect. I would expect something like:
[[1,2,3,4,5,6,7,8,9,10],[11,12,13,14,15,16,17,18,19,20],....

Suppose you want to do something like this to rebuild y:
rebuilded_y = batched_data.transpose(0,1).view(*y.shape)
To make the input look like you expected you need to remove transpose and additional dimension in batched_data:
batched_data = y.contiguous().view(batch_size, -1)

If you want to prepare inputs for RNN, you need to know that RNN takes 3d tensors of shape, seq_len, batch, input_size. Here, input_size refers to the number of features and in your scenario, it is 1. So, the input tensor of shape 10, 10, 1 can still be a valid input for an RNN.
Example
rnn = nn.RNN(input_size=1, hidden_size=20, num_layers=1)
input = torch.randn(10, 10, 1)
output, hn = rnn(input)
print(output.size()) # 10, 10, 20
RNN's output is of shape, seq_len, batch, num_directions * hidden_size.

Related

How to use the tf.case api of TensorFlow correctly?

I want to design a follow function for expanding any 1D/2D/3D matrix to a 4D matrix.
import tensorflow as tf
def inputs_2_4D(inputs):
_ranks = tf.rank(inputs)
return tf.case({tf.equal(_ranks, 3): lambda: tf.expand_dims(inputs, 3),
tf.equal(_ranks, 2): lambda: tf.expand_dims(tf.expand_dims(inputs, 0), 3),
tf.equal(_ranks, 1): lambda: tf.expand_dims(tf.expand_dims(tf.expand_dims(inputs, 0), 0), 3)},
default=lambda: tf.identity(inputs))
def run():
with tf.Session() as sess:
mat_1d = tf.constant([1, 1])
mat_2d = tf.constant([[1, 1]])
mat_3d = tf.constant([[[1, 1]]])
mat_4d = tf.constant([[[[1, 1]]]])
result = inputs_2_4D(mat_1d)
print(result.eval())
The function, however, cannot run well. It can only perform to output a 4-D matrix when the mat_3d and mat-4d tensors are passed into it. There will be some errors information if a 1D or 2D matrix are passed to the function.
When passing mat_3dormat_4dinto inputs_2_4D(), they can be expanded to a 4D matrix or original matrix:
mat_3d -----> [[[[1]
[1]]]]
mat_4d -----> [[[[1 1]]]]
When mat_1dormat_2dmatrixes are passed into inputs_2_4D, error information:
ValueError: dim 3 not in the interval [-2, 1]. for 'case/cond/ExpandDims' (op: 'ExpandDims') with input shapes: [2], [] and with computed input tensors: input[1] = <3>.
I tested another similar function before. That function can run correctly.
import tensorflow as tf
def test_2_4D(inputs):
_ranks = tf.rank(inputs)
return tf.case({tf.equal(_ranks, 3): lambda: tf.constant(3),
tf.equal(_ranks, 2): lambda: tf.constant(2),
tf.equal(_ranks, 1): lambda: tf.constant(1)},
default=lambda: tf.identity(inputs))
def run():
with tf.Session() as sess:
mat_1d = tf.constant([1, 1])
mat_2d = tf.constant([[1, 1]])
mat_3d = tf.constant([[[1, 1]]])
mat_4d = tf.constant([[[[1, 1]]]])
result = test_2_4D(mat_3d)
print(result.eval())
This function can correctly output the corresponding results when passing all of matrixes.
test_2_4D() RESULTS:
mat_1d -----> 1
mat_2d -----> 2
mat_3d -----> 3
mat_4d -----> [[[[1 1]]]]
I don't know why the correct branch in inputs_2_4D() cannot be found while the tf.equal() in each branch were executed. I feel that the 1st and 2nd branches in the function seem to still work if the input matrix is "mat_1d" or "mat_2d". So, the program will crash down. Please help me to analyze this problem!
I think I worked out what the problem is here. Turns out all condition/function pairs are evaluated. This can be revealed by giving the ops different names. The problem is that if your input is, say, rank 2, Tensorflow seems to still evaluate tf.equal(_ranks, 3): lambda: tf.expand_dims(inputs, 3). This leads to a crash because it cannot expand dim 3 for a rank-2 tensor (the maximum allowed value is 2).
This actually makes sense since with tf.case you're basically saying "I don't know which of these cases is going to be true at runtime, so check which one is appropriate and execute the corresponding function". However this means that Tensorflow needs to prepare execution paths for all possible cases, which in this case leads to invalid computations (trying to expand invalid dimensions).
At this point it would be nice to know a little more about your problem, i.e. why exactly you need that function. If you have different inputs and you simply want to bring them all to 4D, but each input always has the same dimensionality, consider simply using Python if-statements. Example:
inputs3d = tf.constant([[[1,1]]]) # this is always 3D
inputs2d = tf.constant([[1,1]]) # this is alwayas 2D
...
def inputs_2_4D(inputs):
_rank = len(inputs.shape.as_list())
if _rank == 3:
return tf.expand_dims(inputs, 3)
elif _rank == 2:
return tf.expand_dims(tf.expand_dims(inputs, 0), 3)
...
This will check the input rank while the graph is being built (not at runtime like tf.case) and really only prepare those expand_dims ops that are appropriate for the given input.
However if you have a single inputs tensor and this could have different ranks at different times of your program this would require a different solution. Please let us know which problem you're trying to solve!
I have implement the functionality I want through 2 ways. Now, I provide my code to share.
The 1st method based on tf.cond:
def inputs_2_4D(inputs):
_rank1d = tf.rank(inputs)
def _1d_2_2d(): return tf.expand_dims(inputs, 0)
def _greater_than_1d(): return tf.identity(inputs)
_tmp_2d = tf.cond(_rank1d < 2, _1d_2_2d, _greater_than_1d)
_rank2d = tf.rank(_tmp_2d)
def _2d_2_3d(): return tf.expand_dims(_tmp_2d, 0)
def _greater_than_2d(): return tf.identity(_tmp_2d)
_tmp_3d = tf.cond(_rank2d < 3, _2d_2_3d, _greater_than_2d)
_rank3d = tf.rank(_tmp_3d)
def _3d_2_4d(): return tf.expand_dims(_tmp_3d, 3)
def _greater_than_3d(): return tf.identity(_tmp_3d)
return (tf.cond(_rank3d < 4, _3d_2_4d, _greater_than_3d))
The 2nd method based on tf.case with tf.cond:
def inputs_2_4D_1(inputs):
_rank = tf.rank(inputs)
def _assign_original(): return tf.identity(inputs)
def _dummy(): return tf.expand_dims(inputs, 0)
_1d = tf.cond(tf.equal(_rank, 1), _assign_original, _dummy)
_2d = tf.cond(tf.equal(_rank, 2), _assign_original, _dummy)
_3d = tf.cond(tf.equal(_rank, 3), _assign_original, _dummy)
def _1d_2_4d(): return tf.expand_dims(tf.expand_dims(tf.expand_dims(_1d, 0), 0), 3)
def _2d_2_4d(): return tf.expand_dims(tf.expand_dims(_2d, 0), 3)
def _3d_2_4d(): return tf.expand_dims(_3d, 3)
return (tf.case({tf.equal(_rank, 1): _1d_2_4d,
tf.equal(_rank, 2): _2d_2_4d,
tf.equal(_rank, 3): _3d_2_4d},
default=_assign_original))
I think the efficiency of the 2nd method should be less than the 1st method's, because the function _dummy() always wastes 2 operations when allocating inputs into _1d,_2d,_3d respectively.

Padding Labels for Tensorflow CTC Loss?

I would like to pad my labels so that they would be of equal length to be passed into the ctc_loss function. Apparently, -1 is not allowed. If I were to apply padding, should the padding value be part of the labels for ctc?
Update
I have this code that converts dense labels into sparse ones to be passed to the ctc_loss function which I think is related to the problem.
def dense_to_sparse(dense_tensor, out_type):
indices = tf.where(tf.not_equal(dense_tensor, tf.constant(0, dense_tensor.dtype)
values = tf.gather_nd(dense_tensor, indices)
shape = tf.shape(dense_tensor, out_type=out_type)
return tf.SparseTensor(indices, values, shape)
Actually, -1 values are allowed to be present in the y_true argument of the ctc_batch_cost with one limitation - they should not appear within the actual label "content" which is specified by label_length (here i-th label "content" would start from the index 0 and end at the index label_length[i]).
So it is perfectly fine to pad labels with -1 so that they would be of equal length, as you intended. The only thing you should take care about is to correctly calculate and pass corresponding label_length values.
Here is the sample code which is a modified version of the test_ctc unit test from keras:
import numpy as np
from tensorflow.keras import backend as K
number_of_categories = 4
number_of_timesteps = 5
labels = np.asarray([[0, 1, 2, 1, 0], [0, 1, 1, 0, -1]])
label_lens = np.expand_dims(np.asarray([5, 4]), 1)
# dimensions are batch x time x categories
inputs = np.zeros((2, number_of_timesteps, number_of_categories), dtype=np.float32)
input_lens = np.expand_dims(np.asarray([5, 5]), 1)
k_labels = K.variable(labels, dtype="int32")
k_inputs = K.variable(inputs, dtype="float32")
k_input_lens = K.variable(input_lens, dtype="int32")
k_label_lens = K.variable(label_lens, dtype="int32")
res = K.eval(K.ctc_batch_cost(k_labels, k_inputs, k_input_lens, k_label_lens))
It runs perfectly fine even with -1 as the last element of the (second) labels sequence because corresponding label_lens item (second) specified that its length is 4.
If we change it to be 5 or if we change some other label value to be -1 then we have the All labels must be nonnegative integers exception that you've mentioned. But this just means that our label_lens is invalid.
Here's how I do it. I have a dense tensor labels that includes padding with -1, so that all targets in a batch have the same length. Then I use
labels_sparse = dense_to_sparse(labels, sparse_val=-1)
where
def dense_to_sparse(dense_tensor, sparse_val=0):
"""Inverse of tf.sparse_to_dense.
Parameters:
dense_tensor: The dense tensor. Duh.
sparse_val: The value to "ignore": Occurrences of this value in the
dense tensor will not be represented in the sparse tensor.
NOTE: When/if later restoring this to a dense tensor, you
will probably want to choose this as the default value.
Returns:
SparseTensor equivalent to the dense input.
"""
with tf.name_scope("dense_to_sparse"):
sparse_inds = tf.where(tf.not_equal(dense_tensor, sparse_val),
name="sparse_inds")
sparse_vals = tf.gather_nd(dense_tensor, sparse_inds,
name="sparse_vals")
dense_shape = tf.shape(dense_tensor, name="dense_shape",
out_type=tf.int64)
return tf.SparseTensor(sparse_inds, sparse_vals, dense_shape)
This creates a sparse tensor of the labels, which is what you need to put into the ctc loss. That is, you call tf.nn.ctc_loss(labels=labels_sparse, ...) The padding (i.e. all values equal to -1 in the dense tensor) is simply not represented in this sparse tensor.

Tensorflow multiply 3D batch tensor with a 2D weight

I've got two tensors with the shape shown below,
batch.shape = [?, 5, 4]
weight.shape = [3, 5]
by multiplying the weight with every element in the batch, I want to get
result.shape = [?, 3, 4]
what is the most efficient way to achieve this?
Try this:
newbatch = tf.transpose(batch,[1,0,2])
newbatch = tf.reshape(newbatch,[5,-1])
result = tf.matmul(weight,newbatch)
result = tf.reshape(result,[3,-1,4])
result = tf.transpose(result, [1,0,2])
Or more compactly:
newbatch = tf.reshape(tf.transpose(batch,[1,0,2]),[5,-1])
result = tf.transpose(tf.reshape(tf.matmul(weight,newbatch),[3,-1,4]), [1,0,2])
Try this:
tf.einsum("ijk,aj-> iak",batch,weight)
A generalized contraction between tensors of arbitrary dimension Refer this for more information

Tensorflow convolution layers have strange artefacts

Could anyone explain me what I'm doing wrong that my tensorboard graphs have additional groups when I use tf.layers.conv1d ?
For sake of simplicity I've created one tf.name_scope 'conv_block1' that contains: conv1d -> max_pool -> batch_norm, yet my graph has odd addtional blocks (see attached screenshot). Basically a superficial block 'conv1dwas added with weights for theconv_block1/conv1d` layer, and it is placed an groups. This makes the networks with multiple convolution blocks completely unreadable, am I doing something wrong or is this some kind of bug/performance feature in Tensorflow 1.4? Odd enough the dense layers are fine and the weights are properly scoped.
Here is the code if anyone wants to recreate the graph:
def cnn_model(inputs, mode):
x = tf.placeholder_with_default(inputs['wav'], shape=[None, SAMPLE_RATE, 1], name='input_placeholder')
with tf.name_scope("conv_block1"):
x = tf.layers.conv1d(x, filters=80, kernel_size=5, strides=1, padding='same', activation=tf.nn.relu)
x = tf.layers.max_pooling1d(x, pool_size=3, strides=3)
x = tf.layers.batch_normalization(x, training=(mode == tf.estimator.ModeKeys.TRAIN))
x = tf.layers.flatten(x)
x = tf.layers.dense(x, units=12)
return x
UPDATE 1
I've added even simpler example that can be executed directly to see the issue:
g = tf.Graph()
with g.as_default():
x = tf.placeholder(name='input', dtype=tf.float32, shape=[None, 16000, 1])
with tf.name_scope('group1'):
x = tf.layers.conv1d(x, 80, 5, name='conv1')
x = tf.layers.dense(x, 10, name="dense1")
[n.name for n in g.as_graph_def().node]
outputs:
['input',
'conv1/kernel/Initializer/random_uniform/shape',
'conv1/kernel/Initializer/random_uniform/min',
'conv1/kernel/Initializer/random_uniform/max',
'conv1/kernel/Initializer/random_uniform/RandomUniform',
'conv1/kernel/Initializer/random_uniform/sub',
'conv1/kernel/Initializer/random_uniform/mul',
'conv1/kernel/Initializer/random_uniform',
'conv1/kernel',
'conv1/kernel/Assign',
'conv1/kernel/read',
'conv1/bias/Initializer/zeros',
'conv1/bias',
'conv1/bias/Assign',
'conv1/bias/read',
'group1/conv1/dilation_rate',
'group1/conv1/conv1d/ExpandDims/dim',
'group1/conv1/conv1d/ExpandDims',
'group1/conv1/conv1d/ExpandDims_1/dim',
'group1/conv1/conv1d/ExpandDims_1',
'group1/conv1/conv1d/Conv2D',
'group1/conv1/conv1d/Squeeze',
'group1/conv1/BiasAdd',
'dense1/kernel/Initializer/random_uniform/shape',
'dense1/kernel/Initializer/random_uniform/min',
'dense1/kernel/Initializer/random_uniform/max',
'dense1/kernel/Initializer/random_uniform/RandomUniform',
'dense1/kernel/Initializer/random_uniform/sub',
'dense1/kernel/Initializer/random_uniform/mul',
'dense1/kernel/Initializer/random_uniform',
'dense1/kernel',
'dense1/kernel/Assign',
'dense1/kernel/read',
'dense1/bias/Initializer/zeros',
'dense1/bias',
'dense1/bias/Assign',
'dense1/bias/read',
'dense1/Tensordot/Shape',
'dense1/Tensordot/Rank',
'dense1/Tensordot/axes',
'dense1/Tensordot/GreaterEqual/y',
'dense1/Tensordot/GreaterEqual',
'dense1/Tensordot/Cast',
'dense1/Tensordot/mul',
'dense1/Tensordot/Less/y',
'dense1/Tensordot/Less',
'dense1/Tensordot/Cast_1',
'dense1/Tensordot/add',
'dense1/Tensordot/mul_1',
'dense1/Tensordot/add_1',
'dense1/Tensordot/range/start',
'dense1/Tensordot/range/delta',
'dense1/Tensordot/range',
'dense1/Tensordot/ListDiff',
'dense1/Tensordot/Gather',
'dense1/Tensordot/Gather_1',
'dense1/Tensordot/Const',
'dense1/Tensordot/Prod',
'dense1/Tensordot/Const_1',
'dense1/Tensordot/Prod_1',
'dense1/Tensordot/concat/axis',
'dense1/Tensordot/concat',
'dense1/Tensordot/concat_1/axis',
'dense1/Tensordot/concat_1',
'dense1/Tensordot/stack',
'dense1/Tensordot/transpose',
'dense1/Tensordot/Reshape',
'dense1/Tensordot/transpose_1/perm',
'dense1/Tensordot/transpose_1',
'dense1/Tensordot/Reshape_1/shape',
'dense1/Tensordot/Reshape_1',
'dense1/Tensordot/MatMul',
'dense1/Tensordot/Const_2',
'dense1/Tensordot/concat_2/axis',
'dense1/Tensordot/concat_2',
'dense1/Tensordot',
'dense1/BiasAdd']
Ok I've found the issue apparently tf.name_scope is for operation only and tf.variable_scope works for both operations and variables (as per this tf issue).
Here is a stack overflow question that explains the difference between name_scope and variable_scope:
What's the difference of name scope and a variable scope in tensorflow?
g = tf.Graph()
with g.as_default():
x = tf.placeholder(name='input', dtype=tf.float32, shape=[None, 16000, 1])
with tf.variable_scope('v_scope1'):
x = tf.layers.conv1d(x, 80, 5, name='conv1')
[n.name for n in g.as_graph_def().node]
gives:
['input',
'v_scope1/conv1/kernel/Initializer/random_uniform/shape',
'v_scope1/conv1/kernel/Initializer/random_uniform/min',
'v_scope1/conv1/kernel/Initializer/random_uniform/max',
'v_scope1/conv1/kernel/Initializer/random_uniform/RandomUniform',
'v_scope1/conv1/kernel/Initializer/random_uniform/sub',
'v_scope1/conv1/kernel/Initializer/random_uniform/mul',
'v_scope1/conv1/kernel/Initializer/random_uniform',
'v_scope1/conv1/kernel',
'v_scope1/conv1/kernel/Assign',
'v_scope1/conv1/kernel/read',
'v_scope1/conv1/bias/Initializer/zeros',
'v_scope1/conv1/bias',
'v_scope1/conv1/bias/Assign',
'v_scope1/conv1/bias/read',
'v_scope1/conv1/dilation_rate',
'v_scope1/conv1/conv1d/ExpandDims/dim',
'v_scope1/conv1/conv1d/ExpandDims',
'v_scope1/conv1/conv1d/ExpandDims_1/dim',
'v_scope1/conv1/conv1d/ExpandDims_1',
'v_scope1/conv1/conv1d/Conv2D',
'v_scope1/conv1/conv1d/Squeeze',
'v_scope1/conv1/BiasAdd']

Visualizing output of convolutional layer in tensorflow

I'm trying to visualize the output of a convolutional layer in tensorflow using the function tf.image_summary. I'm already using it successfully in other instances (e. g. visualizing the input image), but have some difficulties reshaping the output here correctly. I have the following conv layer:
img_size = 256
x_image = tf.reshape(x, [-1,img_size, img_size,1], "sketch_image")
W_conv1 = weight_variable([5, 5, 1, 32])
b_conv1 = bias_variable([32])
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
So the output of h_conv1 would have the shape [-1, img_size, img_size, 32]. Just using tf.image_summary("first_conv", tf.reshape(h_conv1, [-1, img_size, img_size, 1])) Doesn't account for the 32 different kernels, so I'm basically slicing through different feature maps here.
How can I reshape them correctly? Or is there another helper function I could use for including this output in the summary?
I don't know of a helper function but if you want to see all the filters you can pack them into one image with some fancy uses of tf.transpose.
So if you have a tensor that's images x ix x iy x channels
>>> V = tf.Variable()
>>> print V.get_shape()
TensorShape([Dimension(-1), Dimension(256), Dimension(256), Dimension(32)])
So in this example ix = 256, iy=256, channels=32
first slice off 1 image, and remove the image dimension
V = tf.slice(V,(0,0,0,0),(1,-1,-1,-1)) #V[0,...]
V = tf.reshape(V,(iy,ix,channels))
Next add a couple of pixels of zero padding around the image
ix += 4
iy += 4
V = tf.image.resize_image_with_crop_or_pad(image, iy, ix)
Then reshape so that instead of 32 channels you have 4x8 channels, lets call them cy=4 and cx=8.
V = tf.reshape(V,(iy,ix,cy,cx))
Now the tricky part. tf seems to return results in C-order, numpy's default.
The current order, if flattened, would list all the channels for the first pixel (iterating over cx and cy), before listing the channels of the second pixel (incrementing ix). Going across the rows of pixels (ix) before incrementing to the next row (iy).
We want the order that would lay out the images in a grid.
So you go across a row of an image (ix), before stepping along the row of channels (cx), when you hit the end of the row of channels you step to the next row in the image (iy) and when you run out or rows in the image you increment to the next row of channels (cy). so:
V = tf.transpose(V,(2,0,3,1)) #cy,iy,cx,ix
Personally I prefer np.einsum for fancy transposes, for readability, but it's not in tf yet.
newtensor = np.einsum('yxYX->YyXx',oldtensor)
anyway, now that the pixels are in the right order, we can safely flatten it into a 2d tensor:
# image_summary needs 4d input
V = tf.reshape(V,(1,cy*iy,cx*ix,1))
try tf.image_summary on that, you should get a grid of little images.
Below is an image of what one gets after following all the steps here.
In case someone would like to "jump" to numpy and visualize "there" here is an example how to display both Weights and processing result. All transformations are based on prev answer by mdaoust.
# to visualize 1st conv layer Weights
vv1 = sess.run(W_conv1)
# to visualize 1st conv layer output
vv2 = sess.run(h_conv1,feed_dict = {img_ph:x, keep_prob: 1.0})
vv2 = vv2[0,:,:,:] # in case of bunch out - slice first img
def vis_conv(v,ix,iy,ch,cy,cx, p = 0) :
v = np.reshape(v,(iy,ix,ch))
ix += 2
iy += 2
npad = ((1,1), (1,1), (0,0))
v = np.pad(v, pad_width=npad, mode='constant', constant_values=p)
v = np.reshape(v,(iy,ix,cy,cx))
v = np.transpose(v,(2,0,3,1)) #cy,iy,cx,ix
v = np.reshape(v,(cy*iy,cx*ix))
return v
# W_conv1 - weights
ix = 5 # data size
iy = 5
ch = 32
cy = 4 # grid from channels: 32 = 4x8
cx = 8
v = vis_conv(vv1,ix,iy,ch,cy,cx)
plt.figure(figsize = (8,8))
plt.imshow(v,cmap="Greys_r",interpolation='nearest')
# h_conv1 - processed image
ix = 30 # data size
iy = 30
v = vis_conv(vv2,ix,iy,ch,cy,cx)
plt.figure(figsize = (8,8))
plt.imshow(v,cmap="Greys_r",interpolation='nearest')
you may try to get convolution layer activation image this way:
h_conv1_features = tf.unpack(h_conv1, axis=3)
h_conv1_imgs = tf.expand_dims(tf.concat(1, h_conv1_features_padded), -1)
this gets one vertical stripe with all images concatenated vertically.
if you want them padded (in my case of relu activations to pad with white line):
h_conv1_features = tf.unpack(h_conv1, axis=3)
h_conv1_max = tf.reduce_max(h_conv1)
h_conv1_features_padded = map(lambda t: tf.pad(t-h_conv1_max, [[0,0],[0,1],[0,0]])+h_conv1_max, h_conv1_features)
h_conv1_imgs = tf.expand_dims(tf.concat(1, h_conv1_features_padded), -1)
I personally try to tile every 2d-filter in a single image.
For doing this -if i'm not terribly mistaken since I'm quite new to DL- I found out that it could be helpful to exploit the depth_to_space function, since it takes a 4d tensor
[batch, height, width, depth]
and produces an output of shape
[batch, height*block_size, width*block_size, depth/(block_size*block_size)]
Where block_size is the number of "tiles" in the output image. The only limitation to this is that the depth should be the square of block_size, which is an integer, otherwise it cannot "fill" the resulting image correctly.
A possible solution could be of padding the depth of the input tensor up to a depth that is accepted by the method, but I sill havn't tried this.
Another way, which I think very easy, is using the get_operation_by_name function. I had hard time visualizing the layers with other methods but this helped me.
#first, find out the operations, many of those are micro-operations such as add etc.
graph = tf.get_default_graph()
graph.get_operations()
#choose relevant operations
op_name = '...'
op = graph.get_operation_by_name(op_name)
out = sess.run([op.outputs[0]], feed_dict={x: img_batch, is_training: False})
#img_batch is a single image whose dimensions are (1,n,n,1).
# out is the output of the layer, do whatever you want with the output
#in my case, I wanted to see the output of a convolution layer
out2 = np.array(out)
print(out2.shape)
# determine, row, col, and fig size etc.
for each_depth in range(out2.shape[4]):
fig.add_subplot(rows, cols, each_depth+1)
plt.imshow(out2[0,0,:,:,each_depth], cmap='gray')
For example below is the input(colored cat) and output of the second conv layer in my model.
Note that I am aware this question is old and there are easier methods with Keras but for people who use an old model from other people (such as me), this may be useful.