How to create a sparse matrix with a given base matrix? - numpy

I have the following 2 x 2 matrix
1 0
1 1
I want to expand this matrix with dimensions in powers of 2. For example the matrix with dimension 4 would look like:
1 0 0 0
1 1 0 0
1 0 1 0
1 1 1 1
Essentially, I want to retain the original matrix wherever 1 occurs in the base matrix and fill up zeros where 0 occurs in the base matrix? Is there a fast way to do this in numpy or scipy? I want to be able to expand this to any power of 2, say 512 or 1024.

For relatively small values of the powers of 2 (say up to 10), you can recursively replace every 1 with the inital matrix a using numpy block:
import numpy as np
a = np.array([[1, 0], [1, 1]])
def generate(a, k):
z = np.zeros_like(a)
result = a.copy()
for _ in range(1, k):
result = eval(f"np.block({str(result.tolist()).replace('1', 'a').replace('0', 'z')})")
return result
Example for k=3 (8x8 result matrix) generate(a, 3):
array([[1, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 0, 0, 0, 0, 0, 0],
[1, 0, 1, 0, 0, 0, 0, 0],
[1, 1, 1, 1, 0, 0, 0, 0],
[1, 0, 0, 0, 1, 0, 0, 0],
[1, 1, 0, 0, 1, 1, 0, 0],
[1, 0, 1, 0, 1, 0, 1, 0],
[1, 1, 1, 1, 1, 1, 1, 1]])

You can combine tile and repeat.
>>> np.tile(arr, (2, 2))
array([[1, 0, 1, 0],
[1, 1, 1, 1],
[1, 0, 1, 0],
[1, 1, 1, 1]]
>>> np.repeat(np.repeat(arr, 2, axis=1), 2, axis=0)
array([[1, 1, 0, 0],
[1, 1, 0, 0],
[1, 1, 1, 1],
[1, 1, 1, 1]])
Then just multiply:
def tile_mask(a):
tiled = np.tile(a, (2, 2))
mask = np.repeat(
np.repeat(a, 2, axis=1),
2, axis=0
)
return tiled * mask
>>> tile_mask(arr)
array([[1, 0, 0, 0],
[1, 1, 0, 0],
[1, 0, 1, 0],
[1, 1, 1, 1]])
I don't know of a good way to do this for higher powers besides recursion though:
def tile_mask(a, n=2):
if n > 2:
a = tile_mask(a, n-1)
tiled = np.tile(a, (2, 2))
mask = np.repeat(
np.repeat(a, 2, axis=1),
2, axis=0
)
return tiled * mask
>>> tile_mask(arr, 3)
array([[1, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 0, 0, 0, 0, 0, 0],
[1, 0, 1, 0, 0, 0, 0, 0],
[1, 1, 1, 1, 0, 0, 0, 0],
[1, 0, 0, 0, 1, 0, 0, 0],
[1, 1, 0, 0, 1, 1, 0, 0],
[1, 0, 1, 0, 1, 0, 1, 0],
[1, 1, 1, 1, 1, 1, 1, 1]])

Related

find numpy rows that are the same

I have a numpy array
How can I find which of them are the same and how many times appear in the matrix?
thanks
dummy example:
A=np.array([[0, 1, 0, 1],[0, 0, 0, 0],[0, 1, 1, 1],[0, 0, 0, 0]])
You can use numpy.unique with axis=0 and return_counts=True:
np.unique(A, axis=0, return_counts=True)
Output:
(array([[0, 0, 0, 0],
[0, 1, 0, 1],
[0, 1, 1, 1]]),
array([2, 1, 1]))

rotate diagonal of a 2d numpy array into row

I have a 2d numpy array:
A = array([[1, 7, 5, 0, 5],
[9, 1, 4, 6, 0],
[9, 6, 1, 0, 0],
[2, 5, 0, 0, 0],
[1, 0, 0, 0, 0]])
What I want to achieve is
B = array([[1, 0, 0, 0, 0],
[9, 7, 0, 0, 0],
[9, 1, 5, 0, 0],
[2, 6, 4, 0, 0],
[1, 5, 1, 6, 5]])
So basically every diagonal of A is a row in B with 0 padded. Is there any efficient way to achieve this?
This is what I could come up with:
B = np.empty_like(A)
for i in range(5):
pad_width = (0, 5 - len(np.diag(A[::-1], k=n))
B[i, :] = np.pad(np.diag(A[::-1], k=i-4), pad_width)
Here is the explanation:
You can use np.diag(). This will return the diagonal array at level k, where k is the position of the diagonal you want. If 0, it will return the main diagonal
However, you need to reverse the matrix first A[::-1]
If you run the code so far:
np.diag(A[::-1], k=0)
np.diag(A[::-1], k=-1)
np.diag(A[::-1], k=-2)
You obtain the following output:
array([1, 5, 1, 6, 5])
array([2, 6, 4, 0])
array([9, 1, 5])
You can see that we are obtaining the desired rows in reversed order, and without padding. This last issue has an easy solution: np.pad(), whose first argument is the vector to be padded, and the second argument is the width of the padding (before, after).
Thus, we have to set this width to:
(0, 5 - len(np.diag(A[::-1], k=n)) # You can change it to (0, 5 - n) and make it more efficient, but this way is more understandable
where n is the level of the diagonal.
And there we have it, just initialize B:
B = np.empty_like(A)
And change each vector of B:
for i in range(5):
pad_width = (0, 5 - len(np.diag(A[::-1], k=n))
B[i, :] = np.pad(np.diag(A[::-1], k=i-4), pad_width)
And the output is:
array([[1, 0, 0, 0, 0],
[9, 7, 0, 0, 0],
[9, 1, 5, 0, 0],
[2, 6, 4, 0, 0],
[1, 5, 1, 6, 5]])
Here is one vectorized solution using np.meshgrid:
import numpy as np
A = np.array([[1, 7, 5, 0, 5],
[9, 1, 4, 6, 0],
[9, 6, 1, 0, 0],
[2, 5, 0, 0, 0],
[1, 0, 0, 0, 0]])
B = np.array([[1, 0, 0, 0, 0],
[9, 7, 0, 0, 0],
[9, 1, 5, 0, 0],
[2, 6, 4, 0, 0],
[1, 5, 1, 6, 5]])
n, m = A.shape
ix, iy = np.meshgrid(np.arange(n), np.arange(m))
iy = (iy - np.arange(m)) % m
# array([[0, 4, 3, 2, 1],
# [1, 0, 4, 3, 2],
# [2, 1, 0, 4, 3],
# [3, 2, 1, 0, 4],
# [4, 3, 2, 1, 0]])
B2 = A[iy, ix]
assert (B2 == B).all()

What is the purpose of rotating filters while building convolutions with scipy signal?

I recently came across a bit of python code (shown below) which does 2d convolution with scipy signal.
x = np.array([[1, 1, 1, 0, 0],
[0, 1, 1, 1, 0],
[0, 0, 1, 1, 1],
[0, 0, 1, 1, 0],
[0, 1, 1, 0, 0]],
dtype='float')
w_k = np.array([[1, 0, 1],
[0, 1, 0],
[1, 0, 1],],
dtype='float')
w_k = np.rot90(w_k, 2)
f = signal.convolve2d(x, w_k, 'valid')
Right before the convolve2d operation, the filter was rotated. What is the purpose of that?

Find 'distance from the edge' of a numpy array

I have a numpy array with 1s & 0s (or bools if that's easier)
I would like to find the distance from each 1 its closest 'edge' (an edge is where a 1 meets a 0).
Toy example:
Original array:
array([[0, 0, 0, 0],
[0, 1, 1, 1],
[0, 1, 1, 1],
[0, 1, 1, 1]])
Result:
array([[0, 0, 0, 0],
[0, 1, 1, 1],
[0, 1, 2, 1],
[0, 1, 1, 1]])
If possible, I'd like to use the 'cityblock' distance, but that's lower priority
Thanks!
Here's a vectorized approach using binary_erosion & cdist(..'cityblock') -
from scipy.ndimage.morphology import binary_erosion
from scipy.spatial.distance import cdist
def dist_from_edge(img):
I = binary_erosion(img) # Interior mask
C = img - I # Contour mask
out = C.astype(int) # Setup o/p and assign cityblock distances
out[I] = cdist(np.argwhere(C), np.argwhere(I), 'cityblock').min(0) + 1
return out
Sample run -
In [188]: img.astype(int)
Out[188]:
array([[0, 0, 0, 0, 1, 0, 0],
[0, 1, 1, 1, 1, 1, 0],
[0, 1, 1, 1, 1, 1, 1],
[0, 1, 1, 1, 1, 1, 1],
[0, 0, 1, 1, 1, 1, 1],
[0, 0, 0, 1, 0, 0, 0]])
In [189]: dist_from_edge(img)
Out[189]:
array([[0, 0, 0, 0, 1, 0, 0],
[0, 1, 1, 1, 2, 1, 0],
[0, 1, 2, 2, 3, 2, 1],
[0, 1, 2, 3, 2, 2, 1],
[0, 0, 1, 2, 1, 1, 1],
[0, 0, 0, 1, 0, 0, 0]])
Here's an input, output on a human blob -
Here's one way you can do this with scipy.ndimage.distance_transform_cdt (or scipy.ndimage.distance_transform_bf):
import numpy as np
from scipy.ndimage import distance_transform_cdt
def distance_from_edge(x):
x = np.pad(x, 1, mode='constant')
dist = distance_transform_cdt(x, metric='taxicab')
return dist[1:-1, 1:-1]
For example:
In [327]: a
Out[327]:
array([[0, 0, 0, 0],
[0, 1, 1, 1],
[0, 1, 1, 1],
[0, 1, 1, 1]])
In [328]: distance_from_edge(a)
Out[328]:
array([[0, 0, 0, 0],
[0, 1, 1, 1],
[0, 1, 2, 1],
[0, 1, 1, 1]], dtype=int32)
In [329]: x
Out[329]:
array([[1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0],
[1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0],
[1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0],
[1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 0],
[0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1]])
In [330]: distance_from_edge(x)
Out[330]:
array([[1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0],
[1, 2, 2, 2, 2, 1, 0, 0, 0, 1, 0, 0],
[1, 2, 3, 3, 2, 1, 0, 0, 1, 2, 1, 0],
[1, 2, 3, 3, 2, 1, 0, 0, 0, 1, 0, 0],
[1, 2, 3, 3, 2, 1, 0, 0, 0, 0, 0, 0],
[1, 1, 2, 2, 2, 1, 0, 0, 0, 1, 1, 0],
[0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 2, 1],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1]], dtype=int32)
If you don't pad the array with zeros, you get the distance to the nearest 0 in the array:
In [335]: distance_transform_cdt(a, metric='taxicab')
Out[335]:
array([[0, 0, 0, 0],
[0, 1, 1, 1],
[0, 1, 2, 2],
[0, 1, 2, 3]], dtype=int32)
In [336]: distance_transform_cdt(x, metric='taxicab')
Out[336]:
array([[6, 5, 4, 3, 2, 1, 0, 0, 0, 0, 0, 0],
[5, 5, 4, 3, 2, 1, 0, 0, 0, 1, 0, 0],
[4, 4, 4, 3, 2, 1, 0, 0, 1, 2, 1, 0],
[3, 3, 4, 3, 2, 1, 0, 0, 0, 1, 0, 0],
[2, 2, 3, 3, 2, 1, 0, 0, 0, 0, 0, 0],
[1, 1, 2, 2, 2, 1, 0, 0, 0, 1, 1, 0],
[0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 2, 1],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2]], dtype=int32)
Here a different method that uses scipy.ndimage.binary_erosion. I wrote this before I discovered the distance transform function. I'm sure there are much more efficient methods, but this should work reasonably well for images that are not too big.
import numpy as np
from scipy.ndimage import binary_erosion
def distance_from_edge(x):
dist = np.zeros_like(x, dtype=int)
while np.count_nonzero(x) > 0:
dist += x # Assumes x is an array of 0s and 1s, or bools.
x = binary_erosion(x)
return dist
For example,
In [291]: a
Out[291]:
array([[0, 0, 0, 0],
[0, 1, 1, 1],
[0, 1, 1, 1],
[0, 1, 1, 1]])
In [292]: distance_from_edge(a)
Out[292]:
array([[0, 0, 0, 0],
[0, 1, 1, 1],
[0, 1, 2, 1],
[0, 1, 1, 1]])
In [293]: x
Out[293]:
array([[1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0],
[1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0],
[1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0],
[1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 0],
[0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1]])
In [294]: distance_from_edge(x)
Out[294]:
array([[1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0],
[1, 2, 2, 2, 2, 1, 0, 0, 0, 1, 0, 0],
[1, 2, 3, 3, 2, 1, 0, 0, 1, 2, 1, 0],
[1, 2, 3, 3, 2, 1, 0, 0, 0, 1, 0, 0],
[1, 2, 3, 3, 2, 1, 0, 0, 0, 0, 0, 0],
[1, 1, 2, 2, 2, 1, 0, 0, 0, 1, 1, 0],
[0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 2, 1],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1]])

TensorFlow: Unpooling

Is there TensorFlow native function that does unpooling for Deconvolutional Networks ?
I have written this in normal python, but it is getting complicated when want to translate it to TensorFlow as it's objects does not even support item assignment at the moment, and I think this is a great inconvenience with TF.
I don't think there is an official unpooling layer yet which is frustrating because you have to use image resize (bilinear interpolation or nearest neighbor) which is like an average unpooling operation and it's reaaaly slow. Look at the tf api in the section 'image' and you will find it.
Tensorflow has a maxpooling_with_argmax thing where you get you maxpooled output as well as the activation map which is nice as you could use it in an unpooling layer to preserve the 'lost' spacial information but it seems as there isn't such an unpooling operation that does it. I guess that they are planning to add it ... soon.
Edit: I found some guy on google discuss a week ago who seems to have implemented something like this but I personally haven't tried it yet.
https://github.com/ppwwyyxx/tensorpack/blob/master/tensorpack/models/pool.py#L66
There is a couple of tensorflow implementations here pooling.py
Namely:
1) unpool operation (source) that utilizes output of tf.nn.max_pool_with_argmax. Although please notice, that as of tensorflow 1.0 tf.nn.max_pool_with_argmax is GPU-only
2) upsample operation that mimics inverse of max-pooling by filling positions of unpooled region with either zeros or copies of max element.
Comparing to tensorpack it allows copies of elements instead of zeros and supports strides other than [2, 2].
No recompile, back-prop friendly.
Illustration:
I was searching for a maxunpooling operation and tried implementing it. I came up with some kind of hacky implementation for the gradient, as I was struggling with CUDA.
The code is here, you will need to build it from source with GPU support.
Below is a demo application. No warranties, though!
There also exists an open issue for this operation.
import tensorflow as tf
import numpy as np
def max_pool(inp, k=2):
return tf.nn.max_pool_with_argmax_and_mask(inp, ksize=[1, k, k, 1], strides=[1, k, k, 1], padding="SAME")
def max_unpool(inp, argmax, argmax_mask, k=2):
return tf.nn.max_unpool(inp, argmax, argmax_mask, ksize=[1, k, k, 1], strides=[1, k, k, 1], padding="SAME")
def conv2d(inp, name):
w = weights[name]
b = biases[name]
var = tf.nn.conv2d(inp, w, [1, 1, 1, 1], padding='SAME')
var = tf.nn.bias_add(var, b)
var = tf.nn.relu(var)
return var
def conv2d_transpose(inp, name, dropout_prob):
w = weights[name]
b = biases[name]
dims = inp.get_shape().dims[:3]
dims.append(w.get_shape()[-2]) # adpot channels from weights (weight definition for deconv has switched input and output channel!)
out_shape = tf.TensorShape(dims)
var = tf.nn.conv2d_transpose(inp, w, out_shape, strides=[1, 1, 1, 1], padding="SAME")
var = tf.nn.bias_add(var, b)
if not dropout_prob is None:
var = tf.nn.relu(var)
var = tf.nn.dropout(var, dropout_prob)
return var
weights = {
"conv1": tf.Variable(tf.random_normal([3, 3, 3, 16])),
"conv2": tf.Variable(tf.random_normal([3, 3, 16, 32])),
"conv3": tf.Variable(tf.random_normal([3, 3, 32, 32])),
"deconv2": tf.Variable(tf.random_normal([3, 3, 16, 32])),
"deconv1": tf.Variable(tf.random_normal([3, 3, 1, 16])) }
biases = {
"conv1": tf.Variable(tf.random_normal([16])),
"conv2": tf.Variable(tf.random_normal([32])),
"conv3": tf.Variable(tf.random_normal([32])),
"deconv2": tf.Variable(tf.random_normal([16])),
"deconv1": tf.Variable(tf.random_normal([ 1])) }
## Build Miniature CEDN
x = tf.placeholder(tf.float32, [12, 20, 20, 3])
y = tf.placeholder(tf.float32, [12, 20, 20, 1])
p = tf.placeholder(tf.float32)
conv1 = conv2d(x, "conv1")
maxp1, maxp1_argmax, maxp1_argmax_mask = max_pool(conv1)
conv2 = conv2d(maxp1, "conv2")
maxp2, maxp2_argmax, maxp2_argmax_mask = max_pool(conv2)
conv3 = conv2d(maxp2, "conv3")
maxup2 = max_unpool(conv3, maxp2_argmax, maxp2_argmax_mask)
deconv2 = conv2d_transpose(maxup2, "deconv2", p)
maxup1 = max_unpool(deconv2, maxp1_argmax, maxp1_argmax_mask)
deconv1 = conv2d_transpose(maxup1, "deconv1", None)
## Optimizing Stuff
loss = tf.reduce_sum(tf.nn.sigmoid_cross_entropy_with_logits(deconv1, y))
optimizer = tf.train.AdamOptimizer(learning_rate=1).minimize(loss)
## Test Data
np.random.seed(123)
batch_x = np.where(np.random.rand(12, 20, 20, 3) > 0.5, 1.0, -1.0)
batch_y = np.where(np.random.rand(12, 20, 20, 1) > 0.5, 1.0, 0.0)
prob = 0.5
with tf.Session() as session:
tf.set_random_seed(123)
session.run(tf.initialize_all_variables())
print "\n\n"
for i in range(10):
session.run(optimizer, feed_dict={x: batch_x, y: batch_y, p: prob})
print "step", i + 1
print "loss", session.run(loss, feed_dict={x: batch_x, y: batch_y, p: 1.0}), "\n\n"
Edit 29.11.17
Some time back, I reimplemented it in a clean fashion against TensorFlow 1.0, the forward operations are also available as CPU-version. You can find it in this branch, I recommend you looking up the last few commits if you want to use it.
Nowadays there's a Tensorflow Addon MaxUnpooling2D:
Unpool the outputs of a maximum pooling operation.
tfa.layers.MaxUnpooling2D(
pool_size: Union[int, Iterable[int]] = (2, 2),
strides: Union[int, Iterable[int]] = (2, 2),
padding: str = 'SAME',
**kwargs
)
This class can e.g. be used as
import tensorflow as tf
import tensorflow_addons as tfa
pooling, max_index = tf.nn.max_pool_with_argmax(input, 2, 2, padding='SAME')
unpooling = tfa.layers.MaxUnpooling2D()(pooling, max_index)
I checked this which shagas mentioned here and it is working.
x = [[[[1, 1, 2,2, 3, 3],
[1, 1, 2,2, 3, 3],
[1, 1, 2,2, 3, 3],
[1, 1, 2,2, 3, 3],
[1, 1, 2,2, 3, 3],
[1, 1, 2,2, 3, 3]],
[[1, 1, 2,2, 3, 3],
[1, 1, 2,2, 3, 3],
[1, 1, 2,2, 3, 3],
[1, 1, 2,2, 3, 3],
[1, 1, 2,2, 3, 3],
[1, 1, 2,2, 3, 3]],
[[1, 1, 2,2, 3, 3],
[1, 1, 2,2, 3, 3],
[1, 1, 2,2, 3, 3],
[1, 1, 2,2, 3, 3],
[1, 1, 2,2, 3, 3],
[1, 1, 2,2, 3, 3]]]]
x = np.array(x)
inp = tf.convert_to_tensor(x)
out = UnPooling2x2ZeroFilled(inp)
out
Out[19]:
<tf.Tensor: id=36, shape=(1, 6, 12, 6), dtype=int64, numpy=
array([[[[1, 1, 2, 2, 3, 3],
[0, 0, 0, 0, 0, 0],
[1, 1, 2, 2, 3, 3],
[0, 0, 0, 0, 0, 0],
[1, 1, 2, 2, 3, 3],
[0, 0, 0, 0, 0, 0],
[1, 1, 2, 2, 3, 3],
[0, 0, 0, 0, 0, 0],
[1, 1, 2, 2, 3, 3],
[0, 0, 0, 0, 0, 0],
[1, 1, 2, 2, 3, 3],
[0, 0, 0, 0, 0, 0]],
[[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0]],
[[1, 1, 2, 2, 3, 3],
[0, 0, 0, 0, 0, 0],
[1, 1, 2, 2, 3, 3],
[0, 0, 0, 0, 0, 0],
[1, 1, 2, 2, 3, 3],
[0, 0, 0, 0, 0, 0],
[1, 1, 2, 2, 3, 3],
[0, 0, 0, 0, 0, 0],
[1, 1, 2, 2, 3, 3],
[0, 0, 0, 0, 0, 0],
[1, 1, 2, 2, 3, 3],
[0, 0, 0, 0, 0, 0]],
[[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0]],
[[1, 1, 2, 2, 3, 3],
[0, 0, 0, 0, 0, 0],
[1, 1, 2, 2, 3, 3],
[0, 0, 0, 0, 0, 0],
[1, 1, 2, 2, 3, 3],
[0, 0, 0, 0, 0, 0],
[1, 1, 2, 2, 3, 3],
[0, 0, 0, 0, 0, 0],
[1, 1, 2, 2, 3, 3],
[0, 0, 0, 0, 0, 0],
[1, 1, 2, 2, 3, 3],
[0, 0, 0, 0, 0, 0]],
[[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0]]]])>
out1 = tf.keras.layers.MaxPool2D()(out)
out1
Out[37]:
<tf.Tensor: id=118, shape=(1, 3, 6, 6), dtype=int64, numpy=
array([[[[1, 1, 2, 2, 3, 3],
[1, 1, 2, 2, 3, 3],
[1, 1, 2, 2, 3, 3],
[1, 1, 2, 2, 3, 3],
[1, 1, 2, 2, 3, 3],
[1, 1, 2, 2, 3, 3]],
[[1, 1, 2, 2, 3, 3],
[1, 1, 2, 2, 3, 3],
[1, 1, 2, 2, 3, 3],
[1, 1, 2, 2, 3, 3],
[1, 1, 2, 2, 3, 3],
[1, 1, 2, 2, 3, 3]],
[[1, 1, 2, 2, 3, 3],
[1, 1, 2, 2, 3, 3],
[1, 1, 2, 2, 3, 3],
[1, 1, 2, 2, 3, 3],
[1, 1, 2, 2, 3, 3],
[1, 1, 2, 2, 3, 3]]]])>
If you need max unpooling then you can use (though I didn't check it) this one
Here it is my implementation. You should apply the max-pooling using tf.nn.max_pool_with_argmax and then pass the argmax result of tf.nn.max_pool_with_argmax
def unpooling(inputs, output_shape, argmax):
"""
Performs unpooling, as explained in:
https://www.oreilly.com/library/view/hands-on-convolutional-neural/9781789130331/6476c4d5-19f2-455f-8590-c6f99504b7a5.xhtml
:param inputs: Input Tensor.
:param output_shape: Desired output shape. For example, on 2D unpooling, this should be 4D (because of number of samples and channels).
:param argmax: Result argmax from tf.nn.max_pool_with_argmax
https://www.tensorflow.org/api_docs/python/tf/nn/max_pool_with_argmax
"""
flat_output_shape = tf.cast(tf.reduce_prod(output_shape), tf.int64)
updates = tf.reshape(inputs, [-1])
indices = tf.expand_dims(tf.reshape(argmax, [-1]), axis=-1)
ret = tf.scatter_nd(indices, updates, shape=[flat_output_shape])
ret = tf.reshape(ret, output_shape)
return ret
This has a small bug/feature that is that if argmax has a repeated value it will perform an addition instead of just putting the value once. Beware of this if stride is 1. I don't know, however, if this is desired or not.