find numpy rows that are the same - numpy

I have a numpy array
How can I find which of them are the same and how many times appear in the matrix?
thanks
dummy example:
A=np.array([[0, 1, 0, 1],[0, 0, 0, 0],[0, 1, 1, 1],[0, 0, 0, 0]])

You can use numpy.unique with axis=0 and return_counts=True:
np.unique(A, axis=0, return_counts=True)
Output:
(array([[0, 0, 0, 0],
[0, 1, 0, 1],
[0, 1, 1, 1]]),
array([2, 1, 1]))

Related

How to create a sparse matrix with a given base matrix?

I have the following 2 x 2 matrix
1 0
1 1
I want to expand this matrix with dimensions in powers of 2. For example the matrix with dimension 4 would look like:
1 0 0 0
1 1 0 0
1 0 1 0
1 1 1 1
Essentially, I want to retain the original matrix wherever 1 occurs in the base matrix and fill up zeros where 0 occurs in the base matrix? Is there a fast way to do this in numpy or scipy? I want to be able to expand this to any power of 2, say 512 or 1024.
For relatively small values of the powers of 2 (say up to 10), you can recursively replace every 1 with the inital matrix a using numpy block:
import numpy as np
a = np.array([[1, 0], [1, 1]])
def generate(a, k):
z = np.zeros_like(a)
result = a.copy()
for _ in range(1, k):
result = eval(f"np.block({str(result.tolist()).replace('1', 'a').replace('0', 'z')})")
return result
Example for k=3 (8x8 result matrix) generate(a, 3):
array([[1, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 0, 0, 0, 0, 0, 0],
[1, 0, 1, 0, 0, 0, 0, 0],
[1, 1, 1, 1, 0, 0, 0, 0],
[1, 0, 0, 0, 1, 0, 0, 0],
[1, 1, 0, 0, 1, 1, 0, 0],
[1, 0, 1, 0, 1, 0, 1, 0],
[1, 1, 1, 1, 1, 1, 1, 1]])
You can combine tile and repeat.
>>> np.tile(arr, (2, 2))
array([[1, 0, 1, 0],
[1, 1, 1, 1],
[1, 0, 1, 0],
[1, 1, 1, 1]]
>>> np.repeat(np.repeat(arr, 2, axis=1), 2, axis=0)
array([[1, 1, 0, 0],
[1, 1, 0, 0],
[1, 1, 1, 1],
[1, 1, 1, 1]])
Then just multiply:
def tile_mask(a):
tiled = np.tile(a, (2, 2))
mask = np.repeat(
np.repeat(a, 2, axis=1),
2, axis=0
)
return tiled * mask
>>> tile_mask(arr)
array([[1, 0, 0, 0],
[1, 1, 0, 0],
[1, 0, 1, 0],
[1, 1, 1, 1]])
I don't know of a good way to do this for higher powers besides recursion though:
def tile_mask(a, n=2):
if n > 2:
a = tile_mask(a, n-1)
tiled = np.tile(a, (2, 2))
mask = np.repeat(
np.repeat(a, 2, axis=1),
2, axis=0
)
return tiled * mask
>>> tile_mask(arr, 3)
array([[1, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 0, 0, 0, 0, 0, 0],
[1, 0, 1, 0, 0, 0, 0, 0],
[1, 1, 1, 1, 0, 0, 0, 0],
[1, 0, 0, 0, 1, 0, 0, 0],
[1, 1, 0, 0, 1, 1, 0, 0],
[1, 0, 1, 0, 1, 0, 1, 0],
[1, 1, 1, 1, 1, 1, 1, 1]])

rotate diagonal of a 2d numpy array into row

I have a 2d numpy array:
A = array([[1, 7, 5, 0, 5],
[9, 1, 4, 6, 0],
[9, 6, 1, 0, 0],
[2, 5, 0, 0, 0],
[1, 0, 0, 0, 0]])
What I want to achieve is
B = array([[1, 0, 0, 0, 0],
[9, 7, 0, 0, 0],
[9, 1, 5, 0, 0],
[2, 6, 4, 0, 0],
[1, 5, 1, 6, 5]])
So basically every diagonal of A is a row in B with 0 padded. Is there any efficient way to achieve this?
This is what I could come up with:
B = np.empty_like(A)
for i in range(5):
pad_width = (0, 5 - len(np.diag(A[::-1], k=n))
B[i, :] = np.pad(np.diag(A[::-1], k=i-4), pad_width)
Here is the explanation:
You can use np.diag(). This will return the diagonal array at level k, where k is the position of the diagonal you want. If 0, it will return the main diagonal
However, you need to reverse the matrix first A[::-1]
If you run the code so far:
np.diag(A[::-1], k=0)
np.diag(A[::-1], k=-1)
np.diag(A[::-1], k=-2)
You obtain the following output:
array([1, 5, 1, 6, 5])
array([2, 6, 4, 0])
array([9, 1, 5])
You can see that we are obtaining the desired rows in reversed order, and without padding. This last issue has an easy solution: np.pad(), whose first argument is the vector to be padded, and the second argument is the width of the padding (before, after).
Thus, we have to set this width to:
(0, 5 - len(np.diag(A[::-1], k=n)) # You can change it to (0, 5 - n) and make it more efficient, but this way is more understandable
where n is the level of the diagonal.
And there we have it, just initialize B:
B = np.empty_like(A)
And change each vector of B:
for i in range(5):
pad_width = (0, 5 - len(np.diag(A[::-1], k=n))
B[i, :] = np.pad(np.diag(A[::-1], k=i-4), pad_width)
And the output is:
array([[1, 0, 0, 0, 0],
[9, 7, 0, 0, 0],
[9, 1, 5, 0, 0],
[2, 6, 4, 0, 0],
[1, 5, 1, 6, 5]])
Here is one vectorized solution using np.meshgrid:
import numpy as np
A = np.array([[1, 7, 5, 0, 5],
[9, 1, 4, 6, 0],
[9, 6, 1, 0, 0],
[2, 5, 0, 0, 0],
[1, 0, 0, 0, 0]])
B = np.array([[1, 0, 0, 0, 0],
[9, 7, 0, 0, 0],
[9, 1, 5, 0, 0],
[2, 6, 4, 0, 0],
[1, 5, 1, 6, 5]])
n, m = A.shape
ix, iy = np.meshgrid(np.arange(n), np.arange(m))
iy = (iy - np.arange(m)) % m
# array([[0, 4, 3, 2, 1],
# [1, 0, 4, 3, 2],
# [2, 1, 0, 4, 3],
# [3, 2, 1, 0, 4],
# [4, 3, 2, 1, 0]])
B2 = A[iy, ix]
assert (B2 == B).all()

What is the purpose of rotating filters while building convolutions with scipy signal?

I recently came across a bit of python code (shown below) which does 2d convolution with scipy signal.
x = np.array([[1, 1, 1, 0, 0],
[0, 1, 1, 1, 0],
[0, 0, 1, 1, 1],
[0, 0, 1, 1, 0],
[0, 1, 1, 0, 0]],
dtype='float')
w_k = np.array([[1, 0, 1],
[0, 1, 0],
[1, 0, 1],],
dtype='float')
w_k = np.rot90(w_k, 2)
f = signal.convolve2d(x, w_k, 'valid')
Right before the convolve2d operation, the filter was rotated. What is the purpose of that?

How to generate the bernoulli tensor in tensorflow

How can I generate a tensor in tensorflow of Bernoulli distribution with factor p ?
For example:
a = tf.bernoulli(shape=[10,10], p)
generates a matrix 10x10 of 0-1 where each element of matrix is one with probability p and zero with probability 1-p.
I can solve my problem! :D
The following code generates, what I need where p=0.7:
p = tf.constant([0.7])
r = tf.random.uniform(shape=shape, maxval=1)
b = tf.math.greater(p, r)
f = tf.cast(b, dtype=tf.float32)
You can use Bernoulli distribution from Tensorflow probability library which is an extension built on Tensorflow:
import tensorflow_probability as tfp
x = tfp.distributions.Bernoulli(probs=0.7).sample(sample_shape=(10, 10))
x
This will output
<tf.Tensor: shape=(10, 10), dtype=int32, numpy=
array([[1, 1, 1, 1, 0, 0, 0, 1, 1, 0],
[1, 0, 1, 0, 1, 1, 0, 1, 0, 0],
[1, 1, 1, 1, 0, 1, 1, 1, 1, 0],
[0, 0, 1, 1, 0, 1, 1, 1, 1, 1],
[1, 1, 0, 1, 1, 1, 1, 0, 1, 1],
[1, 0, 1, 1, 1, 0, 0, 1, 0, 1],
[0, 1, 0, 1, 0, 1, 1, 1, 1, 1],
[1, 0, 0, 1, 1, 1, 1, 1, 1, 0],
[1, 1, 1, 1, 1, 0, 1, 1, 1, 1],
[1, 0, 1, 1, 1, 1, 1, 1, 1, 1]], dtype=int32)>
The other method is to use a similar class from tf.compat:
import tensorflow as tf
x = tf.compat.v1.distributions.Bernoulli(probs=0.7).sample(sample_shape=(10, 10))
x
You will get x as you want but also a bunch of deprecation warnings will fall. So, I would recommend to use the 1st variant.
Also take into account that when I tested this code the latest version of tensorflow_probability library required tensoflow>=2.3 installed.
What about using tf.random.Generator.binomial? It does not require installing tensorflow_probability.
https://www.tensorflow.org/api_docs/python/tf/random/Generator#binomial
rng = tf.random.Generator.from_seed(seed=234)
rng.binomial(shape=[10], counts=1.0, probs=0.7)
When counts = 1.0, I think the binomial is the same as the Bernoulli.
The output of the above will be as follows:
<tf.Tensor: shape=(10,), dtype=int32, numpy=array([1, 1, 0, 1, 0, 1, 0, 0, 0, 1], dtype=int32)>

TensorFlow: Unpooling

Is there TensorFlow native function that does unpooling for Deconvolutional Networks ?
I have written this in normal python, but it is getting complicated when want to translate it to TensorFlow as it's objects does not even support item assignment at the moment, and I think this is a great inconvenience with TF.
I don't think there is an official unpooling layer yet which is frustrating because you have to use image resize (bilinear interpolation or nearest neighbor) which is like an average unpooling operation and it's reaaaly slow. Look at the tf api in the section 'image' and you will find it.
Tensorflow has a maxpooling_with_argmax thing where you get you maxpooled output as well as the activation map which is nice as you could use it in an unpooling layer to preserve the 'lost' spacial information but it seems as there isn't such an unpooling operation that does it. I guess that they are planning to add it ... soon.
Edit: I found some guy on google discuss a week ago who seems to have implemented something like this but I personally haven't tried it yet.
https://github.com/ppwwyyxx/tensorpack/blob/master/tensorpack/models/pool.py#L66
There is a couple of tensorflow implementations here pooling.py
Namely:
1) unpool operation (source) that utilizes output of tf.nn.max_pool_with_argmax. Although please notice, that as of tensorflow 1.0 tf.nn.max_pool_with_argmax is GPU-only
2) upsample operation that mimics inverse of max-pooling by filling positions of unpooled region with either zeros or copies of max element.
Comparing to tensorpack it allows copies of elements instead of zeros and supports strides other than [2, 2].
No recompile, back-prop friendly.
Illustration:
I was searching for a maxunpooling operation and tried implementing it. I came up with some kind of hacky implementation for the gradient, as I was struggling with CUDA.
The code is here, you will need to build it from source with GPU support.
Below is a demo application. No warranties, though!
There also exists an open issue for this operation.
import tensorflow as tf
import numpy as np
def max_pool(inp, k=2):
return tf.nn.max_pool_with_argmax_and_mask(inp, ksize=[1, k, k, 1], strides=[1, k, k, 1], padding="SAME")
def max_unpool(inp, argmax, argmax_mask, k=2):
return tf.nn.max_unpool(inp, argmax, argmax_mask, ksize=[1, k, k, 1], strides=[1, k, k, 1], padding="SAME")
def conv2d(inp, name):
w = weights[name]
b = biases[name]
var = tf.nn.conv2d(inp, w, [1, 1, 1, 1], padding='SAME')
var = tf.nn.bias_add(var, b)
var = tf.nn.relu(var)
return var
def conv2d_transpose(inp, name, dropout_prob):
w = weights[name]
b = biases[name]
dims = inp.get_shape().dims[:3]
dims.append(w.get_shape()[-2]) # adpot channels from weights (weight definition for deconv has switched input and output channel!)
out_shape = tf.TensorShape(dims)
var = tf.nn.conv2d_transpose(inp, w, out_shape, strides=[1, 1, 1, 1], padding="SAME")
var = tf.nn.bias_add(var, b)
if not dropout_prob is None:
var = tf.nn.relu(var)
var = tf.nn.dropout(var, dropout_prob)
return var
weights = {
"conv1": tf.Variable(tf.random_normal([3, 3, 3, 16])),
"conv2": tf.Variable(tf.random_normal([3, 3, 16, 32])),
"conv3": tf.Variable(tf.random_normal([3, 3, 32, 32])),
"deconv2": tf.Variable(tf.random_normal([3, 3, 16, 32])),
"deconv1": tf.Variable(tf.random_normal([3, 3, 1, 16])) }
biases = {
"conv1": tf.Variable(tf.random_normal([16])),
"conv2": tf.Variable(tf.random_normal([32])),
"conv3": tf.Variable(tf.random_normal([32])),
"deconv2": tf.Variable(tf.random_normal([16])),
"deconv1": tf.Variable(tf.random_normal([ 1])) }
## Build Miniature CEDN
x = tf.placeholder(tf.float32, [12, 20, 20, 3])
y = tf.placeholder(tf.float32, [12, 20, 20, 1])
p = tf.placeholder(tf.float32)
conv1 = conv2d(x, "conv1")
maxp1, maxp1_argmax, maxp1_argmax_mask = max_pool(conv1)
conv2 = conv2d(maxp1, "conv2")
maxp2, maxp2_argmax, maxp2_argmax_mask = max_pool(conv2)
conv3 = conv2d(maxp2, "conv3")
maxup2 = max_unpool(conv3, maxp2_argmax, maxp2_argmax_mask)
deconv2 = conv2d_transpose(maxup2, "deconv2", p)
maxup1 = max_unpool(deconv2, maxp1_argmax, maxp1_argmax_mask)
deconv1 = conv2d_transpose(maxup1, "deconv1", None)
## Optimizing Stuff
loss = tf.reduce_sum(tf.nn.sigmoid_cross_entropy_with_logits(deconv1, y))
optimizer = tf.train.AdamOptimizer(learning_rate=1).minimize(loss)
## Test Data
np.random.seed(123)
batch_x = np.where(np.random.rand(12, 20, 20, 3) > 0.5, 1.0, -1.0)
batch_y = np.where(np.random.rand(12, 20, 20, 1) > 0.5, 1.0, 0.0)
prob = 0.5
with tf.Session() as session:
tf.set_random_seed(123)
session.run(tf.initialize_all_variables())
print "\n\n"
for i in range(10):
session.run(optimizer, feed_dict={x: batch_x, y: batch_y, p: prob})
print "step", i + 1
print "loss", session.run(loss, feed_dict={x: batch_x, y: batch_y, p: 1.0}), "\n\n"
Edit 29.11.17
Some time back, I reimplemented it in a clean fashion against TensorFlow 1.0, the forward operations are also available as CPU-version. You can find it in this branch, I recommend you looking up the last few commits if you want to use it.
Nowadays there's a Tensorflow Addon MaxUnpooling2D:
Unpool the outputs of a maximum pooling operation.
tfa.layers.MaxUnpooling2D(
pool_size: Union[int, Iterable[int]] = (2, 2),
strides: Union[int, Iterable[int]] = (2, 2),
padding: str = 'SAME',
**kwargs
)
This class can e.g. be used as
import tensorflow as tf
import tensorflow_addons as tfa
pooling, max_index = tf.nn.max_pool_with_argmax(input, 2, 2, padding='SAME')
unpooling = tfa.layers.MaxUnpooling2D()(pooling, max_index)
I checked this which shagas mentioned here and it is working.
x = [[[[1, 1, 2,2, 3, 3],
[1, 1, 2,2, 3, 3],
[1, 1, 2,2, 3, 3],
[1, 1, 2,2, 3, 3],
[1, 1, 2,2, 3, 3],
[1, 1, 2,2, 3, 3]],
[[1, 1, 2,2, 3, 3],
[1, 1, 2,2, 3, 3],
[1, 1, 2,2, 3, 3],
[1, 1, 2,2, 3, 3],
[1, 1, 2,2, 3, 3],
[1, 1, 2,2, 3, 3]],
[[1, 1, 2,2, 3, 3],
[1, 1, 2,2, 3, 3],
[1, 1, 2,2, 3, 3],
[1, 1, 2,2, 3, 3],
[1, 1, 2,2, 3, 3],
[1, 1, 2,2, 3, 3]]]]
x = np.array(x)
inp = tf.convert_to_tensor(x)
out = UnPooling2x2ZeroFilled(inp)
out
Out[19]:
<tf.Tensor: id=36, shape=(1, 6, 12, 6), dtype=int64, numpy=
array([[[[1, 1, 2, 2, 3, 3],
[0, 0, 0, 0, 0, 0],
[1, 1, 2, 2, 3, 3],
[0, 0, 0, 0, 0, 0],
[1, 1, 2, 2, 3, 3],
[0, 0, 0, 0, 0, 0],
[1, 1, 2, 2, 3, 3],
[0, 0, 0, 0, 0, 0],
[1, 1, 2, 2, 3, 3],
[0, 0, 0, 0, 0, 0],
[1, 1, 2, 2, 3, 3],
[0, 0, 0, 0, 0, 0]],
[[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0]],
[[1, 1, 2, 2, 3, 3],
[0, 0, 0, 0, 0, 0],
[1, 1, 2, 2, 3, 3],
[0, 0, 0, 0, 0, 0],
[1, 1, 2, 2, 3, 3],
[0, 0, 0, 0, 0, 0],
[1, 1, 2, 2, 3, 3],
[0, 0, 0, 0, 0, 0],
[1, 1, 2, 2, 3, 3],
[0, 0, 0, 0, 0, 0],
[1, 1, 2, 2, 3, 3],
[0, 0, 0, 0, 0, 0]],
[[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0]],
[[1, 1, 2, 2, 3, 3],
[0, 0, 0, 0, 0, 0],
[1, 1, 2, 2, 3, 3],
[0, 0, 0, 0, 0, 0],
[1, 1, 2, 2, 3, 3],
[0, 0, 0, 0, 0, 0],
[1, 1, 2, 2, 3, 3],
[0, 0, 0, 0, 0, 0],
[1, 1, 2, 2, 3, 3],
[0, 0, 0, 0, 0, 0],
[1, 1, 2, 2, 3, 3],
[0, 0, 0, 0, 0, 0]],
[[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0]]]])>
out1 = tf.keras.layers.MaxPool2D()(out)
out1
Out[37]:
<tf.Tensor: id=118, shape=(1, 3, 6, 6), dtype=int64, numpy=
array([[[[1, 1, 2, 2, 3, 3],
[1, 1, 2, 2, 3, 3],
[1, 1, 2, 2, 3, 3],
[1, 1, 2, 2, 3, 3],
[1, 1, 2, 2, 3, 3],
[1, 1, 2, 2, 3, 3]],
[[1, 1, 2, 2, 3, 3],
[1, 1, 2, 2, 3, 3],
[1, 1, 2, 2, 3, 3],
[1, 1, 2, 2, 3, 3],
[1, 1, 2, 2, 3, 3],
[1, 1, 2, 2, 3, 3]],
[[1, 1, 2, 2, 3, 3],
[1, 1, 2, 2, 3, 3],
[1, 1, 2, 2, 3, 3],
[1, 1, 2, 2, 3, 3],
[1, 1, 2, 2, 3, 3],
[1, 1, 2, 2, 3, 3]]]])>
If you need max unpooling then you can use (though I didn't check it) this one
Here it is my implementation. You should apply the max-pooling using tf.nn.max_pool_with_argmax and then pass the argmax result of tf.nn.max_pool_with_argmax
def unpooling(inputs, output_shape, argmax):
"""
Performs unpooling, as explained in:
https://www.oreilly.com/library/view/hands-on-convolutional-neural/9781789130331/6476c4d5-19f2-455f-8590-c6f99504b7a5.xhtml
:param inputs: Input Tensor.
:param output_shape: Desired output shape. For example, on 2D unpooling, this should be 4D (because of number of samples and channels).
:param argmax: Result argmax from tf.nn.max_pool_with_argmax
https://www.tensorflow.org/api_docs/python/tf/nn/max_pool_with_argmax
"""
flat_output_shape = tf.cast(tf.reduce_prod(output_shape), tf.int64)
updates = tf.reshape(inputs, [-1])
indices = tf.expand_dims(tf.reshape(argmax, [-1]), axis=-1)
ret = tf.scatter_nd(indices, updates, shape=[flat_output_shape])
ret = tf.reshape(ret, output_shape)
return ret
This has a small bug/feature that is that if argmax has a repeated value it will perform an addition instead of just putting the value once. Beware of this if stride is 1. I don't know, however, if this is desired or not.