Add a "frame" of zeros around a matrix in numpy python - numpy

I have a matrix of size (456, 456). I would like to make it of size (460, 460) but adding a frame of two zeros all around it.
Here is an example with a smaller matrix. I would like to transform matrixsmall into matrixbig. What is the best way to do it? The original code operates on lots of data to it would be great to have an efficient solution. Thank you in advance for your help!
import numpy as np
matrixsmall = np.array([[1,2],[2,1]])
matrixbig = np.array([[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 1, 2, 0, 0],
[0, 0, 2, 1, 0, 0],
[0 ,0 ,0, 0, 0, 0],
[0, 0, 0, 0, 0, 0]])

np.pad(matrixsmall, (2,2), "constant", constant_values=(0,0))
will do the trick

Related

Find the maximum values of a matrix in rows axis and replace other values to zero

A = [[2,2,4,2,2,2]
[2,6,2,2,2,2]
[2,2,2,2,8,2]]
I want matrix B to be equal to:
B = [[0,0,4,0,0,0]
[0,6,0,0,0,0]
[0,0,0,0,8,0]]
So I want to find the maximum value of each row and replace other values with 0. Is there any way to do this without using for loops?
Thanks in advance for your comments.
Instead of looking at the argmax, you could take the max values for each row directly, then mask the elements which are lower and replace them with zeros:
Inplace this would look like (here True stands for keepdims=True):
>>> A[A < A.max(1, True)] = 0
>>> A
array([[0, 0, 4, 0, 0, 0],
[0, 6, 0, 0, 0, 0],
[0, 0, 0, 0, 8, 0]])
An out of place alternative is to use np.where:
>>> np.where(A == A.max(1, True), A, 0)
array([[0, 0, 4, 0, 0, 0],
[0, 6, 0, 0, 0, 0],
[0, 0, 0, 0, 8, 0]])

Standard implementation of vectorize_sequences

In François Chollet's Deep Learning with Python, appears this function:
def vectorize_sequences(sequences, dimension=10000):
results = np.zeros((len(sequences), dimension))
for i, sequence in enumerate(sequences):
results[i, sequence] = 1.
return results
I understand what this function does. This function is asked about in this quesion and in this question as well, also mentioned here, here, here, here, here & here. Despite being so wide-spread, this vectorization is, according to Chollet's book is done "manually for maximum clarity." I am interested whether there is a standard, not "manual" way of doing it.
Is there a standard Keras / Tensorflow / Scikit-learn / Pandas / Numpy implementation of a function which behaves very similarly to the function above?
Solution with MultiLabelBinarizer
Assuming sequences is an array of integers with maximum possible value upto dimension-1, we can use MultiLabelBinarizer from sklearn.preprocessing to replicate the behaviour of the function vectorize_sequences
from sklearn.preprocessing import MultiLabelBinarizer
mlb = MultiLabelBinarizer(classes=range(dimension))
mlb.fit_transform(sequences)
Solution with Numpy broadcasting
Assuming sequences is an array of integers with maximum possible value upto dimension-1
(np.array(sequences)[:, :, None] == range(dimension)).any(1).view('i1')
Worked out example
>>> sequences
[[4, 1, 0],
[4, 0, 3],
[3, 4, 2]]
>>> dimension = 10
>>> mlb = MultiLabelBinarizer(classes=range(dimension))
>>> mlb.fit_transform(sequences)
array([[1, 1, 0, 0, 1, 0, 0, 0, 0, 0],
[1, 0, 0, 1, 1, 0, 0, 0, 0, 0],
[0, 0, 1, 1, 1, 0, 0, 0, 0, 0]])
>>> (np.array(sequences)[:, :, None] == range(dimension)).any(1).view('i1')
array([[0, 1, 1, 1, 0, 0, 0, 0, 0, 0],
[1, 0, 1, 0, 1, 0, 0, 0, 0, 0],
[1, 1, 0, 0, 1, 0, 0, 0, 0, 0]])

how to use keras and pandas to duplicate similar arrays

I have a array from my teacher, he gave me an array like below:
array contains 0,1,none
[[1, 1, 0, 0, none, 0, 1], [1, 0, 0, 0, none,0, 1], [1, 1, none,0, 1, 0, none], [1,1,1,0,none, 0, 0], [1, 1,0, none, 0, 0,1]]
and asked me to duplicate the array ten times but however each column must have similar percent distribution say each column have no more than 8% percent compared with the origin array.
how should i achieve the goal?

how to create 0-1 matrix from probability matrix

(In any language) For a research project, I am stuck on how to convert a matrix P of probability values to a matrix A such that A_ij = 1 with probability P_ij and 0 otherwise? I have looked through various random number generator documentations, but have been unable to figure out how to do this.
If I understand correctly:
In [11]: p = np.random.uniform(size=(5,5))
In [12]: p
Out[12]:
array([[ 0.45481883, 0.21242567, 0.3124863 , 0.00485797, 0.31970718],
[ 0.91995847, 0.29907277, 0.59154085, 0.85847147, 0.13227595],
[ 0.91914631, 0.5495813 , 0.58648856, 0.08037582, 0.23005148],
[ 0.12464628, 0.70657028, 0.75975869, 0.77632964, 0.24587041],
[ 0.69259133, 0.183515 , 0.65500547, 0.19526148, 0.26975325]])
In [13]: a = (p.round(1)==0.7).astype(np.int8)
In [14]: a
Out[14]:
array([[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 1, 0, 0, 0],
[1, 0, 1, 0, 0]], dtype=int8)

How can I keep the patch which contain all the elements 1

from sklearn.feature_extraction.image import extract_patches
import numpy as np
data = np.array([[1, 1 , 0 , 0 , 0 , 0 , 1 , 0],
[1, 1 , 1 , 0 , 0 , 1 , 1 , 0],
[1, 1 , 0 , 1 , 1 , 0 , 0 , 0],
[0, 0 , 0 , 1 , 1 , 0 , 0 , 0],
[0, 0 , 0 , 1 , 1 , 0 , 0 , 1],
[1, 1 , 0 , 0 , 0 , 0 , 1 , 0],
[1, 1 , 0 , 0 , 0 , 0 , 0 , 0]])
patches = extract_patches(data, patch_shape=(2, 2))
How can I keep the patch which contain all the elements 1?
From the corrections to your post, I believe you might be looking for a way to detect where submatrices of shape (2,2) are all ones. Anywhere where that condition isn't fulfilled should be zero, but priority should be given to the submatrices where that condition is fulfilled, because submatrices can be overlapping.
In that case, you're most likely interested in the staggered grid of that matrix that has a one in the center of each 2x2 submatrix whenever the 4 elements of that submatrix are all ones:
>>> import numpy as np
>>> from sklearn.feature_extraction.image import extract_patches # similar to numpy's stride_tricks
>>>
>>> data = np.array([[1, 1, 0, 0, 0, 0, 1, 0],
... [1, 1, 1, 0, 0, 1, 1, 0],
... [1, 1, 0, 1, 1, 0, 0, 0],
... [0, 0, 0, 1, 1, 0, 0, 0],
... [0, 0, 0, 1, 1, 0, 0, 1],
... [1, 1, 0, 0, 0, 0, 1, 0],
... [1, 1, 0, 0, 0, 0, 0, 0]])
>>>
>>> # to take boundary effects into account, append ones to the right and bottom
... # modify this to `np.zeros` if boundaries are to be set to zero
... data2 = np.ones((data.shape[0]+1, data.shape[1]+1))
>>> data2[:-1,:-1] = data
>>> vert = np.logical_and(data2[:-1,:], data2[1:,])
>>> dual = np.logical_and(vert[:,:-1], vert[:,1:]) # dual is now the "dual" graph/staggered grid of the data2 array
>>> patches = extract_patches(data2, patch_shape=(2, 2)) # could've used numpy stride_tricks too
>>> patches[dual==0] = 0
>>> patches[dual] = 1 # Give precedence to the dual positives
>>> data2[:-1, :-1].astype(np.uint8)
array([[1, 1, 0, 0, 0, 0, 0, 0],
[1, 1, 0, 0, 0, 0, 0, 0],
[1, 1, 0, 1, 1, 0, 0, 0],
[0, 0, 0, 1, 1, 0, 0, 0],
[0, 0, 0, 1, 1, 0, 0, 0],
[1, 1, 0, 0, 0, 0, 0, 0],
[1, 1, 0, 0, 0, 0, 0, 0]], dtype=uint8)
For completeness, this staggered grid form of the matrix could also be obtained easily with a correlation with a np.ones((2,2)) kernel. However, that is computationally more heavy, because a lot more work has to be done (multiplications and summations) rather than simple bit-operations. The method above will outperform a correlation-based method in terms of speed.
The staggered grid dual above could also be generated in the following way:
patches = extract_patches(data, patch_shape=(2, 2))
dual = patches.all(axis=-1).all(axis=-1)
And you would obtain the final result with:
dual = patches.all(axis=-1).all(axis=-1)
patches[dual==False] = 0
patches[dual] = 1
It differs from the previous method in what happens at the boundaries though.
Here's an alternative method, using minimum_filter and maximum_filter from scipy.ndimage. (The description in the question is still too vague--for me, anyway--so this is based on the result shown in #OliverW.'s answer.)
In [138]: from scipy.ndimage import minimum_filter, maximum_filter
In [139]: data
Out[139]:
array([[1, 1, 0, 0, 0, 0, 1, 0],
[1, 1, 1, 0, 0, 1, 1, 0],
[1, 1, 0, 1, 1, 0, 0, 0],
[0, 0, 0, 1, 1, 0, 0, 0],
[0, 0, 0, 1, 1, 0, 0, 1],
[1, 1, 0, 0, 0, 0, 1, 0],
[1, 1, 0, 0, 0, 0, 0, 0]])
In [140]: m = minimum_filter(data, size=(2,2), mode='constant', origin=(-1,-1))
In [141]: result = maximum_filter(m, size=(2,2), mode='constant')
In [142]: result
Out[142]:
array([[1, 1, 0, 0, 0, 0, 0, 0],
[1, 1, 0, 0, 0, 0, 0, 0],
[1, 1, 0, 1, 1, 0, 0, 0],
[0, 0, 0, 1, 1, 0, 0, 0],
[0, 0, 0, 1, 1, 0, 0, 0],
[1, 1, 0, 0, 0, 0, 0, 0],
[1, 1, 0, 0, 0, 0, 0, 0]])