Create tensors where all elements up to a given index are 1s, the rest are 0s - tensorflow

I have a placeholder lengths = tf.placeholder(tf.int32, [10]). Each of the 10 values assigned to this placeholder are <= 25. I now want to create a 2-dimensional tensor, called masks, of shape [10, 25], where each of the 10 vectors of length 25 has the first n elements set to 1, and the rest set to 0 - with n being the corresponding value in lengths.
What is the easiest way to do this using TensorFlow's built in methods?
For example:
lengths = [4, 6, 7, ...]
-> masks = [[1, 1, 1, 1, 0, 0, 0, 0, ..., 0],
[1, 1, 1, 1, 1, 1, 0, 0, ..., 0],
[1, 1, 1, 1, 1, 1, 1, 0, ..., 0],
...
]

You can reshape lengths to a (10, 1) tensor, then compare it with another sequence/indices 0,1,2,3,...,25, which due to broadcasting will result in True if the indices are smaller then lengths, otherwise False; then you can cast the boolean result to 1 and 0:
lengths = tf.constant([4, 6, 7])
n_features = 25
​
import tensorflow as tf
​
masks = tf.cast(tf.range(n_features) < tf.reshape(lengths, (-1, 1)), tf.int8)
with tf.Session() as sess:
print(sess.run(masks))
#[[1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
# [1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
# [1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]]

Related

Change every n-th element of a row in a 2d numpy array depending on the row number

I have a 2d array:
H = 12
a = np.ones([H, H])
print(a.astype(int))
[[1 1 1 1 1 1 1 1 1 1 1 1]
[1 1 1 1 1 1 1 1 1 1 1 1]
[1 1 1 1 1 1 1 1 1 1 1 1]
[1 1 1 1 1 1 1 1 1 1 1 1]
[1 1 1 1 1 1 1 1 1 1 1 1]
[1 1 1 1 1 1 1 1 1 1 1 1]
[1 1 1 1 1 1 1 1 1 1 1 1]
[1 1 1 1 1 1 1 1 1 1 1 1]
[1 1 1 1 1 1 1 1 1 1 1 1]
[1 1 1 1 1 1 1 1 1 1 1 1]
[1 1 1 1 1 1 1 1 1 1 1 1]
[1 1 1 1 1 1 1 1 1 1 1 1]]
The goal is, for every row r to substitute every r+1-th (starting with 0th) element of that row with 0.
Namely, for the 0th row substitute every 'first' (i.e. all of them) element with 0. For the 1st row substitute every 2nd element with 0. And so on.
It can trivially be done in a loop (the printed array is the desired output):
for i in np.arange(H):
a[i, ::i+1] = 0
print(a.astype(int))
[[0 0 0 0 0 0 0 0 0 0 0 0]
[0 1 0 1 0 1 0 1 0 1 0 1]
[0 1 1 0 1 1 0 1 1 0 1 1]
[0 1 1 1 0 1 1 1 0 1 1 1]
[0 1 1 1 1 0 1 1 1 1 0 1]
[0 1 1 1 1 1 0 1 1 1 1 1]
[0 1 1 1 1 1 1 0 1 1 1 1]
[0 1 1 1 1 1 1 1 0 1 1 1]
[0 1 1 1 1 1 1 1 1 0 1 1]
[0 1 1 1 1 1 1 1 1 1 0 1]
[0 1 1 1 1 1 1 1 1 1 1 0]
[0 1 1 1 1 1 1 1 1 1 1 1]]
Can I make use the vectorisation power of numpy here and avoid looping? Or it is not possible?
You can use a np.arange and broadcast modulo over itself
import numpy as np
H = 12
a = np.arange(H)
((a % (a+1)[:, None]) != 0).astype('int')
Output
array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1],
[0, 1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1],
[0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1],
[0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1],
[0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1],
[0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1],
[0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1],
[0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1],
[0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1],
[0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0],
[0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])

The efficient way to compare value between two cell and assign value based on condition in Numpy

The objective is to count the frequency when two nodes have similar value.
Say, for example, we have a vector
pd.DataFrame([0,4,1,1,1],index=['A','B','C','D','E'])
as below
0
A 0
B 4
C 1
D 1
E 1
And, the element Nij is equal to 1 if nodes i and j have similar value and is equal to zero otherwise.
N is then
A B C D E
A 1 0 0 0 0
B 0 1 0 0 0
C 0 0 1 1 1
D 0 0 1 1 1
E 0 0 1 1 1
This simple example can be extended to 2D. For example, here create array of shape (4,5)
A B C D E
0 0 0 0 0 0
1 0 4 1 1 1
2 0 1 1 2 2
3 0 3 2 2 2
Similarly, we go row wise and set the element Nij is equal to 1 if nodes i and j have similar value and is equal to zero otherwise. At every iteration of the row, we sum the cell value.
The frequency is then equal to
A B C D E
A 4.0 1.0 1.0 1.0 1.0
B 1.0 4.0 2.0 1.0 1.0
C 1.0 2.0 4.0 3.0 3.0
D 1.0 1.0 3.0 4.0 4.0
E 1.0 1.0 3.0 4.0 4.0
Based on this, the following code is proposed. But, the current implementation used 3 for-loops and some if-else statement.
I am curios whether the code below can be enhanced further, or maybe, there is a build-in method within Pandas or Numpy that can be used to achieve similar objective.
import numpy as np
arr=[[ 0,0,0,0,0],
[0,4,1,1,1],
[0,1,1,2,2],
[0,3,2,2,2]]
arr=np.array(arr)
# C=arr
# nrows
npart = len(arr[:,0])
# Ncolumns
m = len(arr[0,:])
X = np.zeros(shape =(m,m), dtype = np.double)
for i in range(npart):
for k in range(m):
for p in range(m):
# Check whether the pair have similar value or not
if arr[i,k] == arr[i,p]:
X[k,p] = X[k,p] + 1
else:
X[k,p] = X[k,p] + 0
Output:
4.00000,1.00000,1.00000,1.00000,1.00000
1.00000,4.00000,2.00000,1.00000,1.00000
1.00000,2.00000,4.00000,3.00000,3.00000
1.00000,1.00000,3.00000,4.00000,4.00000
1.00000,1.00000,3.00000,4.00000,4.00000
p.s. The index A,B,C,D,E and use of pandas are for clarification purpose.
With numpy, you can use broadcasting:
1D
a = np.array([0,4,1,1,1])
(a==a[:, None])*1
output:
array([[1, 0, 0, 0, 0],
[0, 1, 0, 0, 0],
[0, 0, 1, 1, 1],
[0, 0, 1, 1, 1],
[0, 0, 1, 1, 1]])
2D
a = np.array([[0, 0, 0, 0, 0],
[0, 4, 1, 1, 1],
[0, 1, 1, 2, 2],
[0, 3, 2, 2, 2]])
(a.T == a.T[:,None]).sum(2)
output:
array([[4, 1, 1, 1, 1],
[1, 4, 2, 1, 1],
[1, 2, 4, 3, 3],
[1, 1, 3, 4, 4],
[1, 1, 3, 4, 4]])

How to choose 2D diagonals of a 3D NumPy array

I define an array as :
XRN =np.array([[[0,1,0,1,0,1,0,1,0,1],
[0,1,1,0,0,1,0,1,0,1],
[0,1,0,0,1,1,0,1,0,1],
[0,1,0,1,0,0,1,1,0,1],],
[[0,1,0,1,0,1,1,0,0,1],
[0,1,0,1,0,1,0,1,1,0],
[1,1,1,0,0,0,0,1,0,1],
[0,1,0,1,0,0,1,1,0,1],],
[[0,1,0,1,0,1,1,1,0,0],
[0,1,0,1,1,1,0,1,0,0],
[0,1,0,1,1,0,0,1,0,1],
[0,1,0,1,0,0,1,1,0,1],]])
print(XRN.shape,XRN)
XRN_LEN = XRN.shape[1]
I can obtain the sum of inner matrix with :
XRN_UP = XRN.sum(axis=1)
print("XRN_UP",XRN_UP.shape,XRN_UP)
XRN_UP (3, 10) [[0 4 1 2 1 3 1 4 0 4]
[1 4 1 3 0 2 2 3 1 3]
[0 4 0 4 2 2 2 4 0 2]]
I want to get the sum of all diagonals with the same shape (3,10)
I tested the code :
RIGHT = [XRN.diagonal(i,axis1=0,axis2=1).sum(axis=1) for i in range(XRN_LEN)]
np_RIGHT = np.array(RIGHT)
print("np_RIGHT=",np_RIGHT.shape,np_RIGHT)
but got
np_RIGHT= (4, 10) [[0 3 0 3 1 2 0 3 1 2]
[1 3 2 1 0 1 1 3 0 3]
[0 2 0 1 1 1 1 2 0 2]
[0 1 0 1 0 0 1 1 0 1]]
I checked all values for axis1 and axis 2 but never got the shape(3,10) : How can I do ?
axis1 axis2 shape
0 1 (4,10)
0 2 (4,4)
1 0 (4,10)
1 2 (4,3)
2 0 (4,4)
2 1 (4,3)
If I understand correctly, you want to sum all possible diagonals on the three elements separately. If that's the case, then you must apply np.diagonal on axis1=1 and axis2=2. This way, you end up with 10 diagonals per element which you sum down to 10 values per element. There are 3 elements, so the resulting shape is (10, 3):
>>> np.array([XRN.diagonal(i, 1, 2).sum(1) for i in range(XRN.shape[-1])])
array([[2, 3, 2],
[2, 1, 2],
[1, 1, 2],
[3, 2, 3],
[2, 2, 2],
[2, 2, 2],
[2, 3, 3],
[2, 2, 2],
[1, 0, 0],
[1, 1, 0]])

Broadcasting multi-dimensional array indices of the same shape

I have a mask array which represents a 2-dimensional binary image. Let's say it's simply:
mask = np.zeros((9, 9), dtype=np.uint8)
# 0 0 0 | 0 0 0 | 0 0 0
# 0 0 0 | 0 0 0 | 0 0 0
# 0 0 0 | 0 0 0 | 0 0 0
# ------+-------+------
# 0 0 0 | 0 0 0 | 0 0 0
# 0 0 0 | 0 0 0 | 0 0 0
# 0 0 0 | 0 0 0 | 0 0 0
# ------+-------+------
# 0 0 0 | 0 0 0 | 0 0 0
# 0 0 0 | 0 0 0 | 0 0 0
# 0 0 0 | 0 0 0 | 0 0 0
Suppose I want to flip the elements in the middle left ninth:
# 0 0 0 | 0 0 0 | 0 0 0
# 0 0 0 | 0 0 0 | 0 0 0
# 0 0 0 | 0 0 0 | 0 0 0
# ------+-------+------
# 1 1 1 | 0 0 0 | 0 0 0
# 1 1 1 | 0 0 0 | 0 0 0
# 1 1 1 | 0 0 0 | 0 0 0
# ------+-------+------
# 0 0 0 | 0 0 0 | 0 0 0
# 0 0 0 | 0 0 0 | 0 0 0
# 0 0 0 | 0 0 0 | 0 0 0
My incorrect approach was something like this:
x = np.arange(mask.shape[0])
y = np.arange(mask.shape[1])
mask[np.logical_and(y >= 3, y < 6), x < 3] = 1
# 0 0 0 | 0 0 0 | 0 0 0
# 0 0 0 | 0 0 0 | 0 0 0
# 0 0 0 | 0 0 0 | 0 0 0
# ------+-------+------
# 1 0 0 | 0 0 0 | 0 0 0
# 0 1 0 | 0 0 0 | 0 0 0
# 0 0 1 | 0 0 0 | 0 0 0
# ------+-------+------
# 0 0 0 | 0 0 0 | 0 0 0
# 0 0 0 | 0 0 0 | 0 0 0
# 0 0 0 | 0 0 0 | 0 0 0
(This is a simplification of the constraints I'm really dealing with, which would not be easily expressed as something like mask[:3,3:6] = 1 as in this case. Consider the constraints arbitrary, like x % 2 == 0 && y % 3 == 0 if you will.)
Numpy's behavior when the two index arrays are the same shape is to take them pairwise, which ends up only selecting the 3 elements above, rather than 9 I would like.
How would I update the right elements with constraints that apply to different axes? Given that the constraints are independent, can I do this by only evaluating my constraints N+M times, rather than N*M?
You can't broadcast the boolean arrays, but you can construct the equivalent numeric indices with ix_:
In [330]: np.ix_((y>=3)&(y<6), x<3)
Out[330]:
(array([[3],
[4],
[5]]), array([[0, 1, 2]]))
Applying it:
In [331]: arr = np.zeros((9,9),int)
In [332]: arr[_330] = 1
In [333]: arr
Out[333]:
array([[0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0]])
Attempting to broadcast the booleans directly raises an error (too many indices):
arr[((y>=3)&(y<6))[:,None], x<3]
Per your comment, let's try this fancier example:
mask = np.zeros((90,90), dtype=np.uint8)
# criteria
def f(x,y): return ((x-20)**2 < 50) & ((y-20)**2 < 50)
# ranges
x,y = np.arange(90), np.arange(90)
# meshgrid
xx,yy = np.meshgrid(x,y)
zz = f(xx,yy)
# mask
mask[zz] = 1
plt.imshow(mask, cnap='gray')
Output:

what is good way to generate a "symmetric ladder" or "adjacent" matrix using tensorflow?

(Updated I forget to say the input is batched) Given a bool array, e.g. [[false, false, false, true, false, false, true, false, false], [false, true, false, false, false, false, true, false, false]], which "true" define the boundary of the separate sequence. I want to generate an adjacent matrix denoting the different group separated by the boundary. What is a good way to generate following "symmetric ladder" matrix using Tensorflow?
[[
[1 1 1 0 0 0 0 0 0]
[1 1 1 0 0 0 0 0 0]
[1 1 1 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0]
[0 0 0 0 1 1 0 0 0]
[0 0 0 0 1 1 0 0 0]
[0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 1 1]
[0 0 0 0 0 0 0 1 1]
]
[
[1 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0]
[0 0 1 1 1 1 1 0 0]
[0 0 1 1 1 1 1 0 0]
[0 0 1 1 1 1 1 0 0]
[0 0 1 1 1 1 1 0 0]
[0 0 1 1 1 1 1 0 0]
[0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0]
]]
Update Jun 15 2018:
Actually, I just have some progress on this problem, if I can convert the input senqence from [false, false, false, true, false, false, true, false, false] to [1, 1, 1, 0, 2, 2, 0, 3, 3], I can get some result using following Tensorflow code. But I am not sure is there a vector operation can convert [false, false, false, true, false, false, true, false, false] to [1, 1, 1, 0, 2, 2, 0, 3, 3]?
import tensorflow as tf
sess = tf.Session()
x = tf.constant([1, 1, 1, 0, 2, 2, 0, 3, 3], shape=(9, 1), dtype=tf.int32)
y = tf.squeeze(tf.cast(tf.equal(tf.expand_dims(x, 1), x), tf.int32))
print(sess.run(y))
[[1 1 1 0 0 0 0 0 0]
[1 1 1 0 0 0 0 0 0]
[1 1 1 0 0 0 0 0 0]
[0 0 0 1 0 0 1 0 0]
[0 0 0 0 1 1 0 0 0]
[0 0 0 0 1 1 0 0 0]
[0 0 0 1 0 0 1 0 0]
[0 0 0 0 0 0 0 1 1]
[0 0 0 0 0 0 0 1 1]]
Update finally:
I inspired a lot from #Willem Van Onsem.
For batched version can be solved by modifying a little from #Willem Van Onsem solution.
import tensorflow as tf
b = tf.constant([[False, False, False, True, False, False, True, False, False], [False, True, False, False, False, False, False, False, False]], shape=(2, 9, 1), dtype=tf.int32)
x = (1 + tf.cumsum(tf.cast(b, tf.int32), axis=1)) * (1-b)
x = tf.cast(tf.equal(x, tf.transpose(x, perm=[0,2,1])),tf.int32) - tf.transpose(b, perm=[0,2,1])*b
with tf.Session() as sess:
print(sess.run(x))
But I am not sure is there a vector operation can convert [False, False, False, True, False, False, True, False, False] to [1, 1, 1, 0, 2, 2, 0, 3, 3]
There is, consider the following example:
b = tf.constant([False, False, False, True, False, False, True, False, False], shape=(9,), dtype=tf.int32)
then we can use tf.cumsum(..) to generate:
>>> print(sess.run(1+tf.cumsum(b)))
[1 1 1 2 2 2 3 3 3]
If we then multiply the values with the opposite of b, we get:
>>> print(sess.run((1+tf.cumsum(b))*(1-b)))
[1 1 1 0 2 2 0 3 3]
So we can store this expression in a variable, for example x:
x = (1+tf.cumsum(b))*(1-b)
I want to generate an adjacent matrix denoting the different group separated by the boundary. What is a good way to generate following "symmetric ladder" matrix using Tensorflow?
If we follow your approach, we only have to remove the points where both lists are 0 at the same time. We can do this with:
tf.cast(tf.equal(x, tf.transpose(x)),tf.int32) - tf.transpose(b)*b
So here we use your approach, where we basically broadcast x, and the transpose of x, and check for elementwise equality, and we subtract the element-wise multiplication of b from, it. This then yields:
>>> print(sess.run(tf.cast(tf.equal(x, tf.transpose(x)),tf.int32) - tf.transpose(b)*b))
[[1 1 1 0 0 0 0 0 0]
[1 1 1 0 0 0 0 0 0]
[1 1 1 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0]
[0 0 0 0 1 1 0 0 0]
[0 0 0 0 1 1 0 0 0]
[0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 1 1]
[0 0 0 0 0 0 0 1 1]]