turn around sparse matrix - numpy

I got some sparse matrix like this
>>>import numpy as np
>>>from scipy.sparse import *
>>>A = csr_matrix((np.identity(3)))
>>>print A
(0, 0) 1.0
(1, 1) 1.0
(2, 2) 1.0
For better understanding A is something like this:
>>>print A.todense()
[[ 1. 0. 0.]
[ 0. 1. 0.]
[ 0. 0. 1.]]
And I would like to have an operator (let us call it op1(n) ) doing this:
>>>A.op1(1)
[[ 0. 1. 0.]
[ 0. 0. 1.]
[ 1. 0. 0.]]
=> makes the last n columns the first n ones,
so
>>>A == A.op1(3)
true
. Is there some build-in solution, (EDIT:) that returns a sparse matrix again?
The solution with roll:
X = np.roll(X.todense(),-tau, axis = 0)
print X.__class__
returns
<class 'numpy.matrixlib.defmatrix.matrix'>

scipy.sparse doesn't have roll, but you can simulate it with hstack:
from scipy.sparse import *
A = eye(3, 3, format='csr')
hstack((A[:, 1:], A[:, :1]), format='csr') # roll left
hstack((A[:, -1:], A[:, :-1]), format='csr') # roll right

>>> a = np.identity(3)
>>> a
array([[ 1., 0., 0.],
[ 0., 1., 0.],
[ 0., 0., 1.]])
>>> np.roll(a, -1, axis=0)
array([[ 0., 1., 0.],
[ 0., 0., 1.],
[ 1., 0., 0.]])
>>> a == np.roll(a, 3, axis=0)
array([[ True, True, True],
[ True, True, True],
[ True, True, True]], dtype=bool)

Related

what does myarray[0][:,0] mean

This is an excerpt from a documentation.
lambda ind, r: 1.0 + any(np.array(points_2d)[ind][:,0] == 0.0)
But I don't understand np.array(points_2d)[ind][:,0].
It seems equivalent to myarray[0][:,0], which doesn't make sense to me.
Can anyone help to explain?
With points_2d from earlier in the doc:
In [38]: points_2d = [(0., 0.), (0., 1.), (1., 1.), (1., 0.),
...: (0.5, 0.25), (0.5, 0.75), (0.25, 0.5), (0.75, 0.5)]
In [39]: np.array(points_2d)
Out[39]:
array([[0. , 0. ],
[0. , 1. ],
[1. , 1. ],
[1. , 0. ],
[0.5 , 0.25],
[0.5 , 0.75],
[0.25, 0.5 ],
[0.75, 0.5 ]])
Indexing with a scalar gives a 1d array, which can't be further indexed with [:,0].
In [40]: np.array(points_2d)[0]
Out[40]: array([0., 0.])
But with a list or slice:
In [41]: np.array(points_2d)[[0,1,2]]
Out[41]:
array([[0., 0.],
[0., 1.],
[1., 1.]])
In [42]: np.array(points_2d)[[0,1,2]][:,0]
Out[42]: array([0., 0., 1.])
So this selects the first column of a subset of rows.
In [43]: np.array(points_2d)[[0,1,2]][:,0]==0.0
Out[43]: array([ True, True, False])
In [44]: any(np.array(points_2d)[[0,1,2]][:,0]==0.0)
Out[44]: True
I think they could have used:
In [45]: np.array(points_2d)[[0,1,2],0]
Out[45]: array([0., 0., 1.])

`sklearn.preprocessing.normalize` (L2 norm) equivalent in Tensorflow or TFX

How can I do the L2 norm in Tensorflow? I'm looking for the equivalent of sklearn.preprocessing.normalize in Tensorflow or in tfx.
You can use tensorflow.keras.utils.normalize for L2 norm as follows.
Using sklearn.preprocessing.normalize
X = [[ 1., -1., 2.],
[ 2., 0., 0.],
[ 0., 1., -1.]]
X_normalized = sklearn.preprocessing.normalize(X, norm='l2')
X_normalized
Output:
array([[ 0.40824829, -0.40824829, 0.81649658],
[ 1. , 0. , 0. ],
[ 0. , 0.70710678, -0.70710678]])
Using tf.keras.utils.normalize gives the same output as above
X = [[ 1., -1., 2.],
[ 2., 0., 0.],
[ 0., 1., -1.]]
tf.keras.utils.normalize(
X, order=2
)
Output:
array([[ 0.40824829, -0.40824829, 0.81649658],
[ 1. , 0. , 0. ],
[ 0. , 0.70710678, -0.70710678]])

Removing certain rows from tensor in tensorflow without using tf.RaggedTensor

Given tensor data
[[[ 0., 0.],
[ 1., 1.],
[-1., -1.]],
[[-1., -1.],
[ 4., 4.],
[ 5., 5.]]]
I want to remove [-1,-1] and get
[[[ 0., 0.],
[ 1., 1.]],
[[ 4., 4.],
[ 5., 5.]]]
How to get the above without using ragged feature in tensorflow?
You can try this:
x = tf.constant(
[[[ 0., 0.],
[ 1., 1.],
[-1., -2.]],
[[-1., -2.],
[ 4., 4.],
[ 5., 5.]]])
mask = tf.math.not_equal(x, np.array([-1, -1]))
result = tf.boolean_mask(x, mask)
shape = tf.shape(x)
result = tf.reshape(result, (shape[0], -1, shape[2]))
You could do it like this:
import tensorflow as tf
import numpy as np
data = [[[ 0., 0.],
[ 1., 1.],
[-1., -1.]],
[[-1., -1.],
[ 4., 4.],
[ 5., 5.]]]
data = tf.constant(data)
indices = tf.math.not_equal(data, tf.constant([-1., -1.]))
res = data[indices]
shape = tf.shape(data)
total = tf.reduce_sum(
tf.cast(tf.math.logical_and(indices[:, :, 0], indices[:, :, 1])[0], tf.int32))
res = tf.reshape(res, (shape[0], total, shape[-1]))
with tf.Session() as sess:
print(sess.run(res))
# [[[0. 0.]
# [1. 1.]]
# [[4. 4.]
# [5. 5.]]]

numpy where condition output explained

Im trying to understand numpy where condition.
>>> import numpy as np
>>> x = np.arange(9.).reshape(3, 3)
>>> x
array([[ 0., 1., 2.],
[ 3., 4., 5.],
[ 6., 7., 8.]])
>>> np.where( x > 5 )
(array([2, 2, 2]), array([0, 1, 2]))
IN the above case, what does the output actually mean, array([0,1,2]) I actually see in the input what is array([2,2,2])
Th first array indicates the row number and the second array indicates the corresponding column number.
If the array is following:
array([[ 0., 1., 2.],
[ 3., 4., 5.],
[ 6., 7., 8.]])
Then the following
(array([2, 2, 2]), array([0, 1, 2]))
Can be interpreted as
array(2,0) => 6
array(2,1) => 7
array (2,2) => 8
You might also want to know where those values appear visually in your array. In such cases, you can return the array's value where the condition is True and a null value where they are false. In the example below, the value of x is returned at the position where x>5, otherwise assign -1.
x = np.arange(9.).reshape(3, 3)
np.where(x>5, x, -1)
array([[-1., -1., -1.],
[-1., -1., -1.],
[ 6., 7., 8.]])
Three elements found, located at (2,0),(2,1),(2,2)..
By the way, tryhelp(np.where()) will help you a lot.

Index variable range in numpy

I have a numpy zero matrix A of the shape (2, 5).
A = [[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.]]
I have another array seq of size 2. This is same as the first axis of A.
seq = [2, 3]
I want to create another matrix B which looks like this:
B = [[ 1., 1., 0., 0., 0.],
[ 1., 1., 1., 0., 0.]]
B is constructed by changing the first seq[i] elements in the ith row of A with 1.
This is a toy example. A and seq can be large so efficiency is required. I would be extra thankful if someone knows how to do this in tensorflow.
You can do this in TensorFlow (and with some analogous code in NumPy) as follows:
seq = [2, 3]
b = tf.expand_dims(tf.range(5), 0) # A 1 x 5 matrix.
seq_matrix = tf.expand_dims(seq, 1) # A 2 x 1 matrix.
b_bool = tf.greater(seq_matrix, b) # A 2 x 5 bool matrix.
B = tf.to_int32(b_bool) # A 2 x 5 int matrix.
Example output:
In [7]: b = tf.expand_dims(tf.range(5), 0)
[[0 1 2 3 4]]
In [21]: b_bool = tf.greater(seq_matrix, b)
In [22]: op = sess.run(b_bool)
In [23]: print(op)
[[ True True False False False]
[ True True True False False]]
In [24]: bint = tf.to_int32(b_bool)
In [25]: op = sess.run(bint)
In [26]: print(op)
[[1 1 0 0 0]
[1 1 1 0 0]]
This #mrry's solution, expressed a little differently
In [667]: [[2],[3]]>np.arange(5)
Out[667]:
array([[ True, True, False, False, False],
[ True, True, True, False, False]], dtype=bool)
In [668]: ([[2],[3]]>np.arange(5)).astype(int)
Out[668]:
array([[1, 1, 0, 0, 0],
[1, 1, 1, 0, 0]])
The idea is to compare [2,3] with [0,1,2,3,4] in an 'outer' broadcasting sense. The result is boolean which can be easily changed to 0/1 integers.
Another approach would be to use cumsum (or another ufunc.accumulate function):
In [669]: A=np.zeros((2,5))
In [670]: A[range(2),[2,3]]=1
In [671]: A
Out[671]:
array([[ 0., 0., 1., 0., 0.],
[ 0., 0., 0., 1., 0.]])
In [672]: A.cumsum(axis=1)
Out[672]:
array([[ 0., 0., 1., 1., 1.],
[ 0., 0., 0., 1., 1.]])
In [673]: 1-A.cumsum(axis=1)
Out[673]:
array([[ 1., 1., 0., 0., 0.],
[ 1., 1., 1., 0., 0.]])
Or a variation starting with 1's:
In [681]: A=np.ones((2,5))
In [682]: A[range(2),[2,3]]=0
In [683]: A
Out[683]:
array([[ 1., 1., 0., 1., 1.],
[ 1., 1., 1., 0., 1.]])
In [684]: np.minimum.accumulate(A,axis=1)
Out[684]:
array([[ 1., 1., 0., 0., 0.],
[ 1., 1., 1., 0., 0.]])