I have a piece of code in Matlab that I want to convert into Python/numpy.
I have a matrix ind which has the dimensions (32768, 24). I have another matrix X which has the dimensions (98304, 6). When I perform the operation
result = X(ind)
the shape of the matrix is (32768, 24).
but in numpy when I perform the same shape
result = X[ind]
I get the shape of the result matrix as (32768, 24, 6).
I would greatly appreciate it if someone can help me with why I can these two different results and how can I fix them. I would want to get the shape (32768, 24) for the result matrix in numpy as well
In Octave, if I define:
>> X=diag([1,2,3,4])
X =
Diagonal Matrix
1 0 0 0
0 2 0 0
0 0 3 0
0 0 0 4
>> idx = [6 7;10 11]
idx =
6 7
10 11
then the indexing selects a block:
>> X(idx)
ans =
2 0
0 3
The numpy equivalent is
In [312]: X=np.diag([1,2,3,4])
In [313]: X
Out[313]:
array([[1, 0, 0, 0],
[0, 2, 0, 0],
[0, 0, 3, 0],
[0, 0, 0, 4]])
In [314]: idx = np.array([[5,6],[9,10]]) # shifted for 0 base indexing
In [315]: np.unravel_index(idx,(4,4)) # raveled to unraveled conversion
Out[315]:
(array([[1, 1],
[2, 2]]),
array([[1, 2],
[1, 2]]))
In [316]: X[_] # this indexes with a tuple of arrays
Out[316]:
array([[2, 0],
[0, 3]])
another way:
In [318]: X.flat[idx]
Out[318]:
array([[2, 0],
[0, 3]])
Related
I have the following numpy array (as an example):
my_array = [[3, 7, 0]
[20, 4, 0]
[7, 54, 0]]
I want to replace the 0's in the 3rd column of each row with a value of 5 only if the first index is odd.
So the expected outcome would be:
my_array = [[3, 7, 5]
[20, 4, 0]
[7, 54, 5]]
I tried numpy.where and numpy.place, but couldn't get the expected results.
Is there an elegant way to do this with numpy functions?
you can do this by indexing as:
my_array[my_array[:, 0] % 2 != 0, 2] = 5
# my_array[:, 0] % 2 != 0 --- Boolean shows modifying rows --> [ True False True]
We have a Tensor of unknown length N, containing some int32 values.
How can we generate another Tensor that will contain N ranges concatenated together, each one between 0 and the int32 value from the original tensor ?
For example, if we have [4, 4, 5, 3, 1], the output Tensor should look like [0 1 2 3 0 1 2 3 0 1 2 3 4 0 1 2 0].
Thank you for any advice.
You can make this work with a tensor as input by using a tf.RaggedTensor which can contain dimensions of non-uniform length.
# Or any other N length tensor
tf_counts = tf.convert_to_tensor([4, 4, 5, 3, 1])
tf.print(tf_counts)
# [4 4 5 3 1]
# Create a ragged tensor, each row is a sequence of length tf_counts[i]
tf_ragged = tf.ragged.range(tf_counts)
tf.print(tf_ragged)
# <tf.RaggedTensor [[0, 1, 2, 3], [0, 1, 2, 3], [0, 1, 2, 3, 4], [0, 1, 2], [0]]>
# Read values
tf.print(tf_ragged.flat_values, summarize=-1)
# [0 1 2 3 0 1 2 3 0 1 2 3 4 0 1 2 0]
For this 2-dimensional case the ragged tensor tf_ragged is a “matrix“ of rows with varying length:
[[0, 1, 2, 3],
[0, 1, 2, 3],
[0, 1, 2, 3, 4],
[0, 1, 2],
[0]]
Check tf.ragged.range for more options on how to create the sequences on each row: starts for inclusive lower limits, limits for exclusive upper limit, deltas for increment. Each may vary for each sequence.
Also mind that the dtype of the tf_counts tensor will propagate to the final values.
If you want to have everything as a tensorflow object, then use tf.range() along with tf.concat().
In [88]: vals = [4, 4, 5, 3, 1]
In [89]: tf_range = [tf.range(0, limit=item, dtype=tf.int32) for item in vals]
# concat all `tf_range` objects into a single tensor
In [90]: concatenated_tensor = tf.concat(tf_range, 0)
In [91]: concatenated_tensor.eval()
Out[91]: array([0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 4, 0, 1, 2, 0], dtype=int32)
There're other approaches to do this as well. Here, I assume that you want a constant tensor but you can construct any tensor once you have the full range list.
First, we construct the full range list using a list comprehension, make a flat list out of it, and then construct a tensor.
In [78]: from itertools import chain
In [79]: vals = [4, 4, 5, 3, 1]
In [80]: range_list = list(chain(*[range(item) for item in vals]))
In [81]: range_list
Out[81]: [0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 4, 0, 1, 2, 0]
In [82]: const_tensor = tf.constant(range_list, dtype=tf.int32)
In [83]: const_tensor.eval()
Out[83]: array([0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 4, 0, 1, 2, 0], dtype=int32)
On the other hand, we can also use tf.range() but then it returns an array when you evaluate it. So, you'd have to construct the list from the arrays and then make a flat list out of it and finally construct the tensor as in the following example.
list_of_arr = [tf.range(0, limit=item, dtype=tf.int32).eval() for item in vals]
range_list = list(chain(*[arr.tolist() for arr in list_of_arr]))
# output
[0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 4, 0, 1, 2, 0]
const_tensor = tf.constant(range_list, dtype=tf.int32)
const_tensor.eval()
#output tensor as numpy array
array([0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 4, 0, 1, 2, 0], dtype=int32)
I have 3d numpy array of the following shape:
(3600L, 7200L, 3L)
If any element in any dimension is 0, how can I convert the elements in the same position in other two dimensions into 0?
If an element is 0, it is 0 in each of the dimensions. I'll illustrate with a small 2d array:
In [1240]: M=np.arange(9).reshape(3,3)
In [1241]: M
Out[1241]:
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
In [1242]: M[0,0]
Out[1242]: 0
One element is 0, the 0 row and the 0 column. I can set the rest of those 2 dimensions to 0 with:
In [1243]: M[0,:]=0
In [1244]: M[:,0]=0
In [1245]: M
Out[1245]:
array([[0, 0, 0],
[0, 4, 5],
[0, 7, 8]])
You can generalize this to 3d and larger arrays. As long as you know the coordinates of that element in all dimensions. With a 3d array
M[i,:,:]=0
actually sets all the values in a plane (2d) to 0. Similarly for M[:,j,:] and M[:,:,k].
np.where gives the coordinates that match some condition:
In [1248]: I=np.where(M==0)
In [1249]: M[I[0],:]=0
In [1250]: M[:,I[1]]=0
In [1251]: M
Out[1251]:
array([[0, 0, 0],
[0, 4, 5],
[0, 7, 8]])
In [1252]:
In [1252]: I
Out[1252]: (array([0], dtype=int32), array([0], dtype=int32))
This works regardless of whether the match is for 1 element, 0, or more. Here it's just one.
Say I have a sparse matrix c and a numpy array a. I'd like to slice the entries of a based on some condition on c.
import scipy.sparse as sps
import numpy as np
x = np.array([1,0,0,1])
y = np.array([0,0,0,1])
c = sps.csc_matrix( (np.ones((4,)) , (x,y)), shape = (2,2),dtype=int)
a = np.array([ [1,2],[3,3]])
idx = c != 0
The variable idx is now a sparse matrix of booleans (it only lists True's). I would like to slice the matrix a and call the same entries of a where c != 0.
c[idx]
works fine but the following will not work:
a[idx]
I could use idx.todense(), but I am finding that these .todense() functions are taking up too memory...
You could index a by getting the indices of the rows and cols where c is nonzero. You can do that by converting c to the COO matrix and using the row and col attributes.
Here's some data for an example:
In [41]: a
Out[41]:
array([[10, 11, 12, 13],
[14, 15, 16, 17],
[18, 19, 20, 21],
[22, 23, 24, 25]])
In [42]: c
Out[42]:
<4x4 sparse matrix of type '<type 'numpy.int64'>'
with 4 stored elements in Compressed Sparse Column format>
In [43]: c.A
Out[43]:
array([[0, 0, 1, 0],
[0, 0, 0, 0],
[1, 0, 1, 0],
[0, 0, 0, 1]])
Convert c to COO format:
In [45]: c2 = c.tocoo()
In [46]: c2
Out[46]:
<4x4 sparse matrix of type '<type 'numpy.int64'>'
with 4 stored elements in COOrdinate format>
In [47]: c2.row
Out[47]: array([2, 0, 2, 3], dtype=int32)
In [48]: c2.col
Out[48]: array([0, 2, 2, 3], dtype=int32)
Now index a with c2.row and c2.col to get the values from a at the positions where c is nonzero:
In [49]: a[c2.row, c2.col]
Out[49]: array([18, 12, 20, 25])
Note, however, that the order of the values is not the same as a[idx.A]:
In [50]: a[(c != 0).A]
Out[50]: array([12, 18, 20, 25])
By the way, this type of indexing of a is not "slicing". Slicing refers to indexing a with a "slice", created using the slice notation start:stop:step (or, less commonly, with a builtin slice object slice(start, stop, step)), e.g. a[1:3, :2]. What you are doing is sometimes called "advanced" indexing (e.g. http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html).
You have an original sparse matrix X:
>>print type(X)
>>print X.todense()
<class 'scipy.sparse.csr.csr_matrix'>
[[1,4,3]
[3,4,1]
[2,1,1]
[3,6,3]]
You have a second sparse matrix Z, which is derived from some rows of X (say the values are doubled so we can see the difference between the two matrices). In pseudo-code:
>>Z = X[[0,2,3]]
>>print Z.todense()
[[1,4,3]
[2,1,1]
[3,6,3]]
>>Z = Z*2
>>print Z.todense()
[[2, 8, 6]
[4, 2, 2]
[6, 12,6]]
What's the best way of retrieving the rows in Z using the ORIGINAL indices from X. So for instance, in pseudo-code:
>>print Z[[0,3]]
[[2,8,6] #0 from Z, and what would be row **0** from X)
[6,12,6]] #2 from Z, but what would be row **3** from X)
That is, how can you retrieve rows from Z, using indices that refer to the original rows position in the original matrix X? To do this, you can't modify X in anyway (you can't add an index column to the matrix X), but there are no other limits.
If you have the original indices in an array i, and the values in i are in increasing order (as in your example), you can use numpy.searchsorted(i, [0, 3]) to find the indices in Z that correspond to indices [0, 3] in the original X. Here's a demonstration in an IPython session:
In [39]: X = csr_matrix([[1,4,3],[3,4,1],[2,1,1],[3,6,3]])
In [40]: X.todense()
Out[40]:
matrix([[1, 4, 3],
[3, 4, 1],
[2, 1, 1],
[3, 6, 3]])
In [41]: i = array([0, 2, 3])
In [42]: Z = 2 * X[i]
In [43]: Z.todense()
Out[43]:
matrix([[ 2, 8, 6],
[ 4, 2, 2],
[ 6, 12, 6]])
In [44]: Zsub = Z[searchsorted(i, [0, 3])]
In [45]: Zsub.todense()
Out[45]:
matrix([[ 2, 8, 6],
[ 6, 12, 6]])