julia index matrix with vector - indexing

Suppose I have a 20-by-10 matrix m
and a 20-by-1 vector v, where each element is an integer between 1 to 10.
Is there smart indexing command something like m[:,v]
that would give a vector, where each element i is element of m at the index [i,v[i]]?

No, it seems that you cannot do it. Documentation (http://docs.julialang.org/en/stable/manual/arrays/) says:
If all the indices are scalars, then the result X is a single element from the array A. Otherwise, X is an array with the same number of dimensions as the sum of the dimensionalities of all the indices.
So, to get 1d result from indexing operation you need to have one of the indices to have dimensionality 0, i.e. to be just a scalar -- and you won't get what you want then.
Use comprehension, as proposed in the comment to your question.

To be explicit about the comprehension approach:
[m[i,v[i]] for i = 1:length(v)]
This is concise and clear enough that having a special syntax seems unnecessary.

Related

numpy indexing with conditional transposes dimensions

When I mix mask-indices and normal indices, I get a weird result:
a = np.zeros((3,2,5))
cond = [True]*5
print(a[0][:,cond].shape) # prints '(2,5)' as expected
print(a[0,:,cond].shape) # prints '(5,2)' which is surprising
i.e., the resulting array is transposed from what I think should happend
Is this a bug? is this a feature? If someone can point me to a piece of documentation that explains this I would be very glad :)
No, this is not a bug. It happens because you have mixed basic indexing (slice) and advanced indexing (integer and boolean array). It's documented here:
Two cases of index combination need to be distinguished:
The advanced indexes are separated by a slice, Ellipsis or newaxis. For example x[arr1, :, arr2].
The advanced indexes are all next to each other. For example x[..., arr1, arr2, :] but not x[arr1, :, 1] since 1 is an advanced
index in this regard.
In the first case, the dimensions resulting from the advanced indexing
operation come first in the result array, and the subspace dimensions
after that. In the second case, the dimensions from the advanced
indexing operations are inserted into the result array at the same
spot as they were in the initial array (the latter logic is what makes
simple advanced indexing behave just like slicing).

Simple question about slicing a Numpy Tensor

I have a Numpy Tensor,
X = np.arange(64).reshape((4,4,4))
I wish to grab the 2,3,4 entries of the first dimension of this tensor, which you can do with,
Y = X[[1,2,3],:,:]
Is this a simpler way of writing this instead of explicitly writing out the indices [1,2,3]? I tried something like [1,:], which gave me an error.
Context: for my real application, the shape of the tensor is something like (30000,100,100). I would like to grab the last (10000, 100,100) to (30000,100,100) of this tensor.
The simplest way in your case is to use X[1:4]. This is the same as X[[1,2,3]], but notice that with X[1:4] you only need one pair of brackets because 1:4 already represent a range of values.
For an N dimensional array in NumPy if you specify indexes for less than N dimensions you get all elements of the remaining dimensions. That is, for N equal to 3, X[1:4] is the same as X[1:4, :, :] or X[1:4, :]. Only if you want to index some dimension while getting all elements in a dimension that comes before it is that you actually need to pass :. Such as X[:, 2:4], for instance.
If you wish to select from some row to the end of array, simply use python slicing notation as below:
X[10000:,:,:]
This will select all rows from 10000 to the end of array and all columns and depths for them.

How to retain indices of a matrix while working on one of its submatrices?

I am trying to implement an algorithm that iteratively removes some rows and columns of a matrix and continues processing the remaining submatrix. However, I would like to know the index of a value in the original matrix rather than the remaining submatrix.
For example, assume that a matrix x is built using
x = np.arange(9).reshape(3, 3)
Now, I would like to find the index of the element that is equal to 8 in the submatrix defined below:
np.where(x[1:, 1:] == 8)
By default, numpy returns (array[1], array[1]) because it is finding the element in the sliced submatrix. What I like to be returned instead is (array[2], array[2]), which is the index of 8 in the original matrix.
What is an efficient solution to this problem?
P.S.
The submatrix may be built arbitrarily. For example, I may need to keep rows, 0 and 1, but columns 0 and 2.
Each submatrix may be sliced in next iterations to make a smaller submatrix. I still would like to have access to the index in the original matrix. In other words, I am looking for a solution that works on submatrices of submatrices as well.
I recently learned about indexing with arrays where submatrices of a matrix can be selected using another numpy array. I think what I can do to solve the problem is to map indices of the submatrix to elements of the indexing array.
For example, in the example above, the submatrix can be defined like this:
row_idx = np.array([1, 2])
col_idx = np.array([1, 2])
np.where(x[row_idx[:, None], col_idx] == 8)
This will still return the same (array[1], array[1]) output, but I can use these indices to lookup the elements of row_idx and col_idx in order to find the corresponding indices in the original matrix, i.e. row_idx[1] and col_idx[1].

Matrices with different row lengths in numpy

Is there a way of defining a matrix (say m) in numpy with rows of different lengths, but such that m stays 2-dimensional (i.e. m.ndim = 2)?
For example, if you define m = numpy.array([[1,2,3], [4,5]]), then m.ndim = 1. I understand why this happens, but I'm interested if there is any way to trick numpy into viewing m as 2D. One idea would be padding with a dummy value so that rows become equally sized, but I have lots of such matrices and it would take up too much space. The reason why I really need m to be 2D is that I am working with Theano, and the tensor which will be given the value of m expects a 2D value.
I'll give here very new information about Theano. We have a new TypedList() type, that allow to have python list with all elements with the same type: like 1d ndarray. All is done, except the documentation.
There is limited functionality you can do with them. But we did it to allow looping over the typed list with scan. It is not yet integrated with scan, but you can use it now like this:
import theano
import theano.typed_list
a = theano.typed_list.TypedListType(theano.tensor.fvector)()
s, _ = theano.scan(fn=lambda i, tl: tl[i].sum(),
non_sequences=[a],
sequences=[theano.tensor.arange(2, dtype='int64')])
f = theano.function([a], s)
f([[1, 2, 3], [4, 5]])
One limitation is that the output of scan must be an ndarray, not a typed list.
No, this is not possible. NumPy arrays need to be rectangular in every pair of dimensions. This is due to the way they map onto memory buffers, as a pointer, itemsize, stride triple.
As for this taking up space: np.array([[1,2,3], [4,5]]) actually takes up more space than a 2×3 array, because it's an array of two pointers to Python lists (and even if the elements were converted to arrays, the memory layout would still be inefficient).

Numpy: convert index in one dimension into many dimensions

Many array methods return a single index despite the fact that the array is multidimensional. For example:
a = rand(2,3)
z = a.argmax()
For two dimensions, it is easy to find the matrix indices of the maximum element:
a[z/3, z%3]
But for more dimensions, it can become annoying. Does Numpy/Scipy have a simple way of returning the indices in multiple dimensions given an index in one (collapsed) dimension? Thanks.
Got it!
a = X.argmax()
(i,j) = unravel_index(a, X.shape)
I don't know of an built-in function that does what you want, but where this
has come up for me, I realized that what I really wanted to do was this:
given 2 arrays a,b with the same shape, find the element of b which is in
the same position (same [i,j,k...] position) as the maximum element of a
For this, the quick numpy-ish solution is:
j = a.flatten().argmax()
corresponding_b_element = b.flatten()[j]
Vince Marchetti