numpy indexing with conditional transposes dimensions - numpy

When I mix mask-indices and normal indices, I get a weird result:
a = np.zeros((3,2,5))
cond = [True]*5
print(a[0][:,cond].shape) # prints '(2,5)' as expected
print(a[0,:,cond].shape) # prints '(5,2)' which is surprising
i.e., the resulting array is transposed from what I think should happend
Is this a bug? is this a feature? If someone can point me to a piece of documentation that explains this I would be very glad :)

No, this is not a bug. It happens because you have mixed basic indexing (slice) and advanced indexing (integer and boolean array). It's documented here:
Two cases of index combination need to be distinguished:
The advanced indexes are separated by a slice, Ellipsis or newaxis. For example x[arr1, :, arr2].
The advanced indexes are all next to each other. For example x[..., arr1, arr2, :] but not x[arr1, :, 1] since 1 is an advanced
index in this regard.
In the first case, the dimensions resulting from the advanced indexing
operation come first in the result array, and the subspace dimensions
after that. In the second case, the dimensions from the advanced
indexing operations are inserted into the result array at the same
spot as they were in the initial array (the latter logic is what makes
simple advanced indexing behave just like slicing).

Related

In tensorflow, how to gather with indices matching params first dimensions?

If i have indices of shape (D_0,...,D_k) and params of shape (D_0,...,D_k,I,F) (with 0 ≤ indices[i_0,...,i_k] < I), what is the fastest/most elegant way to get the array output of shape (D_0,...,D_k,F) with
output[i_0,...,i_k,f]=params[i_0,...,i_k,indices[i_0,...,i_k],f]
If k=0, then we can use gather. So, in the past, I had a solution based on flattening. Is there a nicer solution now that tensorflow has matured?
Most of the times, when I want this type of gathering, indices is obtained by indices = tf.argmax(params[:,...,:,:,0]). For every (i_0,...,i_k), I have I vectors of size (F,) and I want to keep only those with the maximal value for one of the features. A solution which would only work for this special case (a kind of reduce_max only using one feature to decide how to reduce) would satisfy me.

How to use the backslash operator in Julia?

I am currently trying to invert huge matrices of order 1 million by 1 million and I figured that the Backslash operator will be helpful in doing this. Any idea as to how it's implemented?. I did not find any concrete examples so any help is much appreciated.
Any idea as to how it's implemented?
It's a multialgorithm. This shows how to use it:
julia> A = rand(10,10)
10×10 Array{Float64,2}:
0.330453 0.294142 0.682869 0.991427 … 0.533443 0.876566 0.157157
0.666233 0.47974 0.172657 0.427015 0.501511 0.0978822 0.634164
0.829653 0.380123 0.589555 0.480963 0.606704 0.642441 0.159564
0.709197 0.570496 0.484826 0.17325 0.699379 0.0281233 0.66744
0.478663 0.87298 0.488389 0.188844 0.38193 0.641309 0.448757
0.471705 0.804767 0.420039 0.0528729 … 0.658368 0.911007 0.705696
0.679734 0.542958 0.22658 0.977581 0.197043 0.717683 0.21933
0.771544 0.326557 0.863982 0.641557 0.969889 0.382148 0.508773
0.932684 0.531116 0.838293 0.031451 0.242338 0.663352 0.784813
0.283031 0.754613 0.938358 0.0408097 0.609105 0.325545 0.671151
julia> b = rand(10)
10-element Array{Float64,1}:
0.0795157
0.219318
0.965155
0.896807
0.701626
0.741823
0.954437
0.573683
0.493615
0.0821557
julia> A\b
10-element Array{Float64,1}:
1.47909
2.39816
-0.15789
0.144003
-1.10083
-0.273698
-0.775122
0.590762
-0.0266894
-2.36216
You can use #which to see how it's defined:
julia> #which A\b
\(A::AbstractArray{T,2} where T, B::Union{AbstractArray{T,1}, AbstractArray{T,2}} where T) in Base.LinAlg at linalg\generic.jl:805
Which leads us here: https://github.com/JuliaLang/julia/blob/master/base/linalg/generic.jl#L827 (line numbers change slightly because of version differences). As you can see, it does a few quick function calls to determine what type of matrix it is. istril finds out of its lower triangular: https://github.com/JuliaLang/julia/blob/master/base/linalg/generic.jl#L987 , etc. Once it determines the matrix type, it specializes the matrix as much as possible so it can be efficient, and then calls \. These specialized matrix types either perform a factorization which then \ does the backsubstitution (which is a nice way to use \ on your own BTW to re-use the factorization), or it "directly knows" the answer, like for triangular or diagonal matrices.
Can't get more concrete than the source.
Note that \ is slightly different than just inverting. You usually do not want to invert a matrix, let alone a large matrix. These factorizations are much more numerically stable. However, inv will do an inversion, which is a lot like an LU-factorization (which in Julia is lufact). You may also want to look into pinv for the psudo-inverse in some cases where the matrix is singular or close to singular, but you should really avoid this an instead factorize + solve the system instead of using the inverse.
For very large sparse matrices, you'll want to use iterative solvers. You'll find a lot of implementations in IterativeSolvers.jl

julia index matrix with vector

Suppose I have a 20-by-10 matrix m
and a 20-by-1 vector v, where each element is an integer between 1 to 10.
Is there smart indexing command something like m[:,v]
that would give a vector, where each element i is element of m at the index [i,v[i]]?
No, it seems that you cannot do it. Documentation (http://docs.julialang.org/en/stable/manual/arrays/) says:
If all the indices are scalars, then the result X is a single element from the array A. Otherwise, X is an array with the same number of dimensions as the sum of the dimensionalities of all the indices.
So, to get 1d result from indexing operation you need to have one of the indices to have dimensionality 0, i.e. to be just a scalar -- and you won't get what you want then.
Use comprehension, as proposed in the comment to your question.
To be explicit about the comprehension approach:
[m[i,v[i]] for i = 1:length(v)]
This is concise and clear enough that having a special syntax seems unnecessary.

Differences between X.ravel() and X.reshape(s0*s1*s2) when number of axes known

Seeing this answer I am wondering if the creation of a flattened view of X are essentially the same, as long as I know that the number of axes in X is 3:
A = X.ravel()
s0, s1, s2 = X.shape
B = X.reshape(s0*s1*s2)
C = X.reshape(-1) # thanks to #hpaulj below
I'm not asking if A and B and C are the same.
I'm wondering if the particular use of ravel and reshape in this situation are essentially the same, or if there are significant differences, advantages, or disadvantages to one or the other, provided that you know the number of axes of X ahead of time.
The second method takes a few microseconds, but that does not seem to be size dependent.
Look at their __array_interface__ and do some timings. The only difference that I can see is that ravel is faster.
.flatten() has a more significant difference - it returns a copy.
A.reshape(-1)
is a simpler way to use reshape.
You could study the respective docs, and see if there is something else. I haven't explored what happens when you specify order.
I would use ravel if I just want it to be 1d. I use .reshape most often to change a 1d (e.g. arange()) to nd.
e.g.
np.arange(10).reshape(2,5).ravel()
Or choose the one that makes your code most readable.
reshape and ravel are defined in numpy C code:
In https://github.com/numpy/numpy/blob/0703f55f4db7a87c5a9e02d5165309994b9b13fd/numpy/core/src/multiarray/shape.c
PyArray_Ravel(PyArrayObject *arr, NPY_ORDER order) requires nearly 100 lines of C code. And it punts to PyArray_Flatten if the order changes.
In the same file, reshape punts to newshape. That in turn returns a view is the shape doesn't actually change, tries _attempt_nocopy_reshape, and as last resort returns a PyArray_NewCopy.
Both make use of PyArray_Newshape and PyArray_NewFromDescr - depending on how shapes and order mix and match.
So identifying where reshape (to 1d) and ravel are different would require careful study.
Another way to do this ravel is to make a new array, with a new shape, but the same data buffer:
np.ndarray((24,),buffer=A.data)
It times the same as reshape. Its __array_interface__ is the same. I don't recommend using this method, but it may clarify what is going on with these reshape/ravel functions. They all make a new array, with new shape, but with share data (if possible). Timing differences are the result of different sequences of function calls - in Python and C - not in different handling of the data.

Matrices with different row lengths in numpy

Is there a way of defining a matrix (say m) in numpy with rows of different lengths, but such that m stays 2-dimensional (i.e. m.ndim = 2)?
For example, if you define m = numpy.array([[1,2,3], [4,5]]), then m.ndim = 1. I understand why this happens, but I'm interested if there is any way to trick numpy into viewing m as 2D. One idea would be padding with a dummy value so that rows become equally sized, but I have lots of such matrices and it would take up too much space. The reason why I really need m to be 2D is that I am working with Theano, and the tensor which will be given the value of m expects a 2D value.
I'll give here very new information about Theano. We have a new TypedList() type, that allow to have python list with all elements with the same type: like 1d ndarray. All is done, except the documentation.
There is limited functionality you can do with them. But we did it to allow looping over the typed list with scan. It is not yet integrated with scan, but you can use it now like this:
import theano
import theano.typed_list
a = theano.typed_list.TypedListType(theano.tensor.fvector)()
s, _ = theano.scan(fn=lambda i, tl: tl[i].sum(),
non_sequences=[a],
sequences=[theano.tensor.arange(2, dtype='int64')])
f = theano.function([a], s)
f([[1, 2, 3], [4, 5]])
One limitation is that the output of scan must be an ndarray, not a typed list.
No, this is not possible. NumPy arrays need to be rectangular in every pair of dimensions. This is due to the way they map onto memory buffers, as a pointer, itemsize, stride triple.
As for this taking up space: np.array([[1,2,3], [4,5]]) actually takes up more space than a 2×3 array, because it's an array of two pointers to Python lists (and even if the elements were converted to arrays, the memory layout would still be inefficient).