How to concatenate one numpy element and one demension array? - numpy

a=np.array([1,2,3])
b=np.array([5,6,7,8])
c=np.array([8])
d=9
I want to composite a new array:
np.array([2,7,8,8,9])
So type code :
newlist=np.concatenate((a[1],b[2:4],c,d))
But hint error:
ValueError: zero-dimensional arrays cannot be concatenated
Is single value that sliced from a np.array regarded as one dimesional array or a number?
In general, How to compose number and one dimensional array into an one demension array or list?

Just change np.concatenate() into np.hstack():
np.hstack((a[1],b[2:4],c,d))
np.hstack() takes a sequence of arrays and stacks them horizontally to make a single array.

Related

Typed lists vs ND-arrays in Numba

Could someone, please clarify that what is the benefit of using a Numba typed list over an ND array? Also, how do the two compares in terms of speed, and in what context would it be recommended to use the typed list?
Typed lists are useful when your need to append a sequence of elements but you do not know the total number of elements and you could not even find a reasonable bound. Such a data structure is significantly more expensive than a 1D array (both in memory space and computation time).
1D arrays cannot be resized efficiently: a new array needs to be created and a copy must be performed. However, the indexing of 1D arrays is very cheap. Numpy also provide many functions that can natively operate on them (lists are implicitly converted to arrays when passed to a Numpy function and this process is expensive). Note that is the number of items can be bounded to a reasonably size (ie. not much higher than the number of actual element), you can create a big array, then add the elements and finally work on a sub-view of the array.
ND arrays cannot be directly compared with lists. Note that lists of lists are similar to jagged array (they can contains lists of different sizes) while ND array are likes a (fixed-size) N x ... x M table. Lists of lists are very inefficient and often not needed.
As a result, use ND arrays when you can and you do not need to often resize them (or append/remove elements). Otherwise, use typed lists.

Simple question about slicing a Numpy Tensor

I have a Numpy Tensor,
X = np.arange(64).reshape((4,4,4))
I wish to grab the 2,3,4 entries of the first dimension of this tensor, which you can do with,
Y = X[[1,2,3],:,:]
Is this a simpler way of writing this instead of explicitly writing out the indices [1,2,3]? I tried something like [1,:], which gave me an error.
Context: for my real application, the shape of the tensor is something like (30000,100,100). I would like to grab the last (10000, 100,100) to (30000,100,100) of this tensor.
The simplest way in your case is to use X[1:4]. This is the same as X[[1,2,3]], but notice that with X[1:4] you only need one pair of brackets because 1:4 already represent a range of values.
For an N dimensional array in NumPy if you specify indexes for less than N dimensions you get all elements of the remaining dimensions. That is, for N equal to 3, X[1:4] is the same as X[1:4, :, :] or X[1:4, :]. Only if you want to index some dimension while getting all elements in a dimension that comes before it is that you actually need to pass :. Such as X[:, 2:4], for instance.
If you wish to select from some row to the end of array, simply use python slicing notation as below:
X[10000:,:,:]
This will select all rows from 10000 to the end of array and all columns and depths for them.

How to find matching elements in 2 tensors of different sizes?

I am trying to achieve something very simple in Tensorflow (and not native Python or NumPy or pandas) which can be done in any of the following ways:
Have 2 separate arrays/tensors with different sizes. Each element holds two values: a comparing-value and a weight. We want to compare the comparing-value in both tensors, and multiply their corresponding weights.
Have comparing-value and weights as different arrays. Then compare the comparing-values, get the indices, then use the index to find elements in the weight vectors and then multiply them.
In short I want to find indices of matching elements in both the tensors.
The closest solution I could find is to convert them to sets, but it does not give the exact index of the element.
I was able to achieve what I wanted using Pandas:
matched = pd.Index(v1).intersection(pd.Index(v2))
and native Python:
ind_v1, ind_v2 = [i for i, item in enumerate(v1_1) if item in v2_1],[i for i, item in enumerate(v2_1) if item in v1_1]
I wish to have this same in Tensorflow.

2D dot product on two 3D matrix along an aixs

Given two matrixes A and B with dimension of (x,y,z) and (y,x,z) respectively, how to dot product on the first two dimension of the two matrices? The result should have dimension of (x,x,z).
Thanks!
Use np.einsum with literally the same string expression -
np.einsum('xyz,yiz->xiz',a,b) # a,b are input arrays
Note that we have used yiz as the string notation for the second array and not yxz, as that i is supposed to be a new dimension in the output array and is not to be aligned with the first axis of the first array for which we have already assigned x. The dimensions that are to be aligned are given the same string notation.

What is the equivalent of numpy.allclose for structured numpy arrays?

Running numpy.allclose(a, b) throws TypeError: invalid type promotion on structured arrays. What would be the correct way of checking whether the contents of two structured arrays are almost equal?
np.allclose does an np.isclose followed by all(). isclose tests abs(x-y) against tolerances, with accomodations for np.nan and np.inf. So it is designed primarily to work with floats, and by extension ints.
The arrays have to work with np.isfinite(a), as well as a-b and np.abs. In short a.astype(float) should work with your arrays.
None of this works with the compound dtype of a structured array. You could though iterate over the fields of the array, and compare those with isclose (or allclose). But you will have ensure that the 2 arrays have matching dtypes, and use some other test on fields that don't work with isclose (eg. string fields).
So in the simple case
all([np.allclose(a[name], b[name]) for name in a.dtype.names])
should work.
If the fields of the arrays are all the same numeric dtype, you could view the arrays as 2d arrays, and do allclose on those. But usually structured arrays are used when the fields are a mix of string, int and float. And in the most general case, there are compound dtypes within dtypes, requiring some sort of recursive testing.
import numpy.lib.recfunctions as rf
has functions to help with complex structured array operations.
Assuming b is a scalar, you can just iterate over the fields of a:
all(np.allclose(a[field], b) for field in a.dtype.names)