why the difference between numpy matrix and numpy array when selecting element - numpy

I have a calculated matrix
from numpy import matrix
vec=matrix([[ 4.79263398e-01+0.j , -2.94883960e-14+0.34362808j,
5.91036823e-01+0.j , -2.06730654e-14+0.41959935j,
-3.20298698e-01+0.08635809j, -5.97136351e-02+0.22325523j],
[ 9.45394208e-14+0.34385164j, 4.78941900e-01+0.j ,
1.07732017e-13+0.41891016j, 5.91969770e-01+0.j ,
-6.06877417e-02-0.2250884j , 3.17803028e-01+0.08500215j],
[ 4.63795513e-01-0.00827114j, -1.15263719e-02+0.33287485j,
-2.78282097e-01-0.20137267j, -2.81970922e-01-0.1980647j ,
9.26109539e-02-0.38428445j, 5.12483437e-01+0.j ],
[ -1.15282610e-02+0.33275927j, 4.63961516e-01-0.00826978j,
-2.84077490e-01-0.19723838j, -2.79429184e-01-0.19984041j,
-4.42104809e-01+0.25708681j, -2.71973825e-01+0.28735795j],
[ 4.63795513e-01+0.00827114j, 1.15263719e-02+0.33287485j,
-2.78282097e-01+0.20137267j, 2.81970922e-01-0.1980647j ,
2.73235786e-01+0.28564581j, -4.44053596e-01-0.25584307j],
[ 1.15282610e-02+0.33275927j, 4.63961516e-01+0.00826978j,
2.84077490e-01-0.19723838j, -2.79429184e-01+0.19984041j,
5.11419878e-01+0.j , -9.22028113e-02-0.38476356j]])
I want to get 2nd row, 3rd column element
vec[1][2]
IndexError: index 1 is out of bounds for axis 0 with size 1
and slicing works well
vec[1,2]
(1.07732017e-13+0.41891015999999998j)
My first question why first way does not work in this case? it worked before when I used it.
Second question is: the result of slicing is an array, how to make it an complex value without bracket? My experience was using
vec[1,2][0]
but again it is not working here.
I tried to do everything on numpy array at begining, those methods that do not work on numpy matrix work on numpy array. Why there are such differences?

The key difference is that a matrix is always 2d, always. (This is supposed to be familiar to MATLAB users.)
In [85]: mat = np.matrix('1,2;3,4')
In [86]: mat
Out[86]:
matrix([[1, 2],
[3, 4]])
In [87]: mat.shape
Out[87]: (2, 2)
In [88]: mat[1]
Out[88]: matrix([[3, 4]])
In [89]: _.shape
Out[89]: (1, 2)
Selecting a row of mat returns a matrix - a 1 row one. It should be clear that it cannot be indexed again with [1].
Indexing with the tuple returns a scalar:
In [90]: mat[1,1]
Out[90]: 4
In [91]: type(_)
Out[91]: numpy.int32
As a general rule operations on a np.matrix returns a matrix or a scalar, not a np.ndarray.
The other key point is that mat[1][1] is not one numpy operation. It is two, a mat[1] followed by another [1]. Imagine yourself to be a Python interpreter without any special knowledge of numpy. How would you evaluate that expression?
Now for the complex question:
In [92]: mat = np.matrix('1+3j, 2;-2, 2+1j')
In [93]: mat
Out[93]:
matrix([[ 1.+3.j, 2.+0.j],
[-2.+0.j, 2.+1.j]])
In [94]: mat[1,1]
Out[94]: (2+1j)
In [95]: type(_)
Out[95]: numpy.complex128
As expected the tuple index has returned a scalar numpy element. () is just part of numpys way of displaying a complex number.
We can use item to extra python equivalent, but the display still uses ()
In [96]: __.item()
Out[96]: (2+1j)
In [97]: type(_)
Out[97]: complex
In [98]: 1+3j
Out[98]: (1+3j)
mat has A property that gives the array equivalent. But notice the shapes.
In [99]: mat.A # a 2d array
Out[99]:
array([[ 1.+3.j, 2.+0.j],
[-2.+0.j, 2.+1.j]])
In [100]: mat.A1 # a 1d array
Out[100]: array([ 1.+3.j, 2.+0.j, -2.+0.j, 2.+1.j])
In [101]: mat[1].A
Out[101]: array([[-2.+0.j, 2.+1.j]])
In [102]: mat[1].A1
Out[102]: array([-2.+0.j, 2.+1.j])
Sometimes this behavior of matrix is handy. For example np.sum acts like the array keepdims=True:
In [108]: np.sum(mat,1)
Out[108]:
matrix([[ 3.+3.j],
[ 0.+1.j]])
In [110]: np.sum(mat.A,1, keepdims=True)
Out[110]:
array([[ 3.+3.j],
[ 0.+1.j]])

Related

shape of the result matrix for multiplying 2 arrays in numpy

shape of result matrix
matrix A is 3X3
array B is 3X1
so the shape of AXB should be 3X1.
the numpy calculation shows it as 1X3
where am i wrong?
I was expecting the shape as 3X1
The documentation for matmul/# (and dot) is written with ndarray in mind:
With (3,3) and (3,):
In [99]: A= np.arange(9).reshape(3,3); b=np.arange(3)
matmul is (3,) (so would np.dot)):
In [100]: A#b
Out[100]: array([ 5, 14, 23])
But with np.matrix, it appears the result (3,) is "promoted" to matrix), which by default adds a leading dimension:
In [101]: np.matrix(A)#b
Out[101]: matrix([[ 5, 14, 23]])
Originally * was defined to be matrix multiplication for np.matrix (as it is in MATLAB):
In [102]: np.matrix(A)*b
ValueError: shapes (3,3) and (1,3) not aligned: 3 (dim 1) != 1 (dim 0)
That tried to promote b to np.matrix, resulting in a (3,3) with (1,3) => error.
np.dot of the same behaves like #:
In [103]: np.dot(np.matrix(A),b)
Out[103]: matrix([[ 5, 14, 23]])
To use * we have to make the 2nd (3,1) shape:
In [104]: np.matrix(A)*np.matrix(b).T
Out[104]:
matrix([[ 5],
[14],
[23]])
In [105]: np.matrix(b).T
Out[105]:
matrix([[0],
[1],
[2]])
In short using np.matrix with matmul/# complicates things, producing a result that doesn't quite fit the documentation.

Possible to use np.where to check a condition in vector, but output rows in a 2D array

I have a series and a dataframe. I want to check if the values in a series pass a condition, and modify the row of the dataframe if they do, otherwise leave as is.
NumPy has a broadcasting issue with this - is there another way to do this?
ser = pd.Series([74, 80, 24], pd.date_range(start='2020-01-01', periods=3, freq='D'))
test = pd.DataFrame([pd.Series([1, 2], index=['a', 'b'])] * len(ser), index=ser.index)
np.where(ser<50, (test*2), test)
ValueError: operands could not be broadcast together with shapes (3,)
(3,2) (3,2)
I think a workaround would be to modify ser to be a dataframe with all equivalent columns, but it seems a little bit clunky.
Use broadcasting in NumPy, so they are not aligned by indices, only necessary same length of Series and DataFrame:
a = np.where(ser.to_numpy()[:, None]<50, (test*2), test)
print (a)
[[1 2]
[1 2]
[2 4]]

How to compute every sum for every argument a from an array of numbers A

I'd like to compute the following sums for each value of a in A:
D = np.array([1, 2, 3, 4])
A = np.array([0.5, 0.25, -0.5])
beta = 0.5
np.sum(np.square(beta) - np.square(D-a))
and the result is an array of all the sums. To compute it by hand, it would look something like this:
[np.sum(np.square(beta)-np.square(D-0.5)),
np.sum(np.square(beta)-np.square(D-0.25)),
np.sum(np.square(beta)-np.square(D-0.5))]
Use np.sum with broadcasting
np.sum(np.square(beta) - np.square(D[None,:] - A[:,None]), axis=1)
Out[98]: array([-20. , -24.25, -40. ])
Explain: We need the whole array D subtracts each element of array A. We can't simple call D - A because it just does subtraction element-wise between D and A. Therefore, we need employing numpy broadcasting. We need to add an additional dimension to D and A to satisfy rules of broadcasting. After that, just do calculation and sum them along axis=1
Step by step:
Increase dimension D from 1D to 2D at axis=0
In [10]: D[None,:]
Out[10]: array([[1, 2, 3, 4]])
In [11]: D.shape
Out[11]: (4,)
In [12]: D[None,:].shape
Out[12]: (1, 4)
Doing the same for A, but at axis=1
In [13]: A[:,None]
Out[13]:
array([[ 0.5 ],
[ 0.25],
[-0.5 ]])
In [14]: A.shape
Out[14]: (3,)
In [15]: A[:,None].shape
Out[15]: (3, 1)
On subtraction, numpy broadcasting kicks in to broadcast each array to compatible dimension and does subtraction to create 2D array result
In [16]: D[None,:] - A[:,None]
Out[16]:
array([[0.5 , 1.5 , 2.5 , 3.5 ],
[0.75, 1.75, 2.75, 3.75],
[1.5 , 2.5 , 3.5 , 4.5 ]])
Next, it is just element-wise square and subtraction and square.
np.square(beta) - np.square(D[None,:] - A[:,None])
Out[17]:
array([[ 0. , -2. , -6. , -12. ],
[ -0.3125, -2.8125, -7.3125, -13.8125],
[ -2. , -6. , -12. , -20. ]])
Lastly, sum alongs axis=1 to get the final output:
np.sum(np.square(beta) - np.square(D[None,:] - A[:,None]), axis=1)
Out[18]: array([-20. , -24.25, -40. ])
You may read docs on numpy broadcasting here to get more info https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html
I'm not too familiar with numpy, so there may be a vectorized way to do this. But with list comprehension, this will do:
[ np.sum(np.square(beta) - np.square(D-a)) for a in A ]
Output:
[-20.0, -24.25, -40.0]

Elementwise multiplication of NumPy arrays of different shapes

When I use numpy.multiply(a,b) to multiply numpy arrays with shapes (2, 1),(2,) I get a 2 by 2 matrix. But what I want is element-wise multiplication.
I'm not familiar with numpy's rules. Can anyone explain what's happening here?
When doing an element-wise operation between two arrays, which are not of the same dimensionality, NumPy will perform broadcasting. In your case Numpy will broadcast b along the rows of a:
import numpy as np
a = np.array([[1],
[2]])
b = [3, 4]
print(a * b)
Gives:
[[3 4]
[6 8]]
To prevent this, you need to make a and b of the same dimensionality. You can add dimensions to an array by using np.newaxis or None in your indexing, like this:
print(a * b[:, np.newaxis])
Gives:
[[3]
[8]]
Let's say you have two arrays, a and b, with shape (2,3) and (2,) respectively:
a = np.random.randint(10, size=(2,3))
b = np.random.randint(10, size=(2,))
The two arrays, for example, contain:
a = np.array([[8, 0, 3],
[2, 6, 7]])
b = np.array([7, 5])
Now for handling a product element to element a*b you have to specify what numpy has to do when reaching for the absent axis=1 of array b. You can do so by adding None:
result = a*b[:,None]
With result being:
array([[56, 0, 21],
[10, 30, 35]])
Here are the input arrays a and b of the same shape as you mentioned:
In [136]: a
Out[136]:
array([[0],
[1]])
In [137]: b
Out[137]: array([0, 1])
Now, when we do multiplication using either * or numpy.multiply(a, b), we get:
In [138]: a * b
Out[138]:
array([[0, 0],
[0, 1]])
The result is a (2,2) array because numpy uses broadcasting.
# b
#a | 0 1
------------
0 | 0*0 0*1
1 | 1*0 1*1
I just explained the broadcasting rules in broadcasting arrays in numpy
In your case
(2,1) + (2,) => (2,1) + (1,2) => (2,2)
It has to add a dimension to the 2nd argument, and can only add it at the beginning (to avoid ambiguity).
So you want a (2,1) result, you have to expand the 2nd argument yourself, with reshape or [:, np.newaxis].

Slicing a numpy array and passing the slice to a function

I want to have a function that can operate on either a row or a column of a 2D ndarray. Assume the array has C order. The function changes values in the 2D data.
Inside the function I want to have identical index syntax whether it is called with a row or column. A row slice is [n,:] and column slice [:,n] so they have different shapes. Inside the function this requires different indexing expressions.
Is there a way to do this that does not require moving or allocating memory? I am under the impression that using reshape will force a copy to make the data to make it contiguous. Is there a way to use nditer in the function?
Do you mean like this:
In [74]: def foo(arr, n):
...: arr += n
...:
In [75]: arr = np.ones((2,3),int)
In [76]: foo(arr[0,:],1)
In [77]: arr
Out[77]:
array([[2, 2, 2],
[1, 1, 1]])
In [78]: foo(arr[:,1],[100,200])
In [79]: arr
Out[79]:
array([[ 2, 102, 2],
[ 1, 201, 1]])
In the first case I'm adding 1 to one row of the array, ie. a row slice. In the second case I'm add a array (list) to a column. In that case n has to have the right length.
Usually we don't worry about whether the values are C contiguous. Striding takes care of access either way.