Strange behaviour of numpy eigenvector: bug or no bug - numpy

NumPy's eigenvector solution differs from Wolfram Alpha and my personal calculation by hand.
>>> import numpy.linalg
>>> import numpy as np
>>> numpy.linalg.eig(np.array([[-2, 1], [2, -1]]))
(array([-3., 0.]), array([[-0.70710678, -0.4472136 ],
[ 0.70710678, -0.89442719]]))
Wolfram Alpha https://www.wolframalpha.com/input/?i=eigenvectors+%7B%7B-2,1%7D,%7B%2B2,-1%7D%7D and my personal calculation give the eigenvectors (-1, 1) and (2, 1). The NumPy solution however differs.
NumPy's calculated eigenvalues however are confirmed by Wolfram Alpha and my personal calculation.
So, is this a bug in NumPy or is my understanding of math to simple? A similar thread Numpy seems to produce incorrect eigenvectors sees the main difference in rounding/scaling of the eigenvectors but the deviation between the solutions would be massive.
Regards

numpy.linalg.eig normalizes the eigen vectors with the results being the column vectors
eig_vectors = np.linalg.eig(np.array([[-2, 1], [2, -1]]))[1]
vec_1 = eig_vectors[:,0]
vec_2 = eig_vectors[:,1]
now these 2 vectors are just normalized versions of the vectors you calculated ie
print(vec_1 * np.sqrt(2)) # where root 2 is the magnitude of [-1, 1]
print(vec_1 * np.sqrt(5)) # where root 5 is the magnitude of [2, 1]
So bottom line the both sets of calculations are equivalent just Numpy likes to normalze the results.

Related

type hint npt.NDArray number of axis

Given I have the number of axes, can I specify the number of axes to the type hint npt.NDArray (from import numpy.typing as npt)
i.e. if I know it is a 3D array, how can I do npt.NDArray[3, np.float64]
On Python 3.9 and 3.10 the following does the job for me:
data = [[1, 2, 3], [4, 5, 6]]
arr: np.ndarray[Tuple[Literal[2], Literal[3]], np.dtype[np.int_]] = np.array(data)
It is a bit cumbersome, but you might follow numpy issue #16544 for future development on easier specification.
In particular, for now you must declare the full shape and can't only declare the rank of the array.
In the future something like ndarray[Shape[:, :, :], dtype] should be available.

Applying scipy.sparse.linalg.svds returns nan values

I am starting to use the scipy.sparse library, and when I try to apply scipy.sparse.linalg.svds, I get an error if there are zero singular values.
I am doing this because in the end I am going to use very large and very sparse matrices with entries only {+1, -1} which are not square (>1100*1000 size with >0.99 sparsity), and I want to know their rank.
I know approximately what the rank is, it is almost full, so knowing only the last singular values can tell me what is the rank exactly.
This is why I chose to work with scipy.sparse.linalg.svds and set which='LM'. If the rank is not full, there will be singular values which are zero, this is my code:
import numpy as np
import scipy.sparse as sp
import scipy.sparse.linalg as la
a = np.array([[0, 0, 0], [0, 0, 0], [1, 1, -1]], dtype='d')
sp_a = sp.csc_matrix(a)
s = la.svds(sp_a, k=2, return_singular_vectors=False, which='SM')
print(s)
output is
[ nan 9.45667059e-12]
/usr/lib/python3/dist-packages/scipy/sparse/linalg/eigen/arpack/arpack.py:1849: RuntimeWarning: invalid value encountered in sqrt
s = np.sqrt(eigvals)
Any thoughts on why this happens?
Maybe there is another efficient way to know the rank, knowing that I have a large non-square very sparse matrix with almost full rank?
scipy version 1.1.0
numpy version 1.14.5
Linux platform
Thanks in advance

How to understand this: `db = np.sum(dscores, axis=0, keepdims=True)`

In cs231n 2017 class, when we backpropagate the gradient we update the biases like this:
db = np.sum(dscores, axis=0, keepdims=True)
What's the basic idea behind the sum operation? Thanks
This is the formula of derivative (more precisely gradient) of the loss function with respect to the bias (see this question and this post for derivation details).
The numpy.sum call computes the per-column sums along the 0 axis. Example:
dscores = np.array([[1, 2, 3],[2, 3, 4]]) # a 2D matrix
db = np.sum(dscores, axis=0, keepdims=True) # result: [[3 5 7]]
The result is exactly element-wise sum [1, 2, 3] + [2, 3, 4] = [3 5 7]. In addition, keepdims=True preserves the rank of original matrix, that's why the result is [[3 5 7]] instead of just [3 5 7].
By the way, if we were to compute np.sum(dscores, axis=1, keepdims=True), the result would be [[6] [9]].
[Update]
Apparently, the focus of this question is the formula itself. I'd like not to go too much off-topic here and just try to tell the main idea. The sum appears in the formula because of broadcasting over the mini-batch in the forward pass. If you take just one example at a time, the bias derivative is just the error signal, i.e. dscores (see the links above explain it in detail). But for a batch of examples the gradients are added up due to linearity. That's why we take the sum along the batch axis=0.
Numpy axis visual description:

Simplex algorithm in scipy package python

I am reading the documentation of the Simplex Algorithm provided in the Scipy package of python, but the example shown in the last at this documentation page is solving a minimization problem. Whereas I want to do a maximization. How would you alter the parameters in order to perform a maximization if we can do maximization using this package?
Every maximization problem can be transformed into a minimization problem by multiplying the c-vector by -1: Say you have the 2-variable problem from the documentation, but want to maximize c=[-1,4]
from scipy.optimize import linprog
import numpy
c = numpy.array([-1, 4]) # your original c for maximization
c *= -1 # negate the objective coefficients
A = [[-3, 1], [1, 2]]
b = [6, 4]
x0_bnds = (None, None)
x1_bnds = (-3, None)
res = linprog(c, A, b, bounds=(x0_bnds, x1_bnds))
print("Objective = {}".format(res.get('fun') * -1)) # don't forget to retransform your objective back!
outputs
>>> Objective = 11.4285714286

Specify the spherical covariance in numpy's multivariate_normal random sampling

In numpy manual, it is said:
Instead of specifying the full covariance matrix, popular approximations include:
Spherical covariance (cov is a multiple of the identity matrix)
Has anybody ever specified spherical covariance? I am trying to make it work to avoid building the full covariance matrix, which is too much memory-consuming.
If you just have a diagonal covariance matrix, it is usually easier (and more efficient) to just scale standard normal variates yourself instead of using multivariate_normal().
>>> import numpy as np
>>> stdevs = np.array([3.0, 4.0, 5.0])
>>> x = np.random.standard_normal([100, 3])
>>> x.shape
(100, 3)
>>> x *= stdevs
>>> x.std(axis=0)
array([ 3.23973255, 3.40988788, 4.4843039 ])
While #RobertKern's approach is correct, you can let numpy handle all of that for you, as np.random.normal will do broadcasting on multiple means and standard deviations:
>>> np.random.normal(0, [1,2,3])
array([ 0.83227999, 3.40954682, -0.01883329])
To get more than a single random sample, you have to give it an appropriate size:
>>> x = np.random.normal(0, [1, 2, 3], size=(1000, 3))
>>> np.std(x, axis=0)
array([ 1.00034817, 2.07868385, 3.05475583])