Shape of result: np.random.randn(10,5) # np.random.randn(5) - numpy

What is the shape of the result equals to 10 for this numpy equation: np.random.randn(10,5) # np.random.randn(5).
Thank you folks.

According to the documentation page for numpy.matmul:
If the second argument is 1-D, it is promoted to a matrix by appending
a 1 to its dimensions. After matrix multiplication the appended 1 is
removed.
This means after appending an extra dimension on the second operand, the operation is performed between two 2D arrays (10, 5) and (5, 1). The matrix multiplication follows the (i, j) # (j, k) = (i, k) rule, so the output is shaped (10, 1), and the extra appended dimension is then removed: (10,).

Related

How can I reconstruct original matrix from SVD components with following shapes?

I am trying to reconstruct the following matrix of shape (256 x 256 x 2) with SVD components as
U.shape = (256, 256, 256)
s.shape = (256, 2)
vh.shape = (256, 2, 2)
I have already tried methods from documentation of numpy and scipy to reconstruct the original matrix but failed multiple times, I think it maybe 3D matrix has a different way of reconstruction.
I am using numpy.linalg.svd for decompostion.
From np.linalg.svd's documentation:
"... If a has more than two dimensions, then broadcasting rules apply, as explained in :ref:routines.linalg-broadcasting. This means that SVD is
working in "stacked" mode: it iterates over all indices of the first
a.ndim - 2 dimensions and for each combination SVD is applied to the
last two indices."
This means that you only need to handle the s matrix (or tensor in general case) to obtain the right tensor. More precisely, what you need to do is pad s appropriately and then take only the first 2 columns (or generally, the number of rows of vh which should be equal to the number of columns of the returned s).
Here is a working code with example for your case:
import numpy as np
mat = np.random.randn(256, 256, 2) # Your matrix of dim 256 x 256 x2
u, s, vh = np.linalg.svd(mat) # Get the decomposition
# Pad the singular values' arrays, obtain diagonal matrix and take only first 2 columns:
s_rep = np.apply_along_axis(lambda _s: np.diag(np.pad(_s, (0, u.shape[1]-_s.shape[0])))[:, :_s.shape[0]], 1, s)
mat_reconstructed = u # s_rep # vh
mat_reconstructed equals to mat up to precision error.

Tensormultiplication with einsum

I have a tensor phi = np.random.rand(n, n, 3) and a matrix D = np.random.rand(3, 3). I want to multiply the matrix D along the last axis of phi so that the output has shape (n, n, 3). I have tried this
np.einsum("klj,ij->kli", phi, D)
But I am not confident in this notation at all. Basically I want to do
res = np.zeros_like(phi)
for i in range(n):
for j in range(n):
res[i, j, :] = D.dot(phi[i, j, :])
You are treating phi as an n, n array of vectors, each of which is to be left-multiplied by D. So you want to keep the n, n portion of the shape exactly as-is. The last (only) dimension of the vectors should be multiplied and summed with the last dimension of the matrix (the vectors are implicitly 3x1):
np.einsum('ijk,lk->ijl', phi, D)
OR
np.einsum('ij,klj->kli', D, phi)
It's likely much simpler to use broadcasting with np.matmul (the # operator):
np.squeeze(D # phi[..., None])
You can omit the squeeze if you don't mind the extra unit dimension at the end.

Vectors vs ndarrays in pandas/numpy

I know for a 4D vector, shape should be (4, 1) which is actually represented in 4D space but ndim is 2, and for some ndarray to be in 4 dimension, its shape should be something like (2, 3, 4, 5).
So, Is it like dimensional concept differs between vector and matrices (or arrays)? I'm trying to understand from mathematical perspective and how it's derived to pandas programming.
The dimensionality of a mathematical object is usually determined by the number of independent parameters in that particular object. For example, a 4-D vector is mathematically 4 dimensional because it contains 4 independent elements (unless some relation between them has been specified). Such a vector, if represented as a column vector in numpy, would have a shape (4, 1) because it has 4 rows and 1 column. The transpose of this vector, a row vector, has shape (4, ) because it has 4 columns and only 1 row, and the row-style view is default, so if there is 1 row, it's not explicitly mentioned.
Note however, that the column vector and row vector are dimensionally equivalent mathematically. Both have 4 dimensions.
For a 3 x 3 matrix, the most general mathematical dimension is 9, because it has 9 independent elements in general. The shape of a corresponding numpy array would be (3, 3). If you're looking for the maximum number of elements in any numpy array, ndarray.size is the way to go.
ndarray.ndim, however, yields the number of axes in a numpy array. That is, the number of directions along which values are placed (sloppy terminology!). So for the 3 x 3 matrix, ndim yields 2. For an array of shape (3, 7, 2, 1), ndim would yield 4. But, as we already discussed, the mathematical dimension would generally be 3 x 7 x 2 x 1 = 42 (So this is a matrix in 42-dimensional space! But the numpy array has just 4 dimensions). Thereby, as you might've already noticed, ndarray.size is just the product of the numbers in ndarray.shape.
Note that these are not just concepts of programming. We are used to saying "2-D matrices" in mathematics, but that is not to be confused with the space in which the matrices reside.

Why is there a list after the covariance function for numpy?

Whenever I'm finding covariance of 2 arrays, I've always seen it done like
(np.cov(X,Y)[0,1])
What purpose does the [0,1] serve?
For two 1d arrays x and y, np.cov(x, y) returns:
np.array([[variance(x), covariance(x, y)],
[covariance(y, x), variance(y) ]])
Thus for the covariance, you need the [0,1] term.
When formulated as cov(x ,y), numpy creates np.cov(X) where X = np.stack(x, y, axis = 1)
The confusion occurs because for np.cov(X) is really optimized for many vectors at once, with X.shape = (m, n), and np.cov(X)[i,j], i, j < n to be the covariance between rows i and j. And the covariance of rows i and i is just the variance of row i.

The represetaion of vector in numpy: is the return of np.array() a row vector?

I am new to numpy, I guess the return of np.array() is a row vector, Because the dot product between two vectors is commutative, is my guess right? Any respone is grateful.
vx = np.array([1, 2])
vw = np.array([3, 5])
np.dot(vx, vw)
np.dot(vw, vx)
The arrays are 1d ('vectors', not row/column vectors).
First paragraph from dot docuentation:
For 2-D arrays it is equivalent to matrix multiplication, and for 1-D
arrays to inner product of vectors (without complex conjugation). For
N dimensions it is a sum product over the last axis of a and
the second-to-last of b
So you are getting the inner product, which is commutative.
In [118]: vx = np.array([1, 2])
In [119]: vx.shape
Out[119]: (2,)
dot returns a scalar:
In [120]: np.dot(vx,vx)
Out[120]: 5
For a 2d 'row vector', shape matters. dot is matrix multiplication, and last dim as to match 2nd to the last, e.g. 2 matches with 2.
In [121]: vx2 = np.array([[1,2]])
In [122]: vx2.shape
Out[122]: (1, 2)
In [123]: np.dot(vx2, vx2)
...
ValueError: shapes (1,2) and (1,2) not aligned: 2 (dim 1) != 1 (dim 0)
In [124]: np.dot(vx2, vx2.T)
Out[124]: array([[5]])
In this case the result is 2d (1,1).