svd doesn't return correct dimension - numpy

I have a matrix with dimension (22,2) and I want to decompose it using SVD. SVD in numpy doesn't return the correct dimensions though.I'd expect dimensions like (22,22), (22),(22,2)?

The returned dimensions are correct. The uu and vvh matrices are always square matrices, while depending on the software s can be an array with just the singular values (as in numpy) or a diagonal matrix with the dimension of the original matrix (as in MATLAB, for instance).
The dimensions of the uu matrix is the number of rows of the original matrix, while the dimension of the vvh matrix is the number of columns of the original matrix. This can never change or you would be computing something else instead of the SVD.
To reconstruct the original matrix from the decomposition in numpy we need to make s into a matrix with the proper dimension. For square matrices it's easy, just np.diag(s) is enough. Since your original matrix is not square and it has more rows than columns, then we can use something like
S = np.vstack([np.diag(s), np.zeros((20, 2))])
Then we get a S matrix which is a diagonal matrix with the singular values concatenated with a zero matrix. In the end, uu is 22x22, S is 22x2 and vvh is 2x2. Multiplying uu # S # vvh will give the original matrix back.

Related

Calculate the inverse of a non-square matrix using numpy

Is there a way where I can calculate the inverse of a mxn non-square matrix using numpy? Since using la.inv(S) seems to give me an error of ValueError: expected square matrix
You are probably looking for np.linalg.pinv.
To calculate the non square matrix mxn, We can use np.linalg.pinv(S), here s is the data you want to pass.
For square matrix we use np.linalg.inv(S), The inverse of a matrix is such that if it is multiplied by the original matrix, it results in identity matrix.
note: np is numpy
We can also use np.linalg.inv(S) for non square matrix but in order to not get any error you need to slice the data S.
For more details on np.linalg.pinv : https://numpy.org/doc/stable/reference/generated/numpy.linalg.pinv.html

How to apply numpy matrix operations on first two dimensions of 3D array

I have a 3D numpy array which I am using to represent a tuple of (square) matrices, and I'd like to perform a matrix operation on each of those matrices, corresponding to the first two dimensions of the array. For instance, if my list of matrices is [A,B,C] I would like to compute [A'A,B'B,C'C] where ' denotes the conjugate transpose.
The following code kinda sorta does what I'm looking for:
foo=np.array([[[1,1],[0,1]],[[0,1],[0,0]],[[3,0],[0,-2]]])
[np.matrix(j).H*np.matrix(j) for j in foo]
But I'd like to do this using vectorized operations instead of list comprehension.

A Pure Pythonic Pairwise Euclidean distance of rows of a numpy ndarray

I have a matrix of size (n_classes, n_features) and i want to compute the pairwise euclidean distance of each pair of classes so the output would be a (n_classes, n_classes) matrix where each cell has the value of euclidean_distance(class_i, class_j).
I know that there is this scipy spatial distances (http://docs.scipy.org/doc/scipy-0.14.0/reference/spatial.distance.html) and sklearn.metric.euclidean distances (http://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.euclidean_distances.html) but i want to use this in Theano software so i need a pure mathematical formula rather than functions that compute the results.
for example i need a series of transformations like A = X * B, D = X.T-X, results = D.T something that contains just matrix mathematical operations not functions.
You can do this using numpy broadcasting as shown in this gist. It should be straightforward to convert this to Theano code, or just reference #eickenberg's comment above, since he's the one who showed me how to do this!

Faster way to perform point-wise interplation of numpy array?

I have a 3D datacube, with two spatial dimensions and the third being a multi-band spectrum at each point of the 2D image.
H[x, y, bands]
Given a wavelength (or band number), I would like to extract the 2D image corresponding to that wavelength. This would be simply an array slice like H[:,:,bnd]. Similarly, given a spatial location (i,j) the spectrum at that location is H[i,j].
I would also like to 'smooth' the image spectrally, to counter low-light noise in the spectra. That is for band bnd, I choose a window of size wind and fit a n-degree polynomial to the spectrum in that window. With polyfit and polyval I can find the fitted spectral value at that point for band bnd.
Now, if I want the whole image of bnd from the fitted value, then I have to perform this windowed-fitting at each (i,j) of the image. I also want the 2nd-derivative image of bnd, that is, the value of the 2nd-derivative of the fitted spectrum at each point.
Running over the points, I could polyfit-polyval-polyder each of the x*y spectra. While this works, this is a point-wise operation. Is there some pytho-numponic way to do this faster?
If you do least-squares polynomial fitting to points (x+dx[i],y[i]) for a fixed set of dx and then evaluate the resulting polynomial at x, the result is a (fixed) linear combination of the y[i]. The same is true for the derivatives of the polynomial. So you just need a linear combination of the slices. Look up "Savitzky-Golay filters".
EDITED to add a brief example of how S-G filters work. I haven't checked any of the details and you should therefore not rely on it to be correct.
So, suppose you take a filter of width 5 and degree 2. That is, for each band (ignoring, for the moment, ones at the start and end) we'll take that one and the two on either side, fit a quadratic curve, and look at its value in the middle.
So, if f(x) ~= ax^2+bx+c and f(-2),f(-1),f(0),f(1),f(2) = p,q,r,s,t then we want 4a-2b+c ~= p, a-b+c ~= q, etc. Least-squares fitting means minimizing (4a-2b+c-p)^2 + (a-b+c-q)^2 + (c-r)^2 + (a+b+c-s)^2 + (4a+2b+c-t)^2, which means (taking partial derivatives w.r.t. a,b,c):
4(4a-2b+c-p)+(a-b+c-q)+(a+b+c-s)+4(4a+2b+c-t)=0
-2(4a-2b+c-p)-(a-b+c-q)+(a+b+c-s)+2(4a+2b+c-t)=0
(4a-2b+c-p)+(a-b+c-q)+(c-r)+(a+b+c-s)+(4a+2b+c-t)=0
or, simplifying,
22a+10c = 4p+q+s+4t
10b = -2p-q+s+2t
10a+5c = p+q+r+s+t
so a,b,c = p-q/2-r-s/2+t, (2(t-p)+(s-q))/10, (p+q+r+s+t)/5-(2p-q-2r-s+2t).
And of course c is the value of the fitted polynomial at 0, and therefore is the smoothed value we want. So for each spatial position, we have a vector of input spectral data, from which we compute the smoothed spectral data by multiplying by a matrix whose rows (apart from the first and last couple) look like [0 ... 0 -9/5 4/5 11/5 4/5 -9/5 0 ... 0], with the central 11/5 on the main diagonal of the matrix.
So you could do a matrix multiplication for each spatial position; but since it's the same matrix everywhere you can do it with a single call to tensordot. So if S contains the matrix I just described (er, wait, no, the transpose of the matrix I just described) and A is your 3-dimensional data cube, your spectrally-smoothed data cube would be numpy.tensordot(A,S).
This would be a good point at which to repeat my warning: I haven't checked any of the details in the few paragraphs above, which are just meant to give an indication of how it all works and why you can do the whole thing in a single linear-algebra operation.

Vectorizing multiplication of matrices with different shapes in numpy/tensorflow

I have a 4x4 input matrix and I want to multiply every 2x2 slice with a weight stored in a 3x3 weight matrix. Please see the attached image for an example:
In the image, the colored section of the 4x4 input matrix is multiplied by the same colored section of the 3x3 weight matrix and stored in the 4x4 output matrix. When the slices overlap, the output takes the sum of the overlaps (e.g. the blue+red).
I am trying to perform this operation in Tensorflow 2.0 using eager tensors (which can be treated as numpy arrays). This is what I've written to perform this operation and it produces the expected output.
inputm = np.ones([4,4]) # initialize 4x4 input matrix
weightm = np.ones([3,3]) # initialize 3x3 weight matrix
outputm = np.zeros([4,4]) # initialize blank 4x4 output matrix
# iterate through each weight
for i in range(weightm.shape[0]):
for j in range(weightm.shape[1]):
outputm[i:i+2, j:j+2] += weightm[i,j] * inputm[i:i+2, j:j+2]
However, I don't think this is efficient since I am iterating through the weight matrix one-by-one, and this will be extremely slow when I need to perform this on large matrices of 500x500. I am having a hard time identifying a way to vectorize this operation, maybe tiling the weight matrix to be the same shape as the input matrix and performing a single matrix multiplication. I have also thought about flattening the matrix but I'm still not able to see a way to do this more efficiently.
Any advice will be much appreciated. Thanks in advance!
Alright, I think I have a solution but this involves using both numpy operations (e.g. np.repeat) and TensorFlow 2.0 operations (i.e. tf.segment_sum). And to warn you this is not the most clear elegant solution in the world, but it was the most elegant I could come up with. So here goes.
The main culprit in your problem is this weight matrix. If you manipulate this weight matrix to be a 4x4 matrix (with correct sum of weight at each position) you have a nice weight matrix which you can do an element-wise multiplication with the input. And that's my solution. Note that this is designed for the 4x4 problem and you should be able to relatively easily extend this to the 500x500 matrix.
import numpy as np
import tensorflow as tf
a = np.array([[1,2,3,4],[4,3,2,1],[1,2,3,4],[4,3,2,1]])
w = np.array([[5,4,3],[3,4,5],[5,4,3]])
# We make weights to a 6x6 matrix by repeating 2 times on both axis
w_rep = np.repeat(w,2,axis=0)
w_rep = np.repeat(w_rep,2,axis=1)
# Let's now jump in to tensorflow
tf_a = tf.constant(a)
tf_w = tf.constant(w_rep)
tf_segments = tf.constant([0,1,1,2,2,3])
# This is the most tricky bit, here we use the segment_sum to achieve what we need
# You can use segment_sum to get the sum of segments on the very first dimension of a matrix.
# So you need to do that to the input matrix twice. One on the original and the other on the transpose.
tf_w2 = tf.math.segment_sum(tf_w, tf_segments)
tf_w2 = tf.transpose(tf_w2)
tf_w2 = tf.math.segment_sum(tf_w2, tf_segments)
tf_w2 = tf.transpose(tf_w2)
print(tf_w2*a)
PS: I will try to include an illustration of what's going on here in a future edit. But I reckon that will take some time.
After realising #thushv89's trick, I realised you can get the same result by convolving the weight matrix with a matrix of ones:
import numpy as np
from scipy.signal import convolve2d
a = np.ones([4,4]) # initialize 4x4 input matrix
w = np.ones([3,3]) # initialize 3x3 weight matrix
b = np.multiply(a, convolve2d(w, np.ones((2,2))))
print(b)