I have a matrix X of size (1875, 77). For each column, I want to compute the covariance matrix, i.e., x_1 # x_1.T where x_1 has a shape (1875, 1). Ideally, I want to do this in one-go without a for loop. Is there an easy way to do this?
I was thinking about padding with zeros for each column up and down based on the column index(so x_1 will have 76 zeros columns, x_2 will have one (77, 1) zero column pad on top and 75 zero column pads), but this seems to complicate things more.
You probably want this:
import numpy as np
r, c = 1875, 77
X = np.random.rand(r, c)
covs = X.T[..., None] # X.T[:, None, :]
covs.shape
# (77, 1875, 1875)
This simply performs c number of matrix multiplications, 1 for each column of X. Here X.T[..., None] is of shape (c, r, 1) and X.T[:, None, :] is of shape (c, 1, r) and this makes the matrix multiple between them compatible.
Related
I have an idea for a tensor operation that would not be difficult to implement via iteration, with batch size one. However I would like to parallelize it as much as possible.
I have two tensors with shape (n, 5) called X and Y. X is actually supposed to represent 5 one-dimensional tensors with shape (n, 1): (x_1, ..., x_n). Ditto for Y.
I would like to compute a tensor with shape (n, 25) where each column represents the output of the tensor operation f(x_i, y_j), where f is fixed for all 1 <= i, j <= 5. The operation f has output shape (n, 1), just like x_i and y_i.
I feel it is important to clarify that f is essentially a fully-connected layer from the concatenated [...x_i, ...y_i] tensor with shape (1, 10), to an output layer with shape (1,5).
Again, it is easy to see how to do this manually with iteration and slicing. However this is probably very slow. Performing this operation in batches, where the tensors X, Y now have shape (n, 5, batch_size) is also desirable, particularly for mini-batch gradient descent.
It is difficult to really articulate here why I desire to create this network; I feel it is suited for my domain of 'itemized tabular data' and cuts down significantly on the number of weights per operation, compared to a fully connected network.
Is this possible using tensorflow? Certainly not using just keras.
Below is an example in numpy per AloneTogether's request
import numpy as np
features = 16
batch_size = 256
X_batch = np.random.random((features, 5, batch_size))
Y_batch = np.random.random((features, 5, batch_size))
# one tensor operation to reduce weights in this custom 'layer'
f = np.random.random((features, 2 * features))
for b in range(batch_size):
X = X_batch[:, :, b]
Y = Y_batch[:, :, b]
for i in range(5):
x_i = X[:, i:i+1]
for j in range(5):
y_j = Y[:, j:j+1]
x_i_y_j = np.concatenate([x_i, y_j], axis=0)
# f(x_i, y_j)
# implemented by a fully-connected layer
f_i_j = np.matmul(f, x_i_y_j)
All operations you need (concatenation and matrix multiplication) can be batched.
Difficult part here is, that you want to concatenate features of all items in X with features of all items in Y (all combinations).
My recommended solution is to expand the dimensions of X to [batch, features, 5, 1], expand dimensions of Y to [batch, features, 1, 5]
Than tf.repeat() both tensors so their shapes become [batch, features, 5, 5].
Now you can concatenate X and Y. You will have a tensor of shape [batch, 2*features, 5, 5]. Observe that this way all combinations are built.
Next step is matrix multiplication. tf.matmul() can also do batch matrix multiplication, but I use here tf.einsum() because I want more control over which dimensions are considered as batch.
Full code:
import tensorflow as tf
import numpy as np
batch_size=3
features=6
items=5
x = np.random.uniform(size=[batch_size,features,items])
y = np.random.uniform(size=[batch_size,features,items])
f = np.random.uniform(size=[2*features,features])
x_reps= tf.repeat(x[:,:,:,tf.newaxis], items, axis=3)
y_reps= tf.repeat(y[:,:,tf.newaxis,:], items, axis=2)
xy_conc = tf.concat([x_reps,y_reps], axis=1)
f_i_j = tf.einsum("bfij, fg->bgij", xy_conc,f)
f_i_j = tf.reshape(f_i_j , [batch_size,features,items*items])
I am trying to reconstruct the following matrix of shape (256 x 256 x 2) with SVD components as
U.shape = (256, 256, 256)
s.shape = (256, 2)
vh.shape = (256, 2, 2)
I have already tried methods from documentation of numpy and scipy to reconstruct the original matrix but failed multiple times, I think it maybe 3D matrix has a different way of reconstruction.
I am using numpy.linalg.svd for decompostion.
From np.linalg.svd's documentation:
"... If a has more than two dimensions, then broadcasting rules apply, as explained in :ref:routines.linalg-broadcasting. This means that SVD is
working in "stacked" mode: it iterates over all indices of the first
a.ndim - 2 dimensions and for each combination SVD is applied to the
last two indices."
This means that you only need to handle the s matrix (or tensor in general case) to obtain the right tensor. More precisely, what you need to do is pad s appropriately and then take only the first 2 columns (or generally, the number of rows of vh which should be equal to the number of columns of the returned s).
Here is a working code with example for your case:
import numpy as np
mat = np.random.randn(256, 256, 2) # Your matrix of dim 256 x 256 x2
u, s, vh = np.linalg.svd(mat) # Get the decomposition
# Pad the singular values' arrays, obtain diagonal matrix and take only first 2 columns:
s_rep = np.apply_along_axis(lambda _s: np.diag(np.pad(_s, (0, u.shape[1]-_s.shape[0])))[:, :_s.shape[0]], 1, s)
mat_reconstructed = u # s_rep # vh
mat_reconstructed equals to mat up to precision error.
I have a tensor phi = np.random.rand(n, n, 3) and a matrix D = np.random.rand(3, 3). I want to multiply the matrix D along the last axis of phi so that the output has shape (n, n, 3). I have tried this
np.einsum("klj,ij->kli", phi, D)
But I am not confident in this notation at all. Basically I want to do
res = np.zeros_like(phi)
for i in range(n):
for j in range(n):
res[i, j, :] = D.dot(phi[i, j, :])
You are treating phi as an n, n array of vectors, each of which is to be left-multiplied by D. So you want to keep the n, n portion of the shape exactly as-is. The last (only) dimension of the vectors should be multiplied and summed with the last dimension of the matrix (the vectors are implicitly 3x1):
np.einsum('ijk,lk->ijl', phi, D)
OR
np.einsum('ij,klj->kli', D, phi)
It's likely much simpler to use broadcasting with np.matmul (the # operator):
np.squeeze(D # phi[..., None])
You can omit the squeeze if you don't mind the extra unit dimension at the end.
Whenever I'm finding covariance of 2 arrays, I've always seen it done like
(np.cov(X,Y)[0,1])
What purpose does the [0,1] serve?
For two 1d arrays x and y, np.cov(x, y) returns:
np.array([[variance(x), covariance(x, y)],
[covariance(y, x), variance(y) ]])
Thus for the covariance, you need the [0,1] term.
When formulated as cov(x ,y), numpy creates np.cov(X) where X = np.stack(x, y, axis = 1)
The confusion occurs because for np.cov(X) is really optimized for many vectors at once, with X.shape = (m, n), and np.cov(X)[i,j], i, j < n to be the covariance between rows i and j. And the covariance of rows i and i is just the variance of row i.
suppose I have two numpy arrays x and y of shape N which I want to represent as size N x 1 each, and I want to multiply them as x y' to a get a matrix of size N x N. But if I try:
np.dot(x, y.T) or np.dot(x.T, y)
I always get a scalar (size 1 x 1).
Is it possible to specify to numpy to multiply two arrays along a particular dimension?
To clarify, suppose I have
x = [x1, x2]
y = [y1, y2]
I want
xy' = [[x1*y1, x1*y2], [x2*y1, x2*y2]]
but numpy always seems to return
xy' = x1*y1+x2*y2
You want np.outer(x, y). You can also do it with broadcasting:
x[:, None] * y
which is more flexible