Is there a way to avoid multiplying zeros as part of an inner a loop? As a laughable test I tried a conditional to stop the multiplication if it encounters a zero, and of course this is slower then just doing the multiplication. My preference is to leave the LU matrix intact, rather than rearrange to make zeros disappear (sparse). In this instance language is VBA prior to conversion to VB.net.
For k = 1 To i - 1
If LU(j, k) <> 0 and LU(k, i) <> 0 Then temp = temp - LU(j, k) * LU(k, i)
Next k
Thanks.
It is impossible to avoid multiplying by zeroes if you want to preserve the matrix structure.
Furthermore, sparse matrices are not supported in VBA so you would have to code your own class for sparse matrices : the idea is that instead of storing the entire matrix, you just store index/value pairs.
A sparse matrix class would include methods to :
create a matrix with given values in index/value form.
create a sparse matrix from given values in array form.
multiply two sparse matrices (including the special case sparse matrix times sparse vector)
Macroman, I mean to skip the calculation if zeros encountered to speed up solution. Thanks.
Titus, already wrote a fully pivoted LUD solver for VBA which can (slowly) solve sparse matrices. I just wanted to see if it was feasible to convert solver to Sparse techniques. Memory is not an issue, hence preference to avoid index/value storage technique, so I just wanted see if there was a fast way to make solver skip zeros to speed it up. Thanks.
Related
I am using sparse matrices in python, namely
scipy.sparse.csr_matrix
I am in principle free to choose the exact sparse implementation, as long as the matrices support matrix-vector multiplication and addition/subtraction of matrices with the same sparsity pattern. Currently, at every time step, I construct a new sparse matrix from scratch and add it to the existing matrix. I believe that my code could be unnecessarily losing time due to
Construction time of sparse matrix
Addition of the sparse matrices, assuming that the underlying algorithm inside CSR matrix implementation has to find matching sparse entries before adding them up.
My guess would be that the sparse matrix is internally stored as a numpy array of values + a few index arrays denoting where those values are located. The question is if it is possible to directly add the underlying value arrays without touching the sparsity structure. Is something like this possible?
new_values = np.linspace(0, num_values)
csr_mat.val += new_values
Quite simply, what I want to do is the following
A = np.ones((3,3)) #arbitrary matrix
B = np.ones((2,2)) #arbitrary matrix
A[1:,1:] = A[1:,1:] + B
except in Tensorflow (where the matrices can be arbitrarily complicated tensor expressions). Neither A nor B is a Tensorflow Variable, but just a run-of-the-mill tensor.
What I have gathered so far: tensors are immutable, so I cannot assign to a submatrix. tf.scatter_nd is the current option for sub-assignment, but does not appear to support sub-matrices, only slices.
Methods that should work, but are perhaps not ideal:
I could pad B with zeros, but I'm sure this leads to instantiation of
an unnecessarily large B - can it be made sparse, maybe?
I could use the padding idea, but write it as a low-rank decomposition, e.g. in Numpy: A+U.dot(B).U.T where U is a stacked zero and identity matrix. I'm not sure this is actually advantageous.
I could split A into submatrices, and stack them back together. Might be the most efficient, but sounds like the code would be convoluted.
Ideally, I want to do this operation N times for progressively smaller matrices, resulting in one large final result, but this is tangential.
I'll use one of the hacks for now, but I'm hoping someone can tell me what the idiomatic version is!
I have to operate on matrices using an equivalent of sicpy's sparse.coo_matrix and sparse.csr_matrix. However, I cannot use scipy (it is incompatible with the image analysis software I want to use this in). I can, however, use numpy.
Is there an easy way to accomplish what scipy.sparse.coo_matrix and scipy.sparse.csr_matrix do, with numpy only?
Thanks!
The attributes of a sparse.coo_matrix are:
dtype : dtype
Data type of the matrix
shape : 2-tuple
Shape of the matrix
ndim : int
Number of dimensions (this is always 2)
nnz
Number of nonzero elements
data
COO format data array of the matrix
row
COO format row index array of the matrix
col
COO format column index array of the matrix
The data, row, col arrays are essentially the data, i, j parameters when defined with coo_matrix((data, (i, j)), [shape=(M, N)]). shape also comes from the definition. dtype from the data array. nzz as first approximation is the length of data (not accounting for zeros and duplicate coordinates).
So it is easy to construct a coo like object. Similarly a lil matrix has 2 lists of lists. And a dok matrix is a dictionary (see its .__class__.__mro__).
The data structure of a csr matrix is a bit more obscure:
data
CSR format data array of the matrix
indices
CSR format index array of the matrix
indptr
CSR format index pointer array of the matrix
It still has 3 arrays. And they can be derived from the coo arrays. But doing so with pure Python code won't be nearly as fast as the compiled scipy functions.
But these classes have a lot of functionality that would require a lot of work to duplicate. Some is pure Python, but critical pieces are compiled for speed. Particularly important are the mathematical operations that the csr_matrix implements, such as matrix multiplication.
Replicating the data structures for temporary storage is one thing; replicating the functionality is quite another.
I have two large square sparse matrices, A & B, and need to compute the following: A * B^-1 in the most efficient way. I have a feeling that the answer involves using scipy.sparse, but can't for the life of me figure it out.
After extensive searching, I have run across the following thread: Efficient numpy / lapack routine for product of inverse and sparse matrix? but can't figure out what the most efficient way would be.
Someone suggested using LU decomposition which is built into the sparse module of scipy, but when I try and do LU on sample matrix is says the result is singular (although when I just do a * B^-1 i get an answer). I have also heard someone suggest using linalg.spsolve(), but i can't figure out how to implement this as it requires a vector as the second argument.
If it helps, once I have the solution s.t. A * B^-1 = C, i only need to know the value for one row of the matrix C. The matrices will be roughly 1000x1000 to 1500x1500.
Actually 1000x1000 matrices are not that large. You can compute the inverse of such a matrix using numpy.linalg.inv(B) in less than 1 second on a modern desktop computer.
But you can be much more efficient if you rewrite your problem taking into account the fact that you only need one row of C (this is actually very often the case).
Let us write d_i = [0 0 0 ... 0 1 0 ... 0 ], a vector with only one one on the i-th element.
You can write, if ^t denotes the transpose :
AB^-1 = C <=> A = CB <=> A^t = B^t C^t
For the i-th row :
A^t d_i = B^t C^t d_i <=> a_i = B^t c_i
So you have a linear inverse problem which can be solved using numpy.linalg.solve
ci = np.linalg.solve(B.T, a[i])
I am computing a similarity matrix based on Euclidean distance in MATLAB. My code is as follows:
for i=1:N % M,N is the size of the matrix x for whose elements I am computing similarity matrix
for j=1:N
D(i,j) = sqrt(sum(x(:,i)-x(:,j)).^2)); % D is the similarity matrix
end
end
Can any help with optimizing this = reducing the for loops as my matrix x is of dimension 256x30000.
Thanks a lot!
--Aditya
The function to do so in matlab is called pdist. Unfortunately it is painfully slow and doesnt take Matlabs vectorization abilities into account.
The following is code I wrote for a project. Let me know what kind of speed up you get.
Qx=repmat(dot(x,x,2),1,size(x,1));
D=sqrt(Qx+Qx'-2*x*x');
Note though that this will only work if your data points are in the rows and your dimensions the columns. So for example lets say I have 256 data points and 100000 dimensions then on my mac using x=rand(256,100000) and the above code produces a 256x256 matrix in about half a second.
There's probably a better way to do it, but the first thing I noticed was that you could cut the runtime in half by exploiting the symmetry D(i,j)==D(i,j)
You can also use the function norm(x(:,i)-x(:,j),2)
I think this is what you're looking for.
D=zeros(N);
jIndx=repmat(1:N,N,1);iIndx=jIndx'; %'# fix SO's syntax highlighting
D(:)=sqrt(sum((x(iIndx(:),:)-x(jIndx(:),:)).^2,2));
Here, I have assumed that the distance vector, x is initalized as an NxM array, where M is the number of dimensions of the system and N is the number of points. So if your ordering is different, you'll have to make changes accordingly.
To start with, you are computing twice as much as you need to here, because D will be symmetric. You don't need to calculate the (i,j) entry and the (j,i) entry separately. Change your inner loop to for j=1:i, and add in the body of that loop D(j,i)=D(i,j);
After that, there's really not much redundancy left in what that code does, so your only room left for improvement is to parallelize it: if you have the Parallel Computing Toolbox, convert your outer loop to a parfor and before you run it, say matlabpool(n), where n is the number of threads to use.