Tensorflow: efficient way to subtract a vector from the matrix - tensorflow

I have a MxN matrix and a length N vector. What I want to do is to subtract this vector from each of M rows of the matrix. The obvious way to do this is to tf.tile the vector, but this seems highly inefficient because of all the new memory allocation: tiling takes up to 4x as much time than the actualy subtraction.
Is there any way to do it more efficiently, or will I have to write my own operation in C++?

Related

computational complexity of higher order derivatives with AD in jax

Let f: R -> R be an infinitely differentiable function. What is the computational complexity of calculating the first n derivatives of f in Jax? Naive chain rule would suggest that each multiplication gives a factor of 2 increase, hence the nth derivative would require at least 2^n more operations. I imagine though that clever manipulation of formal series would reduce the number of required calculations and eliminate duplications, esspecially if the derivaives are Jax jitted? Is there a different between the Jax, Tensorflow and Torch implementations?
https://openreview.net/forum?id=SkxEF3FNPH discusses this topic, but doesn t provide a computational complexity.
What is the computational complexity of calculating the first n derivatives of f in Jax?
There's not much you can say in general about computational complexity of Nth derivatives. For example, with a function like jnp.sin, the Nth derivative is O[1], oscillating between negative and positive sin and cos calls as N grows. For an order-k polynomial, the Nth derivative is O[0] for N > k. Other functions may have complexity that is linear or polynomial or even exponential with N depending on the operations they contain.
I imagine though that clever manipulation of formal series would reduce the number of required calculations and eliminate duplications, esspecially if the derivaives are Jax jitted
You imagine correctly! One implementation of this idea is the jax.experimental.jet module, which is an experimental transform designed for computing higher-order derivatives efficiently and accurately. It doesn't cover all JAX functions, but it may be complete enough to do what you have in mind.

numpy difference between fft and rfft

I'm trying to understand the difference between numpy fft and rfft. I've read the doc, and it only says rfft is meant for real inputs.
I've tested their performance on a large real array and found out rfft is faster than fft by about a third. My question is: why is rfft fast? Thanks!
An RFFT has half the degrees of freedom on the input, and half the number of complex outputs, compared to an FFT. Thus the FFT computation tree can be pruned to remove those adds and multiplies not needed for the non-existent inputs and/or those unnecessary since there are a lesser number of independant output values that need to be computed.
This is because an FFT of a strictly real input (e.g. all the input value imaginary components zero) produces a complex conjugate mirrored result, where each half can be trivially derived from the other half.

Time complexity (Big-O notation) of Posterior Probability Calculation

I got a basic idea of Big-O notation from Big-O notation's definition.
In my problem, a 2-D surface is divided into uniform M grids. Each grid (m) is assigned with a posterior probability based on A features.
The posterior probability of m grid is calculated as follows:
and the marginal likelihood is given as:
Here, A features are independent of each other and sigma and mean symbol represent the standard deviation and mean value of each a feature at each grid. I need to calculate the Posterior probability of all M grids.
What will be the time complexity of the above operation in terms of Big-O notation?
My guess is O(M) or O(M+A). Am I correct? I'm expecting an authenticate answer to present at the formal forum.
Also, what will be the time complexity if M grids are divided into T clusters where every cluster has Q grids (Q << M) (calculating Posterior Probability only on Q grids out of M grids) ?
Thank you very much.
Discrete sum and product
can be understood as loops. If you are happy with floating point approximation most other operators are typically O(1), conditional probability looks like a function call. Just inject constants and variables in your equation and you'll get the expected Big-O, the details of formula are irrelevant. Also be aware that these "loops" can often be simplified using mathematical properties.
If the result is not obvious, please convert your above mathematical formula in actual programming code in a programming language. Computer Science Big-O is never about a formula but about an actual translation of it in programming steps, depending on the implementation the same formula can lead to very different execution complexities. As different as adding integers by actually performing sum O(n) or applying Gauss formula O(1) for instance.
By the way why are you doing a discrete sum on a discrete domaine N ? Shouldn't it be M ?

Most efficient way to add two CSR sparse matrices with the same sparsity pattern in python

I am using sparse matrices in python, namely
scipy.sparse.csr_matrix
I am in principle free to choose the exact sparse implementation, as long as the matrices support matrix-vector multiplication and addition/subtraction of matrices with the same sparsity pattern. Currently, at every time step, I construct a new sparse matrix from scratch and add it to the existing matrix. I believe that my code could be unnecessarily losing time due to
Construction time of sparse matrix
Addition of the sparse matrices, assuming that the underlying algorithm inside CSR matrix implementation has to find matching sparse entries before adding them up.
My guess would be that the sparse matrix is internally stored as a numpy array of values + a few index arrays denoting where those values are located. The question is if it is possible to directly add the underlying value arrays without touching the sparsity structure. Is something like this possible?
new_values = np.linspace(0, num_values)
csr_mat.val += new_values

How to optimize a matrix multiplication in MATLAB?

Can I somehow optimize this formula? I evaluate it many times and it takes much time...
w - 1xN double
phis - NxN double
x - Nx2 double
sum(w(ones([size(x, 1) 1]),:) .* phis, 2)
You're taking the scalar product of each row of phis with w. You can do this easily using linear algebra.
out = phis * w';
This matrix multiplication saves you calls to sum, ones, and size, which should make your code a lot faster. Furthermore, linear algebra operations are often very fast in Matlab, since that's what the program is historically optimized for.