Is it possible to calculate covariance between 2 variables based on their individual covariance with a third variable - variables

If one knows the covariance between A and B and the covariance between B and C, is it then possible to calculate the covariance between A and C?
Or what would you additionally need?
Thanks

No it isn't.
If you know Cov(A, B) and Cov(B, C) then Cov(A, C) is constrained, but not uniquely defined.
The constraints can be found by asserting that the covariance matrix (of A, B, and C) is positive semi-definite: i.e. none of its eigenvalues can be negative.

Related

A positive semidefinite matrix with negative eigenvalues

From what I know, for any square real matrix A, a matrix generated with the following should be a positive semidefinite (PSD) matrix:
Q = A # A.T
I have this matrix A, which is sparse and not symmetric. However, regardless of the properties of A, I think the matrix Q should be PSD.
However, upon using np.linalg.eigvals, I get the following:
np.sort(np.linalg.eigvals(Q))
>>>array([-1.54781185e+01+0.j, -7.27494242e-04+0.j, 2.09363431e-04+0.j, ...,
3.55351888e+15+0.j, 5.82221014e+17+0.j, 1.78954577e+18+0.j])
I think the complex eigenvalues result from the numerical instability of the operation. Using scipy.linalg.eigh, which takes advantage of the fact that the matrix is symmetric, gives,
np.sort(eigh(Q, eigvals_only=True))
>>>array([-3.10854357e+01, -6.60108485e+00, -7.34059692e-01, ...,
3.55351888e+15, 5.82221014e+17, 1.78954577e+18])
which again, contains negative eigenvalues.
My goal is to perform Cholesky decomposition on the matrix Q, however, I keep getting this error message saying that the matrix Q is not positive definite, which can be again confirmed with the negative eigenvalues shown above.
Does anyone know why the matrix is not PSD? Thank you.
Of course that's a numerical problem, but I would say that Q is probably still PSD.
Notice that the largest eigenvalue is 1.7e18 while the smallest is 3.1e1 so the ratio is about, if you take probably min(L) + max(L) == max(L) will return true, meaning that the minimum value is negligible compared to the maximum.
What I would suggest to you is to compute Cholesky on a slightly shifted version of the matrix.
e.g.
d = np.linalg.norm(Q) * np.finfo(Q.dtype).eps;
I = np.eye(len(Q));
np.linalg.cholesky(Q + d * I);

What is a Hessian matrix?

I know that the Hessian matrix is a kind of second derivative test of functions involving more than one independent variable. How does one find the maximum or minimum of a function involving more than one variable? Is it found using the eigenvalues of the Hessian matrix or its principal minors?
You should have a look here:
https://en.wikipedia.org/wiki/Second_partial_derivative_test
For an n-dimensional function f, find an x where the gradient grad f = 0. This is a critical point.
Then, the 2nd derivatives tell, whether x marks a local minimum, a maximum or a saddle point.
The Hessian H is the matrix of all combinations of 2nd derivatives of f.
For the 2D-case the determinant and the minors of the Hessian are relevant.
For the nD-case it might involve a computation of eigen values of the Hessian H (if H is invertible) as part of checking H for being positive (or negative) definite.
In fact, the shortcut in 1) is generalized by 2)
For numeric calculations, some kind of optimization strategy can be used for finding x where grad f = 0.

How to best use Numpy/Scipy to find optimal common coefficients for a set different linear equations?

I have n (around 5 million) sets of specific (k,m,v,z)* parameters that describe some linear relationships. I want to find the optimal positive a,b and c coefficients that minimize the addition of their absolute values as shown below:
I know beforehand the range for each a, b and c and so, I could use it to make things a bit faster. However, I do not know how to properly implement this problem to best take advantage of Numpy (or Scipy/etc).
I was thinking of iteratively making checks using different a, b and c coefficients (based on a step) and in the end keeping the combination that would provide the minimum sum. But properly implementing this in Numpy is another thing.
*
(k,m,v are either 0 or positive and are in fact k,m,v,i,j,p)
(z can be negative too)
Any tips are welcome!
Either I am missing something, or a == b == c == 0 is optimal. So, a positive solution for (a,b,c) does not exist in general. You can verify this explicitly by posing the minimization problem as a quantile regression of 0 on (k, m, v) with the quantile set to 0.5.
import numpy as np
from statsmodels.regression.quantile_regression import QuantReg
x = np.random.rand(1000, 3)
a, b, c = QuantReg(np.zeros(x.shape[0]), x).fit(0.5).params
assert np.allclose([a, b, c], 0)

Finding Rotation and Translation between three coordinate systems

If we have three coordinate systems namely A, B, and C and we know the [R|t] from A to B and A to C. Then how can we find the [R|t] between B and C?
From B to C is from B to A to C, so you need to invert the first transformation and combine it with the second.
I assume that by [R|t] you mean the rotation matrix plus translation vector. It might be easier to consider these two as a single square matrix operating on homogeneous coordinates. For planar operations that would be a 3×3 matrix, for 3d operations it would be 4×4. That way you can use regular matrix inversion and multiplication to describe your combined result.

Bayesian inference in feature-based categorization

Here is my problem which I hope you can help me with:
Lets say we live in a world where there are only two categories, where each has some features. The objects in this world are different permutations of these features.
cat1: {a, b, c, d, e, f}
cat2: {g, h, i, j}
Now we have an object with these features:
obj: {a, b, c, d, g, h}
What is the probability that this object gets categorized as Cat.1?
p(cat1|a, b, c, d, g, h)?
In general how can I model an equation for:
n categories each with different number of features, objects with different permutations?
You can use a Bayesian classifier to calculate these probabilities. To fit a normal distribution as it is the most used distribution for Bayesian classifiers see this link. However, to use this solution you need to assume that every object holds a value for all of your features for example:
obj1:{a=1, b=1, c=1, d=1, e=0, f=0, g=1, h=1, i=0, j=0}
Note that feature values must be normalized before calculating parameters of your distribution.