From linear algebra we know that the eigenvectors of any symmetric matrix (let's call it A) are orthonormal, meaning if M is the matrix of all eigenvectors, we should obtain |det(M)| = 1. I had hoped to see this in numpy.linalg.eig, but got the following behaviour:
import numpy as np
def get_det(A, func, decimals=12):
eigenvectors = func(A)[1]
return np.round(np.absolute(np.linalg.det(eigenvectors)), decimals=decimals)
n = 100
x = np.meshgrid(* 2 * [np.linspace(0, 2 * np.pi, n)])
A = np.sin(x[0]) + np.sin(x[1])
A += A.T # This step is redundant; just to convince everyone that it's symmetric
print(get_det(A, np.linalg.eigh), get_det(A, np.linalg.eig))
Output
>>> 1.0 0.0
As you can see, numpy.linalg.eigh gives the correct result, while numpy.linalg.eig apparently returns a near non-invertible matrix. I presume that it comes from the fact that it has a degenerate eigenvalue (which is 0 in this case) and the corresponding eigenspace is not orthonormal and therefore the total determinant is not 1. In the following example, where there's (usually) no degenerate eigenvalue, the results are indeed the same:
import numpy as np
n = 100
A = np.random.randn(n, n)
A += A.T
print(get_det(A, np.linalg.eigh), get_det(A, np.linalg.eig))
Output
>>> 1.0 1.0
Now, regardless of whether my guess was correct or not (i.e. the difference between eig and eigh comes from the degeneracy of the eigenvalues), I would like to know if there is a way to get a full dimensional eigenvector matrix (i.e. the one that maximises the determinant) using numpy.linalg.eig, because I'm now working with a near-symmetric matrix, but not entirely symmetric, and it gives me a lot of problems if the eigenvector matrix is not invertible. Thank you for your help in advance!
To find the linearly independent eigenvectors, you could try Matrix.rref() which specifies the reduced row-echelon form of the matrix formed by eigenvectors.
Consider a matrix which is not diagonalizable, i.e., its eigenspace is not full rank
array([[0., 1., 0.],
[0., 0., 1.],
[0., 0., 0.]])
We can find its linearly independent eigenvectors using rref()
A = np.diag(np.ones(2), 1)
v = np.linalg.eig(A)[1]
result = sympy.Matrix(np.round(v, decimals=100)).rref()
v[:, result[1]]
which returns
array([[1.],
[0.],
[0.]])
See also: python built-in function to do matrix reduction
Find a linearly independent set of vectors that spans the same substance of R^3 as that spanned
Related
Using the great TensorFlow Hidden Markov Model library, it is straightforward to model the following Dynamic Bayesian Network:
where Hi is the probability variable that represents the HMM and Si is the probability variable that represents observations.
What if I'd like to make H depend on yet another HMM (Hierarchical HMM) or simply other probability variable like this:
The HiddenMarkovModel definition in TensorFlow looks like the following:
tfp.distributions.HiddenMarkovModel(
initial_distribution, transition_distribution, observation_distribution,
num_steps, validate_args=False, allow_nan_stats=True,
time_varying_transition_distribution=False,
time_varying_observation_distribution=False, name='HiddenMarkovModel'
)
It only accepts initial, transition and observation distributions.
How could I model the above and pass additional probability variable distribution to the HiddenMarkovModel? Is that possible by somehow incorporating C into the transition_distribution parameter?
Maybe C should be treated as observation as well? (I'm not sure though, if that would be a full equivalent of the structure I'd like to model)
A simple example / explanation would be great to have.
UPDATE
I've tried building a simple joint distribution of two dependent variables and feed as transition_distribution into the HMM:
def mydist(y):
samples_length = 1 if tf.rank(y) == 0 else y.shape[0]
b = tf.ones([samples_length], dtype=tf.int32) - y
a = tf.reshape(y, [samples_length,1])
b = tf.reshape(b, [samples_length,1])
c = tf.concat([a, b], axis=1)
condprobs = tf.constant([ [0.1, 0.9], [0.5, 0.5] ])
d = tf.matmul(tf.cast(c, tf.float32), condprobs)
return tfd.Categorical(d, dtype=tf.int32)
jd = tfd.JointDistributionSequential([
tfd.Categorical(probs=[0.9, 0.1]),
lambda y: mydist(y)
], validate_args=True)
initial_distribution = tfd.Categorical(probs=[0.8, 0.2])
transition_distribution = tfd.Categorical(probs=[[0.7, 0.3],
[0.2, 0.8]])
observation_distribution = tfd.Normal(loc=[0., 15.], scale=[5., 10.])
model = tfd.HiddenMarkovModel(
initial_distribution=initial_distribution,
transition_distribution=jd,
observation_distribution=observation_distribution,
num_steps=7)
temps = [-2., 0., 2., 4., 6., 8., 10.]
model.posterior_mode(temps)
This gives an error:
ValueError: If the two shapes can not be broadcasted.
AttributeError: 'list' object has no attribute 'ndims'
The HMM manual mentions:
This model assumes that the transition matrices are fixed over time.
And that transition_distribution must be
A Categorical-like instance. The rightmost batch dimension indexes the
probability distribution of each hidden state conditioned on the
previous hidden state.
which tfd.JointDistributionSequential is probably not.
Still looking for a ways of building hierarchical HMMs with TensorFlow.
The TFP HiddenMarkovModel implements message passing algorithms for chain-structured graphs, so it can't natively handle the graph in which the Cs are additional latent variables. I can think of a few approaches:
Fold the Cs into the hidden state H, blowing up the state size. (that is, if H took values in 1, ..., N and C took values in 1, ..., M, the new combined state would take values in 1, ..., NM).
Model the chain conditioned on values for the Cs that are set by some approximate inference algorithm. For example, if the Cs are continuous, you could fit them using gradient-based VI or MCMC:
#tfd.JointDistributionCoroutineAutoBatched
def model():
Cs = yield tfd.Sample(SomePrior, num_timesteps)
Ss = yield tfd.HiddenMarkovModel(
...,
transition_distribution=SomeDistribution(Cs),
time_varying_transition_distribution=True)
# Fit Cs using gradient-based VI (could also use HMC).
pinned = tfp.experimental.distributions.JointDistributionPinned(model, Ss=observations)
surrogate_posterior = tfp.experimental.vi.build_factored_surrogate_posterior(
event_shape=pinned.event_shape,
bijector=pinned.experimental_default_event_space_bijector())
losses = tfp.vi.fit_surrogate_posterior(
target_log_prob_fn=pinned.unnormalized_log_prob,
surrogate_posterior=surrogate_posterior,
optimizer=tf.optimizers.Adam(0.1),
num_steps=200)
Use a particle filter, which can handle arbitrary joint distributions and dependencies in the transition and observation models:
[
trajectories,
incremental_log_marginal_likelihoods
] = tfp.experimental.mcmc.infer_trajectories(
observations=observations,
initial_state_prior=tfd.JointDistributionSequential(
[PriorOnC(),
lambda c: mydist(c, previous_state=None)]),
transition_fn=lambda step, state: tfd.JointDistributionSequential(
[PriorOnC(),
lambda c: mydist(c, previous_state=state)]),
observation_fn=lambda step, state: observation_distribution[state[1]],
num_particles=4096)
This gives up exact inference over the discrete chain, but it's probably the most flexible approach for working with dynamic Bayesian networks in general.
I would like a method that could turn [1,2,3] into [1,1,2,2,3,3].
My thought is something like
val = tf.constant([1.,2.,3.]) #1,2,3
tiled = tf.tile(val, 2) # [1,2,3,1,2,3]
reshaped = tf.reshape(2,3) # [[1,2,3], [1,2,3]]
transposed = tf.transpose(reshaped) # [[1,1], [2,2], [3,3]]
flattened = tf.reshape(transposed, (6,)) # [1,1,2,2,3,3]
I haven't tested the above, but it looks like it should work. But, is there a cleaner way to do it? Mine seems ugly.
The motivation is to make some sort of GMM, where I can get a 20-dim vector that is the concatenation of two 10-dim normal distributions, each multiplied by a different random variable. So if there's a different approach for that, I'm interested as well. Thanks in advance.
One alternative without tf.transpose, basically add a second dimension to your tensor, tile by the second axis and then flatten it:
t = tf.expand_dims(val, 1)
t = tf.tile(t, (1, 2))
t = tf.reshape(t, (-1,))
t.eval()
# array([ 1., 1., 2., 2., 3., 3.], dtype=float32)
I have a question about the way scipy builds block diagonal matrices. I was expecting that creating a sparse block diagonal matrix would be much quicker and more efficient than creating a dense one (because of sparsity compressions). But it turns out that it's not the case (or maybe am I using some inefficient method) :
from timeit import default_timer as timer
import numpy as np
from scipy.sparse import block_diag as bd_sp
from scipy.linalg import block_diag as bd_la
m = [np.identity(1)] * 10000
before = timer()
res = bd_sp(m)
timer()-before
#takes 33.79 secs
before = timer()
res = bd_la(*m)
timer()-before
#takes 0.069 secs
What am I missing? Thank's in advance for your replies.
In [625]: [np.identity(1)*i for i in range(1,5)]
Out[625]: [array([[1.]]), array([[2.]]), array([[3.]]), array([[4.]])]
In [626]: sparse.block_diag(_)
Out[626]:
<4x4 sparse matrix of type '<class 'numpy.float64'>'
with 4 stored elements in COOrdinate format>
In [627]: _.A
Out[627]:
array([[1., 0., 0., 0.],
[0., 2., 0., 0.],
[0., 0., 3., 0.],
[0., 0., 0., 4.]])
block_diag uses bmat to join the elements. bmat makes coo matrices from all elements, and combines their attributes with offsets, and makes a new coo matrix. The code is readable Python.
It may be more efficient to construct your own data, row, col arrays. block_diag is a convenience, and fine for combining a few large matrices, but not efficient when combining many small ones.
The linalg function is also Python (and pretty short). If creates an out array of the right shape, and inserts the blocks with sliced indexing. That's an efficient dense array solution. Most of the hard work is done in compiled numpy code.
Sparse matrices can be faster when doing matrix multiplication (and related linalg solvers). For most other operations, including initialization, they are slower than equivalent dense code. They are also valuable when the problem is too big.
I have around 200 3D points that I need to multiply into a rather complex 2D projection matrix. I am currently using numpy and a for loop, essentially iterating through 3D point cloud, applying the matrix transformations and getting my data.
This seems rather slow. Is there any way I might be able to vectorize this, or use some kind of speed up techniques (maps, pools etc.)
F = matrix([
[735.4809, 0., 388.9476, 0.],
[0., 733.6047, 292.0895, 0.],
[0., 0., 1.0000, 0.]
])
VehicleRPY = self.GetRT(Roll=Roll, Pitch=Pitch, Yaw=0., X=IMUX, Y=IMUY, Z=IMUZ);
SonarToCamera = self.GetRT(Roll=RTRoll, Pitch=RTPitch, Yaw=RTYaw, X=RTX, Y=RTY, Z=RTZ);
SpaceMatrix = matrix([
[(sqrt(R**2 - (Y/(cosd(Roll)*cosd(Pitch)))**2)*sind(Theta))],
[(Y/(cosd(Roll)*cosd(Pitch)))],
[(sqrt(R**2 - (Y/(cosd(Roll)*cosd(Pitch)))**2)*cosd(Theta))],
[1]
])
FinalMatrix = F*VehicleRPY*SonarToCamera*SpaceMatrix;
UVMatrix = matrix([
[FinalMatrix.item(0)/FinalMatrix.item(2)],
[FinalMatrix.item(1)/FinalMatrix.item(2)],
])
Something like the above. I need to repeat this 3*3/4*4 multiplication across 200 points per frame
I need to solve linear equations with varied sizes. Sometime the size may be 0 or 1 in which cases some errors will happen. For example,
import numpy as np
from numpy.linalg import solve
from scipy.sparse.linalg import spsolve
A1 = np.array([[1,2],[2,1]])
b1 = np.array([[1],[1]])
A2 = np.array([[1]])
b2 = np.array([[1]])
Some unexpected results will happen when calling spsolve or solve:
sage: solve(A1,b1)
array([[ 0.33333333],
[ 0.33333333]])
sage: solve(A2,b2)
array([[ 1.]])
sage: spsolve(A1,b1)
array([ 0.33333333, 0.33333333])
sage: spsolve(A2,b2)
ValueError: object of too small depth for desired array
Notice that the call of "spsolve(A1,b1)" actually yields a row vector, is there anyway to force it to be a column vector? Also, the error in calling "spsolve(A2,b2)" is also very strange since the size of A1 and b1 are not zero.
spsolve does not return an 2d array but a 1d vector.
Use numpy.atleast_2d to inflate the vector, e.g., in your example
In [10]: np.atleast_2d(spsolve(A1,b1)).T
Out[10]:
array([[ 0.33333333],
[ 0.33333333]])
and .T to get a column (2d) vector. This probably also solves your second issue, related to the depth of the result vector.
(I don't use sage, so I can't reproduce your error.)