I'm confused about how to calculate feature scaling in Viola Jones algorithm. For example in "An Analysis of the Viola-Jones Face Detection Algorithm" of Yi-Qing Wang, he proposed the following for feature type "a":
set the original feature support size a ← 2wh
i <- Jie/24K, j <- Jje/24K, h <- Jhe/24K where JzK defines the nearest integer to z ∈ R+
w <- max{κ ∈ N : κ ≤ J1 + 2we/24K/2, 2κ ≤ e − j + 1}
compute the sum S1 of the pixels in [i, i + h − 1] × [j, j + w − 1]
compute the sum S2 of the pixels in [i, i + h − 1] × [j + w, j + 2w − 1]
return the scaled feature ((S1−S2)a)/2wh
In this case, I don't understand how to calculate "w" (line 3). Do you know another way of to calculate feature scaling?
On the other hand, we know that an strong classifier have weak classifiers, a polarity and a threshold. The weak classifiers depends of features. When we scale a feature, is there any a change in its threshold?
Related
I wonder what the time complexity of the following algorithm is.
function SearchInWindow(p)
for l ← 1 ... L do
c_l ← CountOccurrences(p, l)
end for
end function
The CountOccurrences returns the number of occurrences of a fragment of the length l positioned at the position p in the window (p + 1 ... p + W − 1), where W is the window length.
The CountOccurrences is in O(l × W). The p is the pointer to the data and should not affect the time complexity.
My guess is binom(L+1, 2) × W. But here I am not at all sure.
Let's take a look at the time complexity.
CountOccurrences takes a parameter l (and also p, but as it does not affect the time complexity, we will disregard it), and complete in O(l × W).
You're running a loop from 1 to L and calling CountOccurences with each value.
The time complexity of that is:
O(1 × W) + O(2 × W) + O(3 × W) + ... + O(L × W)
= O(W × (1 + 2 + 3 + ... + L))
= O(W × 1/2 × (L²+L)
Note that we can disregard the constant. Additionally, L² >> L, so O(L²+L) = O(L²).
So we have: = O(W × L²) as the final answer.
Note: Your answer of binom(L+1, 2) × W given in the question is technically equivalent as my answer, but since 2 in binom(L+1, 2) is a constant, we can simplify it further.
I'm trying to sample a Gaussian distribution of covariance matrix P that is N by N, with N very large (around 4000 ).
Usually one would proceed like so:
Compute the Cholesky decomposition of P : L, such that L * L.T = P
Sample a normal Gaussian distribution : X ~N(0,I_N), where I_N is the identity and N = 4000
Obtain the desired sample Y from Y = L * X
The snag here is in the computation of L. The algorithm does not seem to be stable for such a large matrix, as the computed Cholesky decomposition does not satisfy L * L.T != P.
I've tried to normalize P before computing its Cholesky decomposition (dividing it by its largest value), to no avail. I'm using the C++ library Eigen, and I've noticed this problem with numpy as well.
Any advice?
Cholesky decomposition should be quite stable, if the input matrix is actually positive definite. It can have issues if the matrix is (near) semi- or in-definite.
In that case you can use the LDLT decomposition instead. For an input A it computes a permutation P, a unit-diagonal triangular L and a diagonal D, such that
A = P.T*L*D*L.T*P
Then instead of multiplying Y = L * X you need of course Y = sqrt(D) * L * X, where sqrt(D) is an element-wise sqrt (I don't know the python syntax for that).
Note that you can ignore the permutation, since permuting a vector of identically independent distributed random numbers, is still a vector of i.i.d. numbers.
If that still does not work, try using the SelfAdjointEigenSolver-decomposition.
This computes a diagonal matrix of Eigenvalues D and a unitarian matrix V of Eigenvectors, such that
A = V * D * V^{-1}
And you can do essentially the same as above. (Note that for unitarian matrices, V^{-1} is just the adjoint of V, i.e., V^{-1} = V^T in the real-valued case).
A quaternion is obviously equivalent to a rotation matrix, but a 4x4 matrix does more than just rotation. It also does translation and scaling.
A matrix for affine transforms can actually be represented with 12 elements because the last row is constant:
a b c d
e f g h
i j k l
0 0 0 1
x' = a*x + b*y + c*z + d
y' = e*x + f*y + g*z + h
z' = i*x + j*y + k*z + l
A full transform therefore takes 9 multiplies and 9 adds.
For the three affine transforms: rotation, scale, and translation I would like to know if a quaternion-based system is competitive. I have looked all over and have not found one anywhere.
Given a quaternion p = (w,x,y,z)
For rotation, q' = pqp'. I could add a translation vector: t=(tx,ty,tz)
q' = pqp' + t
That is just 7 elements as compared to 12 with matrices, though it is slightly more operations.
That still does not support scaling though. Is there a complete equivalent?
Note: If the only answer is to convert the rotation to a matrix, then that is not really an answer. The question is whether a quaternion system can perform affine transform without matrices.
If there is an equivalence, can anyone point me to a java or c++ class so I can see how this works?
I am trying to implement a loss function which tries to minimize the negative log likelihood of obtaining ground truth values (x,y) from predicted bivariate gaussian distribution parameters. I am implementing this in tensorflow -
Here is the code -
def tf_2d_normal(self, x, y, mux, muy, sx, sy, rho):
'''
Function that implements the PDF of a 2D normal distribution
params:
x : input x points
y : input y points
mux : mean of the distribution in x
muy : mean of the distribution in y
sx : std dev of the distribution in x
sy : std dev of the distribution in y
rho : Correlation factor of the distribution
'''
# eq 3 in the paper
# and eq 24 & 25 in Graves (2013)
# Calculate (x - mux) and (y-muy)
normx = tf.sub(x, mux)
normy = tf.sub(y, muy)
# Calculate sx*sy
sxsy = tf.mul(sx, sy)
# Calculate the exponential factor
z = tf.square(tf.div(normx, sx)) + tf.square(tf.div(normy, sy)) - 2*tf.div(tf.mul(rho, tf.mul(normx, normy)), sxsy)
negRho = 1 - tf.square(rho)
# Numerator
result = tf.exp(tf.div(-z, 2*negRho))
# Normalization constant
denom = 2 * np.pi * tf.mul(sxsy, tf.sqrt(negRho))
# Final PDF calculation
result = -tf.log(tf.div(result, denom))
return result
When I am doing the training, I can see the loss value decreasing but it goes well past below 0. I can understand that should be because, we are minimizing the 'negative' likelihood. Even the loss values are decreasing, I can't get my results accurate. Can someone help in verifying, if the code that I have written for the loss function is correct or not.
Also is such a nature of loss desirable for training Neural Nets(specifically RNN)?
Thankss
I see you've found the sketch-rnn code from magenta, I'm working on something similar. I found this piece of code not to be stable by itself. You'll need to stabilize it using constraints, so the tf_2d_normal code can't be used or interpreted in isolation. NaNs and Infs will start appearing all over the place if your data isn't normalized properly in advance or in your loss function.
Below is a more stable loss function version I'm building with Keras. There may be some redundancy in here, it may not be perfect for your needs but I found it to be working and you can test/adapt it. I included some inline comments on how large negative log values can arise:
def r3_bivariate_gaussian_loss(true, pred):
"""
Rank 3 bivariate gaussian loss function
Returns results of eq # 24 of http://arxiv.org/abs/1308.0850
:param true: truth values with at least [mu1, mu2, sigma1, sigma2, rho]
:param pred: values predicted from a model with the same shape requirements as truth values
:return: the log of the summed max likelihood
"""
x_coord = true[:, :, 0]
y_coord = true[:, :, 1]
mu_x = pred[:, :, 0]
mu_y = pred[:, :, 1]
# exponentiate the sigmas and also make correlative rho between -1 and 1.
# eq. # 21 and 22 of http://arxiv.org/abs/1308.0850
# analogous to https://github.com/tensorflow/magenta/blob/master/magenta/models/sketch_rnn/model.py#L326
sigma_x = K.exp(K.abs(pred[:, :, 2]))
sigma_y = K.exp(K.abs(pred[:, :, 3]))
rho = K.tanh(pred[:, :, 4]) * 0.1 # avoid drifting to -1 or 1 to prevent NaN, you will have to tweak this multiplier value to suit the shape of your data
norm1 = K.log(1 + K.abs(x_coord - mu_x))
norm2 = K.log(1 + K.abs(y_coord - mu_y))
variance_x = K.softplus(K.square(sigma_x))
variance_y = K.softplus(K.square(sigma_y))
s1s2 = K.softplus(sigma_x * sigma_y) # very large if sigma_x and/or sigma_y are very large
# eq 25 of http://arxiv.org/abs/1308.0850
z = ((K.square(norm1) / variance_x) +
(K.square(norm2) / variance_y) -
(2 * rho * norm1 * norm2 / s1s2)) # z → -∞ if rho * norm1 * norm2 → ∞ and/or s1s2 → 0
neg_rho = 1 - K.square(rho) # → 0 if rho → {1, -1}
numerator = K.exp(-z / (2 * neg_rho)) # → ∞ if z → -∞ and/or neg_rho → 0
denominator = (2 * np.pi * s1s2 * K.sqrt(neg_rho)) + epsilon() # → 0 if s1s2 → 0 and/or neg_rho → 0
pdf = numerator / denominator # → ∞ if denominator → 0 and/or if numerator → ∞
return K.log(K.sum(-K.log(pdf + epsilon()))) # → -∞ if pdf → ∞
Hope you find this of value.
I have solid object that is spinning with a torque W, and I want to calculate the force F applied on a certain point that's D units away from the center of the object. All these values are represented in Vector3 format (x, y, z)
I know until now that W = D x F, where x is the cross product, so by expanding this I get:
Wx = Dy*Fz - Dz*Fy
Wy = Dz*Fx - Dx*Fz
Wz = Dx*Fy - Dy*Fx
So I have this equation, and I need to find (Fx, Fy, Fz), and I'm thinking of using the Simplex method to solve it.
Since the F vector can also have negative values, I split each F variable into 2 (F = G-H), so the new equation looks like this:
Wx = Dy*Gz - Dy*Hz - Dz*Gy + Dz*Hy
Wy = Dz*Gx - Dz*Hx - Dx*Gz + Dx*Hz
Wz = Dx*Gy - Dx*Hy - Dy*Gx + Dy*Hx
Next, I define the simplex table (we need <= inequalities, so I duplicate each equation and multiply it by -1.
Also, I define the objective function as: minimize (Gx - Hx + Gy - Hy + Gz - Hz).
The table looks like this:
Gx Hx Gy Hy Gz Hz <= RHS
============================================================
0 0 -Dz Dz Dy -Dy <= Wx = Gx
0 0 Dz -Dz -Dy Dy <= -Wx = Hx
Dz -Dz 0 0 Dx -Dx <= Wy = Gy
-Dz Dz 0 0 -Dx Dx <= -Wy = Hy
-Dy Dy Dx -Dx 0 0 <= Wz = Gz
Dy -Dy -Dx Dx 0 0 <= -Wz = Hz
============================================================
1 -1 1 -1 1 -1 0 = Z
The problem is that when I run it through an online solver I get Unbounded solution.
Can anyone please point me to what I'm doing wrong ?
Thanks in advance.
edit: I'm sure I messed up some signs somewhere (for example the Z should be defined as a max), but I'm sure I'm wrong when defining something more important.
There exists no unique solution to the problem as posed. You can only solve for the tangential projection of the force. This comes from the properties of the vector (cross) product - it is zero for collinear vectors and in particular for the vector product of a vector by itself. Therefore, if F is a solution of W = r x F, then F' = F + kr is also a solution for any k:
r x F' = r x (F + kr) = r x F + k (r x r) = r x F
since the r x r term is zero by the definition of vector product. Therefore, there is not a single solution but rather a whole linear space of vectors that are solutions.
If you restrict the solution to forces that have zero projection in the direction of r, then you could simply take the vector product of W and r:
W x r = (r x F) x r = -[r x (r x F)] = -[(r . F)r - (r . r)F] = |r|2F
with the first term of the expansion being zero because the projection of F onto r is zero (the dot denotes scalar (inner) product). Therefore:
F = (W x r) / |r|2
If you are also given the magnitude of F, i.e. |F|, then you can compute the radial component (if any) but there are still two possible solutions with radial components in opposing directions.
Quick dirty derivation...
Given D and F, you get W perpendicular to them. That's what a cross product does.
But you have W and D and need to find F. This is a bad assumption, but let's assume F was perpendicular to D. Call it Fp, since it's not necessarily the same as F. Ignoring magnitudes, WxD should give you the direction of Fp.
This ignoring magnitudes, so fix that with a little arithmetic. Starting with W=DxF applied to Fp:
mag(W) = mag(D)*mag(Fp) (ignoring geometry; using Fp perp to D)
mag(Fp) = mag(W)/mag(D)
Combining the cross product bit for direction with this stuff for magnitude,
Fp = WxD / mag(WxD) * mag(Fp)
Fp = WxD /mag(W) /mag(D) *mag(W) /mag(D)
= WxD / mag(D)^2.
Note that given any solution Fp to W=DxF, you can add any vector proportional to D to Fp to obtain another solution F. That is a totally free parameter to choose as you like.
Note also that if the torque applies to some sort of axle or object constrained to rotate about some axis, and F is applied to some oddball lever sticking out at a funny angle, then vector D points in some funny direction. You want to replace D with just the part perpendicular to the axle/axis, otherwise the "/mag(D)" part will be wrong.
So from your comment is clear that all rotations are spinning around center of gravity
in that case
F=M/r
F force [N]
M torque [N/m]
r scalar distance between center of rotation [m]
this way you know the scalar size of your Force
now you need the direction
it is perpendicular to rotation axis
and it is the tangent of the rotation in that point
dir=r x axis
F = F * dir / |dir|
bolds are vectors rest is scalar
x is cross product
dir is force direction
axis is rotation axis direction
now just change the direction according to rotation direction (signum of actual omega)
also depending on your coordinate system setup
so ether negate F or not
but this is in 3D free rotation very unprobable scenario
the object had to by symmetrical from mass point of view
or initial driving forces was applied in manner to achieve this
also beware that after first hit with any interaction Force this will not be true !!!
so if you want just to compute Force it generate on certain point if collision occurs is this fine
but immediately after this your spinning will change
and for non symmetric objects the spinning will be most likely off the center of gravity !!!
if your object will be disintegrated then you do not need to worry
if not then you have to apply rotation and movement dynamics
Rotation Dynamics
M=alpha*I
M torque [N/m]
alpha angular acceleration
I quadratic mass inertia for actual rotation axis [kg.m^2]
epislon''=omega'=alpha
' means derivation by time
omega angular speed
epsilon angle