Posing a quadrating optimization problem for CVXOPT correctly - optimization

I am trying to minimize the function || Cx - d ||_2^2 with constraints Ax <= b. Some information about their sizes is as such:
* C is a (138, 22) matrix
* d is a (138,) vector
* A is a (138, 22) matrix
* b is a (138, ) vector of zeros
So I have 138 equation and 22 free variables that I'd like to optimize. I am currently coding this in Python and am using the transpose C.T*C to form a square matrix. The entire code looks like this
C = matrix(np.matmul(w, b).astype('double'))
b = matrix(np.matmul(w, np.log(dwi)).astype('double').reshape(-1))
P = C.T * C
q = -C.T * b
G = matrix(-constraints)
h = matrix(np.zeros(G.size[0]))
dt = np.array(solvers.qp(P, q, G, h, dims)['x']).reshape(-1)
where np.matmul(w, b) is C and np.matmul(w, np.log(dwi)) is d. Variables P and q are C and b multiplied by the transpose C.T to form a square multiplier matrix and constant vector, respectively. This works perfectly and I can find a solution.
I'd like to know whether this my approach makes mathematical sense. From my limited knowledge of linear algebra I know that a square matrix produces a unique solution, but is there is a way to run the same this to produce an overdetermined solution? I tried this but solver.qp said input Q needs to be a square matrix.
We can also parse in a dims argument to solver.qp, which I tried, but received the error:
use of function valued P, G, A requires a user-provided kktsolver.
How do I correctly setup dims?
Thanks a lot for any help. I'll try to clarify any questions as best as I can.

Related

np.einsum multiplication of matrices with several indices?

Given m x n matrix A and n x r matrix B how to write the following formula in np.einsum notation?
f(i) = \sum_{j,k} a_ij * b_jk
What will change in np.einsum if r x p matrix C will be added?
f(i) = \sum_{j,k,l} a_ij * b_jk * c_kl
#Sengiley Despite you already answered yourself's question. A more general version that allows broadcasting is:
np.einsum('...ij,jk,kl->...i', a, b, c)

Python to fit a linear-plateau curve

I have curve that initially Y increases linearly with X, then reach a plateau at point C.
In other words, the curve can be defined as:
if X < C:
Y = k * X + b
else:
Y = k * C + b
The training data is a list of X ~ Y values. I need to determine k, b and C through a machine learning approach (or similar), since the data is noisy and refection point C changes over time. I want something more robust than get C through observing the current sample data.
How can I do it using sklearn or maybe scipy?
WLOG you can say the second equation is
Y = C
looks like you have a linear regression to fit the line and then a detection point to find the constant.
You know that in the high values of X, as in X > C you are already at the constant. So just check how far back down the values of X you get the same constant.
Then do a linear regression to find the line with value of X, X <= C
Your model is nonlinear
I think the smartest way to solve this is to do these steps:
find the maximum value of Y which is equal to k*C+b
M=max(Y)
drop this maximum value from your dataset
df1 = df[df.Y != M]
and then you have simple dataset to fit your X to Y and you can use sklearn for that

How do I determine the distance between v and PQ when v =[2,1,2] and PQ = [1,0,3]? P = [0,0,0] Q = [1,0,3]

What I have tried already: d = |v||PQ|sin("Theta")
Now, I need to determine what theta is, so I set up a position on a makeshift graph, the graph I made was on the xy plane only as the z plane complicates things needlessly for finding theta. So, I ended up with an acute angle, and if the angle is acute, then I have to find theta which according to dot product facts is greater than 0.
I do not have access to theta, so I used the same princples from cross dots. u * v = |u||v|cos("theta") but in this case, u and v are PQ and v. A vector is a vector, right?
so now I have theta = acos((v*PQ)/(|v||PQ))
with that I get (4sqrt(10))/15 = 32.5125173162 in degrees, so the angle is 32.5125173162 degrees.
So, now that I have theta, I plug it into my distance formula |v||PQ|sin(32.5125173162)
3*sqrt(10)*sin(32.5125173162) = 5.0990195136
or for the sake of simplicity, 5.1
I however want to know if this question is correct.
If it is NOT correct, what can I do to correct it? At what points did I use incorrect information?
This is not a question with a definitive answer in the back of the book, its a question on the side of a page that said: "try this!"
There are a couple of problems with this question.
From the context it looks like you mean for both v and PQ to be vectors. The "distance" between two vectors is an awkward (not well defined) question because vectors are not position bound.
You are using the cross product formula and I have no idea why:
|AxB| = |A||B|Sin(theta)
I think what you are actually trying to do is calculate the distance between the terminal points of the vectors, (2, 1, 2) and (1, 0, 3). Just use the Pythagorean Theorem (extended to 3D) for this.
d = sqrt( (x1 - x2)^2 + (y1 - y2)^2 + (z1 - z2)^2 )
d = sqrt( (2 - 1)^2 + (1 - 2)^2 + (2 - 3)^2 )
d = sqrt( 1^2 + (-1)^2 + (-1)^2 )
d = sqrt(3)
Edit:
If what you need really is the magnitude of the cross product, |AxB| then just find the cross product (using the determinant) and then calculate the magnitude of the result. There is no need for the formula you were using.

Stable sampling of large Gaussian distributions

I'm trying to sample a Gaussian distribution of covariance matrix P that is N by N, with N very large (around 4000 ).
Usually one would proceed like so:
Compute the Cholesky decomposition of P : L, such that L * L.T = P
Sample a normal Gaussian distribution : X ~N(0,I_N), where I_N is the identity and N = 4000
Obtain the desired sample Y from Y = L * X
The snag here is in the computation of L. The algorithm does not seem to be stable for such a large matrix, as the computed Cholesky decomposition does not satisfy L * L.T != P.
I've tried to normalize P before computing its Cholesky decomposition (dividing it by its largest value), to no avail. I'm using the C++ library Eigen, and I've noticed this problem with numpy as well.
Any advice?
Cholesky decomposition should be quite stable, if the input matrix is actually positive definite. It can have issues if the matrix is (near) semi- or in-definite.
In that case you can use the LDLT decomposition instead. For an input A it computes a permutation P, a unit-diagonal triangular L and a diagonal D, such that
A = P.T*L*D*L.T*P
Then instead of multiplying Y = L * X you need of course Y = sqrt(D) * L * X, where sqrt(D) is an element-wise sqrt (I don't know the python syntax for that).
Note that you can ignore the permutation, since permuting a vector of identically independent distributed random numbers, is still a vector of i.i.d. numbers.
If that still does not work, try using the SelfAdjointEigenSolver-decomposition.
This computes a diagonal matrix of Eigenvalues D and a unitarian matrix V of Eigenvectors, such that
A = V * D * V^{-1}
And you can do essentially the same as above. (Note that for unitarian matrices, V^{-1} is just the adjoint of V, i.e., V^{-1} = V^T in the real-valued case).

Plot a third point past the two previously plotted points. Cocos2d

Ok so let me try to explain this the best way that i can.
I have two points plotted 'A' and 'B' and I am trying to plot a third point 'C' so that it is past point 'B' but along the same slope. I have the angle of the line and I would post some code but I really have no idea where to begin.
any help would be awesome!
Just a little code that i do have
CGPoint vector = ccpSub(touchedPoint, fixedPoint);
CGFloat rotateAngle = -ccpToAngle(vector);
Assuming that by this you mean you need a 3rd point C added such that all the points are colinear, all you need to do is calculate the vector that takes you from A to B, and then generate a new point by adding multiples of this vector to the point B. Choose the multiple based on the distance you want C to be from B.
As an example, say A = (2,2), B = (4,3). Then the vector from A to B is given by (2,1).
All you need to do then is work out how far your new point is from B and add a multiple K*(2,1) to your point B where K is chosen to meet the requirements of your distance
I am assuming you are in 2D, but the same method would apply in higher dimensions
My math is rusty, but the linear equation is generally represented as y=m*x+b, where m is the slope, and b is the y-intercept. You can get m, the slope, by taking the difference of the y values and dividing that by the difference in the x values, e.g., if A = (2,2) and B = (4,3), then m is (3-2)/(4-2) or 0.5. Then, you can solve the linear equation for b, the y-intercept, i.e. b=y-m*x and then plug in either of the data points, e.g. if we plug in the x and y values for point A, you get b = 2 - 0.5 * 2 = 1. Now knowing the slope, m (0.5 in this example), and the y-intercept, b (1 in this example), you can calculate the y for any x value using y=m*x+b, in this case y=0.5*x+1.
So, if touchedPoint and fixedPoint are CGPoint, you can calculate the slope and y-intercept from fixedPoint and touchedPoint like so:
double m = (fixedPoint.y - touchedPoint.y) / (fixedPoint.x - touchedPoint.x);
double b = fixedPoint.y - m * fixedPoint.x;
Now, you don't say how you want to determine where this third point, C, is. But if you, for example, knew the x coordinate for this new point C, you can calculate the y coordinate that falls on the same line as follows:
CGPoint pointC;
pointC.x = 400; // or set this to whatever you want
pointC.y = m * pointC.x + b;