Use GP for Solving Symbolic Regression - genetic-programming

I understand how to solve a simple symbolic regression problem such as f(x)=x4+x3+x2+x. However, I'm not quite sure how to use GP to solve the below one.
f(x) = x2 + cosx + 7, for x<=0,
1/x + 2x, for x>0
Can anyone help with this?

It says that you compare against
f = x<=0 ? x2 + cosx + 7 : 1/x + 2x
instead of
f = x4+x3+x2+x

Related

Numerically stable calculation of invariant mass in particle physics?

In particle physics, we have to compute the invariant mass a lot, which is for a two-body decay
When the momenta (p1, p2) are sometimes very large (up to a factor 1000 or more) compared to the masses (m1, m2). In that case, there is large cancellation happening between the last two terms when the calculation is carried out with floating point numbers on a computer.
What kind of numerical tricks can be used to compute this accurately for any inputs?
The question is about suitable numerical tricks to improve the accuracy of the calculation with floating point numbers, so the solution should be language-agnostic. For demonstration purposes, implementations in Python are preferred. Solutions which reformulate the problem and increase the amount of elementary operations are acceptable, but solutions which suggest to use other number types like decimal or multi-precision floating point numbers are not.
Note: The original question presented a simplified 1D dimensional problem in form of a Python expression, but the question is for the general case where the momenta are given in 3D dimensions. The question was reformulated in this way.
With a few tricks listed on Stackoverflow and the transformation described by Jakob Stark in his answer, it is possible to rewrite the equation into a form that does not suffer anymore from catastrophic cancellation.
The original question asked for a solution in 1D, which has a simple solution, but in practice, we need the formula in 3D and then the solution is more complicated. See this notebook for a full derivation.
Example implementation of numerically stable calculation in 3D in Python:
import numpy as np
# numerically stable implementation
#np.vectorize
def msq2(px1, py1, pz1, px2, py2, pz2, m1, m2):
p1_sq = px1 ** 2 + py1 ** 2 + pz1 ** 2
p2_sq = px2 ** 2 + py2 ** 2 + pz2 ** 2
m1_sq = m1 ** 2
m2_sq = m2 ** 2
x1 = m1_sq / p1_sq
x2 = m2_sq / p2_sq
x = x1 + x2 + x1 * x2
a = angle(px1, py1, pz1, px2, py2, pz2)
cos_a = np.cos(a)
if cos_a >= 0:
y1 = (x + np.sin(a) ** 2) / (np.sqrt(x + 1) + cos_a)
else:
y1 = -cos_a + np.sqrt(x + 1)
y2 = 2 * np.sqrt(p1_sq * p2_sq)
return m1_sq + m2_sq + y1 * y2
# numerically stable calculation of angle
def angle(x1, y1, z1, x2, y2, z2):
# cross product
cx = y1 * z2 - y2 * z1
cy = x1 * z2 - x2 * z1
cz = x1 * y2 - x2 * y1
# norm of cross product
c = np.sqrt(cx * cx + cy * cy + cz * cz)
# dot product
d = x1 * x2 + y1 * y2 + z1 * z2
return np.arctan2(c, d)
The numerically stable implementation can never produce a negative result, which is a commonly occurring problem with naive implementations, even in double precision.
Let's compare the numerically stable function with a naive implementation.
# naive implementation
def msq1(px1, py1, pz1, px2, py2, pz2, m1, m2):
p1_sq = px1 ** 2 + py1 ** 2 + pz1 ** 2
p2_sq = px2 ** 2 + py2 ** 2 + pz2 ** 2
m1_sq = m1 ** 2
m2_sq = m2 ** 2
# energies of particles 1 and 2
e1 = np.sqrt(p1_sq + m1_sq)
e2 = np.sqrt(p2_sq + m2_sq)
# dangerous cancelation in third term
return m1_sq + m2_sq + 2 * (e1 * e2 - (px1 * px2 + py1 * py2 + pz1 * pz2))
For the following image, the momenta p1 and p2 are randomly picked from 1 to 1e5, the values m1 and m2 are randomly picked from 1e-5 to 1e5. All implementations get the input values in single precision. The reference in both cases is calculated with mpmath using the naive formula with 100 decimal places.
The naive implementation loses all accuracy for some inputs, while the numerically stable implementation does not.
If you put e.g. m1 = 1e-4, m2 = 1e-4, p1 = 1 and p2 = 1 in the expression, you get about 4e-8 with double precision but 0.0 with single precision calculation. I assume, that your question is about how one can get the 4e-8 as well with single precision calculation.
What you can do is a taylor expansion (around m1 = 0 and m2 = 0) of the expression above.
e ~ e|(m1=0,m2=0) + de/dm1|(m1=0,m2=0) * m1 + de/dm2|(m1=0,m2=0) * m2 + ...
If I calculated correctly, the zeroth and first order terms are 0 and the second order expansion would be
e ~ (p1+p2)/p1 * m1**2 + (p1+p2)/p2 * m2**2
This yields exactly 4e-8 even with single precision calculation. You can of course do more terms in the expansion if you need, until you hit the precision limit of a single float.
Edit
If the mi are not always much smaller than the pi you could further massage the equation to get
The complicated part is now the one in the square brackets. It essentially is sqrt(x+1)-1 for a wide range of x values. If x is very small, we can use the taylor expansion of the square root (e.g. like here). If the x value is larger, the formula works just fine, because the addition and subtraction of 1 are no longer discarding the value of x due to floating point precision. So one threshold for x must be choosen below one switches to the taylor expansion.

How to get multiple BFS using lpsolve

I'm trying to use lpsolve IDE to solve LPs with multiple BFS, however only one solution is generated. What should I do?
min 2x1 + 4x2 + 7x3
st 2x1 + x2 + 6x3 >= 5
4x1 - 6x2 + 5x3 >= 8
x1 >= 0
x2 >= 0
I only get x1 = 2.5, x2 = x3 = 0
But there are other BFS eg (0,0,8.5), (19/8,1/4,0)
LP solvers typically just return one solution. With some solvers, you may be able to find out what solutions were visited (I am not sure that can be done with lpsolve).
Enumerating all BFS is not that easy. Here is a (somewhat complicated) way to enumerate optimal solutions. This can also be used to enumerate all (or many) feasible solutions.

Wolfram Alpha Solution Entry

Using this Wolfram Alpha code either through web or in Mathematica:
(5y^4+3y^2+e^y)y'=cos(x),y(0)=0
My equation seems to be properly parsed:
{(5 y(x)^4 + 3 y(x)^2 + e^(y(x))) y'(x) = cos(x), y(0) = 0}
As a separable DE, the result should be:
y^5 + y^3 + e^y = sin(x) + 1
How do I modify the original Wolfram Alpha code to get the program to evaluate the solution?
Using Wolfram Alpha
{(5y[x]^4+3y[x]^2+E^y[x])y'[x]==Cos[x],y[0]==0} solution

Finding Big O of the Harmonic Series

Prove that
1 + 1/2 + 1/3 + ... + 1/n is O(log n).
Assume n = 2^k
I put the series into the summation, but I have no idea how to tackle this problem. Any help is appreciated
This follows easily from a simple fact in Calculus:
and we have the following inequality:
Here we can conclude that S = 1 + 1/2 + ... + 1/n is both Ω(log(n)) and O(log(n)), thus it is Ɵ(log(n)), the bound is actually tight.
Here's a formulation using Discrete Mathematics:
So, H(n) = O(log n)
If the problem was changed to :
1 + 1/2 + 1/4 + ... + 1/n
series can now be written as:
1/2^0 + 1/2^1 + 1/2^2 + ... + 1/2^(k)
How many times loop will run? 0 to k = k + 1 times.From both series we can see 2^k = n. Hence k = log (n). So, number of times it ran = log(n) + 1 = O(log n).

Find 3D point along the line at given distance

I have a problem and please let me know if my solution is correct.
I have a known point, at location A(x1,y1,z1) and the origin O(0,0,0) and I would like to find the coordinates of a point B(x2,y2,z2) that is located on the line OA, and the distance OB is 1.2 times greater then OA.
So, my idea is to obtain the equation of the line formed by points O and A.
The direction of OA is (-x1, -y1, -z1), so the equation of the line is:
x = -x1*t;
y = -y1*t;
z = -z1*t;
Distance OA is sqrt( (x1-0)^2 + (y1-0)^2 + (z1-0)^2). KNOWN
Distance OB is sqrt( (x2-0)^2 + (y2-0)^2 + (z2-0)^2). UNKNOWN
I can replace the x, y, z points determined for the line equation in the distance OB, and the result should be 1.2 times greater then the distance OA.
So, sqrt( (-x1*t-0)^2 + (-y1*t-0)^2 + (-z1*t-0)^2) = 1.2 * dist(OA).
I find t from here, solving the quadratic equation and I obtain the coordinates of the point by replacing the t in the equation of the line.
Is this correct?
Thank you for your time.
EDIT:
This is my code:
rangeRatio = 1.114;
norm = sqrt((P2(1) - P1(1))^2 + (P2(2) - P1(2))^2 + (P2(3) - P1(3))^2);
P3(1) = P1(1) + ((P2(1,1) - P1(1)) /norm) * rangeRatio;
P3(2) = P1(2) + ((P2(1,2) - P1(2)) /norm) * rangeRatio;
P3(3) = P1(3) + ((P2(1,3) - P1(3)) /norm) * rangeRatio;
I tried also norm = 1, and i get slightly different results but still not always colinear.
Thank you
It is even a lot easier; you can just multiply a, b and c by 1.2. This gives a line that is 1.2 times the size of the original line.