making Maxima code run faster (radius of convergence) - optimization

I am trying to find the radius of convergence of some Taylor series related to Newton's method applied to the Mandelbrot set. I have written some Maxima code to this end; the R computed is incorrect at least for the first commented case (1 / inf instead of 1 / 2), so I am trying to print some values to see how it behaves numerically. However Maxima takes too much time (1 second for the first value, 17 seconds for the second, and I gave up waiting for the third). The performance in the third commented case is even slower. How might I speed it up?
f(c, z) := z^2 + c;
/* h(c) := f(c, f(c, 0)); /* simpler case is much faster */ */
h(c) := ratsimp(f(c, f(c, f(c, 0))) / f(c, 0)); /* polynomial with roots at period 3 components */
/* h(c) := ratsimp(f(c, f(c, f(c, f(c, 0)))) / f(c, f(c, 0))); /* next case is much slower */ */
hh(c) := ratsimp(diff(h(c), c));
N(z) := ratsimp(z - h(z) / hh(z)); /* Newton's method */
M(z) := ratsimp(subst(c = c + z, N(c)) - c); /* perturbed */
for ct in allroots(h(c) = 0) do
(
print(ct),
/* coefficient of Taylor series */
a(n) := at(diff((subst(lhs(ct) = bfloat(rhs(ct)), M(z))), z, n) / factorial(n), z = 0),
R : 1 / limit(abs(a(n+1)/a(n)), n, infinity), /* radius of convergence */
print(R), /* prints 1/inf, but this is incorrect at least for the first commented case */
print(bfloat(1 / abs(a(11)/a(10)))), /* prints after 1 second */
print(bfloat(1 / abs(a(101)/a(100)))), /* prints after 17 seconds */
print(bfloat(1 / abs(a(1001)/a(1000)))) /* gave up waiting */
);
EDIT Correcting the infinity vs inf confusion, consider the first commented case h(c) = f(c, f(c, 0)) = c^2 + c. Then M(z) = (z^2 - h(c)) / (2 * z + 2 * c + 1) which for the root c = 0 gives M(z) = z^2 / (1 - (-2 * z)) which is the sum of the geometric series z^2 (1 + (-2) * z + (-2)^2 * z^2 + ... + (-2)^n z^n + ...). Now the coefficients a(n) = (-2)^n so 1 / limit(abs(a(n + 1)/a(n)), n, inf) should be 1/2, but Maxima reports a divide by zero:
f(c, z) := z^2 + c;
h(c) := f(c, f(c, 0));
hh(c) := ratsimp(diff(h(c), c));
N(z) := ratsimp(z - h(z) / hh(z));
M(z) := ratsimp(subst(c = c + z, N(c)) - c);
for ct in solve(h(c) = 0) do
(
print(ct),
a(n) := at(diff((subst(lhs(ct) = bfloat(rhs(ct)), M(z))), z, n) / factorial(n), z = 0),
invR : limit(abs(a(n+1)/a(n)), n, inf),
print(abs(a(10+1)/a(10))),
print(abs(a(100+1)/a(100))),
print(abs(a(1000+1)/a(1000))),
print(invR)
);
which prints
...
c = - 1
`rat' replaced -1.0B0 by -1/1 = -1.0B0
`rat' replaced -1.0B0 by -1/1 = -1.0B0
2.0b0
2.0b0
2.0b0
0
c = 0
`rat' replaced 1.0B0 by 1/1 = 1.0B0
`rat' replaced 1.0B0 by 1/1 = 1.0B0
2.0b0
2.0b0
2.0b0
0
I want to evaluate the high order terms of the other examples numerically to see how it behaves, because the limit does not seem to be calculated properly. But my Maxima code is too slow to make this practical, and while SymPy performs ok for the period 3 (cubic) case (see linked question), for the period 4 (degree 6) it also takes too long.

Related

is there an R**2 values for finding in the linear regression analysis?

I'm trying to code a for linear regression analysis that prints TypeError: can't multiply sequence by non-int of type 'list',.
I tried to learn linear regression coefficient analysis
def corr_coef(x,y):
N = len(x)
num = (N * (x * y).sum()) - (x.sum() * y.sum())
den = np.sqrt((N * (x**2).sum() - x.sum()**2) * (N * (y**2).sum() - y.sum()**2))
R = num / den
return R
num = (N * (x * y).sum()) - (x.sum() * y.sum())
TypeError: can't multiply sequence by non-int of type 'list'

Octave fminunc "trust region become excessively small"

I am trying to run a linear regression using fminunc to optimize my parameters. However, while the code never fails, the fminunc function seems to only be running once and not converging. The exit flag that the fminunc funtion returns is -3, which - according to documentation- means "The trust region radius became excessively small". What does this mean and how can I fix it?
This is my main:
load('data.mat');
% returns matrix X, a matrix of data
% Initliaze parameters
[m, n] = size(X);
X = [ones(m, 1), X];
initialTheta = zeros(n + 1, 1);
alpha = 1;
lambda = 0;
costfun = #(t) costFunction(t, X, surv, lambda, alpha);
options = optimset('GradObj', 'on', 'MaxIter', 1000);
[theta, cost, info] = fminunc(costfun, initialTheta, options);
And the cost function:
function [J, grad] = costFunction(theta, X, y, lambda, alpha)
%COSTFUNCTION Implements a logistic regression cost function.
% [J grad] = COSTFUNCTION(initialParameters, X, y, lambda) computes the cost
% and the gradient for the logistic regression.
%
m = size(X, 1);
J = 0;
grad = zeros(size(theta));
% un-regularized
z = X * theta;
J = (-1 / m) * y' * log(sigmoid(z)) + (1 - y)' * log(1 - sigmoid(z));
grad = (alpha / m) * X' * (sigmoid(z) - y);
% regularization
theta(1) = 0;
J = J + (lambda / (2 * m)) * (theta' * theta);
grad = grad + alpha * ((lambda / m) * theta);
endfunction
Any help is much appreciated.
There are a few issues with the code above:
Using the fminunc means you don't have to provide an alpha. Remove all instances of it from the code and your gradient functions should look like the following
grad = (1 / m) * X' * (sigmoid(z) - y);
and
grad = grad + ((lambda / m) * theta); % This isn't quite correct, see below
In the regularization of the grad, you can't use theta as you don't add in the theta for j = 0. There are a number ways to do this, but here is one
temp = theta;
temp(1) = 0;
grad = grad + ((lambda / m) * temp);
You missing a set of bracket in your cost function. The (-1 / m) is being applied only to a portion of the rest of the equation. It should look like.
J = (-1 / m) * ( y' * log(sigmoid(z)) + (1 - y)' * log(1 - sigmoid(z)) );
And finally, as a nit, a lambda value of 0 means that your regularization does nothing.

How can I solve exponential equation in Maxima CAS

I have function in Maxima CAS :
f(t) := (2*exp(2*%i*%pi*t) - exp(4*%pi*t*%i))/4;
here:
t is a real number between 0 and 1
function should give a point on the boundary of main cardioid of Mandelbrot set
How can I solve equation :
eq1:c=f(t);
(where c is a complex number)
?
Solve doesn't work
solve( eq1,t);
result is empty list
[]
Result of this equation should give real number t ( internal angle or rotation number ) from complex point c
EDIT: Thx to comment by #JosehDoggie
I can draw initial equation using:
load(draw)$
f(t):=(2*exp(%i*t) - exp(2*t*%i))/4;
draw2d(
key="main cardioid",
nticks=200,
parametric( 0.5*cos(t) - 0.25*cos(2*t), 0.5*sin(t) - 0.25*sin(2*t), t,0,2*%pi),
title="main cardioid of M set "
)$
or
draw2d(polar(abs(exp(t*%i)/2 -exp(2*t*%i)/4),t,0,2*%pi));
Similar image ( cardioid) is here
Edit2:
(%i1) eq1:c = exp(%pi*t*%i)/2 - exp(2*%pi*t*%i)/4;
%i %pi t 2 %i %pi t
%e %e
(%o1) c = ---------- - ------------
2 4
(%i2) solve(eq1,t);
%i log(1 - sqrt(1 - 4 c)) %i log(sqrt(1 - 4 c) + 1)
(%o2) [t = - -------------------------, t = - -------------------------]
%pi %pi
So :
f1(c):=float(cabs( - %i* log(1 - sqrt(1 - 4* c))/%pi));
f2(c):=float(cabs( - %i* log(1 + sqrt(1 - 4* c))/%pi));
but the results are not good.
Edit 3 :
Maybe I shoud start from it.
I have:
complex numbers c ( = boundary of cardioid)
real numbers t ( from 0 to 1 or sometimes from 0 to 2*pi )
function f which computes c from t : c= f(t)
I want to find function which computes t from c: t = g(c)
testing values :
t = 0 , c= 1/4
t = 1/2 , c= -3/4
t = 1/3 , c = c = -0.125 +0.649519052838329*%i
t = 2/5 , c = -0.481762745781211 +0.531656755220025*%i
t = 0.118033988749895 c = 0.346828007859920 +0.088702386914555*%i
t = 0.618033988749895 , c = -0.390540870218399 -0.586787907346969*%i
t = 0.718033988749895 c = 0.130349371041523 -0.587693986342220*%i
load("to_poly_solve") $
e: (2*exp(2*%i*%pi*t) - exp(4*%pi*t*%i))/4 - c $
s: to_poly_solve(e, t) $
s: maplist(lambda([e], rhs(first(e))), s) $ /* unpack arguments of %union */
ratexpand(s);
Outputs
%i log(1 - sqrt(1 - 4 c)) %i log(sqrt(1 - 4 c) + 1)
(%o6) [%z7 - -------------------------, %z9 - -------------------------]
2 %pi 2 %pi

Finding intersection points of line and circle

Im trying to understand what this function does. It was given by my teacher and I just cant understands, whats logic behind the formulas finding x, and y coordinates. From my math class I know I my formulas for finding interception but its confusing translated in code. So I have some problems how they defined the formulas for a,b,c and for finding the coordinates x and y.
void Intersection::getIntersectionPoints(const Arc& arc, const Line& line) {
double a, b, c, mu, det;
std::pair<double, double> xPoints;
std::pair<double, double> yPoints;
std::pair<double, double> zPoints;
//(m2+1)x2+2(mc−mq−p)x+(q2−r2+p2−2cq+c2)=0.
//a= m2;
//b= 2 * (mc - mq - p);
//c= q2−r2+p2−2cq+c2
a = pow((line.end().x - line.start().x), 2) + pow((line.end().y - line.start().y), 2) + pow((line.end().z - line.start().z), 2);
b = 2 * ((line.end().x - line.start().x)*(line.start().x - arc.center().x)
+ (line.end().y - line.start().y)*(line.start().y - arc.center().y)
+ (line.end().z - line.start().z)*(line.start().z - arc.center().z));
c = pow((arc.center().x), 2) + pow((arc.center().y), 2) +
pow((arc.center().z), 2) + pow((line.start().x), 2) +
pow((line.start().y), 2) + pow((line.start().z), 2) -
2 * (arc.center().x * line.start().x + arc.center().y * line.start().y +
arc.center().z * line.start().z) - pow((arc.radius()), 2);
det = pow(b, 2) - 4 * a * c;
/* Tangenta na kružnicu */
if (Math<double>::isEqual(det, 0.0, 0.00001)) {
if (!Math<double>::isEqual(a, 0.0, 0.00001))
mu = -b / (2 * a);
else
mu = 0.0;
// x = h + t * ( p − h )
xPoints.second = xPoints.first = line.start().x + mu * (line.end().x - line.start().x);
yPoints.second = yPoints.first = line.start().y + mu * (line.end().y - line.start().y);
zPoints.second = zPoints.first = line.start().z + mu * (line.end().z - line.start().z);
}
if (Math<double>::isGreater(det, 0.0, 0.00001)) {
// first intersection
mu = (-b - sqrt(pow(b, 2) - 4 * a * c)) / (2 * a);
xPoints.first = line.start().x + mu * (line.end().x - line.start().x);
yPoints.first = line.start().y + mu * (line.end().y - line.start().y);
zPoints.first = line.start().z + mu * (line.end().z - line.start().z);
// second intersection
mu = (-b + sqrt(pow(b, 2) - 4 * a * c)) / (2 * a);
xPoints.second = line.start().x + mu * (line.end().x - line.start().x);
yPoints.second = line.start().y + mu * (line.end().y - line.start().y);
zPoints.second = line.start().z + mu * (line.end().z - line.start().z);
}
Denoting the line's start point as A, end point as B, circle's center as C, circle's radius as r and the intersection point as P, then we can write P as
P=(1-t)*A + t*B = A+t*(B-A) (1)
Point P will also locate on the circle, therefore
|P-C|^2 = r^2 (2)
Plugging equation (1) into equation (2), you will get
|B-A|^2*t^2 + 2(B-A)\dot(A-C)*t +(|A-C|^2 - r^2) = 0 (3)
This is how you get the formula for a, b and c in the program you posted. After solving for t, you shall obtain the intersection point(s) from equation (1). Since equation (3) is quadratic, you might get 0, 1 or 2 values for t, which correspond to the geometric configurations where the line might not intersect the circle, be exactly tangent to the circle or pass thru the circle at two locations.

OpenACC red-black Gauss-Seidel slower than CPU

I added OpenACC directives to my red-black Gauss-Seidel solver for the Laplace equation (a simple heated plate problem), but the GPU-accelerated code is no faster than the CPU, even for large problems.
I also wrote a CUDA version, and that is much faster than both (for 512x512, on the order of 2 seconds compared to 25 for CPU and OpenACC).
Can anyone think of a reason for this discrepancy? I realize that CUDA offers the most potential speed, but OpenACC should give something better than the CPU for larger problems (like the Jacobi solver for the same sort of problem demonstrated here).
Here is the relevant code (the full working source is here):
#pragma acc data copyin(aP[0:size], aW[0:size], aE[0:size], aS[0:size], aN[0:size], b[0:size]) copy(temp_red[0:size_temp], temp_black[0:size_temp])
// red-black Gauss-Seidel with SOR iteration loop
for (iter = 1; iter <= it_max; ++iter) {
Real norm_L2 = 0.0;
// update red cells
#pragma omp parallel for shared(aP, aW, aE, aS, aN, temp_black, temp_red) \
reduction(+:norm_L2)
#pragma acc kernels present(aP[0:size], aW[0:size], aE[0:size], aS[0:size], aN[0:size], b[0:size], temp_red[0:size_temp], temp_black[0:size_temp])
#pragma acc loop independent gang vector(4)
for (int col = 1; col < NUM + 1; ++col) {
#pragma acc loop independent gang vector(64)
for (int row = 1; row < (NUM / 2) + 1; ++row) {
int ind_red = col * ((NUM / 2) + 2) + row; // local (red) index
int ind = 2 * row - (col % 2) - 1 + NUM * (col - 1); // global index
#pragma acc cache(aP[ind], b[ind], aW[ind], aE[ind], aS[ind], aN[ind])
Real res = b[ind] + (aW[ind] * temp_black[row + (col - 1) * ((NUM / 2) + 2)]
+ aE[ind] * temp_black[row + (col + 1) * ((NUM / 2) + 2)]
+ aS[ind] * temp_black[row - (col % 2) + col * ((NUM / 2) + 2)]
+ aN[ind] * temp_black[row + ((col + 1) % 2) + col * ((NUM / 2) + 2)]);
Real temp_old = temp_red[ind_red];
temp_red[ind_red] = temp_old * (1.0 - omega) + omega * (res / aP[ind]);
// calculate residual
res = temp_red[ind_red] - temp_old;
norm_L2 += (res * res);
} // end for row
} // end for col
// update black cells
#pragma omp parallel for shared(aP, aW, aE, aS, aN, temp_black, temp_red) \
reduction(+:norm_L2)
#pragma acc kernels present(aP[0:size], aW[0:size], aE[0:size], aS[0:size], aN[0:size], b[0:size], temp_red[0:size_temp], temp_black[0:size_temp])
#pragma acc loop independent gang vector(4)
for (int col = 1; col < NUM + 1; ++col) {
#pragma acc loop independent gang vector(64)
for (int row = 1; row < (NUM / 2) + 1; ++row) {
int ind_black = col * ((NUM / 2) + 2) + row; // local (black) index
int ind = 2 * row - ((col + 1) % 2) - 1 + NUM * (col - 1); // global index
#pragma acc cache(aP[ind], b[ind], aW[ind], aE[ind], aS[ind], aN[ind])
Real res = b[ind] + (aW[ind] * temp_red[row + (col - 1) * ((NUM / 2) + 2)]
+ aE[ind] * temp_red[row + (col + 1) * ((NUM / 2) + 2)]
+ aS[ind] * temp_red[row - ((col + 1) % 2) + col * ((NUM / 2) + 2)]
+ aN[ind] * temp_red[row + (col % 2) + col * ((NUM / 2) + 2)]);
Real temp_old = temp_black[ind_black];
temp_black[ind_black] = temp_old * (1.0 - omega) + omega * (res / aP[ind]);
// calculate residual
res = temp_black[ind_black] - temp_old;
norm_L2 += (res * res);
} // end for row
} // end for col
// calculate residual
norm_L2 = sqrt(norm_L2 / ((Real)size));
if(iter % 100 == 0) printf("%5d, %0.6f\n", iter, norm_L2);
// if tolerance has been reached, end SOR iterations
if (norm_L2 < tol) {
break;
}
}
Alright, I found a semi-solution that reduces the time somewhat significantly for smaller problems.
If I insert the lines:
acc_init(acc_device_nvidia);
acc_set_device_num(0, acc_device_nvidia);
before I start my timer, in order to activate and set the GPU, the time for the 512x512 problem drops to 9.8 seconds, and down to 42 for 1024x1024. Increasing the problem size further shows how fast even OpenACC can be compared to running on four CPU cores.
With this change, the OpenACC code is on the order of 2x slower than the CUDA code, with the gap getting closer to just a bit slower (~1.2) as the problem size gets bigger and bigger.
I download your full code and i compiled and run it! Did't stop run and for instruction
if(iter % 100 == 0) printf("%5d, %0.6f\n", iter, norm_L2);
the result was:
100, nan
200, nan
....
I changed all variables with type Real into type float and the result was:
100, 0.000654
200, 0.000370
..., ....
..., ....
8800, 0.000002
8900, 0.000002
9000, 0.000001
9100, 0.000001
9200, 0.000001
9300, 0.000001
9400, 0.000001
9500, 0.000001
9600, 0.000001
9700, 0.000001
CPU
Iterations: 9796
Total time: 5.594017 s
With NUM = 1024 the result was:
Iterations: 27271
Total time: 25.949905 s