I made this simple program to compute the minimum of a goal function using the gradient method. I test it for a simple 1D function (http://en.wikipedia.org/wiki/Gradient_descent) and it's work very well giving to me the exact position of the minimum.
I generalize it to a 2D function: x^4+2y^4 which as only one zero in (0,0) as follow:
real*8 function cubic(xvect_old,n)
real(8), dimension(n) :: xvect_old
cubic = 4.d0*(xvect_old(1)**3.d0)+8.d0*(xvect_old(2)**3.d0)
!cubic = 4.d0*(xvect_old(1)**3.d0)-9.d0*(xvect_old(1)**2.d0)
end function cubic
program findmin
implicit none
integer, parameter :: n=2
integer :: i,j,m
real(8) :: cubic
real(8), dimension(n) :: xvect,xvect_old
real(8) :: eps,max_prec
m=30
eps = 0.01d0 ! step size
xvect_old =0.d0
xvect(1) = 2.2d0 ! first guess
xvect(2) = 3.1d0
max_prec = 1e-16
do while ( abs(xvect(1) - xvect_old(1)) > max_prec .and. &
& abs(xvect(2) - xvect_old(2)) > max_prec)
xvect_old(1:2) = xvect(1:2)
xvect(1:2) = xvect_old(1:2) - eps*cubic(xvect_old,n)
end do
print*, "Local minimum occurs at : ", xvect(1:2)
end program findmin
But it give to me, also if I am very near to the correct position (let's say, thanking as first guess (1.2,1.1) ) some large non correct solutions:
(-0.5017 ; 0.3982)
Is the method implemented wrong or there is some lack in my understanding how accurate the method is? I know there are more advanced methods like genetic algorithm which are, maybe, faster but are they also easy to implement?
Thanks a lot.
cubic is supposed to return the gradient and thus must be a vector.
Try the following:
module functions
implicit none
contains
function cubic(x,n) result(g)
integer, intent(in) :: n
real*8, dimension(n), intent(in) :: x
real*8, dimension(n) :: g
g =(/ 4.d0*(x(1)**3.d0), 8.d0*(x(2)**3.d0) /)
end function cubic
end module
program SOGradient
use functions
implicit none
integer, parameter :: n=2
integer :: i,j,m
real(8), dimension(n) :: xvect,xvect_old
real(8) :: eps,max_prec
m=30
eps = 0.01d0 ! step size
xvect_old =(/ 0.d0, 0d0 /)
! first guess
xvect = (/ 2.2d0, 3.1d0 /)
max_prec = 1e-12
do while ( MAXVAL(ABS(xvect-xvect_old))>max_prec )
xvect_old = xvect
xvect = xvect_old - eps*cubic(xvect_old,n)
end do
print*, "Local minimum occurs at : ", xvect
end program SOGradient
Of course the closer you get to the minimum, the smaller the step so the convergence is really slow. I would suggest using a newton raphson type method to find where the gradient is zero.
So to find the minimum of f(x,y) find the gradient g(x,y)=[gx,gy]=[df/dx,df/dy] and the gradient of the gradient h(x,y) = [[ dgx/dx, dgx/dy],[dgy/dx, dgy/dy]]
Now you iterate with
[x,y] -> [x,y] - h(x,y)^(-1)*g(x,y)
In you case f(x,y) = x^4+2*y^2, g=[4*x^3, 8*y^3] and h=[[12*x^2,0],[0,24*y^2]]
[x,y] -> [x,y] - [x/3,y/3]
which obviously has a solution in (0,0), but converges there much faster.
Related
Extremely new to Julia, so please pardon any obvious oversights on my end
I am trying to estimate a piecewise likelihood function through optimization. I have the code functional in R, but have begun translating it to Julia in the hopes of faster estimation, for eventual bootstrapping
Here is the current block of code that I am trying (v and x are already as 1000x1 vectors elsewhere defined elsewhere):
function est(a,b)
function pwll(v,x)
if v>4
ILL=pdf(Poisson(exp(a+b*x)), v)
elseif v==4
ILL=pdf(Poisson(exp(a+b*x)), 4)+pdf(Poisson(exp(a+b*x)),3)+pdf(Poisson(exp(a+b*x)),2)
else v==0
ILL=pdf(Poisson(exp(a+b*x)), 1)+pdf(Poisson(exp(a+b*x)), 0)
end
return(ILL)
end
ILL=pwll.(v, x)
function fixILL(x)
if x==0
x=0.00000000000000001
else
x=x
end
end
ILL=fixILL.(ILL)
LILL=log10.(ILL)
LL=-1*LILL
return(sum(LL))
end
using Optim
params0=[1,1]
optimize(est, params0)
And the error message(s) I am getting are:
ERROR: InexactError: Int64(NaN)
Stacktrace:
[1] Int64(x::Float64)
# Base ./float.jl:788
[2] x_of_nans(x::Vector{Int64}, Tf::Type{Int64}) (repeats 2 times)
# NLSolversBase ~/.julia/packages/NLSolversBase/kavn7/src/NLSolversBase.jl:60
[3] NonDifferentiable(f::Function, x::Vector{Int64}, F::Int64; inplace::Bool)
# NLSolversBase ~/.julia/packages/NLSolversBase/kavn7/src/objective_types/nondifferentiable.jl:11
[4] NonDifferentiable(f::Function, x::Vector{Int64}, F::Int64)
# NLSolversBase ~/.julia/packages/NLSolversBase/kavn7/src/objective_types/nondifferentiable.jl:10
[5] promote_objtype(method::NelderMead{Optim.AffineSimplexer, Optim.AdaptiveParameters}, x::Vector{Int64}, autodiff::Symbol, inplace::Bool, args::Function)
# Optim ~/.julia/packages/Optim/tP8PJ/src/multivariate/optimize/interface.jl:63
[6] optimize(f::Function, initial_x::Vector{Int64}; inplace::Bool, autodiff::Symbol, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
# Optim ~/.julia/packages/Optim/tP8PJ/src/multivariate/optimize/interface.jl:86
[7] optimize(f::Function, initial_x::Vector{Int64})
# Optim ~/.julia/packages/Optim/tP8PJ/src/multivariate/optimize/interface.jl:83
[8] top-level scope
# ~/Documents/Projects/ki_new/peicewise_ll.jl:120
I understand that it seems the error is coming from the function to be optimized being non-differentiable. A fairly direct translation works well in R, using the built in optim() function.
Can anyone provide any insight?
I have tried the above code displayed above, with multiple variations. The function to be optimized is functional, I am struggling with the optimization (the issues of which may stem from function being inefficiently written)
Here's an adapted version of your code which produces a solution:
using Distributions, Optim
function pwll(v, x, a, b)
d = Poisson(exp(a+b*x))
if v > 4
return pdf(d, v)
elseif v == 4
return pdf(d, 4) + pdf(d, 3) + pdf(d, 2)
else
return pdf(d, 1) + pdf(d, 0)
end
end
fixILL(x) = iszero(x) ? 1e-17 : x
est(a, b, v, x) = sum(-1 .* log10.(fixILL.(pwll.(v, x, a, b))))
v = 4; x = 0.5 # Defining these here as they are not given in your post
obj(input; v = v, x = x) = est(input[1], input[2], v, x)
optimize(obj, [1.0, 1.0])
I have no idea whether this is correct of course, check this against some sort of known result if you can.
I have the following code section (appropriately simplified)
cpdef double func(double[:] x, double[:] y) nogil:
cdef:
double[:] _y
_y = y # Here's my trouble
_y[2] = 2. - y[1]
_y[1] = 1.
return func2(x, _y)
I'm trying to create a copy of y that I can manipulate in the function. The problem is, any changes made to _y get passed back to y. I don't want to make changes to y, just to this temporary copy of it.
The function is nogil, so I can't use _y = y.copy(). (already tried). I also tried _y[:] = y, based on the cython guidance pages, but I apparently can't do that if _y hasn't been initialized yet.
So... how do I make a copy of a 1d vector without invoking the gil?
I have a problem with my Negamax algorithm and hope someone could help me.
I'm writing it in Cython
my search method is a following:
cdef _search(self, object game_state, int depth, long alpha, long beta, int max_depth):
if depth == max_depth or game_state.is_terminated:
value = self.evaluator.evaluate(game_state) evaluates based on current player
return value, []
moves = self.prepare_moves(depth, game_state) # getting moves and sorting
max_value = LONG_MIN
for move in moves:
new_board = game_state.make_move(move)
value, pv_moves = self._search(new_board, depth + 1, -beta, -alpha, max_depth, event)
value = -value
if max_value < value:
max_value = value
best_move = move
best_pv_moves = pv_moves
if alpha < max_value:
alpha = max_value
if max_value >= beta:
return LONG_MAX, []
best_pv_moves.insert(0, best_move)
return alpha, best_pv_moves
In many examples you break after a cutoff is detected but when I do this the algorithm don't find the optimal solution. I'm testing against some chess puzzles and I was wondering why this is the case. If I return the maximum number after a cutoff is detected It works fine but I takes a long time (252sec for depth 6)...
Speed: Nodes pre Second : 21550.33203125
Or if you have other improvements let me know (I use transposition table, pvs and killer heuristics)
Turn out I used the c limits
cdef extern from "limits.h":
cdef long LONG_MAX
cdef long LONG_MIN
and when you try to invert LONG_MIN, with -LONG_MIN you get LONG_MIN, because of an overflow?
I am unable to understand the source of this error:
line 327, in function_wrapper
return function(*(wrapper_args + args))
TypeError: SSVOptionPriceObjFunc() missing 1 required positional argument: 'marketVolSurface'
The relevant code is below:
x0 = [1.0, 0.0] # (lambda0, rho)
x0 = np.asarray(x0)
args = (spot, 0.01*r, daysInYear, mktPrices, volSurface)
# constraints: lambd0 >0, -1<= rho <=1
boundsHere = ((0, None), (-1, 1))
res = minimize(SSVOptionPriceObjFunc, x0, args, method='L-BFGS-B', jac=None,
bounds=boundsHere,options={'xtol': 1e-8, 'disp': True})
The function to be minimized is below. The first two arguments are the free variables, while the other five are fixed as parameters.
def SSVOptionPriceObjFunc(lambda0, rho, spot, spotInterestRate, daysInYear, marketPrices,
marketVolSurface):
My intention is to find (lambd0, rho) giving a minimum. From the debugger, it seems that my initial guess x0 is interpreted as a single variable, not as a vector, giving the error about a missing positional argument. I have tried passing x0 as a list, tuple, and ndarray; all fail. Can someone spot an error, or suggest a workaround? Thank you in advance.
Update: I have found a solution: use a wrapper function from the functools package to set the parameters.
import functools as ft
SSVOptionPriceObjFuncWrapper = ft.partial(SSVOptionPriceObjFunc, spot=spot,
spotInterestRate=0.01 * r, daysInYear=daysInYear, marketPrices=mktPrices,
marketVolSurface=volSurface)
Then pass SSVOptionPriceObjFuncWrapper to the minimizer with args = None
Thank you for the replies.
Take the documented minimize inputs seriously. It's your job to write the function to fit what minimize does, not the other way around.
scipy.optimize.minimize(fun, x0, args=(),
fun: callable
The objective function to be minimized.
fun(x, *args) -> float
where x is an 1-D array with shape (n,) and args is a tuple of the fixed
parameters needed to completely specify the function.
I am trying to code a java methods which returns a Boolean true if a point(x,y) is on a line segment and false if not.
I tried this:
public static boolean OnDistance(MyLocation a, MyLocation b, MyLocation queryPoint) {
double value = java.lang.Math.signum((a.mLongitude - b.mLongitude) * (queryPoint.mLatitude - a.mLatitude)
- (b.mLatitude - a.mLatitude) * (queryPoint.mLongitude - a.mLongitude));
double compare = 1;
if (value == compare) {
return true;
}
return false;
}
but it doesn't work.
I am not JAVA coder so I stick to math behind ... For starters let assume you are on plane (not sphere surface)
I would use Vector math so let:
a,b - be the line endpoints
q - queried point
c=q-a - queried line direction vector
d=b-a - line direction vector
use dot product for parameter extraction
t=dot(c,d)/(|c|*|d|)
t is line parameter <0,1> if out of range q is not inside line
|c|=sqrt(c.x*c.x+c.y*c.y) size of vector
dot(c,d)=c.x*d.x+c.y*d.y scalar vector multiply
now compute corresponding point on line
e=a+(t*d)
e is the closest point to q on the line ab
compute perpendicular distance of q and ab
l=|q-e|;
if (l>treshold) then q is not on line ab else it is on the line ab. The threshold is the max distance from line you are still accepting as inside line. No need to have l sqrt-ed the threshold constant can be powered by 2 instead for speed.
if you add all this to single equation
then some things will simplify itself (hope did not make some silly math mistake)
l=|(q-a)-(b-a)*(dot(q-a,b-a)/|b-a|^2)|;
return (l<=treshold);
or
l=|c-(d*dot(c,d)/|d|^2)|;
return (l<=treshold);
As you can see we do not even need sqrt for this :)
[Notes]
If you need spherical or ellipsoidal surface instead then you need to specify it closer which it is what are the semi axises. The line become arc/curve and need some corrections which depends on the shape of surface see
Projecting a point onto a path
but can be done also by approximation and may be also by binary search of point e see:
mine approx class in C++
The vector math used can be found here at the end:
Understanding 4x4 homogenous transform matrices
Here 3D C++ implementation (with different names):
double distance_point_axis(double *p,double *p0,double *dp)
{
int i;
double l,d,q[3];
for (i=0;i<3;i++) q[i]=p[i]-p0[i]; // q = p-p0
for (l=0.0,i=0;i<3;i++) l+=dp[i]*dp[i]; // l = |dp|^2
for (d=0.0,i=0;i<3;i++) d+=q[i]*dp[i]; // d = dot(q,dp)
if (l<1e-10) d=0.0; else d/=l; // d = dot(q,dp)/|dp|^2
for (i=0;i<3;i++) q[i]-=dp[i]*d; // q=q-dp*dot(q,dp)/|dp|^2
for (l=0.0,i=0;i<3;i++) l+=q[i]*q[i]; l=sqrt(l); // l = |q|
return l;
}
Where p0[3] is any point on axis and dp[3] is direction vector of axis. The p[3] is the queried point you want the distance to axis for.