Domain error when using Nelder Mead algorithm in Julia - optimization

I am struggling with optimization in Julia.
I used to use Matlab but I am trying to work on Julia instead.
The following is the code I wrote.
using Optim
V = fill(1.0, (18,14,5))
agrid = range(-2, stop=20, length=18)
dgrid = range(0.01, stop=24, length=14)
#zgrid = [0.5; 0.75; 1.0; 1.25; 1.5]
zgrid = [0.7739832502827438; 0.8797631785217791; 1.0; 1.1366695315439874; 1.2920176239404275]
# function
function adj_utility(V,s_a,s_d,s_z,i_z,c_a,c_d)
consumption = s_z + 1.0125*s_a + (1-0.018)*s_d - c_a - c_d - 0.05*(1-0.018)*s_d
if consumption >= 0
return (1/(1-2)) * (( (consumption^0.88) * (c_d^(1-0.88)) )^(1-2))
end
if consumption < 0
return -99999999
end
end
# Optimization
i_a = 1
i_d = 3
i_z = 1
utility_adj(x) = -adj_utility(V,agrid[i_a],dgrid[i_d],zgrid[i_z],i_z,x[1],x[2])
result1 = optimize(utility_adj, [1.0, 1.0], NelderMead())
If I use zgrid = [0.5; 0.75; 1.0; 1.25; 1.5], then the code works.
However, if I use zgrid = [0.7739832502827438; 0.8797631785217791; 1.0; 1.1366695315439874; 1.2920176239404275], I got an error message "DomainError with -0.3781249999999996"
In the function, if the consumption is less than 0 then the value should be -9999999 so I am not sure why I am getting this message.
Any help would be appreciated.
Thank you.

Raising negative numbers to non-integer powers returns complex numbers, which is where your error is coming from.
julia> (-0.37)^(1-0.88)
ERROR: DomainError with -0.37:
Exponentiation yielding a complex result requires a complex argument.
Replace x^y with (x+0im)^y, Complex(x)^y, or similar.
Stacktrace:
[1] throw_exp_domainerror(::Float64) at ./math.jl:37
[2] ^(::Float64, ::Float64) at ./math.jl:888
[3] top-level scope at REPL[5]:1
You have a constraint that consumption must be strictly positive, but if you want consumption to be a real number you will need constraints that c_d is positive as well. You can either add this directly to your objective function as above, or you can use one of the constrained optimization algorithms in NLopt, which is available in Julia via the NLopt package.

Related

ORTools CP-Sat Solver Channeling Constraint dependant of x

I try to add the following constraints to my model. my problem: the function g() expects x as a binary numpy array. So the result arr_a depends on the current value of x in every step of the optimization!
Afterwards, I want the max of this array times x to be smaller than 50.
How can I add this constraint dynamically so that arr_a is always rightfully calculated with the value of x at each iteration while telling the model to keep the constraint arr_a * x <= 50 ? Currently I am getting an error when adding the constraint to the model because g() expects x as numpy array to calculate arr_a, arr_b, arr_c ( g uses np.where(x == 1) within its calculation).
#Init model
from ortools.sat.python import cp_model
model = cp_model.CpModel()
# Declare the variables
x = []
for i in range(self.ds.n_banks):
x.append(model.NewIntVar(0, 1, "x[%i]" % (i)))
#add bool vars
a = model.NewBoolVar('a')
arr_a, arr_b, arr_c = g(df1,df2,df3,x)
model.Add((arr_a.astype('int32') * x).max() <= 50).OnlyEnforceIf(a)
model.Add((arr_a.astype('int32') * x).max() > 50).OnlyEnforceIf(a.Not())
Afterwards i add the target function that naturally also depends on x.
model.Minimize(target(x))
def target(x):
arr_a, arr_b, arr_c = g(df1,df2,df3,x)
return (3 * arr_b * x + 2 * arr_c * x).sum()
EDIT:
My problem changed a bit and i managed to get it work without issues. Nevertheless, I experienced that the constraint is never actually met! self-defined-function is a highly non-linear function that expects the indices where x==1 and where x == 0 and returns a numpy array. Also it is not possible to re-build it with pre-defined functions of the sat.solver.
#Init model
model = cp_model.CpModel()
# Declare the variables
x = [model.NewIntVar(0, 1, "x[%i]" % (i)) for i in range(66)]
# add hints
[model.AddHint(x[i],np.random.choice(2, 1, p=[0.4, 0.6])[0]) for i in range(66)]
open_elements = [model.NewBoolVar("open_elements[%i]" % (i)) for i in range(66)]
closed_elements = [model.NewBoolVar("closed_elements[%i]" % (i)) for i in range(6)]
# open indices as bool vars
for i in range(66):
model.Add(x[i] == 1).OnlyEnforceIf(open_elements[i])
model.Add(x[i] != 1).OnlyEnforceIf(open_elements[i].Not())
model.Add(x[i] != 1).OnlyEnforceIf(closed_elements[i])
model.Add(x[i] == 1).OnlyEnforceIf(closed_elements[i].Not())
model.Add((self-defined-function(np.where(open_elements), np.where(closed_elements), some_array).astype('int32') * x - some_vector).all() <= 0)
Even when I apply a simpler function, it will not work properly.
model.Add((self-defined-function(x, some_array).astype('int32') * x - some_vector).all() <= 0)
I also tried the following:
arr_indices_open = []
arr_indices_closed = []
for i in range(66):
if open_elements[i] == True:
arr_indices_open.append(i)
else:
arr_indices_closed.append(i)
# final Constraint
arr_ = self-defined-function(arr_indices_open, arr_indices_closed, some_array)[0].astype('int32')
for i in range(66):
model.Add(arr_[i] * x[i] <= some_other_vector[i])
Some minimal example for the self-defined-function, with which I simply try to say that n_closed shall be smaller than 10. Even that condition is not met by the solver:
def self_defined_function(arr_indices_closed)
return len(arr_indices_closed)
arr_ = self-defined-function(arr_indices_closed)
for i in range(66):
model.Add(arr_ < 10)
I'm not sure I fully understand the question, but generally, if you want to optimize a function g(x), you'll have to implement it in using the solver's primitives (docs).
It's easier to do when your calculation coincides with an existing solver function, e.g.: if you're trying to calculate a linear expression; but could get harder to do when trying to calculate something more complex. However, I believe that's the only way.

LoadError using approximate bayesian criteria

I am getting an error that is confusing me.
using DifferentialEquations
using RecursiveArrayTools # for VectorOfArray
using DiffEqBayes
f2 = #ode_def_nohes LotkaVolterraTest begin
dx = x*(1 - x - A*y)
dy = rho*y*(1 - B*x - y)
end A B rho
u0 = [1.0;1.0]
tspan = (0.0,10.0)
p = [0.2,0.5,0.3]
prob = ODEProblem(f2,u0,tspan,p)
sol = solve(prob,Tsit5())
t = collect(linspace(0,10,200))
randomized = VectorOfArray([(sol(t[i]) + .01randn(2)) for i in 1:length(t)])
data = convert(Array,randomized)
priors = [Uniform(0.0, 2.0), Uniform(0.0, 2.0), Uniform(0.0, 2.0)]
bayesian_result_abc = abc_inference(prob, Tsit5(), t, data,
priors;num_samples=500)
Returns the error
ERROR: LoadError: DimensionMismatch("first array has length 400 which does not match the length of the second, 398.")
while loading..., in expression starting on line 20.
I have not been able to locate any array of size 400 or 398.
Thanks for your help.
Take a look at https://github.com/JuliaDiffEq/DiffEqBayes.jl/issues/52, that was due to an error in passing the t. This has been fixed on master so you can use that or wait some time, we will have a new release soon with the 1.0 upgrades which will have this fixed too.
Thanks!

Julia: Error when trying to minimize a function with optimize

I have the following function with multiple arguments that I would like to minimize with Optim.jl:
function post(parm,y,x,n)
# Evaluate the log of the marginal posterior for parm at a point
fgamma=zeros(n,1);
for ii = 1:2
fgamma = fgamma + parm[ii+1]*(x[:,ii+1].^parm[4]);
end
fgamma = fgamma.^(1/parm[4]);
fgamma = fgamma + parm[1]*ones(n,1);
lpost = .5*n*log.((y - fgamma)'*(y-fgamma));
end
However, when i try to use optimize, Julia returns an error.
Old error (with parm):
MethodError: no method matching finite_difference!(::##1#2, ::Array{Float64,2}, ::Array{Float64,2}, ::Symbol)
New error(with parm2):
MethodError: Cannot `convert` an object of type Array{Float64,2} to an object of type Float64
The complete script with data and optimize call I am using is this:
using Distributions
using Optim
n = 200;
k = 3;
x = ones(n,k);
fgamma=zeros(n,1);
gam = [1.01; 0.6; 0.8; 1.5];
x[:,2] = rand(Chisq(10),n);
x[:,3] = rand(Chisq(5),n);
epsl = rand(Normal(0,1),n);
y = zeros(n,1);
for i = 1:n
y[i,1] = gam[1] + (gam[2]*x[i,2]^gam[4] + gam[3]*x[i,3]^gam[4])^(1/gam[4]) + epsl[i];
end
# Sim
bols = inv(x'x)x'y;
s2 = (y-x*bols)'*(y-x*bols)/(n-k);
sse=(n-k)*s2;
bolscov = s2.*inv(x'*x);
bolssd=zeros(k,1);
for i = 1:k
bolssd[i,1]=sqrt(bolscov[i,i]);
end
# Calculate posterior mode and Hessian at mode
nparam=k+1;
parm = ones(nparam,1);
parm[1:k,1]=bols;
parm2 = vec(parm);
opt = Optim.Options(f_tol = 1e-8, iterations = 1000);
Optim.after_while!{T}(d, state::Optim.BFGSState{T}, method::BFGS, options) = global invH = state.invH
res = optimize(p -> post(p,y,x,n), parm2, BFGS(), opt)
Does anyone knows what I am doing wrong? I think that the there is a problem with the type of lpost in the function post, since it returns a 1x1 Array{Float64,2}. Unfortunately, i couldn't handle it well.
The error message
MethodError: Cannot `convert` an object of type Array{Float64,2} to an object of type Float64
is caused by an attempt to convert a matrix into a scalar. In general this is not possible, but when the matrix is a 1x1 matrix (as the question pointed out), there is a natural transformation: scalar = matrix[1,1].
optimize wants a scalar value returned because it is a scalar non-linear optimization routine. Optimizing a vector value is even hard to unambiguously define (concepts such as Pareto optima is an attempt to do so).
So, after this prelude, the fix is simple, together with an issue with Complex optimization #fst (the poster) later tackled. Again, a single dimensional scalar is required, so real(...) was used to make a scalar out of a complex value (more precisely an ordered scalar, as complex numbers are scalars too). The resulting post function is:
function post(parm,y,x,n)
# Evaluate the log of the marginal posterior for parm at a point
fgamma=zeros(n,1);
for ii = 1:2
fgamma = fgamma + parm[ii+1]*(x[:,ii+1].^parm[4]);
end
fgamma = fgamma.^Complex(1/parm[4]);
fgamma = fgamma + parm[1]*ones(n,1);
lpost = .5*n*log.((y - fgamma)'*(y-fgamma));
return real(lpost[1,1])
end

Gaussian Log Likelyhood loss function in Tensorflow

I need to implement a gaussian log likelihood loss function in Tensorflow, however I am not sure if what I wrote is correct. I think this is the correct definition of the loss function.
I went around implementing it like this:
two_pi = 2*np.pi
def gaussian_density_function(x, mean, stddev):
stddev2 = tf.pow(stddev, 2)
z = tf.multiply(two_pi, stddev2)
z = tf.pow(z, 0.5)
arg = -0.5*(x-mean)
arg = tf.pow(arg, 2)
arg = tf.div(arg, stddev2)
return tf.divide(tf.exp(arg), z)
mean_x, var_x = tf.nn.moments(dae_output_tensor, [0])
stddev_x = tf.sqrt(var_x)
loss_op_AE = -gaussian_density_function(inputs, mean_x, stddev_x)
loss_op_AE = tf.reduce_mean(loss_op_AE)
I want to use this as the loss function for an autoencoder, however, I am not sure this implementation is correct, since I get a NaN out of loss_op_AE.
EDIT: I also tried using:
mean_x, var_x = tf.nn.moments(autoencoder_output, axes=[1,2])
stddev_x = tf.sqrt(var_x)
dist = tf.contrib.distributions.Normal(mean_x, stddev_x)
loss_op_AE = -dist.pdf(inputs)
and I get the same NaN values.
Model the stddev as log stddev, this should fix the nan issue. So instead of pretending stddev is sigma^2, pretend it is the natural logarithm of sigma^2.

The math behind Apple's Speak here example

I have a question regarding the math that Apple is using in it's speak here example.
A little background: I know that average power and peak power returned by the AVAudioRecorder and AVAudioPlayer is in dB. I also understand why the RMS power is in dB and that it needs to be converted into amp using pow(10, (0.5 * avgPower)).
My question being:
Apple uses this formula to create it's "Meter Table"
MeterTable::MeterTable(float inMinDecibels, size_t inTableSize, float inRoot)
: mMinDecibels(inMinDecibels),
mDecibelResolution(mMinDecibels / (inTableSize - 1)),
mScaleFactor(1. / mDecibelResolution)
{
if (inMinDecibels >= 0.)
{
printf("MeterTable inMinDecibels must be negative");
return;
}
mTable = (float*)malloc(inTableSize*sizeof(float));
double minAmp = DbToAmp(inMinDecibels);
double ampRange = 1. - minAmp;
double invAmpRange = 1. / ampRange;
double rroot = 1. / inRoot;
for (size_t i = 0; i < inTableSize; ++i) {
double decibels = i * mDecibelResolution;
double amp = DbToAmp(decibels);
double adjAmp = (amp - minAmp) * invAmpRange;
mTable[i] = pow(adjAmp, rroot);
}
}
What are all the calculations - or rather, what do each of these steps do? I think that mDecibelResolution and mScaleFactor are used to plot 80dB range over 400 values (unless I'm mistaken). However, what's the significance of inRoot, ampRange, invAmpRange and adjAmp? Additionally, why is the i-th entry in the meter table "mTable[i] = pow(adjAmp, rroot);"?
Any help is much appreciated! :)
Thanks in advance and cheers!
It's been a month since I've asked this question, and thanks, Geebs, for your response! :)
So, this is related to a project that I've been working on, and the feature that is based on this was implemented about 2 days after asking that question. Clearly, I've slacked off on posting a closing response (sorry about that). I posted a comment on Jan 7, as well, but circling back, seems like I had a confusion with var names. >_<. Thought I'd give a full, line by line answer to this question (with pictures). :)
So, here goes:
//mDecibelResolution is the "weight" factor of each of the values in the meterTable.
//Here, the table is of size 400, and we're looking at values 0 to 399.
//Thus, the "weight" factor of each value is minValue / 399.
MeterTable::MeterTable(float inMinDecibels, size_t inTableSize, float inRoot)
: mMinDecibels(inMinDecibels),
mDecibelResolution(mMinDecibels / (inTableSize - 1)),
mScaleFactor(1. / mDecibelResolution)
{
if (inMinDecibels >= 0.)
{
printf("MeterTable inMinDecibels must be negative");
return;
}
//Allocate a table to store the 400 values
mTable = (float*)malloc(inTableSize*sizeof(float));
//Remember, "dB" is a logarithmic scale.
//If we have a range of -160dB to 0dB, -80dB is NOT 50% power!!!
//We need to convert it to a linear scale. Thus, we do pow(10, (0.05 * dbValue)), as stated in my question.
double minAmp = DbToAmp(inMinDecibels);
//For the next couple of steps, you need to know linear interpolation.
//Again, remember that all calculations are on a LINEAR scale.
//Attached is an image of the basic linear interpolation formula, and some simple equation solving.
//As per the image, and the following line, (y1 - y0) is the ampRange -
//where y1 = maxAmp and y0 = minAmp.
//In this case, maxAmp = 1amp, as our maxDB is 0dB - FYI: 0dB = 1amp.
//Thus, ampRange = (maxAmp - minAmp) = 1. - minAmp
double ampRange = 1. - minAmp;
//As you can see, invAmpRange is the extreme right hand side fraction on our image's "Step 3"
double invAmpRange = 1. / ampRange;
//Now, if we were looking for different values of x0, x1, y0 or y1, simply substitute it in that equation and you're good to go. :)
//The only reason we were able to get rid of x0 was because our minInterpolatedValue was 0.
//I'll come to this later.
double rroot = 1. / inRoot;
for (size_t i = 0; i < inTableSize; ++i) {
//Thus, for each entry in the table, multiply that entry with it's "weight" factor.
double decibels = i * mDecibelResolution;
//Convert the "weighted" value to amplitude using pow(10, (0.05 * decibelValue));
double amp = DbToAmp(decibels);
//This is linear interpolation - based on our image, this is the same as "Step 3" of the image.
double adjAmp = (amp - minAmp) * invAmpRange;
//This is where inRoot and rroot come into picture.
//Linear interpolation gives you a "straight line" between 2 end-points.
//rroot = 0.5
//If I raise a variable, say myValue by 0.5, it is essentially taking the square root of myValue.
//So, instead of getting a "straight line" response, by storing the square root of the value,
//we get a curved response that is similar to the one drawn in the image (note: not to scale).
mTable[i] = pow(adjAmp, rroot);
}
}
Response Curve image: As you can see, the "Linear curve" is not exactly a curve. >_<
Hope this helps the community in some way. :)
No expert, but based on physics and math:
Assume the max amplitude is 1 and minimum is 0.0001 [corresponding to -80db, which is what min db value is set to in the apple example : #define kMinDBvalue -80.0 in AQLevelMeter.h]
minAmp is the minimum amplitude = 0.0001 for this example
Now, all that is being done is the amplitudes in multiples of the decibel resolution are being adjusted against the minimum amplitude:
adjusted amplitude = (amp-minamp)/(1-minamp)
This makes the range of the adjusted amplitude = 0 to 1 instead of 0.0001 to 1 (if that was desired).
inRoot is set to 2 here. rroot=1/2 - raising to power 1/2 is square root. from apple's file:
// inRoot - this controls the curvature of the response. 2.0 is square root, 3.0 is cube root. But inRoot doesn't have to be integer valued, it could be 1.8 or 2.5, etc.
Essentially gives you a response between 0 and 1 again, and the curvature of that varies based on what value you set for inRoot.