fmincon : impose vector greater than zero constraint - optimization

How do you impose a constraint that all values in a vector you are trying to optimize for are greater than zero, using fmincon()?
According to the documentation, I need some parameters A and b, where A*x ≤ b, but I think if I make A a vector of -1's and b 0, then I will have optimized for the sum of x>0, instead of each value of x greater than 0.
Just in case you need it, here is my code. I am trying to optimize over a vector (x) such that the (componentwise) product of x and a matrix (called multiplierMatrix) makes a matrix for which the sum of the columns is x.
function [sse] = myfun(x) % this is a nested function
bigMatrix = repmat(x,1,120) .* multiplierMatrix;
answer = sum(bigMatrix,1)';
sse = sum((expectedAnswer - answer).^2);
end
xGuess = ones(1:120,1);
[sse xVals] = fmincon(#myfun,xGuess,???);
Let me know if I need to explain my problem better. Thanks for your help in advance!

You can use the lower bound:
xGuess = ones(120,1);
lb = zeros(120,1);
[sse xVals] = fmincon(#myfun,xGuess, [],[],[],[], lb);
note that xVals and sse should probably be swapped (if their name means anything).
The lower bound lb means that elements in your decision variable x will never fall below the corresponding element in lb, which is what you are after here.
The empties ([]) indicate you're not using linear constraints (e.g., A,b, Aeq,beq), only the lower bounds lb.
Some advice: fmincon is a pretty advanced function. You'd better memorize the documentation on it, and play with it for a few hours, using many different example problems.

Related

Using fixed point to show square root

In going through the exercises of SICP, it defines a fixed-point as a function that satisfies the equation F(x)=x. And iterating to find where the function stops changing, for example F(F(F(x))).
The thing I don't understand is how a square root of, say, 9 has anything to do with that.
For example, if I have F(x) = sqrt(9), obviously x=3. Yet, how does that relate to doing:
F(F(F(x))) --> sqrt(sqrt(sqrt(9)))
Which I believe just converges to zero:
>>> math.sqrt(math.sqrt(math.sqrt(math.sqrt(math.sqrt(math.sqrt(9))))))
1.0349277670798647
Since F(x) = sqrt(x) when x=1. In other words, how does finding the square root of a constant have anything to do with finding fixed points of functions?
When calculating the square-root of a number, say a, you essentially have an equation of the form x^2 - a = 0. That is, to find the square-root of a, you have to find an x such that x^2 = a or x^2 - a = 0 -- call the latter equation as (1). The form given in (1) is an equation which is of the form g(x) = 0, where g(x) := x^2 - a.
To use the fixed-point method for calculating the roots of this equation, you have to make some subtle modifications to the existing equation and bring it to the form f(x) = x. One way to do this is to rewrite (1) as x = a/x -- call it (2). Now in (2), you have obtained the form required for solving an equation by the fixed-point method: f(x) is a/x.
Observe that this method requires both sides of the equation to have an 'x' term; an equation of the form sqrt(a) = x doesn't meet the specification and hence can't be solved (iteratively) using the fixed-point method.
The thing I don't understand is how a square root of, say, 9 has anything to do with that.
For example, if I have F(x) = sqrt(9), obviously x=3. Yet, how does that relate to doing: F(F(F(x))) --> sqrt(sqrt(sqrt(9)))
These are standard methods for numerical calculation of roots of non-linear equations, quite a complex topic on its own and one which is usually covered in Engineering courses. So don't worry if you don't get the "hang of it", the authors probably felt it was a good example of iterative problem solving.
You need to convert the problem f(x) = 0 to a fixed point problem g(x) = x that is likely to converge to the root of f(x). In general, the choice of g(x) is tricky.
if f(x) = x² - a = 0, then you should choose g(x) as follows:
g(x) = 1/2*(x + a/x)
(This choice is based on Newton's method, which is a special case of fixed-point iterations).
To find the square root, sqrt(a):
guess an initial value of x0.
Given a tolerance ε, compute xn+1 = 1/2*(xn + a/xn) for n = 0, 1, ... until convergence.

Relaxation of linear constraints?

When we need to optimize a function on the positive real half-line, and we only have non-constraints optimization routines, we use y = exp(x), or y = x^2 to map to the real line and still optimize on the log or the (signed) square root of the variable.
Can we do something similar for linear constraints, of the form Ax = b where, for x a d-dimensional vector, A is a (N,n)-shaped matrix and b is a vector of length N, defining the constraints ?
While, as Ervin Kalvelaglan says this is not always a good idea, here is one way to do it.
Suppose we take the SVD of A, getting
A = U*S*V'
where if A is n x m
U is nxn orthogonal,
S is nxm, zero off the main diagonal,
V is mxm orthogonal
Computing the SVD is not a trivial computation.
We first zero out the elements of S which we think are non-zero just due to noise -- which can be a slightly delicate thing to do.
Then we can find one solution x~ to
A*x = b
as
x~ = V*pinv(S)*U'*b
(where pinv(S) is the pseudo inverse of S, ie replace the non zero elements of the diagonal by their multiplicative inverses)
Note that x~ is a least squares solution to the constraints, so we need to check that it is close enough to being a real solution, ie that Ax~ is close enough to b -- another somewhat delicate thing. If x~ doesn't satisfy the constraints closely enough you should give up: if the constraints have no solution neither does the optimisation.
Any other solution to the constraints can be written
x = x~ + sum c[i]*V[i]
where the V[i] are the columns of V corresponding to entries of S that are (now) zero. Here the c[i] are arbitrary constants. So we can change variables to using the c[] in the optimisation, and the constraints will be automatically satisfied. However this change of variables could be somewhat irksome!

In AMPL, how to refer to part of the result, and use them in multiple places

I'm learning AMPL to speed up a model currently in excel spreadsheet with excel solver. It basically based on the matrix multiplication result of a 1 x m variables and an m x n parameters. And it would find the variables to maximize the minimum of certain values in the result while keeping some other values in the same result satisfying a few constraints. How to do so in AMPL?
Given: P= m x n parameters
Variable: X= 1 x m variable we tried to solve
Calculate: R= X x P , result of matrix multiplication of X and P
Maximize: min(R[1..3]), the minimum value of the first 3 values in the result
Subject to: R[2]<R[4]
min(R[6..8])>20
R[5]-20>R[7]
X are all integers
I read several tutorials and look up the manual but can't find the solution to this seemingly straightforward problem. All I found is maximize a single value, which is the calculation result. And it was used only once and does not appear again in the constraint.
The usual approach for "maximize the minimum" problems in products like AMPL is to define an auxiliary variable and set linear constraints that effectively define it as the minimum, converting a nonlinear function (min) into linear rules.
For instance, suppose I have a bunch of decision variables x[i] with i ranging over an index set S, and I want to maximize the minimum over x[i]. AMPL syntax for that would be:
var x_min;
s.t. DefineMinimum{i in S}: x_min <= x[i];
maximize ObjectiveFunction: x_min;
The constraint only requires that x_min be less than or equal to the minimum of x[i]. However, since you're trying to maximize x_min and there are no other constraints on it, it should always end up exactly equal to that minimum (give or take machine-arithmetic epsilon considerations).
If you have parameters (i.e. values are known before you run the optimisation) and want to refer to their minimum, AMPL lets you do that more directly:
param p_min := min{j in IndexSet_P} p[j];
While AMPL also supports this syntax for variables, not all of the solvers used with AMPL are capable of accepting this type of constraint. For instance:
reset;
option solver gecode;
set S := {1,2,3};
var x{S} integer;
var x_min = min{s in S} x[s];
minimize OF: sum{s in S} x[s];
s.t. c1: x_min >= 5;
solve;
This will run and do what you'd expect it to do, because Gecode is programmed to recognise and deal with min-type constraints. However, if you switch the solver option to gurobi or cplex it will fail, since these only accept linear or quadratic constraints. To apply a minimum constraint with those solvers, you need to use something like the linearization trick I discussed above.

Checking containment using integer programming

For this question, a region is a subset of Zd defined by finitely many linear inequalities with integer coefficients, where Zd is the set of d-tuples of integers. For example, the set of pairs (x, y) of non-negative integers with 2x+3y >= 10 constitutes a region with d=2 (non-negativity just imposes the additional inequalities x>=0 and y>=0).
Question: is there a good way, using integer programming (or something else?), to check if one region is contained in a union of finitely many other regions?
I know one way to check containment, which I describe below, but I'm hoping someone may be able to offer some improvements, as it's not too efficient.
Here's the way I know to check containment. First, integer programming libraries can directly check if a region is empty: in integer programming terminology (as I understand it), emptiness of a region corresponds to infeasibility of a model. I have coded up something using the gurobi library to check emptiness, and it seems to work well in practice for the kind of regions I care about.
Suppose now that we want to check if a region X is contained in another region Y (a special case of the question). Let Z be the intersection of X with the complement of Y. Then X is contained in Y if and only if Z is empty. Now, Z itself is not a region in my sense of the word, but it is a union of regions Z_1, ..., Z_n, where n is the number of inequalities used to define Y. We can check if Z is empty by checking that each of Z_1, ..., Z_n is empty, and we can do this as described above.
The general case can be handled in exactly the same way: if Y is a finite union of regions Y_1, ..., Y_k then Z is still a finite union of regions Z_1, ..., Z_n, and so we just check that each Z_i is empty. If Y_i is defined by m_i inequalities then n = m_1 * m_2 * ... * m_k.
So to summarize, we can reduce the containment problem to the emptiness problem, which the library can solve directly. The issue is that we may have to solve a very large number of emptiness problems to solve containment (e.g., if each Y_i is defined by only two inequalities then n = 2^k grows exponentially with k), and so this may take a lot of time.
You can't really expect a simple answer. Suppose that A is defined by all constraints of the form 0 <= x_i <= 1. A can be thought of as the collection of all possible rows of a truth table. Given any logical expression of the form e.g. x or (not y) or z, you can express it as a linear inequality
such as x + (1-y) + z >= 1 (along with the 0-1 constraints). Using this approach, any Boolean formula in conjunctive normal form (CNF) can be expressed as a region in Z^n. If A is defined as above and B_1, B_2, ...., B_k is a list of regions corresponding to CNFs then A is contained in the union of the B_i if and only if the disjunction of those CNFs is a tautology. But tautology-checking is a canonical example of an NP-complete problem.
None of this is to say that it can't be usefully reduced to ILP (which itself is NP-complete). I don't see of any direct way to do so, though I suspect that some of the techniques used to identify redundant constraints would be relevant.

Normal Distribution function

edit
So based on the answers so far (thanks for taking your time) I'm getting the sense that I'm probably NOT looking for a Normal Distribution function. Perhaps I'll try to re-describe what I'm looking to do.
Lets say I have an object that returns a number of 0 to 10. And that number controls "speed". However instead of 10 being the top speed, I need 5 to be the top speed, and anything lower or higher would slow down accordingly. (with easing, thus the bell curve)
I hope that's clearer ;/
-original question
These are the times I wish I remembered something from math class.
I'm trying to figure out how to write a function in obj-C where I define the boundries, ex (0 - 10) and then if x = foo y = ? .... where x runs something like 0,1,2,3,4,5,6,7,8,9,10 and y runs 0,1,2,3,4,5,4,3,2,1,0 but only on a curve
Something like the attached image.
I tried googling for Normal Distribution but its way over my head. I was hoping to find some site that lists some useful algorithms like these but wasn't very successful.
So can anyone help me out here ? And if there is some good sites which shows useful mathematical functions, I'd love to check them out.
TIA!!!
-added
I'm not looking for a random number, I'm looking for.. ex: if x=0 y should be 0, if x=5 y should be 5, if x=10 y should be 0.... and all those other not so obvious in between numbers
alt text http://dizy.cc/slider.gif
Okay, your edit really clarifies things. You're not looking for anything to do with the normal distribution, just a nice smooth little ramp function. The one Paul provides will do nicely, but is tricky to modify for other values. It can be made a little more flexible (my code examples are in Python, which should be very easy to translate to any other language):
def quarticRamp(x, b=10, peak=5):
if not 0 <= x <= b:
raise ValueError #or return 0
return peak*x*x*(x-b)*(x-b)*16/(b*b*b*b)
Parameter b is the upper bound for the region you want to have a slope on (10, in your example), and peak is how high you want it to go (5, in the example).
Personally I like a quadratic spline approach, which is marginally cheaper computationally and has a different curve to it (this curve is really nice to use in a couple of special applications that don't happen to matter at all for you):
def quadraticSplineRamp(x, a=0, b=10, peak=5):
if not a <= x <= b:
raise ValueError #or return 0
if x > (b+a)/2:
x = a + b - x
z = 2*(x-a)/b
if z > 0.5:
return peak * (1 - 2*(z-1)*(z-1))
else:
return peak * (2*z*z)
This is similar to the other function, but takes a lower bound a (0 in your example). The logic is a little more complex because it's a somewhat-optimized implementation of a piecewise function.
The two curves have slightly different shapes; you probably don't care what the exact shape is, and so could pick either. There are an infinite number of ramp functions meeting your criteria; these are two simple ones, but they can get as baroque as you want.
The thing you want to plot is the probability density function (pdf) of the normal distribution. You can find it on the mighty Wikipedia.
Luckily, the pdf for a normal distribution is not difficult to implement - some of the other related functions are considerably worse because they require the error function.
To get a plot like you showed, you want a mean of 5 and a standard deviation of about 1.5. The median is obviously the centre, and figuring out an appropriate standard deviation given the left & right boundaries isn't particularly difficult.
A function to calculate the y value of the pdf given the x coordinate, standard deviation and mean might look something like:
double normal_pdf(double x, double mean, double std_dev) {
return( 1.0/(sqrt(2*PI)*std_dev) *
exp(-(x-mean)*(x-mean)/(2*std_dev*std_dev)) );
}
A normal distribution is never equal to 0.
Please make sure that what you want to plot is indeed a
normal distribution.
If you're only looking for this bell shape (with the tangent and everything)
you can use the following formula:
x^2*(x-10)^2 for x between 0 and 10
0 elsewhere
(Divide by 125 if you need to have your peek on 5.)
double bell(double x) {
if ((x < 10) && (x>0))
return x*x*(x-10.)*(x-10.)/125.;
else
return 0.;
}
Well, there's good old Wikipedia, of course. And Mathworld.
What you want is a random number generator for "generating normally distributed random deviates". Since Objective C can call regular C libraries, you either need a C-callable library like the GNU Scientific Library, or for this, you can write it yourself following the description here.
Try simulating rolls of dice by generating random numbers between 1 and 6. If you add up the rolls from 5 independent dice rolls, you'll get a surprisingly good approximation to the normal distribution. You can roll more dice if you'd like and you'll get a better approximation.
Here's an article that explains why this works. It's probably more mathematical detail than you want, but you could show it to someone to justify your approach.
If what you want is the value of the probability density function, p(x), of a normal (Gaussian) distribution of mean mu and standard deviation sigma at x, the formula is
p(x) = exp( ((x-mu)^2)/(2*sigma^2) ) / (sigma * 2 * sqrt(pi))
where pi is the area of a circle divided by the square of its radius (approximately 3.14159...). Using the C standard library math.h, this is:
#include <math>
double normal_pdf(double x, double mu, double sigma) {
double n = sigma * 2 * sqrt(M_PI); //normalization factor
p = exp( -pow(x-mu, 2) / (2 * pow(sigma, 2)) ); // unnormalized pdf
return p / n;
}
Of course, you can do the same in Objective-C.
For reference, see the Wikipedia or MathWorld articles.
It sounds like you want to write a function that yields a curve of a specific shape. Something like y = f(x), for x in [0:10]. You have a constraint on the max value of y, and a general idea of what you want the curve to look like (somewhat bell-shaped, y=0 at the edges of the x range, y=5 when x=5). So roughly, you would call your function iteratively with the x range, with a step that gives you enough points to make your curve look nice.
So you really don't need random numbers, and this has nothing to do with probability unless you want it to (as in, you want your curve to look like a the outline of a normal distribution or something along those lines).
If you have a clear idea of what function will yield your desired curve, the code is trivial - a function to compute f(x) and a for loop to call it the desired number of times for the desired values of x. Plot the x,y pairs and you're done. So that's your algorithm - call a function in a for loop.
The contents of the routine implementing the function will depend on the specifics of what you want the curve to look like. If you need help on functions that might return a curve resembling your sample, I would direct you to the reading material in the other answers. :) However, I suspect that this is actually an assignment of some sort, and that you have been given a function already. If you are actually doing this on your own to learn, then I again echo the other reading suggestions.
y=-1*abs(x-5)+5