What is the most efficient set order in GAMS? - gams-math

The documentation suggests that sets should appear in the same order to increase performance. If most of our variables share a set, is it better to have the common set first or last?
I.e. which is more efficient?
y[i,t] =E= a[t] * x[j,t];
or
y[t,i] =E= a[t] * x[t,j];

The main point of this "same order" is, that the sets should be used in the order they are controlled. So
Equation1(t,i,j).. y[t,i] =E= a[t] * x[t,j];
should be better than
Equation2(i,j,t).. y[t,i] =E= a[t] * x[t,j];
Other than that, it is not so easy to give many general rules. If you have also full control over the controlling indices, often it is beneficial, if the largest set is last, so if t >> i, than x[i,t] should be better than x[t,i]. In general, the GAMS command line parameter profile (https://www.gams.com/latest/docs/UG_GamsCall.html#GAMSAOprofile) is very useful to check the influence of different formulations of your mode.

In case anyone stumbles upon this, the context is a large dynamic economic model.
We tried switching our time index from always being the last set to always being the first.
The model is a square system of around 1m equations, 7m Jacobian elements, of which 2m are non-linear.
The solution time using CONOPT4 was significantly worse after the change. I.e. we have better performance with the time index last rather than first, despite the time index being among the largest sets in the model. The result is probably not transferable to other models, but confirms that there is no trivial answer.

Related

Multi-objective optimization but the function equation is unknown?

Firstly, I am totally out of my expertise zone so please bear with me.
I developed a fluid dynamic engine with 5 exposed parameters (say A,B,C,D,E). When you give this engine these 5 parameters, it does magic and give out a value 'Z'.
I want to write a script which can explore which combinations of A-E give lowest (or close to lowest) value of Z.
I know optimization algorithm exists, but from all of my search for examples, they use some function.
So I guess my function would simply be minimize Z? But where do A-E go?
Not really an answer, but some questions and ideas that might help you think through the best way to address this. We have no understanding of how big a range of values needs to be explored for those parameters, or how Z behaves, so this is very vague...
If you look at the values of Z for given values of A...E, does the value of Z jump around a lot for small changes on the parameter values, or does the Z value change reasonably smoothly?
If the Z value is not too eratic you could try some kind of gradient descent approach using calculated values of Z for some values of the parameters to approximate the gradient - suppose changing the value of 'A' from 1 to 2 gives a better change in the Z value than a similar size change in the other parameters, then try other values of A while keeping the other parameters fixed until you find a value of A that gives the best value of Z. Then try changing the other parameter values to see which one gives the steepest descent and try to find some better value for that parameter. Repeat this process until you can't find any improvement and you will have found a (local) minimum. You could then start at a different place in your parameter space and try again - you will probably find several local minima, and may just choose the best of those. Not provably optimal but may be good enough. Of course you can get clever and use things like conjugate gradients, Newton-Raphson or similar if Z is smooth enough.
If the Z values are very eratic, then you might have to just do some sampling of the possible combinations of A...E to get values of Z and choose the best you can find. Again you might do that in some systematic way (e.g. points on a grid in your parameter space) or entirely at random, or a combination of both.
If you find that there are 'clusters' of good solutions with similar values of the parameters then maybe some kind of local search would help - the idea is that there is often a better solution in the local neighbourhood of a known good solution. So maybe try perturbing your parameter values a bit from a known solution to see if that can lead to a better solution - either by some gradient descent method or by random sampling.
Unfortunately, if your Z calculation is complex, then any method using it as a black box will likely be slow as it will need to be re-evaluated many times.
You could use a Genetic Algorithm, where your chromosomes are formed with the 5 candidate values of the variables you have to optimize, to minimize Z, and your optimization/fitness "function" is the simulation itself outputting Z.
Other viable alternatives are Particle Swarm Optimization algorithm or Ant Colony Optimization. All of those are usable algortihms for that kind of optimization problem.

z3 minimization and timeout

I try to use the z3 solver for a minimization problem. I was trying to get a timeout, and return the best solution so far. I use the python API, and the timeout option "smt.timeout" with
set_option("smt.timeout", 1000) # 1s timeout
This actually times out after about 1 second. However a larger timeout does not provide a smaller objective. I ended up turning on the verbosity with
set_option("verbose", 2)
And I think that z3 successively evaluates larger values of my objective, until the problem is satisfiable:
(opt.maxres [0:6117664])
(opt.maxres [175560:6117664])
(opt.maxres [236460:6117664])
(opt.maxres [297360:6117664])
...
(opt.maxres [940415:6117664])
(opt.maxres [945805:6117664])
...
I thus have the two questions:
Can I on contrary tell z3 to start with the upper bound, and successively return models with a smaller value for my objective function (just like for instance Minizinc annotations indomain_max http://www.minizinc.org/2.0/doc-lib/doc-annotations-search.html)
It still looks like the solver returns a satisfiable instance of my problem. How is it found? If it's trying to evaluates larger values of my objective successively, it should not have found a satisfiable instance yet when the timeout occurs...
edit: In the opt.maxres log, the upper bound never shrinks.
For the record, I found a more verbose description of the options in the source here opt_params.pyg
Edit Sorry to bother, I've beed diving into this recently once again. Anyway I think this might be usefull to others. I've been finding that I actually have to call the Optimize.upper method in order to get the upper bound, and the model is still not the one that corresponds to this upper bound. I've been able to add it as a new constraint, and call a solver (without optimization, just SAT), but that's probably not the best idea. By reading this I feel like I should call Optimize.update_upper after the solver times out, but the python interface has no such method (?). At least I can get the upper bound, and the corresponding model now (at the cost of unneccessary computations I guess).
Z3 finds solutions for the hard constraints and records the current values for the objectives and soft constraints. The last model that was found (the last model with the so-far best value for the objectives) is returned if you ask for a model. The maxres strategy mainly improves the lower bounds on the soft constraints (e.g., any solution must have cost at least xx) and whenever possible improves the upper bound (the optional solution has cost at most yy). The lower bounds don't tell you too much other than narrowing the range of possible optimal values. The upper bounds are available when you timeout.
You could try one of the other strategies, such as the one called "wmax", which
performs a branch-and-prune. Typically maxres does significantly better, but you may have better experience (depending on the problems) with wmax for improving upper bounds.
I don't have a mode where you get a stream of models. It is in principle possible, but it would require some (non-trivial) reorganization. For Pareto fronts you make successive invocations to Optimize.check() to get the successive fronts.

Optimizing Parameters using AI technique

I know that my question is general, but I'm new to AI area.
I have an experiment with some parameters (almost 6 parameters). Each one of them is independent one, and I want to find the optimal solution for maximum or minimum the output function. However, if I want to do it in traditional programming technique it will take much time since i will use six nested loops.
I just want to know which AI technique to use for this problem? Genetic Algorithm? Neural Network? Machine learning?
Update
Actually, the problem could have more than one evaluation function.
It will have one function that we should minimize it (Cost)
and another function the we want to maximize it (Capacity)
Maybe another functions can be added.
Example:
Construction a glass window can be done in a million ways. However, we want the strongest window with lowest cost. There are many parameters that affect the pressure capacity of the window such as the strength of the glass, Height and Width, slope of the window.
Obviously, if we go to extreme cases (Largest strength glass, with smallest width and height, and zero slope) the window will be extremely strong. However, the cost for that will be very high.
I want to study the interaction between the parameters in specific range.
Without knowing much about the specific problem it sounds like Genetic Algorithms would be ideal. They've been used a lot for parameter optimisation and have often given good results. Personally, I've used them to narrow parameter ranges for edge detection techniques with about 15 variables and they did a decent job.
Having multiple evaluation functions needn't be a problem if you code this into the Genetic Algorithm's fitness function. I'd look up multi objective optimisation with genetic algorithms.
I'd start here: Multi-Objective optimization using genetic algorithms: A tutorial
First of all if you have multiple competing targets the problem is confused.
You have to find a single value that you want to maximize... for example:
value = strength - k*cost
or
value = strength / (k1 + k2*cost)
In both for a fixed strength the lower cost wins and for a fixed cost the higher strength wins but you have a formula to be able to decide if a given solution is better or worse than another. If you don't do this how can you decide if a solution is better than another that is cheaper but weaker?
In some cases a correctly defined value requires a more complex function... for example for strength the value could increase up to a certain point (i.e. having a result stronger than a prescribed amount is just pointless) or a cost could have a cap (because higher than a certain amount a solution is not interesting because it would place the final price out of the market).
Once you find the criteria if the parameters are independent a very simple approach that in my experience is still decent is:
pick a random solution by choosing n random values, one for each parameter within the allowed boundaries
compute target value for this starting point
pick a random number 1 <= k <= n and for each of k parameters randomly chosen from the n compute a random signed increment and change the parameter by that amount.
compute the new target value from the translated solution
if the new value is better keep the new position, otherwise revert to the original one.
repeat from 3 until you run out of time.
Depending on the target function there are random distributions that work better than others, also may be that for different parameters the optimal choice is different.
Some time ago I wrote a C++ code for solving optimization problems using Genetic Algorithms. Here it is: http://create-technology.blogspot.ro/2015/03/a-genetic-algorithm-for-solving.html
It should be very easy to follow.

Cplex/OPL local search

I have a model implemented in OPL. I want to use this model to implement a local search in java. I want to initialize solutions with some heuristics and give these initial solutions to cplex find a better solution based on the model, but also I want to limit the search to a specific neighborhood. Any idea about how to do it?
Also, how can I limit the range of all variables? And what's the best: implement these heuristics and local search in own opl or in java or even C++?
Thanks in advance!
Just to add some related observations:
Re Ram's point 3: We have had a lot of success with approach b. In particular it is simple to add constraints to fix the some of the variables to values from a known solution, and then re-solve for the rest of the variables in the problem. More generally, you can add constraints to limit the values to be similar to a previous solution, like:
var >= previousValue - 1
var <= previousValue + 2
This is no use for binary variables of course, but for general integer or continuous variables can work well. This approach can be generalised for collections of variables:
sum(i in indexSet) var[i] >= (sum(i in indexSet) value[i])) - 2
sum(i in indexSet) var[i] <= (sum(i in indexSet) value[i])) + 2
This can work well for sets of binary variables. For an array of 100 binary variables of which maybe 10 had the value 1, we would be looking for a solution where at least 8 have the value 1, but not more than 12. Another variant is to limit something like the Hamming distance (assume that the vars are all binary here):
dvar int changed[indexSet] in 0..1;
forall(i in indexSet)
if (previousValue[i] <= 0.5)
changed[i] == (var[i] >= 0.5) // was zero before
else
changed[i] == (var[i] <= 0.5) // was one before
sum(i in indexSet) changed[i] <= 2;
Here we would be saying that out of an array of e.g. 100 binary variables, only a maximum of two would be allowed to have a different value from the previous solution.
Of course you can combine these ideas. For example, add simple constraints to fix a large part of the problem to previous values, while leaving some other variables to be re-solved, and then add constraints on some of the remaining free variables to limit the new solution to be near to the previous one. You will notice of course that these schemes get more complex to implement and maintain as we try to be more clever.
To make the local search work well you will need to think carefully about how you construct your local neighbourhoods - too small and there will be too little opportunity to make the improvements you seek, while if they are too large they take too long to solve, so you don't get to make so many improvement steps.
A related point is that each neighbourhood needs to be reasonably internally connected. We have done some experiments where we fixed the values of maybe 99% of the variables in a model and solved for the remaining 1%. When the 1% was clustered together in the model (e.g. all the allocation variables for a subset of resources) we got good results, while in comparison we got nowhere by just choosing 1% of the variables at random from anywhere in the model.
An often overlooked idea is to invert these same limits on the model, as a way of forcing some changes into the solution to achieve a degree of diversification. So you could add a constraint to force a specific value to be different from a previous solution, or ensure that at least two out of an array of 100 binary variables have a different value from the previous solution. We have used this approach to get a sort-of tabu search with a hybrid matheuristic model.
Finally, we have mainly done this in C++ and C#, but it would work perfectly well from Java. Not tried it much from OPL, but it should be fine too. The key for us was being able to traverse the problem structure and use problem knowledge to choose the sets of variables we freeze or relax - we just found that easier and faster to code in a language like C#, but then the modelling stuff is more difficult to write and maintain. We are maybe a bit "old-school" and like to have detailed fine-grained control of what we are doing, and find we need to create many more arrays and index sets in OPL to achieve what we want, while we can achieve the same effect with more intelligent loops etc without creating so many data structures in a language like C#.
Those are several questions. So here are some pointers and suggestions:
In Cplex, you give your model an initial solution with the use of IloOplCplexVectors()
Here's a good example in IBM's documentation of how to alter CPLEX's solution.
Within OPL, you can do the same. You basically set a series of values for your variables, and hand those over to CPLEX. (See this example.)
Limiting the search to a specific neighborhood: There is no easy way to respond without knowing the details. But there are two ways that people do this:
a. change the objective to favor that 'neighborhood' and make other areas unattractive.
b. Add constraints that weed out other neighborhoods from the search space.
Regarding limiting the range of variables in OPL, you can do it directly:
dvar int supply in minQty..maxQty;
Or for a whole array of decision variables, you can do something along the lines of:
range CreditsAllowed = 3..12;
dvar int credits[student] in CreditsAllowed;
Hope this helps you move forward.

HLSL branch avoidance

I have a shader where I want to move half of the vertices in the vertex shader. I'm trying to decide the best way to do this from a performance standpoint, because we're dealing with well over 100,000 verts, so speed is critical. I've looked at 3 different methods: (pseudo-code, but enough to give you the idea. The <complex formula> I can't give out, but I can say that it involves a sin() function, as well as a function call (just returns a number, but still a function call), as well as a bunch of basic arithmetic on floating point numbers).
if (y < 0.5)
{
x += <complex formula>;
}
This has the advantage that the <complex formula> is only executed half the time, but the downside is that it definitely causes a branch, which may actually be slower than the formula. It is the most readable, but we care more about speed than readability in this context.
x += step(y, 0.5) * <complex formula>;
Using HLSL's step() function (which returns 0 if the first param is greater and 1 if less), you can eliminate the branch, but now the <complex formula> is being called every time, and its results are being multiplied by 0 (thus wasted effort) half of the time.
x += (y < 0.5) ? <complex formula> : 0;
This I don't know about. Does the ?: cause a branch? And if not, are both sides of the equation evaluated or only the one that is relevant?
The final possibility is that the <complex formula> could be offloaded back to the CPU instead of the GPU, but I worry that it will be slower in calculating sin() and other operations, which might result in a net loss. Also, it means one more number has to be passed to the shader, and that could cause overhead as well. Anyone have any insight as to which would be the best course of action?
Addendum:
According to http://msdn.microsoft.com/en-us/library/windows/desktop/bb509665%28v=vs.85%29.aspx
the step() function uses a ?: internally, so it's probably no better than my 3rd solution, and potentially worse since <complex formula> is definitely called every time, whereas it may be only called half the time with a straight ?:. (Nobody's answered that part of the question yet.) Though avoiding both and using:
x += (1.0 - y) * <complex formula>;
may well be better than any of them, since there's no comparison being made anywhere. (And y is always either 0 or 1.) Still executes the <complex formula> needlessly half the time, but might be worth it to avoid branches altogether.
Perhaps look at this answer.
My guess (this is a performance question: measure it!) is that you are best off keeping the if statement.
Reason number one: The shader compiler, in theory (and if invoked correctly), should be clever enough to make the best choice between a branch instruction, and something similar to the step function, when it compiles your if statement. The only way to improve on it is to profile[1]. Note that it's probably hardware-dependent at this level of granularity.
[1] Or if you have specific knowledge about how your data is laid out, read on...
Reason number two is the way shader units work: If even one fragment or vertex in the unit takes a different branch to the others, then the shader unit must take both branches. But if they all take the same branch - the other branch is ignored. So while it is per-unit, rather than per-vertex - it is still possible for the expensive branch to be skipped.
For fragments, the shader units have on-screen locality - meaning you get best performance with groups of nearby pixels all taking the same branch (see the illustration in my linked answer). To be honest, I don't know how vertices are grouped into units - but if your data is grouped appropriately - you should get the desired performance benefit.
Finally: It's worth pointing out that your <complex formula> - if you're saying that you can hoist it out of your HLSL manually - it may well get hoisted into a CPU-based pre-shader anyway (on PC at least, from memory Xbox 360 doesn't support this, no idea about PS3). You can check this by decompiling the shader. If it is something that you only need to calculate once per-draw (rather than per-vertex/fragment) it probably is best for performance to do it on the CPU.
I got tired of my conditionals being ignored so I just made a another kernel and did an override in c execution.
If you need it to be accurate all the time I suggest this fix.