I am using Custom Experiment Optimization in AnyLogic 8.7.2. My model includes optimizing usage of a set of resources (Resource Pool) to meet a certain pre-defined demand. I define my own decision variables and objective function. This Custom Experiment works fine when my model size is small (less number of agents and hence, less decision variables). However, as the model grows in size, the optimization engine seems to get stuck with no warning/error message.
I have tried a few things like increasing the heap size, making the upper bound of decision variables small (varied from 10 to 15) but none of these solutions seem to work.
Any workaround or help on what could be causing this on this would be really useful. Thanks.
Related
I have been working on a simulation model for battery swapping in Anylogic. So far I have developed the simulation model, optimization experiment and parameters variation experiment.
There are no errors in the model but the output values are unsatisfactory. Small changes such as changing the step size of the decision variables results in a drastic change in the best value obtained after every experiment. Though the objective does not change much but I am concerned about the other variables that are changing with each run. Even with multiple optimization runs it is difficult to come to a conclusion.
For reference I am posting an output of parameters variation experiment here. I ran the experiment with an optimized value but I was getting feasible results (percentile > 95%) far off the expected input values. Although, the overall result is correct (decreasing percentile with increasing charging time) but it is difficult to understand the variability.
Can anyone help?enter image description here
When building a model, this is a common problem you will have when looking at high level overall outputs. You could have a model bug, but it is just as likely (if not more likely) that there is some dynamic to your system that was not clear in simple Excel spreadsheets or mental models. The DES may be telling us something truly interesting about the system behavior, but without additional outputs, there is no way to understand what that is.
A few suggestions:
Run this as a simple single scenario, where you manually update inputs. When you run this with the low range of input values and then the high range of input values, what do you see on the animation or additional outputs that is different than you expected or could explain the overall output trend? Try running several intermediate points.
Add additional output metrics. If you look at queue sizes, resource utilizations, turn-around-times, etc; do you see anything at that level that is different than expected?
Add a "replication" log. When you run a set of inputs for multiple scenarios, does any single replication stand out as an outlier? If so, re-run the scenario with that set of inputs and that random seed.
There is no substitute for understanding underlying system behavior, and without understanding those dynamics, looking at overall correlation with optimization or parameter variation experiments will often lead companies to make the wrong policies decisions.
I am trying to solve an optimisation problem consisting in finding the global maximum of a high dimensional (10+) monotonic function (as in monotonic in every direction). The constraints are such that they cut the search space with planes.
I have coded the whole thing in pyomo and I am using the ipopt solver. In most cases, I am confident it converges successfully to the global optimal. But if I play a bit with the constraints I see that it sometimes converges to a local minima.
It looks like a exploration-exploitation trade-off.
I have looked into the options that can be passed to ipopt and the list is so long that I cannot understand which parameters to play with to help with the convergence to the global minima.
edit:
Two hints of a solution:
my variables used to be defined with very infinite bounds, e.g. bounds=(0,None) to move on the infinite half-line. I have enforced two finite bounds on them.
I am now using multiple starts with:
opt = SolverFactory('multistart')
results = opt.solve(self.model, solver='ipopt', strategy='midpoint_guess_and_bound')
So far this has made my happy with the convergence.
Sorry, IPOPT is a local solver. If you really want to find global solutions, you can use a global solver such as Baron, Couenne or Antigone. There is a trade-off: global solvers are slower and may not work for large problems.
Alternatively, you can help local solvers with a good initial point. Be aware that active set methods are often better in this respect than interior point methods. Sometimes multistart algorithms are used to prevent bad local optima: use a bunch of different starting points. Pyomo has some facilities to do this (see the documentation).
Can anyone give me some tips to make a binary integer programming model faster?
I currently have a model that runs well with very small amount of variables but as soon as I increase the number of variables in my model SCIP keeps running without giving me an optimal solution. I'm currently using SCIP with Soplex to find an optimal solution.
You should have a look at the statistics (type display statistics in the interactive shell). Watch out for time consuming heuristics that don't find a solution and try disabling them. You should also play around with the parameters to find better suited settings for your instances (different branching rule or node selection). Without further information, though, we won't be able to help you.
I have an iterative computation that involves a Fourier transform in each iteration.
in high level it looks like this:
// executed in host , calling functions that run on the device
B = image
L = 100
while(L--) {
A = FFT_2D(B)
A = SOME_PER_PIXEL_CALCULATION(A)
B = INVERSE_FFT_2D(A)
B = SOME_PER_PIXEL_CALCULATION(B)
}
I am using "cufft" library to do the transforms.
now the problem is that I am always working with global memory,
basically if there was a way of doing some of the work with shared memory it would be great,
but it seems like using FFT won't allow me to bypass this, given "cufft" library functions can only be called from the host, and stores input and output in global memory.
how should I tackle this?
thanks.
EDIT:
since there IS a data dependency. it would seem like I can't do much but optimize the 'per pixel' calculations...
the bottleneck is still due to the fact that the kernels pass the data via global memory .which seems unavoidable in this case.
so basically the fact that I have to do the transform an it's inverse is what keeps me from sharing intermidiate computation data.
currently I am exploring ways of doing most of the calculation in the frequency space.
( more of a math problem )
so does anyone has a good idea on how to approximate F{max(0,f(x,y))} given F{f(x,y)} ?
EDIT:
note that f(x,y) is in the time domain, and therefore is real valued,
f(x,y) is also processed before calculating pointwise max(0,f(x,y)), so it is indeed possible for negetiv values to appear.
Concerning the FFT/IFFT, I think you are wrongly assuming that the CUFFT routine does not internally use shared memory. Typical algorithms for FFT calculations split the entire FFT into smaller ones fitting one thread block and so probably they already internally exploit shared memory, see for example the paper.
Concerning the PER_PIXEL_CALCULATIONS, shared memory is typically used to make threads within a thread block cooperate each other. My question is: are the PER_PIXEL_CALCULATIONS independent each other? If so, perhaps thread cooperation is not needed and you would not need shared memory either and arrange the calculations by using only registers.
Anyway, to be more specific on the latter point, you should provide more information on what you actually need (by editing your original post). Is your code related to an implementation of the Gerchberg-Saxton algorithm?
Hey there,
I'm currently developing a Mex-file in matlab including CUDA computation. I wonder if there's a good way to 'automatically' optimize the program for arbitrary input parameters from the user. E.g. when the input-parameters don't exceed a certain size, try to use shared and/or constant memory... which will only work up to certain limits. From there on, global memory has to be used. But such optimizations can only be made in runtime because that's the point I get to know the size of input parameters from the user. Any simple solution?
Thanks!
You can simply write different kernels and decide which ones to call at runtime.
You can also use the device query API or do some micro-benchmarking to figure out the sizes of shared/constant memory at runtime. This is probably necessary if you don't want to assume a particular GPU model.