Using pymc.potential to prevent evaluation of function at meaningless parameters values - bayesian

I am building a pymc model which must evaluate a very cpu expensive function (up to 1 sec per call on a very decent hardware). I am trying to limit the explored parameter space to meaningful solutions by means of a potential (the sum of a list of my variables has to stay within a given range). This works but I noticed that even when my potential returns an infinite value and forbids the parameters choice, this function gets evaluated. Is there a way to prevent that? Can one force the sampler to use a given evaluation sequence (pick up the necessary variables, check if the potential is ok and proceed if allowed)
I thought of using the potential inside the function itself and use it to determine whether it must proceed or immediately return, but is there a better way?
Jean-François

I am not aware of a way of ordering the evaluation of the potentials. This might not be the best way of doing so, but you might be able to check if the parameters are within reasonable at the beginning of the simulation. If the parameters are not within reasonable bounds you can return a value that will create your posterior to be zero.
Another option is to create a function for your likelihood. At the beginning of this function you could check if the parameters are within reasonable limits. If they are not you can return -inf without running your simulation. If they are reasonable you can run your model and calculate the log(p).
This is definitely not an elegant solution but it should work.
Full disclosure - I am not by any means a pymc expert.

Related

SciPy Basinhopping not returning lowest-found minimum

I know there is a very similar question, but mine is different. I am running an optimization using Basinhopping, with the Powell method. Within the function I am optimizing, I also store to an external array the parameters and the resulting cost function value for each iteration, so I can afterwards check the results. I've noticed repeatedly that the lowest minimization result which the basinhopping function returns is not actually the set of parameters which resulted in the lowest overall error. I assume this is not an error, but maybe me misunderstanding how the technique works. For example, in an optimization I just ran, I found the result which was returned was actually the 35th-best option, when I check my arrays after completion. The difference in cost is very small (I'm using RMSE as a metric, and the difference is 0.02), but I still don't understand how it selected the minimum.
My first thought was maybe these parameters somehow exceeded the bounds I set, but I checked and that isn't the case.
I don't yet have a shareable reproducible version since I'm using some internal modules in the function call, but I figured I would post my question since it is more about the conceptual aspect of how basinhopping selects its result.

Numerical Instability in Optim.jl

I'm currently working on a project in Julia where I am starting with an input beta which is assumed to be incorrect. I'm running through a sequence of code that updates this beta to be the correct value and checking the error. As beta gets larger, I expect this error to reach 100%. This code ultimately does a minimization of some parameter chi which is why I've chosen to employ the optimize function from Optim.jl. The output I'm getting is below.
When I perform this calculation by hand (using 1st and 2nd derivative to update) I get this
I see that this still has some numerical instability, but it holds up longer than the Optim way does. I would expect it to behave the other way around. My optimize function is set up as
result = optimize(β -> TEfunc(E,nc,onecut,β,pcutoff,μcutoff,N),β/2,2.2*β,Brent(),abs_tol=tempcutoff,rel_tol=sqrt(tempcutoff))
βstar=Optim.minimizer(result)
Is there an argument that I'm missing in the optimize call? I just want to figure out why I have numerical instability so quickly.

Determining a program's execution time by its length in bits?

This is a question popped into my mind while reading the halting problem, collatz conjecture and Kolmogorov complexity. I have tried to search for something similar but I was unable to find a particular topic maybe because it is not of great value or it could just be a trivial question.
For the sake of simplicity I will give three examples of programs/functions.
function one(s):
return s
function two(s):
while (True):
print s
function three(s):
for i from 0 to 10^10:
print(s)
So my questions is, if there is a way to formalize the length of a program (like the bits used to describe it) and also the internal memory used by the program, to determine the minimum/maximum number of time/steps needed to decide whether the program will terminate or run forever.
For example, in the first function the program doesn't alter its internal memory and halts after some time steps.
In the second example, the program runs forever but the program also doesn't alter its internal memory. For example, if we considered all the programs with the same length as with the program two that do not alter their state, couldn't we determine an upper bound of steps, which if surpassed we could conclude that this program will never terminate ? (If not why ?)
On the last example, the program alters its state (variable i). So, at each step the upper bound may change.
[In short]
Kolmogorov complexity suggests a way of finding the (descriptive) complexity of an object such as a piece of text. I would like to know, given a formal way of describing the memory-space used by a program (computed in runtime), if we could compute a maximum number of steps, which if surpassed would allow us to know whether this program will terminate or run forever.
Finally, I would like to suggest me any source that I might find useful and help me figure out what I am exactly looking for.
Thank you. (sorry for my English, not my native language. I hope I was clear)
If a deterministic Turing machine enters precisely the same configuration twice (which we can detect b keeping a trace of configurations seen so far), then we immediately know the TM will loop forever.
If it known in advance that a deterministic Turing machine cannot possibly use more than some fixed constant amount of its input tape, then the TM must explicitly halt or eventually enter some configuration it has already visited. Suppose the TM can use at most k tape cells, the tape alphabet is T and the set of states is Q. Then there are (|T|+1)^k * |Q| unique configurations (the number of strings over (T union blank) of length k times the number of states) and by the pigeonhole principle we know that a TM that takes that many steps must enter some configuration it has already been to before.
one: because we are given that this function does not use internal memory, we know that it either halts or loops forever.
two: because we are given that this function does not use internal memory, we know that it either halts or loops forever.
three: because we are given that this function only uses a fixed amount of internal memory (like 34 bits) we can tell in fewer than 2^34 iterations of the loop whether the TM will halt or not for any given input s, guaranteed.
Now, knowing how much tape a TM is going to use, or how much memory a program is going to use, is not a problem a TM can solve. But if you have an oracle (like a person who was able to do a proof) that tells you a correct fixed upper bound on memory, then the halting problem is solvable.

z3 minimization and timeout

I try to use the z3 solver for a minimization problem. I was trying to get a timeout, and return the best solution so far. I use the python API, and the timeout option "smt.timeout" with
set_option("smt.timeout", 1000) # 1s timeout
This actually times out after about 1 second. However a larger timeout does not provide a smaller objective. I ended up turning on the verbosity with
set_option("verbose", 2)
And I think that z3 successively evaluates larger values of my objective, until the problem is satisfiable:
(opt.maxres [0:6117664])
(opt.maxres [175560:6117664])
(opt.maxres [236460:6117664])
(opt.maxres [297360:6117664])
...
(opt.maxres [940415:6117664])
(opt.maxres [945805:6117664])
...
I thus have the two questions:
Can I on contrary tell z3 to start with the upper bound, and successively return models with a smaller value for my objective function (just like for instance Minizinc annotations indomain_max http://www.minizinc.org/2.0/doc-lib/doc-annotations-search.html)
It still looks like the solver returns a satisfiable instance of my problem. How is it found? If it's trying to evaluates larger values of my objective successively, it should not have found a satisfiable instance yet when the timeout occurs...
edit: In the opt.maxres log, the upper bound never shrinks.
For the record, I found a more verbose description of the options in the source here opt_params.pyg
Edit Sorry to bother, I've beed diving into this recently once again. Anyway I think this might be usefull to others. I've been finding that I actually have to call the Optimize.upper method in order to get the upper bound, and the model is still not the one that corresponds to this upper bound. I've been able to add it as a new constraint, and call a solver (without optimization, just SAT), but that's probably not the best idea. By reading this I feel like I should call Optimize.update_upper after the solver times out, but the python interface has no such method (?). At least I can get the upper bound, and the corresponding model now (at the cost of unneccessary computations I guess).
Z3 finds solutions for the hard constraints and records the current values for the objectives and soft constraints. The last model that was found (the last model with the so-far best value for the objectives) is returned if you ask for a model. The maxres strategy mainly improves the lower bounds on the soft constraints (e.g., any solution must have cost at least xx) and whenever possible improves the upper bound (the optional solution has cost at most yy). The lower bounds don't tell you too much other than narrowing the range of possible optimal values. The upper bounds are available when you timeout.
You could try one of the other strategies, such as the one called "wmax", which
performs a branch-and-prune. Typically maxres does significantly better, but you may have better experience (depending on the problems) with wmax for improving upper bounds.
I don't have a mode where you get a stream of models. It is in principle possible, but it would require some (non-trivial) reorganization. For Pareto fronts you make successive invocations to Optimize.check() to get the successive fronts.

Does OptaPlanner support optimizations and constraints on continuous variables?

I'm reading contradictory things in the documentation.
On one hand, this passage seems to indicate that continuous planning variables are possible:
A planning value range is the set of possible planning values for a
planning variable. This set can be a discrete (for example row 1, 2, 3
or 4) or continuous (for example any double between 0.0 and 1.0).
On the other hand, when defining a Planning Variable, you must specify a ValueRangeProvider annotation on a field to use for the value set:
The Solution implementation has method which returns a Collection. Any
value from that Collection is a possible planning value for this
planning variable.
Both of these snippets are in the same section of the documentation (http://docs.jboss.org/drools/release/latest/optaplanner-docs/html_single/#d0e2518)
So, which is it? Can I use a full double as my planning variable, or do I need to restrict its range to the values in a specific Collection?
Looking at the actual algorithms are provided, I don't see any that are actually suitable for optimizing continuous variables, so I doubt it's possible, but it'd be nice to have that clarified and made explicit.
We're working towards fully supporting continuous variables. But currently (in 6.0.0.CR2) it's not decently supported yet.
Value ranges can indeed be continuous ranges, but the plumbing to actually use them isn't there yet. We have made good progress recently, see https://issues.jboss.org/browse/PLANNER-160.
Here's how it will work:
You 'll be able to use a #ValueRangeProvider annotation on a method that returns a ValueRange (instead of a Collection) too.
A ValueRange will be an interface supports selecting a random value, getting a size, ...
Out-of-the-box we will support IntValueRange, DoubleValueRange, BigDecimalValueRange, ...
(Implementation detail: we'll retro-fit those Collection-returning methods into a CollectionValueRange.)
Then the ValueSelector implementations will use that directly.
As for the suitability to optimize continuous variables:
JIT random selection will be blazing fast and be very memory-efficient.
If you have an NP-complete/NP-hard problem, then OptaPlanner will be a great match. If you have only continuous variables (and not a single discrete variable), then it's unlikely that your problem is NP-complete (unless your constraints counterprove that) and in that case you're better off with a custom, handmade, polynomial algorithm anyway (because it's not NP-complete, so there's an "easy" solution).