Implementing the Bayes' theorem in a fitness function - bayesian

In an evolutionary programming project I'm working on I thought it could be a useful idea to use the formula in Bayes' theorem. Although I'm not totally sure what that would look like.
So the programs that are evolving are attempting to predict the future state of a time series using past data. Given some price data for the past n days the program will predict either buy if it predicts the price will rise, sell if fall, leave if there is too little movement.
From my understanding, I work out the probability of the model being accurate with regards to buying with the following algorithm after testing it on the historical data and recording correct and incorrect predictions.
prob-b-given-a = correct-buy-predictions / total
prob-a = actual-buy-count / total
prob-b = prediction-buy-count / total
prob-a-given-b = (prob-b-given-a * prob-a) / prob-b
fitness = prob-a-given-b //last step for clarification
Am I interpreting Bayes' theorem correctly and is this a suitable fitness function?
How would I combine the fitness function for all predictions? (in my example I only show the predictive probability of the buy prediction)

Related

The acceptance rate jumps between 0 and 1 drastically in high-dimensional MH algorithm. How can I tune it?

One part of my MCMC algorithm is using MH algorithm to update (n\times 1) vector of parameters $\boldsymbol{\delta}$. I think it is less computational intensive to propose a new sample from a $n\times 1$ multivariate proposal distribution (n is large). However, it seems impossible to tune the acceptance rate towards some ideal interval, such as 0.2 to 0.5.
I have tried random walk update based on multivariate normal and multivariate uniform distribution. No matter how I adjust the step size, the pattern of acceptance rate looks similar to the following figure.
enter image description here
Is there anyone have such experience? Any good suggest is welcome!

Deep Learning algorithm to calculate output of two parameters based on training data

I have to optimize my Parameters in order to get the highest Energy cusumption. I think no Need to explain the physical phenomenon I'm studying but the important informations are: Let's say I have two variables which are the frequency F and the Magnitude A. The Energy consumption Y is not calculated through an equation but with a complex Simulation in Ansys. From Ansys I can have the Energy Y for every Frequency and Magnitude combination I choose. is there a Deep learning technique which allows to use some variables and their Output Energy as Training data to create a Networks which would calculate the Output Energy for every other Parameter combination.
Ideas are welcome …

Can I use this fitness function?

I am working on a project using a genetic algorithm, and I am trying to formulate a fitness function, my questions are:
What is the effect of fitness formula choice on a GA?
It is possible to make the fitness function equals directly the number of violation (in case of minimisation)?
What is the effect of fitness formula choice on a GA
The fitness function plays a very important role in guiding GA.
Good fitness functions will help GA to explore the search space effectively and efficiently. Bad fitness functions, on the other hand, can easily make GA get trapped in a local optimum solution and lose the discovery power.
Unfortunately every problem has its own fitness function.
For classification tasks error measures (euclidean, manhattan...) are widely adopted. You can also use entropy based approaches.
For optimization problems, you can use a crude model of the function you are investigating.
Extensive literature is available on the characteristics of fitness function (e.g. {2}, {3}, {5}).
From an implementation point of view, some additional mechanisms have to be taken into consideration: linear scaling, sigma truncation, power scaling... (see {1}, {2}).
Also the fitness function can be dynamic: changing during the evolution to help search space exploration.
It is possible to make the fitness function equals directly the number of violation (in case of minimisation)?
Yes, it's possible but you have to consider that it could be a too coarse grained fitness function.
If the fitness function is too coarse (*), it doesn't have enough expressiveness to guide the search and the genetic algorithm will get stuck in local minima a lot more often and may never converge on a solution.
Ideally a good fitness function should have the capacity to tell you what the best direction to go from a given point is: if the fitness of a point is good, a subset of its neighborhood should be better.
So no large plateau (a broad flat region that doesn't give a search direction and induces a random walk).
(*) On the other hand a perfectly smooth fitness function could be a sign you are using the wrong type of algorithm.
A naive example: you look for parameters a, b, c such that
g(x) = a * x / (b + c * sqrt(x))
is a good approximation of n given data points (x_i, y_i)
You could minimize this fitness function:
| 0 if g(x_i) == y_i
E1_i = |
| 1 otherwise
f1(a, b, c) = sum (E1_i)
i
and it could work, but the search isn't aimed. A better choice is:
E2_i = (y_i - g(x_i)) ^ 2
f1(a, b, c) = sum (E2_i)
i
now you have a "search direction" and greater probability of success.
Further details:
Genetic Algorithms: what fitness scaling is optimal? by Vladik Kreinovich, Chris Quintana
Genetic Algorithms in Search, Optimization and Machine Learning by Goldberg, D. (1989, Addison-Wesley)
The Royal Road for Genetic Algorithms: Fitness Landscapes and GA Performance by Melanie Mitchell, Stephanie Forrest, John H Holland.
Avoiding the pitfalls of noisy fitness functions with genetic algorithms by Fiacc Larkin, Conor Ryan (ISBN: 978-1-60558-325-9)
Essentials of Metaheuristics by Sean Luke

Adding two probability density function

I'm working with a evolutionary algorithm and I'm trying to generate new population using probability density function. We have many classical individual(Xij) and his fitness( f(Xij) ), to have probabilities for each of them, normalize fitness for instance I have Xij with probability 0.10 and I can get pdf.
During iterations I will have many pdf's, I want to save information of last pdf in the actual pdf. I have tried adding but I think it's not the best choice. What would you recommend?
Example of pdf's
PD: you can watch peaks of the pdf's in the picture.
To answer this question I will have to make some assumptions:
You have individuals Xij
You have individual fitness or scores f(Xij)
You normalize the score/fitness to a probability density function by applying a function to the fitness. In other words: pdf(Xij) = g(f(Xij)
You require persistence in the pdf to allow selection of specimen that performed well over several iterations.
I would either use a running average of the fitness score and then normalize the averaged score; or use a running average of the normalized score for a new pdf.

How to design acceptance probability function for simulated annealing with multiple distinct costs?

I am using simulated annealing to solve an NP-complete resource scheduling problem. For each candidate ordering of the tasks I compute several different costs (or energy values). Some examples are (though the specifics are probably irrelevant to the question):
global_finish_time: The total number of days that the schedule spans.
split_cost: The number of days by which each task is delayed due to interruptions by other tasks (this is meant to discourage interruption of a task once it has started).
deadline_cost: The sum of the squared number of days by which each missed deadline is overdue.
The traditional acceptance probability function looks like this (in Python):
def acceptance_probability(old_cost, new_cost, temperature):
if new_cost < old_cost:
return 1.0
else:
return math.exp((old_cost - new_cost) / temperature)
So far I have combined my first two costs into one by simply adding them, so that I can feed the result into acceptance_probability. But what I would really want is for deadline_cost to always take precedence over global_finish_time, and for global_finish_time to take precedence over split_cost.
So my question to Stack Overflow is: how can I design an acceptance probability function that takes multiple energies into account but always considers the first energy to be more important than the second energy, and so on? In other words, I would like to pass in old_cost and new_cost as tuples of several costs and return a sensible value .
Edit: After a few days of experimenting with the proposed solutions I have concluded that the only way that works well enough for me is Mike Dunlavey's suggestion, even though this creates many other difficulties with cost components that have different units. I am practically forced to compare apples with oranges.
So, I put some effort into "normalizing" the values. First, deadline_cost is a sum of squares, so it grows exponentially while the other components grow linearly. To address this I use the square root to get a similar growth rate. Second, I developed a function that computes a linear combination of the costs, but auto-adjusts the coefficients according to the highest cost component seen so far.
For example, if the tuple of highest costs is (A, B, C) and the input cost vector is (x, y, z), the linear combination is BCx + Cy + z. That way, no matter how high z gets it will never be more important than an x value of 1.
This creates "jaggies" in the cost function as new maximum costs are discovered. For example, if C goes up then BCx and Cy will both be higher for a given (x, y, z) input and so will differences between costs. A higher cost difference means that the acceptance probability will drop, as if the temperature was suddenly lowered an extra step. In practice though this is not a problem because the maximum costs are updated only a few times in the beginning and do not change later. I believe this could even be theoretically proven to converge to a correct result since we know that the cost will converge toward a lower value.
One thing that still has me somewhat confused is what happens when the maximum costs are 1.0 and lower, say 0.5. With a maximum vector of (0.5, 0.5, 0.5) this would give the linear combination 0.5*0.5*x + 0.5*y + z, i.e. the order of precedence is suddenly reversed. I suppose the best way to deal with it is to use the maximum vector to scale all values to given ranges, so that the coefficients can always be the same (say, 100x + 10y + z). But I haven't tried that yet.
mbeckish is right.
Could you make a linear combination of the different energies, and adjust the coefficients?
Possibly log-transforming them in and out?
I've done some MCMC using Metropolis-Hastings. In that case I'm defining the (non-normalized) log-likelihood of a particular state (given its priors), and I find that a way to clarify my thinking about what I want.
I would take a hint from multi-objective evolutionary algorithm (MOEA) and have it transition if all of the objectives simultaneously pass with the acceptance_probability function you gave. This will have the effect of exploring the Pareto front much like the standard simulated annealing explores plateaus of same-energy solutions.
However, this does give up on the idea of having the first one take priority.
You will probably have to tweak your parameters, such as giving it a higher initial temperature.
I would consider something along the lines of:
If (new deadline_cost > old deadline_cost)
return (calculate probability)
else if (new global finish time > old global finish time)
return (calculate probability)
else if (new split cost > old split cost)
return (calculate probability)
else
return (1.0)
Of course each of the three places you calculate the probability could use a different function.
It depends on what you mean by "takes precedence".
For example, what if the deadline_cost goes down by 0.001, but the global_finish_time cost goes up by 10000? Do you return 1.0, because the deadline_cost decreased, and that takes precedence over anything else?
This seems like it is a judgment call that only you can make, unless you can provide enough background information on the project so that others can suggest their own informed judgment call.