Optimizing Portfolio With Bounds on Weights and Costs - optimization

I wish to create efficient frontiers for portfolios with bounds on both weights and costs. The following code provides the frontiers for portfolios in which the underlying assets are bounded with minimum and maximum weights. How do I add to this a secondary constraint in which the combined annual charges of the underlying assets do not exceed a maximum? Assume each asset has an annual cost which is applied as a percentage. As such the combined weights*charges should not exceed x%.
lb=Bounds(:,1);
ub=Bounds(:,2);
P = Portfolio('AssetList', AssetList,'LowerBound', lb, 'UpperBound', ub, 'Budget', 1);
P = P.estimateAssetMoments(AssetReturns);
[Passetmean, Passetcovar] = P.getAssetMoments;
Correlations=corrcoef(AssetReturns);
% Estimate Frontier
pwgt = P.estimateFrontier(20);
[prsk, pret] = P.estimatePortMoments(pwgt);

Mary,
having entered another set of constraint principles into the model, kindly notice, that the modified efficient frontier problem is out of the grounds of a guaranteed convex-optimisation problem.
Thus one may forget about a comfort of all the popular fmicg(), l-bgfs et al solvers.
This will not simply have a SLOC one-liner to get answer(s) out of the box.
Non-linear problems will require ( the wilder, the more ... ) you to assemble another optimisation function, be it either
a brute-force based scanner,
with a fully orthogonal mesh scanned, with "utility function" defined so that, as the given requirement states, it incorporates also the add-on cost-of-beholding a Portfolio item
or
a genetic-algorithm based approach,
in a belief, the brute-force one might become as time-extensive as to cease to be a feasible approach and a GA-evolution may yield acceptable sub-optimal (local optima) outputs

Related

Optimize "1D" bin packing/sheet cutting

Our use case could be described as a variant of 1D bin packing or sheet cutting.
Imagine a drywall with a beam framing.
We want to optimize the number and size of gypsum boards that would be needed to cover the wall.
Boards must start and end on a beam.
Boards must not overlap (hard constraint).
Less (i.e. bigger) boards, the better (soft constraint).
What we currently do:
Pre-generate all possible boards and pass them as problem facts.
Let the solver pick the best subset of those (nullable planning variable).
First Fit Decreasing + Simulated Annealing
Even relatively small walls (~6m, less than 20 possible boards to pick from) take sometimes minutes and while we mostly get a feasible solution, it's rarely optimal.
Is there a better way to model that?
EDIT
Our current domain model looks like the following. Please note that the planning entity only holds the selected/picked material but nothing else. I.e. currently our planning entities are all equal, which kind of prevents any optimization that depends on planning entity difficulty.
data class Assignment(
#PlanningId
private val id: Long? = null,
#PlanningVariable(
valueRangeProviderRefs = ["materials"],
strengthComparatorClass = MaterialStrengthComparator::class,
nullable = true
)
var material: Material? = null
)
data class Material(
val start: Double,
val stop: Double,
)
Active (sub)pillar change and swap move selectors. See optaplanner docs section about move selectors (move neighorhoods). The default moves (single swap and single change) are probably getting stuck in local optima (and even though SA helps them escape those, those escapes are probably not efficient enough).
That should help, but a custom move to swap two subpillars of the almost the same size, might improve efficiency further.
Also, as you're using SA (Simulated Annealing), know that SA is parameter sensitive. Use optaplanner-benchmark to try multiple SA starting temp parameters with different dataset set sizes. Also compare it to a plain LA (Late Acceptance) in benchmarks too. LA isn't fickle like SA can be. (With fickle I don't mean unstable. I mean potential dataset size sensitive parameter tweaking.)

Octave minimization for a many-body Hamiltonian with non-linear constraint

I work in theoretical physics, and I have come upon a problem that requires the minimization of a particular Hamiltonian operator for a system of 8 particles, with one non-linear constraint. Due to the complexity of the system, I cannot define the entire Hamiltonian "in one go", nor the constraint. By this I mean that the quantity I am searching for is defined recurrently, depending on complex summations over quantities calculated for systems of 7 particles, which in turn depend on quantities calculated for systems of 6, and so on, until it reaches a one or two-particle system, for which said quantities are given as initial values, dependent on the elements of a column vector (the argument/minization parameters). The constraint itself is also of this form, requiring the "overlap" between the states of 8 particles to be exactly 1. (I.E. the state be normalized) I have been thinking of a way to use fmincon for this, but I've come up short, since my function has an implicit dependence on the parameters, and I can't write the whole thing explicitly. For a better understanding, here is some of the code:
for m=3:npairs+1
for n=3:npairs+1
for i=1:nsps
for j=1:nsps
overlap(m,n)=overlap(m,n)+x(i)*x(j)*(delta(i,j)*(overlap(m-1,n-1)-N(m-1,n-1,i))+p0p(m-1,n-1,j,i));
p(m,n,i)=(n-1)*x(i)*overlap(m,n-1)-(n-2)*(n-1)*x(i)*x(i)*((m-1)*x(i)*overlap(m-1,n-1)-(m-2)*(m-1)*x(i)*x(i)*p(m-1,n-1,i));
N(m,n,i)=2*(n-1)*x(i)*p(n-1,m,i);
p0p(m,n,i,j)=(m-1)*(n-1)*x(i)*x(j)*overlap(m-1,n-1)-(m-1)*(n-1)*(m-2)*x(i)*x(i)*x(j)*p(m-2,n-1,i)-(m-1)*(n-1)*(n-2)*x(i)*x(j)*x(j)*p0(m-1,n-2,j)-(m-1)*(n-1)*(m-2)*(n-2)*x(i)*x(i)*x(j)*x(j)*(delta(i,j)*(overlap(m-2,n-2)-N(m-2,n-2,i))+p0p(m-2,n-2,j,i));
endfor
endfor
endfor
endfor
function [E]=H(x)
E=summation over all i and j of N and p0p for m=n=8 %not actual code
endfunction
overlap(9,9)=1 %constraint
It's hard to give a specific answer, but I would advise the following to get you started.
First, note that, the inner two steps of the nest loop can be vectorised, since i and j always appear as indices (whereas m and n make backreferences, so they cannot be vectorised). So your 4-level loop can be reduced to a 2-level loop containing 4 functions operating over i-by-j matrices.
Second, note that the whole construct can be expressed as a recursive function. If you have suitable base cases for m = 0, n = 0, you can iteratively obtain all i,j matrices for all cases up to m=9,n=9. In particular, you can try to 'memoize' the early steps, and plug them into higher steps, rather than rely on actual recursion.
Assuming you need to sum with the first two indeces fixed to 8 (if I understood correctly), you can easily do with Anonymous Functions
https://octave.org/doc/v6.1.0/Anonymous-Functions.html#Anonymous-Functions
# creating same data
A=ones(8,8,4,4);
B=2*ones(8,8,4,4);
# defining 2 versions of sums
f = #(A,B) [sum(sum(A(8,8,:,:))), sum(sum(B(8,8,:,:)))];
g = #(A,B) sum(sum(A(8,8,:,:)))+ sum(sum(B(8,8,:,:)));
E1=f(A,B)
E2=g(A,B)
the output will be:
octave:21> E1=f(A,B)
E1 =
16 32
octave:22> E2=g(A,B)
E2 = 48

Setting initial values for non-linear parameters via tabuSearch

I'm trying to fit the lppl model to KLSE index to predict the most probable crash time. Many papers suggested tabuSearch to identify the initial value for non-linear parameters but none of them publish their code. I have tried to fit the mentioned index with the help of NLS And Log-Periodic Power Law (LPPL) in R. But the obtained error and p values are not significant. I believe that the initial values are not accurate. Can anyone help me on how to find the proper initial values?
library(tseries)
library(zoo)
ts<-get.hist.quote(instrument="^KLSE",start="2003-04-18",end="2008-01-30",quote="Close",provider="yahoo",origin="1970-01-01",compression="d",retclass="zoo")
df<-data.frame(ts)
df<-data.frame(Date=as.Date(rownames(df)),Y=df$Close)
df<-df[!is.na(df$Y),]
library(minpack.lm)
library(ggplot2)
df$days<-as.numeric(df$Date-df[1,]$Date)
f<-function(pars,xx){pars$a + (pars$tc - xx)^pars$m *(pars$b+ pars$c * cos(pars$omega*log(pars$tc - xx) + pars$phi))}
resids<-function(p,observed,xx){df$Y-f(p,xx)}
nls.out <- nls.lm(par=list(a=600,b=-266,tc=3000, m=.5,omega=7.8,phi=-4,c=-14),fn = resids, observed = df$Y, xx = df$days, control= nls.lm.control (maxiter =1024, ftol=1e-6, maxfev=1e6))
par<-nls.out$par
nls.final<-nls(Y~(a+(tc-days)^m*(b+c*cos(omega*log(tc-days)+phi))),data=df,start=par,algorithm="plinear",control=nls.control(maxiter=10024,minFactor=1e-8))
summary(nls.final)
I would look at some of the newer research on this topic, there is a good trig modification that will practically guarantee a singular optimization. Additionally, you can use r's built in linear equation solver, to find the linearizable parameters, ergo you will only need to optimize in 3 dimensions. The link below should get you started. I would cite recent literature and personal experience to strongly advise against a tabu search.
https://www.ethz.ch/content/dam/ethz/special-interest/mtec/chair-of-entrepreneurial-risks-dam/documents/dissertation/master%20thesis/MAS_final_Tuncay.pdf

Can I run a GA to optimize wavelet transform?

I am running a wavelet transform (cmor) to estimate damping and frequencies that exists in a signal.cmor has 2 parameters that I can change them to get more accurate results. center frequency(Fc) and bandwidth frequency(Fb). If I construct a signal with few freqs and damping then I can measure the error of my estimation(fig 2). but in actual case I have a signal and I don't know its freqs and dampings so I can't measure the error.so a friend in here suggested me to reconstruct the signal and find error by measuring the difference between the original and reconstructed signal e(t)=|x(t)−x^(t)|.
so my question is:
Does anyone know a better function to find the error between reconstructed and original signal,rather than e(t)=|x(t)−x^(t)|.
can I use GA to search for Fb and Fc? or do you know a better search method?
Hope this picture shows what I mean, the actual case is last one. others are for explanations
Thanks in advance
You say you don't know the error until after running the wavelet transform, but that's fine. You just run a wavelet transform for every individual the GA produces. Those individuals with lower errors are considered fitter and survive with greater probability. This may be very slow, but conceptually at least, that's the idea.
Let's define a Chromosome datatype containing an encoded pair of values, one for the frequency and another for the damping parameter. Don't worry too much about how their encoded for now, just assume it's an array of two doubles if you like. All that's important is that you have a way to get the values out of the chromosome. For now, I'll just refer to them by name, but you could represent them in binary, as an array of doubles, etc. The other member of the Chromosome type is a double storing its fitness.
We can obviously generate random frequency and damping values, so let's create say 100 random Chromosomes. We don't know how to set their fitness yet, but that's fine. Just set it to zero at first. To set the real fitness value, we're going to have to run the wavelet transform once for each of our 100 parameter settings.
for Chromosome chr in population
chr.fitness = run_wavelet_transform(chr.frequency, chr.damping)
end
Now we have 100 possible wavelet transforms, each with a computed error, stored in our set called population. What's left is to select fitter members of the population, breed them, and allow the fitter members of the population and offspring to survive into the next generation.
while not done
offspring = new_population()
while count(offspring) < N
parent1, parent2 = select_parents(population)
child1, child2 = do_crossover(parent1, parent2)
mutate(child1)
mutate(child2)
child1.fitness = run_wavelet_transform(child1.frequency, child1.damping)
child2.fitness = run_wavelet_transform(child2.frequency, child2.damping)
offspring.add(child1)
offspring.add(child2)
end while
population = merge(population, offspring)
end while
There are a bunch of different ways to do the individual steps like select_parents, do_crossover, mutate, and merge here, but the basic structure of the GA stays pretty much the same. You just have to run a brand new wavelet decomposition for every new offspring.

Constrained Single-Objective Optimization

Introduction
I need to split an array filled with a certain type (let's take water buckets for example) with two values set (in this case weight and volume), while keeping the difference between the total of the weight to a minimum (preferred) and the difference between the total of the volumes less than 1000 (required). This doesn't need to be a full-fetched genetic algorithm or something similar, but it should be better than what I currently have...
Current Implementation
Due to not knowing how to do it better, I started by splitting the array in two same-length arrays (the array can be filled with an uneven number of items), replacing a possibly void spot with an item with both values being 0. The sides don't need to have the same amount of items, I just didn't knew how to handle it otherwise.
After having these distributed, I'm trying to optimize them like this:
func (main *Main) Optimize() {
for {
difference := main.Difference(WEIGHT)
for i := 0; i < len(main.left); i++ {
for j := 0; j < len(main.right); j++ {
if main.DifferenceAfter(i, j, WEIGHT) < main.Difference(WEIGHT) {
main.left[i], main.right[j] = main.right[j], main.left[i]
}
}
}
if difference == main.Difference(WEIGHT) {
break
}
}
for main.Difference(CAPACITY) > 1000 {
leftIndex := 0
rightIndex := 0
liters := 0
weight := 100
for i := 0; i < len(main.left); i++ {
for j := 0; j < len(main.right); j++ {
if main.DifferenceAfter(i, j, CAPACITY) < main.Difference(CAPACITY) {
newLiters := main.Difference(CAPACITY) - main.DifferenceAfter(i, j, CAPACITY)
newWeight := main.Difference(WEIGHT) - main.DifferenceAfter(i, j, WEIGHT)
if newLiters > liters && newWeight <= weight || newLiters == liters && newWeight < weight {
leftIndex = i
rightIndex = j
liters = newLiters
weight = newWeight
}
}
}
}
main.left[leftIndex], main.right[rightIndex] = main.right[rightIndex], main.left[leftIndex]
}
}
Functions:
main.Difference(const) calculates the absolute difference between the two sides, the constant taken as an argument decides the value to calculate the difference for
main.DifferenceAfter(i, j, const) simulates a swap between the two buckets, i being the left one and j being the right one, and calculates the resulting absolute difference then, the constant again determines the value to check
Explanation:
Basically this starts by optimizing the weight, which is what the first for-loop does. On every iteration, it tries every possible combination of buckets that can be switched and if the difference after that is less than the current difference (resulting in better distribution) it switches them. If the weight doesn't change anymore, it breaks out of the for-loop. While not perfect, this works quite well, and I consider this acceptable for what I'm trying to accomplish.
Then it's supposed to optimize the distribution based on the volume, so the total difference is less than 1000. Here I tried to be more careful and search for the best combination in a run before switching it. Thus it searches for the bucket switch resulting in the biggest capacity change and is also supposed to search for a tradeoff between this, though I see the flaw that the first bucket combination tried will set the liters and weight variables, resulting in the next possible combinations being reduced by a big a amount.
Conclusion
I think I need to include some more math here, but I'm honestly stuck here and don't know how to continue here, so I'd like to get some help from you, basically that can help me here is welcome.
As previously said, your problem is actually a constrained optimisation problem with a constraint on your difference of volumes.
Mathematically, this would be minimise the difference of volumes under constraint that the difference of volumes is less than 1000. The simplest way to express it as a linear optimisation problem would be:
min weights . x
subject to volumes . x < 1000.0
for all i, x[i] = +1 or -1
Where a . b is the vector dot product. Once this problem is solved, all indices where x = +1 correspond to your first array, all indices where x = -1 correspond to your second array.
Unfortunately, 0-1 integer programming is known to be NP-hard. The simplest way of solving it is to perform exhaustive brute force exploring of the space, but it requires testing all 2^n possible vectors x (where n is the length of your original weights and volumes vectors), which can quickly get out of hands. There is a lot of literature on this topic, with more efficient algorithms, but they are often highly specific to a particular set of problems and/or constraints. You can google "linear integer programming" to see what has been done on this topic.
I think the simplest might be to perform a heuristic-based brute force search, where you prune your search tree early when it would get you out of your volume constraint, and stay close to your constraint (as a general rule, the solution of linear optimisation problems are on the edge of the feasible space).
Here are a couple of articles you might want to read on this kind of optimisations:
UCLA Linear integer programming
MIT course on Integer programming
Carleton course on Binary programming
Articles on combinatorial optimisation & linear integer programming
If you are not familiar with optimisation articles or math in general, the wikipedia articles provides a good introduction, but most articles on this topic quickly show some (pseudo)code you can adapt right away.
If your n is large, I think at some point you will have to make a trade off between how optimal your solution is and how fast it can be computed. Your solution is probably suboptimal, but it is much faster than the exhaustive search. There might be a better trade off, depending on the exact configuration of your problem.
It seems that in your case, difference of weight is objective, while difference of volume is just a constraint, which means that you are seeking for solutions that optimize difference of weight attribute (as small as possible), and satisfy the condition on difference of volume attribute (total < 1000). In this case, it's a single objective constrained optimization problem.
Whereas, if you are interested in multi-objective optimization, maybe you wanna look at the concept of Pareto Frontier: http://en.wikipedia.org/wiki/Pareto_efficiency . It's good for keeping multiple good solutions with advantages in different objective, i.e., not losing diversity.