Can't minimize function - optimization

I just want to minimize a simple function, every example i' ve watch didnt get me anywhere.
import math
import numpy as np
import sympy as sp
from scipy.optimize import minimize
import scipy.optimize as optimize
R=1.5
k_1=2
a=1
n=a
alpha=0.25
beta=0.5
delta=0.9
def f_gob(x, y, z):
c_1=((1/x-y/x)+R*k_1)/(1+delta*(1+alpha))
c_2=delta*x*(((1/x-y/x)+R*k_1)/(1+delta*(1+alpha)))
l=n-(alpha*(delta*x*(((1/x-y/x)+R*k_1)/((1+delta*(1+alpha))))))/(1-y)
return -1*(math.log(c_1)+delta*(math.log(c_2)+alpha*math.log(n-l)+beta*math.log(z)))
f_gob(0.9996,0.332,0.7765)
x0 = [0.8,0.2,0.6]
res = minimize(f_gob, x0)
Thank you very much.

Better is:
def f_gob(a):
x = a[0]
y = a[1]
z = a[2]
c_1= ((1/x-y/x)+R*k_1)/(1+delta*(1+alpha))
c_2=delta*x*c_1
l=n-(alpha*c_2)/(1-y)
return -1*(math.log(c_1)+delta*(math.log(c_2)+alpha*math.log(n-l)+beta*math.log(z)))
f_gob([0.9996,0.332,0.7765])
The main issue is that the current levels of the three decision variables x,y,z are passed on as a single array, which I call a. I just unpack the individual members to keep things close to what you had. Passing things on as an array makes sense, especially if you want to allow for large numbers of variables (say hundreds).
For further information see the documentation: the third sentence explains the format of the function to be called. Also check the examples.

Related

Equation constraints with scipy least_squares

I'm trying to use least square to minimize a loss function by changing x,y,z. My problem is nonlinear hence why i chose scipy least_squares. The general structure is:
from scipy.optimize import least_squares
def loss_func(x, *arguments):
# plug x's and args into an arbitrary equation and return loss
return loss # loss here is an array
# x_arr contains x,y,z
res = least_squares(loss_func, x_arr, args=arguments)
I am trying to constraint x,y,z by: x-y = some value, z-y = some value. How do I go about doing so? The scipy least_squares documentation only provided bounds. I understand I can create bounds like 0<x<5. However, my constraints is an equation and not a constant bound. Thank you in advance!
If anyone ever stumble on this question, I've figured out how to overcome this issue. Since least_squares does not have constraints, it is best to just use linear programming with scipy.optimize.minimize. Since the loss_func returns an array of residuals, we can use L1 norm (as we want to minimize the absolute difference of this array of residuals).
from scipy.optimize import minimize
import numpy as np
def loss_func(x, *arguments):
# plug x's and args into an arbitrary equation and return loss (loss is an array)
return np.linalg.norm(loss, 1)
The bounds can be added to scipy.optimize.minimize fairly easily:)

Black box optimization with Scikit Optimize

I have to optimize a black-box problem that depends on external software (no function definition neither derivatives) that is quite expensive to evaluate. It depends on several variables, some of them are real and some other are integers.
I think Scikit Optimize may be a good choice.
I was wondering if the following example (from the Scikit Optimize documentation) may be adapted to my actual problem. Being "f" an external function that provides the cost of a given set of parameters. Here it is a dummy function just to be reproducible. But, instead of depending just on "x", make it dependable on "y" and "z" being one of them restricted to integer values.
I have seen some other examples of Scikit Optimize oriented to hyperparameter optimization (based on Scikit Learn), but they seem less clear for me.
Here is the minimum reproducible example (that crash):
import numpy as np
from skopt import gp_minimize
from skopt.space import Integer
from skopt.space import Real
np.random.seed(123)
def f(x,y,z):
return (np.sin(5 * x[0]) * (1 - np.tanh(x[0] ** 2)) *np.random.randn() * 0.1-y[0]**2+z[0]**2)
search_space = list()
search_space.append(Real(-2, 2, name='x'))
search_space.append(Integer(-2, 2, name='y'))
search_space.append(Real(0, 2, name='z'))
res = gp_minimize(f, search_space, n_calls=20)
print("x*=%.2f, y*=%.2f, f(x*,y*)=%.2f" % (res.x[0],res.y[0],res.z[0], res.fun))
Best regards and thank you
You can use the decorator function use_named_args from scikit-optimize to pass your search space with names to your cost function:
import numpy as np
from skopt import gp_minimize
from skopt.space import Integer
from skopt.space import Real
from skopt.utils import use_named_args
np.random.seed(123)
search_space = [
Real(-2, 2, name='x'),
Integer(-2, 2, name='y'),
Real(0, 2, name='z')
]
#use_named_args(search_space)
def f(x, y, z):
return (np.sin(5 * x) * (1 - np.tanh(x ** 2)) *np.random.randn() * 0.1-y**2+z**2)
res = gp_minimize(f, search_space, n_calls=20)
Note that your OptimizeResult res is storing the optimized parameters in the attribute x which is an array of the best values. That is why your code crashes (i.e. there are no attributes y and z in res). You could get a dictionary with mapped names and optimized values as following:
optimized_params = {p.name: res.x[i] for i, p in enumerate(search_space)}

How can I make scipy.odeint to be faster?

I am currently solving an integrated system of 559 non linear differential equations.I have to fit the solutions obtained to some experimental data by varying the constants c1,c2 b and g.
I am using scipy.odeint and I would like to know if there is a way to make my program faster as it takes ages to run.
The code is this:
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import odeint
import random as rd
from numba import jit
L=np.loadtxt('C:/Users/Pablo/Desktop/TFG/Probas/matriz_L_Pablo.txt')
I=np.loadtxt('C:/Users/Pablo/Desktop/TFG/Probas/vector_I_Pablo.txt')
k=np.diag(L)
n=len(k) #Contamos o numero de nodos
u=np.zeros(n)
for i in range (n):
u[i]=rd.random()
M=np.zeros((n,n))
derivs=np.zeros(n)
c1=100 ; c2=10000 ; b=0.01 ; g=1
#jit
def f(y,t,params):
suma=0
c1,c2,b,g=params
for i in range(n):
for j in range(n):
if i==j:
M[i,i]=(1-y[i]/b)+g*(1-y[i])+c2*I[i]*(1/n-1)
if i!=j:
M[i,j]=(1/n)*(c1*L[i,j]+c2*I[i])
out=(M[i,j]*y[j])
suma=suma+out
derivs[i]=suma
suma=0
return derivs
#Condicions iniciais
y0=u
#lista cos parametros
params=[c1,c2,b,g]
#tempos de int
tf=1
deltat=0.001
t=np.arange(0,tf,deltat)
#solucion
sol= odeint(f, y0,t, args=(params,))
(Sorry if it is not very clear it's my first time here)
You can try vectorizing your code. The function f does 2 things - first it creates the matrix M, and then does the multiplication $$My$$. The multiplication $$My$$ is easy to vectorize because all we have to do is use numpy's matmul function.
def f(y,t,params):
suma=0
c1,c2,b,g=params
for i in range(n):
for j in range(n):
if i==j:
M[i,i]=(1-y[i]/b)+g*(1-y[i])+c2*I[i]*(1/n-1)
if i!=j:
M[i,j]=(1/n)*(c1*L[i,j]+c2*I[i])
return np.matmul(M,y)
That should help with runtime a bit. But the most time consuming part is the fact that the entire matrix M is formed every time f is called, and that it is formed one element at a time.
The only parts of M that need to be modified when calling f, are the parts which depend on y. So all of the off diagonal entries in M can be filled before the ode solver is called. So if M is (569x569), instead of having to calculate all ~250000+ elements of M every time f is called, you would only have to calculate the 569 elements of M on the diagonal. The remaining 250000 entries of M don't depend on y, and are specified before calling the ode solver. Making this modification should result in a huge speedup as this seems to be the main bottleneck in your code.
Lastly, you could also vectorize how the diagonal of M is filled by using something like numpy.diag_indices.

Numpy - AttributeError: 'Zero' object has no attribute 'exp'

I'm having trouble solving a discrepancy between something breaking at runtime, but using the exact same data and operations in the python console, having it work fine.
# f_err - currently has value 1.11819388872025
# l_scales - currently a numpy array [1.17840183376334 1.13456764589809]
sq_euc_dists = self.se_term(x1, x2, l_scales) # this is fine. It calls cdists on x1/l_scales, x2/l_scales vectors
return (f_err**2) * np.exp(-0.5 * sq_euc_dists) # <-- errors on this line
The error that I get is
AttributeError: 'Zero' object has no attribute 'exp'
However, calling those exact same lines, with the same f_err, l_scales, and x1, x2 in the console right after it errors out, somehow does not produce errors.
I was not able to find a post referring to the 'Zero' object error specifically, and the non-'Zero' ones I found didn't seem to apply to my case here.
EDIT: It was a bit lacking in info, so here's an actual (extracted) runnable example with sample data I took straight out of a failed run, which when run in isolation works fine/I can't reproduce the error except in runtime.
Note that the sqeucld_dist function below is quite bad and I should be using scipy's cdist instead. However, because I'm using sympy's symbols for matrix elementwise gradients with over 15 partial derivatives in my real data, cdist is not an option as it doesn't deal with arbitrary objects.
import numpy as np
def se_term(x1, x2, l):
return sqeucl_dist(x1/l, x2/l)
def sqeucl_dist(x, xs):
return np.sum([(i-j)**2 for i in x for j in xs], axis=1).reshape(x.shape[0], xs.shape[0])
x = np.array([[-0.29932052, 0.40997373], [0.40203481, 2.19895326], [-0.37679417, -1.11028267], [-2.53012051, 1.09819485], [0.59390005, 0.9735], [0.78276777, -1.18787904], [-0.9300892, 1.18802775], [0.44852545, -1.57954101], [1.33285028, -0.58594779], [0.7401607, 2.69842268], [-2.04258086, 0.43581565], [0.17353396, -1.34430191], [0.97214259, -1.29342284], [-0.11103534, -0.15112815], [0.41541759, -1.51803154], [-0.59852383, 0.78442389], [2.01323359, -0.85283772], [-0.14074266, -0.63457529], [-0.49504797, -1.06690869], [-0.18028754, -0.70835799], [-1.3794126, 0.20592016], [-0.49685373, -1.46109525], [-1.41276934, -0.66472598], [-1.44173868, 0.42678815], [0.64623684, 1.19927771], [-0.5945761, -0.10417961]])
f_err = 1.11466725760716
l = [1.18388412685279, 1.02290811104357]
result = (f_err**2) * np.exp(-0.5 * se_term(x, x, l)) # This runs fine, but fails with the exact same calls and data during runtime
Any help greatly appreciated!
Here is how to reproduce the error you are seeing:
import sympy
import numpy
zero = sympy.sympify('0')
numpy.exp(zero)
You will see the same exception you are seeing.
You can fix this (inefficiently) by changing your code to the following to make things floating point.
def sqeucl_dist(x, xs):
return np.sum([np.vectorize(float)(i-j)**2 for i in x for j in xs],
axis=1).reshape(x.shape[0], xs.shape[0])
It will be better to fix your gradient function using lambdify.
Here's an example of how lambdify can be used on partial d
from sympy.abc import x, y, z
expression = x**2 + sympy.sin(y) + z
derivatives = [expression.diff(var, 1) for var in [x, y, z]]
derivatives is now [2*x, cos(y), 1], a list of Sympy expressions. To create a function which will evaluate this numerically at a particular set of values, we use lambdify as follows (passing 'numpy' as an argument like that means to use numpy.cos rather than sympy.cos):
derivative_calc = sympy.lambdify((x, y, z), derivatives, 'numpy')
Now derivative_calc(1, 2, 3) will return [2, -0.41614683654714241, 1]. These are ints and numpy.float64s.
A side note: np.exp(M) will calculate the element-wise exponent of each of the elements of M. If you are trying to do a matrix exponential, you need np.linalg.exmp.

How to simulate random returns with numpy

What is a quick way to simulate random returns. I'm aware of numpy.random. However, that doesn't guide me towards how to model asset returns.
I've tried:
import numpy as np
r = np.random.rand(100)
But this doesn't feel accurate. How are others dealing doing this?
I'd suggest one of two approaches:
One: Assume returns are normally distributed with mean equal to 0.1% and stadard deviation about 1%. This looks like:
import numpy as np
np.random.seed(314)
r = np.random.randn(100) / 100 + 0.001
seed(314) sets the random number generator at a specific point so that if we both use the same seed, we should see the same results.
randn pulls from the normal distribution.
I'd also recommend using pandas. It's a library that implements a DataFrame object similar to R
import pandas as pd
df = pd.DataFrame(r)
You can then plot the cumulative returns like this:
df.add(1).cumprod().plot()
Two:
The second way is to assume returns are log normally distributed. That means the log(r) is normal. In this scenario, we pull normally distributed random numbers and then use those values as the exponent of e. It looks like this:
r = np.exp(np.random.randn(100) / 100 + 0.001) - 1
If you plot it, it looks like this:
pd.DataFrame(r).add(1).cumprod().plot()