I am dealing with an optimization problem where I have to optimize model parameters to minimize errors in the model predictions (y_pred) w.r.t. observations (y_obs). My objective is to minimize Root Mean Square Error (RMSE) and maximize the correlation coefficient (CORR). I came up with following objective function:
minimize(f) = minimize(lambda*RMSE/CORR)
where lambda is some negative large value (e.g., -1e6) if CORR < 0
else lambda = 1
Did I define the objective function correctly or It can be defined in better way?
Try ^^
I think you need to search in minimize variable and put your equation inside the function or variable.
minimize = RMSE/CORR # if it's a variable
# for function need search ^^
study = optuna.create_study(directions["minimize"],...)
func = lambda trial: objective(trial, X_enc, y_train_full)
study.optimize(func, n_trials=10)
For more information, see this Optuna tutorial.
Related
I am trying to use pymo for a single objective nonlinear optimization problem.
The objective function is to minimize the variance (or standard deviation) of the input variables following certain constraints (which I was able to do in Excel).
Following is a code example of what I am trying to do
model = pyo.ConcreteModel()
# declare decision variables
model.x1 = pyo.Var(domain=pyo.NonNegativeReals)
model.x2 = pyo.Var(domain=pyo.NonNegativeReals)
model.x3 = pyo.Var(domain=pyo.NonNegativeReals)
model.x4 = pyo.Var(domain=pyo.NonNegativeReals)
# declare objective
from statistics import stdev
model.variance = pyo.Objective(
expr = stdev([model.x1, model.x2, model.x3, model.x4]),
sense = pyo.minimize)
# declare constraints
model.max_charging = pyo.Constraint(expr = model.x1 + model.x2 + model.x3 + model.x4 >= 500)
model.max_x1 = pyo.Constraint(expr = model.x1 <= 300)
model.max_x2 = pyo.Constraint(expr = model.x2 <= 200)
model.max_x3 = pyo.Constraint(expr = model.x3 <= 100)
model.max_x4 = pyo.Constraint(expr = model.x4 <= 200)
# solve
pyo.SolverFactory('glpk').solve(model).write()
#print
print("energy_price = ", model.variance())
print(f'Variables = [{model.x1()},{model.x2()},{model.x3()},{model.x4()}]')
The error I get is TypeError: can't convert type 'ScalarVar' to numerator/denominator
The problem seems to be caused by using the stdev function from statistics.
My assumption is that the models variables x1-x4 are yet to have been assigned a value and that is the main issue. However, I am not sure how to approach this?
First: stdev is nonlinear. So why even try to solve this with a linear solver?
Pyomo does not know about the statistics package. You'll have to program the standard deviation using elementary operations, use an external function approach, or use an approximation (like minimizing the range).
So I managed to solve this issue and I am including the solution below. But first, there are a couple of points I would like to highlight
As #Erwin Kalvelagen mentioned, 'stdev' is nonlinear so a linear solver like 'glpk' would always result in an error. For my problem, 'ipopt' worked fine but be careful as it can perform poorly in some cases.
Also, as #Erwin Kalvelagen mentioned, Pyomo does not know about the statistics package. So When you try to use a function from that package (e.g., 'stdev', 'variance', etc.), it will try to evaluate the model variables before the solver assigns them any value and that will cause an error.
A pyomo objective function needs an expression. The code sample below shows how to dynamically generate an expression for the variance without using an external package. The code is also agnostic to the number of model variables you have.
Using either the variance or the standard deviation will serve the same purpose for my project. I opted for using the variance to avoid calculating its square root (as the standard deviation is the square root of the variance).
Variability Objective Function
import pyomo.environ as pyo
def variabilityRule(model):
#Calculate mean of all model variables
for index in model.x:
mean += model.x[index]
mean = mean/len(model.x)
#Calculate the difference between each model variable and the mean
obj_exp = ((model.x[1])-mean) * ((model.x[1])-mean)
for i in range(2, len(model.x)+1): #note that pyomo variables start from 1 not zero
obj_exp += ((model.x[i])-mean) * ((model.x[i])-mean)
#Divide by number of items
obj_exp = (obj_exp/(len(model.x)))
return obj_exp
model = pyo.ConcreteModel()
model.objective = pyo.Objective(
rule = variabilityRule,
sense = pyo.maximize)
EDIT: Standard Deviation Calculation
You can use the same approach to calculate the standard deviation as I found out. Just multiply the final expression (`obj_exp`) by power 0.5
obj_exp = obj_exp ** 0.5
...
P.S. If you are interested, this is how I dynamically generated my model variables based on an input array
model.x = pyo.VarList(domain=pyo.NonNegativeReals)
for i in range(0, len(input_array)):
model.x.add()
I have used CVXPY and some of its LP solvers to determine whether a solution to an A*x <= b problem is feasible, and now I would like to try PySCIPOpt. I could not find an example of this in the docs, and I'm having trouble figuring out the right syntax. With CVXPY the code is simply:
def do_cvxpy(A, b, solver):
x = cvxpy.Variable(A.shape[1])
constraints = [A#x <= b] #The # denotes matrix multiplication in CVXPY
obj = cvxpy.Minimize(0)
prob = cvxpy.Problem(obj, constraints)
prob.solve(solver=solver)
return prob.status
I think with PySCIPOpt one cannot use matrix notation as above, but must treat vectors and matrices as collections of scalar variables, each of which has to be added individually, so I tried this:
def do_scip(A, b):
model = Model("XYZ")
x = {}
for i in range(A.shape[1]):
x[i] = model.addVar(vtype="C", name="x(%s)" % i)
model.setObjective(0) #Is this right for a feasibility-only problem?
model.addCons(A*x <= b) #This is certainly the wrong syntax
model.optimize()
return model.getStatus()
Could anyone please help me out with the correct form for the constraint in addCons() for this kind of problem, and confirm that an acceptable way to ask whether a solution is feasible is to simply pass 0 as the objective?
I'm still not positive about the setObjective(0), but at least I can get the code to run without errors by "unpacking" the A matrix and the b vector and adding each element as a constraint:
for i in range(ncols):
for j in range(nrows):
model.addCons(A[j,i]*x[i] <= b[i])
I also discovered that CVXPY actually has an interface to SCIP, but it gives me an error when I try to use it:
getSolObjVal cannot only be called in stage SOLVING without a valid solution
which seems to suggest that the interface cannot be used for feasibility-only problems.
I am trying to solve the optimization problem for the two-node system. Both nodes have the same number of variables and constraints but different data. I want to solve the optimization problem for both the nodes in parallel using pyomo and multiprocessing. I have explained the code below to show how I am trying to implement it. Is it the right way to do it?
My Python Code is something like this:
Class opt():
def __init__(self):
I have initialized all variables and data.
def createmodel(self):
I have declared all Pyomo model parameters and Variables.
def parallelizemodel(self):
with Pool(processes=1) as pool:
s0 = pool.apply_async(self.node1opt, ())
s1 = pool.apply_async(self.node2opt, ())
def node1opt(self):
I have defined constraints and objective function for pyomo model.
def node2opt(self):
I have defined constraints and objective function for pyomo model.
This code does not run properly. It does not give me errors but does not solve the optimization problem as well.
I am looking forward to minimize a non linear function with 3 arguments (x1,x2 and x3)
My sources of information are:
the explanation of the minimization function:
https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html
And an example they provide:
https://docs.scipy.org/doc/scipy/reference/tutorial/optimize.html
I do not belong to a mathematical area, so first off forgive me if I am using incorrect wording / expressions.
This is my code :
import numpy as np
from scipy.optimize import minimize
def rosen(x1,x2,x3):
return np.sqrt(((x1**2)*0.002)+((x2**2)*0.0035)+((x3**2)*0.0015)+(2*x1*x2*0.015)+(2*x1*x3*0.01)+(2*x2*x3*0.02))
I think that the first step is okey up to here..
Then it is required to state the:
x0 : ndarray
Initial guess. len(x0) is the dimensionality of the minimization problem.
Given that I am stating 3 args in the minimization function I shall state a 3 dim array , such like this?
x0=np.array([1,1,1])
res = minimize(rosen, x0)
print(res.x)
The undesired output is:
rosen() missing 2 required positional arguments: 'x2' and 'x3'
Which I do not really understand where shall I state the positional arguments.
Apart from that I would like to set some bounds for the outputing values for x1,x2,x3 .
Which I tried
res = minimize(rosen, x0, bounds=([0,None]),options={"disp": False})
Which outputs also that :
ValueError: length of x0 != length of bounds
How should I then express the bounds inside the res then?
The desired output would be simply to output an array for x1,x2,x3 according to the minimization of the function where each value is minimun 0, as stated in the bounds and that the sum of the args add up to 1.
Function-definition
Read the docs carefully, e.g. for your function-def:
fun : callable
The objective function to be minimized. Must be in the form f(x, *args). The
optimizing argument, x, is a 1-D array of points, and args is a tuple of any
additional fixed parameters needed to completely specify the function.
Your function should take a 1d-array, while you implement the multi-argument for multi-variables approach!
Changing:
def rosen(x1,x2,x3):
return np.sqrt(((x1**2)*0.002)+((x2**2)*0.0035)+((x3**2)*0.0015)+(2*x1*x2*0.015)+(2*x1*x3*0.01)+(2*x2*x3*0.02))
def rosen(x):
x1,x2,x3 = x # unpack vector for your kind of calculations
return np.sqrt(((x1**2)*0.002)+((x2**2)*0.0035)+((x3**2)*0.0015)+(2*x1*x2*0.015)+(2*x1*x3*0.01)+(2*x2*x3*0.02))
should work. This is a bit a repair-something-to-keep-my-other-code approach but won't hurt much in this example. Usually you implement your function-definition on the 1d-array-input assumption!
Bounds
Again from the docs:
bounds : sequence, optional
Bounds for variables (only for L-BFGS-B, TNC and SLSQP). (min, max) pairs for each
element in x, defining the bounds on that parameter. Use None for one of min or max
when there is no bound in that direction.
So you need n_vars pairs! Easily achieved by using a list-comprehension, deducing the necessary info from x0.
res = minimize(rosen, x0, bounds=[[0,None] for i in range(len(x0))],options={"disp": False})
Make variables sum up to 1 / Constraints
Your comment implies you want the variables to sum up to 1. You would need to use an equality-constraint then (only 1 solver supporting this and inequality-constraints; one other only inequality-constraints; the rest no constraints; solver will be picked automatically if none explicitly given).
It looks somewhat like:
cons = ({'type': 'eq', 'fun': lambda x: sum(x) - 1}) # read docs to understand!
# to think about:
# sum vs. np.sum
# (not much diff here)
res = minimize(rosen, x0, bounds=[[0,None] for i in range(len(x0))],options={"disp": False}, constraints=cons)
For the case of x nonnegative, the constraint is usually called the probability-simplex.
(untested code; conceptually correct!)
Let z is a complex variable, C(z) is its conjugation.
In complex analysis theory, the derivative of C(z) w.r.t z don't exist. But in tesnsorflow, we can calculate dC(z)/dz and the result is just 1.
Here is an example:
x = tf.placeholder('complex64',(2,2))
y = tf.reduce_sum(tf.conj(x))
z = tf.gradients(y,x)
sess = tf.Session()
X = np.random.rand(2,2)+1.j*np.random.rand(2,2)
X = X.astype('complex64')
Z = sess.run(z,{x:X})[0]
The input X is
[[0.17014372+0.71475762j 0.57455420+0.00144318j]
[0.57871044+0.61303568j 0.48074263+0.7623235j ]]
and the result Z is
[[1.-0.j 1.-0.j]
[1.-0.j 1.-0.j]]
I don't understand why the gradient is set to be 1?
And I want to know how tensorflow handles the complex gradients in general.
How?
The equation used by Tensorflow for the gradient is:
Where the '*' means conjugate.
When using the definition of the partial derivatives wrt z and z* it uses Wirtinger Calculus. Wirtinger calculus enables to calculate the derivative wrt a complex variable for non-holomorphic functions. The Wirtinger definition is:
Why this definition?
When using for example Complex-Valued Neural Networks (CVNN) the gradients will be used over non-holomorphic, real-valued scalar function of one or several complex variables, tensorflow definition of a gradient can then be written as:
This definition corresponds with the literature of CVNN like for example chapter 4 section 4.3 of this book or Amin et al. (between countless examples).
Bit late, but I came across this issue recently too.
The key point is that TensorFlow defines the "gradient" of a complex-valued function f(z) of a complex variable as "the gradient of the real map F: (x,y) -> Re(f(x+iy)), expressed as a complex number" (the gradient of that real map is a vector in R^2, so we can express it as a complex number in the obvious way).
Presumably the reason for that definition is that in TF one is usually concerned with gradients for the purpose of running gradient descent on a loss function, and in particular for identifying the direction of maximum increase/decrease of that loss function. Using the above definition of gradient means that a complex-valued function of complex variables can be used as a loss function in a standard gradient descent algorithm, and the result will be that the real part of the function gets minimised (which seems to me a somewhat reasonable interpretation of "optimise this complex-valued function").
Now, to your question, an equivalent way to write that definition of gradient is
gradient(f) := dF/dx + idF/dy = conj(df/dz + dconj(f)/dz)
(you can easily verify that using the definition of d/dz). That's how TensorFlow handles complex gradients. As for the case of f(z):=conj(z), we have df/dz=0 (as you mention) and dconj(f)/dz=1, giving gradient(f)=1.
I wrote up a longer explanation here, if you're interested: https://github.com/tensorflow/tensorflow/issues/3348#issuecomment-512101921