I am compiling Drake from source to use Gurobi.
I am able to have bilinear non-convex quadratic costs due to it being supported in Gurobi 9.0:
https://www.gurobi.com/wp-content/uploads/2019/12/Gurobi-9.0-Overview-Webinar-Slides-1.pdf
However, I cannot have bilinear constraints, even though it is supposed to be supported.
def cross_prod_2d(vec1, vec2):
a, b, c = vec1[0], 0., vec1[1]
d, e, f = vec2[0], 0., vec2[1]
return -(a*f - c*d)
net_torque += cross_prod_2d(lever_arm, signed_contact_force)
net_torque_slack = prog.NewContinuousVariables(1)
prog.AddConstraint(np.asscalar(net_torque_slack) >= net_torque)
ValueError: GurobiSolver is unable to solve because a GenericConstraint was declared but is not supported.
Is the binding just not available? Can we make it available?
Related
I am trying to solve the following optimization problem and trying to obtain a set of values x_1, x_2, ..., x_k as follows:
argmin Σx_i * a_i
subject to <x_1, x_2, ..., x_k> ~ Lap(m, b)
The terms a_i are constants, and the values x_i are drawn from a laplace distribution with mean m and scale parameter b. Hence the resulting outputs are generated from a Laplacian distribution. What is this kind of constraint called?
This looks like a general form of Lasso regularization. Adding L1 regularization can often been seen as enforcing a Laplace prior on your data (see the Bayesian interpretation). You could try solving:
argmin Σx_i * a_i + 1/b Σ|x_i - m|
You could try solving this using (sub)-gradient methods or proximal minimization.
I am new to Python. In one of my assignment question, part of the question requires us to compute the average of each element in a sub-matrix and replace each element with the mean using operators that's available in Numpy.
An example of the matrix could be
M = [[[1,2,3],[2,3,4]],[[3,4,5],[4,5,6]]]
Through some operations, it is expected to get a matrix like the following:
M = [[[2,2,2],[3,3,3]],[[4,4,4],[5,5,5]]]
I have looked at some numpy documentations and still haven't figured out, would really appreciate if someone can help.
You have a few different options here. All of them follow the same general idea. You have an MxNxL array and you want to apply a reduction operation along the last axis that will leave you with an MxN result by default. However, you want to broadcast that result across the same MxNxL shape you started with.
Numpy has a parameter in most reduction operations that allows you to keep the reduced dimension present in the output array, which will allow you to easily broadcast that result into the correct sized matrix. The parameter is called keepdims, you can read more in the documentation to numpy.mean.
Here are a few approaches that all take advantage of this.
Setup
avg = M.mean(-1, keepdims=1)
# array([[[2.],
# [3.]],
#
# [[4.],
# [5.]]])
Option 1
Assign to a view of the array. However, it will also coerce float averages to int, so cast your array to float first for precision if you want to do this.
M[:] = avg
Option 2
An efficient read only view using np.broadcast_to
np.broadcast_to(avg, M.shape)
Option 3
Broadcasted multiplication, more for demonstration than anything.
avg * np.ones(M.shape)
All will produce (same except for possibly the dtype):
array([[[2., 2., 2.],
[3., 3., 3.]],
[[4., 4., 4.],
[5., 5., 5.]]])
In one line of code:
M.mean(-1, keepdims=1) * np.ones(M.shape)
Assuming I have a a weight matrix that looks like [[a , b ], [c, d]], is it possible in Tensorflow to fix the values of b and c to zero such that they don't change during optimization?
Some sample code:
A = tf.Variable([[1., 0.], [3., 0.]])
A1 = A[:,0:1] # just some slicing of your variable
A2 = A[:,1:2]
A2_stop = tf.stop_gradient(tf.identity(A2))
A = tf.concat((A1, A2_stop), axis=1)
Actually, tf.identity is needed to stop the gradient before A2.
There are three ways to do this, you can
Break apart your weight matrix into multiple variables, and make only some of them trainable.
Hack the gradient calculation to be zero for the constant elements.
Hack the gradient application to reset the values of the constant elements.
My calculation consists of putting many matrices in one big block matrix. Some of these matrices can be empty in certain cases. These empty matrices give unexpected results.
The problem comes down to this:
b
Out[117]: array([], dtype=int32)
X = A[b,:]
Out[118]: array([], shape=(0, 3), dtype=float64)
X is the empty matrix. The matrix it gets multiplied by is also empty due to the code.
Y = array([]).dot(X)
Out[119]: array([ 0., 0., 0.])
I realise that the size of Y is correct according to algebra: (1x0).(0x3)=(1x3). But I was expecting an empty matrix to be the result, since the inner dimmension of the matrices are zero (not one),
I would rather not check for these matrices to be empty, because putting the block matrix together, would have to be rewriten for every combination of the possible empty matrices.
Is there a solution to this problem? I was thinking of wrapping the dot function and only proceding if the inner dimension is not zero. But I feel like there is a cleaner solution.
edit:
I should clarify i bit more with what I mean with that I rather not check for zero dimension. The equations that i put into a block matrix consists of a hundreths of these dot products. Each dot product represents a component in an electrical network. X being empty means that there is no such component present in the network. But if I would have to compose the final (block) matrix dependent on which elements are presents. Then this would mean thousands of lines of code. Because the [ 0., 0., 0.] equation adds an incorrect equation. Which I would rather not do.
The bad news is that the shape of the result is both expected and correct.
The good news is that there is a nearly trivial check to see if a matrix is empty or not for all cases using the total number of elements in the result, provided by the size attribute:
b = ...
X = ...
Y = array([]).dot(X)
if Y.size:
# You have a non-empty result
EDIT
You can use the same logic to filter your input vectors. Since you want to do calculations only for non-empty vectors, you may want to try something like:
if b.size and X.size:
Y = b.dot(X)
# Add Y to your block matrix, knowing that it is of the expected size
I'm solving a geometric constrained optimization problem. The variables in the optimization are the x-y components for a set of vectors. The objective function is quadratic in these variables.
However, I need to constrain the SUM of the magnitudes of a subset of the vectors.
Specifically, suppose this subset consists of {v1,v2,...,vn}
I need the solution to satisfy
||v1|| + ||v2|| + .... + ||vn|| < L
If it was just a single vector I could square both sides to get a quadratic constraint and frame the problem as a QCQP
v1.x * v1.x + v1.y * v1.y < L*L
However, I have multiple vectors. So is there any way to express the constraint in such a way that I could apply a technique more specific than general non-linear constrained optimization? Or given that my objective function can be minimized analytically, would it make sense to solve the problem by
Ignoring the constraint and obtaining an optimum value x* for the objective function
Projecting x* onto the constraint manifold numerically to get a final solution that satisfies the constraints?
Not sure what else is within your optimization-problem, but the task of constraining your norms alone is nonlinear but convex which allows efficient solving.
Using external libraries you can prototype it like this. Here cvxpy (python) is used.
There are many similar libraries following the same ideas like: cvxopt (python), picos (python), yalmip (matlab), convex.jl (julia). The official solver-APIs are usually way more low-level and there is more work to do. Inbetween there is also JuMP (julia).
from cvxpy import *
L = 10.0
V = Variable(3,5) # 3 vectors
constraints = []
constraints.append(sum_entries(norm(V, axis=1)) <= L)
objective = Maximize(sum_entries(V))
prob = Problem(objective, constraints)
prob.solve()
print("status:", prob.status)
print("optimal value", prob.value)
print("optimal var", V.value)
print('constr: ', sum_entries(norm(V, axis=1).value))
Output:
status: optimal
optimal value 22.36067971461066
optimal var [[ 1.49071198 1.49071198 1.49071198 1.49071198 1.49071198]
[ 1.49071198 1.49071198 1.49071198 1.49071198 1.49071198]
[ 1.49071198 1.49071198 1.49071198 1.49071198 1.49071198]]
constr: sum_entries([[ 3.33333332]
[ 3.33333332]
[ 3.33333332]])
The above is automatically converted so SOCP-form and can be solved by commercial-solvers or open-source solvers like ECOS and SCS.
This conversion also proves to us, that this problem is convex (by construction)! The approach is called Disciplined Convex Programming
.
Depending on your library/software chosings, you have to do this conversion manually. It should be not that hard when you introduce some helper-variables to collect the norms of your vectors. Within Gurobi, you just would need to use the basic SOCP-constraint docs.
Remark: ||v1|| + ||v2|| + .... + ||vn|| < L is scary, as numerical-optimization usually only cares about <=. Everything else needs trickery (epsilon-values...)
Edit:
Here is a pure Gurobi-approach, which can give you some idea on how to achieve this with more low-level libraries supporting similar functions as Gurobi's API (i'm thinking about Mosek and CPLEX without knowing there APIs much; i think Mosek is quite different).
from gurobipy import *
import numpy as np
L = 10.0
# Model
m = Model("test")
# Vars
v = np.empty((3,5), dtype=object)
for i in range(3):
for j in range(5):
v[i, j] = m.addVar() # by default >= 0; it's just an example
norm_v = np.empty(3, dtype=object)
for i in range(3):
norm_v[i] = m.addVar() # aux-vars to collect norms
m.update() # make vars usable for posting constraints
# Constraints
for i in range(3):
m.addQConstr(np.dot(v[i, :], v[i, :]),
GRB.LESS_EQUAL, norm_v[i] * norm_v[i]) # this is the SOCP-constraint for our norm
m.addConstr(np.sum(norm_v) <= L) # gurobi-devs would propose using quicksum
# Objective
m.setObjective(np.sum(v), GRB.MAXIMIZE)
# Solve
m.optimize()
def get_val(x):
return x.X
get_val_func = np.vectorize(get_val)
print('optimal var: ', get_val_func(v))
Output:
Optimal objective 2.23606793e+01
optimal var: [[ 1.49071195 1.49071195 1.49071195 1.49071195 1.49071195]
[ 1.49071195 1.49071195 1.49071195 1.49071195 1.49071195]
[ 1.49071195 1.49071195 1.49071195 1.49071195 1.49071195]]