Formulating a constraint on the SUM of the magnitudes of a set of vectors in a constrained optimization - optimization

I'm solving a geometric constrained optimization problem. The variables in the optimization are the x-y components for a set of vectors. The objective function is quadratic in these variables.
However, I need to constrain the SUM of the magnitudes of a subset of the vectors.
Specifically, suppose this subset consists of {v1,v2,...,vn}
I need the solution to satisfy
||v1|| + ||v2|| + .... + ||vn|| < L
If it was just a single vector I could square both sides to get a quadratic constraint and frame the problem as a QCQP
v1.x * v1.x + v1.y * v1.y < L*L
However, I have multiple vectors. So is there any way to express the constraint in such a way that I could apply a technique more specific than general non-linear constrained optimization? Or given that my objective function can be minimized analytically, would it make sense to solve the problem by
Ignoring the constraint and obtaining an optimum value x* for the objective function
Projecting x* onto the constraint manifold numerically to get a final solution that satisfies the constraints?

Not sure what else is within your optimization-problem, but the task of constraining your norms alone is nonlinear but convex which allows efficient solving.
Using external libraries you can prototype it like this. Here cvxpy (python) is used.
There are many similar libraries following the same ideas like: cvxopt (python), picos (python), yalmip (matlab), convex.jl (julia). The official solver-APIs are usually way more low-level and there is more work to do. Inbetween there is also JuMP (julia).
from cvxpy import *
L = 10.0
V = Variable(3,5) # 3 vectors
constraints = []
constraints.append(sum_entries(norm(V, axis=1)) <= L)
objective = Maximize(sum_entries(V))
prob = Problem(objective, constraints)
prob.solve()
print("status:", prob.status)
print("optimal value", prob.value)
print("optimal var", V.value)
print('constr: ', sum_entries(norm(V, axis=1).value))
Output:
status: optimal
optimal value 22.36067971461066
optimal var [[ 1.49071198 1.49071198 1.49071198 1.49071198 1.49071198]
[ 1.49071198 1.49071198 1.49071198 1.49071198 1.49071198]
[ 1.49071198 1.49071198 1.49071198 1.49071198 1.49071198]]
constr: sum_entries([[ 3.33333332]
[ 3.33333332]
[ 3.33333332]])
The above is automatically converted so SOCP-form and can be solved by commercial-solvers or open-source solvers like ECOS and SCS.
This conversion also proves to us, that this problem is convex (by construction)! The approach is called Disciplined Convex Programming
.
Depending on your library/software chosings, you have to do this conversion manually. It should be not that hard when you introduce some helper-variables to collect the norms of your vectors. Within Gurobi, you just would need to use the basic SOCP-constraint docs.
Remark: ||v1|| + ||v2|| + .... + ||vn|| < L is scary, as numerical-optimization usually only cares about <=. Everything else needs trickery (epsilon-values...)
Edit:
Here is a pure Gurobi-approach, which can give you some idea on how to achieve this with more low-level libraries supporting similar functions as Gurobi's API (i'm thinking about Mosek and CPLEX without knowing there APIs much; i think Mosek is quite different).
from gurobipy import *
import numpy as np
L = 10.0
# Model
m = Model("test")
# Vars
v = np.empty((3,5), dtype=object)
for i in range(3):
for j in range(5):
v[i, j] = m.addVar() # by default >= 0; it's just an example
norm_v = np.empty(3, dtype=object)
for i in range(3):
norm_v[i] = m.addVar() # aux-vars to collect norms
m.update() # make vars usable for posting constraints
# Constraints
for i in range(3):
m.addQConstr(np.dot(v[i, :], v[i, :]),
GRB.LESS_EQUAL, norm_v[i] * norm_v[i]) # this is the SOCP-constraint for our norm
m.addConstr(np.sum(norm_v) <= L) # gurobi-devs would propose using quicksum
# Objective
m.setObjective(np.sum(v), GRB.MAXIMIZE)
# Solve
m.optimize()
def get_val(x):
return x.X
get_val_func = np.vectorize(get_val)
print('optimal var: ', get_val_func(v))
Output:
Optimal objective 2.23606793e+01
optimal var: [[ 1.49071195 1.49071195 1.49071195 1.49071195 1.49071195]
[ 1.49071195 1.49071195 1.49071195 1.49071195 1.49071195]
[ 1.49071195 1.49071195 1.49071195 1.49071195 1.49071195]]

Related

Optimization problem with Laplacian constraints

I am trying to solve the following optimization problem and trying to obtain a set of values x_1, x_2, ..., x_k as follows:
argmin Σx_i * a_i
subject to <x_1, x_2, ..., x_k> ~ Lap(m, b)
The terms a_i are constants, and the values x_i are drawn from a laplace distribution with mean m and scale parameter b. Hence the resulting outputs are generated from a Laplacian distribution. What is this kind of constraint called?
This looks like a general form of Lasso regularization. Adding L1 regularization can often been seen as enforcing a Laplace prior on your data (see the Bayesian interpretation). You could try solving:
argmin Σx_i * a_i + 1/b Σ|x_i - m|
You could try solving this using (sub)-gradient methods or proximal minimization.

Tensorflow Probability VI: Discrete + Continuous RVs inference: gradient estimation?

See this tensorflow-probability issue
tensorflow==2.7.0
tensorflow-probability==0.14.1
TLDR
To perform VI on discrete RVs, should I use:
A- the REINFORCE gradient estimator
B- the Gumbel-Softmax reparametrization
C- another solution
and how to implement it ?
Problem statement
Sorry in advance for the long issue, but I believe the problem requires some explaining.
I want to implement a Hierarchical Bayesian Model involving both continuous and discrete Random Variables. A minimal example is a Gaussian Mixture model:
import tensorflow as tf
import tensorflow_probability as tfp
tfd = tfp.distributions
tfb = tfp.bijectors
G = 2
p = tfd.JointDistributionNamed(
model=dict(
mu=tfd.Sample(
tfd.Normal(0., 1.),
sample_shape=(G,)
),
z=tfd.Categorical(
probs=tf.ones((G,)) / G
),
x=lambda mu, z: tfd.Normal(
loc=mu[z],
scale=1.
)
)
)
In this example I don't use the tfd.Mixture API on purpose to expose the Categorical label. I want to perform Variational Inference in this context, and for instance given an observed x fit over the posterior of z a Categorical distribution with parametric probabilities:
q_probs = tfp.util.TransformedVariable(
tf.ones((G,)) / G,
tfb.SoftmaxCentered(),
name="q_probs"
)
q_loc = tf.Variable(0., name="q_loc")
q_scale = tfp.util.TransformedVariable(
1.,
tfb.Exp(),
name="q_scale"
)
q = tfd.JointDistributionNamed(
model=dict(
mu=tfd.Normal(q_loc, q_scale),
z=tfd.Categorical(probs=q_probs)
)
)
The issue is: when computing the ELBO and trying to optimize for the optimal q_probs I cannot use the reparameterization gradient estimators: this is AFAIK because z is a discrete RV:
def log_prob_fn(**kwargs):
return p.log_prob(
**kwargs,
x=tf.constant([2.])
)
optimizer = tf.optimizers.SGD()
#tf.function
def fit_vi():
return tfp.vi.fit_surrogate_posterior(
target_log_prob_fn=log_prob_fn,
surrogate_posterior=q,
optimizer=optimizer,
num_steps=10,
sample_size=8
)
_ = fit_vi()
# This last line raises:
# ValueError: Distribution `surrogate_posterior` must be reparameterized, i.e.,a diffeomorphic transformation
# of a parameterless distribution. (Otherwise this function has a biased gradient.)
I'm looking into a way to make this work. I've identified at least 2 ways to circumvent the issue: using REINFORCE gradient estimator or the Gumbel-Softmax reparameterization.
A- REINFORCE gradient
cf this TFP API link a classical result in VI is that the REINFORCE gradient can deal with a non-differentiable objective function, for instance due to discrete RVs.
I can use a tfp.vi.GradientEstimators.SCORE_FUNCTION estimator instead of the tfp.vi.GradientEstimators.REPARAMETERIZATION one using the lower-level tfp.vi.monte_carlo_variational_loss function?
Using the REINFORCE gradient, In only need the log_prob method of q to be differentiable, but the sample method needn't be differentiated.
As far as I understood it, the sample method for a Categorical distribution implies a gradient break, but the log_prob method does not. Am I correct to assume that this could help with my issue? Am I missing something here?
Also I wonder: why is this possibility not exposed in the tfp.vi.fit_surrogate_posterior API ? Is the performance bad, meaning is the variance of the estimator too large for practical purposes ?
B- Gumbel-Softmax reparameterization
cf this TFP API link I could also reparameterize z as a variable y = tfd.RelaxedOneHotCategorical(...) . The issue is: I need to have a proper categorical label to use for the definition of x, so AFAIK I need to do the following:
p_GS = tfd.JointDistributionNamed(
model=dict(
mu=tfd.Sample(
tfd.Normal(0., 1.),
sample_shape=(G,)
),
y=tfd.RelaxedOneHotCategorical(
temperature=1.,
probs=tf.ones((G,)) / G
),
x=lambda mu, y: tfd.Normal(
loc=mu[tf.argmax(y)],
scale=1.
)
)
)
...but his would just move the gradient breaking problem to tf.argmax. This is where I maybe miss something. Following the Gumbel-Softmax (Jang et al., 2016) paper, I could then use the "STRAIGHT-THROUGH" (ST) strategy and "plug" the gradients of the variable tf.one_hot(tf.argmax(y)) -the "discrete y"- onto y -the "continuous y".
But again I wonder: how to do this properly ? I don't want to mix and match the gradients by hand, and I guess an autodiff backend is precisely meant to avoid me this issue. How could I create a distribution that differentiates the forward direction (sampling a "discrete y") from the backward direction (gradient computed using the "continuous y") ? I guess this is the meant usage of the tfd.RelaxedOneHotCategorical distribution, but I don't see this implemented anywhere in the API.
Should I implement this myself ? How ? Could I use something in the lines of tf.custom_gradient?
Actual question
Which solution -A or B or another- is meant to be used in the TFP API, if any? How should I implement said solution efficiently?
So the ides was not to make a Q&A but I looked into this issue for a couple days and here are my conclusions:
solution A -REINFORCE- is a possibility, it doesn't introduce any bias, but as far as I understood it it has high variance in its vanilla form -making it prohibitively slow for most real-world tasks. As detailed a bit below, control variates can help tackle the variance issue;
solution B, Gumbell-Softmax, exists as well in the API, but I did not find any native way to make it work for hierarchical tasks. Below is my implementation.
First off, we need to reparameterize the joint distribution p as the KL between a discrete and a continuous distribution is ill-defined (as explained in the Maddison et al. (2017) paper). To not break the gradients, I implemented a simple one_hot_straight_through operation that converts the continuous RV y into a discrete RV z:
G = 2
#tf.custom_gradient
def one_hot_straight_through(y):
depth = y.shape[-1]
z = tf.one_hot(
tf.argmax(
y,
axis=-1
),
depth=depth
)
def grad(upstream):
return upstream
return z, grad
p = tfd.JointDistributionNamed(
model=dict(
mu=tfd.Sample(
tfd.Normal(0., 1.),
sample_shape=(G,)
),
y=tfd.RelaxedOneHotCategorical(
temperature=1.,
probs=tf.ones((G,)) / G
),
x=lambda mu, y: tfd.Normal(
loc=tf.reduce_sum(
one_hot_straight_through(y)
* mu
),
scale=1.
)
)
)
The variational distribution q follows the same reparameterization and the following code bit does work:
q_probs = tfp.util.TransformedVariable(
tf.ones((G,)) / G,
tfb.SoftmaxCentered(),
name="q_probs"
)
q_loc = tf.Variable(tf.zeros((2,)), name="q_loc")
q_scale = tfp.util.TransformedVariable(
1.,
tfb.Exp(),
name="q_scale"
)
q = tfd.JointDistributionNamed(
model=dict(
mu=tfd.Independent(
tfd.Normal(q_loc, q_scale),
reinterpreted_batch_ndims=1
),
y=tfd.RelaxedOneHotCategorical(
temperature=1.,
probs=q_probs
)
)
)
def log_prob_fn(**kwargs):
return p.log_prob(
**kwargs,
x=tf.constant([2.])
)
optimizer = tf.optimizers.SGD()
#tf.function
def fit_vi():
return tfp.vi.fit_surrogate_posterior(
target_log_prob_fn=log_prob_fn,
surrogate_posterior=q,
optimizer=optimizer,
num_steps=10,
sample_size=8
)
_ = fit_vi()
Now there are several issues with that design:
first off we needed to reparameterize not only q but also p so we "modify our target model". This results in our models p and q not outputing discrete RVs like originally intended but continuous RVs. I think that the introduction of a hard option like in the torch implem could be a nice addition to overcome this issue;
second we introduce the burden of setting up the temperature parameter. The latter make the continuous RV y smoothly converge to its discrete counterpart z. An annealing strategy, reducing the temperature to reduce the bias introduced by the relaxation at the cost of a higher variance can be implemented. Or the temperature can be learned online, akin to an entropy regularization (see Maddison et al. (2017) and Jang et al. (2017));
the gradient obtained with this estimator are biased, which probably can be acceptable for most applications but is an issue in theory.
Recent methods like REBAR (Tucker et al. (2017)) or RELAX (Grathwohl et al. (2018)) can instead obtain unbiased estimators with a lower variance than the original REINFORCE. But they do so at the cost of introducing -learnable- control variates with separate losses. Modifications of the one_hot_straight_through functions could probably implement this.
In conclusion my opinion is that the tensorflow probability support for discrete RVs optimization is too scarce at the moment and that the API lacks native functions and tutorials to make it easier for the user.

Treatment of constraints in SLSQP optimization with openMDAO

With openMDAO, I am using FD derivatives and trying to solve a non-linearly constrained optimization problem with the SLSQP method. Any time the optimizer arrives at a point that violates one of the constraints, it just crashes with the message:
Optimization FAILED. Positive directional derivative for linesearch
For instance, if I intentionally set the initial point to an unfeasible design point, the optimizer performs 1 iteration and exits with the above error (the same happens when I start from a feasible point, but then optimizer arrives at an unfeasible point after a few iterations).
Based on the answer to the question in In OpenMDAO, is there a way to ensure that the constraints are respected before proceeding with a computation?, I'm assuming that raising the AnalysisError exception will not work in my case, is that correct? Is there any other way to prevent the optimizer from going into unfeasible regions or at least backtrack on the linesearch and try a different direction/distance? Or should the SLSQP method be only used when analytic derivatives are available?
Reproducible test case:
import numpy as np
import openmdao.api as om
class d1(om.ExplicitComponent):
def setup(self):
# Global design variables
self.add_input('r', val= [3,3,3])
self.add_input('T', val= 20)
# Coupling output
self.add_output('M', val=0)
self.add_output('cost', val=0)
def setup_partials(self):
# Finite difference all partials.
self.declare_partials('*', '*', method='fd')
def compute(self, inputs, outputs):
# define inputs
r = inputs['r']
T = inputs['T'][0]
cost = 174.42 * T * (r[0]**2 + 2*r[1]**2 + r[2]**2 + r[0]*r[1] + r[1]*r[2])
M = 456.19 * T * (r[0]**2 + 2*r[1]**2 + r[2]**2 + r[0]*r[1] + r[1]*r[2]) - 599718
outputs['M'] = M
outputs['cost'] = cost
class MDA(om.Group):
class ObjCmp(om.ExplicitComponent):
def setup(self):
# Global Design Variable
self.add_input('cost', val=0)
# Output
self.add_output('obj', val=0.0)
def setup_partials(self):
# Finite difference all partials.
self.declare_partials('*', '*', method='fd')
def compute(self, inputs, outputs):
outputs['obj'] = inputs['cost']
class ConCmp(om.ExplicitComponent):
def setup(self):
# Global Design Variable
self.add_input('M', val=0)
# Output
self.add_output('con', val=0.0)
def setup_partials(self):
# Finite difference all partials.
self.declare_partials('*', '*', method='fd')
def compute(self, inputs, outputs):
# assemble outputs
outputs['con'] = inputs['M']
def setup(self):
self.add_subsystem('d1', d1(), promotes_inputs=['r','T'],
promotes_outputs=['M','cost'])
self.add_subsystem('con_cmp', self.ConCmp(), promotes_inputs=['M'],
promotes_outputs=['con'])
self.add_subsystem('obj_cmp', self.ObjCmp(), promotes_inputs=['cost'],
promotes_outputs=['obj'])
# Build the model
prob = om.Problem(model=MDA())
model = prob.model
model.add_design_var('r', lower= [3,3,3], upper= [10,10,10])
model.add_design_var('T', lower= 20, upper= 220)
model.add_objective('obj', scaler=1)
model.add_constraint('con', lower=0)
# Setup the optimization
prob.driver = om.ScipyOptimizeDriver(optimizer='SLSQP', tol=1e-3, disp=True)
prob.setup()
prob.set_solver_print(level=0)
prob.run_driver()
# Printout
print('minimum found at')
print(prob.get_val('T')[0])
print(prob.get_val('r'))
print('constraint')
print(prob.get_val('con')[0])
print('minimum objective')
print(prob.get_val('obj')[0])
Based on your provided test case, the problem here is that your have a really poorly scaled objective and constraint (you also have some very strange coding choices ... which I modified).
Running the OpenMDAO scaling report shows that your objective and constraint values are both around 1e6 in magnitude:
This is quite large, and is the source of your problems. A (very rough) rule of thumb is that your objectives and constraints should be around order 1. Thats not hard and fast rule, but is generally a good starting point. Sometimes other scaling will work better, if you have very very larger or small derivatives ... but there are parts of SQP methods that are sensitive to the scaling of objective and constraint values directly. So trying to keep them roughly in the range of 1 is a good idea.
Adding ref=1e6 to both objective and constraints gave enough resolution for the numerical methods to converge the problem:
Current function value: [0.229372]
Iterations: 8
Function evaluations: 8
Gradient evaluations: 8
Optimization Complete
-----------------------------------
minimum found at
20.00006826587515
[3.61138704 3. 3.61138704]
constraint
197.20821903413162
minimum objective
229371.99547899762
Here is the code I modified (including removing the extra class definitions inside your group that didn't seem to be doing anything):
import numpy as np
import openmdao.api as om
class d1(om.ExplicitComponent):
def setup(self):
# Global design variables
self.add_input('r', val= [3,3,3])
self.add_input('T', val= 20)
# Coupling output
self.add_output('M', val=0)
self.add_output('cost', val=0)
def setup_partials(self):
# Finite difference all partials.
self.declare_partials('*', '*', method='cs')
def compute(self, inputs, outputs):
# define inputs
r = inputs['r']
T = inputs['T'][0]
cost = 174.42 * T * (r[0]**2 + 2*r[1]**2 + r[2]**2 + r[0]*r[1] + r[1]*r[2])
M = 456.19 * T * (r[0]**2 + 2*r[1]**2 + r[2]**2 + r[0]*r[1] + r[1]*r[2]) - 599718
outputs['M'] = M
outputs['cost'] = cost
class MDA(om.Group):
def setup(self):
self.add_subsystem('d1', d1(), promotes_inputs=['r','T'],
promotes_outputs=['M','cost'])
# self.add_subsystem('con_cmp', self.ConCmp(), promotes_inputs=['M'],
# promotes_outputs=['con'])
# self.add_subsystem('obj_cmp', self.ObjCmp(), promotes_inputs=['cost'],
# promotes_outputs=['obj'])
# Build the model
prob = om.Problem(model=MDA())
model = prob.model
model.add_design_var('r', lower= [3,3,3], upper= [10,10,10])
model.add_design_var('T', lower= 20, upper= 220)
model.add_objective('cost', ref=1e6)
model.add_constraint('M', lower=0, ref=1e6)
# Setup the optimization
prob.driver = om.ScipyOptimizeDriver(optimizer='SLSQP', tol=1e-3, disp=True)
prob.setup()
prob.set_solver_print(level=0)
prob.set_val('r', 7.65)
prob.run_driver()
# Printout
print('minimum found at')
print(prob.get_val('T')[0])
print(prob.get_val('r'))
print('constraint')
print(prob.get_val('M')[0])
print('minimum objective')
print(prob.get_val('cost')[0])
Which SLSQP method are you using? There is one implementation in pyOptSparse and one in ScipyOptimizer. The one in pyoptsparse is older and doesn't respect bounds constraints. The one in Scipy is newer and does. (Yes, its very confusing that they have the same name and share some lineage... but are not the same optimizer any more)
You shouldn't raise an analysis error when you go outside the bounds. If you need strict bounds respecting, I suggest using IPopt from within pyoptsparse (if you can get it to compile) or switching to ScipyOptimizerDriver and its SLSQP implementation.
Based on your question, its not totally clear to me if you're talking about bounds constraints or inequality/equality constraints. If its the latter, then then there isn't any optimizer that would guarantee you remain in a feasible region all the time. Interior point methods like IPopt will stay inside the region much better, but not 100% of the time.
In general, with gradient based optimization its pretty critical that you make your problem smooth and continuous even when its outside the constraint areas. If there are parts of the space that you absolutely can not go into, then you need to make those variables into design variables and use bound constraints. This sometimes requires reformulating your problem formulation a little bit, possibly by adding a kind of compatibility constraint that says "design variable = computed_value". Then you can make sure that the design variable is passed into anything that requires the value to be strictly within a bound, and (hopefully) a converged answer will also satisfy your compatibility constraint.
If you provide some kind of a test case or example, I can amend my answer with a more specific suggestion.

Using the piecewise function of the IBM CPLEX python API, but the problem cannot be solved

I try to use MILP (Mixed Integer Linear Programming) to calculate the unit commitment problem. (unit commitment: An optimization problem trying to find the best scheduling of generator)
Because the relationship between generator power and cost is a quadratic function, so I use piecewise function to convert power to cost.
I modify the answer on this page:
unit commitment problem using piecewise-linear approximation become MIQP
The simple program structure is like this:
from docplex.mp.model import Model
mdl = Model(name='buses')
nbbus40 = mdl.integer_var(name='nbBus40')
nbbus30 = mdl.integer_var(name='nbBus30')
mdl.add_constraint(nbbus40*40 + nbbus30*30 >= 300, 'kids')
#after 4 buses, additional buses of a given size are cheaper
f1=mdl.piecewise(0, [(0,0),(4,2000),(10,4400)], 0.8)
f2=mdl.piecewise(0, [(0,0),(4,1600),(10,3520)], 0.8)
cost1= f1(nbbus40)
cost2 = f2(nbbus30)
mdl.minimize(cost1+ cost1)
mdl.solve()
mdl.report()
for v in mdl.iter_integer_vars():
print(v," = ",v.solution_value)
which gives
* model buses solved with
objective = 3520.000
nbBus40 = 0
nbBus30 = 10.0
The answer is perfect but there is no way to apply my example.
I used a piecewise function to formulate a piecewise linear relationship between power and cost, and got a new object (cost1), and then calculated the minimum value of this object.
The following is my actual code(simply):
(min1,miny1), (pw1_1,pw1_1y),(pw1_2,pw1_2y), (max1,maxy1) are the breakpoints on the power-cost curve.
pwl_func_1phase = ucpm.piecewise(
0,
[(0,0),(min1,miny1),
(pw1_1,pw1_1y),
(pw1_2,pw1_2y),
(max1,maxy1)
],
0
)
#df_decision_vars_spinning is a dataframe store Optimization variables
df_decision_vars_spinning.at[
(units,period),
'variable_cost'
] = pwl_func_1phase(
df_decision_vars_spinning.at[
(units,period),
'production'
]
)
total_variable_cost = ucpm.sum(
(df_decision_vars_spinning.variable_cost))
ucpm.minimize(total_variable_cost )
I don’t know what causes this optimization problem can't be solve. Here is my complete code :
https://colab.research.google.com/drive/1JSKfOf0Vzo3E3FywsxcDdOz4sAwCgOHd?usp=sharing
With an unlimited edition of CPLEX, your model solves (though very slowly). Here are two ideas to better control what happens in solve()
use solve(log_output=True) to print the log: you'll see the gap going down
set a mip gap: setting mip gap to 5% stops the solve at 36s
ucpm.parameters.mip.tolerances.mipgap = 0.05
ucpm.solve(log_output=True)
Not an answer, but to illustrate my comment.
Let's say we have as the cost curve
cost = α + β⋅power^2
Furthermore, we are minimizing cost.
We can approximate using a few linear curves. Here I have drawn a few:
Let's say each linear curve has the form
cost = a(i) + b(i)⋅power
for i=1,...,n (n=number of linear curves).
It is easy to see that is we write:
min cost
cost ≥ a(i) + b(i)⋅power ∀i
we have a good approximation for the quadratic cost curve. This is exactly as I said in the comment.
No binary variables were used here.

sparse matrix multiplication involving inverted matrix

I have two large square sparse matrices, A & B, and need to compute the following: A * B^-1 in the most efficient way. I have a feeling that the answer involves using scipy.sparse, but can't for the life of me figure it out.
After extensive searching, I have run across the following thread: Efficient numpy / lapack routine for product of inverse and sparse matrix? but can't figure out what the most efficient way would be.
Someone suggested using LU decomposition which is built into the sparse module of scipy, but when I try and do LU on sample matrix is says the result is singular (although when I just do a * B^-1 i get an answer). I have also heard someone suggest using linalg.spsolve(), but i can't figure out how to implement this as it requires a vector as the second argument.
If it helps, once I have the solution s.t. A * B^-1 = C, i only need to know the value for one row of the matrix C. The matrices will be roughly 1000x1000 to 1500x1500.
Actually 1000x1000 matrices are not that large. You can compute the inverse of such a matrix using numpy.linalg.inv(B) in less than 1 second on a modern desktop computer.
But you can be much more efficient if you rewrite your problem taking into account the fact that you only need one row of C (this is actually very often the case).
Let us write d_i = [0 0 0 ... 0 1 0 ... 0 ], a vector with only one one on the i-th element.
You can write, if ^t denotes the transpose :
AB^-1 = C <=> A = CB <=> A^t = B^t C^t
For the i-th row :
A^t d_i = B^t C^t d_i <=> a_i = B^t c_i
So you have a linear inverse problem which can be solved using numpy.linalg.solve
ci = np.linalg.solve(B.T, a[i])