I am trying to solve the following linear system using optimize.root
AX = b
With the following code.
A = [[0,1,0],[2,1,0],[1,4,1]]
def foo(X):
b = np.matrix([2,1,1])
out = np.dot(A,X) - b
return out.tolist()
sol = scipy.optimize.root(foo,[0,0,0])
I know that I can simply use the numpy.linalg.solve to do this easily. But I am actually trying to solve a non linear system that is in matrix form. See my question here. So I need to find a way to make this method work. To do that I am trying to solve this problem in this simple case. But I get the error
TypeError: fsolve: there is a mismatch between the input and output shape of the 'func' argument 'foo'.Shape should be (3,) but it is (1, 3).
From what I have read from other similar stackoverflow questions this happens because the out put of the foo function is not compatible with the shape of the initial guess [0,0,0]
Surely there is a way to solve this equation using scipy.optimize.root. Can anyone please help?
(I'm assuming the capital B in your .dot is a typo for A.)
Try using np.array for b. np.matrix creates a "row vector", i.e. shape (1, 3) whereas your initial guess has shape (3,).
Related
I would like to use pyomo to solve a multiple linear regression under constraint in pyomo.
to do so I have 3 matrices :
X (noted tour1 in the following code) My inputs (600x13)(bureaux*t1 in pyomo)
Y (noted tour2 in the following code) the matrix I want to predict (6003)(bureauxt2 inpyomo)
T (noted transfer In the code) (13x3)(t1*t2 in pyomo)
I would like to do the following
ypred = XT
minimize (ypred-y)**2
subject to
0<T<1
and Sum_i(Tij)=1
To that effect, I started the following code
from pyomo.environ import *
tour1=pd.DataFrame(np.random.random(size=(60,13)),columns=["X"+str(i) for i in range(13)],index=["B"+str(i) for i in range(60)])
tour2=pd.DataFrame(np.random.random(size=(60,3)),columns=["Y"+str(i) for i in range(3)],index=["B"+str(i) for i in range(60)])
def gettour1(model,i,j):
return tour1.loc[i,j]
def gettour2(model,i,j):
return tour2.loc[i,j]
def cost(model):
return sum((sum(model.tour1[i,k] * model.transfer[k,j] for k in model.t1) - model.tour2[i,j] )**2 for i in model.bureaux for j in model.tour2)
model = ConcreteModel()
model.bureaux = Set(initialize=tour1.index.tolist())
model.t1 = Set(initialize=tour1.columns)
model.t2 = Set(initialize=tour2.columns)
model.tour1 = Param(model.bureaux, model.t1,initialize=gettour1)
model.tour2 = Param(model.bureaux, model.t2,initialize=gettour2)
model.transfer = Var(model.t1,model.t2,bounds=[0,1])
model.obj=Objective(rule=cost, sense=minimize)
I unfortunately get an error at this stage :
KeyError: "Index '('X0', 'B0', 'Y0')' is not valid for indexed component 'transfer'"
anyone knows how I can calculate the objective ?
furthermore any help for the constrains would be appreciated :-)
A couple things...
First, the error you are getting. There is information in that error statement that should help identify the problem. The construction appears to be trying to index transfer with a 3-part index (x, b, y). That clearly is outside of t1 x t2. If you look at the sum equation you have, you are mistakenly using model.tour2 instead of model.t2.
Also, your bounds parameter needs to be a tuple.
While building the model, you should be pprint()-ing the model very frequently to look for these types of issues. That only works well if you have "small" data. 60 x 13 may be the normal problem size, but it is a total pain to troubleshoot. So, start with something tiny, maybe 3 x 4. Make a Set, pprint(). Make a Constraint, pprint()... Once the model computes/solves with "tiny" data, just pop in the real stuff.
I have used CVXPY and some of its LP solvers to determine whether a solution to an A*x <= b problem is feasible, and now I would like to try PySCIPOpt. I could not find an example of this in the docs, and I'm having trouble figuring out the right syntax. With CVXPY the code is simply:
def do_cvxpy(A, b, solver):
x = cvxpy.Variable(A.shape[1])
constraints = [A#x <= b] #The # denotes matrix multiplication in CVXPY
obj = cvxpy.Minimize(0)
prob = cvxpy.Problem(obj, constraints)
prob.solve(solver=solver)
return prob.status
I think with PySCIPOpt one cannot use matrix notation as above, but must treat vectors and matrices as collections of scalar variables, each of which has to be added individually, so I tried this:
def do_scip(A, b):
model = Model("XYZ")
x = {}
for i in range(A.shape[1]):
x[i] = model.addVar(vtype="C", name="x(%s)" % i)
model.setObjective(0) #Is this right for a feasibility-only problem?
model.addCons(A*x <= b) #This is certainly the wrong syntax
model.optimize()
return model.getStatus()
Could anyone please help me out with the correct form for the constraint in addCons() for this kind of problem, and confirm that an acceptable way to ask whether a solution is feasible is to simply pass 0 as the objective?
I'm still not positive about the setObjective(0), but at least I can get the code to run without errors by "unpacking" the A matrix and the b vector and adding each element as a constraint:
for i in range(ncols):
for j in range(nrows):
model.addCons(A[j,i]*x[i] <= b[i])
I also discovered that CVXPY actually has an interface to SCIP, but it gives me an error when I try to use it:
getSolObjVal cannot only be called in stage SOLVING without a valid solution
which seems to suggest that the interface cannot be used for feasibility-only problems.
Suppose I have a matrix of N users, and each user is associated with a vector of words (translated to integers). So for example for N = 2 I'd have:
user 0 corresponds to words['20','56']
user 1 corresponds to words ['58','10','105']
So I have a list
user_words = [['20','56'],['58','10','105']]
Suppose further I created a 100-column embedding matrix (word_emb) for these words. I'd like to look up the (mean) embeddings of each of the user vectors and create a new Tensor, whose shape I would expect to be [2,100]. I tried doing this:
word_vec = []
for word_sequence_i in tf.map_fn(lambda x: x, user_words):
all_word_vecs = tf.nn.embedding_lookup(word_emb, word_sequence_i)
word_vec.append( tf.reduce_mean(all_word_vecs, 1))
But this gives me an error:
TypeError: `Tensor` objects are not iterable when eager execution is not enabled. To iterate over this tensor use `tf.map_fn`.
I thought I already was using tf.map_fn above! So what is Tensorflow complaining about? Is there even a way to do what I am trying to do?
Thanks so much!
tf.map_fn returns a Tensor object itself, which is a symbolic reference to a value that will be computed at Session.run() time. You can see this with type(tf.map_fn(lambda x: x, user_words)). So, it's the iteration implied in for word_sequence_i in tf.map_fn(...) that is generating the error.
Perhaps what you're looking for is something like:
all_word_vecs = tf.map_fn(lambda x: tf.nn.embedding_lookup(word_emb, x), user_words)
word_vec = tf.reduce_mean(all_word_vecs, axis=1)
On a related note, if this distinction between graph construction and execution is getting bothersome, you might want to give TensorFlow's eager execution a spin. See getting started and the programmer's guide.
Hope that helps.
Before I continue, please excuse my ignorance. I have some experience programming before this, but my previous intuition has failed me presently.
Essentially, I need to expand a 1-D vector (size M x 1) of numbers ranging from 0...K, to a 2-D matrix (or Tensor, size M x K) where each row is a 1-D vector (size 1 x K), and each element is a 0 except for the index of the initial value being 1.
Yes, this is a multiclass classification problem for a ML class.
I had the idea of creating a zeros matrix of the correct shape, and then assigning the index of the element I need manually to a 1, but cannot seem to change the values of the already created Variable. I get the error:
TypeError: 'Tensor' object does not support item assignment
Can anyone assist with this? If you feel as though my way of going about creating this final Tensor could use a different approach, any advice would be appreciated.
In tensorflow, the function tf.one_hot() is what you seek. One hot encoding is the term describing the operation you are looking to implement. See https://www.tensorflow.org/api_docs/python/tf/one_hot .
I'm having a tough time using the 'initial state' argument in the tf.nn.rnn function.
val, _ = tf.nn.rnn(cell1, newBatch, initial_state=stateP, dtype=tf.float32)
newBatch.shape => (1, 1, 11)
stateP.shape => (2, 2, 1, 11)
In general, I've gone through the training for my LSTM neural net and now I want to use the values of it. How do I do this? I know that the tf.nn.rnn() function will return state... but I don't know how to plug it in.
fyi stateP.shape => (2, 2, 1, 11) ..... maybe because I used stacked LSTM cells?
I've also tried:
val, _ = tf.nn.dynamic_rnn(stacked_lstm, newBatch, initial_state=stateP, dtype=tf.float32)
but I get the error "AttributeError: 'NoneType' object has no attribute 'op'".
I'm pretty sure that the 'NoneType' object being talked about is the stateP tuple I gave, but I'm not sure what to do here.
EDIT: I finally got this running by using:
init_state = cell.zero_state(batch_size, tf.float32)
To determine the exact shape I need to pass into the 'initial_state' argument. In my case, it was a TUPLE of 4 tensors, each with the shape of (1, 11). I made it like this:
stateP0 = tf.convert_to_tensor(stateP[0][0])
stateP1 = tf.convert_to_tensor(stateP[0][1])
stateP2 = tf.convert_to_tensor(stateP[1][0])
stateP3 = tf.convert_to_tensor(stateP[1][1])
newStateP = stateP0, stateP1, stateP2, stateP3
Alright! Now the tf.dynamic_rnn() function is working, but it's giving me different results every time I run it.... so what's the point of passing in the initial state? I want to use the state I trained to find... and I don't want it to change. I want to actually use the results of my training!
You are probably using the deprecated (or soon to be) behavior. stateP in your case represents the concatenation of c (cell state) and h (output of lstm from the final step of unrolling). So you need to slice the state along dimension 1 to get the actual state.
Or, you can initialize your LSTM cell with state_is_tuple=True, which I would recommend, so that you could easily get the final state (if you want to tinker with it) by indexing the state stateP[0]. Or you could just pass the state tuple directly to rnn (or dynamic_rnn).
I cant say anything beyond that because you have not provided your initialization code. So I would be guessing.
You can edit your question to provide more details if you still face problems and I would edit the answer.