Warm Starting in Scip - scip

I am using GAMS/SCIP. I have been trying to warm start a model (Model2) that has only one more non-linear equality constraint than another model (Model1). For this, after failing to get SCIP to understand that the given solution from Model1 is feasible for Model2, I finally I did an experiment that proves for some reason SCIP finds its own solution infeasible when I feed it as initial solution. I simply just solved the same model twice. This way variables have the values and the solver is supposed to count it as the initial solution. Why is this happening and how can I fix it?

Related

Does increasing the number of iterations affect log-lik, AIC etc.?

Whenever I try to solve a convergence issue in one of my glmer models with the help of a different optimizer, I repeat the entire model optimization procedure with the new optimizer. That is, I re-run all the models I've computed so far with the new optimizer and again conduct comparisons with anova (). I do this because as far as I know different optimizers may lead to differences in AICs and log-lik ratios for one and the same model, making comparisons between two models that use different optimizers problematic.
In my most recent analysis, I've increased the number of iterations with optCtrl=list(maxfun=100000) to avoid convergence errors. I'm now wondering whether this can also lead to differences in AIC/log-lik etc. for one and the same model? Is it equally problematic to compare two models that differ with regard to the inclusion of the optCtrl=list(maxfun=100000) argument?
I actually thought that increasing the number of iterations would simply lead to longer computation times (rather than different results), but I was unable to verify this online. Any hint/explanation is appreciated.
As far as I know, you should be fine. As long as the models were fit with the same number of observations you should be able to compare them using the AIC. Hopefully someone else can comment on the nuances of the computations of the AIC itself, but I just fit a bunch of models with the same formula and dataset and different number of max iterations, getting the AIC each time. It didn't change as a function of the iterations. The iterations are just the time the model fitting process can take to maximize the likelihood, which for complex models can be tricky. Once a model is fit, and has converged on an answer, the number of iterations shouldn't change anything about the model itself.
If you look at this question, the top answer explains the AIC quite well:https://stats.stackexchange.com/questions/232465/how-to-compare-models-on-the-basis-of-aic

Solving an optimization problem bounded by conditional constrains

Basically, I have a dataset that contains 'weights' for some (207) variables, some are more important than the others for determining the class variable (binary) and therefore they are bigger etc. at the end all weigths are summed up across all columns so that the resulting cumulative weight is obtained for each observation.
If this weight is higher then some number then class variable is 1 otherwise is 0. I do have true labels for a class variable so the problem is to minimize false positives.
The thing is, for me it looks like a OR problem as it's about finding optimal weights. However, I am not sure if there is an OR method for such problem, at least I have not heard about one. Question is: does anyone recognize this type of problems and can send some keywords for me to research?
Another thing of course is to predict that with machine learning rather then deterministic methods but I need to do it this way.
Thank you!
Are the variables discrete (integer numbers etc) or continuous (floating point numbers)?
If they are discrete, it sounds like the knapsack problem, which constraint solvers like OptaPlanner (see this training that builds a knapsack solver) excel at.
If they are continuous, look for an LP solver, like CPLEX.
Either way, you'll get much better results than machine learning approaches, because neural nets et al are great at pattern recognition use cases (image/voice recognition, prediction, catagorization, ...), but consistently inferior for constraint optimization problems (like this, I presume).

Imbalanced Dataset for Multi Label Classification

So I trained a deep neural network on a multi label dataset I created (about 20000 samples). I switched softmax for sigmoid and try to minimize (using Adam optimizer) :
tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels=y_, logits=y_pred)
And I end up with this king of prediction (pretty "constant") :
Prediction for Im1 : [ 0.59275776 0.08751075 0.37567005 0.1636796 0.42361438 0.08701646 0.38991812 0.54468459 0.34593087 0.82790571]
Prediction for Im2 : [ 0.52609032 0.07885984 0.45780018 0.04995904 0.32828355 0.07349177 0.35400775 0.36479294 0.30002621 0.84438241]
Prediction for Im3 : [ 0.58714485 0.03258472 0.3349618 0.03199361 0.54665488 0.02271551 0.43719986 0.54638696 0.20344526 0.88144571]
At first, I thought I just neeeded to find a threshold value for each class.
But I noticed that, for instance, among my 20000 samples, the 1st class appears about 10800 so a 0.54 ratio and it the value around which my prediction is every time. So I think I need to find a way to tackle tuis "imbalanced datset" issue.
I thought about reducing my dataset (Undersampling) to have about the same number of occurence for each class but only 26 samples correspond to one of my classes... That would make me loose a lot of samples...
I read about oversampling or about penalizing even more the classes that are rare but did not really understood how it works.
Can someone share some explainations about these methods please ?
In practice, on Tensorflow, are there functions that help doing that ?
Any other suggestions ?
Thank you :)
PS: Neural Network for Imbalanced Multi-Class Multi-Label Classification This post raises the same problem but had no answer !
Well, having 10000 samples in one class and just 26 in a rare class will be indeed a problem.
However, what you experience, to me, seems more like "outputs don't even see the inputs" and thus the net just learns your output distribution.
To debug this I would create a reduced set (just for this debugging purpose) with say 26 samples per class and then try to heavily overfit. If you get correct predictions my thought is wrong. But if the net cannot even detect those undersampled overfit samples then indeed it's an architecture/implementation problem and not due to the schewed distribution (which you will then need to fix. But it'll be not as bad as your current results).
Your problem is not the class imbalance, rather just the lack of data. 26 samples are considered to be a very small dataset for practically any real machine learning task. A class imbalance could be easily handled by ensuring that each minibatch will have at least one sample from every class (this leads to situations when some samples will be used much more frequently than another, but who cares).
However, in the case of presence only 26 samples this approach (and any other) will quickly lead to overfitting. This problem could be partly solved with some form of data augmentation, but there still too few samples to construct something reasonable.
So, my suggestion will be to collect more data.

How to determine which constraints or variable bounds are rendering a GAMS model infeasible?

The solve summary in my GAMS model (NLP) is returning the following:
**** SOLVER STATUS 1 Normal Completion
**** MODEL STATUS 19 Infeasible - No Solution
**** OBJECTIVE VALUE NA
THE bounds on one of my variables are:
y.lo = 0, y.up = 0.15
if I change the bounds to:
y.lo = 0, y.up = 0.12
the model then converges and gives the following:
**** SOLVER STATUS 1 Normal Completion
**** MODEL STATUS 2 Locally Optimal
**** OBJECTIVE VALUE 66013164.0000
It turns out that the final variable level is
y.l = 0.12
how can it be that GAMS determined the model to be infeasible in the first case (upper bound = 0.15) even though the solution (0.12) was within the search space? (btw, I am using ANTIGONE solver)
Additionally, are there any methodical ways to identify which constraints/variable bounds are causing the model to be infeasible?
In order to find this (seemingly illogical) error, I had to spend hours guessing and checking arbitrary details within the model with no rhyme or reason. There has to be a better way, right?
That issue is not GAMS fault, but the solver you're using. Have you tried with CONOPT?
You can see the infeasible constraint in the lst file. Some equations should have (***INFES) mark
Also, to solve your problem, I would try to provide the NLP solver an initial solution that is somehow close enough to the optimal one, or at least feasible.
I would also try to check the options of the solvers you are using to start the solution procedure with a feasible starting point.
Non-convex optimization is not easy.
I hope this helps.

Practical solver for convex QCQP?

I am working with a convex QCQP as the following:
Min e'Ie
z'Iz=n
[some linear equalities and inequalities that contain variables w,z, and e]
w>=0, z in [0,1]^n
So the problem has only one quadratic constraint, except the objective, and some variables are nonnegative. The matrices of both quadratic forms are identity matrices, thus are positive definite.
I can move the quadratic constraint to the objective but it must have the negative sign so the problem will be nonconvex:
min e'Ie-z'Iz
The size of the problem can be up to 10000 linear constraints, with 100 nonnegative variables and almost the same number of other variables.
The problem can be rewritten as an MIQP as well, as z_i can be binary, and z'Iz=n can be removed.
So far, I have been working with CPLEX via AIMMS for MIQP and it is very slow for this problem. Using the QCQP version of the problem with CPLEX, MINOS, SNOPT and CONOPT is hopeless as they either cannot find a solution, or the solution is not even close to an approximation that I know a priori.
Now I have three questions:
Do you know any method/technique to get rid of the quadratic constraint as this form without going to MIQP?
Is there any "good" solver for this QCQP? by good, I mean a solver that efficiently finds the global optimum in a resonable time.
Do you think using SDP relaxation can be a solution to this problem? I have never solved an SDP problem in reallity, so I do not know how efficient SDP version can be. Any advice?
Thanks.