Alternate Construction heuristic and Local Search phases - optaplanner

I have a case based on MeetingScheduling example.
The results are fine.
The scheduling begin with Construction Heuristic phase.
Then there is a Local Search phase.
The CH phase reduce the hard and medium constraints penalties while the LS seems to reduce the soft constraints penalties.
I found that when I re-run the scheduling, the CH phase reduce again the hard and medium constraints penalties.
So, can we configure the solver to alternate CH et LS phases several times ?
The current solver config:
<?xml version="1.0" encoding="UTF-8"?>
<solver>
<solutionClass>org.optaplanner.examples.meetings.domain.MeetingSchedule</solutionClass>
<entityClass>org.optaplanner.examples.meetings.domain.Meeting</entityClass>
<scoreDirectorFactory>
<scoreDrl>org/optaplanner/examples/meetings/solver/meetingsScoreRules.drl</scoreDrl>
</scoreDirectorFactory>
<termination>
<minutesSpentLimit>20</minutesSpentLimit>
</termination>
</solver>

This should work:
<solver>
...
<constructionHeuristic/>
<localSearch>
<termination>...stepCountLimit or calculateCountLimit?...</termination>
</localSearch>
<constructionHeuristic/>
<localSearch>
<termination>...stepCountLimit or calculateCountLimit?...</termination>
</localSearch>
<constructionHeuristic/>
<localSearch>
<termination>...stepCountLimit or calculateCountLimit?...</termination>
</localSearch>
</solver>
And with the programatic API you can make it dynamic to n.
That being said, this is probably the suboptimal solution. The right solution would be reheating (not yet supported).

Related

CPLEX warm start with altered objective function coefficient

I am optimizing a MILP model in CPLEX (Python interface) that takes a long time to solve. Sometimes I need to do concurrent runs if my timelimit runs out. In order to continue optimizing with a solution from a previous, unfinished, run I usually provide the .sol file as a warm start.
Now I have a change in the objective function coefficients. The model's constraints and variables stay the same. Is it possible to provide a solution from the 'old', already optimized model to the model with the revised coefficients? Will CPLEX find the optimal solution of the new model faster than just starting fresh, regardless if it is is in the same 'range' as the old solution? And can I provide the .sol file for this as usual or should I use an .mst file?
On a related note, I am finding that when I use a previous solution as a warm start, CPLEX does use the best integer value found previously but often starts with a higher best bound. So the gap initially is higher than what previously has already been reached. Is there a method to overcome this, possibly speeding up the run?
you can warmstart with a .sol file but also with APIS
warm start through API in https://www.linkedin.com/pulse/making-optimization-simple-python-alex-fleischer/
from docplex.mp.model import Model
mdl = Model(name='buses')
nbbus40 = mdl.integer_var(name='nbBus40')
nbbus30 = mdl.integer_var(name='nbBus30')
mdl.add_constraint(nbbus40*40 + nbbus30*30 >= 300, 'kids')
mdl.minimize(nbbus40*500 + nbbus30*400)
warmstart=mdl.new_solution()
warmstart.add_var_value(nbbus40,8)
warmstart.add_var_value(nbbus30,0)
mdl.add_mip_start(warmstart)
sol=mdl.solve(log_output=True)
for v in mdl.iter_integer_vars():
print(v," = ",v.solution_value)
Sometimes warmstart helps, some times it does not. Fixed start helps too sometimes.(The search space is smaller so the search can be faster)
You say you are using ".sol" file as mip start, but there is a dedicated format for mip starts: MST format in ".mst" files. If you are using DOcplex, from a solution produced by Model.solve(), you can export a mst file using export_as_mst, which takes a WriteLevel argument.
This enumerated value controls what is written in the MST file: by default, only discrete values are used in the mip start, which avoids precision issues.
This is one of the reasons you should prefer MST format for mip starts.

How to optimize SpaCy pipe for NER only (using an existing model, no training)

I am looking to use SpaCy v3 to extract named entities from a large list of sentences. What I have works, but it seems slower than it should be, and before investing in more machines, I'd like to know if I am doing more work than I need to in the pipe.
I've used ntlk to parse everything into sentences as an iterator, then process these using "pipe" to get the named entities. All of this appears to work well, and python appears to be hitting every cpu core on my machine fairly heavily, which is good.
nlp = spacy.load("en_core_web_trf")
for (doc, context) in nlp.pipe(lines, as_tuples=True, batch_size=1000):
for ent in doc.ents:
pass #handle each entity
I understand that I can use nlp.disable_pipes to disable certain elements. Is there anything I can disable that won't impact accuracy and that isn't required for NER?
For NER only with the transformer model en_core_web_trf, you can disable ["tagger", "parser", "attribute_ruler", "lemmatizer"].
If you want to use a non-transformer model like en_core_web_lg (much faster but slightly lower accuracy), you can disable ["tok2vec", "tagger", "parser", "attribute_ruler", "lemmatizer"] and use nlp.pipe(n_process=-1) for multiprocessing on all CPUs (or n_process=N to restrict to N CPUs).

What is actually meant by parallel_iterations in tfp.mcmc.sample_chain?

I am not able to get what does the parameter parallel_iterations stand for in sampling multiple chains during MCMC.
The documentation for mcmc.sample_chain() doesn't give much details, it just says that
The parallel iterations are the number of iterations allowed to run in parallel. It must be a positive integer.
I am running a NUTS sampler with multiple chains while specifying parallel_iterations=8.
Does it mean that the chains are strictly run in parallel? Is the parallel execution dependent on multi-core support? If so, what is a good value (based on the number of cores) to set parallel_iterations? Should I naively set it to some higher value?
TensorFlow can unroll iterations of while loops to execute in parallel, when some parts of the data flow (I.e. iteration condition) can be computed faster than other parts. If you don't have a special preference (i.e. reproducibility with legacy stateful samplers), leave it at default.

Can we Ignore unnecessary classes in the Tensorflow object detection API by only omitting labels in pbtxt label map file?

So I am trying to create custom datasets for object detection using the Tensorflow Object detection API. When working with open source datasets the annotation files I have come across as PASCAL VOC xmls or jsons. These contain a list of labels for each class for example:
<annotation>
<folder>open_images_volume</folder>
<filename>0d2471ff2d033ccd.jpg</filename>
<path>/mnt/open_images_volume/0d2471ff2d033ccd.jpg</path>
<source>
<database>Unknown</database>
</source>
<size>
<width>1024</width>
<height>1024</height>
<depth>3</depth>
</size>
<segmented>0</segmented>
<object>
<name>Chair</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>8</xmin>
<ymin>670</ymin>
<xmax>409</xmax>
<ymax>1020</ymax>
</bndbox>
</object>
<object>
<name>Table</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>232</xmin>
<ymin>654</ymin>
<xmax>555</xmax>
<ymax>1020</ymax>
</bndbox>
</object>
</annotation>
Here the annotation file describes 2 classes, Table & chair. I am only interested in detecting chairs, which is why the pbtxt file I have generated is simply
item {
id: 1
display_name: "Chair"
}
My question is that will the model train on simply the annotations of the class "Chair" because that's what I have defined via the label_map.pbtxt or do I need to manually scrape all the annotation files and remove the bounding box co-ordinates through regex or xmltree in order to make sure the additional bounding boxes do not interfere with training. So is it possible to select only custom classes for training even if the annotation files have additional classes through the TF API or is it necessary to clean up the entire datasets and manually remove unnecessary class labels? Will it affect training in any way?
You can use a .pbtxt that only has the classes that you need to train and you don't have to change the xmls.
Also, make sure to change the num_classes: your_num_classes.

Finding Optimal Parameters In A "Black Box" System

I'm developing machine learning algorithms which classify images based on training data.
During the image preprocessing stages, there are several parameters which I can modify that affect the data I feed my algorithms (for example, I can change the Hessian Threshold when extracting SURF features). So the flow thus far looks like:
[param1, param2, param3...] => [black box] => accuracy %
My problem is: with so many parameters at my disposal, how can I systematically pick values which give me optimized results/accuracy? A naive approach is to run i nested for-loops (assuming i parameters) and just iterate through all parameter combinations, but if it takes 5 minute to calculate an accuracy from my "black box" system this would take a long, long time.
This being said, are there any algorithms or techniques which can search for optimal parameters in a black box system? I was thinking of taking a course in Discrete Optimization but I'm not sure if that would be the best use of my time.
Thank you for your time and help!
Edit (to answer comments):
I have 5-8 parameters. Each parameter has its own range. One parameter can be 0-1000 (integer), while another can be 0 to 1 (real number). Nothing is stopping me from multithreading the black box evaluation.
Also, there are some parts of the black box that have some randomness to them. For example, one stage is using k-means clustering. Each black box evaluation, the cluster centers may change. I run k-means several times to (hopefully) avoid local optima. In addition, I evaluate the black box multiple times and find the median accuracy in order to further mitigate randomness and outliers.
As a partial solution, a grid search of moderate resolution and range can be recursively repeated in the areas where the n-parameters result in the optimal values.
Each n-dimensioned result from each step would be used as a starting point for the next iteration.
The key is that for each iteration the resolution in absolute terms is kept constant (i.e. keep the iteration period constant) but the range decreased so as to reduce the pitch/granular step size.
I'd call it a ‘contracting mesh’ :)
Keep in mind that while it avoids full brute-force complexity it only reaches exhaustive resolution in the final iteration (this is what defines the final iteration).
Also that the outlined process is only exhaustive on a subset of the points that may or may not include the global minimum - i.e. it could result in a local minima.
(You can always chase your tail though by offsetting the initial grid by some sub-initial-resolution amount and compare results...)
Have fun!
Here is the solution to your problem.
A method behind it is described in this paper.