CBC Hangs After Finding Optimal Solution - coin-or-cbc

I am trying to use CBC to solve an mps file. It frequently hangs for long periods of time and it will exit without any message.
It will also hang even after it has found the optimal solution. I don't know why it isn't returning to prompt.
I've reinstalled the latest version 2.10, and I've tried on multiple Windows 10 machines.
Prompt inputs and outputs below.
C:\my_directory>cbc
Welcome to the CBC MILP Solver
Version: Trunk (unstable)
Build Date: May 3 2019
CoinSolver takes input from arguments ( - switches to stdin)
Enter ? for list of commands or help
Coin:verbose 15
verbose was changed from 0 to 15
Coin:import SF_STD.mps
At line 8 NAME swolf_co
At line 9 ROWS
At line 179948 COLUMNS
At line 539250 RHS
At line 539271 BOUNDS
At line 539402 ENDATA
Problem swolf_co has 179937 rows, 191943 columns and 619924 elements
Coin0008I swolf_co read with 0 errors
Coin:stat
Presolve 18316 (-161621) rows, 19174 (-172769) columns and 138258 (-481666) elements
Statistics for presolved model
Original problem has 24 integers (24 of which binary)
Problem has 18316 rows, 19174 columns (2 with objective) and 138258 elements
There are 1 singletons with objective
Column breakdown:
19165 of type 0.0->inf, 6 of type 0.0->up, 0 of type lo->inf,
0 of type lo->up, 1 of type free, 0 of type fixed,
0 of type -inf->0.0, 2 of type -inf->up, 0 of type 0.0->1.0
Row breakdown:
9660 of type E 0.0, 0 of type E 1.0, 0 of type E -1.0,
1 of type E other, 6 of type G 0.0, 0 of type G 1.0,
1 of type G other, 8646 of type L 0.0, 0 of type L 1.0,
2 of type L other, 0 of type Range 0.0->1.0, 0 of type Range other,
0 of type Free
Coin:solv
Continuous objective value is 1.25864e+08 - 1.81 seconds
Cgl0002I 6 variables fixed
Cgl0004I processed model has 18298 rows, 19156 columns (0 integer (0 of which binary)) and 137182 elements
Cbc3007W No integer variables - nothing to do
Cuts at root node changed objective from 1.25864e+08 to -1.79769e+308
Probing was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
Gomory was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
Knapsack was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
Clique was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
MixedIntegerRounding2 was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
FlowCover was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
TwoMirCuts was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
ZeroHalf was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
Result - Optimal solution found
Objective value: 125864301.49648696
Enumerated nodes: 0
Total iterations: 0
Time (CPU seconds): 905.83
Time (Wallclock seconds): 905.85
Everything before Result - Optimal solution found ran quickly, but then it prints Cuts at root node changed objective from 1.25864e+08 to -1.79769e+308, and runs for a very long time. Turning cuts off did not change anything. That specific line no longer appears, but the objective still goes from 1.25864e+08 to -1.06e15. The model then takes about an hour to get back to the actual objective value of 1.25864e+08, which is the integer optimal solution as well as the solution to LP relaxation.
It solves in clp in several seconds. The mps file was written by glpk, and solves in several seconds in CPLEX.
Any guidance you could provide would be greatly appreciated.
I've tried multiple version of cbc and I still get this long hang time for relatively simply problems. I've tried .lp and .mps formats (clp seems to solve .lp faster), but nothing changes.

Related

Using Pandas and Numpy to search for conditions within binned data in 2 data frames

Python newbie here. Here's a simplified example of my problem. I have 2 pandas dataframes.
One dataframe lightbulb_df has data on whether a light is on or off and looks something like this:
Light_Time
Light On?
5790.76
0
5790.76
0
5790.771
1
5790.779
1
5790.779
1
5790.782
0
5790.783
1
5790.783
1
5790.784
0
Where the time is in seconds since start of day and 1 is the lightbulb is on, 0 means the lightbulb is off.
The second dataframe sensor_df shows whether or not a sensor detected the lightbulb and has different time values and rates.
Sensor_Time
Sensor Detect?
5790.8
0
5790.9
0
5791.0
1
5791.1
1
5791.2
1
5791.3
0
Both dataframes are very large with 100,000s of rows. The lightbulb will turn on for a few minutes and then turn off, then back on, etc.
Using the .diff function, I was able to compare each row to its predecessor and depending on whether the result was 1 or -1 create a truth table with simplified on and off times and append it to lightbulb_df.
# use .diff() to compare each row to the last row
lightbulb_df['light_diff'] = lightbulb_df['Light On?'].diff()
# the light on start times are when
#.diff is less than 0 (0 - 1 = -1)
light_start = lightbulb_df.loc[lightbulb_df['light_diff'] < 0]
# the light off start times (first times when light turns off)
# are when .diff is greater than 0 (1 - 0 = 1)
light_off = lightbulb_df.loc[lightbulb_df['light_diff'] > 0]
# and then I can concatenate them to have
# a single changed state df that only captures when the lightbulb changes
lightbulb_changes = pd.concat((light_start, light_off)).sort_values(by=['Light_Time'])
So I end up with a dataframe of on start times, a dataframe of off start times, and a change state dataframe that looks like this.
Light_Time
Light On?
light_diff
5790.771
1
1
5790.782
0
-1
5790.783
1
1
5790.784
0
-1
Now my goal is to search the sensor_df dataframe during each of the changed state times (above 5790.771 to 5790.782 and 5790.783 to 5790.784) by 1 second intervals to see whether or not the sensor detected the lightbulb. So I want to end up with the number of seconds the lightbulb was on and the number of seconds the sensor detected the lightbulb for each of the many light on periods in the change state dataframe. I'm trying to get % correctly detected.
Whenever I try to plan this out, I end up using lots of nested for loops or while loops which I know will be really slow with 100,000s of rows of data. I thought about using the .cut function to divide up the dataframe into 1 second intervals. I made a for loop to cycle through each of the times in the changed state dataframe and then nested a while loop inside to loop through 1 second intervals but that seems like it would be really slow.
I know python has a lot of built in functions that could help but I'm having trouble knowing what to google to find the right one.
Any advice would be appreciated.

How to deal with the error when using Gurobi with cvxpy :Unable to retrieve attribute 'BarIterCount'

How to deal with the error when using Gurobi with cvxpy :AttributeError: Unable to retrieve attribute 'BarIterCount'.
I have an Integer programming problem, using cvxpy and set gurobi as a solver.
When the number of variables is small, the result is ok. After the number of variables reaches a level of like 43*13*6, then the error occurred. I suppose it may be caused by the scale of the problem, in which the gurobi solver can not estimate the BarIterCount, which is the max Iterations needed.
Thus, I wonder, is there any way to manually set the BarItercount attribute of gurobi through the interface of the CVX? Or whether there exists another way to solve this problem?
Thanks for any suggestions you may provide for me.
The trace log is as follows:
If my model is small, like I set a number which indicates the scale of model as 3, then the program is ok. The trace is :
Using license file D:\software\lib\site-packages\gurobipy\gurobi.lic
Restricted license - for non-production use only - expires 2022-01-13
Parameter OutputFlag unchanged
Value: 1 Min: 0 Max: 1 Default: 1
D:\software\lib\site-packages\cvxpy\reductions\solvers\solving_chain.py:326: DeprecationWarning: Deprecated, use Model.addMConstr() instead
solver_opts, problem._solver_cache)
Changed value of parameter QCPDual to 1
Prev: 0 Min: 0 Max: 1 Default: 0
Gurobi Optimizer version 9.1.0 build v9.1.0rc0 (win64)
Thread count: 16 physical cores, 32 logical processors, using up to 32 threads
Optimize a model with 126 rows, 370 columns and 2689 nonzeros
Model fingerprint: 0x70d49530
Variable types: 0 continuous, 370 integer (369 binary)
Coefficient statistics:
Matrix range [1e+00, 7e+00]
Objective range [1e+00, 1e+00]
Bounds range [1e+00, 1e+00]
RHS range [1e+00, 6e+00]
Found heuristic solution: objective 7.0000000
Presolve removed 4 rows and 90 columns
Presolve time: 0.01s
Presolved: 122 rows, 280 columns, 1882 nonzeros
Variable types: 0 continuous, 280 integer (279 binary)
Root relaxation: objective 4.307692e+00, 216 iterations, 0.00 seconds
Nodes | Current Node | Objective Bounds | Work
Expl Unexpl | Obj Depth IntInf | Incumbent BestBd Gap | It/Node Time
0 0 4.30769 0 49 7.00000 4.30769 38.5% - 0s
H 0 0 6.0000000 4.30769 28.2% - 0s
0 0 5.00000 0 35 6.00000 5.00000 16.7% - 0s
0 0 5.00000 0 37 6.00000 5.00000 16.7% - 0s
0 0 5.00000 0 7 6.00000 5.00000 16.7% - 0s
Cutting planes:
Gomory: 4
Cover: 9
MIR: 4
StrongCG: 1
GUB cover: 9
Zero half: 1
RLT: 1
Explored 1 nodes (849 simplex iterations) in 0.12 seconds
Thread count was 32 (of 32 available processors)
Solution count 2: 6 7
Optimal solution found (tolerance 1.00e-04)
Best objective 6.000000000000e+00, best bound 6.000000000000e+00, gap 0.0000%
If the number is 6, then error occurs:
-------------------------------------------------------
Using license file D:\software\lib\site-packages\gurobipy\gurobi.lic
Restricted license - for non-production use only - expires 2022-01-13
Parameter OutputFlag unchanged
Value: 1 Min: 0 Max: 1 Default: 1
D:\software\lib\site-packages\cvxpy\reductions\solvers\solving_chain.py:326: DeprecationWarning: Deprecated, use Model.addMConstr() instead
solver_opts, problem._solver_cache)
Changed value of parameter QCPDual to 1
Prev: 0 Min: 0 Max: 1 Default: 0
Gurobi Optimizer version 9.1.0 build v9.1.0rc0 (win64)
Thread count: 16 physical cores, 32 logical processors, using up to 32 threads
Traceback (most recent call last):
File "model.py", line 274, in <module>
problem.solve(solver=cp.GUROBI,verbose=True)
File "D:\software\lib\site-packages\cvxpy\problems\problem.py", line 396, in solve
return solve_func(self, *args, **kwargs)
File "D:\software\lib\site-packages\cvxpy\problems\problem.py", line 754, in _solve
self.unpack_results(solution, solving_chain, inverse_data)
File "D:\software\lib\site-packages\cvxpy\problems\problem.py", line 1058, in unpack_results
solution = chain.invert(solution, inverse_data)
File "D:\software\lib\site-packages\cvxpy\reductions\chain.py", line 79, in invert
solution = r.invert(solution, inv)
File "D:\software\lib\site-packages\cvxpy\reductions\solvers\qp_solvers\gurobi_qpif.py", line 59, in invert
s.NUM_ITERS: model.BarIterCount,
File "src\gurobipy\model.pxi", line 343, in gurobipy.gurobipy.Model.__getattr__
File "src\gurobipy\model.pxi", line 1842, in gurobipy.gurobipy.Model.getAttr
File "src\gurobipy\attrutil.pxi", line 100, in gurobipy.gurobipy.__getattr
AttributeError: Unable to retrieve attribute 'BarIterCount'
Hopefully this can provide more hint for solution.
BarIterCount is the number of barrier iterations performed to solve an LP. This is not a limit on the number of iterations and it should only be queried when the current optimization process has been finished. You cannot set this attribute either, of course.
To actually limit the number of iterations the barrier algorithm is allowed to take, you can use the parameter BarIterLimit.
Please inspect your log file for further information about the solver's behavior.

How to proceed with my Spark / Scala project

I am new to Spark and Scala. I am working on a Scala project where I will have data access from SQL Server.
There is a table in SQL Server has info about clothes. itemCode is the primary key and several attributes with Boolean value 0/1 - Designer, Exclusive, Handloom and several other columns having attributes of the product etc.
Code Designer Exclusive Handloom
A 1 0 1
B 1 0 0
C 0 0 1
D 0 1 0
E 0 1 0
F 1 0 1
G 0 1 0
H 0 0 0
I 1 1 1
J 1 1 1
K 0 0 1
L 0 1 0
M 0 1 0
N 1 1 0
O 0 1 1
P 1 1 0
and the list continues.
I have to select a collection of 32 items out of 320 items that have ATLEAST:
8 Designer, 8 Exclusive, 8 Handloom, 8 WeddingStyle, 8 PartyStyle,
8 Silk, 8 Georgette
I had solved the problem in MS Excel solver (it uses Gradient Descent algo) by adding an extra column and using sumproduct function between added column and required columns. So, the problem was solved there and it took around 1 minute 30 seconds for the same.
Also, the problem can be solved by writing an SQL query with 32 joins (so many), for example, if i want to select 6 items out of those 16 above with atleast 4 items designer, 4 exclusive, 4 handloom, the query would be like in my post: MYSQL - Select rows fulfilling many count conditions
In production, I have to fetch 32 rows like this way, So my question is how do I proceed further with the project.
I am working on Scala IDE for Eclipse, and have added spark mllib there. I have fetched data via JDBC and stored in a dataframe, and the created a temporary table:
dataFrame.registerTempTable("Data")
There is a class optimizer in mllib optimization that uses gradient descent (like excel solver does) to solve problems. But, that is for machine learning and takes as input training data.
I am not able to understand how do I proceed with my project. Can i use mllib, or use a better simple version of the sql with sparkSQL. I need serious help.
I'd recommend you to use https://spark.apache.org/docs/1.3.0/sql-programming-guide.html#creating-dataframes rather than MLLib.
I solved this problem through linear programming. I have now used lpsolver library for java in my scala project. It is giving almost the same result as in excel solver.

How to plot a linegraph in SPSS with respect to the data?

Hi all,
Above you'll see a line-graph plotted with SPSS. I want to improve this line-graph according to its data. Meaning that some elements are not presented correctly:
(1) I deliberately adjusted the scaling on the Y-axis from -1 to 10, in order to notice the breaks (i.e. missing values) in the line graph. Otherwise you'll not notice the breaks, as it will overlap with the bottom-line of the graph. Is it possible to notice the breaks, but with a scaling of 0 to 10 (in SPSS)? > SOLVED
(2) On the X-axis, point 14 and 15 are missing, hence the break. However, the line graph shows an upward trend just after point 13, and a downward trend just before point 16. Is it possible to adjust the line-graph (in SPSS), which would delete these described (interpolation) trends?
GGRAPH
/GRAPHDATASET NAME="graphdataset" VARIABLES=Time_Period_Hours
MEAN(MT)[name="MEAN_MT"] MISSING=VARIABLEWISE REPORTMISSING=NO
/GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
SOURCE: s=userSource(id("graphdataset"))
DATA: Time_Period_Hours=col(source(s), name("Time_Period_Hours"), unit.category())
DATA: MEAN_MT=col(source(s), name("MEAN_MT"))
GUIDE: axis(dim(2), delta(1))
SCALE: linear(dim(2), min(-0.5), max(9))
ELEMENT: line(position(Time_Period_Hours*MEAN_MT))
ELEMENT: point(position(Time_Period_Hours*MEAN_MT), color(color.black),
size(size."3px"))
END GPL.
Here is an example, for the line element you need to specify the option missing.gap() - I thought just deleting missing.wings() from the default code would work but maybe it is an internal default. You may want to consider changing Time_Period_Hours to a scale variable and doing the aggregation outside of GGRAPH. Also making the Y axis scale in your example go all the way up to 9 seems a bit superfluous.
DATA LIST FREE / Time_Period_Hours MT.
BEGIN DATA
1 1
2 0
3 0
4 0
5 1
6 0
7 0
8 0
9 0
10 0
11 .
12 0
13 0
14 .
15 .
16 1
17 0
18 0
19 0
20 .
21 0
END DATA.
FORMATS Time_Period_Hours MT (F2.0).
GGRAPH
/GRAPHDATASET NAME="graphdataset" VARIABLES=Time_Period_Hours
MEAN(MT)[name="MEAN_MT"] MISSING=VARIABLEWISE REPORTMISSING=NO
/GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
SOURCE: s=userSource(id("graphdataset"))
DATA: Time_Period_Hours=col(source(s), name("Time_Period_Hours"), unit.category())
DATA: MEAN_MT=col(source(s), name("MEAN_MT"))
GUIDE: axis(dim(2), delta(1))
SCALE: linear(dim(2), min(-0.5), max(9))
ELEMENT: line(position(Time_Period_Hours*MEAN_MT), missing.gap())
ELEMENT: point(position(Time_Period_Hours*MEAN_MT), color(color.black),
size(size."3px"))
END GPL.

Smalltalk dictionary as calculator

I'm working on a homework assignment that asks us to create a type of Units class that can keep track of units and perform basic arithmetic on them. The problem description has this bit, which I don't completely understand:
Probably the easiest way to keep track of the units is to give Units a dictionary that maps symbols to integers. If you are dividing by a unit then it has a negative value in the dictionary. You add two Units together by adding the value together for each symbol in the dictionary. When it is zero, throw the symbol away!
For reference, this is also included in the description:
[...] you could write an expression 3 elephants / (1 sec sec) and it would return the right thing.
Could someone shed some light here? How can I use a dictionary to map these types of units? Am I making this way harder than it needs to be?
It sounds like your teacher is giving you a hint about how to wind up with the proper units at the end of the calculation.
When you're parsing the problem, as you encounter items that are obviously units, enter them into a dictionary. The dictionary would consist of a number and a string (the supposed "unit"). Then you'd use a set of rules to increase or decrease the integer count. The resultant integer value would help you to output the units correctly.
A count of 1 indicates it's a unit in the output.
A count of -1 indicates it's inverse is a unit in the output.
A count of 0 indicates that it doesn't appear in the output at all.
Similarly, a count of 2 would indicate that it's square appears as a unit in the output.
To wit:
5 Hippo + 10 Hippo = 15 Hippos
Parsing: Dictionary:
-------- -----------
5 Hippo Hippo:1
+
10 Hippo Hippo:1 (previous operation was addition or subtraction, and already have Hippo in dictionary
But consider this problem:
5 Hippo * 5 sec/Hippo = 25 sec
Parsing: Dictionary:
5 Hippo Hippo:1
*
5 sec Hippo:1, sec:1
/
Hippo Hippp:0, sec:1 (previous operation was division of Hippo, so decrement Hippo count)
Or perhaps:
10 feet / 5 sec = 2 feet/sec
Parsing: Dictionary:
10 feet feet:1
/
5 sec feet:1, sec:-1 (divided by sec, and second is not in dictionary, so second implicitly = 0. 0 + (-1) = -1.
In the example above, feet will be on the top of the bar because it's equal to 1, and sec will be below the bar because it's value is -1. If it's value had been -2, it would have been (feet/(sec*sec) or feet/(sec squared).