Reduce keeps restarting to 0% without any failure - hive

I am running a MapReduce job but the reduce task always restarts at 0% without any failure. I can't find anything related to this on the web. Any help?
2016-06-02 07:19:33,992 Stage-1 map = 100%, reduce = 21%, Cumulative
CPU 36280.53 sec 2016-06-02 07:19:49,351 Stage-1 map = 100%, reduce =
22%, Cumulative CPU 36288.54 sec 2016-06-02 07:20:29,331 Stage-1 map =
100%, reduce = 23%, Cumulative CPU 36309.57 sec 2016-06-02
07:20:41,626 Stage-1 map = 100%, reduce = 24%, Cumulative CPU
36316.05 sec 2016-06-02 07:20:50,847 Stage-1 map = 100%, reduce = 25%, Cumulative CPU 36320.83 sec 2016-06-02 07:21:09,279 Stage-1 map =
100%, reduce = 27%, Cumulative CPU 36330.34 sec 2016-06-02
07:21:21,598 Stage-1 map = 100%, reduce = 28%, Cumulative CPU
36336.66 sec 2016-06-02 07:21:33,898 Stage-1 map = 100%, reduce = 29%, Cumulative CPU 36342.98 sec 2016-06-02 07:21:52,330 Stage-1 map =
100%, reduce = 30%, Cumulative CPU 36352.47 sec 2016-06-02
07:22:10,764 Stage-1 map = 100%, reduce = 31%, Cumulative CPU
36361.88 sec
2016-06-02 07:22:19,996 Stage-1 map = 100%, reduce = 32%, Cumulative CPU 36366.42 sec 2016-06-02 07:22:28,216 Stage-1 map =
100%, reduce = 0%, Cumulative CPU 36053.06 sec 2016-06-02
07:22:40,576 Stage-1 map = 100%, reduce = 1%, Cumulative CPU 36062.98
sec 2016-06-02 07:22:49,795 Stage-1 map = 100%, reduce = 2%,
Cumulative CPU 36070.63 sec 2016-06-02 07:23:14,373 Stage-1 map =
100%, reduce = 3%, Cumulative CPU 36086.48 sec 2016-06-02
07:23:54,361 Stage-1 map = 100%, reduce = 4%, Cumulative CPU 36108.05
sec

Related

CPLEX OPL : Overflow occurred, please use oplrun -profile

I am running a fairly large model in OPL, it has 576723 constraints, 1132515 variables, 3855 binary, 27150711 Non zero co-efficients.
At about 12 minutes the optimisation stops, it says 1 solution but displays no solution. In the profiler tab I get the Overflow occurred, please use oplrun -profile message.
The Engine log looks as below ( Updated on 24th Sep):
Found incumbent of value 0.000000 after 0.02 sec. (30.57 ticks)
Presolve has eliminated 65039 rows and 117138 columns...
Presolve has improved bounds 1277962 times...
Aggregator has done 20701 substitutions...
Aggregator has done 42701 substitutions...
Aggregator has done 65901 substitutions...
Aggregator has done 89601 substitutions...
Aggregator has done 114601 substitutions...
Aggregator has done 141901 substitutions...
Aggregator has done 172001 substitutions...
Aggregator has done 205101 substitutions...
Aggregator has done 242201 substitutions...
Aggregator has done 285501 substitutions...
Aggregator has done 339801 substitutions...
Aggregator has done 425001 substitutions...
Tried aggregator 2 times.
MIP Presolve eliminated 65049 rows and 119516 columns.
MIP Presolve modified 3304560 coefficients.
Aggregator did 505533 substitutions.
Reduced MIP has 6138 rows, 507466 columns, and 15507869 nonzeros.
Reduced MIP has 2761 binaries, 0 generals, 0 SOSs, and 0 indicators.
Presolve time = 52.98 sec. (140577.29 ticks)
Tried aggregator 1 time.
Reduced MIP has 6138 rows, 507466 columns, and 15507869 nonzeros.
Reduced MIP has 2761 binaries, 0 generals, 0 SOSs, and 0 indicators.
Presolve time = 4.59 sec. (4115.32 ticks)
Probing time = 0.33 sec. (193.08 ticks)
Clique table members: 674.
MIP emphasis: balance optimality and feasibility.
MIP search method: dynamic search.
Parallel mode: deterministic, using up to 16 threads.
Root relaxation solution time = 5983.52 sec. (4525135.08 ticks)
Nodes Cuts/
Node Left Objective IInf Best Integer Best Bound ItCnt Gap
* 0+ 0 0.0000 4585.0158 ---
0 0 1414.4727 839 0.0000 1414.4727 74713 ---
0 0 cutoff 0.0000 5409203 ---
Elapsed time = 19950.47 sec. (18809991.19 ticks, tree = 0.01 MB, solutions = 1)
Clique cuts applied: 2
Cover cuts applied: 57
Implied bound cuts applied: 91
Flow cuts applied: 121
Mixed integer rounding cuts applied: 236
Gomory fractional cuts applied: 4
Root node processing (before b&c):
Real time = 19950.63 sec. (18810086.10 ticks)
Parallel b&c, 16 threads:
Real time = 0.00 sec. (0.00 ticks)
Sync time (average) = 0.00 sec.
Wait time (average) = 0.00 sec.
------------
Total (root+branch&cut) = 19950.63 sec. (18810086.10 ticks)
<<< solve
OBJECTIVE: 0
<<< post process
<<< done
Profiler Report
Time PeakMemory SelfTime LocalMem Count Nodes Description
20,190.282 100% 9.902 G 100% 0.753 0% 879.507 M 9% 1 126 TOTAL
0.000 0% 0 B 0% 0.000 0% 256 B 0% 1 1 READING MODEL DEFINITION Ashes200_data
38.626 0% 840.113 M 8% 0.128 0% 721.418 M 7% 1 97 LOADING MODEL Ashes200_data-0000025C59804DD8
7.277 0% 103.191 M 1% 2.750 0% 84.547 M 1% 1 52 LOADING DATA D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data 3yr.dat
0.005 0% 28 K 0% 0.005 0% 400 B 0% 1 1 INIT TimePeriods at 13:1-24 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.003 0% 8 K 0% 0.003 0% 54.047 K 0% 1 1 INIT PitBlocks at 14:1-25 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.001 0% 16 K 0% 0.001 0% 35.641 K 0% 1 1 INIT DumpBlocks at 15:1-25 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.001 0% 0 B 0% 0.001 0% 576 B 0% 1 1 INIT Stockpiles at 17:1-25 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.001 0% 0 B 0% 0.001 0% 424 B 0% 1 1 INIT Plants at 19:1-21 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.329 0% 22.695 M 0% 0.329 0% 18.362 M 0% 1 1 INIT Pathid at 21:1-22 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.002 0% 0 B 0% 0.002 0% 904 B 0% 1 1 INIT AverageGrade at 48:1-37 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.001 0% 0 B 0% 0.001 0% 864 B 0% 1 1 INIT DensityGradeBins at 49:1-42 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.002 0% 8 K 0% 0.002 0% 5.531 K 0% 1 1 INIT grade at 26:1-30 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.001 0% 0 B 0% 0.001 0% 5.516 K 0% 1 1 INIT oreTons at 27:1-32 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.002 0% 0 B 0% 0.002 0% 5.562 K 0% 1 1 INIT density at 28:1-32 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.001 0% 0 B 0% 0.001 0% 5.523 K 0% 1 1 INIT wasteVolume at 29:1-36 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.068 0% 0 B 0% 0.068 0% 5.523 K 0% 1 1 INIT totalVolume at 30:1-36 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.002 0% 8 K 0% 0.002 0% 3.773 K 0% 1 1 INIT dumpVolume at 32:1-35 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.000 0% 0 B 0% 0.000 0% 872 B 0% 1 1 INIT resourceMaxCap at 35:1-40 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.002 0% 0 B 0% 0.002 0% 840 B 0% 1 1 INIT resourceMinCap at 36:1-40 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.013 0% 0 B 0% 0.013 0% 1.484 K 0% 1 1 INIT processMinCap at 37:1-46 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.001 0% 0 B 0% 0.001 0% 1.516 K 0% 1 1 INIT processMaxCap at 38:1-46 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.001 0% 0 B 0% 0.001 0% 1.477 K 0% 1 1 INIT GradeMin at 39:1-42 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.002 0% 0 B 0% 0.002 0% 936 B 0% 1 1 INIT SellPrice at 41:1-35 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.001 0% 0 B 0% 0.001 0% 840 B 0% 1 1 INIT wasteMiningCost at 42:1-41 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.001 0% 0 B 0% 0.001 0% 840 B 0% 1 1 INIT coalMiningCost at 43:1-40 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.001 0% 0 B 0% 0.001 0% 840 B 0% 1 1 INIT washCost at 44:1-34 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.001 0% 0 B 0% 0.001 0% 840 B 0% 1 1 INIT HaulageCost at 45:1-37 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.001 0% 0 B 0% 0.001 0% 848 B 0% 1 1 INIT StockPileRehandlingCost at 46:1-49 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.001 0% 0 B 0% 0.001 0% 128 B 0% 1 1 INIT SwellFactor at 52:1-24 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.001 0% 56 K 0% 0.001 0% 2.031 K 0% 1 1 INIT StockPileMaxCap at 56:1-52 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.001 0% 0 B 0% 0.001 0% 2.094 K 0% 1 1 INIT StockPileMinCap at 55:1-52 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.001 0% 0 B 0% 0.001 0% 128 B 0% 1 1 INIT DisountRate at 58:1-24 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.002 0% 0 B 0% 0.002 0% 128 B 0% 1 1 INIT DumpCapacity at 60:1-25 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.323 0% 0 B 0% 0.323 0% 130.461 K 0% 1 2 INIT PitBlocksType at 287:1-27 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.000 0% 0 B 0% 0.000 0% 976 B 0% 1 1 INIT ijk at 278:1-284:2 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.003 0% 0 B 0% 0.003 0% 57.203 K 0% 1 2 INIT DumpBlocksType at 273:1-34 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.000 0% 0 B 0% 0.000 0% 872 B 0% 1 1 INIT blockType at 263:1-268:3 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.273 0% 48 K 0% 0.273 0% 90.812 K 0% 1 2 INIT PitLagInfoXYB at 79:1-25 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.000 0% 0 B 0% 0.000 0% 872 B 0% 1 1 INIT xyz at 64:1-69:2 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.172 0% 0 B 0% 0.172 0% 55.266 K 0% 1 1 INIT DumpLagInfoXYB at 78:1-26 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.165 0% 0 B 0% 0.165 0% 20.453 K 0% 1 1 INIT DumpXYZ at 72:1-29 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.001 0% 0 B 0% 0.001 0% 2.555 K 0% 1 1 INIT PlantXYZ at 73:1-26 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.002 0% 0 B 0% 0.002 0% 2.742 K 0% 1 1 INIT StockpilesXYZ at 74:1-35 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.003 0% 40 K 0% 0.003 0% 30.953 K 0% 1 1 INIT PitXYZ at 71:1-27 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
2.463 0% 56.422 M 1% 2.463 0% 45.421 M 0% 1 2 INIT rawPbd at 131:1-20 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.000 0% 0 B 0% 0.000 0% 1.117 K 0% 1 1 INIT Raw at 121:1-130:2 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.011 0% 188 K 0% 0.011 0% 174.133 K 0% 1 1 INIT rawPbm at 132:1-20 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.024 0% 652 K 0% 0.024 0% 414.375 K 0% 1 1 INIT rawPbs at 133:1-20 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.625 0% 21.031 M 0% 0.625 0% 19.388 M 0% 1 2 INIT sourceDestD at 108:1-37 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.000 0% 0 B 0% 0.000 0% 744 B 0% 1 1 INIT sourceDestination at 103:1-106:2 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.005 0% 292 K 0% 0.005 0% 61.859 K 0% 1 1 INIT sourceDestM at 109:1-37 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.006 0% 240 K 0% 0.006 0% 177.469 K 0% 1 1 INIT sourceDestS at 110:1-37 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.002 0% 0 B 0% 0.002 0% 12.562 K 0% 1 2 INIT NullVariablesSet at 450:1-40 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.000 0% 0 B 0% 0.000 0% 744 B 0% 1 1 INIT nullVariables at 445:1-448:2 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
28.810 0% 463.848 M 5% 0.000 0% 416.204 M 4% 1 29 PRE PROCESSING
0.410 0% 640 K 0% 0.345 0% 649.023 K 0% 1 4 EXECUTE anonymous#1 at 90:1-8 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.065 0% 640 K 0% 0.065 0% 647.672 K 0% 1 3 INIT OntopDumpLag at 85:6-87:52 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.000 0% 8 K 0% 0.000 0% 280 B 0% 1 1 INIT D at 81:11-14 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.000 0% 0 B 0% 0.000 0% 296 B 0% 1 1 INIT BottomPitBenNo at 82:24-25 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
5.935 0% 229.957 M 2% 5.935 0% 211.206 M 2% 1 8 EXECUTE anonymous#2 at 158:1-8 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.000 0% 0 B 0% 0.000 0% 624 B 0% 1 1 INIT emptysetd at 153:22-24 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.000 0% 0 B 0% 0.000 0% 5.641 K 0% 1 2 INIT Pbd at 148:12-14 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.000 0% 0 B 0% 0.000 0% 1.117 K 0% 1 1 INIT Path at 136:1-145:2 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.000 0% 0 B 0% 0.000 0% 840 B 0% 1 1 INIT emptysetm at 154:22-24 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.000 0% 0 B 0% 0.000 0% 2.875 K 0% 1 1 INIT Pbm at 149:13-15 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.000 0% 0 B 0% 0.000 0% 792 B 0% 1 1 INIT emptysets at 155:22-24 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.000 0% 0 B 0% 0.000 0% 2.875 K 0% 1 1 INIT Pbs at 150:12-14 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.902 0% 51.145 M 1% 0.788 0% 47.271 M 0% 1 2 EXECUTE anonymous#3 at 237:1-8 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.114 0% 51.145 M 1% 0.114 0% 47.271 M 0% 1 1 INIT hc at 233:1-31 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
1.678 0% 2.129 M 0% 1.029 0% 1.958 M 0% 1 2 EXECUTE anonymous#4 at 303:1-8 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.649 0% 2.129 M 0% 0.649 0% 1.957 M 0% 1 1 INIT OntopPit at 290:7-299:28 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
5.335 0% 117.746 M 1% 5.163 0% 106.703 M 1% 1 6 EXECUTE anonymous#5 at 367:1-8 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.000 0% 0 B 0% 0.000 0% 624 B 0% 1 1 INIT MaxS at 364:10-12 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.061 0% 42.777 M 0% 0.061 0% 39.647 M 0% 1 1 INIT splitPitBlocksPath at 353:1-34 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.001 0% 0 B 0% 0.001 0% 121.359 K 0% 1 1 INIT splitPitBlocksPathM at 354:1-35 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.001 0% 536 K 0% 0.001 0% 361.719 K 0% 1 1 INIT splitPitBlocksPathS at 355:1-35 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.109 0% 43.75 M 0% 0.109 0% 43.522 M 0% 1 1 INIT splitDumpBlocksPath at 356:1-35 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
14.550 0% 62.246 M 1% 14.436 0% 48.431 M 0% 1 6 EXECUTE anonymous#6 at 470:1-8 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.000 0% 268 K 0% 0.000 0% 263.789 K 0% 1 1 INIT capBMT at 453:1-46 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.112 0% 50.91 M 1% 0.112 0% 47.161 M 0% 1 1 INIT capBDT at 455:1-50 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.002 0% 536 K 0% 0.002 0% 555.375 K 0% 1 1 INIT capBST at 457:1-50 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.000 0% 268 K 0% 0.000 0% 143.125 K 0% 1 1 INIT capBT at 459:1-35 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.000 0% 0 B 0% 0.000 0% 147.5 K 0% 1 1 INIT capschedulePit at 461:1-44 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
2.101 0% 256.062 M 3% 0.009 0% 218.874 M 2% 1 10 INIT npv at 699:19-703:108 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.000 0% 0 B 0% 0.000 0% 5.156 K 0% 1 1 INIT Dfbmt at 684:1-103 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.006 0% 0 B 0% 0.006 0% 419.82 K 0% 1 1 INIT Xbmt at 672:1-89 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.000 0% 0 B 0% 0.000 0% 3.914 K 0% 1 1 INIT Dfbdt at 687:1-59 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
1.350 0% 108.723 M 1% 1.350 0% 106.255 M 1% 1 1 INIT Xbdt at 673:1-91 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.000 0% 0 B 0% 0.000 0% 3.914 K 0% 1 1 INIT Dfbst at 690:1-58 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.012 0% 1.051 M 0% 0.012 0%1,020.648 K 0% 1 1 INIT Xbst at 674:1-91 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.000 0% 0 B 0% 0.000 0% 4.805 K 0% 1 1 INIT Dfsmt at 694:1-87 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.000 0% 0 B 0% 0.000 0% 3.758 K 0% 1 1 INIT Xsmt at 663:1-51 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.724 0% 109.227 M 1% 0.724 0% 106.847 M 1% 1 1 INIT ypt at 677:1-47 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.018 0% 16.035 M 0% 0.018 0% 315.047 K 0% 1 1 INIT schedulePit at 676:1-87 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.287 0% 0 B 0% 0.287 0% 875.367 K 0% 1 1 INIT OnBelowDump at 313:6-323:47 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.002 0% 0 B 0% 0.002 0% 202.047 K 0% 1 1 INIT scheduleDump at 668:1-52 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.000 0% 0 B 0% 0.000 0% 2.156 K 0% 1 1 INIT StockPileVol at 54:1-45 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.003 0% 272 K 0% 0.003 0% 280.82 K 0% 1 1 INIT zbt at 675:1-70 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
177.026 1% 9.063 G 92% 10.546 0% 158.001 M 2% 1 2 EXTRACTING Ashes200_data-0000025C59804DD8
166.480 1% 8.179 G 83% 166.480 1% 17.213 M 0% 1 1 OBJECTIVE at 714:1-716:4 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
19,951.910 99% 4.281 G 43%5,989.816 30% 319.668 M 3% 1 13 CPLEX MIP Optimization
52.990 0% 389.746 M 4% 52.990 0% 389.746 M 4% 1 1 CPLEX Pre Solve
4.589 0% 256.008 M 3% 4.589 0% 256.008 M 3% 1 1 CPLEX Pre Solve
0.000 0% 0 B 0% 0.000 0% 0 B 0% 1 1 CPLEX Solve LP Relaxation
13,904.515 69% 1.05 G 11% 23.882 0% 467.602 M 5% 1 9 CPLEX Generating Cuts for Root Node
13,714.446 68% 52 K 0% 2.292 0% 78.424 M 1% 7 3 CPLEX Solve LP Relaxation
13,711.520 68% 0 B 0%13,711.520 68% 110.169 M 1% 4 1 CPLEX Solve LP Relaxation
0.634 0% 52 K 0% 0.634 0% 225.727 M 2% 1 1 CPLEX Pre Solve
165.425 1% 604.797 M 6% 0.170 0% 604.797 M 6% 1 3 CPLEX Heuristics
165.255 1% 324.707 M 3% 0.289 0% 81.177 M 1% 4 2 CPLEX Solve LP Relaxation
164.966 1% 309.051 M 3% 164.966 1% 152.584 M 2% 2 1 CPLEX Solve LP Relaxation
0.130 0% 0 B 0% 0.130 0% 0 B 0% 1 1 CPLEX Probing
0.632 0% 225.676 M 2% 0.632 0% 225.676 M 2% 1 1 CPLEX Pre Solve
21.967 0% 8.656 M 0% 0.009 0% 35.43 K 0% 1 12 POST PROCESSING
21.958 0% 8.656 M 0% 17.082 0% 39.516 K 0% 1 11 EXECUTE anonymous#7 at 1300:1-1301:0 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.011 0% 8.656 M 0% 0.011 0% 9.328 K 0% 1 2 INIT solXbmt at 1252:21-112 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.000 0% 0 B 0% 0.000 0% 1,000 B 0% 1 1 INIT SolXbmt at 1245:1-1250:2 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
2.327 0% 0 B 0% 2.327 0% 7.258 K 0% 1 2 INIT solXbdt at 1263:24-118 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.000 0% 0 B 0% 0.000 0% 1,000 B 0% 1 1 INIT SolXbdt at 1255:1-1260:2 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.036 0% 0 B 0% 0.036 0% 7.352 K 0% 1 2 INIT solXbst at 1275:22-117 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.000 0% 0 B 0% 0.000 0% 1,000 B 0% 1 1 INIT SolXbst at 1267:1-1272:2 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.000 0% 0 B 0% 0.000 0% 7.258 K 0% 1 2 INIT solXsmt at 1284:21-111 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.000 0% 0 B 0% 0.000 0% 1,000 B 0% 1 1 INIT SolXsmt at 1277:1-1282:2 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
2.502 0% 0 B 0% 2.502 0% 6.18 K 0% 1 2 INIT solPath at 1297:21-84 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
0.000 0% 0 B 0% 0.000 0% 944 B 0% 1 1 INIT SolPath at 1291:1-1295:2 D:\PhD\Minex_Data\FINAL_PAPER2022\AshesPit200\Ashes_Pit200\Ashes200_data.mod
<<< profile
Kindly suggest how to overcome this problem.
Use better units. An objective value of 8.95478e+11 indicates you are using cents instead of billions of dollars. Also, make sure any big-M constants are not larger than needed.

Why can't ONE Iperf thread fill the CPU?

I'm running iperf2 on ubuntu18.04 and trying to test the max bandwidth with one CPU. My topology is client(with CX5 100G nic)->switch(100G)->server(with CX5 100G nic).
Unfortunately, the speed is only 30Gbps. Then I do a loopback test, the speed is 60Gbps and the cpu usage is 100%. What should I do?
Binding to local address 10.0.0.2
Write buffer size: 128 KByte
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[ 3] local 10.0.0.2 port 42749 connected with 10.0.0.1 port 5001
[ ID] Interval Transfer Bandwidth Write/Err Rtry Cwnd/RTT
[ 3] 0.00-1.00 sec 3.71 GBytes 31.9 Gbits/sec 30379/0 0 6025K/354 us
[ ID] Interval Transfer Bandwidth Write/Err Rtry Cwnd/RTT
[ 3] 1.00-2.00 sec 3.59 GBytes 30.8 Gbits/sec 29415/0 0 6684K/271 us
[ ID] Interval Transfer Bandwidth Write/Err Rtry Cwnd/RTT
[ 3] 2.00-3.00 sec 3.63 GBytes 31.2 Gbits/sec 29712/0 0 7407K/399 us
[ ID] Interval Transfer Bandwidth Write/Err Rtry Cwnd/RTT
[ 3] 3.00-4.00 sec 3.71 GBytes 31.8 Gbits/sec 30359/0 0 8285K/399 us
[ ID] Interval Transfer Bandwidth Write/Err Rtry Cwnd/RTT
[ 3] 4.00-5.00 sec 3.73 GBytes 32.0 Gbits/sec 30520/0 0 8285K/328 us
[ ID] Interval Transfer Bandwidth Write/Err Rtry Cwnd/RTT
[ 3] 5.00-6.00 sec 3.77 GBytes 32.4 Gbits/sec 30911/0 0 8285K/378 us
[ ID] Interval Transfer Bandwidth Write/Err Rtry Cwnd/RTT
[ 3] 6.00-7.00 sec 3.77 GBytes 32.4 Gbits/sec 30920/0 0 8285K/384 us
[ ID] Interval Transfer Bandwidth Write/Err Rtry Cwnd/RTT
[ 3] 7.00-8.00 sec 3.74 GBytes 32.1 Gbits/sec 30646/0 0 8285K/325 us
[ ID] Interval Transfer Bandwidth Write/Err Rtry Cwnd/RTT
[ 3] 8.00-9.00 sec 3.77 GBytes 32.4 Gbits/sec 30917/0 0 8285K/334 us
[ ID] Interval Transfer Bandwidth Write/Err Rtry Cwnd/RTT
[ 3] 9.00-10.00 sec 3.72 GBytes 32.0 Gbits/sec 30503/0 0 8285K/278 us
[ ID] Interval Transfer Bandwidth Write/Err Rtry Cwnd/RTT
[ 3] 0.00-10.00 sec 37.1 GBytes 31.9 Gbits/sec 304286/0 0 8285K/278 us
Binding to local address 10.0.0.2
Write buffer size: 128 KByte
TCP window size: 2.50 MByte (default)
------------------------------------------------------------
[ 3] local 10.0.0.2 port 37687 connected with 10.0.0.2 port 5001
[ ID] Interval Transfer Bandwidth Write/Err Rtry Cwnd/RTT
[ 3] 0.00-1.00 sec 3.85 GBytes 33.1 Gbits/sec 31562/0 0 1087K/41 us
[ ID] Interval Transfer Bandwidth Write/Err Rtry Cwnd/RTT
[ 3] 1.00-2.00 sec 5.82 GBytes 50.0 Gbits/sec 47638/0 0 1087K/20 us
[ ID] Interval Transfer Bandwidth Write/Err Rtry Cwnd/RTT
[ 3] 2.00-3.00 sec 7.65 GBytes 65.7 Gbits/sec 62644/0 0 1087K/19 us
[ ID] Interval Transfer Bandwidth Write/Err Rtry Cwnd/RTT
[ 3] 3.00-4.00 sec 7.46 GBytes 64.1 Gbits/sec 61140/0 0 1342K/19 us
[ ID] Interval Transfer Bandwidth Write/Err Rtry Cwnd/RTT
[ 3] 4.00-5.00 sec 7.66 GBytes 65.8 Gbits/sec 62716/0 0 1342K/19 us
[ ID] Interval Transfer Bandwidth Write/Err Rtry Cwnd/RTT
[ 3] 5.00-6.00 sec 7.60 GBytes 65.3 Gbits/sec 62231/0 0 1342K/21 us
[ ID] Interval Transfer Bandwidth Write/Err Rtry Cwnd/RTT
[ 3] 6.00-7.00 sec 7.67 GBytes 65.9 Gbits/sec 62833/0 0 1342K/17 us
[ ID] Interval Transfer Bandwidth Write/Err Rtry Cwnd/RTT
[ 3] 7.00-8.00 sec 7.71 GBytes 66.2 Gbits/sec 63134/0 0 1342K/19 us
[ ID] Interval Transfer Bandwidth Write/Err Rtry Cwnd/RTT
[ 3] 8.00-9.00 sec 7.73 GBytes 66.4 Gbits/sec 63291/0 0 1342K/19 us
[ ID] Interval Transfer Bandwidth Write/Err Rtry Cwnd/RTT
[ 3] 9.00-10.00 sec 7.73 GBytes 66.4 Gbits/sec 63311/0 0 2110K/44 us
[ ID] Interval Transfer Bandwidth Write/Err Rtry Cwnd/RTT
[ 3] 0.00-10.00 sec 70.9 GBytes 60.9 Gbits/sec 580508/0 0 2110K/44 us
For one thing, your loopback test will likely have a much larger packet size than your over-the-network test. That will mean TCP needs to to fewer trips up/down the protocol stack to transfer a given quantity of data.
For another, your loopback test will not include any of the overhead for the driver of your 100G NIC.
Also, one of the many limits to the performance of a TCP connection is:
Throughput <= WindowSize / RoundTripTime
and while the round-trip time on your small 100G setup may be small, they are still longer than loopback, and you are trying for a throughput which is quite large, so the default limits for the TCP window size of your stack may be impeding getting higher throughput.

what means MaxInUse for bfc_allocator in tensorflow

I encountered an OOM error while running tensorflow code. I am confused about these indicators, such as Limit, InUse, MaxInUse, NumAllocs, MaxAllocSize. Can anyone help explain it?thanks!
2021-12-01 21:08:29.733639: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 14 Chunks of size 221773824 totalling 2.89GiB
2021-12-01 21:08:29.733644: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 4 Chunks of size 325730304 totalling 1.21GiB
2021-12-01 21:08:29.733649: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 1 Chunks of size 325844992 totalling 310.75MiB
2021-12-01 21:08:29.733654: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 2 Chunks of size 591396864 totalling 1.10GiB
2021-12-01 21:08:29.733658: I tensorflow/core/common_runtime/bfc_allocator.cc:645] Sum Total of in-use chunks: 9.26GiB
2021-12-01 21:08:29.733667: I tensorflow/core/common_runtime/bfc_allocator.cc:647] Stats:
Limit: 10255859712
InUse: 9942288128
MaxInUse: 9942288128
NumAllocs: 8892
MaxAllocSize: 591396864
2021-12-01 21:08:29.733905: W tensorflow/core/common_runtime/bfc_allocator.cc:271] *********************************************************_******************************************
2021-12-01 21:08:29.733938: W tensorflow/core/framework/op_kernel.cc:1401] OP_REQUIRES failed at cwise_ops_common.cc:70 : Resource exhausted: OOM when allocating tensor with shape[4096,141,96] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc

OOM when allocating tensor with shape[1,48,48,1024] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc

I'm trying to reproduce the training of the Mask RCNN in the following repository:https://github.com/maxkferg/metal-defect-detection
Code snippet for the train is the following:
# Training - Stage 1
print("Training network heads")
model.train(dataset_train, dataset_val,
learning_rate=config.LEARNING_RATE,
epochs=40,
layers='heads')
# Training - Stage 2
# Finetune layers from ResNet stage 4 and up
print("Fine tune Resnet stage 4 and up")
model.train(dataset_train, dataset_val,
learning_rate=config.LEARNING_RATE,
epochs=120,
layers='4+')
# # Training - Stage 3
# # Fine tune all layers
print("Fine tune all layers")
model.train(dataset_train, dataset_val,
learning_rate=config.LEARNING_RATE / 10,
epochs=160,
layers='all')
Stage-1 goes smooth. But fails from the Stage-2. Giving the following:
2020-08-17 15:53:10.685456: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 123 Chunks of size 2048 totalling 246.0KiB
2020-08-17 15:53:10.685456: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 1 Chunks of size 2816 totalling 2.8KiB
2020-08-17 15:53:10.686456: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 6 Chunks of size 3072 totalling 18.0KiB
2020-08-17 15:53:10.686456: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 387 Chunks of size 4096 totalling 1.51MiB
2020-08-17 15:53:10.687456: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 1 Chunks of size 6144 totalling 6.0KiB
2020-08-17 15:53:10.687456: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 1 Chunks of size 6656 totalling 6.5KiB
2020-08-17 15:53:10.688456: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 60 Chunks of size 8192 totalling 480.0KiB
2020-08-17 15:53:10.688456: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 2 Chunks of size 9216 totalling 18.0KiB
2020-08-17 15:53:10.689456: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 12 Chunks of size 12288 totalling 144.0KiB
2020-08-17 15:53:10.689456: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 2 Chunks of size 16384 totalling 32.0KiB
2020-08-17 15:53:10.690456: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 1 Chunks of size 21248 totalling 20.8KiB
2020-08-17 15:53:10.691456: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 1 Chunks of size 24064 totalling 23.5KiB
2020-08-17 15:53:10.691456: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 5 Chunks of size 24576 totalling 120.0KiB
2020-08-17 15:53:10.692456: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 1 Chunks of size 37632 totalling 36.8KiB
2020-08-17 15:53:10.692456: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 1 Chunks of size 40960 totalling 40.0KiB
2020-08-17 15:53:10.693456: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 4 Chunks of size 49152 totalling 192.0KiB
2020-08-17 15:53:10.693456: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 6 Chunks of size 65536 totalling 384.0KiB
2020-08-17 15:53:10.694456: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 1 Chunks of size 81920 totalling 80.0KiB
2020-08-17 15:53:10.695456: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 1 Chunks of size 90624 totalling 88.5KiB
2020-08-17 15:53:10.695456: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 1 Chunks of size 131072 totalling 128.0KiB
2020-08-17 15:53:10.695456: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 3 Chunks of size 147456 totalling 432.0KiB
2020-08-17 15:53:10.696456: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 12 Chunks of size 262144 totalling 3.00MiB
2020-08-17 15:53:10.696456: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 1 Chunks of size 327680 totalling 320.0KiB
2020-08-17 15:53:10.697457: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 11 Chunks of size 524288 totalling 5.50MiB
2020-08-17 15:53:10.697457: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 4 Chunks of size 589824 totalling 2.25MiB
2020-08-17 15:53:10.698457: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 194 Chunks of size 1048576 totalling 194.00MiB
2020-08-17 15:53:10.699457: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 17 Chunks of size 2097152 totalling 34.00MiB
2020-08-17 15:53:10.699457: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 1 Chunks of size 2211840 totalling 2.11MiB
2020-08-17 15:53:10.700457: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 146 Chunks of size 2359296 totalling 328.50MiB
2020-08-17 15:53:10.701457: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 1 Chunks of size 2360320 totalling 2.25MiB
2020-08-17 15:53:10.701457: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 1 Chunks of size 2621440 totalling 2.50MiB
2020-08-17 15:53:10.702457: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 1 Chunks of size 2698496 totalling 2.57MiB
2020-08-17 15:53:10.702457: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 1 Chunks of size 3670016 totalling 3.50MiB
2020-08-17 15:53:10.703457: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 31 Chunks of size 4194304 totalling 124.00MiB
2020-08-17 15:53:10.703457: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 6 Chunks of size 4718592 totalling 27.00MiB
2020-08-17 15:53:10.704457: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 5 Chunks of size 8388608 totalling 40.00MiB
2020-08-17 15:53:10.705457: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 25 Chunks of size 9437184 totalling 225.00MiB
2020-08-17 15:53:10.705457: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 2 Chunks of size 9438208 totalling 18.00MiB
2020-08-17 15:53:10.706457: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 1 Chunks of size 9441280 totalling 9.00MiB
2020-08-17 15:53:10.706457: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 1 Chunks of size 16138752 totalling 15.39MiB
2020-08-17 15:53:10.707457: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 1 Chunks of size 18874368 totalling 18.00MiB
2020-08-17 15:53:10.707457: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 1 Chunks of size 37748736 totalling 36.00MiB
2020-08-17 15:53:10.708457: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 7 Chunks of size 51380224 totalling 343.00MiB
2020-08-17 15:53:10.708457: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:684] Sum Total of in-use chunks: 1.41GiB
2020-08-17 15:53:10.709457: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:686] Stats:
Limit: 1613615104
InUse: 1510723072
MaxInUse: 1510723072
NumAllocs: 3860
MaxAllocSize: 119947776
The training is running on a QuadroK420 with 2GB of RAM. Is only a problem of low RAM or I'm missing something?
There is a way to train also with my equippement?
The problem is the gpu memory of your video card.
In the first stage you were able to train smoothly, because of the fact that you trained only "the heads" of the network, which translates to a smaller number of parameters.
In the second stage, you started to get out of memory problems because you trained many more layers, resulting in out of memory.
I suggest using a video card with at least 8 GB VRAM for Computer Vision problems.
Indeed, sometimes out of memory problems can be solved by reducing the batch size, but in your case the only viable solution is to opt for a bigger/better video card.
It's most probably a RAM issue. You can try reducing your batch size to 1 or simplify your network. If either of those methods works, get something with bigger RAM.
one way to fix this sometimes is to put an up sampling layer in the model. so lower your target in the image generator and then do add an upsampling layer. Is a good way to trick it.If that works then you know that colab couldn't handle it

Tensorflow slower on GPU than on CPU

Using Keras with Tensorflow backend, I am trying to train an LSTM network and it is taking much longer to run it on a GPU than a CPU.
I am training an LSTM network using the fit_generator function. It takes CPU ~250 seconds per epoch while it takes GPU ~900 seconds per epoch. The packages in my GPU environment include
keras-applications 1.0.8 py_0 anaconda
keras-base 2.2.4 py36_0 anaconda
keras-gpu 2.2.4 0 anaconda
keras-preprocessing 1.1.0 py_1 anaconda
...
tensorflow 1.13.1 gpu_py36h3991807_0 anaconda
tensorflow-base 1.13.1 gpu_py36h8d69cac_0 anaconda
tensorflow-estimator 1.13.0 py_0 anaconda
tensorflow-gpu 1.13.1 pypi_0 pypi
My Cuda compilation tools are of version 9.1.85 and my CUDA and Driver version are
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.67 Driver Version: 418.67 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce RTX 2080 On | 00000000:0A:00.0 Off | N/A |
| 0% 39C P8 5W / 225W | 7740MiB / 7952MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 GeForce RTX 2080 On | 00000000:42:00.0 Off | N/A |
| 0% 33C P8 19W / 225W | 142MiB / 7951MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 49251 C .../whsu014/.conda/envs/whsuphd/bin/python 7729MiB |
| 1 1354 G /usr/lib/xorg/Xorg 16MiB |
| 1 49251 C .../whsu014/.conda/envs/whsuphd/bin/python 113MiB |
+-----------------------------------------------------------------------------+
When I insert this line of code
tf.Session(config = tf.configProto(log_device_placement = True)):
I see the below in my terminal
...
ining_1/Adam/Const_10: (Const)/job:localhost/replica:0/task:0/device:GPU:0
training_1/Adam/Const_11: (Const): /job:localhost/replica:0/task:0/device:GPU:0
2019-06-25 11:27:31.720653: I tensorflow/core/common_runtime/placer.cc:1059] training_1/Adam/Const_11: (Const)/job:localhost/replica:0/task:0/device:GPU:0
training_1/Adam/add_15/y: (Const): /job:localhost/replica:0/task:0/device:GPU:0
2019-06-25 11:27:31.720666: I tensorflow/core/common_runtime/placer.cc:1059] training_1/Adam/add_15/y: (Const)/job:localhost/replica:0/task:0/device:GPU:0
...
So it seems that Tensorflow is using GPU.
When I profile the code,
on GPU this is the first 10 lines
10852017 function calls (10524203 primitive calls) in 184.768 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
16200 173.827 0.011 173.827 0.011 {built-in method _pywrap_tensorflow_internal.TF_SessionRunCallable}
6 0.926 0.154 0.926 0.154 {built-in method _pywrap_tensorflow_internal.TF_SessionMakeCallable}
62 0.813 0.013 0.813 0.013 {built-in method _pywrap_tensorflow_internal.TF_SessionRun_wrapper}
156954 0.414 0.000 0.415 0.000 {built-in method numpy.array}
16200 0.379 0.000 1.042 0.000 training.py:643(_standardize_user_data)
24300 0.338 0.000 0.338 0.000 {method 'partition' of 'numpy.ndarray' objects}
68 0.301 0.004 0.301 0.004 {built-in method _pywrap_tensorflow_internal.ExtendSession}
32458 0.223 0.000 2.122 0.000 tensorflow_backend.py:156(get_session)
3206 0.212 0.000 0.238 0.000 tf_stack.py:31(extract_stack)
76024 0.210 0.000 0.702 0.000 ops.py:5246(get_controller)
...
on CPU this is the first 10 lines
22123473 function calls (21647174 primitive calls) in 60.173 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
16269 42.491 0.003 42.491 0.003 {built-in method tensorflow.python._pywrap_tensorflow_internal.TF_Run}
16269 0.568 0.000 48.964 0.003 session.py:1042(_run)
56 0.532 0.010 0.532 0.010 {built-in method time.sleep}
153641 0.458 0.000 0.460 0.000 {built-in method numpy.core.multiarray.array}
183148/125354 0.447 0.000 1.316 0.000 python_message.py:469(init)
1226659 0.362 0.000 0.364 0.000 {built-in method builtins.getattr}
2302110/2301986 0.339 0.000 0.358 0.000 {built-in method builtins.isinstance}
8 0.285 0.036 0.285 0.036 {built-in method tensorflow.python._pywrap_tensorflow_internal.TF_ExtendGraph}
12150 0.267 0.000 0.271 0.000 callbacks.py:211(on_batch_end)
147026/49078 0.264 0.000 1.429 0.000 python_message.py:1008(ByteSize)
...
This is my code.
def train_generator(x_list, y_list):
# 0.1 validatioin split
train_length = (len(x_list)//10)*9
while True:
for i in range(train_length):
train_x = np.array([x_list[i]])
train_y = np.array([y_list[i]])
yield train_x, train_y
def val_generator(x_list, y_list):
# 0.1 validation split
val_length = len(x_list)//10
while True:
for i in range(-val_length, 0, 1):
val_x = np.array([x_list[i]])
val_y = np.array([y_list[i]])
yield val_x, val_y
with tf.Session(config = tf.ConfigProto(log_device_placement = True)):
model = Sequential()
model.add(LSTM(64, return_sequences=False,
input_shape=(None, 24)))
model.add(Dense(1))
model.compile(loss='mae', optimizer='adam')
checkpointer = ModelCheckpoint(filepath="weights.hdf5",
monitor='val_loss', verbose=1,
save_best_only=True)
history = model.fit_generator(generator=train_generator(train_x,
train_y),
steps_per_epoch=(len(train_x)//10)*9,
epochs=5,
validation_data=val_generator(train_x,
train_y),
validation_steps=len(train_x)//10,
callbacks=[checkpointer],
verbose=2, shuffle=False)
# plot history
pyplot.plot(history.history['loss'], label='train')
pyplot.plot(history.history['val_loss'], label='validation')
pyplot.legend()
pyplot.show()
I expect a significant speed up when using GPU for training. How can I fix this? Can someone help me to understand what is causing the slowdown? Thank you.
Couple of observations:
Use CuDNNLSTM instead of LSTM to train on GPU, you will see considerable increase in speed.
Sometimes, for very small networks, the overhead of transferring between CPU and GPU outweighs the parallel computations made on GPU; in other words, there is more time lost on transferring the data than time gained by training on GPU.
GPUs should be used for highly intensive tasks and computations (very big LSTM/heavy CNN networks). Nevertheless, for very small MLPs and even small LSTMs you might observe that the network trains equally fast on CPU and GPU or that in some particular cases the speed on CPU is even better (very particular cases with super small networks).
UPDATE FOR TENSORFLOW >= 2.0
The imports default to using CuDNNLSTM/CuDNNGRU if the video card is detected; therefore it is not needed explicitly to import them.