How to delete rows containing Nan in Python 3.6.3 - pandas

I want to remove rows with "nan" or "-nan":
Reading:
excel_file = 'originale_ridotto.xlsx'
df = pd.read_excel(excel_file, na_values="NaN")
print(df)
print("I am here")
df.dropna(axis=0, how="any")
print(df)
Output of dataframe colunmns (Python 3.6.3):
Data e ora Potenza Teorica Totale CC [kW]
0 01/01/2017 00:05 0
1 01/01/2017 00:10 0
2 01/01/2017 00:15 0
3 01/01/2017 00:20 0
4 01/01/2017 00:25 0
5 01/01/2017 00:30 0
6 01/01/2017 00:35 0
7 01/01/2017 00:40 0
Potenza Attiva Totale AC [kW] Energia totale cumulata al contatore [kWh] \
0 0 7760812.5
1 0 7760812.5
2 0 7760812.5
3 0 7760812.5
4 0 7760812.5
5 0 7760812.5
6 0 7760812.5
7 0 7760812.5
Temperatura modulo [°C] Irraggiamento [W/m2]
0 0 5.0
1 0 6.0
2 0 NaN
3 0 2.0
4 0 3.0
5 0 NaN
6 0 7.0
7 0 9.0
Potenza Attiva Inv.1Blocco1 [kW]
0 0
1 0
2 0
3 0
4 0
5 0
6 0
7 0
Data e ora Potenza Teorica Totale CC [kW]
0 01/01/2017 00:05 0
1 01/01/2017 00:10 0
2 01/01/2017 00:15 0
3 01/01/2017 00:20 0
4 01/01/2017 00:25 0
5 01/01/2017 00:30 0
6 01/01/2017 00:35 0
7 01/01/2017 00:40 0
Potenza Attiva Totale AC [kW] Energia totale cumulata al contatore [kWh]
0 0 7760812.5
1 0 7760812.5
2 0 7760812.5
3 0 7760812.5
4 0 7760812.5
5 0 7760812.5
6 0 7760812.5
7 0 7760812.5
Temperatura modulo [°C] Irraggiamento [W/m2] \
0 0 5.0
1 0 6.0
2 0 NaN
3 0 2.0
4 0 3.0
5 0 NaN
6 0 7.0
7 0 9.0
Potenza Attiva Inv.1Blocco1 [kW]
0 0
1 0
2 0
3 0
4 0
5 0
6 0
7 0
df.dropna(axis=0, how="any") does not remove these rows. Why?
Could you help me?

You are creating a cleaned dataframe, but you are not "remembering" it. df.dropna(how='any') returns the cleaned df - you need to assign it and then use it:
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randint(0,1000,size=(10, 10)), columns=list('ABCDEFGHIJ'))
# ignoring the warnings
df['A'][2] = np.NaN
df['C'][3] = np.NaN
df['I'][5] = np.NaN
df['E'][7] = np.NaN
print(df)
df = df.dropna(how='any') # this returns a NEW dataframe, it does not modify in place
print(df)
Output:
A B C D E F G H I J
0 314.0 664 855.0 101 764.0 251 503 783 153.0 474
1 903.0 77 546.0 205 113.0 519 115 45 988.0 964
2 NaN 155 481.0 243 165.0 696 255 123 802.0 228
3 406.0 603 NaN 84 390.0 545 651 549 440.0 982
4 796.0 626 139.0 810 474.0 257 407 264 680.0 164
5 443.0 132 545.0 380 420.0 885 704 596 NaN 778
6 285.0 317 238.0 437 508.0 189 501 738 605.0 290
7 144.0 426 220.0 573 NaN 758 581 420 544.0 173
8 864.0 369 541.0 405 863.0 45 522 178 705.0 419
9 936.0 664 547.0 793 68.0 77 364 633 547.0 790
A B C D E F G H I J
0 314.0 664 855.0 101 764.0 251 503 783 153.0 474
1 903.0 77 546.0 205 113.0 519 115 45 988.0 964
4 796.0 626 139.0 810 474.0 257 407 264 680.0 164
6 285.0 317 238.0 437 508.0 189 501 738 605.0 290
8 864.0 369 541.0 405 863.0 45 522 178 705.0 419
9 936.0 664 547.0 793 68.0 77 364 633 547.0 790

Related

python reshape every nth column

I have just started with python and need some help. I have a dataframe which looks like "Input Data", What I want is stack by every nth column. In other words, I want a dataframe where every nth Column is appended below to first m rows
id
city
Col 1
Col 2
Col 3
Col 4
Col 5
Col 6
Col 7
Col 8
Col 9
Col 10
1
1
51
155
255
355
455
666
777
955
55
553
2
0
52
155
255
355
455
666
777
595
55
553
3
NAN
53
155
255
355
455
666
777
559
55
535
4
1
54
155
255
355
545
666
777
559
55
535
5
7
55
155
255
355
455
666
777
955
55
535
Required Output
id
city
Col 1
Col 2
Col 3
Col 4
Col 5
1
1
51
155
255
355
455
2
0
52
155
255
355
455
3
NAN
53
155
255
355
455
4
1
54
155
255
355
545
5
7
55
155
255
355
455
1
1
666
777
955
55
553
2
0
666
777
595
55
553
3
NAN
666
777
559
55
535
4
1
666
777
559
55
535
5
7
666
777
955
55
535
I am trying to do something opposite of this
Input & required Output
In [74]: column_list = [df.columns[k:k+5] for k in range(2, len(df.columns), 5)]
In [75]: column_list
Out[75]:
[Index(['Col 1', 'Col 2', 'Col 3', 'Col 4', 'Col 5'], dtype='object'),
Index(['Col 6', 'Col 7', 'Col 8', 'Col 9', 'Col 10'], dtype='object')]
In [76]: dfs = [df[['id', 'city'] + columns.tolist()].rename(columns=dict(zip(columns, range(5)))) for columns in column_list]
In [77]: dfs
Out[77]:
[ id city 0 1 2 3 4
0 1 1.0 51 155 255 355 455
1 2 0.0 52 155 255 355 455
2 3 NaN 53 155 255 355 455
3 4 1.0 54 155 255 355 545
4 5 7.0 55 155 255 355 455,
id city 0 1 2 3 4
0 1 1.0 666 777 955 55 553
1 2 0.0 666 777 595 55 553
2 3 NaN 666 777 559 55 535
3 4 1.0 666 777 559 55 535
4 5 7.0 666 777 955 55 535]
In [78]: pd.concat(dfs, ignore_index=True)
Out[78]:
id city 0 1 2 3 4
0 1 1.0 51 155 255 355 455
1 2 0.0 52 155 255 355 455
2 3 NaN 53 155 255 355 455
3 4 1.0 54 155 255 355 545
4 5 7.0 55 155 255 355 455
5 1 1.0 666 777 955 55 553
6 2 0.0 666 777 595 55 553
7 3 NaN 666 777 559 55 535
8 4 1.0 666 777 559 55 535
9 5 7.0 666 777 955 55 535
To explain :
First generate the required columns for each slice
pd.concat requires the column names of all the dataframes in the list to be the same, hence the renames in rename(columns=dict(zip(columns, range(5)))). We are just renaming the sliced columns to 0,1,2,3,4
Last step is to concat everything.
EDIT
Based on the comments by OP:
Sorry #Asish M. but how to add a column for dataset number in each dataset of dfs, eg- here we split our dataset into 2, so I need one column which says for first 1 to 5 ids - 'first' (or 1), then again for another 1 to 5 ids - 'second' (or 2) in the output. I hope it's making scenes
dfs = [df[['id', 'city'] + columns.tolist()].assign(split_group=idx).rename(columns=dict(zip(columns, range(5)))) for idx, columns in enumerate(column_list)]
df.assign(split_group=idx) creates a column 'split_group' with value = idx. You get the idx from enumerating the column_list

create new column from divided columns over iteration

I am working with the following code:
url = 'https://raw.githubusercontent.com/dothemathonthatone/maps/master/fertility.csv'
df = pd.read_csv(url)
year regional_schlüssel Aus15 Deu15 Aus16 Deu16 Aus17 Deu17 Aus18 Deu18 ... aus36 aus37 aus38 aus39 aus40 aus41 aus42 aus43 aus44 aus45
0 2000 5111000 0 4 8 25 20 45 56 89 ... 935 862 746 732 792 660 687 663 623 722
1 2000 5113000 1 1 4 14 13 33 19 48 ... 614 602 498 461 521 470 393 411 397 400
2 2000 5114000 0 11 0 5 2 13 7 20 ... 317 278 265 235 259 228 204 173 213 192
3 2000 5116000 0 2 2 7 3 28 13 26 ... 264 217 206 207 197 177 171 146 181 169
4 2000 5117000 0 0 3 1 2 4 4 7 ... 135 129 118 116 128 148 89 110 124 83
I would like to create a new set of columns fertility_deu15, ..., fertility_deu45 and fertility_aus15, ..., fertility_aus45 such that aus15 / Aus15 = fertiltiy_aus15 and deu15/ Deu15 = fertility_deu15 for each ausi and Ausj where j == i \n [15-45] and deui:Deuj where j == i \n [15-45]
I'm not sure what is up with that data but we need to fix it to make it numeric. I'll end up doing that while filtering
numerator = df.filter(regex='^[a-z]+\d+$') # Lower case ones
numerator = numerator.apply(pd.to_numeric, errors='coerce') # Fix numbers
denominator = df.filter(regex='^[A-Z][a-z]+\d+$').rename(columns=str.lower)
denominator = denominator.apply(pd.to_numeric, errors='coerce')
numerator.div(denominator).add_prefix('fertility_')

Shifting values to the next day

I have this data frame:
ID Date X 123_Var 456_Var 789_Var
A 16-07-19 3 777 250 810
A 17-07-19 9 637 121 529
A 20-07-19 2 295 272 490
A 21-07-19 3 778 600 544
A 22-07-19 6 741 792 907
B 01-07-19 4 509 690 406
B 03-07-19 2 413 725 414
B 04-07-19 2 170 702 912
B 09-08-19 3 851 616 477
B 10-08-19 9 475 447 555
B 11-08-19 1 412 403 708
B 12-08-19 2 299 537 321
B 13-08-19 4 310 119 125
C 14-08-19 4 912 755 657
C 15-08-19 4 586 771 394
C 17-08-19 2 500 528 764
C 18-08-19 1 982 383 654
C 20-08-19 3 336 691 496
C 21-08-19 3 206 433 263
C 22-08-19 2 373 319 111
D 10-12-18 2 170 702 912
E 10-12-18 2 912 755 657
E 14-12-18 2 373 319 111
I want to shift values in each column (among 123_Var 456_Var 789_Var columns).
The value will be shifted only if there's a one day difference, otherwise, a NaN value will be remained.
The shifting should be applied for each ID separately. (by Groupby.)
Expected result:
ID Date X 123_Var 456_Var 789_Var 123_Var_S 456_Var_S 789_Var_S
A 16-07-19 3 777 250 810 NaN NaN NaN
A 17-07-19 9 637 121 529 777.0 250.0 810.0
A 20-07-19 2 295 272 490 NaN NaN NaN
A 21-07-19 3 778 600 544 295.0 272.0 490.0
A 22-07-19 6 741 792 907 778.0 600.0 544.0
B 01-07-19 4 509 690 406 NaN NaN NaN
B 03-07-19 2 413 725 414 NaN NaN NaN
B 04-07-19 2 170 702 912 413.0 725.0 414.0
B 09-08-19 3 851 616 477 NaN NaN NaN
B 10-08-19 9 475 447 555 851.0 616.0 477.0
B 11-08-19 1 412 403 708 475.0 447.0 555.0
B 12-08-19 2 299 537 321 412.0 403.0 708.0
B 13-08-19 4 310 119 125 299.0 537.0 321.0
C 14-08-19 4 912 755 657 NaN NaN NaN
C 15-08-19 4 586 771 394 912.0 755.0 657.0
C 17-08-19 2 500 528 764 NaN NaN NaN
C 18-08-19 1 982 383 654 500.0 528.0 764.0
C 20-08-19 3 336 691 496 NaN NaN NaN
C 21-08-19 3 206 433 263 336.0 691.0 496.0
C 22-08-19 2 373 319 111 206.0 433.0 263.0
D 10-12-18 2 170 702 912 NaN NaN NaN
E 10-12-18 2 912 755 657 NaN NaN NaN
E 14-12-18 2 373 319 111 NaN NaN NaN
IIUC,
we can groupby, apply a filter and use .loc along with shift to assign your values:
df['Date'] = df['Date'].apply(pd.to_datetime,format='%d-%m-%y')
s = df.groupby('ID')['Date'].apply(lambda x : (x - x.shift()).eq('1 days'))
cols = df.filter(like='Var').columns.map(lambda x : x + '_S')
df[cols] = df.filter(like='Var').shift()
df.loc[~s,cols]= np.nan
print(df)
ID Date X 123_Var 456_Var 789_Var 123_Var_S 456_Var_S \
0 A 2019-07-16 3 777 250 810 NaN NaN
1 A 2019-07-17 9 637 121 529 777.0 250.0
2 A 2019-07-20 2 295 272 490 NaN NaN
3 A 2019-07-21 3 778 600 544 295.0 272.0
4 A 2019-07-22 6 741 792 907 778.0 600.0
5 B 2019-07-01 4 509 690 406 NaN NaN
6 B 2019-07-03 2 413 725 414 NaN NaN
7 B 2019-07-04 2 170 702 912 413.0 725.0
8 B 2019-08-09 3 851 616 477 NaN NaN
9 B 2019-08-10 9 475 447 555 851.0 616.0
10 B 2019-08-11 1 412 403 708 475.0 447.0
11 B 2019-08-12 2 299 537 321 412.0 403.0
12 B 2019-08-13 4 310 119 125 299.0 537.0
13 C 2019-08-14 4 912 755 657 NaN NaN
14 C 2019-08-15 4 586 771 394 912.0 755.0
15 C 2019-08-17 2 500 528 764 NaN NaN
16 C 2019-08-18 1 982 383 654 500.0 528.0
17 C 2019-08-20 3 336 691 496 NaN NaN
18 C 2019-08-21 3 206 433 263 336.0 691.0
19 C 2019-08-22 2 373 319 111 206.0 433.0
20 D 2018-12-10 2 170 702 912 NaN NaN
21 E 2018-12-10 2 912 755 657 NaN NaN
22 E 2018-12-14 2 373 319 111 NaN NaN
789_Var_S
0 NaN
1 810.0
2 NaN
3 490.0
4 544.0
5 NaN
6 NaN
7 414.0
8 NaN
9 477.0
10 555.0
11 708.0
12 321.0
13 NaN
14 657.0
15 NaN
16 764.0
17 NaN
18 496.0
19 263.0
20 NaN
21 NaN
22 NaN
You may want to consider this approach with iterrows():
for index, row in df.iterrows():
if df.loc[index, 'Date'] == df.loc[index-1, 'Date'] + pd.Timedelta(days=1):
df.loc[index, '123_Var_S'] = df.loc[index-1, '123_Var']
df.loc[index, '456_Var_S'] = df.loc[index-1, '456_Var']
df.loc[index, '789_Var_S'] = df.loc[index-1, '789_Var']

Taking the last two rows' minimum value

I have this data frame:
ID Date X 123_Var 456_Var 789_Var
A 16-07-19 3 777 250 810
A 17-07-19 9 637 121 529
A 18-07-19 7 878 786 406
A 19-07-19 4 656 140 204
A 20-07-19 2 295 272 490
A 21-07-19 3 778 600 544
A 22-07-19 6 741 792 907
B 01-07-19 4 509 690 406
B 02-07-19 2 732 915 199
B 03-07-19 2 413 725 414
B 04-07-19 2 170 702 912
B 09-08-19 3 851 616 477
B 10-08-19 9 475 447 555
B 11-08-19 1 412 403 708
B 12-08-19 2 299 537 321
B 13-08-19 4 310 119 125
C 01-12-18 4 912 755 657
C 02-12-18 4 586 771 394
C 04-12-18 9 498 122 193
C 05-12-18 2 500 528 764
C 06-12-18 1 982 383 654
C 07-12-18 1 299 496 488
C 08-12-18 3 336 691 496
C 09-12-18 3 206 433 263
C 10-12-18 2 373 319 111
I want to show the minimum value between current row and previous row values, for each column in 123_Var 456_Var 789_Var set.
That should be applied separately for each ID. (Groupby.)
The first row of each ID, will show the current value. (Since there's no "previous" value to compare.)
Expected result:
ID Date X 123_Var 456_Var 789_Var 123_Min2 456_Min2 789_Min2
A 16-07-19 3 777 250 810 777 250 810
A 17-07-19 9 637 121 529 637 121 529
A 18-07-19 7 878 786 406 637 121 406
A 19-07-19 4 656 140 204 656 140 204
A 20-07-19 2 295 272 490 295 140 204
A 21-07-19 3 778 600 544 295 272 490
A 22-07-19 6 741 792 907 741 600 544
B 01-07-19 4 509 690 406 509 690 406
B 02-07-19 2 732 915 199 509 690 199
B 03-07-19 2 413 725 414 413 725 199
B 04-07-19 2 170 702 912 170 702 414
B 09-08-19 3 851 616 477 170 616 477
B 10-08-19 9 475 447 555 475 447 477
B 11-08-19 1 412 403 708 412 403 555
B 12-08-19 2 299 537 321 299 403 321
B 13-08-19 4 310 119 125 299 119 125
C 01-12-18 4 912 755 657 912 755 657
C 02-12-18 4 586 771 394 586 755 394
C 04-12-18 9 498 122 193 498 122 193
C 05-12-18 2 500 528 764 498 122 193
C 06-12-18 1 982 383 654 500 383 654
C 07-12-18 1 299 496 488 299 383 488
C 08-12-18 3 336 691 496 299 496 488
C 09-12-18 3 206 433 263 206 433 263
C 10-12-18 2 373 319 111 206 319 111
IIUC, We use groupby.shift to select the previous var for each ID, then we can use DataFrame.where
to leave only the cells where the previous value is lower than the current value and fill with the current value in the rest. We use DataFrame.add_suffix to add _Min2 and we join with df with DataFrame.join
df_vars = df[['123_Var','456_Var','789_Var']]
df = df.join(df.groupby('ID')['123_Var','456_Var','789_Var']
.shift()
.fillna(df_vars)
.where(lambda x: x.le(df_vars),df_vars)
.add_suffix('_Min2')
)
print(df)
Output
ID Date X 123_Var 456_Var 789_Var 123_Var_Min2 456_Var_Min2 789_Var_Min2
0 A 16-07-19 3 777 250 810 777.0 250.0 810.0
1 A 17-07-19 9 637 121 529 637.0 121.0 529.0
2 A 18-07-19 7 878 786 406 637.0 121.0 406.0
3 A 19-07-19 4 656 140 204 656.0 140.0 204.0
4 A 20-07-19 2 295 272 490 295.0 140.0 204.0
5 A 21-07-19 3 778 600 544 295.0 272.0 490.0
6 A 22-07-19 6 741 792 907 741.0 600.0 544.0
7 B 01-07-19 4 509 690 406 509.0 690.0 406.0
8 B 02-07-19 2 732 915 199 509.0 690.0 199.0
9 B 03-07-19 2 413 725 414 413.0 725.0 199.0
10 B 04-07-19 2 170 702 912 170.0 702.0 414.0
11 B 09-08-19 3 851 616 477 170.0 616.0 477.0
12 B 10-08-19 9 475 447 555 475.0 447.0 477.0
13 B 11-08-19 1 412 403 708 412.0 403.0 555.0
14 B 12-08-19 2 299 537 321 299.0 403.0 321.0
15 B 13-08-19 4 310 119 125 299.0 119.0 125.0
16 C 01-12-18 4 912 755 657 912.0 755.0 657.0
17 C 02-12-18 4 586 771 394 586.0 755.0 394.0
18 C 04-12-18 9 498 122 193 498.0 122.0 193.0
19 C 05-12-18 2 500 528 764 498.0 122.0 193.0
20 C 06-12-18 1 982 383 654 500.0 383.0 654.0
21 C 07-12-18 1 299 496 488 299.0 383.0 488.0
22 C 08-12-18 3 336 691 496 299.0 496.0 488.0
23 C 09-12-18 3 206 433 263 206.0 433.0 263.0
24 C 10-12-18 2 373 319 111 206.0 319.0 111.0
Case 2: If you want check the n previous use groupby.rolling
df_vars = df[['123_Var','456_Var','789_Var']]
n = 3
df = df.join(df.groupby('ID')['123_Var','456_Var','789_Var']
.rolling(n,min_periods = 1).min()
.reset_index(drop=True)
.add_suffix(f'_Min{n}')
)
print(df)
ID Date X 123_Var 456_Var 789_Var 123_Var_Min3 456_Var_Min3 789_Var_Min3
0 A 16-07-19 3 777 250 810 777.0 250.0 810.0
1 A 17-07-19 9 637 121 529 637.0 121.0 529.0
2 A 18-07-19 7 878 786 406 637.0 121.0 406.0
3 A 19-07-19 4 656 140 204 637.0 121.0 204.0
4 A 20-07-19 2 295 272 490 295.0 121.0 204.0
5 A 21-07-19 3 778 600 544 295.0 140.0 204.0
6 A 22-07-19 6 741 792 907 295.0 140.0 204.0
7 B 01-07-19 4 509 690 406 509.0 690.0 406.0
8 B 02-07-19 2 732 915 199 509.0 690.0 199.0
9 B 03-07-19 2 413 725 414 413.0 690.0 199.0
10 B 04-07-19 2 170 702 912 170.0 690.0 199.0
11 B 09-08-19 3 851 616 477 170.0 616.0 199.0
12 B 10-08-19 9 475 447 555 170.0 447.0 414.0
13 B 11-08-19 1 412 403 708 170.0 403.0 477.0
14 B 12-08-19 2 299 537 321 299.0 403.0 321.0
15 B 13-08-19 4 310 119 125 299.0 119.0 125.0
16 C 01-12-18 4 912 755 657 912.0 755.0 657.0
17 C 02-12-18 4 586 771 394 586.0 755.0 394.0
18 C 04-12-18 9 498 122 193 498.0 122.0 193.0
19 C 05-12-18 2 500 528 764 498.0 122.0 193.0
20 C 06-12-18 1 982 383 654 498.0 122.0 193.0
21 C 07-12-18 1 299 496 488 299.0 122.0 193.0
22 C 08-12-18 3 336 691 496 299.0 383.0 488.0
23 C 09-12-18 3 206 433 263 206.0 383.0 263.0
24 C 10-12-18 2 373 319 111 206.0 319.0 111.0
A quite elegant solution is to apply rolling(2).min() to each group,
but to avoid the first row of NaN in each group, this first row
should be "replicated" from the source group.
To do your task, start from defining the following function:
def fnMin2(grp):
rv = pd.concat([pd.DataFrame([grp.iloc[0, -3:]]),
grp[['123_Var', '456_Var', '789_Var']].rolling(2).min().iloc[1:]])\
.astype('int')
rv.columns = [ it.replace('Var', 'Min2') for it in rv.columns ]
return grp.join(rv)
Then apply it to each group:
df.groupby('ID').apply(fnMin2)
Note that column names assigned to new columns in my solution are
just as you wish, contrary to the solution you accepted.
#this compares the next row to the previous row
ext = df.iloc[:,3:].gt(df.iloc[:,3:].shift(1))
#simply renamed the columns here
ext.columns=['123_min','456_min','789_min']
#join the two dataframes by columns
M = pd.concat([df,ext],axis=1)
#based on the conditions, if it is False,
#use value from current row,
#else use value from previous row
M['123_min']=np.where(M['123_min']==0,
M['123_Var'],
M['123_Var'].shift(1)
)
M['456_min']=np.where(M['456_min']==0,
M['456_Var'],
M['456_Var'].shift(1)
)
M['789_min']=np.where(M['789_min']==0,
M['789_Var'],
M['789_Var'].shift(1)
)

How to interpret the log output of docplex optimisation library

I am having a problem interpreting this log that I get after trying to maximise an objective function using docplex:
Nodes Cuts/
Node Left Objective IInf Best Integer Best Bound ItCnt Gap
0 0 6.3105 0 10.2106 26
0 0 5.9960 8 Cone: 5 34
0 0 5.8464 5 Cone: 8 47
0 0 5.8030 11 Cone: 10 54
0 0 5.7670 12 Cone: 13 64
0 0 5.7441 13 Cone: 16 72
0 0 5.7044 9 Cone: 19 81
0 0 5.6844 14 5.6844 559
* 0+ 0 4.5362 5.6844 25.31%
0 0 5.5546 15 4.5362 Cuts: 322 1014 22.45%
0 0 5.4738 15 4.5362 Cuts: 38 1108 20.67%
* 0+ 0 4.6021 5.4738 18.94%
0 0 5.4296 16 4.6021 Cuts: 100 1155 17.98%
0 0 5.3779 19 4.6021 Cuts: 34 1204 16.86%
0 0 5.3462 17 4.6021 Cuts: 80 1252 16.17%
0 0 5.3396 19 4.6021 Cuts: 42 1276 16.03%
0 0 5.3364 24 4.6021 Cuts: 57 1325 15.96%
0 0 5.3269 17 4.6021 Cuts: 66 1353 15.75%
0 0 5.3188 20 4.6021 Cuts: 42 1369 15.57%
0 0 5.2975 21 4.6021 Cuts: 62 1387 15.11%
0 0 5.2838 24 4.6021 Cuts: 72 1427 14.81%
0 0 5.2796 21 4.6021 Cuts: 70 1457 14.72%
0 0 5.2762 24 4.6021 Cuts: 73 1471 14.65%
0 0 5.2655 24 4.6021 Cuts: 18 1479 14.42%
* 0+ 0 4.6061 5.2655 14.32%
* 0+ 0 4.6613 5.2655 12.96%
0 0 5.2554 26 4.6613 Cuts: 40 1492 12.75%
0 0 5.2425 27 4.6613 Cuts: 11 1511 12.47%
0 0 5.2360 23 4.6613 Cuts: 3 1518 12.33%
0 0 5.2296 19 4.6613 Cuts: 7 1521 12.19%
0 0 5.2213 18 4.6613 Cuts: 8 1543 12.01%
0 0 5.2163 24 4.6613 Cuts: 15 1552 11.91%
0 0 5.2106 21 4.6613 Cuts: 4 1558 11.78%
0 0 5.2106 21 4.6613 Cuts: 3 1559 11.78%
* 0+ 0 4.6706 5.2106 11.56%
0 2 5.2106 21 4.6706 5.2106 1559 11.56%
Elapsed time = 9.12 sec. (7822.43 ticks, tree = 0.01 MB, solutions = 5)
51 29 4.9031 3 4.6706 5.1575 1828 10.42%
260 147 4.9207 1 4.6706 5.1575 2699 10.42%
498 242 infeasible 4.6706 5.0909 3364 9.00%
712 346 4.7470 6 4.6706 5.0591 4400 8.32%
991 497 4.7338 6 4.6706 5.0480 5704 8.08%
1358 566 4.8085 11 4.6706 5.0005 7569 7.06%
1708 708 4.7638 14 4.6706 4.9579 9781 6.15%
1985 817 cutoff 4.6706 4.9265 11661 5.48%
2399 843 infeasible 4.6706 4.9058 15567 5.04%
3619 887 4.7066 4 4.6706 4.7875 23685 2.50%
Elapsed time = 17.75 sec. (10933.85 ticks, tree = 3.05 MB, solutions = 5)
4623 500 4.6863 13 4.6706 4.7274 35862 1.22%
What I don't understand is the following:
What is the difference between the third (Objective) and fifth column (Best integer )
How come that the third column (Objective) has higher values than the actual solution of the problem given by CPLEX which is (4.6706)
Does the values in the third column take into consideration the constraints given to the optimization problem?
This webpage didn't help me to understand neither, the explanation of Best Integer is really confusing.
Thank you in advance for your feedback.
Regards.
The user manual includes a detailed explanation of this log in section
CPLEX->User's Manual for CPLEX->Discrete Optimization->Solving Mixed Integer Programming Problems (MIP)->Progress Reports: interpreting the node log
(see https://www.ibm.com/support/knowledgecenter/SSSA5P_12.8.0/ilog.odms.cplex.help/CPLEX/UsrMan/topics/discr_optim/mip/para/52_node_log.html)
I suggest to have a look at
in
https://fr.slideshare.net/mobile/IBMOptimization/2013-11-informsminingthenodelog