How to fix the domain violation error in GAMS - gams-math

when i run this code it shows domain violation error for element, how do i remove the error.?
...
table data (i, coef)
a b c
1 0.0016 2 0
2 0.01 2.5 0
3 0.0625 1.0 0
4 0.00834 3.25 0
5 0.025 3 0
6 0.025 3 0;
table Losscoef(i,j)
1 2 3 4 5 6
1 0.000218 0.000103 0.000009 -0.00001 0.000002 0.000027
2 0.000103 0.000181 0.000004 -0.000015 0.000002 0.00003
3 0.000009 0.000004 0.000417 -0.000131 -0.000153 -0.000107
4 -0.00014 -0.000015 -0.000131 0.000221 0.000094 0.00005
5 0.000002 0.000002 -0.000153 0.000094 0.000243 0
6 0.000027 0.00003 -0.000107 0.00005 0 0.000358;
...

No error now, you must first de-clear your sets
set i /1*6/
coef /a,b,c/;
alias(i,j);
table data (i, coef)
a b c
1 0.0016 2 0
2 0.01 2.5 0
3 0.0625 1.0 0
4 0.00834 3.25 0
5 0.025 3 0
6 0.025 3 0;
table Losscoef(i,j)
1 2 3 4 5 6
1 0.000218 0.000103 0.000009 -0.00001 0.000002 0.000027
2 0.000103 0.000181 0.000004 -0.000015 0.000002 0.00003
3 0.000009 0.000004 0.000417 -0.000131 -0.000153 -0.000107
4 -0.00014 -0.000015 -0.000131 0.000221 0.000094 0.00005
5 0.000002 0.000002 -0.000153 0.000094 0.000243 0
6 0.000027 0.00003 -0.000107 0.00005 0 0.000358;

Related

Pandas: I want slice the data and shuffle them to genereate some synthetic data

Just want to help with data science to generate some synthetic data since we don't have enough labelled data. I want to cut the rows around the random position of the y column around 0s, don't cut 1 sequence.
After cutting, want to shuffle those slices and generate a new DataFrame.
It's better to have some parameters that adjust the maximum, and minimum sequence to cut, the number of cuts, and something like that.
The raw data
ts v1 y
0 100 1
1 120 1
2 80 1
3 5 0
4 2 0
5 100 1
6 200 1
7 1234 1
8 12 0
9 40 0
10 200 1
11 300 1
12 0.5 0
...
Some possible cuts
ts v1 y
0 100 1
1 120 1
2 80 1
3 5 0
--------------
4 2 0
--------------
5 100 1
6 200 1
7 1234 1
-------------
8 12 0
9 40 0
10 200 1
11 300 1
-------------
12 0.5 0
...
ts v1 y
0 100 1
1 120 1
2 80 1
3 5 0
4 2 0
-------------
5 100 1
6 200 1
7 1234 1
8 12 0
9 40 0
10 200 1
11 300 1
------------
12 0.5 0
...
This is NOT CORRECT
ts v1 y
0 100 1
1 120 1
------------
2 80 1
3 5 0
4 2 0
5 100 1
6 200 1
7 1234 1
8 12 0
9 40 0
10 200 1
11 300 1
12 0.5 0
...
You can use:
#number of cuts
N = 3
#create random N index values of index if y=0
idx = np.random.choice(df.index[df['y'].eq(0)], N, replace=False)
#create groups with check membership and cumulative sum
arr = df.index.isin(idx).cumsum()
#randomize unique integers - groups
u = np.unique(arr)
np.random.shuffle(u)
#change order of groups in DataFrame
df = df.set_index(arr).loc[u].reset_index(drop=True)
print (df)
ts v1 y
0 9 40.0 0
1 10 200.0 1
2 11 300.0 1
3 12 0.5 0
4 3 5.0 0
5 4 2.0 0
6 5 100.0 1
7 6 200.0 1
8 7 1234.0 1
9 8 12.0 0
10 0 100.0 1
11 1 120.0 1
12 2 80.0 1

Backfill and Increment by one?

I have a column of a DataFrame that consists of 0's and NaN's:
Timestamp A B C
1 3 3 NaN
2 5 2 NaN
3 9 1 NaN
4 2 6 NaN
5 3 3 0
6 5 2 NaN
7 3 1 NaN
8 2 8 NaN
9 1 6 0
And I want to backfill it and increment the last value:
Timestamp A B C
1 3 3 4
2 5 2 3
3 9 1 2
4 2 6 1
5 3 3 0
6 5 2 3
7 3 1 2
8 2 8 1
9 1 6 0
YOu can use iloc[::-1] to reverse the data, and groupby().cumcount() to create the row counter:
s = df['C'].iloc[::-1].notnull()
df['C'] = df['C'].bfill() + s.groupby(s.cumsum()).cumcount()
Output
Timestamp A B C
0 1 3 3 4.0
1 2 5 2 3.0
2 3 9 1 2.0
3 4 2 6 1.0
4 5 3 3 0.0
5 6 5 2 3.0
6 7 3 1 2.0
7 8 2 8 1.0
8 9 1 6 0.0

low accuracies with ML modules

I'm working with breast cancer dataset with 2 classes 0-1 and the training and accuracy was great, but I have changed the number of classes to 8 classes 0-7 and I'm getting low accuracy wit Ml algorithms but meanwhile the accuracy with ANN 97% maybe I made a mistake but I don't know where
y_pred :
[5 0 3 0 3 6 1 0 2 1 7 6 7 3 0 3 6 3 7 0 7 1 5 2 5 0 3 6 5 5 7 2 0 6 6 6 3
6 5 0 0 6 6 5 3 0 5 1 6 4 0 7 6 0 5 5 5 0 0 5 7 1 6 6 7 6 0 1 7 5 6 0 6 0
3 3 6 7 7 1 0 7 0 5 5 0 6 0 0 6 1 6 5 0 0 7 0 1 6 1 0 6 0 7 0 6 0 5 0 6 3
6 7 0 6 6 0 0 0 5 7 4 6 6 2 3 5 6 0 7 7 0 5 6 0 0 0 6 1 5 0 7 4 6 0 7 3 6
5 6 6 0 2 0 1 0 7 0 1 7 0 7 7 6 6 6 7 6 6 0 6 5 1 1 7 6 6 7 0 7 0 1 6 0]
y_test:
[1 0 1 6 4 6 1 0 1 3 0 2 6 3 0 1 0 7 0 0 6 6 5 6 2 6 3 6 5 6 7 6 5 7 0 2 3
6 5 0 7 2 6 4 0 0 2 6 3 7 7 1 3 6 5 0 2 7 0 7 6 0 1 7 6 6 0 4 7 0 0 0 6 0
3 5 0 0 7 6 0 0 7 0 6 7 7 2 7 1 1 5 5 3 7 4 7 2 2 4 0 0 0 7 0 2 0 6 0 6 1
7 6 0 6 0 0 1 0 6 6 7 6 6 7 0 6 1 0 0 7 0 5 7 0 0 7 7 6 5 0 0 1 6 0 7 6 6
5 2 6 0 2 0 6 0 5 0 2 7 0 7 7 6 7 6 6 6 0 6 6 0 1 1 7 6 2 7 6 0 0 6 5 0]
I have replaced multilabel_confusion_matrix with confusion_matrix but still I'm getting the same results the accuracy between 40% to 50%.
and I'm getting results with : cv_results.mean() *100
K-Nearest Neighbours: 39.62 %
Support Vector Machine: 48.09 %
Naive Bayes: 30.46 %
Decision Tree: 30.46 %
Randoom Forest: 52.32 %
Logistic Regression: 44.26 %
here is Ml part :
# Predicting the Test set results
y_pred = classifier.predict(X_test)
y_pred = np.argmax(y_pred, axis=1)
cm = multilabel_confusion_matrix(y_test, y_pred)
models = []
models.append(('K-Nearest Neighbours', KNeighborsClassifier(n_neighbors = 5)))
models.append(('Support Vector Machine', SVC()))
models.append(('Naive Bayes', GaussianNB()))
models.append(('Decision Tree', DecisionTreeClassifier()))
models.append(('Randoom Forest', RandomForestClassifier(n_estimators=100)))
models.append(('Logistic Regression', LogisticRegression()))
results = []
names = []
for name, model in models:
kfold = model_selection.KFold(n_splits=10, random_state = 8)
cv_results = model_selection.cross_val_score(model, X_train, y_train, cv=kfold, scoring='accuracy')

Fill consecutive NaNs with cumsum, to increment by one on each consecutive NaN

Given a dataframe with lots of missing value in a certain inverval, my desired output dataframe should have all consecutive NaN filled with a cumsum starting from the first valid value, and adding 1 for each NaN.
Given:
shop_id calendar_date quantity
0 2018-12-12 1
1 2018-12-13 NaN
2 2018-12-14 NaN
3 2018-12-15 NaN
4 2018-12-16 1
5 2018-12-17 NaN
Desired output:
shop_id calendar_date quantity
0 2018-12-12 1
1 2018-12-13 2
2 2018-12-14 3
3 2018-12-15 4
4 2018-12-16 1
5 2018-12-17 2
Use:
g = (~df.quantity.isnull()).cumsum()
df['quantity'] = df.fillna(1).groupby(g).quantity.cumsum()
shop_id calendar_date quantity
0 0 2018-12-12 1.0
1 1 2018-12-13 2.0
2 2 2018-12-14 3.0
3 3 2018-12-15 4.0
4 4 2018-12-16 1.0
5 5 2018-12-17 2.0
Details
Use .isnull() to check where quantity has valid values, and take the cumsum of the boolean Series:
g = (~df.quantity.isnull()).cumsum()
0 1
1 1
2 1
3 1
4 2
5 2
Use fillna
so that when you group by g and take the cusmum the values will increase starting from whatever the value is:
df.fillna(1).groupby(g).quantity.cumsum()
0 1.0
1 2.0
2 3.0
3 4.0
4 1.0
5 2.0
Another approach ?
data
shop_id calender_date quantity
0 0 2018-12-12 1.0
1 1 2018-12-13 NaN
2 2 2018-12-14 NaN
3 3 2018-12-15 NaN
4 4 2018-12-16 1.0
5 5 2018-12-17 NaN
6 6 2018-12-18 NaN
7 7 2018-12-17 NaN
using np.where
where = np.where(data['quantity'] >= 1)
r = []
for i in range(len(where[0])):
try:
r.extend(np.arange(1,where[0][i+1] - where[0][i]+1))
except:
r.extend(np.arange(1,len(data)-where[0][i]+1))
data['quantity'] = r
print(data)
shop_id calender_date quantity
0 0 2018-12-12 1
1 1 2018-12-13 2
2 2 2018-12-14 3
3 3 2018-12-15 4
4 4 2018-12-16 1
5 5 2018-12-17 2
6 6 2018-12-18 3
7 7 2018-12-17 4

To count every 3 rows to fit the condition by Pandas rolling

I have dataframe look like this:
raw_data ={'col0':[1,4,5,1,3,3,1,5,8,9,1,2]}
df = DataFrame(raw_data)
col0
0 1
1 4
2 5
3 1
4 3
5 3
6 1
7 5
8 8
9 9
10 1
11 2
What I want to do is to count every 3 rows to fit condition(df['col0']>3) and make new col looks like this:
col0 col_roll_count3
0 1 0
1 4 1
2 5 2 #[index 0,1,2/ 4,5 fit the condition]
3 1 2
4 3 1
5 3 0 #[index 3,4,5/no fit the condition]
6 1 0
7 5 1
8 8 2
9 9 3
10 1 2
11 2 1
How can I achieve that?
I tried this but failed:
df['col_roll_count3'] = df[df['col0']>3].rolling(3).count()
print(df)
col0 col1
0 1 NaN
1 4 1.0
2 5 2.0
3 1 NaN
4 3 NaN
5 3 NaN
6 1 NaN
7 5 3.0
8 8 3.0
9 9 3.0
10 1 NaN
11 2 NaN
df['col_roll_count3'] = df['col0'].gt(3).rolling(3).sum()
Let's use rolling, apply, np.count_nonzero:
df['col_roll_count3'] = df.col0.rolling(3,min_periods=1)\
.apply(lambda x: np.count_nonzero(x>3))
Output:
col0 col_roll_count3
0 1 0.0
1 4 1.0
2 5 2.0
3 1 2.0
4 3 1.0
5 3 0.0
6 1 0.0
7 5 1.0
8 8 2.0
9 9 3.0
10 1 2.0
11 2 1.0