Invalide Subcript x[2,1,1] of AMPL - ampl

It is my first time to use AMPL to solve a problem.
Could anyone tell me why does this program gives me Invalide Subcript x[2,1,1] ?
Thanks !
param K; #number of customers
param T; #number of orders
param J; #number of fabs
param I; #number if items
param A {i in 1..I, t in 1..T}; #quantity of item i requested in order t
param P { t in 1..T}; # price of order t if fully fulfilled
param C {i in 1..I, j in 1..J}; #number of items i that can be produced by fab j per hour
param Cap {j in 1..J}; #capacity hours for fab j
set list {j in 1..J}= {i in 1..I: C[i,j] <>0}; #set of items that can be produced by firm j
set nonlist {j in 1..J}= {i in 1..I: C[i,j] =0}; #set of items that cannot be produced by firm j
var x {i in 1..I, j in 1..J, t in i..T}; #optimal quantity of item i produced by fab j for order t
maximize profit:
#sum {i in 1..I, j in 1..J, t in 1..T} (P[t]*(x[i,j,t] / (sum{ i in 1..I} A[i,t]))); #written like that doesn't work?!
sum{t in 1..T} (P[t]*((sum{i in 1..I, j in 1..J} x[i,j,t])/(sum {i in 1..I} A[i,t])));
subject to limit {i in 1..I, t in 1..T}: sum {j in 1..J} x[i,j,t] <= A[i,t] ; #cannot produce more than ordered for each item i in each order t
subject to capacity {j in 1..J} : sum { t in 1..T, i in list[j]} (x [i,j,t] / C[i,j]) <= Cap [j]; #cannot produce more than maximum capacity for each fab j
subject to realistic {j in 1..J, i in nonlist[j], t in 1..T}: x[i,j,t] =0; # firm j cannot produce item i if C[i,j]=0
subject to nonnegativity {i in 1..I, j in 1..J, t in 1..T}: x[i,j,t] >= 0;
The data file is
param T := 10;
param J:= 8;
param I:= 12;
param A:
1 2 3 4 5 6 7 8 9 10 :=
1 0 1000 0 0 5000 0 0 2000 1500 0
2 0 2000 0 4000 0 1000 1000 2000 0 0
3 0 0 1500 0 0 3500 500 0 3000 0
4 2000 0 0 0 0 1500 0 500 4000 2000
5 3000 0 0 5000 1500 0 0 1000 500 0
6 0 1000 0 0 2500 0 5000 0 1000 0
7 0 0 5000 0 0 0 0 1000 3000 0
8 0 0 4000 0 0 3000 0 0 2000 2000
9 0 0 6000 8000 2500 0 0 0 500 0
10 5000 0 0 0 0 0 0 2000 3000 3000
11 0 3000 0 2000 0 1500 0 3000 500 0
12 0 0 2000 3000 0 0 500 1000 1500 4000 ;
param P :=
1 5500
2 4300
3 9300
4 8600
5 8000
6 6700
7 4700
8 7000
9 9600
10 7200 ;
param Cap :=
1 840
2 750
3 610
4 470
5 560
6 240
7 1250
8 930;
param C:
1 2 3 4 5 6 7 8 :=
1 10 5 0 25 20 40 0 0
2 5 0 20 0 15 0 5 10
3 10 15 30 0 20 40 0 0
4 10 0 5 20 0 50 15 15
5 5 0 0 25 0 50 15 15
6 0 5 10 40 15 0 5 0
7 20 10 0 5 30 0 10 0
8 50 15 10 0 0 30 5 0
9 40 20 30 0 0 0 10 20
10 0 25 15 0 15 45 5 0
11 0 20 0 30 0 20 15 5
12 0 0 30 15 20 0 10 20;
Running command gives result Invalide Subcript x[2,1,1]:

The variable x[2,1,1] doesn't exist because x is indexed over {i in 1..I, j in 1..J, t in i..T}, so when i is 2, t goes from 2 to T. You should either change the declaration of x to something like
var x {i in 1..I, j in 1..J, t in 1..T};
or change the indexing in the declaration of profit and, possibly, constraints to be consistent with the indexing of x.

Related

Pandas: I want slice the data and shuffle them to genereate some synthetic data

Just want to help with data science to generate some synthetic data since we don't have enough labelled data. I want to cut the rows around the random position of the y column around 0s, don't cut 1 sequence.
After cutting, want to shuffle those slices and generate a new DataFrame.
It's better to have some parameters that adjust the maximum, and minimum sequence to cut, the number of cuts, and something like that.
The raw data
ts v1 y
0 100 1
1 120 1
2 80 1
3 5 0
4 2 0
5 100 1
6 200 1
7 1234 1
8 12 0
9 40 0
10 200 1
11 300 1
12 0.5 0
...
Some possible cuts
ts v1 y
0 100 1
1 120 1
2 80 1
3 5 0
--------------
4 2 0
--------------
5 100 1
6 200 1
7 1234 1
-------------
8 12 0
9 40 0
10 200 1
11 300 1
-------------
12 0.5 0
...
ts v1 y
0 100 1
1 120 1
2 80 1
3 5 0
4 2 0
-------------
5 100 1
6 200 1
7 1234 1
8 12 0
9 40 0
10 200 1
11 300 1
------------
12 0.5 0
...
This is NOT CORRECT
ts v1 y
0 100 1
1 120 1
------------
2 80 1
3 5 0
4 2 0
5 100 1
6 200 1
7 1234 1
8 12 0
9 40 0
10 200 1
11 300 1
12 0.5 0
...
You can use:
#number of cuts
N = 3
#create random N index values of index if y=0
idx = np.random.choice(df.index[df['y'].eq(0)], N, replace=False)
#create groups with check membership and cumulative sum
arr = df.index.isin(idx).cumsum()
#randomize unique integers - groups
u = np.unique(arr)
np.random.shuffle(u)
#change order of groups in DataFrame
df = df.set_index(arr).loc[u].reset_index(drop=True)
print (df)
ts v1 y
0 9 40.0 0
1 10 200.0 1
2 11 300.0 1
3 12 0.5 0
4 3 5.0 0
5 4 2.0 0
6 5 100.0 1
7 6 200.0 1
8 7 1234.0 1
9 8 12.0 0
10 0 100.0 1
11 1 120.0 1
12 2 80.0 1

How to do time continuity checks using python in pandas data-frame

I have a data-frame which has columns like:
colA colB colC colD colE flag
A X 2018Q1 500 600 1
A X 2018Q2 200 800 1
A X 2018Q3 100 400 1
A X 2018Q4 500 600 1
A X 2019Q1 400 7000 0
A X 2019Q2 1500 6100 0
A X 2018Q3 5600 600 1
A X 2018Q4 500 6007 1
A Y 2016Q1 900 620 1
A Y 2016Q2 750 850 0
A Y 2017Q1 750 850 1
A Y 2017Q2 750 850 1
A Y 2017Q3 750 850 1
A Y 2018Q1 750 850 1
A Y 2018Q2 750 850 1
A Y 2018Q3 750 850 1
A Y 2018Q4 750 850 1
A row at colA, colB level passes a statistical check if at colA, colB level the value of flag==1 for continuous 4 quarters of data after sorting for one stride.
We have to stride like this: 2018Q1-2018Q4 then 2018Q2-2019Q1 .... so on if there is 4 continuous quarter and flag==1 then we lable that as 1.
The final output will be like:
colA colB colC colD colE flag check_qtr
A X 2018Q1 500 600 1 1
A X 2018Q2 200 800 1 1
A X 2018Q3 100 400 1 1
A X 2018Q4 500 600 1 1
A X 2019Q1 400 7000 0 0
A X 2019Q2 1500 6100 0 0
A X 2018Q3 5600 600 1 0
A X 2018Q4 500 6007 1 0
A Y 2016Q1 900 620 1 0
A Y 2016Q2 750 850 0 0
A Y 2017Q1 750 850 1 0
A Y 2017Q2 750 850 1 0
A Y 2017Q3 750 850 1 0
A Y 2018Q1 750 850 1 1
A Y 2018Q2 750 850 1 1
A Y 2018Q3 750 850 1 1
A Y 2018Q4 750 850 1 1
How can we do this using pandas and numpy?
Can we implemet this is using sql?
Concerning your first question, this can be done like this using pandas:
First i'll generate your example dataframe:
import pandas as pd
df = pd.DataFrame({'colA':['A']*17,
'colB':['X']*8+['Y']*9,
'flag':[1,1,1,1,0,0,1,1,1,0,1,1,1,1,1,1,1]})
df.set_index(['colA','colB'], inplace=True) # Set index as multilevel with colA and colB
Resulting in your example dataframe. However, to use the following approach, we'll need to go back to a normal index:
df.reset_index(inplace=True)
colA colB flag
0 A X 1
1 A X 1
2 A X 1
3 A X 1
4 A X 0
5 A X 0
6 A X 1
7 A X 1
8 A Y 1
9 A Y 0
10 A Y 1
11 A Y 1
12 A Y 1
13 A Y 1
14 A Y 1
15 A Y 1
16 A Y 1
Then to obtain your result column you can use the groupby function (with some print to understand what's going on):
from scipy.ndimage.interpolation import shift
import numpy as np
df['check_qtr'] = pd.Series(0,index=df.index) # Initialise your result column
for name, group in df.groupby(['colA','colB','flag']):
if name[2] == 1:
print(name)
idx = ((group.index.values - shift(group.index.values, 1, cval=-1)) == 1).astype(int) # Is the index of the following value just 1 place after current ?
print(idx)
bools = [idx[x:x+4].sum()==4 for x in range(len(idx))] # Are the 4 next indexes following each others ?
print(bools)
for idx in group.index.values[bools]: # For each index where the 4 next indexes following each others
df.loc[idx:idx+3,'check_qtr'] = 1 #set check_qtr in row idx to row idx+3
('A', 'X', 1)
[1 1 1 1 0 1]
[True, False, False, False, False, False]
('A', 'Y', 1)
[0 0 1 1 1 1 1 1]
[False, False, True, True, True, False, False, False]
Note that we are using +4 in the case where we are doing array indexing. Because array[x:x+4] will give you the 4 values at index x to x+3.
We are using +3 when using loc because loc doesn't use the same logic. It retrieves indexes by name and not position. So between value idx and idx+3 we'll get 4 values.
Giving you the result you want:
colA colB flag check_qtr
0 A X 1 1
1 A X 1 1
2 A X 1 1
3 A X 1 1
4 A X 0 0
5 A X 0 0
6 A X 1 0
7 A X 1 0
8 A Y 1 0
9 A Y 0 0
10 A Y 1 0
11 A Y 1 1
12 A Y 1 1
13 A Y 1 1
14 A Y 1 1
15 A Y 1 1
16 A Y 1 1
This may not be the perfect way to do it, but it can give you some hints about how you can use some of those functions !

Cumulative variable calculation which is reset under a given condition, for each ID - Pandas

I want to create a cumulative variable based on a non-cumulative variable. This variable should be reset when the value of Y equals 1 (but the reset will start from the row below).
I want to do that for each ID in the data frame.
Data illustration:
ID X Non_cum Y
A .. 0 0
A .. 20 0
A .. 40 0
B .. 0 0
B .. 100 0
B .. 200 1
B .. 50 0
Expected result:
ID X Non_cum Y Cum
A .. 0 0 0
A .. 20 0 20
A .. 40 0 60
B .. 0 0 0
B .. 100 0 100
B .. 200 1 300
B .. 50 0 50
You can group by ID and cumsum of Y (with shift):
groups = df.groupby(['ID'])
df['Y_block'] = groups['Y'].shift(fill_value=0)
df['Y_block'] = groups['Y_block'].cumsum()
df['Cum'] = df.groupby(['ID','Y_block'])['Non_cum'].cumsum()
Output (Cum column):
0 0
1 20
2 60
3 0
4 100
5 300
6 50
Name: Cum, dtype: int64

Pandas: Calculate percentage of column for each class

I have a dataframe like this:
Class Boolean Sum
0 1 0 10
1 1 1 20
2 2 0 15
3 2 1 25
4 3 0 52
5 3 1 48
I want to calculate percentage of 0/1's for each class, so for example the output could be:
Class Boolean Sum %
0 1 0 10 0.333
1 1 1 20 0.666
2 2 0 15 0.375
3 2 1 25 0.625
4 3 0 52 0.520
5 3 1 48 0.480
Divide column Sum with GroupBy.transform for return Series with same length as original DataFrame filled by aggregated values:
df['%'] = df['Sum'].div(df.groupby('Class')['Sum'].transform('sum'))
print (df)
Class Boolean Sum %
0 1 0 10 0.333333
1 1 1 20 0.666667
2 2 0 15 0.375000
3 2 1 25 0.625000
4 3 0 52 0.520000
5 3 1 48 0.480000
Detail:
print (df.groupby('Class')['Sum'].transform('sum'))
0 30
1 30
2 40
3 40
4 100
5 100
Name: Sum, dtype: int64

What's the problem of this one-hot encoding?

In [4]: data = pd.read_csv('student_data.csv')
In [5]: data[:10]
Out[5]:
admit gre gpa rank
0 0 380 3.61 3
1 1 660 3.67 3
2 1 800 4.00 1
3 1 640 3.19 4
4 0 520 2.93 4
5 1 760 3.00 2
6 1 560 2.98 1
7 0 400 3.08 2
8 1 540 3.39 3
9 0 700 3.92 2
one_hot_data = pd.get_dummies(data['rank'])
# TODO: Drop the previous rank column
data = data.drop('rank', axis=1)
data = data.join(one_hot_data)
# Print the first 10 rows of our data
data[:10]
It always gives an error:
KeyError: 'rank'
During handling of the above exception, another exception occurred:
KeyError Traceback (most recent call last)
<ipython-input-25-6a749c8f286e> in <module>()
1 # TODO: Make dummy variables for rank
----> 2 one_hot_data = pd.get_dummies(data['rank'])
3
4 # TODO: Drop the previous rank column
5 data = data.drop('rank', axis=1)
If get:
KeyError: 'rank'
it means there is no column rank. Obviously problem is with traling whitespace or encoding.
print (data.columns.tolist())
['admit', 'gre', 'gpa', 'rank']
Your solution should be simplify by DataFrame.pop - it select column and remove from original DataFrame:
data = data.join(pd.get_dummies(data.pop('rank')))
# Print the first 10 rows of our data
print(data[:10])
admit gre gpa 1 2 3 4
0 0 380 3.61 0 0 1 0
1 1 660 3.67 0 0 1 0
2 1 800 4.00 1 0 0 0
3 1 640 3.19 0 0 0 1
4 0 520 2.93 0 0 0 1
5 1 760 3.00 0 1 0 0
6 1 560 2.98 1 0 0 0
7 0 400 3.08 0 1 0 0
8 1 540 3.39 0 0 1 0
9 0 700 3.92 0 1 0 0
I tried your code and it works fine. You can need to rerun the previous cells which includes loading of the data