I am relatively new to proc optmodel and have been struggling with syntax/structure. I was able to get help once before and am stuck again.
Here is my dataset:
data have;
input NAME $ TEAM $ LEAD GRADE XXX MIN MAX YYY RATE;
cards;
HAL A 1 1 50 45 55 100 1.1
SAL A 0 2 55 0 9999 200 1
KIM A 0 3 70 0 9999 50 1.4
JIM B 1 2 100 90 110 300 .95
GIO B 0 3 120 0 9999 50 1
CAL B 0 4 130 0 9999 20 .9
TOM C 1 1 2 1 5 20 .7
SUE C 0 3 5 0 9999 10 .5
VAL D 1 7 20 15 25 100 .6
WHO D 0 4 10 0 9999 10 .9
;
run;
Here are the specifics:
1. Only the "team lead" has any meaningful constraints.
2. However, the other members of the team will be adjusted accordingly. The value of XXX will be ten percent lower or higher relative to the difference in grade from the team lead. So, if HAL's NEW_XXX is 50 (stays same), then SAL will be 10% higher than HAL's (2 is 1 unit greater than 1) which is 55. KIM's NEW_XXX is 60, since this is twenty percent higher than HAL (3 is 2 units greater than 1. SImilarly, WHO's NEW_XXX will be 30% lower than VAL's.
Does that make sense?
Below is what I have so far, which is the skeleton from a similar project.
proc optmodel;
*set variables and inputs;
set<string>NAME;
string TEAM{NAME};
number LEAD{NAME};
number GRADE{NAME};
number XXX{NAME};
number MIN{NAME};
number MAX{NAME};
number YYY{NAME};
number RATE{NAME};
set TEAMS = setof{i in NAME} TEAM[i];
set NAMEperTEAM{gi in TEAMS} = {i in NAME: TEAM[i] = gi};
var NEW_XXX{i in NAME}>=MIN[i]<=MAX[i];
*read data into procedure;
read data have into
NAME=[NAME]
TEAM
LEAD
GRADE
XXX
MIN
MAX
YYY
RATE;
*state function to optimize;
max metric=sum{gi in TEAMS}
sum{i in NAMEperTEAM[gi]}
(NEW_XXX[i])*(1-(NEW_XXX[i]-XXX[i])*RATE[i]/XXX[i])*YYY[i];
expand;
solve;
*write output dataset;
create data results
from [NAME]={NAME}
TEAM
LEAD
GRADE
XXX
NEW_XXX
MIN
MAX
RATE
YYY;
*write results to window;
print NEW_XXX metric;
quit;
If I understand this correctly, you need set the non-team leads NEW_XXX variable in an equality constraint. That leaves only the team lead NEW_XXX variables free for the optimization.
Let me know if this is what you are trying to accomplish.
Here's how I did it:
proc optmodel;
*set variables and inputs;
set<string> NAME;
string TEAM{NAME};
number LEAD{NAME};
number GRADE{NAME};
number XXX{NAME};
number MIN{NAME};
number MAX{NAME};
number YYY{NAME};
number RATE{NAME};
*read data into procedure;
read data have into
NAME=[NAME]
TEAM
LEAD
GRADE
XXX
MIN
MAX
YYY
RATE;
set TEAMS = setof{i in NAME} TEAM[i];
set NAMEperTEAM{gi in TEAMS} = {i in NAME: TEAM[i] = gi};
/*Helper array that gives me the team leader for each team*/
str LEADS{TEAMS};
for {i in NAME: LEAD[i] = 1} do;
LEADS[TEAM[i]] = i;
end;
var NEW_XXX{i in NAME} init XXX[i] >=MIN[i]<=MAX[i];
*state function to optimize;
max metric=sum{gi in TEAMS}(
sum{i in NAMEperTEAM[gi]} (
(NEW_XXX[i])*(1-(NEW_XXX[i]-XXX[i])*RATE[i]/XXX[i])*YYY[i]
)
);
/*Constrain the non-lead members*/
con NonLeads{i in NAME: LEAD[i] = 0}: NEW_XXX[i] = (1 + (GRADE[i] - GRADE[LEADS[TEAM[i]]]) * 0.1) * NEW_XXX[LEADS[TEAM[i]]] ;
expand;
solve;
*write output dataset;
create data results
from [NAME]={NAME}
TEAM
LEAD
GRADE
XXX
NEW_XXX
MIN
MAX
RATE
YYY;
*write results to window;
print new_xxx metric;
quit;
Related
We make phones. We have selling price, production cost, profit.
The goal is to maximize profits.
The following components are required to assemble each phone.
Maximum quantity of components .
Orders (so many phones were ordered from us, we sold them) :
Here is my mod file:
set PHONE;
set COMPONENTS;
param price {PHONE} >= 0;
param cost {PHONE} >= 0;
param maxComponents {COMPONENTS} >= 0;
param ordered {PHONE} >= 0;
param matrix {COMPONENTS, PHONE}; #The amount of components needed to make a particular phone.
var x {PHONE} >= 0; # Number of manufactured telephones.
maximize profit: sum {i in PHONE} ( ordered[i] * price[i] - x[i] * cost[i] );
subject to min_manufacture {i in PHONE}:
x[i] >= ordered[i]; # We must produce a minimum of what is ordered
subject to component {i in COMPONENTS}:
sum {j in PHONE} matrix[i,j] * x[j] <= maxComponents[i]; # The number of components used must not exceed the maximum.
subject to min_quantity {i in COMPONENTS, l in PHONE}:
sum {j in PHONE} matrix[i,j] * x[j] >= matrix[i,l]; # Minimum quantity used per component if we manufacture at least one telephone. For example, a triple phone requires at least 2 of the five components.
and dat file:
set PHONE := 1 2 3 4 5;
set COMPONENTS:= 1 2 3 4 5 6 7;
param price :=
1 450
2 120
3 500
4 390
5 100;
param cost :=
1 370
2 90
3 400
4 320
5 70;
param maxComponents :=
1 28
2 20
3 8
4 30
5 47
6 27
7 15;
param ordered :=
1 3
2 5
3 5
4 0
5 10;
param matrix: 1 2 3 4 5 :=
1 1 1 0 0 0
2 1 1 0 0 0
3 1 0 0 0 0
4 1 0 1 1 0
5 0 0 2 1 1
6 0 0 2 1 0
7 0 0 1 1 0;
The problem is that if, for example, the maximum amount of sixth components is three, the maximum amount of seventh components is two , then 1.5 is produced from the triple phone which cannot be . And quantity used of the fourth, fifth, sixth, seventh components for the triple phone 1,5 3 3 1,5 which also cannot be.
How do I do it to just get a integer solution?
Because if I write to the variable x that it's an integer, I get zero for everything.
My run file:
model phone.mod;
data phone.dat;
option presolve 0;
option solver cplex;
solve;
display profit, x;
display {i in COMPONENTS, j in PHONE} matrix[i,j] * x[j];
You need to declare the relevant variables as integer, like so:
var x {PHONE} >= 0 integer;
Some solvers are not able to deal with integer constraints and may ignore that constraint (with a warning message) but CPLEX should be fine.
This post follow this one: SAS sum observations not in a group, by group
Where my minimal example was a bit too minimal sadly,I wasn't able to use it on my data.
Here is a complete case example, what I have is :
data have;
input group1 group2 group3 $ value;
datalines;
1 A X 2
1 A X 4
1 A Y 1
1 A Y 3
1 B Z 2
1 B Z 1
1 C Y 1
1 C Y 6
1 C Z 7
2 A Z 3
2 A Z 9
2 A Y 2
2 B X 8
2 B X 5
2 B X 5
2 B Z 7
2 C Y 2
2 C X 1
;
run;
For each group, I want a new variable "sum" with the sum of all values in the column for the same sub groups (group1 and group2), exept for the group (group3) the observation is in.
data want;
input group1 group2 group3 $ value $ sum;
datalines;
1 A X 2 8
1 A X 4 6
1 A Y 1 9
1 A Y 3 7
1 B Z 2 1
1 B Z 1 2
1 C Y 1 13
1 C Y 6 8
1 C Z 7 7
2 A Z 3 11
2 A Z 9 5
2 A Y 2 12
2 B X 8 17
2 B X 5 20
2 B X 5 20
2 B Z 7 18
2 C Y 2 1
2 C X 1 2
;
run;
My goal is to use either datasteps or proc sql (doing it on around 30 millions observations and proc means and such in SAS seems slower than those on previous similar computations).
My issue with solutions provided in the linked post is that is uses the total value of the column and I don't know how to change this by using the total in the sub group.
Any idea please?
A SQL solution will join all data to an aggregating select:
proc sql;
create table want as
select have.group1, have.group2, have.group3, have.value
, aggregate.sum - value as sum
from
have
join
(select group1, group2, sum(value) as sum
from have
group by group1, group2
) aggregate
on
aggregate.group1 = have.group1
& aggregate.group2 = have.group2
;
SQL can be slower than hash solution, but SQL code is understood by more people than those that understand SAS DATA Step involving hashes ( which can be faster the SQL. )
data want2;
if 0 then set have; * prep pdv;
declare hash sums (suminc:'value');
sums.defineKey('group1', 'group2');
sums.defineDone();
do while (not hash_loaded);
set have end=hash_loaded;
sums.ref(); * adds value to internal sum of hash data record;
end;
do while (not last_have);
set have end=last_have;
sums.sum(sum:sum); * retrieve group sum.;
sum = sum - value; * subtract from group sum;
output;
end;
stop;
run;
SAS documentation touches on SUMINC and has some examples
The question does not address this concept:
For each row compute the tier 2 sum that excludes the tier 3 this row is in
A hash based solution would require tracking each two level and three level sums:
data want2;
if 0 then set have; * prep pdv;
declare hash T2 (suminc:'value'); * hash for two (T)iers;
T2.defineKey('group1', 'group2'); * one hash record per combination of group1, group2;
T2.defineDone();
declare hash T3 (suminc:'value'); * hash for three (T)iers;
T3.defineKey('group1', 'group2', 'group3'); * one hash record per combination of group1, group2, group3;
T3.defineDone();
do while (not hash_loaded);
set have end=hash_loaded;
T2.ref(); * adds value to internal sum of hash data record;
T3.ref();
end;
T2_cardinality = T2.num_items;
T3_cardinality = T3.num_items;
put 'NOTE: |T2| = ' T2_cardinality;
put 'NOTE: |T3| = ' T3_cardinality;
do while (not last_have);
set have end=last_have;
T2.sum(sum:t2_sum);
T3.sum(sum:t3_sum);
sum = t2_sum - t3_sum;
output;
end;
stop;
drop t2_: t3:;
run;
I would be more than appreciative for some help here, as I have been having some serious problems with this.
Background:
I have a list of unique records. For each record I have a monotonically increasing pattern (either A, B or C), and a development position (1 to 5) assigned to it.
So each of the 3 patterns is set out in five fields representing the development period.
Problem:
I need to retrieve the percentages relating to the relevant development periods, from different fields for each row. It should be in a single column called "Output".
Example:
Apologies, not sure how to attach a table here, but the fields are below, the table is a transpose of these fields.
ID - (1,2,3,4,5)
Pattern - (A, B, C, A, C)
Dev - (1,5,3,4,2)
1 - (20%, 15%, 25%, 20%, 25%)
2 - (40%, 35%, 40%, 40%, 40%)
3 - (60%, 65%, 60%, 60%, 60%)
4 - (80%, 85%, 65%, 80%, 65%)
5 - (100%, 100%, 100%, 100%, 100%)
Output - (20%, 100%, 60%, 80%, 40%)
In MS Excel, I could simply use a HLOOKUP or OFFSET function to do this. But how do I do this in Access? The best I have come up with so far is Output: Eval([Category]) but this doesn't seem to achieve what I want which is to select the "Dev" field, and treat this as a field when building an expression.
In practice, I have more than 100 development periods to play with, and over 800 different patterns, so "switch" methods can't work here I think.
Thanks in advance,
alch84
Assuming that
[ID] is a unique column (primary key), and
the source column for [Output] only depends on the value of [Dev]
then this seems to work:
UPDATE tblAlvo SET Output = DLOOKUP("[" & Dev & "]", "tblAlvo", "ID=" & ID)
Before:
ID Pattern Dev 1 2 3 4 5 Output
-- ------- --- -- -- -- -- --- ------
1 A 1 20 40 60 80 100
2 B 5 15 35 65 85 100
3 C 3 25 40 60 65 100
4 A 4 20 40 60 80 100
5 C 2 25 40 60 65 100
After:
ID Pattern Dev 1 2 3 4 5 Output
-- ------- --- -- -- -- -- --- ------
1 A 1 20 40 60 80 100 20
2 B 5 15 35 65 85 100 100
3 C 3 25 40 60 65 100 60
4 A 4 20 40 60 80 100 80
5 C 2 25 40 60 65 100 40
I have an optimization exercise I am trying to work through and am stuck again on the syntax. Below is my attempt, and I'd really like a thorough explanation of the syntax in addition to the solution code. I think it's the specific index piece that I am having trouble with.
The problem:
I have an item that I wish to sell out of within ten weeks. I have a historical trend and wish to alter that trend by lowering price. I want maximum margin dollars. The below works, but I wish to add two constraints and can't sort out the syntax. I have spaces for these two constraints in the code, with my brief explanation of what I think they may look like. Here is a more detailed explanation of what I need each constraint to do.
inv_cap=There is only so much inventory available at each location. I wish to sell it all. For location 1 it is 800, location 2 it is 1200. The sum of the column FRC_UNITS should equal this amount, but cannot exceed it.
price_down_or_same=The price cannot bounce around, so it needs to always be less than or more than the previous week. So, price(i)<=price(i-1) where i=week.
Here is my attempt. Thank you in advance for assistance.
*read in data;
data opt_test_mkdown_raw;
input
ITM_NBR
ITM_DES_TXT $
LCT_NBR
WEEK
LY_UNITS
ELAST
COST
PRICE
TOTAL_INV;
cards;
1 stuff 1 1 300 1.2 6 10 800
1 stuff 1 2 150 1.2 6 10 800
1 stuff 1 3 100 1.2 6 10 800
1 stuff 1 4 60 1.2 6 10 800
1 stuff 1 5 40 1.2 6 10 800
1 stuff 1 6 20 1.2 6 10 800
1 stuff 1 7 10 1.2 6 10 800
1 stuff 1 8 10 1.2 6 10 800
1 stuff 1 9 5 1.2 6 10 800
1 stuff 1 10 1 1.2 6 10 800
1 stuff 2 1 400 1.1 6 9 1200
1 stuff 2 2 200 1.1 6 9 1200
1 stuff 2 3 100 1.1 6 9 1200
1 stuff 2 4 100 1.1 6 9 1200
1 stuff 2 5 100 1.1 6 9 1200
1 stuff 2 6 50 1.1 6 9 1200
1 stuff 2 7 20 1.1 6 9 1200
1 stuff 2 8 20 1.1 6 9 1200
1 stuff 2 9 5 1.1 6 9 1200
1 stuff 2 10 3 1.1 6 9 1200
;
run;
data opt_test_mkdown_raw;
set opt_test_mkdown_raw;
ITM_LCT_WK=cats(ITM_NBR, LCT_NBR, WEEK);
ITM_LCT=cats(ITM_NBR, LCT_NBR);
run;
proc optmodel;
*set variables and inputs;
set<string> ITM_LCT_WK;
number ITM_NBR{ITM_LCT_WK};
string ITM_DES_TXT{ITM_LCT_WK};
string ITM_LCT{ITM_LCT_WK};
number LCT_NBR{ITM_LCT_WK};
number WEEK{ITM_LCT_WK};
number LY_UNITS{ITM_LCT_WK};
number ELAST{ITM_LCT_WK};
number COST{ITM_LCT_WK};
number PRICE{ITM_LCT_WK};
number TOTAL_INV{ITM_LCT_WK};
*read data into procedure;
read data opt_test_mkdown_raw into
ITM_LCT_WK=[ITM_LCT_WK]
ITM_NBR
ITM_DES_TXT
ITM_LCT
LCT_NBR
WEEK
LY_UNITS
ELAST
COST
PRICE
TOTAL_INV;
var NEW_PRICE{i in ITM_LCT_WK};
impvar FRC_UNITS{i in ITM_LCT_WK}=(1-(NEW_PRICE[i]-PRICE[i])*ELAST[i]/PRICE[i])*LY_UNITS[i];
con ceiling_price {i in ITM_LCT_WK}: NEW_PRICE[i]<=PRICE[i];
/*con inv_cap {j in ITM_LCT}: sum{i in ITM_LCT_WK}=I want this to be 800 for location 1 and 1200 for location 2;*/
con supply_last {i in ITM_LCT_WK}: FRC_UNITS[i]>=LY_UNITS[i];
/*con price_down_or_same {j in ITM_LCT} : NEW_PRICE[week]<=NEW_PRICE[week-1];*/
*state function to optimize;
max margin=sum{i in ITM_LCT_WK}
(NEW_PRICE[i]-COST[i])*(1-(NEW_PRICE[i]-PRICE[i])*ELAST[i]/PRICE[i])*LY_UNITS[i];
/*expand;*/
solve;
*write output dataset;
create data results_MKD_maxmargin
from
[ITM_LCT_WK]={ITM_LCT_WK}
ITM_NBR
ITM_DES_TXT
LCT_NBR
WEEK
LY_UNITS
FRC_UNITS
ELAST
COST
PRICE
NEW_PRICE
TOTAL_INV;
*write results to window;
print
/*NEW_PRICE */
margin;
quit;
The main difficulty is that in your application, decisions are indexed by (Item,Location) pairs and Weeks, but in your code you have merged (Item,Location,Week) triplets. I rather like that use of the data step, but the result in this example is that your code is unable to refer to specific weeks and to specific pairs.
The fix that changes your code the least is to add these relationships by using defined sets and inputs that OPTMODEL can compute for you. Then you will know which triplets refer to each combination of (Item,Location) pair and week:
/* This code creates a set version of the Item x Location pairs
that you already have as strings */
set ITM_LCTS = setof{ilw in ITM_LCT_WK} itm_lct[ilw];
/* For each Item x Location pair, define a set of which
Item x Location x Week entries refer to that Item x Location */
set ILWperIL{il in ITM_LCTS} = {ilw in ITM_LCT_WK: itm_lct[ilw] = il};
With this relationship you can add the other two constraints.
I left your code as is, but applied to the new code a convention I find useful, especially when there are similar names like itm_lct and ITM_LCTS:
sets as all caps;
input parameters start with lowercase;
output (vars, impvars, and constraints) start with Uppercase */
Here is the new OPTMODEL code:
proc optmodel;
*set variables and inputs;
set<string> ITM_LCT_WK;
number ITM_NBR{ITM_LCT_WK};
string ITM_DES_TXT{ITM_LCT_WK};
string ITM_LCT{ITM_LCT_WK};
number LCT_NBR{ITM_LCT_WK};
number WEEK{ITM_LCT_WK};
number LY_UNITS{ITM_LCT_WK};
number ELAST{ITM_LCT_WK};
number COST{ITM_LCT_WK};
number PRICE{ITM_LCT_WK};
number TOTAL_INV{ITM_LCT_WK};
*read data into procedure;
read data opt_test_mkdown_raw into
ITM_LCT_WK=[ITM_LCT_WK]
ITM_NBR
ITM_DES_TXT
ITM_LCT
LCT_NBR
WEEK
LY_UNITS
ELAST
COST
PRICE
TOTAL_INV;
var NEW_PRICE{i in ITM_LCT_WK} <= price[i];
impvar FRC_UNITS{i in ITM_LCT_WK} =
(1-(NEW_PRICE[i]-PRICE[i])*ELAST[i]/PRICE[i]) * LY_UNITS[i];
* Moved to bound
con ceiling_price {i in ITM_LCT_WK}: NEW_PRICE[i] <= PRICE[i];
con supply_last{i in ITM_LCT_WK}: FRC_UNITS[i] >= LY_UNITS[i];
/* This code creates a set version of the Item x Location pairs
that you already have as strings */
set ITM_LCTS = setof{ilw in ITM_LCT_WK} itm_lct[ilw];
/* For each Item x Location pair, define a set of which
Item x Location x Week entries refer to that Item x Location */
set ILWperIL{il in ITM_LCTS} = {ilw in ITM_LCT_WK: itm_lct[ilw] = il};
/* I assume that for each item and location
the inventory is the same for all weeks for convenience,
i.e., that is not a coincidence */
num inventory{il in ITM_LCTS} = max{ilw in ILWperIL[il]} total_inv[ilw];
con inv_cap {il in ITM_LCTS}:
sum{ilw in ILWperIL[il]} Frc_Units[ilw] = inventory[il];
num lastWeek = max{ilw in ITM_LCT_WK} week[ilw];
/* Concatenating indexes is not the prettiest, but gets the job done here*/
con Price_down_or_same {il in ITM_LCTS, w in 2 .. lastWeek}:
New_Price[il || w] <= New_Price[il || w - 1];*/
*state function to optimize;
max margin=sum{i in ITM_LCT_WK}
(NEW_PRICE[i]-COST[i])*(1-(NEW_PRICE[i]-PRICE[i])*ELAST[i]/PRICE[i])*LY_UNITS[i];
expand;
solve;
*write output dataset;
create data results_MKD_maxmargin
from
[ITM_LCT_WK]={ITM_LCT_WK}
ITM_NBR
ITM_DES_TXT
LCT_NBR
WEEK
LY_UNITS
FRC_UNITS
ELAST
COST
PRICE
NEW_PRICE
TOTAL_INV;
*write results to window;
print
NEW_PRICE FRC_UNITS
margin
;
quit;
I have this sort of table :
Cluster Age FR
8 70 153
...
What I want is to get a table : for each Cluster and for each Age, the mean of FR in each 10th quantile. It should look like :
Cluster Age Quantile FR
1 1 10% 12
1 1 20% 14
1 1 30% 16
1 1 40% 18
1 1 50% 20
1 1 60% 22
1 1 70% 24
1 1 80% 26
1 1 90% 28
1 1 100% 30
1 2 10% 13
1 2 20% 15
1 2 30% 17
I tried doing this with proc univariate but with no success...
proc univariate data=etude.Presta_cluster_panier noprint;
var FR;
output out=pctls pctlpre=P_ pctlpts=0 to 100 by 10;
run;
This can be accomplished in two step through the use of proc rank & proc means.
proc rank data=etude.Presta_cluster_panier out=outranks groups=10;
var FR;
ranks Quantile;
by Cluster Age;
run;
proc means data=outranks;
var FR;
ways 3;
class Cluster Age Quantile;
output out=outmean;
run;
You will need to first obtain your quartiles by cluster and age. Then remerge with your master dataset, assign groups depending on your quartiles and finally compute the mean buy cluster age and quartile.
It is not possible in one step.