SAS table with percentage attached - sql

I am trying to create a matrix with both numeric and percentage result. I was given two tables
id cc
1 2
1 5
1 40
2 55
2 2
2 130
2 177
3 20
3 55
3 40
4 30
4 100
id Description
1 Dell
1 Lenovo
1 HP
2 Sony
2 Dell
2 Acer
2 Other
3 Fujitsu
3 Sony
3 HP
4 Apple
4 Asus
I have already created a table that looks like..I used the code
CC CC1 CC2… …CC177
1 264 5 0
2 0 132 6
…
…
177 2 1 692
data RESULT;
set id_CC;
by id;
retain CC1-CC177; /*CC range from 1 to 177*/
array CC_List(177) CC1-CC177;
if first.id then do i=1 to 177;
id_LIST(i)=0;
end;
CC_List(CC)=1;
if last.id then output;
run;
ods output sscp=coocs;
ods select sscp;
proc corr data=RESULT sscp;
var CC1-CC177;
run;
/*proc print data=coocs;*/
/*run;*/
/**/
In other words, how many id have cc1 also have cc2..cc177..etc. Now, I am wondering if it's doable to add percentage next to each number. For instance if CC1*CC1=264 (100%) then CC1*CC2= 5/264=1.9%
Another table I am trying to create is to have description of each CC on the matrix. Each CC number stands for one brand. 2=Dell 177=Other, etc. I want to create a table looks like
If I want to change the CC1 CC2 to characters, how do I modify the arrays? Eventually, I would like my table looks like
Description Dell Lenovo HP Sony Acer Other Fujitsu Sony
Dell 264 (100%)
Lenovo
HP 50 (10%)
Sony
Acer
Other
Fujitsu
Sony
In other words, how many people have dell also have acer, sony, other, etc?

The rename is a question that's been asked on here so I'll leave that one for now.
For the percentages you'll need to create a character variable. TO calculate the percent use the automatic variable _n_ which is the row, but will also be the denominator for your calculation. Then use a concatenate function such as cats to create the variable in the format N(PP%).
data want;
set have;
array cc(177) cc1-cc177;
array dd(177) $ dd1-dd177;
do i=1 to 177;
percent=cc(i)/cc(_n_);
dd(i)=cats(cc(i), "(", put(percent, percent8.1), ")");
end;
run;

In answering Reeza, I did:
data RESULT_PRE;
set ID_CC;
by ID;
retain CC1-CC177;
array CC_LIST(177) CC1-CC177;
array DD_LIST(177) $ DD1-DD177;
if first.id then do i=1 to 177;
CC_LIST(i)=0;
end;
CC_LIST(CC)=1;
if last.id then output;
run;
data RESULT;
set RESULT_PRE;
array CC_LIST(177) CC1-CC177;
array DD_LIST(177) $ DD1-DD177;
do i=1 to 177;
percent=CC_LIST(i)/CC_LIST(_n_);
DD_LIST(i)=cats(CC_LIST(i), "(", put(percent, percent8.1), ")");
end;
run;
The error shows that Array subscript out of range at line xx column xx and ERROR 68-185: The function CC is unknown, or cannot be accessed.

Related

How do I add a key to a row based on its "group"?

I have a data set like this:
a 10
a 13
a 14
b 15
b 44
c 64
c 32
d 12
I want to write a PROC SQL statement or DATA step that will yield this:
a 10 1
a 13 1
a 14 1
b 15 2
b 44 2
c 64 3
c 32 3
d 12 4
How do?
DATA TEST;
INPUT id $ value ;
DATALINES;
a 10
a 13
a 14
b 15
b 44
c 64
c 32
d 12
;
RUN;
Sort your data if needed:
proc sort data=test;
by id;
run;
Then:
data want;
set test;
retain key;
by id;
if _n_ = 1 then key = 0;
if first.id then key = key + 1;
run;
The retain statement will retain the value of key through the iterations.
Then, whenever a new id appears, we sum 1 to key.
Alternatively as stated by Keith, you could use this simplified data step to do the job:
data want;
set test;
by id;
if first.id then key + 1;
run;
I'll leave both versions here for reference because I think the first one is easier to understand, and the last one from Keith's comments is a lot cleaner.

How do you mark unique occurrences in a pattern given that value are unique when occurring simultaneously and not when they come separately? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
Suppose my data looks like this
student article.bought
1 A pen
2 B pencil
3 V book
4 A pen
5 A inkbottle
6 B pen
7 B pencil
8 B pencil
9 V book
10 Z marker
11 A inkbottle
12 V book
13 V pen
14 V book
I need unique occurrences of articles probably in a different column like this
student article.bought Occurences
1 A pen 1
2 B pencil 1
3 V book 1
4 A pen 1 # as A is taking a pen again
5 A inkbottle 2 # 'A' changed from pen to ink bottle
6 B pen 2
7 B pencil 3 # though B took pencil before, this is different as he took a pen in between
8 B pencil 3
9 V book 1
10 Z marker 1
11 A inkbottle 2
12 V book 1
13 V pen 2
14 V book 3
In R, we can find changes in a student's selection by finding the difference, diff, of each subsequent value. When we take the cumulative sum, cumsum, of that logical index we get a running count of occurrences.
In the second line we coerce the factor variable article.bought to numeric and run the function from the first line using ave to group the function f by student.
f <- function(x) cumsum(c(F, diff(x) != 0)) + 1
df$Occurences <- with(df, ave(as.numeric(article.bought), student, FUN=f))
df
# student article.bought Occurences
# 1 A pen 1
# 2 B pencil 1
# 3 V book 1
# 4 A pen 1
# 5 A inkbottle 2
# 6 B pen 2
# 7 B pencil 3
# 8 B pencil 3
# 9 V book 1
# 10 Z marker 1
# 11 A inkbottle 2
# 12 V book 1
# 13 V pen 2
# 14 V book 3
create additional column [Original Sort Order] and enumerate from 1
to ...
sort table by student / orig sort order
enter =IF(A2=A1,IF(B2=B1,D1,D1+1),1) in D2 and copy down
convert column D to values (copy, paste as ... Values)
restore original sort order
If this is more than a one-off, use the same tactic to create a VBA script
A shot with SAS:
data try00;
length student article $20;
infile datalines dlm=' ';
input student $ article $;
datalines;
A pen
B pencil
V book
A pen
A inkbottle
B pen
B pencil
B pencil
V book
Z marker
A inkbottle
V book
V pen
V book
;
data try01;
set try00;
pos=_n_;
run;
proc sort data=try01 out=try02; by student pos article; run;
proc sort data=try02 out=stud(keep=student) nodupkey; by student; run;
data shell;
length occurrence 8.;
set try02;
if _n_>0 then delete;
run;
%macro loopstudent();
data _null_; set stud end=eof; if eof then call symput("nstu",_n_); run;
%do i=1 %to &nstu;
data _null_; set stud; if _n_=&i then call symput("stud&i",student); run;
data thisstu;
set try02;
where student="&&stud&i";
dummyart=lag(article);
retain occurrence 0;
if dummyart ne article then occurrence=occurrence+1;
else occurrence=occurrence;
drop dummyart;
run;
proc append base=shell data=thisstu; run;
%end;
proc sort data=shell out=final; by pos; run;
%mend loopstudent; %loopstudent();
dataset "final" has the result.

Create a variable based on sum of two variables (one lag)

I have a data set like the one below, where the amount has dropped off, but the adjustment remains. For each row amount should be the sum of the previous amount and the adjustment. So, amount for observation 5 is 134 (124+10).
I have an answer which gets me the next value, but I need some sort of recursion to get me the rest of the way there. What am I missing? Thanks.
data have;
input amount adjust;
cards;
100 0
101 1
121 20
124 3
. 10
. 4
. 3
. 0
. 1
;
run;
data attempt;
set have;
x=lag1(amount);
if amount=. then amount=adjust+x;
run;
data want;
input amount adjust;
cards;
100 0
101 1
121 20
124 3
134 10
138 4
141 3
141 0
142 1
;
run;
EDIT:
Also trying something like this now, still not quite what I want.
%macro doodoo;
%do i = 1 %to 5;
data have;
set have;
/* if _n_=i+4 then*/
amount=lag1(amount)+adjust;
run;
%end;
%mend;
%doodoo;
No need to LAG() use RETAIN instead.
data want ;
set have ;
retain previous ;
if amount = . then amount=sum(previous,adjust);
previous=amount ;
run;

SAS Proc Optmodel Constraint Syntax

I have an optimization exercise I am trying to work through and am stuck again on the syntax. Below is my attempt, and I'd really like a thorough explanation of the syntax in addition to the solution code. I think it's the specific index piece that I am having trouble with.
The problem:
I have an item that I wish to sell out of within ten weeks. I have a historical trend and wish to alter that trend by lowering price. I want maximum margin dollars. The below works, but I wish to add two constraints and can't sort out the syntax. I have spaces for these two constraints in the code, with my brief explanation of what I think they may look like. Here is a more detailed explanation of what I need each constraint to do.
inv_cap=There is only so much inventory available at each location. I wish to sell it all. For location 1 it is 800, location 2 it is 1200. The sum of the column FRC_UNITS should equal this amount, but cannot exceed it.
price_down_or_same=The price cannot bounce around, so it needs to always be less than or more than the previous week. So, price(i)<=price(i-1) where i=week.
Here is my attempt. Thank you in advance for assistance.
*read in data;
data opt_test_mkdown_raw;
input
ITM_NBR
ITM_DES_TXT $
LCT_NBR
WEEK
LY_UNITS
ELAST
COST
PRICE
TOTAL_INV;
cards;
1 stuff 1 1 300 1.2 6 10 800
1 stuff 1 2 150 1.2 6 10 800
1 stuff 1 3 100 1.2 6 10 800
1 stuff 1 4 60 1.2 6 10 800
1 stuff 1 5 40 1.2 6 10 800
1 stuff 1 6 20 1.2 6 10 800
1 stuff 1 7 10 1.2 6 10 800
1 stuff 1 8 10 1.2 6 10 800
1 stuff 1 9 5 1.2 6 10 800
1 stuff 1 10 1 1.2 6 10 800
1 stuff 2 1 400 1.1 6 9 1200
1 stuff 2 2 200 1.1 6 9 1200
1 stuff 2 3 100 1.1 6 9 1200
1 stuff 2 4 100 1.1 6 9 1200
1 stuff 2 5 100 1.1 6 9 1200
1 stuff 2 6 50 1.1 6 9 1200
1 stuff 2 7 20 1.1 6 9 1200
1 stuff 2 8 20 1.1 6 9 1200
1 stuff 2 9 5 1.1 6 9 1200
1 stuff 2 10 3 1.1 6 9 1200
;
run;
data opt_test_mkdown_raw;
set opt_test_mkdown_raw;
ITM_LCT_WK=cats(ITM_NBR, LCT_NBR, WEEK);
ITM_LCT=cats(ITM_NBR, LCT_NBR);
run;
proc optmodel;
*set variables and inputs;
set<string> ITM_LCT_WK;
number ITM_NBR{ITM_LCT_WK};
string ITM_DES_TXT{ITM_LCT_WK};
string ITM_LCT{ITM_LCT_WK};
number LCT_NBR{ITM_LCT_WK};
number WEEK{ITM_LCT_WK};
number LY_UNITS{ITM_LCT_WK};
number ELAST{ITM_LCT_WK};
number COST{ITM_LCT_WK};
number PRICE{ITM_LCT_WK};
number TOTAL_INV{ITM_LCT_WK};
*read data into procedure;
read data opt_test_mkdown_raw into
ITM_LCT_WK=[ITM_LCT_WK]
ITM_NBR
ITM_DES_TXT
ITM_LCT
LCT_NBR
WEEK
LY_UNITS
ELAST
COST
PRICE
TOTAL_INV;
var NEW_PRICE{i in ITM_LCT_WK};
impvar FRC_UNITS{i in ITM_LCT_WK}=(1-(NEW_PRICE[i]-PRICE[i])*ELAST[i]/PRICE[i])*LY_UNITS[i];
con ceiling_price {i in ITM_LCT_WK}: NEW_PRICE[i]<=PRICE[i];
/*con inv_cap {j in ITM_LCT}: sum{i in ITM_LCT_WK}=I want this to be 800 for location 1 and 1200 for location 2;*/
con supply_last {i in ITM_LCT_WK}: FRC_UNITS[i]>=LY_UNITS[i];
/*con price_down_or_same {j in ITM_LCT} : NEW_PRICE[week]<=NEW_PRICE[week-1];*/
*state function to optimize;
max margin=sum{i in ITM_LCT_WK}
(NEW_PRICE[i]-COST[i])*(1-(NEW_PRICE[i]-PRICE[i])*ELAST[i]/PRICE[i])*LY_UNITS[i];
/*expand;*/
solve;
*write output dataset;
create data results_MKD_maxmargin
from
[ITM_LCT_WK]={ITM_LCT_WK}
ITM_NBR
ITM_DES_TXT
LCT_NBR
WEEK
LY_UNITS
FRC_UNITS
ELAST
COST
PRICE
NEW_PRICE
TOTAL_INV;
*write results to window;
print
/*NEW_PRICE */
margin;
quit;
The main difficulty is that in your application, decisions are indexed by (Item,Location) pairs and Weeks, but in your code you have merged (Item,Location,Week) triplets. I rather like that use of the data step, but the result in this example is that your code is unable to refer to specific weeks and to specific pairs.
The fix that changes your code the least is to add these relationships by using defined sets and inputs that OPTMODEL can compute for you. Then you will know which triplets refer to each combination of (Item,Location) pair and week:
/* This code creates a set version of the Item x Location pairs
that you already have as strings */
set ITM_LCTS = setof{ilw in ITM_LCT_WK} itm_lct[ilw];
/* For each Item x Location pair, define a set of which
Item x Location x Week entries refer to that Item x Location */
set ILWperIL{il in ITM_LCTS} = {ilw in ITM_LCT_WK: itm_lct[ilw] = il};
With this relationship you can add the other two constraints.
I left your code as is, but applied to the new code a convention I find useful, especially when there are similar names like itm_lct and ITM_LCTS:
sets as all caps;
input parameters start with lowercase;
output (vars, impvars, and constraints) start with Uppercase */
Here is the new OPTMODEL code:
proc optmodel;
*set variables and inputs;
set<string> ITM_LCT_WK;
number ITM_NBR{ITM_LCT_WK};
string ITM_DES_TXT{ITM_LCT_WK};
string ITM_LCT{ITM_LCT_WK};
number LCT_NBR{ITM_LCT_WK};
number WEEK{ITM_LCT_WK};
number LY_UNITS{ITM_LCT_WK};
number ELAST{ITM_LCT_WK};
number COST{ITM_LCT_WK};
number PRICE{ITM_LCT_WK};
number TOTAL_INV{ITM_LCT_WK};
*read data into procedure;
read data opt_test_mkdown_raw into
ITM_LCT_WK=[ITM_LCT_WK]
ITM_NBR
ITM_DES_TXT
ITM_LCT
LCT_NBR
WEEK
LY_UNITS
ELAST
COST
PRICE
TOTAL_INV;
var NEW_PRICE{i in ITM_LCT_WK} <= price[i];
impvar FRC_UNITS{i in ITM_LCT_WK} =
(1-(NEW_PRICE[i]-PRICE[i])*ELAST[i]/PRICE[i]) * LY_UNITS[i];
* Moved to bound
con ceiling_price {i in ITM_LCT_WK}: NEW_PRICE[i] <= PRICE[i];
con supply_last{i in ITM_LCT_WK}: FRC_UNITS[i] >= LY_UNITS[i];
/* This code creates a set version of the Item x Location pairs
that you already have as strings */
set ITM_LCTS = setof{ilw in ITM_LCT_WK} itm_lct[ilw];
/* For each Item x Location pair, define a set of which
Item x Location x Week entries refer to that Item x Location */
set ILWperIL{il in ITM_LCTS} = {ilw in ITM_LCT_WK: itm_lct[ilw] = il};
/* I assume that for each item and location
the inventory is the same for all weeks for convenience,
i.e., that is not a coincidence */
num inventory{il in ITM_LCTS} = max{ilw in ILWperIL[il]} total_inv[ilw];
con inv_cap {il in ITM_LCTS}:
sum{ilw in ILWperIL[il]} Frc_Units[ilw] = inventory[il];
num lastWeek = max{ilw in ITM_LCT_WK} week[ilw];
/* Concatenating indexes is not the prettiest, but gets the job done here*/
con Price_down_or_same {il in ITM_LCTS, w in 2 .. lastWeek}:
New_Price[il || w] <= New_Price[il || w - 1];*/
*state function to optimize;
max margin=sum{i in ITM_LCT_WK}
(NEW_PRICE[i]-COST[i])*(1-(NEW_PRICE[i]-PRICE[i])*ELAST[i]/PRICE[i])*LY_UNITS[i];
expand;
solve;
*write output dataset;
create data results_MKD_maxmargin
from
[ITM_LCT_WK]={ITM_LCT_WK}
ITM_NBR
ITM_DES_TXT
LCT_NBR
WEEK
LY_UNITS
FRC_UNITS
ELAST
COST
PRICE
NEW_PRICE
TOTAL_INV;
*write results to window;
print
NEW_PRICE FRC_UNITS
margin
;
quit;

convert univariate table to multivariate table in sas [duplicate]

This question already has answers here:
SAS PROC Transpose Data
(3 answers)
Closed 8 years ago.
I'm struggling with convert univariate table to multivariate table in sas. I would say the 'univariate table' I mean maybe still a multivariate table...but here's an example:
a b c
1001 1 4 8
1001 2 3 7
1002 11 9 6
1002 5 14 15
I want it to be like:
a1 b1 c1 a2 b2 c3
1001 1 4 8 2 3 7
1002 11 9 6 5 14 15
since I have thousands of ids(like 1001-3000). is there an simple way that I can flip the table around??
Many Thanks!
Not a simple way because you're doing multiple values.
You can do a transpose 3 times - once for each values and merge via the link from #jaamor. Or you can do a semi-manual data step transpose. This assumes a max count of 2. If you have more than 2 per ID you can calculate the max and put it into a macro variable. Then replace the 2's in the code with the macro variable.
data want;
set have;
by id;
array _a(2) a1-a2;
array _b(2) b1-b2;
array _c(2) c1-c2;
retain a: b: c:;
if first.id then count=1;
else count+1;
_a(count)=a;
_b(count)=b;
_c(count)=c;
if last.id then output;
drop a b c count;
run;