SAS Call a macro variable - variables

I have a program that I want to run on several years. Therefore at some time I have to select my data
data want;
set have(where='2014');
run;
To try on several years, I have a macro variable, that I define as
%let an=14
/*It is 14 and not 2014, because elsewhere I need it that way.*/
But then when I try to put it in my program it does not work at all
data want;
set have(where="&&20&an.");
run;
I would appreciate some help
First edit : changed ' ' into " ", but still does not work
Second edit and answer
"20&an"

The answer you arrived at (20&an) is correct - you are all set. You don't even need to read the rest of this answer I've posted :-)
However I noticed you were a bit confused about & vs. &&. If you'd like to know more about that, I put together some extra info on the difference between & and &&, as well as the purpose of &&, in SAS macro evaluation.
The & is the most common symbol - you simply use that to evaluate/de-reference a variable. So:
%LET an = 14 ;
%PUT ----- an ;
%PUT ----- &an ;
Output:
----- an
----- 14
So as you can see, you must put a & before the variable name in order to de-reference it to its value. Omitting the & simply prints the string an, which happens to be the name of the variable in this case. For most macro coding, & is all you will ever need. & in SAS macro is like $ in shell, * in C, etc.
Now, what is && for? It exists to enable you to have dynamic macro variable names. That is, having a macro variable whose value is the name of another macro variable. If you are familiar with C, you can think of this as a pointer to a pointer.
The way SAS evaluates && is in two passes. In the first pass, it converts && to &. At the same time, any & symbols it sees in that pass will be used to de-reference the variable names they are next to. The idea is for these latter expressions to resolve to a variable name. Then, in the second pass, the remaining & symbols (all originally && symbols) de-reference whatever variable names they now find themselves next to.
Here is an example with sample output:
%LET x = 3;
%LET name_of_variable = x;
%PUT ----- &x;
%PUT ----- &&name_of_variable;
%PUT ----- &&&name_of_variable;
Output:
----- 3
----- x
----- 3
In the first %PUT, we are just using plain old &, and thus we are doing what we did before, reading and printing the value that x holds. In the second %PUT, things get slightly more interesting. Because of the &&, SAS does two passes of evaluation. The first one converts this:
%PUT ----- &&name_of_variable;
To this
%PUT ----- &name_of_variable;
In the second pass, SAS does the standard & evaluation to print the value held in name_of_variable - the string x, which happens to be the name of the other variable we are using. Of course, this example is especially contrived: why would you write &&name_of_variable when you could have just written &name_of_variable?
In the third %PUT, where we now have the &&&, SAS does two passes. Here is where we finally see the true purpose of &&. I will put pieces of the expression in parentheses so you can see how they are evaluated. We go from this:
%PUT ----- (&&)(&name_of_variable);
To this:
%PUT ----- &x;
So the in the first pass, the && was converted to &, and &name_of_variable was a simple de-referencing of name_of_variable, evaluating to the content it is holding, which as we said was x.
So in the second pass, we are just left with the simple evaluation:
%PUT ----- &x;
As we had set x equal to 3, this evaluates to 3.
So in a sense, saying &&&name_of_variable is saying "Show me the value of the variable whose name is stored in name_of_variable."
Here is a motivating example for why you would want to do this. Suppose you had a simple macro subroutine that added an arbitrary number to a numerical value stored in a SAS macro variable. To do that, the subroutine would have to know the amount to add, but more importantly, it would need to know the name of the variable to add to. You would accomplish this dynamic variable naming via the && mechanism, like so:
%MACRO increment_by_amount (var_name = , amount = );
%LET &var_name = %EVAL (&&&var_name + &amount) ;
/* Note: this could also begin with %LET &&var_name = .... */
%MEND;
Here we are saying: "Let the variable whose name is held in var_name (i.e. &var_name) equal the value of the variable whose name is held in var_name (i.e. &&&var_name) plus the value held in amount (i.e. &amount).
When you call a subroutine like this, make sure you are passing the variable name, not the value. That is, say this:
%increment_by_amount (var_name = x , amount = 3 );
Not this:
%increment_by_amount (var_name = &x , amount = 3);
So an example of invocation would be:
%LET x = 3;
%PUT ----- &x;
%increment_by_amount (var_name = x , amount = 3 );
%PUT ----- &x;
Output:
----- 3
----- 6

Related

sas macro resolving issue

Dummy data:
MEMNAME _var1 var2 var3 var4
XY XYSTART_1 XYSTATT_2 XYSTAET_3 XYSTAWT_4
I want to create a macro variable that will have data as TEST_XYSTART, TEST_XYSTATT, TEST_XYSTAET, TEST_TAWT.... how can I do this in datastep without using call symput because I want to use this macro variable in the same datastep (call symput will not create macro variable until I end the datastep).
I tried as below (not working), please tell me what is the correct way of write the step.
case = "TEST_"|| strip(reverse(substr(strip(reverse(testcase(i))),3)));
%let var = case; (with/without quotes not getting the desired result).
abc= strip(reverse(substr(strip(reverse(testcase(i))),3)));
%let test = TEST_;
%let var = &test.abc;
I am getting correct data with this statement: strip(reverse(substr(strip(reverse(testcase(i))),3)))
just not able to concatenate this value with TEST_ and assign it to the macro variable in a datastep.
Appreciate your help!
It makes no sense to locate a %LET statement in the middle of a data step. The macro processor evaluates the macro statements first and then passes the resulting code onto SAS to evaluate. So if you had this code:
data want;
set have;
%let var=abc ;
run;
It is the same as if you placed the %LET statements before the DATA statement.
%let var=abc ;
data want;
set have;
run;
If you want to reference a variable dynamically in a data step then use either an index into an array.
data want;
set have;
array test test_1 - test_3;
new_var = test[ testnum ] ;
run;
Or use the VvalueX() function which will return the value of a variable whose name is a character expression.
data want;
set have;
new_var = vvaluex( cats('test_',testnum) );
run;

Using a dynamic table name with month+year (mmyy) SAS EG

Any help please?
When displaying the variable vMonth, it is working but when concatenating it with the library name, the following issue is obtained.
Program:
%LET lastdaypreviousmonth = put(intnx('month', today(), -1, 'E'),mmyyn4.);
%LET vMonth = cats('RM',&lastdaypreviousmonth);
PROC SQL;
SELECT &vMonth,*
FROM MASU.&vMonth
WHERE nsgr = '040';
QUIT;
Log file :
27 %LET lastdaypreviousmonth = put(intnx('month', today(), -1, 'E'),mmyyn4.);
28 %LET vMonth = cats('RM',&lastdaypreviousmonth);
29
30 PROC SQL;
31
32 SELECT &vMonth,*
33 FROM MASU.&vMonth
34 WHERE nsgr = '040';
NOTE: PROC SQL set option NOEXEC and will continue to check the syntax of statements.
NOTE: Line generated by the macro variable "VMONTH".
34 MASU.cats('RM',put(intnx('month', today(), -1, 'E'),mmyyn4.))
_ _
79 79
200
ERROR 79-322: Expecting a ).
ERROR 200-322: The symbol is not recognized and will be ignored.
The macro code is just doing what you told it to do. Add some %PUT statements to see what values you have put into your macro variables. The macro processer will not treat strings like put or cats any differently than it would treat the string xyz or 123.
If you want to call SAS functions in macro code you need to wrap each call with the %sysfunc() macro function. Not all functions can be called this way. In particular instead of the type flexible PUT() and INPUT() functions use the type specific versions instead. But in this case you can just use the format parameter of the %SYSFUNC() call instead of the function call. Do not include quotes in your string literals, everything is a string literal to the macro processor.
Use this:
%LET lastdaypreviousmonth=%sysfunc(intnx(month,%sysfunc(today()),-1, E),mmyyn4.);
There is no need to ever use the CAT...() functions in macro code. To concatenate macro variable value just expand them where you want them to appear.
%LET vMonth = RM&lastdaypreviousmonth.;

SAS: most efficient method to output first non-missing across multiple columns

The data I have are millions of rows and rather sparse with anywhere between 3 and 10 variables needing processed. My end result needs to be one single row containing the first non-missing value for each column. Take the following test data:
** test data **;
data test;
length ID $5 AID 8 TYPE $5;
input ID $ AID TYPE $;
datalines;
A . .
. 123 .
C . XYZ
;
run;
The end result should look like such:
ID AID TYPE
A 123 XYZ
Using macro lists and loops I can brute force this result with multiple merge statements where the variable is non-missing and obs=1 but this is not efficient when the data are very large (below I'd loop over these variables rather than write multiple merge statements):
** works but takes too long on big data **;
data one_row;
merge
test(keep=ID where=(ID ne "") obs=1) /* character */
test(keep=AID where=(AID ne .) obs=1) /* numeric */
test(keep=TYPE where=(TYPE ne "") obs=1); /* character */
run;
The coalesce function seems very promising, but I believe I need it in combination with array and output to build this single-row result. The function also differs (coalesce and coalescec depending on variable type) whereas it does not matter using proc sql. I get an error using array since all variables in the array list are not the same type.
Exactly what is most efficient will largely depend on the characteristics of your data. In particular, whether the first nonmissing value for the last variable is usually relatively "early" in the dataset, or if you usually will have to trawl through the entire dataset to get to it.
I assume your dataset is not indexed (as that would simplify things greatly).
One option is the standard data step. This isn't necessarily fast, but it's probably not too much slower than most other options given you're going to have to read most/all of the rows no matter what you do. This has a nice advantage that it can stop when every row is complete.
data want;
if 0 then set test; *defines characteristics;
set test(rename=(id=_id aid=_aid type=_type)) end=eof;
id=coalescec(id,_id);
aid=coalesce(aid,_aid);
type=coalescec(type,_type);
if cmiss(of id aid type)=0 then do;
output;
stop;
end;
else if eof then output;
drop _:;
run;
You could populate all of that from macro variables from dictionary.columns, or even might use temporary arrays, though I think that gets too messy.
Another option is the self update, except it needs two changes. One, you need something to join on (as opposed to merge which can have no by variable). Two, it will give you the last nonmissing value, not the first, so you'd have to reverse-sort the dataset.
But assuming you added x to the first dataset, with any value (doesn't matter, but constant for every row), it is this simple:
data want;
update test(obs=0) test;
by x;
run;
So that has the huge advantage of simplicity of code, exchanged for some cost of time (reverse sorting and adding a new variable).
If your dataset is very sparse, a transpose might be a good compromise. Doesn't require knowing the variable names as you can process them with arrays.
data test_t;
set test;
array numvars _numeric_;
array charvars _character_;
do _i = 1 to dim(numvars);
if not missing(numvars[_i]) then do;
varname = vname(numvars[_i]);
numvalue= numvars[_i];
output;
end;
end;
do _i = 1 to dim(charvars);
if not missing(charvars[_i]) then do;
varname = vname(charvars[_i]);
charvalue= charvars[_i];
output;
end;
end;
keep numvalue charvalue varname;
run;
proc sort data=test_t;
by varname;
run;
data want;
set test_t;
by varname;
if first.varname;
run;
Then you proc transpose this to get the desired want (or maybe this works for you as is). It does lose the formats/etc. on the value, so take that into account, and your character value length probably needs to be set to something appropriately long - and then set back (you can use an if 0 then set to fix it).
A similar hash approach would work roughly the same way; it has the advantage that it would stop much sooner, and doesn't require resorting.
data test_h;
set test end=eof;
array numvars _numeric_;
array charvars _character_;
length varname $32 numvalue 8 charvalue $1024; *or longest charvalue length;
if _n_=1 then do;
declare hash h(ordered:'a');
h.defineKey('varname');
h.defineData('varname','numvalue','charvalue');
h.defineDone();
end;
do _i = 1 to dim(numvars);
if not missing(numvars[_i]) then do;
varname = vname(numvars[_i]);
rc = h.find();
if rc ne 0 then do;
numvalue= numvars[_i];
rc=h.add();
end;
end;
end;
do _i = 1 to dim(charvars);
if not missing(charvars[_i]) then do;
varname = vname(charvars[_i]);
rc = h.find();
if rc ne 0 then do;
charvalue= charvars[_i];
rc=h.add();
end;
end;
end;
if eof or h.num_items = dim(numvars) + dim(charvars) then do;
rc = h.output(dataset:'want');
end;
run;
There are lots of other solutions, just depending on your data which would be most efficient.

Output to a text file

I need to output lots of different datasets to different text files. The datasets share some common variables that need to be output but also have quite a lot of different ones. I have loaded these different ones into a macro variable separated by blanks so that I can macroize this.
So I created a macro which loops over the datasets and outputs each into a different text file.
For this purpose, I used a put statement inside a data step. The PUT statement looks like this:
PUT (all the common variables shared by all the datasets), (macro variable containing all the dataset-specific variables);
E.g.:
%MACRO OUTPUT();
%DO N=1 %TO &TABLES_COUNT;
DATA _NULL_;
SET &&TABLE&N;
FILE 'PATH/&&TABLE&N..txt';
PUT a b c d "&vars";
RUN;
%END;
%MEND OUTPUT;
Where &vars is the macro variable containing all the variables needed for outputting for a dataset in the current loop.
Which gets resolved, for example, to:
PUT a b c d special1 special2 special5 ... special329;
Now the problem is, the quoted string can only be 262 characters long. And some of my datasets I am trying to output have so many variables to be output that this macro variable which is a quoted string and holds all those variables will be much longer than that. Is there any other way how I can do this?
Do not include quotes around the list of variable names.
put a b c d &vars ;
There should not be any limit to the number of variables you can output, but if the length of the output line gets too long SAS will wrap to a new line. The default line length is currently 32,767 (but older versions of SAS use 256 as the default line length). You can actually set that much higher if you want. So you could use 1,000,000 for example. The upper limit probably depends on your operating system.
FILE "PATH/&&TABLE&N..txt" lrecl=1000000 ;
If you just want to make sure that the common variables appear at the front (that is you are not excluding any of the variables) then perhaps you don't need the list of variables for each table at all.
DATA _NULL_;
retain a b c d ;
SET &&TABLE&N;
FILE "&PATH/&&TABLE&N..txt" lrecl=1000000;
put (_all_) (+0) ;
RUN;
I would tackle this but having 1 put statement per variable. Use the # modifier so that you don't get a new line.
For example:
data test;
a=1;
b=2;
c=3;
output;
output;
run;
data _null_;
set test;
put a #;
put b #;
put c #;
put;
run;
Outputs this to the log:
800 data _null_;
801 set test;
802 put a #;
803 put b #;
804 put c #;
805 put;
806 run;
1 2 3
1 2 3
NOTE: There were 2 observations read from the data set WORK.TEST.
NOTE: DATA statement used (Total process time):
real time 0.07 seconds
cpu time 0.03 seconds
So modify your macro to loop through the two sets of values using this syntax.
Not sure why you're talking about quoted strings: you would not quote the &vars argument.
put a b c d &vars;
not
put a b c d "&vars";
There's a limit there, but it's much higher (64k).
That said, I would do this in a data driven fashion with CALL EXECUTE. This is pretty simple and does it all in one step, assuming you can easily determine which datasets to output from the dictionary tables in a WHERE statement. This has a limitation of 32kiB total, though if you're actually going to go over that you can work around it very easily (you can separate out various bits into multiple calls, and even structure the call so that if the callstr hits 32000 long you issue a call execute with it and then continue).
This avoids having to manage a bunch of large macro variables (your &VAR will really be &&VAR&N and will be many large macro variables).
data test;
length vars callstr $32767;
do _n_ = 1 by 1 until (last.memname);
set sashelp.vcolumn;
where memname in ('CLASS','CARS');
by libname memname;
vars = catx(' ',vars,name);
end;
callstr = catx(' ',
'data _null_;',
'set',cats(libname,'.',memname),';',
'file',cats('"c:\temp\',memname,'.txt"'),';',
'put',vars,';',
'run;');
call execute(callstr);
run;

Creating and modifying a global statement in SAS

I would like to do something very simple, but it doesn't work
This is a simple example but I intend to use it for some more complex stuff
the output I want is :
obs. dummy newcount
1 3 1
2 5 2
3 2 3
but the output I get is :
obs. dummy newcount
1 3 1
2 5 1
3 2 1
here is my code
data test;
input dummy;
cards;
3
5
2
;
run;
%let count=1;
data test2;
set test;
newcount = &count.;
%let count = &count. + 1;
run;
The variable count doesn't get incremented. How do I do this?
Thanks for your help !
You're mixing macro variables and datastep variables in a way you cannot. Macro variables used in the data step in most cases have to have their values already defined prior to the data step when used like this; what happens is the data step compiler immediately resolves &count to the number 1, and uses that number 1 in its compilation, not the macro variable's newer values.
Further, the %let is not a data step command but a macro statement - it is also only executed once, not one time per data step pass.
You could use
data test2;
set test;
newcount = symget("count");
call symput("count",newcount+1);
put _all_;
run;
and it would work (call symput is how you define a macro variable in a data step, symget is how you retrieve the value of a macro variable that isn't finalized before the data step begins). It is probably not a good idea, however - you shouldn't generally store data values in macro variables and interact repeatedly with them inside a data step. If you post more details about why you're trying to do this (ie, what your actual goal is) I'm sure several of us could offer some suggestions for how to approach the problem.