How to create multiple macro variables using a loop in SAS Enterprise Guide? - while-loop

I am constantly using SAS datasets in SAS EG to create macro variables that can be used as variables in a query from SAS EG to my internal servers. There is a character limit of 65,534 for a macro variable. When I need to get 100k ids that are 9 to 15 digits in length the number of macro variables required to create really adds up. I am asking the community if there is a way to create a large number of macro variables with a loop instead of doing it manually.
For instance the manual way to create these macro variables would be something like the following:
proc sql; create table alerts as select distinct review_id format best12. from q4_21_alerts order by review_id;quit;
proc sql; create table alerts1 as select review_id, monotonic() as number from alerts order by number; quit;
proc sql; select distinct review_id into:alert_ids1 separated by ',' from alerts1 where number between 1 and 7000; quit;
proc sql; select distinct review_id into:alert_ids2 separated by ',' from alerts1 where number between 7001 and 14000; quit;
proc sql; select distinct review_id into:alert_ids3 separated by ',' from alerts1 where number between 14001 and 21000; quit;
proc sql; select distinct review_id into:alert_ids4 separated by ',' from alerts1 where number between 21001 and 27000; quit;
.
.
.
proc sql; select distinct review_id into:alert_ids21 separated by ',' from alerts1 where number between 140001 and 147000; quit;
I am hoping to find a way to do something like the following:
N = 145417
#total number of review_ids that need to be contained in SAS macro variables
L = 8
#length/number of characters/digits in each review_id
L = L + 1
#length/number of characters/digits in each review_id with 1 added for the comma separation in macro variable
stop = N*L
i = 1
while(i<=stop){
some code to create all 21 macro variables
}
then be left with macro variables alert_ids1, alert_ids2,...,alert_ids21 that would contain all 145,417 ids i need to then use in a query for my internal servers.
Any help would be appreciated!
I've used google and sas communities and have code to do this process manually...

I am unsure what your final query is and would advise building a SQL query that specifically filters to the IDs you want. e.g.:
proc sql;
create table want as
select *
from have
where id in(select id from id_table)
;
quit;
But if you need to have a comma-separated list of macro variables that abides by the 65,534 character length, the safest way is to create one ID per macro variable. You can very easily do this with a data step.
data _null_;
set alerts1;
call symputx(cats('alert_id', _N_), review_id);
call symputx('n_ids', _N_);
run;
This will create the macro variables:
alert_id1
alert_id2
alert_id3
...
Now you need to create a loop that makes these all comma-separated.
%macro id_loop;
%do i = 1 %to &n_ids;
&&alert_id&i %if(&i < &n_ids) %then %do;,%end;
%end;
%mend;
Note the code format is a bit strange to keep the output formatted correctly. Now run this macro and you'll see a comma-separated list of every alert ID:
%put %id_loop;
id1, id2, id3, ...
You can put this in a query, such as where alert_id in (%id_loop). Keep in mind that doing this will load up the symbol table with a ton of macro variables. This is not the recommended way to query, but it is one way to achieve what you asked.

Use a data step instead of SQL to create the macro variables.
You can even create a second macro variable that references all of the generated macro variables.
For example say you have determined that you can always fit 1000 values into a single variable (the limit for a data step variable is 32K instead of the 64K limit of a macro variable) then you could use a data step like this:
data _null_;
length string list $32767 ;
group+1;
do i=1 to 1000 until(eof);
set alerts end=eof;
string=catx(',',string,review_id);
end;
call symputx(cats('alert_id',group),string);
list = catx(',',list,cats('&alert_id',group'));
if eof then call symputx('alerts',list);
run;
Now you can use that single macro variable ALERTS that consists of the string
&alert_id1,&alert_id2,....
in your SQL code:
where review_id in (&alerts)
And filter on all of the values in the ALERTS dataset even if the total string is longer than 64K. Since you put 1000 into each macro variable and you can fit about 3000 references to those macro variables into the ALERTS macro variable you could store up to 3 million values.
Of course you might hit a limit on what the SQL processor can handle.

Related

Changing FROM statement with a variable

I am trying to change the name of the table I am getting my data from
Like this:
COREPOUT.KUNDE_REA_UDL_202112 --> COREPOUT.KUNDE_REA_UDL_202203
I create my variable like this:
PROC SQL NOPRINT;
SELECT DISTINCT
PERIOKVT_PREV_BANKSL_I_YYMMN6
INTO :PERIOKVT_PREV_BANKSL_I_YYMMN6
FROM Datostamp_PREV_Kvartal;
This is the code I want to use the variable for.
%_eg_conditional_dropds(WORK.QUERY_FOR_KUNDE_REA_UDL_20_0000);
PROC SQL;
CREATE TABLE WORK.QUERY_FOR_KUNDE_REA_UDL_20_0000 AS
SELECT t1.Z_ORDINATE,
(input(t1.cpr_se,w.)) AS KundeNum
FROM COREPOUT.KUNDE_REA_UDL_202203 t1;
QUIT;
I have tried things like:
FROM string("COREPOUT.KUNDE_REA_UDL_",PERIOKVT_PREV_BANKSL_I_YYMMN6," t1";
I hope you can point me in the right direction.
Use & to reference and resolve macro variables into strings (e.g. &PERIOKVT_PREV_BANKSL_I_YYMMN6).
proc sql noprint;
select distinct PERIOKVT_PREV_BANKSL_I_YYMMN6
into :PERIOKVT_PREV_BANKSL_I_YYMMN6
from Datostamp_PREV_Kvartal
;
quit;
proc sql;
create table WORK.QUERY_FOR_KUNDE_REA_UDL_20_0000 AS
select t1.Z_ORDINATE,
(input(t1.cpr_se,w.)) AS KundeNum
from &PERIOKVT_PREV_BANKSL_I_YYMMN6 t1
;
quit;
You can use CALL SYMPUTX() to move values from a dataset into a macro variable.
data _null_;
set Datostamp_PREV_Kvartal;
call symputx('dataset_name',PERIOKVT_PREV_BANKSL_I_YYMMN6);
stop;
run;
Then use the value of the macro variable to insert the dataset name into the code at the appropriate place. So your posted SQL is equivalent to this simple data step.
data QUERY_FOR_KUNDE_REA_UDL_20_0000;
set &dataset_name. ;
KundeNum = input(cpr_se,32.);
keep Z_ORDINATE KundeNum;
run;
Note: I did not see any definition of a user defined informat named W in your posted code so I just replaced it with the normal numeric informat instead since it looked like you where trying to convert a character value into a number.
The solution I ended up with was inspried by #Stu Sztukowski response:
I made a data step to concat the variable and created a macro variable.
data Concat_var;
str_PERIOKVT_PREV_YYMMN6 = CAT("COREPOUT.KUNDE_REA_UDL_",&PERIOKVT_PREV_BANKSL_I_YYMMN6," t1");
run;
PROC SQL NOPRINT;
SELECT DISTINCT
str_PERIOKVT_PREV_YYMMN6
INTO :str_PERIOKVT_PREV_YYMMN6
FROM Concat_var;
Then I used the variable in the FROM statement:
%_eg_conditional_dropds(WORK.QUERY_FOR_KUNDE_REA_UDL_20_0000);
PROC SQL;
CREATE TABLE WORK.QUERY_FOR_KUNDE_REA_UDL_20_0000 AS
SELECT t1.Z_ORDINATE,
(input(t1.cpr_se,w.)) AS KundeNum
FROM &str_PERIOKVT_PREV_YYMMN6;
QUIT;
I hope this helps someone else in the future.

Using macro for formula proc sql in SAS

I need some help with macros in SAS. I want to sum variables (for example, from v_1 to v_7) to aggregate them, grouping by year. There are plenty of them, so I want to use macro. However, it doesn't work (I get only v_1) I would really appreciate Your help.
%macro my_macro();
%local i;
%do i = 1 %to 7;
proc sql;
create table my_table as select
year,
sum(v_&i.) as v_&i.
from my_table
group by year
;
quit;
%end;
%mend;
/* I don't know to run this macro - is it ok? */
data run_macro;
set my_table;
%my_macro();
run;
The macro processor just generates SAS code and then passes onto to SAS to run. You are calling a macro that generates a complete SAS step in the middle of your DATA step. So you are trying to run this code:
data run_macro;
set my_table;
proc sql;
create table my_table as select
year,
sum(v_1) as v_1
from my_table
group by year
;
quit;
proc sql;
create table my_table as select
year,
sum(v_1) as v_1
from my_table
group by year
;
quit;
...
So first you make a copy of MY_TABLE as RUN_MACRO. Then you overwrite MY_TABLE with a collapsed version of MY_TABLE that has just two variables and only one observations per year. Then you try to collapse it again but are referencing a variable named V_2 that no longer exists.
If you simply move the %DO loop inside the generation of the SQL statement it should work. Also don't overwrite your input dataset. Here is version of the macro will create a new dataset name MY_NEW_TABLE with 8 variables from the existing dataset named MY_TABLE.
%macro my_macro();
%local i;
proc sql;
create table my_NEW_table as
select year
%do i = 1 %to 7;
, sum(v_&i.) as v_&i.
%end;
from my_table
group by year
;
quit;
%mend;
%my_macro;
Note if this is all you are doing then just use PROC SUMMARY. With regular SAS code instead of SQL code you can use variable lists like v_1-v_7. So there is no need for code generation.
proc summary nway data=my_table ;
class year ;
var v_1 - v_7;
output out=my_NEW_table sum=;
run;

Dynamize range of SAS PROC SQL SELECT INTO macro creation

I want to put multiple observations into an own macro variable. I would do this by using select into :obs1 - :obs4, however, as count of observations can differ, i would like to dynamize the range and my code looks like this:
proc sql;
create table segments as select distinct substr(name,1,6) as segment from dictionary.columns
where libname = 'WORK' and memname = 'ALL_CCFS' and name ne 'MONTH';
run;
proc sql noprint;
select count(*) into: count from segments;
run;
proc sql noprint;
select segment into :segment_1 - :segment_&count. from dictionary.columns;
run;
However, this doesn't seem to work... any suggestions? Thank you!
Leave last value empty/blank and SAS will create them automatically
Set it to an absurdly large number and SAS will only use what's required
Use a data step to create it where you can dynamically increment your number (not shown).
proc sql noprint;
select segment into :segment_1 -
from dictionary.columns;
run;
proc sql noprint;
select segment into :segment_1 - :segment_999
from dictionary.columns;
run;

Overwriting/Appending sas variable

%let rows = "";
%macro test;
proc sql noprint;
select count(ID)
into: sqlRows
from mytbl;
quit;
%do i = 1 %to &sqlRows; * loop from 1 to sqlRows;
proc sql noprint;
select ID
into: ColumnID
from mytbl(firstobs= &i);
quit;
%if &rows eq "" %then %do
%let rows = "<tr><td>&ColumnID</td></tr>";
%end;
%if &rows ne "" %then %do
%let rows = "&rows<tr><td>&ColumnID</td></tr>";
%end;
%end;*End loop;
%mend;
%test;
%put &rows;
Hi I want to put all data of column ID data of mytbl into a variable.
I've created a variable named rows and assigned empty value in it. Then using loop I'm getting the values one by one of mytab and saving them in columnID variable. if rows variable is empty then only add tr and td with columnID data. if rows variable is not empty then append it. but it's only giving me the last record of my table.
lets say mytbl has data 1,2 and 3 in ID column
rows variable should have data as
<tr><td>1</td></tr><tr><td>2</td></tr><tr><td>3</td></tr>
but its only showing me data of last row as
<tr><td>3</td></tr>
You've got a few different problems, starting with some missing semicolons. More importantly, your code is more complex than it needs to be. You can get what you want with one PROC SQL step using SELECT INTO:, you don't need a separate PROC SQL step for each record. Play around with:
data have;
do ID=1 to 3;
output;
end;
run;
proc sql noprint;
select cats('<tr><td>',ID,'</td></tr>')
into :Rows
separated by ""
from have;
quit;
%put &rows;
I think you're severely misunderstanding what macro variables are, as opposed to regular variables, in SAS. You don't say exactly what you're going to eventually do with this, but nonetheless.
First off, macro variables don't take quotation marks; if they contain them, they're treated just as regular characters. So:
%let var = "";
%let var = "&var.123";
%put &=var.;
will return
"""123"
since it doesn't really know much about the quotation marks (it is somewhat aware of them, but it doesn't treat them the way a normal SAS variable does).
Second, as Quentin correctly points out, why on earth are you using SQL to go a row at a time? That's basically the opposite reason as what you'd use SQL for. SQL is great for doing something to the whole dataset at once, it's absolutely horrible at one row at a time- that's what the data step is for.
If you actually want a SAS variable, or you want to process things a row at a time, you should just use the data step:
data want;
set mytbl end=eof;
retain rows; *do not need to initialize to missing, that is normal;
length rows $32767;
rows = cats(rows,"<tr><td>",ColID,"</td></tr>");
if eof then output;
run;
You'd usually do that if you were going to use call execute, for example if you planned to put this to an HTML page (in a stored proc for example) with some wrapper code that you wanted to execute, in if _n_=1 for the start and if eof for the end.

SAS SQL Macro to join multiple datasets into one

What it the problem with the below SQL Macro?
I have multiple datasets called "C_out1" 2,3,4 and so on and I would like to extract just one number from each dataset into one new table. I have searched and searched for help but without any luck.
I have tested the code without the macro element on just one dataset and that works fine, but when I try to make it dynamic with the below code, it fails.
I'm able to do the job with a simple datastep, but I'm fairly new to the SQL language, and I would really like to be able to do this. There's almost 200.000 datasets I have to join so I'm guessing that SQL is preferable to the data step.
I get the error:
"NOTE: Line generated by the macro variable "I".
1 C_out5
------
78
ERROR 78-322: Expecting a ','.
974 where label2='c';
975 quit;
Code:
%macro loop(prefix2);
%do i=1 %to 5;
&prefix2&i
%let i = %eval(&i+1);
%end;
%mend loop;
PROC SQL;
create table CTOTAL as
select nvalue2
from %loop(C_out)
where label2='c';
quit;
The data step is a significantly better method to do this than SQL.
data CTOTAL;
set C_out:;
where label2='c';
run;
If you're not using all of the c_out datasets, you might need to do some work to improve this, but if you are using all c_out datasets then this will work as is.
Joe's answer is best.
If you must use SQL:
Your SQL is not correct. You need to do this with a UNION ALL. This solution will work for SQL:
%macro loop(prefix2);
%do i=1 %to 4;
select nvalue from &prefix2&i where str='c' union all
%end;
select nvalue from &prefix2&i where str='c'
%mend loop;
PROC SQL;
create table CTOTAL as
%loop(C_out)
;
quit;
No need to manually increment the &i value.
The loop will put &i to (n+1) at the final loop, so make the loop go to (n-1) and just output the last statement outside the loop.