Using macro variables/language in PROC SQL - sql

I use PROC SQL for Oracle database queries (I'm not a db person though, so I can't be more specific than that), and we often apply formats from a library that is automatically loaded. I was wondering if there's a faster way to program these types of queries, for example let's say I have a variable called prim_disease_cd in a view, and I want to pull that out, apply the format (which has the same name) and also call it prim_disease_cd. Right now I would do
put(a.prim_disease_cd, prim_disease_cd.) as prim_disease_cd
Is there a way I can shorten this using macro language? I have been unsuccessful so far, but we do this often and it seems quite inefficient. Essentially I want a macro that takes in a view/dataset a and a variable X and applies "put (a.X, X.) as X"
Additionally, if there's anyway I can implement something like this for dates too that would be great, i.e. to replace
datepart(a.(var_name)) as (var_name) format mmddyy10.
Thanks for any help you can provide.

You could create simple macros to do those two things. Macros that emit just a portion of a statement like that are often referred to as macro functions or function style macros. Make sure not to emit any semi-colons. For example you might make these two macros.
%macro decode(alias);
%local varname ;
%let varname=%scan(&alias,-1,.);
put(&alias,&varname..) as &varname
%mend;
%macro datepart(alias);
%local varname ;
%let varname=%scan(&alias,-1,.);
datepart(&alias) as &varname format yymmdd10.
%mend;
Then your SQL query might look like:
create table want as
select a.patid
, %decode(a.prim_disease_cd)
, %datepart(a.onset_date)
from oralib.diagnosis a
;
You might find that the use of the these will make your SAS code much harder to maintain. It might be easier to find a way to automate the generation of the text in your editor instead. Or running a program that generates the text from the metadata and then just copy and paste it into your program.
PS Don't use MDY (or DMY) format for dates. It will just confuse your European (or American) friends.

If ever need to use the <concept>_cd code values in a future query against the Oracle data I would say create a new variable such as <concept>_value or simply <concept>.
If the coded data in the Oracle query is named consistently, such as only <concept>_cd, you can have a macro examine the pulled data and create a SAS view that applies the mapping from code to value via SAS format. Since you are pulling the coded values from Oracle, there is likely one or more lookup tables in Oracle that map the code to the value, and possibly your SAS formats are built from that data.
In your use case, transforming code to value is, in essence, performing left joins against the supposed lookup table or tables. I would presume you are performing the code mapping so that it is easier to perform subset selections.
If you are only reporting the data, you may only need to apply the format to the code variable itself. Here is a sample macro that post processes a query result and performs code to value mappings according to naming convention <concept>_cd
data code_lookups;
length id 8 fmt $31 desc $50 ;
input id & fmt & desc;
datalines;
1 country_cd US
2 country_cd Canada
10 color_cd Green
11 color_cd Blue
12 color_cd Red
20 footwear_cd Shoes
21 footwear_cd Socks
22 footwear_cd Laces
run;
proc format cntlin=code_lookups(rename=(fmt=fmtname id=start desc=label));
run;
data have(label="Some result from Oracle with unmapped codes");
input item_id country_cd color_cd footwear_cd;
datalines;
1 1 11 22
2 2 11 21
3 1 12 22
3 1 10 20
run;
%macro auto_codemap (data=, out=, out_struct=view, map_func=new_var);
%local dsid i l p q varname;
%let dsid = %sysfunc(open(&data));
%if &map_func ne format_only and &map_func ne new_var %then %do;
%put ERROR: &=map_func unknown.;
%end;
proc sql;
create &out_struct &out as
select
%do i = 1 %to %sysfunc(attrn(&dsid,nvar));
%if &i > 1 %then %str(,);
%let varname = %sysfunc(varname(&dsid,&i));
&varname
%let l = %length(&varname);
%if &l > 3 %then %do;
%let p = %eval(&l-3);
%let q = %eval(&l-2);
%if %substr(%upcase(&varname),&q) = _CD %then %do;
%if &map_func = format_only %then %do;
format=%str(&varname).
%end;
%else %if &map_func = new_var %then %do;
, put(&varname, %str(&varname).) as %substr(&varname,1,&p)
%end;
%end;
%end;
%end;
from &data
;
quit;
%let dsid = %sysfunc(close(&dsid));
%mend;
options mprint;
%auto_codemap (data=have, out=want)
proc print data=want;
run;
%auto_codemap (data=have, out=want2, map_func=format_only)
proc print data=want2;
run;

Related

Conditional PROC SQL on SAS

I have a SAS Script containing multiple PROC SQLs. The question is that the SQL Query should be "adapted" based on a SAS variable value, for example :
%let COD_COC =(52624, 52568, 52572);
%let COD_BLOC = ();
proc sql;
create table work.abordados as
select t1.cd_acao,
t1.cd_bloc,
t1.cd_tip_cnl,
t1.cd_cmco,
t1.cd_cli,
datepart(t1.ts_ctt) format=date9. as data_abordagem,
intnx('day',datepart(t1.ts_ctt), &Prazo_Compra) format=date9. as data_limite
from db2coc.ctt_cli_cmco t1
where (t1.cd_acao in &COD_COC)
and (t1.cd_bloc in &COD_BLOC) <<<<<<< facultative filter
;quit;
The question is that the second filter (t1.cd_bloc in &COD_BLOC) should be applied only if the %let COD_BLOC = (); is different of "()".
I´ve been reading about "match/case" on SQL but as far as I know, this test applies to results of queries/values. On my case, what I must test is the SAS variable.
How handle this?
Knowing you want to apply the COD_BLOC in-list filter only when there are one or more values, AND that a proper in-list will have at least 3 source code characters (*), you can test the length as the criteria for using the macro variable.
When the %IF is in open code you need a %do %end block as follows:
...
%if %length(&COD_BLOC) > 2 %then %do;
and t1.cd in &COD_BLOC
%end;
...
When the code is inside a macro, you can use the above or the below
...
%if %length(&COD_BLOC) > 2 %then
and t1.cd in &COD_BLOC
;
...
Another possible coding solution is to use %sysfunc(IFC(...)) to conditionally generate code
...
%sysfunc(ifc(%length(&COD_BLOC) > 2
, and t1.cd in &COD_BLOC
, %str()
))
...
Two good ways to do this, I think.
First: the easy hack.
%let COD_COC =(52624, 52568, 52572);
%let COD_BLOC = (and t1.cd_bloc in (...));
%let COD_BLOC = ;
proc sql;
create table work.abordados as
select t1.cd_acao,
t1.cd_bloc,
t1.cd_tip_cnl,
t1.cd_cmco,
t1.cd_cli,
datepart(t1.ts_ctt) format=date9. as data_abordagem,
intnx('day',datepart(t1.ts_ctt), &Prazo_Compra) format=date9. as data_limite
from db2coc.ctt_cli_cmco t1
where (t1.cd_acao in &COD_COC)
&COD_BLOC <<<<<<< facultative filter
;quit;
Tada, now it is just ignored. Comment out or delete the second line if you want it to be used (and put values in the ... ).
Second, the more proper way, is to use the macro language. This is more commonly done in a macro, but in 9.4m7 (the newest release, and a few years old now) you can do this in open code.
%let COD_COC =(52624, 52568, 52572);
%let COD_BLOC = ();
proc sql;
create table work.abordados as
select t1.cd_acao,
t1.cd_bloc,
t1.cd_tip_cnl,
t1.cd_cmco,
t1.cd_cli,
datepart(t1.ts_ctt) format=date9. as data_abordagem,
intnx('day',datepart(t1.ts_ctt), &Prazo_Compra) format=date9. as data_limite
from db2coc.ctt_cli_cmco t1
where (t1.cd_acao in &COD_COC)
%if %sysevalf(%superq(COD_BLOC) ne %nrstr(%(%)),boolean) %then %do;
and (t1.cd_bloc in &COD_BLOC) <<<<<<< facultative filter
%end;
;quit;
You have to be careful with the ne () bit because () are macro language syntax elements, hence the long %nrstr to make sure they're properly considered characters. (%str would be okay too, I just default to %nrstr.)

SAS sql select variable as change name to a date in MonYY7. format

I am not sure if it is possible at all, but in case someone knows the answer. I need to select variables and rename them to dates in MonYY7. format. My understanding is that SAS stores dates as numbers, and it is the formats which represent them in the former way. However, would it be possible to somehow rename the variable's name itself according to the format?
Here is the code I have written:
%macro try;
%let month_count_back = 12;
%let today = %sysfunc(today());
%let sysmonth = %sysfunc(month("&sysdate"d));
proc sql;
create table try as
select *,
%do i = -&sysmonth. %to -&month_count_back.-&sysmonth.+1 %by -1;
max(month(FP_NDT) = month(intnx('month',&today.,&i.))) as mn%eval(&month_count_back.+&sysmonth.+&i.)
%if &i. = -&month_count_back.-&sysmonth.+1 %then %goto leave_month;
,
%leave_month:
%end;
from work.test
group by var;
quit;
run;
%mend try;
%try;
run;
It returns dummy indicators for each month value of the 'var' variable for the previous year (the intention here is to know which values are null and which are not). However, I would like each dummy variable created be named according to the month and the year it refers to. For example, m12 should be DEC2015, m11 - NOV2015 etc... As a corollary if month_count_back is equal to, say, 36 then m36 should be DEC2015, but M12 should be DEC2013 and M1 should be JAN2013 etc...
Maybe there is way to rename it later in a data step? I have tried to loop through it, but could not control for the changing month_count_back value...
Would appreciate any suggestions, thanks!

Two variables in a macro [SAS]

So, i want to have a macro that has others macros inside.
Here is the code: `
proc sql NOPRINT ;
select id into :l_id separated by ' ' from work.AMOSTRACHU;
select count(*) into :nr_reg separated by ' ' from tdata.work.AMOSTRACHU;
quit;
* check;
%put l_id=&l_id nr_reg=&nr_reg;
%macro ciclo_first();
%do n=1 %to &nr_reg;
%let ref=%scan(&l_id,&n);
%put ref=&ref;
proc sql;
select recetor into : lsus&ref separated by ' ' from tdata.5pct_&ref;
select count(*) into :nrsus&ref separated by ' ' from tdata.5pct_&ref;
quit;
%put lsus&ref=&lsus&ref;
%put nrsus&ref=&nrsus&ref;
%MACRO CICLO_PF_SUSref();
%do n=1 %to &nrsus&ref %by 1;
%let sus=%scan(&lsus&ref,&n);
%put sus=&sus;
%LET I = %EVAL(14);
%DO %WHILE (&I<=24);
*my code (depends on &i and &sus)* (works fine alone)
%LET I = %EVAL(&I+1);
%END;
%END;
%MEND;
%CICLO_PF_SUSref;
%MACRO CICLO_PF_SUS_CSRANK();
%do n=1 %to &nrsus&refm %by 1;
%let sus=%scan(&lsus&ref,&n);
%put sus=&sus;
%CICLO_PF_SUSPEITOSrefmsisdn;
%CICLO_PF_SUS_CSRANK;
my code ( just depends on &sus)/
%END;
%MEND;
%CICLO_PF_SUS_CSRANK;
%end;
%mend;
%ciclo_first;`
I think the major problem is in this part:
%put lsus&ref=&lsus&ref;
%put nrsus&ref=&nrsus&ref;
And the error about that is:
A character operand was found in the %EVAL function or %IF condition
where a numeric operand is required. The condition was:
&nrsus&ref
How can i change this in order to work? I understand that it doesn't make all the sense to have something depending on two, like &nrsus&ref.
the first warnings and errors appears here:
ref=15
WARNING: Apparent symbolic reference LSUS not resolved.
lsus15=&lsus15 WARNING: Apparent symbolic
reference NRSUS not resolved.
nrsus15=&nrsus15 ERROR: Expected semicolon not
found. The macro will not be compiled.
How can i solve this? Have no ideas and it would be really useful to make this macro functional in order to avoid to run this 100 times.
UPDATE [06.08.2015]
I have a table with 100 numbers, that's in
'work.amostrachu'.
I created the macro ciclo_first in order to run the other 2 macros for this list. because, if i replace manually the &ref by the number i want it works fine.
Let's suppose 'work.amostrachu' has:
ID 1 2 3 (...) till n=100
Then, with this part:
proc sql;
select recetor into : lsus&ref separated by ' ' from work.5pct_&ref;
select count(*) into :nrsus&ref separated by ' ' from work.5pct_&ref;
quit;
I want to get the elements that are on the column 'recetor' of work.5pct_&ref.
For ID=1 i would obtain lsus1 composed by, for example, 3 numbers (124,564,859)
And, then, the %MACRO CICLO_PF_SUSref(); will have as input these 3 numbers (that could be 4 or 5 or sometingh else).
(here, i might be calling badly the list of elements i want from 'work.5pct_&ref).
Then, the output of the previous macro would be the input of this one: %MACRO CICLO_PF_SUS_CSRANK.
And that would be all.
The %MACRO CICLO_PF_SUSref() and %MACRO CICLO_PF_SUS_CSRANK works ok if i just replace the &ref by the id. that's why i tried to create a macro that would run these 2 macros for the initial list. if you have best ideas, i would be thankful.
So, i want something that allows me to run this two macros (%MACRO CICLO_PF_SUSref() and `%MACRO CICLO_PF_SUS_CSRANK) for the list i get in the beginning:
proc sql NOPRINT ;
select id into :l_id separated by ' ' from work.AMOSTRACHU;
select count(*) into :nr_reg separated by ' ' from tdata.work.AMOSTRACHU;
quit;
[UPDATE 10.08.2015]
Ok, just read the suggested answers and worked on it.
I have a list, with the identification(numerical) of 100 clients, let's call each client : ref. That's on WORK.AMOSTRACHU.
I wroted the following code and it worked, and will help me explain you what i want:
proc sql NOPRINT ;
select id into :l_id separated by ' ' from work.AMOSTRACHU;
select count(*) into :nr_reg separated by ' ' from work.AMOSTRACHU;
quit;
* check;
%put l_id=&l_id nr_reg=&nr_reg;
%macro lista_ent();
%do n=1 %to &nr_reg;
%put n=&n;
%let ref=%scan(&l_id,&n);
%put ref=&ref;
proc sql;
select recetor into :listae&ref SEPARATED BY ' ' from work.e5pct_id&ref;
select count(*) into :nre&ref separated by ' ' from work.e5pct_id&ref;
quit;
%end;
%mend;
%lista_ent;
Will show you the output for the first 3 cases (of 100, the beggining list in work.amostrachu), it's the results part in SAS:
Recetor
507
723
955
-page break-
3
-page break-
380
500
675
977
984
-page break-
5
-page break-
200
225
351
488
698
781
927
-page break-
7
So, i have the 'values' of the column 'recetor' of the data work.e5pct_id&ref and how many values i have for each ref. (i've showed you results for the first 3 refs, but i have it for the 100).
Now, the first macro:
%MACRO CICLO_M_PF_ref();
%local me n i;
%do n=1 %to nre&ref %by 1;
%let me=%scan(listae&ref,&n);
%put me=&me;
%LET I = %EVAL(14);
%DO %WHILE (&I<=24);
proc sql;
create table work.smthng_&I as
select * from
work.wtv&I
WHERE A=&me OR B=&me;RUN;
PROC APPEND
DATA=work.smthng_&I
BASE=work.pf_&me
FORCE;
RUN;
%LET I = %EVAL(&I+1);
%END;
%END;
%MEND;
%CICLO_M_PF_ref;
My all doubts in the & and && are around here.
So, with the data: I have my first ref whose results of column 'recetor' are
Recetor
507
723
955
-page break-
3
So, i want to run that code for each one of this values. First for '507', then for '723' and then for '955', and i want to do it for all the refs.
So, when the macro finishes to run my code for this 3, i want the macro to skip to the second ref and then run my code for the values of the column 'recetor' for the second ref: 380,500,675,977 and 984.
i used this code:
proc sql;
select recetor into :listae&ref SEPARATED BY ' ' from work.e5pct_id&ref;
select count(*) into :nre&ref separated by ' ' from work.e5pct_id&ref;
quit;
because each one of the refs have different values and the number of them could be different, just as i showed you. so, this whas to tell the macro to run it nre&ref times and for all values in the list listae&ref.
the error is the following:
ERROR: A character operand was found in the %EVAL function or %IF
condition where a numeric operand is required. The condition was:
nre&ref ERROR: The %TO value of the %DO T loop is invalid. ERROR: The macro CICLO_M_PF_REF will stop executing.
I can't quite follow your desired output and macro but here are some things I noticed.
None of your macros take parameters. If you change your macro to take parameters you can call them individually which may help to stream line your process.
I think you want something like this:
%macro def1(param1);
...
%mend;
%macro def2(param2);
...
%mend;
%macro execute();
%do i=1 to 100;
%def1(param1);
%def2(param2);
%end;
%mend;
This still seems a bit awkward, so if you can explain your process with your data there may be a better way overall.
I see a number of issues you could address, but without test data it is hard to evaluate.
When trying to show the value for macro variable x&i you need to double up on the prefix &. So if I=1 and X1 = FRED then &&x&i = FRED.
When pushing values into macro variables from SQL use the automatic macro variable SQLOBS to get the record count. No need to run the query again to get the count.
You cannot select COUNT(*) into multiple macro variables. SQL will just return one count.
SAS dataset or variable names cannot start with a digit (tdata.5pct_&ref) or contain periods (tdata.work.AMOSTRACHU).
Do NOT nest macro definitions. You can nest the calls, but nesting
the definitions is just going to lead to confusion.
Your actual nested macros do not make much sense. What is this variable I that is introduced? It appears to be a constant.
Why not just code them as part of the outer macro? Not much need to make them separate macros if they are only called at one place.
If you do nest them then make sure to define your local macro variables as local to prevent overwriting the values of macro variables with the same name that might exist in an outer macro scope. The N looping variable for your %DO loops for example.
First define your subroutine macros.
%MACRO CICLO_PF_SUSref(ref_list);
* CICLO_PF_SUSref ;
%local n sus;
%do n=1 %to %sysfunc(countw(&ref_list,%str( )));
%let sus=%scan(&ref_list,&n);
%put NOTE: &sysmacroname N=&n SUS=&sus;
%end;
%MEND CICLO_PF_SUSref;
%MACRO CICLO_PF_SUS_CSRANK(ref_list);
* CICLO_PF_SUS_CSRANK ;
%local n sus ;
%do n=1 %to %sysfunc(countw(&ref_list,%str( )));
%let sus=%scan(&ref_list,&n);
%put NOTE: &sysmacroname N=&n SUS=&sus;
%put NOTE: Call macro named: CICLO_PF_SUSPEITOSrefmsisdn;
%end;
%MEND CICLO_PF_SUS_CSRANK;
Then your main macro.
%macro ciclo_first(id_list);
* Start ciclo_first ;
%local n id ;
%do n=1 %to %sysfunc(countw(&id_list,%str( )));
%let id=%scan(&id_list,&n);
proc sql noprint;
select recetor into : lsus&id separated by ' ' from pct_&id;
%let nrsus&id = &sqlobs ;
quit;
%put NOTE: Current ID=&id ;
%put NOTE: &&nrsus&id records read from PCT_&ID ;
%put NOTE: Value List LSUS&id = &&LSUS&id ;
%CICLO_PF_SUSref(&&lsus&id);
%CICLO_PF_SUS_CSRANK(&&lsus&id);
%end;
* End ciclo_first ;
%mend ciclo_first;
Then setup some data and call the main macro.
* Setup test data ;
data AMOSTRACHU;
do id=1 to 2; output; end;
run;
data PCT_1 ;
do recetor='A','B';
output;
end;
run;
data PCT_2 ;
do recetor='C','D';
output;
end;
run;
options mprint;
%ciclo_first(1 2);

SAS If then else Statement with Do loop

I need to get a percentage for 75 values in 75 columns individually. And I want to use a do loop so I don't have to hard code it 75 times. There are some conditions so there will be a where statement.
I am not getting the do loop correctly but I am using the below to get a percentage
case when (SUM(t1.sam)) >0 then
((SUM(t1.sam))/(SUM(t1.sam_Threshold)))*100
else 0
end
I tried the below and its a bit better:
data test;
i_1=4;
i_2=8;
i_3=4;
i_4=8;
V_ANN_V_INSP=24;
run;
%macro loop();
%let numcols=4;
proc sql;
create table test3 as
select V_ANN_V_INSP,
%do i=1 %to &numcols;
(i_&i/V_ANN_V_INSP)*100 as i_&i._perc
%if &i<&numcols %then %do;,
%end;
%end;
from test;
quit;
%mend;
%loop();
CASE WHEN is a SQL statement, not a data step statement, so you can't use a DO loop there. Depending on what you're doing exactly, there are a lot of possible solutions here. Posting additional code would help to get a more precise answer, but I can give you a few suggestions.
First, take it into a data step. Then you can use a do loop.
data want;
set have;
array nums sam1-sam75;
array denoms threshold1-threshold75;
array pct[75];
do _t = 1 to dim(nums);
pct[_t]=nums[_t]/denoms[_t];
end;
run;
Second, if you need to do this in SQL for some reason, you can write out the SQL code either in a macro or in a data step in a pre-processing step.
%macro do_sql_st;
%do _t = 1 to 75;
case when (SUM(t1.sam&_t.)) >0 then
((SUM(t1.sam&_t.))/(SUM(t1.sam_Threshold&_t.)))*100
else 0
end
as pct&_t.
%end;
%mend do_sql_st;
proc sql;
select %do_sql_st from t1 where ... ;
quit;
These are not terribly flexible; unless you have very specifically named variables, they won't work as is. You're more likely to want to do some sort of data step preprocessing I suspect, but that's very hard to explain without more detail as to how the variables are named (ie, if there is a relationship between them).

SAS macros: using macros in proc sql

How to use macros in SQL? (for every thing, that was selected)
I mean something like this:
&VarTable is a table, which have two variables: (for example) Lib and Table
Each observation in &VarTable is the name of table: Lib.Table
I want to do things for every table:
1) exist?
2) sort it
and last condition:
each table, if it exist, have a variable &VarField.
%macro mSortedTable(vLib,vTab,vVar);
%if %sysfunc(exist(&vLib..&vTab)) %then %do;
proc sort data = &vLib..&vTab;
by &vVar;
run;
&vLib..&vTab
%end;
%else %do; "" %end;
%mend mSortedTable;
proc sql noprint;
select %mSortedTable(vLib=Lib,vTab=Table,vVar=&VarField)
into: AccumVar separated by " "
from &VarTable;
quit;
how to do this with sql and macros?
Do you have to use sql and macros? A simple data step and call execute would do what you need here.
Below is an example that takes a data set that has a list of tables to process, checks to see if the table exists and if it does, sorts it by &VarField. This could be easily extended to sort each table by a custom set of variables if desired.
If the table does not exist, it generates a warning message.
/* create fake data */
data testdat;
length lib $8 table $32;
input lib $ table $;
datalines;
work test1
work test2
work test3
work doesnotexist
;
run;
/* create 3 data sets */
data work.test1 work.test2 work.test3;
input var1 var2 var3;
datalines;
1 34 8
2 54 5
12 5 6
;
run;
/* end create data */
%let VarTable=work.testdat;
%let VarField=var2 var3;
data _null_;
set &VarTable;
dsname=catx('.',lib,table);
if exist(dsname) then do;
call execute("proc sort data=" || strip(dsname) || "; by &VarField; run;");
end;
else do;
put "WARNING: The data set does not exist: " lib= table=;
end;
run;
Call execute is a good solution, however if the data step code being "executed" is complicated (which it is not in this example), I find it hard to debug.
Another method is to put all the variables into macro variables and then loop through them in a macro do-loop;
(building on #cmjohns data)
/* create fake data */
data testdat;
length lib $8 table $32;
input lib $ table $;
datalines;
work test1
work test2
work test3
work doesnotexist
;
run;
/* create 3 data sets */
data work.test1 work.test2 work.test3;
input var1 var2 var3;
datalines;
1 34 8
2 54 5
12 5 6
;
run;
/* end create data */
%let VarTable=work.testdat;
%let VarField=var2 var3;
proc sql noprint;
select count(lib)
into :cnt
from &vartable;
%Let cnt=&cnt;
select strip(lib), strip(table)
into :lib1 - :lib&cnt, :table1 - :table&cnt
from &vartable;
quit;
%Macro test;
%Do i = 1 %to &cnt;
%Let lib=&&lib&i;
%Let table=&&table&i;
%Let dsn=&lib..&table;
%if %sysfunc(exist(&dsn)) %then %do;
Proc sort data=&dsn;
by &varfield;
run;
%end;
%else %do;
%put WARNING: The data set does not exist: &dsn;
%end;
%end;
%Mend;
%test