macro into a table or a macro variable with sas - sql

I'm having this macro. The aim is to take the name of variables from the table dicofr and put the rows inside into variable name using a symput.
However , something is not working correctly because that variable, &nvarname, is not seen as a variable.
This is the content of dico&&pays&l
varname descr
var12 aza
var55 ghj
var74 mcy
This is the content of dico&&pays&l..1
varname
var12
var55
var74
Below is my code
%macro testmac;
%let pays1=FR ;
%do l=1 %to 1 ;
data dico&&pays&l..1 ; set dico&&pays&l (keep=varname);
call symput("nvarname",trim(left(_n_))) ;
run ;
data a&&pays&l;
set a&&pays&l;
nouv_date=mdy(substr(date,6,2),01,substr(date,1,4));
format nouv_date monyy5.;
run;
proc sql;
create table toto
(nouv_date date , nomvar varchar (12));
quit;
proc sql;
insert into toto SELECT max(nouv_date),"&nvarname" as nouv_date as varname FROM a&&pays&l WHERE (&nvarname ne .);
%end;
%mend;
%testmac;
A subsidiary question. Is it possible to have the varname and the date related to that varname into a macro variable? My man-a told me about this but I have never done that before.
Thanks in advance.
Edited:
I have this table
date col1 col2 col3 ... colx
1999M12 . . . .
1999M11 . 2 . .
1999M10 1 3 . 3
1999M9 0.2 3 2 1
I'm trying to do know the name of the column with the maximum date , knowing the value inside of the column is different than a missing value.
For col1, it would be 1999M10. For col2, it would be 1999M11 etc ...

Based on your update, I think the following code does what you want. If you don't mind sorting your input dataset first, you can get all the values you're looking for with a single data step - no macros required!
data have;
length date $7;
input date col1 col2 col3;
format date2 monyy5.;
date2 = mdy(substr(date,6,2),1,substr(date,1,4));
datalines;
1999M12 . . .
1999M11 . 2 .
1999M10 1 3 .
1999M09 0.2 3 2
;
run;
/*Required for the following data step to work*/
/*Doing it this way allows us to potentially skip reading most of the input data set*/
proc sort data = have;
by descending date2;
run;
data want(keep = max_date:);
array max_dates{*} max_date1-max_date3;
array cols{*} col1-col3;
format max_date: monyy5.;
do until(eof); /*Begin DOW loop*/
set have end = eof;
/*Check to see if we've found the max date for each col yet.*/
/*Save the date for that col if applicable*/
j = 0;
do i = 1 to dim(cols);
if missing(max_dates[i]) and not(missing(cols[i])) then max_dates[i] = date2;
j + missing(max_dates[i]);
end;
/*Use j to count how many cols we still need dates for.*/
/* If we've got a full set, we can skip reading the rest of the data set*/
if j = 0 then do;
output;
stop;
end;
end; /*End DOW loop*/
run;
EDIT: if you want to output the names alongside the max date for each, that can be done with a slight modification:
data want(keep = col_name max_date);
array max_dates{*} max_date1-max_date3;
array cols{*} col1-col3;
format max_date monyy5.;
do until(eof); /*Begin DOW loop*/
set have end = eof;
/*Check to see if we've found the max date for each col yet.*/
/*If not then save date from current row for that col*/
j = 0;
do i = 1 to dim(cols);
if missing(max_dates[i]) and not(missing(cols[i])) then max_dates[i] = date2;
j + missing(max_dates[i]);
end;
/*Use j to count how many cols we still need dates for.*/
/* If we've got a full set, we can skip reading the rest of the data set*/
if j = 0 or eof then do;
do i = 1 to dim(cols);
col_name = vname(cols[i]);
max_date = max_dates[i];
output;
end;
stop;
end;
end; /*End DOW loop*/
run;

It looks to me that you're trying to use macros to generate INSERT INTO statements to populate your table. It's possible to do this without using macros at all which is the approach I'd recommend.
You could use a datastep statement to write out the INSERT INTO statements to a file. Then following the datastep, use a %include statement to run the file.
This will be easier to write/maintain/debug and will also perform better.

Related

How to select a range of columns in a case statement in proc SQL?

I have around 80 columns names diag1 to diag80. I am wondering how can I pick just 30 columns and apply a case statment in proc SqL. The following code produces an error because it doesn't understand the range.
proc sql;
create table data_WANT as
select *,
case
when **diag1:diag30** in ('F00','G30','F01','F02','F03','F051') then 1
else 0
end as p_nervoussystem
from data_HAVE;
quit;
Thank you, any help is appreciated!
You have two problem with that attempted syntax. First is that variable lists are not supported by PROC SQL (since they are not supported by SQL syntax). The second is there is no simple syntax to search N variables for a list of M strings.
You will need a loop of some kind. It will be much easier in SAS code than in SQL.
For example you could make an array to reference your 30 variables than loop over the variables checking whether each one has a value in the list of values. You can stop checking once one is found.
data want;
set have;
array vars diag1-diag30;
p_nervoussystem=0;
do index=1 to dim(vars) while (not p_nervoussystem);
p_nervoussystem = vars[index] in ('F00','G30','F01','F02','F03','F051');
end;
run;
The inverse pattern to #Tom search for a nervous system diagnostic code:
via FINDW over a concatenation of the observed diagnoses
via WHICHC over an array of the observed diagnoses
data have;
infile datalines missover;
length id 8;
array dx(30) $5;
input id (dx1-dx50) (50*:$5.);
datalines;
1 A00 B00 A12
2 F00 Z12 T45
3 A01 A02 B12 F00
4 Q12
5 Q13
6 T14
7 F44 F45 F46
8 . . . . . . . . . . . . . . G30
;
data want;
length p_nervoussystem p_ns 4;
set have;
array dx dx:;
array ns(6) $5 _temporary_ ('F00','G30','F01','F02','F03','F051');
dx_catx = catx(' ', of dx(*));* drop dx_catx; * way 1;
do _n_ = 1 to dim(ns) until(p_nervoussystem);
p_nervoussystem = 0 < indexw(dx_catx, trim(ns(_n_))); * way 1;
p_ns = 0 < whichc(ns(_n_), of dx(*)); * way 2;
end;
run;```
try it sys.tables and sys.columns and filter your columns.
SELECT * FROM sys.tables INNER JOIN sys.columns ON columns.object_id = tables.object_id

Use SAS macro variable to create variable name in PROC SQL

I'm trying to create a set of flags based off of a column of character strings in a data set. The string has thousands of unique values, but I want to create flags for only a small subset (say 10). I'd like to use a SAS macro variable to do this. I've tried many different approaches, none of which have worked. Here is the code that seems simplest and most logical to me, although it's still not working:
%let Px1='12345';
PROC SQL;
CREATE TABLE CLAIM1 AS
SELECT
b.MEMBERID
, b.ENROL_MN
, CASE WHEN (a.PROCEDURE = &Px1.) THEN 1 ELSE 0 END AS CPT_+&Px1.
, a.DX1
, a.DX2
, a.DX3
, a.DX4
FROM ENROLLMENT as b
left join CLAIMS as a
on a.MEMBERID = b.MEMBERID;
QUIT;
Obviously there is only one flag in this code, but once I figure it out the idea is that I would add additional macro variables and flags. Here is the error message I get:
8048 , CASE WHEN (PROCEDURE= &Px1.) THEN 1 ELSE 0 END AS CPT_+&Px1.
-
78
ERROR 78-322: Expecting a ','.
It seems that the cause of the problem is related to combining the string CPT_ with the macro variable. As I mentioned, I've tried several approaches to addressing this, but none have worked.
Thanks in advance for your help.
Something like this normally requires dynamic sql (although I am not sure how will that works with SAS, I believe it may depend on how you have established connection with the database).
Proc sql;
DECLARE #px1 varchar(20) = '12345'
,#sql varhcar(max) =
'SELECT b.MEMBERID
, b.ENROL_MN
, CASE WHEN (a.PROCEDURE = ' + #Px1 + ') THEN 1 ELSE 0
END AS CPT_' + #px1 + '
, a.DX1
, a.DX2
, a.DX3
, a.DX4
FROM ENROLLMENT as b
left join CLAIMS as a
on a.MEMBERID = b.MEMBERID'
EXEC sp_excutesql #sql;
QUIT;
Your issue here is the quotes in the macro variable.
%let Px1='12345';
So now SAS is seeing this:
... THEN 1 ELSE 0 END AS CPT_+'12345'
That's not remotely legal! You need to remove the '.
%let Px1 = 12345;
Then add back on at the right spot.
CASE WHEN a.procedure = "&px1." THEN 1 ELSE 0 END AS CPT_&px1.
Note " not ' as that lets the macro variable resolve.
If you have a list it might help to put the list into a table. Then you can use SAS code to generate the code to make the flag variables instead of macro code.
Say a table with PX code variable.
data pxlist;
input px $10. ;
cards;
12345
4567
;
You could then use PROC SQL query to generate code to make the flag variable into a macro variable.
proc sql noprint;
select catx(' ','PROCEDURE=',quote(trim(px)),'as',cats('CPT_',px))
into :flags separated by ','
from pxlist
;
%put &=flags;
quit;
Code looks like
PROCEDURE= "12345" as CPT_12345,PROCEDURE= "4567" as CPT_4567
So if we make some dummy data.
data enrollment ;
length memberid $8 enrol_mn $6 ;
input memberid enrol_nm;
cards;
1 201612
;
data claims;
length memberid $8 procedure $10 dx1-dx4 $10 ;
input memberid--dx4 ;
cards;
1 12345 1 2 . . .
1 345 1 2 3 . .
;
We can then combine the two tables and create the flag variables.
proc sql noprint;
create table want as
select *,&flags
from ENROLLMENT
natural join CLAIMS
;
quit;
Results
memberid procedure dx1 dx2 dx3 dx4 enrol_mn CPT_12345 CPT_4567
1 12345 1 2 201612 1 0
1 345 1 2 3 201612 0 0

Rename column headers - either after a key database in sas - or after values from first row

I need to rename the column headers of my variables so they match what I have in my key list. I attached a picture below to describe what I have and what I need.
My Data
I don't necesarily need actual code, just an idea of how to make it happen. :)
Thank you so much folks, and so sorry about the changes, I have never posted a question before.
If you have a table like
NEW1 NEW2 NEW3
OLDX OLDY OLDZ
And you want to use it to generate rename statement like
rename oldx=new1 oldy=new2 oldz=new3 ;
Then an easy way to do it is to use PROC TRANSPOSE to convert it into a separate row for each name pair.
proc transpose data=have out=names ;
var _all_;
run;
Which will get you a table like
_NAME_ COL1
NEW1 OLDX
NEW2 OLDY
NEW3 OLDZ
Then you can either use PROC SQL to quickly generate a macro variable with the pairs.
proc sql noprint;
select catx('=',col1,_name_) into :rename separated by ' '
from names;
quit;
data new ;
set old;
rename &rename ;
run;
If the list of names is too long to put into a single macro variable then just use a data step to generate the rename statement to a text file and use %INCLUDE to run it where you want.
filename code temp;
data _null_;
set names end=eof;
file code ;
if _n_=1 then put 'rename' ;
put col1 '=' _name_ ;
if eof then put ';';
run;
data new ;
set old;
%include code ;
run;
EDIT
You could probably do the last step directly from the data set and skip the proc transpose.
filename code temp;
data _null_;
set have ;
array _X _character_ ;
file code ;
put 'rename ' # ;
do i=1 to dim(_X);
oldname = _x(i);
newname = vname(_x(i));
put oldname '=' newname #;
end;
put / ';' ;
stop;
run;
You can use column aliases to change what's displayed in the results header row.
SELECT A AS 'NewA',
B AS 'OtherB',
C AS 'diffC'
FROM <<Table>>
If you want 'NewA OtherB diffC' as a row in the results, you could do this:
SELECT 'NewA' AS 'A',
'OtherB' AS 'B',
'diffC' AS 'C'
UNION
SELECT A,
B,
C
FROM <<Table>>

SAS PROC SQL NOT CONTAINS multiple values in one statement

In PROC SQL, I need to select all rows where a column called "NAME" does not contain multiple values "abc", "cde" and "fbv" regardless of what comes before or after these values. So I did it like this:
SELECT * FROM A WHERE
NAME NOT CONTAINS "abc"
AND
NAME NOT CONTAINS "cde"
AND
NAME NOT CONTAINS "fbv";
which works just fine, but I imagine it would be a headache if we had a hundred of conditions. So my question is - can we accomplish this in a single statement in PROC SQL?
I tried using this:
SELECT * FROM A WHERE
NOT CONTAINS(NAME, '"abc" AND "cde" AND "fbv"');
but this doesn't work in PROC SQL, I am getting the following error:
ERROR: Function CONTAINS could not be located.
I don't want to use LIKE.
You could use regular expressions, I suppose.
data a;
input name $;
datalines;
xyabcde
xyzxyz
xycdeyz
xyzxyzxyz
fbvxyz
;;;;
run;
proc sql;
SELECT * FROM A WHERE
NAME NOT CONTAINS "abc"
AND
NAME NOT CONTAINS "cde"
AND
NAME NOT CONTAINS "fbv";
SELECT * FROM A WHERE
NOT (PRXMATCH('~ABC|CDE|FBV~i',NAME));
quit;
You can't use CONTAINS that way, though.
You can use NOT IN:
SELECT * FROM A WHERE
NAME NOT IN ('abc','cde','fbv');
If the number of items is above reasonable number to build inside code, you can create a table (work.words below) to store the words and iterate over it to check occurrences:
data work.values;
input name $;
datalines;
xyabcde
xyzxyz
xycdeyz
xyzxyzxyz
fbvxyz
;
run;
data work.words;
length word $50;
input word $;
datalines;
abc
cde
fbv
;
run;
data output;
set values;
/* build a has of words */
length word $50;
if _n_ = 1 then do;
/* this runs once only */
call missing(word);
declare hash words (dataset: 'work.words');
words.defineKey('word');
words.defineData('word');
words.defineDone();
end;
/* iterate hash of words */
declare hiter iter('words');
rc = iter.first();
found = 0;
do while (rc=0);
if index(name, trim(word)) gt 0 then do; /* check if word present using INDEX function */
found= 1;
rc = 1;
end;
else rc = iter.next();
end;
if found = 0 then output; /* output only if no word found in name */
drop word rc found;
run;

sum equal zero for variables for sas

I looked at the internet but I could not find anything relevant.
I have a table with thousands of variable.
I'm trying to do a sum of one single variable and find out , which variable in sum , is equal to zero.
example
col1 col2 col3
0 0 0
1 0 2
1 0 3
results
col2
0
However, my proc means does not want to take my where clause.
proc sql;
create table toto as select nomvar,monotonic() as num_lig from dicofr
where nomvar <> 'date';
proc sql;
select nomvar into :varnom separated by ' ' from toto
where num_lig between 0 and 1000;
%put varnom: &varnom;
proc means data=afr sum (where=(sum(&varnom)=0) ;
var &varnom;
output out=want;
run;
What am I doing wrong?
Thank you for anything that can lead me to a solution.
This will do it. I believe this requires SAS 9.3+ for the stackedodsoutput option.
*Generating some data;
data have;
array x[100];
call streaminit(7);
do i = 1 to 20;
do _t = 1 to dim(x);
if rand('Uniform') < 0.9 then x[_t]=0;
else x[_t]=1;
end;
output;
end;
run;
*ods output grabs what you want from proc means;
ods output summary=want(where=(sum=0));
proc means data=have sum stackodsoutput;
var x:;
run;
ods output close;
Several ways to achieve this, here's a datastep/array method :
%LET NVARS = 1000 ;
data want ;
set have end=eof ;
array n{*} col1-col&NVARS ;
array t{&NVARS} 5. _TEMPORARY_ ;
do i = 1 to dim(n) ;
t{i} + n{i} ;
end ;
if eof then do ;
do i = 1 to dim(t) ;
if missing(t{i}) or t{i} = 0 then do ;
vname = vname(n{i}) ;
put "Sum of " vname "= 0" ; /* write message to log */
output ; /* Write to dataset */
end ;
end ;
end ;
keep vname ;
run ;