Execute Macro inside SQL statement - sql

The Situation:
I have a table mytable with two columns: tablename and tablefield:
|-----------|------------|
| tablename | tablefield |
|-----------|------------|
| table1 | id |
| table2 | date |
| table3 | etc |
|-----------|------------|
My core objective here is basically, make a Select for each of these tablenames, showing the MAX() value of its corresponding tablefield.
Proc SQL;
Select MAX(id) From table1;
Select MAX(date) From table2;
Select MAX(etc) From table3;
Quit;
ps: The solution have to pull the data from the table, so whether the table change its values, the solutions will make its changes also.
What I have tried:
From the most of my attempts, this is the most sofisticated and I believe the nearest from the solution:
proc sql;
create table table_associations (
memname varchar(255), dt_name varchar(255)
);
Insert Into table_associations
values ("table1", "id")
values ("table2", "date")
values ("table3", "etc");
quit;
%Macro Max(field, table);
Select MAX(&field.) From &table.;
%mend;
proc sql;
Select table, field, (%Max(field,table))
From LIB.table_associations
quit;
Creating the Macro, my intend is clear but, for this example, I should solve 2 problems:
Execute a macro inside an SQL Statement; And
Make the macro understand its String value parameter as an SQL command.

In a data step, you can use call execute to do what you're describing.
%Macro Max(field, table);
proc sql;
Select MAX(&field.) From &table.;
quit;
%mend;
data _null_;
set table_associations;
call execute('%MAX('||field||','||table||')');
run;

Macros are not necessary here as you can just generate the code using put statements in the data step:
filename gencode temp;
data _null_;
set table_associations end=eof;
file gencode;
if _n_=1 then put 'proc sql;';
put 'select max(' tablefield ') from ' tablename ';';
if eof then put 'quit;';
run;
%include gencode / source2;
filename gencode clear;
The code is written to the a temporary file named 'gencode'. You could make this file permanent if you want. _n_=1 and end=eof are used to print the statements before and after the queries. Finally, %include gencode runs the code and the source2 option prints the code to the log.

Related

Changing FROM statement with a variable

I am trying to change the name of the table I am getting my data from
Like this:
COREPOUT.KUNDE_REA_UDL_202112 --> COREPOUT.KUNDE_REA_UDL_202203
I create my variable like this:
PROC SQL NOPRINT;
SELECT DISTINCT
PERIOKVT_PREV_BANKSL_I_YYMMN6
INTO :PERIOKVT_PREV_BANKSL_I_YYMMN6
FROM Datostamp_PREV_Kvartal;
This is the code I want to use the variable for.
%_eg_conditional_dropds(WORK.QUERY_FOR_KUNDE_REA_UDL_20_0000);
PROC SQL;
CREATE TABLE WORK.QUERY_FOR_KUNDE_REA_UDL_20_0000 AS
SELECT t1.Z_ORDINATE,
(input(t1.cpr_se,w.)) AS KundeNum
FROM COREPOUT.KUNDE_REA_UDL_202203 t1;
QUIT;
I have tried things like:
FROM string("COREPOUT.KUNDE_REA_UDL_",PERIOKVT_PREV_BANKSL_I_YYMMN6," t1";
I hope you can point me in the right direction.
Use & to reference and resolve macro variables into strings (e.g. &PERIOKVT_PREV_BANKSL_I_YYMMN6).
proc sql noprint;
select distinct PERIOKVT_PREV_BANKSL_I_YYMMN6
into :PERIOKVT_PREV_BANKSL_I_YYMMN6
from Datostamp_PREV_Kvartal
;
quit;
proc sql;
create table WORK.QUERY_FOR_KUNDE_REA_UDL_20_0000 AS
select t1.Z_ORDINATE,
(input(t1.cpr_se,w.)) AS KundeNum
from &PERIOKVT_PREV_BANKSL_I_YYMMN6 t1
;
quit;
You can use CALL SYMPUTX() to move values from a dataset into a macro variable.
data _null_;
set Datostamp_PREV_Kvartal;
call symputx('dataset_name',PERIOKVT_PREV_BANKSL_I_YYMMN6);
stop;
run;
Then use the value of the macro variable to insert the dataset name into the code at the appropriate place. So your posted SQL is equivalent to this simple data step.
data QUERY_FOR_KUNDE_REA_UDL_20_0000;
set &dataset_name. ;
KundeNum = input(cpr_se,32.);
keep Z_ORDINATE KundeNum;
run;
Note: I did not see any definition of a user defined informat named W in your posted code so I just replaced it with the normal numeric informat instead since it looked like you where trying to convert a character value into a number.
The solution I ended up with was inspried by #Stu Sztukowski response:
I made a data step to concat the variable and created a macro variable.
data Concat_var;
str_PERIOKVT_PREV_YYMMN6 = CAT("COREPOUT.KUNDE_REA_UDL_",&PERIOKVT_PREV_BANKSL_I_YYMMN6," t1");
run;
PROC SQL NOPRINT;
SELECT DISTINCT
str_PERIOKVT_PREV_YYMMN6
INTO :str_PERIOKVT_PREV_YYMMN6
FROM Concat_var;
Then I used the variable in the FROM statement:
%_eg_conditional_dropds(WORK.QUERY_FOR_KUNDE_REA_UDL_20_0000);
PROC SQL;
CREATE TABLE WORK.QUERY_FOR_KUNDE_REA_UDL_20_0000 AS
SELECT t1.Z_ORDINATE,
(input(t1.cpr_se,w.)) AS KundeNum
FROM &str_PERIOKVT_PREV_YYMMN6;
QUIT;
I hope this helps someone else in the future.

Using macro for formula proc sql in SAS

I need some help with macros in SAS. I want to sum variables (for example, from v_1 to v_7) to aggregate them, grouping by year. There are plenty of them, so I want to use macro. However, it doesn't work (I get only v_1) I would really appreciate Your help.
%macro my_macro();
%local i;
%do i = 1 %to 7;
proc sql;
create table my_table as select
year,
sum(v_&i.) as v_&i.
from my_table
group by year
;
quit;
%end;
%mend;
/* I don't know to run this macro - is it ok? */
data run_macro;
set my_table;
%my_macro();
run;
The macro processor just generates SAS code and then passes onto to SAS to run. You are calling a macro that generates a complete SAS step in the middle of your DATA step. So you are trying to run this code:
data run_macro;
set my_table;
proc sql;
create table my_table as select
year,
sum(v_1) as v_1
from my_table
group by year
;
quit;
proc sql;
create table my_table as select
year,
sum(v_1) as v_1
from my_table
group by year
;
quit;
...
So first you make a copy of MY_TABLE as RUN_MACRO. Then you overwrite MY_TABLE with a collapsed version of MY_TABLE that has just two variables and only one observations per year. Then you try to collapse it again but are referencing a variable named V_2 that no longer exists.
If you simply move the %DO loop inside the generation of the SQL statement it should work. Also don't overwrite your input dataset. Here is version of the macro will create a new dataset name MY_NEW_TABLE with 8 variables from the existing dataset named MY_TABLE.
%macro my_macro();
%local i;
proc sql;
create table my_NEW_table as
select year
%do i = 1 %to 7;
, sum(v_&i.) as v_&i.
%end;
from my_table
group by year
;
quit;
%mend;
%my_macro;
Note if this is all you are doing then just use PROC SUMMARY. With regular SAS code instead of SQL code you can use variable lists like v_1-v_7. So there is no need for code generation.
proc summary nway data=my_table ;
class year ;
var v_1 - v_7;
output out=my_NEW_table sum=;
run;

check whether proc append was successful

I have some code which appends yesterday's data to [large dataset], using proc append. After doing so it changes the value of the variable "latest_date" in another dataset to yesterday's date, thus showing the maximum date value in [large dataset] without a time-consuming data step or proc sql.
How can I check, within the same program in which proc append is used, whether proc append was successful (no errors)? My goal is to change the "latest_date" variable in this secondary dataset only if the append is successful.
Try the automatic macro variable &SYSCC.
data test;
do i=1 to 10;
output;
end;
run;
data t1;
i=11;
run;
data t2;
XXX=12;
run;
proc append base=test data=t1;
run;
%put &syscc;
proc append base=test data=t2;
run;
%put &syscc;
I'm using the %get_table_size macro, which I found here. My steps are
run %get_table_size(large_table, size_preappend)
Create dataset called to_append
run %get_table_size(to_append, append_size)
run proc append
run %get_table_size(large_table, size_postappend)
Check if &size_postappend = &size_preappend + &append_size
Using &syscc isn't exactly what I wanted, because it doesn't check specifically for an error in proc append. It could be thrown off by earlier errors.
You can do this by counting how many records are in the table pre and post appending. This would work with any sas table or database.
The best practice is to always have control table for your process to log run time and number of records read.
Code:
/*Create input data*/
data work.t1;
input row ;
datalines;
1
2
;;
run;
data work.t2;
input row ;
datalines;
3
;;
run;
/*Create Control table, Run this bit only once, otherwise you delete the table everytime*/
data work.cntrl;
length load_dt 8. source 8. delta 8. total 8. ;
format load_dt datetime21.;
run;
proc sql; delete * from work.cntrl; quit;
/*Count Records before append*/
proc sql noprint ; select count(*) into: count_t1 from work.t1; quit;
proc sql noprint; select count(*) into: count_t2 from work.t2; quit;
/*Append data*/
proc append base=work.t1 data=work.t2 ; run;
/*Count Records after append*/
proc sql noprint ; select count(*) into: count_final from work.t1; quit;
/*Insert counts and timestampe into the Control Table*/
proc sql noprint; insert into work.cntrl
/*values(input(datetime(),datetime21.), input(&count_t1.,8.) , input(&count_t2.,8.) , input(&count_final.,8.)) ; */
values(%sysfunc(datetime()), &count_t1. , &count_t2., &count_final.) ;
quit;
Output: Control table is updated

Conditional Insert using SAS Proc SQL

I am trying to add records from one smaller table into a very large table if the primary key value for the rows in the smaller table is not in the larger one:
data test;
Length B C $4;
infile datalines delimiter=',';
input a b $ c $;
datalines;
1000,Test,File
2000,Test,File
3000,Test,File
;
data test2;
Length B C $4;
infile datalines delimiter=',';
input a b $ c $;
datalines;
1000,Test,File
4000,Test,File
;
proc sql;
insert into test
select * from test2
where a not in (select a from test2);
quit;
This however insets no records into the table Test. Can anyone tell me what I am doing wrong? The end result should be that the row where a = 4000 should be added to the table Test.
EDIT:
Using where a not in (select a from test) was what I originally tried and it generated the following error:
WARNING: This DELETE/INSERT statement recursively references the target table. A consequence of this is a possible data integrity problem.
ERROR: You cannot reopen WORK.TEST.DATA for update access with member-level control because WORK.TEST.DATA is in use by you in resource
environment SQL.
ERROR: PROC SQL could not undo this statement if an ERROR were to happen as it could not obtain exclusive access to the data set. This
statement will not execute as the SQL option UNDO_POLICY=REQUIRED is in effect.
224 quit;
Thanks
You can either do the process in two steps. First create the table of records to insert and then insert them.
proc sql ;
create table to_add as
select * from test2
where a not in (select a from test)
;
insert into test select * from to_add ;
quit;
Or you could just change the setting for the UNDO_POLICY option and SAS will let you reference TEST while updating TEST.
proc sql undo_policy=none;
insert into test
select * from test2
where a not in (select a from test)
;
quit;

drop the duplicate columns with sas

Would the following code work?
I'm dropping duplcated columns from my tables. I feel a bit confused after thinking about it. My code looks like working but I'm concerned about unseen mistakes.
proc sql;
create table toto
as select min(nomvar) as nomvar,count(intitule) as compte
from dicoat
group by intitule
having count(intitule) > 1;
data work.toto;
set toto;
do while(cpte>=1);
proc sql;
delete from dicoat where nomvar in (select nomvar from toto);
insert into toto
select min(nomvar) as nomvar,count(intitule) as compte from dicoat
group by intitule
having count(intitule) > 1;
end;
run;
data _null_;
file tempf;
set toto end=lastobs;
if _n_=1 then put "data aat;set aat (drop=";
put var /;
if lastobs then put ");run;";
run;
%inc tempf;
filename tempf clear;
After some thoughts and some questioning (ok - lots of questionning), one of my acquaintance helped me out with this
proc sort data=dicoat;
by title;
run;
data _null_;
set dicoat end=last;
length dropvar $1000;
retain dropvar;
by title;
if not first.title then dropvar = catx(' ',dropvar,nomvar);
if last then call symput('dropvar',trim(dropvar));
run;
data aat;
set aat(drop=&DROPVAR.);
run;
It should do the tricks of removing duplicate columns. And no,proc sql does not work within a data step.
Best.