SAS/SQL - Create SELECT Statement Using Custom Function - sql

UPDATE
Given this new approach using INTNX I think I can just use a loop to simplify things even more. What if I made an array:
data;
array period [4] $ var1-var4 ('day' 'week' 'month' 'year');
run;
And then tried to make a loop for each element:
%MACRO sqlloop;
proc sql;
%DO k = 1 %TO dim(period); /* in case i decide to drop something from array later */
%LET bucket = &period(k)
CREATE TABLE output.t_&bucket AS (
SELECT INTX( "&bucket.", date_field, O, 'E') AS test FROM table);
%END
quit;
%MEND
%sqlloop
This doesn't quite work, but it captures the idea I want. It could just run the query for each of those values in INTX. Does that make sense?
I have a couple of prior questions that I'm merging into one. I got some really helpful advice on the others and hopefully this can tie it together.
I have the following function that creates a dynamic string to populate a SELECT statement in a SAS proc sql; code block:
proc fcmp outlib = output.funcs.test;
function sqlSelectByDateRange(interval $, date_field $) $;
day = date_field||" AS day, ";
week = "WEEK("||date_field||") AS week, ";
month = "MONTH("||date_field||") AS month, ";
year = "YEAR("||date_field||") AS year, ";
IF interval = "week" THEN
do;
day = '';
end;
IF interval = "month" THEN
do;
day = '';
week = '';
end;
IF interval = "year" THEN
do;
day = '';
week = '';
month = '';
end;
where_string = day||week||month||year;
return(where_string);
endsub;
quit;
I've verified that this creates the kind of string I want:
data _null_;
q = sqlSelectByDateRange('month', 'myDateColumn');
put q =;
run;
This yields:
q=MONTH(myDateColumn) AS month, YEAR(myDateColumn) AS year,
This is exactly what I want the SQL string to be. From prior questions, I believe I need to call this function in a MACRO. Then I want something like this:
%MACRO sqlSelectByDateRange(interval, date_field);
/* Code I can't figure out */
%MEND
PROC SQL;
CREATE TABLE output.t AS (
SELECT
%sqlSelectByDateRange('month', 'myDateColumn')
FROM
output.myTable
);
QUIT;
I am having trouble understanding how to make the code call this macro and interpret as part of the SQL SELECT string. I've tried some of the previous examples in other answers but I just can't make it work. I'm hoping this more specific question can help me fill in this missing step so I can learn how to do it in the future.

Two things:
First, you should be able to use %SYSFUNC to call your custom function.
%MACRO sqlSelectByDateRange(interval, date_field);
%SYSFUNC( sqlSelectByDateRange(&interval., &date_field.) )
%MEND;
Note that you should not use quotation marks when calling a function via SYSFUNC. Also, you cannot use SYSFUNC with FCMP functions until SAS 9.2. If you are using an earlier version, this will not work.
Second, you have a trailing comma in your select clause. You may need a dummy column as in the following:
PROC SQL;
CREATE TABLE output.t AS (
SELECT
%sqlSelectByDateRange('month', 'myDateColumn')
0 AS dummy
FROM
output.myTable
);
QUIT;
(Notice that there is no comma before dummy, as the comma is already embedded in your macro.)
UPDATE
I read your comment on another answer:
I also need to be able to do it for different date ranges and on a very ad-hoc basis, so it's something where I want to say "by month from june to december" or "weekly for two years" etc when someone makes a request.
I think I can recommend an easier way to accopmlish what you are doing. First, I'll create a very simple dataset with dates and values. The dates are spread throughout different days, weeks, months and years:
DATA Work.Accounts;
Format Opened yymmdd10.
Value dollar14.2
;
INPUT Opened yymmdd10.
Value dollar14.2
;
DATALINES;
2012-12-31 $90,000.00
2013-01-01 $100,000.00
2013-01-02 $200,000.00
2013-01-03 $150,000.00
2013-01-15 $250,000.00
2013-02-10 $120,000.00
2013-02-14 $230,000.00
2013-03-01 $900,000.00
RUN;
You can now use the INTNX function to create a third column to round the "Opened" column to some time period, such as a 'WEEK', 'MONTH', or 'YEAR' (see this complete list):
%LET Period = YEAR;
PROC SQL NOPRINT;
CREATE TABLE Work.PeriodSummary AS
SELECT INTNX( "&Period.", Opened, 0, 'E' ) AS Period_End FORMAT=yymmdd10.
, SUM( Value ) AS TotalValue FORMAT=dollar14.
FROM Work.Accounts
GROUP BY Period_End
;
QUIT;
Output for WEEK:
Period_End TotalValue
2013-01-05 $540,000
2013-01-19 $250,000
2013-02-16 $350,000
2013-03-02 $900,000
Output for MONTH:
Period_End TotalValue
2012-12-31 $90,000
2013-01-31 $700,000
2013-02-28 $350,000
2013-03-31 $900,000
Output for YEAR:
Period_End TotalValue
2012-12-31 $90,000
2013-12-31 $1,950,000

As Cyborg37 says, you probably should get rid of that trailing comma in your function. But note you do not really need to create a macro to do this, just use the %SYSFUNC function directly:
proc sql;
create table output.t as
select %sysfunc( sqlSelectByDateRange(month, myDateColumn) )
* /* to avoid the trailing comma */
from output.myTable;
quit;
Also, although this is a clever use of user-defined functions, it's not very clear why you want to do this. There are probably better solutions available that will not cause as much potential confusion in your code. User-defined functions, like user-written macros, can make life easier but they can also create an administrative nightmare.

I could make all sorts of guesses as to why you're getting errors, but fundamentally, don't do it this way. You can do exactly what you're trying to do in a data step that is much easier to troubleshoot and much easier to implement than a FCMP function which is really just trying to be a data step anyway.
Steps:
1. Create a dataset that has your possible date pulls. If you're using this a lot, you can put this in a permanent library that is defined in your SAS AUTOEXEC.
2. Create a macro that pulls the needed date strings from it.
3. If you want, use PROC FCMP to make this a function-style macro, using RUN_MACRO.
4. If you do that, use %SYSFUNC to call it.
Here is something that does this:
1:
data pull_list;
infile datalines dlm='|';
length query $50. type $8.;
input type $ typenum query $;
datalines;
day|1|&date_field. as day
week|2|week(&date_field.) as week
month|3|month(&date_field.) as month
year|4|year(&date_field.) as year
;;;;
run;
2:
%macro pull_list(type=,date_field=);
%let date_field = datevar;
%let type = week;
proc sql noprint;
select query into :sellist separated by ','
from pull_list
where typenum >= (select typenum from pull_list where type="&type.");
quit;
%mend pull_list;
3:
proc fcmp outlib = work.functions.funcs;
function pull_list(type $,date_field $) $;
rc = run_macro('pull_list', type,date_field);
if rc eq 0 then return("&sellist.");
else return(' ');
endsub;
run;
4:
data test;
input datevar 5.;
datalines;
18963
19632
18131
19105
;;;;
run;
option cmplib = (work.functions);
proc sql;
select %sysfunc(pull_list(week,datevar)) from test;
quit;
One of the big advantages of this is that you can add additional types without having to worry about the function's code - just add a row to pull_list and it works. If you want to set it up to do that, I recommend using something other than 1,2,3,4 for typenum - use 10,20,30,40 or something so you have gaps (say, if "twoweek" is added, it would be between 2 and 3, and 25 is easier than 2.5 for people to think about). Create that pull_list dataset, put it on a network drive where all of your users can use it (if anybody beyond you uses it, or a personal one if not), and go from there.

Related

How to run a different query if table is empty one month earlier

How to run a different query if the output table is empty.
My current query is:
PROC SQL;
CREATE TABLE WORK.QUERY_FOR_A_KUNDESCORINGRATINGRE AS
SELECT t1.PD,
t1.DATO,
t1.KSRUID
FROM DLKAR.A_KUNDESCORINGRATINGRETRO t1
WHERE t1.KSRUID = 6 AND t1.DATO = '31Aug2022'd;
QUIT;
But I would like to make a conditional statement to run the query again if it is empty but with the filter t1.DATO set to '31Jul2022'd instead of august. So every time the query fails on a given date the query tries again one month earlier.
I hope you can point me in a direction.
Just loop until you get results. You should put an upper bound on the number of times it loops to make sure if will end.
This will require that you create a macro to allow the conditional code generation.
%macro loop(start,months);
%local offset;
PROC SQL;
%do offset=0 to -&months by -1;
CREATE TABLE WORK.QUERY_FOR_A_KUNDESCORINGRATINGRE AS
SELECT t1.PD
, t1.DATO
, t1.KSRUID
FROM DLKAR.A_KUNDESCORINGRATINGRETRO t1
WHERE t1.KSRUID = 6
AND t1.DATO = %sysfunc(intnx(month,&start,&offset,e))
;
%if &sqlobs %then %goto leave;
%end;
%leave:
QUIT;
%mend;
%loop('31AUG2022'd,6)
You could make SQL work a little harder to get what you want. Pull the data back as many months as you want, but only keep the observations that are for the latest month. Now you don't need any looping.
CREATE TABLE WORK.QUERY_FOR_A_KUNDESCORINGRATINGRE AS
SELECT t1.PD
, t1.DATO
, t1.KSRUID
FROM DLKAR.A_KUNDESCORINGRATINGRETRO t1
WHERE t1.KSRUID = 6
AND t1.DATO between %sysfunc(intnx(month,&start,-&offset,e)) and &start
having dato=max(dato)
;
Which method performs better will depend on the data and things like whether or not the data is sorted or indexed.
I assume you always want to query for DATO the last day of the month.
%macro QUERY_FOR_A_KUNDESCORINGRATINGRE(DATO);
** Try with the date given **;
PROC SQL;
CREATE TABLE WORK.QUERY_FOR_A_KUNDESCORINGRATINGRE AS
SELECT t1.PD,
t1.DATO,
t1.KSRUID
FROM DLKAR.A_KUNDESCORINGRATINGRETRO t1
WHERE t1.KSRUID = 6 AND t1.DATO = &DATO;
QUIT;
** Read the result AND any other dataset.
(SASHELP.CLASS is a small dtaset that always exists.) **;
data _null_;
set QUERY_FOR_A_KUNDESCORINGRATINGRE(in=real_result) SASHELP.CLASS;
** If there is any result, the first observation(=row) will be from
that result and real_result will be 1(the SAS coding of True)
otherwise real_result will be 0(the SAS coding of False) **;
if not real_result then do;
** find the last day of the previous month **;
month_earlier = intnx("month", -1, &DATO, 'end');
call execute('%QUERY_FOR_A_KUNDESCORINGRATINGRE('
|| put(month_earlier, 8.) ||');');
end;
** We only need one observation(=row), so stop now **;
stop;
run;
%mend;
%QUERY_FOR_A_KUNDESCORINGRATINGRE('31Aug2022'd);
Disclaimer: I did not test this. It might need some debugging.
Try to run this code, we need loop until you get records
%macro query;
%global DATO;
%let DAT0 = '31Aug2022'd;
%first: PROC SQL;
CREATE TABLE WORK.QUERY_FOR_A_KUNDESCORINGRATINGRE AS
SELECT t1.PD,
t1.DATO,
t1.KSRUID
FROM DLKAR.A_KUNDESCORINGRATINGRETRO t1
WHERE t1.KSRUID = 6 AND t1.DATO = &DATO;
QUIT;
%let dsid = %sysfunc(open (QUERY_FOR_A_KUNDESCORINGRATINGRE))
%let obs = %sysfunc(attrn(&dsid. nlobs));
%let dsid = %sysfunc(close(&dsid.));
%if &obs = 0 %then %do;
data _null_;
call symputx("dato",intnx('m',&dato.,-1));
run;
%goto first;
%end;
%mend;
%query;
Please note: I haven't tested this code would be great if this helps you
If your dataset only contains data for the last day of a month, this solves your problem without iterating at all:
PROC SQL;
CREATE TABLE WORK.QUERY_FOR_A_KUNDESCORINGRATINGRE AS
SELECT t1.PD,
t1.DATO,
t1.KSRUID
FROM DLKAR.A_KUNDESCORINGRATINGRETRO t1
WHERE t1.KSRUID = 6 AND t1.DATO = (
SELECT max(t2.DATO)
FROM DLKAR.A_KUNDESCORINGRATINGRETRO t2
WHERE t2.KSRUID = 6);
QUIT;

Error when using a Macro variable in DO loop in SAS: Required operator not found in expression

I'm using SAS and I need to combine a number of tables, each of which has suffix of month and year in their name. The specific tables to use will be variable depending on user-defined start and end date. To achieve this, I'm trying to use a do loop via a macro to loop through the months/years in the date range and append to the previous table. However, I'm having issues (seemingly to do with it using the macro variable for the start/end year in the loop). I receive the following errors:
ERROR: Required operator not found in expression: &start_year.
ERROR: The %FROM value of the %DO QUOTE_YEAR loop is invalid.
ERROR: Required operator not found in expression: &end_year.
ERROR: The %TO value of the %DO QUOTE_YEAR loop is invalid.
ERROR: The macro GET_PRICES will stop executing.
Here is some example test code I've come up with that replicates the issue which produced the errors above which I am trying to debug. Note for this example, I'm only looping through the years. (I will add the months in once I resolve this issue.)
DATA _NULL_;
FORMAT start_date end_date DATE9.;
start_date = '01JUL2018'd;
end_date = '30JUN2019'd;
CALL SYMPUT('start_date',start_date);
CALL SYMPUT('end_date',end_date);
RUN;
%MACRO get_prices(start_date, end_date);
%LET start_year = year(&start_date.);
%LET end_year = year(&end_date.);
%LET start_month = month(&start_date.);
%LET end_month = month(&end_date.);
DATA test;
t = 0;
RUN;
%DO quote_year = &start_year. %TO &end_year.;
DATA test2;
t = &quote_year.;
RUN;
PROC APPEND BASE= test DATA= test2;
%END;
%MEND;
%get_prices(&start_date.,&end_date.);
The expected output is a table with a single column 't', with 3 rows: (0, 2018, 2019). (The 0 value I just included to initialise a non-empty table on which to append.) The code works when I replace the macro variables to the start/end year in the loop values to their actual value.
Doesn't Work
%DO quote_year = &start_year. %TO &end_year.;
Works
%DO quote_year = 2018 %TO 2019;
I can't work out what is causing this to fail. I believe it must have something to do with the way I've defined the macro variables, but the strange thing is if I remove the do loop completely and have the following data step under the %LET statements, the values appear as expected.
DATA test_macro_values;
s = &start_year.;
t = &end_year.;
u = &start_month.;
v = &end_month.;
RUN;
Can anyone see what's going wrong?
There are no macro functions called year and month. You should use %sysfunc:
%LET start_year = %sysfunc(year(&start_date.));
%LET end_year = %sysfunc(year(&end_date.));
%LET start_month = %sysfunc(month(&start_date.));
%LET end_month = %sysfunc(month(&end_date.));

SAS How iterate on date in macro?

I have a macro like this:
%macro loop(report_date);
/* some sql-code with where-statement on report_date:
create table with name small_table */
%mend;
Then I want to write a code which creates a table: this table is union of tables where the condition on the varible report_day is true.
But my code doesn't work:
%let days_number = 31;
%let Min_Date = '01Jan2018:00:00:00'dt;
/* create table with name big_table */
/*this macro creates a union table */
%macro doInLoop(report_date);
%loop(&report_date.);
PROC SQL;
CREATE TABLE Big_table AS
SELECT *
FROM big_table
UNION ALL
SELECT *
FROM small_table;
QUIT;
%mend;
%macro createTable;
%local j;
%do j = 1 to &days_number.;
%let rep_date = dhms(datepart(&Min_Date.) + j, 0, 0, 0);
%if day(rep_date) = 1 %then %doInLoop(%rep_date);
%end;
%mend;
%createTable;
I have 31 mistakes with messages:
"ERROR: The following columns were not found in the contributing tables: j"
Or how can I create a macro, that uses a working macro for one day ("loop"), for certain days from the range?
Thank you.
Use INTNX() to increment your date, don't do it manually.
You cannot use functions in a %LET statement without %SYSFUNC() otherwise the
macro processor cannot tell what is text and what is a function.
Unfortunately you also have more issues than just your date, so I'll
show you how to loop it and leave the rest to you.
%let rep_date = %sysfunc(intnx(DTDAY, &min_date, 1, 's');
DTDAY specifies a day interval for a datetime variable. If you had a date variable the interval would be DAY.
Remember to reference rep_date with the ampersand, &rep_date, otherwise it's just text to SAS.
You may find the sample macros in the macro appendix helpful. One illustrates looping with dates.
https://documentation.sas.com/?docsetId=mcrolref&docsetTarget=n01vuhy8h909xgn16p0x6rddpoj9.htm&docsetVersion=9.4&locale=en

SAS data step vs. proc sql dates

I hope someone can help me answer this query: I have two programs, one in proc sql and one in data step. The proc sql works, the data step doesn't. I can't see why?
%let _run_date = '30-jun-2017';
proc sql;
connect to oracle (path='EDRPRD' authdomain='EDRProduction'
buffsize=32767);
create table customer_sets as
select * from connection to oracle (
select *
from customer_set
where start_date <= &_run_date.
and nvl(end_date, &_run_date.) >= &_run_date.
and substr(sets_num,1,2) = 'R9');
quit;
This works fine. However, this doesn't:
libname ora oracle path='EDRPRD' authdomain='EDRProduction' schema='CST';
data customer_sets;
set ora.customer_set;
where start_date le &_run_date. and
coalesce(end_date, &_run_date.) ge &_run_date. and
substr(sets_num,1,2) = "R9";
run;
Can anyone tell me why?
Thanks!
It would have helped to see the error log but, for starters, your date macro variable, as it is used in your data step is interpreted by SAS as a string literal, not a date. In SAS, date literals are enclosed in quotes (single or double) and followed by a d.
You can modify your data step as follows and see if that's any better:
%let _run_date = '30-jun-2017';
data customer_sets;
set ora.customer_set;
where start_date le &_run_date.d and
coalesce(end_date, &_run_date.d) ge &_run_date.d and
substr(sets_num,1,2) = "R9";
run;
If that's not the issue, please post the log containing the error.
EDIT
Here is the above code with a small test data created beforehand:
libname ora (work);
data ora.customer_set;
infile datalines dlm='09'x;
input ID start_date :anydtdte. end_date :anydtdte. sets_num $;
format start_date end_date date.;
datalines;
1 30-may-2017 . R9xxx
2 30-may-2017 31-may-2017 R9xxx
;
run;
%let _run_date = '30-jun-2017';
data customer_sets;
set ora.customer_set;
where start_date le &_run_date.d and
coalesce(end_date, &_run_date.d) ge &_run_date.d and
substr(sets_num,1,2) = "R9";
run;
You can copy paste and run this as-is and you will see that it works fine.

SAS sql select variable as change name to a date in MonYY7. format

I am not sure if it is possible at all, but in case someone knows the answer. I need to select variables and rename them to dates in MonYY7. format. My understanding is that SAS stores dates as numbers, and it is the formats which represent them in the former way. However, would it be possible to somehow rename the variable's name itself according to the format?
Here is the code I have written:
%macro try;
%let month_count_back = 12;
%let today = %sysfunc(today());
%let sysmonth = %sysfunc(month("&sysdate"d));
proc sql;
create table try as
select *,
%do i = -&sysmonth. %to -&month_count_back.-&sysmonth.+1 %by -1;
max(month(FP_NDT) = month(intnx('month',&today.,&i.))) as mn%eval(&month_count_back.+&sysmonth.+&i.)
%if &i. = -&month_count_back.-&sysmonth.+1 %then %goto leave_month;
,
%leave_month:
%end;
from work.test
group by var;
quit;
run;
%mend try;
%try;
run;
It returns dummy indicators for each month value of the 'var' variable for the previous year (the intention here is to know which values are null and which are not). However, I would like each dummy variable created be named according to the month and the year it refers to. For example, m12 should be DEC2015, m11 - NOV2015 etc... As a corollary if month_count_back is equal to, say, 36 then m36 should be DEC2015, but M12 should be DEC2013 and M1 should be JAN2013 etc...
Maybe there is way to rename it later in a data step? I have tried to loop through it, but could not control for the changing month_count_back value...
Would appreciate any suggestions, thanks!