Proc Sql Output - sql

Hello i am new in sas and i created sql code, and now i need to redirect the output to /tmp/output.txt.
proc sql;
select (COUNT(IDCUENTACLIENTE)) AS COUNT_of_IDCUENTACLIENTE from S1.CUENTACLIENTE where segmentonivel1 = 'Altas Recientes'
and segmentonivel2 = 'Masivo'
GROUP BY SEGMENTONIVEL1,SEGMENTONIVEL2;
quit;
I tried to put
data _null_;
FILE "/tmp/MyFile.txt";
run;
but is not creating the file.
Some one can help me?

I have a sugestion...
First create a data set using the query. In your code, I have doubts about the GROUP BY you are using. It's run without errors?
Second export to txt file like below
proc sql;
create table work.temp as
select SEGMENTONIVEL1,SEGMENTONIVEL2, (COUNT(IDCUENTACLIENTE)) AS
COUNT_of_IDCUENTACLIENTE from S1.CUENTACLIENTE
where segmentonivel1 = 'Altas Recientes'
and segmentonivel2 = 'Masivo'
GROUP BY SEGMENTONIVEL1,SEGMENTONIVEL2;
quit;
/* code to create TXT file */
data _null_;
FILE "/tmp/MyFile.txt";
set work.temp;
put
SEGMENTONIVEL1
SEGMENTONIVEL2
COUNT_of_IDCUENTACLIENTE;
run;

If you want use the filename definition and you don't want to write the filename into datastep:
proc sql;
create table tableName as
select (COUNT(IDCUENTACLIENTE)) AS COUNT_of_IDCUENTACLIENTE from
S1.CUENTACLIENTE where segmentonivel1 = 'Altas Recientes'
and segmentonivel2 = 'Masivo'
GROUP BY SEGMENTONIVEL1,SEGMENTONIVEL2;
quit;
filename x "c:\temp\teszt.txt";
data _null_;
file x;
set work.tableName;
put COUNT_of_IDCUENTACLIENTE;
run;

Related

How can i convert this SAS code in to SQL Query?

data GC_OUT.ABCD_2;
set GC_OUT.TEST;
index_first_non_zero = verify(ASSIGNED_EMPLOYEE_CD,"0");
ASSIGNED_EMPLOYEE_CD_1 = substr(ASSIGNED_EMPLOYEE_CD, index_first_non_zero);
run;
A direct translation would be something like this:
proc sql;
create table GC_OUT.ABCD_2 as
select *
, verify(ASSIGNED_EMPLOYEE_CD,"0") as index_first_non_zero
, substr(ASSIGNED_EMPLOYEE_CD,calculated index_first_non_zero)
as ASSIGNED_EMPLOYEE_CD_1
from GC_OUT.TEST
;
quit;

Delete prefix from multiple variables

I have a dataset "Dairy1" with variables labeled as '240, 241, 242 ..." but the actual name is '_240, _241, _242 ...'.
How can I delete the prefix "_" from the name of all these variables? I tried to use the following code in SAS 9.4 but it doesn't work.
**proc sql noprint;
select cats(name,'=',scan(name, 1, '_'))
into :suffixlist
separated by ' '
from dictionary.columns
where libname = 'WORK' and memname = 'Dairy1' and '240' = scan(name, 2, '_');
quit;
%put &suffixlist.;
data want;
set Dairy1;
rename &suffixlist.;
run;**
It shows me the following:
WARNING: Apparent symbolic reference SUFFIXLIST not resolved.
ERROR 22-322: Syntax error, expecting one of the following: un nombre, ;.
Thanks in advance
I did similar solution the other way around, which added suffix to variables.
What this does, is to create single string, which contains all the variable rename-commands.
Note that this is a bit stupid as it will remove all characters. Just modify the logic to suit your needs in data sjm_tmp2.
%macro remove_str(Dataset, str);
proc contents noprint
data=work.&dataset out=sjm_tmp(keep=NAME);
run;
data sjm_tmp2;
set sjm_tmp;
help= tranwrd(NAME, "&str.", '');
foobar=cats(name, '=',help);
run;
proc sql noprint;
select foobar into :sjm_list separated by ' ' from sjm_tmp2;
quit;
proc datasets library = work nolist;
modify &dataset;
rename &sjm_list;
quit;
proc datasets library=work noprint;
delete sjm_tmp sjm_tmp2 ;
run;
%mend;
Tested with code:
data test;
_a=1;
_b=1;
_c=1;
_d=1;
_e=1;
_f=1;
run;
%remove_str(test, _);

Format the summarised variables from proc summary

I'm using a Proc Summary, as I want to utilise a multilabel format. I've been going round and round trying to apply a format to my summarised outputs, but can't see how to get this without incurring warnings.
Proc Summary Data = Source CompleteTypes Missing NoPrint NWay;
Class Brand / MLF;
Var Id Total;
Output Out = Results
N(ID) = Volume
Sum(Total) = Grand_Total;
Run;
I want to format my Volume as Comma23. and the Grand_Total as Comma23.2. If I put a format statement after the outputs it warns me that the variables don't exist, but the dataset does have the format applied.
I would have thought that formatting a summarised variable would be a common action, but I can't find a way to apply it without getting the warnings. Is there something I'm missing?
Many thanks
Another approach is to use proc template to apply the format. The format will be carried over into the newly created data set using the ods output. Use ods trace on to find (1) the name of the template to alter (2) the name of the object to output into a data set. In your case, you want to alter the Base.Summary template and output the Summary object. Both will be found in the log when you run ods trace in front of a proc step. This can be done with other procedures as well. For instance, a proc frequency of a single table has the template Base.Freq.OneWayList
/* Create Test Data */
data test (drop = num);
do num = 1 to 100;
x = ceil(rand('NORMAL', 100, 10));
output;
end;
run;
/* Check log with ODS Trace On to find template to alter and object to output */
ods trace on;
proc summary data = test sum n mean print;
var x;
run;
ods trace off;
/* Alter the Base.Summary template */
ods path reset;
ods path (PREPEND) WORK.TEMPLATE(UPDATE);
proc template;
edit Base.Summary;
edit N;
label = 'Count';
header = varlabel;
format = Comma10.;
end;
edit Mean;
label = 'Average';
header = varlabel;
format = Comma10.;
end;
edit Sum;
label = "Sum";
header = varlabel;
format = Comma10.;
end;
end;
run;
/* Output Results (formatted) from the Proc */
ods output summary = results;
proc summary data = test sum n mean print stackodsoutput;
var x;
run;
Some statistics like SUM inherit the format of the analysis variable. N statistics does not inherit the format but you can format the new variable if you can use the : trick shown in the example, and no warning is produced.
proc summary data=sashelp.class;
class sex;
output out=test n(age)=Nage sum(weight)=sum_weight;
format nage: comma12. weight comma12.3;
run;
proc contents varnum;
run;
proc print;
run;
Use proc datasets to apply the format to your output dataset after proc summary has created it:
proc datasets lib = work;
modify results;
format Volume comma23. Grand_total comma23.2;
run;
quit;

How to randomly select variables in SAS?

I can find all sorts of information on how to randomly select observations in SAS which is a fairly easy task. This is not what I need though. I need to randomly select variables. What I want to do specifically is randomly choose 20 variables from my list of 159 variables and do this 50 times. I want to ensure diversity too. I have been spending about two days on this and am having no luck.
I'm glad that you asked this question, because I just developed a solution for that! Let's break down exactly what needs to be done, step-by-step.
Step 0: What do we need to do?
We need a way to take all of our variables and randomly select 20 of them while keeping them within the bounds of the SAS language rules.
We'll require:
All variables in the dataset
A way to re-sort them randomly
A limit of 20 variables
A way to loop this 50 times
Let's start with 1.
Step 1: Getting all the variables
sashelp.vcolumn provides a list of all variables within a dataset. Let's select them all.
proc sql noprint;
create table all_vars as
select name
where libname = 'LIBRARYHERE' AND memname = 'HAVE'
;
quit;
This gets us a list of all variables within our dataset. Now, we need to sort them randomly.
Step 2: Making them random
SAS provides the rand function that allows you to pull from any distribution that you'd like. You can use call streaminit(seedhere) prior to the rand function to set a specific seed, creating reproducable results.
We'll simply modify our original SQL statement and order the dataset with the rand() function.
data _null_;
call streaminit(1234);
run;
proc sql noprint;
create table all_vars as
select name
from sashelp.vcolumn
where libname = 'LIBRARYHERE' AND memname = 'HAVE'
order by rand('uniform');
quit;
Now we've got all of our variables in a random order, distributed evenly by the uniform distribution.
Step 3: Limit to 20 variables
You can do this a few ways. One way is the obs= dataset option in separate procedures, another is the outobs= proc sql option. Personally, I like the obs= dataset option since it doesn't generate a warning in the log, and can be used in other procedures.
data _null_;
call streaminit(1234);
run;
proc sql noprint outobs=20;
create table all_vars as
select name
from sashelp.vcolumn
where libname = 'LIBRARYHERE' AND memname = 'HAVE'
order by rand('uniform');
quit;
Step 4: Loop it 50 times
We'll use SAS Macro Language to do this part. We can create 50 individual datasets this way, or switch the code up slightly and read them into macro variables.
%macro selectVars(loop=50, seed=1234);
data _null_;
call streaminit(&seed);
run;
%do i = 1 %to &loop;
proc sql noprint outobs=20;
create table all_vars&i as
select name
from sashelp.vcolumn
where libname = 'LIBRARYHERE' AND memname = 'HAVE'
order by rand('uniform')
;
quit;
%end;
%mend;
%selectVars;
Or, option 2:
%macro selectVars(loop=50, seed=1234);
data _null_;
call streaminit(&seed);
run;
%do i = 1 %to &loop;
proc sql noprint outobs=20;
select name
into :varlist separated by ' '
from sashelp.vcolumn
where libname = 'LIBRARYHERE' AND memname = 'HAVE'
order by rand('uniform')
;
quit;
%end;
%mend;
%selectVars;
The 2nd option will create a local macro variable called &varlist that will have the random 20 variables separated by spaces. This can be convenient for various modeling procs, and is preferable since it does not create a separate dataset each time.
Hope this helps!
You will need to treat your meta data as data and use SURVEYSELECT to select observations. Then perhaps put these names into macro variables but you did not mention the exact output you want.
data v;
array rvars[159];
run;
proc transpose data=v(obs=0) out=vars name=name;
var rvars:;
run;
proc surveyselect reps=4 sampsize=20 data=vars out=selection;
run;
proc transpose data=selection out=lists(drop=_:);
by replicate;
var name;
run;
proc print;
run;
data _null_;
set lists;
by replicate;
call symputx(cats('VLIST',_n_),catx(' ',of col:));
run;
%put _global_;

SAS - renaming variables

I am trying to change the names of variables in my table/dataset. I went through several websites and this discussion forum, but I didnĀ“t manage to find any code that would work properly in my case (i am a newcomer to SAS).
My dataset contains 103 columns and I would like to rename the first 100 columns. The name of the first column is CFT(1), CFT(2) of the second column,..., CFT(100) of the 100th column. New variables can be called for example CFT_n(1),...,CFT_n(100).
The code I was using is following:
data vystup_m200_b;
set vystup_m200_a;
rename 'cft(1)'n - 'cft(100)'n='cft(1)_n'n - 'cft(100)_n'n;
run;
But I obtain an error stating:
Aplhabetic prefixes for enumerated variables (cft(1)-cft(100)) are different.
Thank you for any suggestion what I am doing wrong.
Even with validvarname=any the numeric suffix on a numbered variable list have to have the number as the last part of the name. You "could" use the features of PROC TRANSPOSE to flip-flop the data to rename the variables. This is only advisable if the data are rather small.
data ren;
array _a[*] 'cft(1)'n 'cft(2)'n 'cft(3)'n ( 1 2 3);
do i = 1 to 10;
output;
end;
drop i;
run;
proc transpose data=ren out=ren2;
run;
proc transpose data=ren2 out=renamed(drop=_name_) suffix=_N;
id _name_;
run;
If your variables are sequentially named, a simple macro will suffice:
option validvarname = any;
data ren;
array _a[*] 'cft(1)'n 'cft(2)'n 'cft(3)'n ( 1 2 3);
do i = 1 to 10;
output;
end;
drop i;
run;
%macro rename_loop;
%local i;
%do i = 1 %to 3;
"cft(&i)"n = "cft(&i)_n"n
%end;
%mend rename_loop;
proc datasets lib = work nolist nowarn nodetails;
modify ren;
rename %rename_loop;
run;
quit;
This should work more or less instantaneously, regardless of the size of the dataset, as it only needs to update the metadata.
Renaming is fastest. I would look to a more general solution that doesn't require knowing anything like the name or how many or if you need name literals.
data ren;
array _a[*] 'cft(1)'n 'cft(2)'n 'cft(3)'n (1 2 3);
do i = 1 to 10;
output;
end;
drop i;
run;
proc print;
run;
proc transpose data=ren(obs=0) out=ren2;
run;
proc sql noprint;
select catx('=',nliteral(_name_),nliteral(cats(_name_,'_n')))
into :renamelist separated by ' '
from ren2;
quit;
run;
%put NOTE: &=renamelist;
proc datasets nolist;
modify ren;
rename &renamelist;
run;
contents data=ren varnum short;
quit;
Another solution, which is renaming variables after upload:
proc import datafile="\\folder\RUN_00.xlsx"
dbms=xlsx out=run_00 replace;
run;
data rename;
length ren $32767;
set run_00(obs= 1);
keep ren delka;
array cfte{*} CFT:;
do i=1 to dim(cfte);
ren=strip(ren)||" 'cft("||strip(i)||")'n='cft_"||strip(i)||"_00'n";
delka=length(ren);
end;
call symputx("renam",ren);
run;
proc datasets library=work;
modify run_00;
rename &renam;
run;