Is there some way to tell SAS that for any obs ####1, ####2, or ####3 (where # = 1-9), I want them formatted #### Spring, #### Fall, and #### Winter? - formatting

So I have a 1000 observations for one variable that look like this:
19962
19943
19972
19951
19951
19912
The first four digits vary a bit, but the last digit is always 1, 2, or 3. Is there a way to only format the last digit, while not having to type out each iteration of the first four digits in a value statement?
That is, I want to avoid doing this:
proc format;
value varfmt
19911 = '1991 Spring'
19912 = '1991 Fall'
19913 = '1991 Winter'
19921 = '
19922 = '
[…]
19991 = '1999 Spring'
19992 = '1999 Fall'
19993 = '
;
run;
Instead, is there some way to tell SAS that for any ####1, ####2, or ####3, I want #### Spring, #### Fall, and #### Winter (which would be three lines under the value statement)?
Thanks in advance for any help.

As you are applying the format on the last digit only, so using the all the digits in the proc format is not required. Just extract the last digit and apply the format on it and concatenate it with other first four digits.
Creating the sample dataset
data test;
infile datalines;
input year;
datalines;
19962
19943
19972
19951
19951
19912
;
run;
Creating the formats
proc format;
value $varfmt
1 = 'Spring'
2 = 'Fall'
3 = 'Winter'
;
run;
Here, doing the following things
Extracting the last digit
Applying the format on it, created above
Extracting the first four digits of the number
Concatenating the output of 2 and 3
data final;
set test;
year_new = cat(substr(compress(year),1,4)," ",put(substr(compress(year),5,1),$varfmt.));
run;

You also have the option of creating a format from a dataset, if you do want a format for the whole value. You will have to create all possible rows, but it's not particularly hard.
data forfmt;
fmtname='SEASONF';
length start $5 label $8;
do startyr = 1990 to 2015;
start=cats(startyr,'1');
label=catx(' ',startyr,'Spring');
output;
start=cats(startyr,'2');
label=catx(' ',startyr,'Fall');
output;
start=cats(startyr,'3');
label=catx(' ',startyr,'Winter');
output;
end;
run;
proc format cntlin=forfmt;
quit;

Related

How to subtract second row from first, fourth row from third and so forth

I have SAS dataset as in the attached picture.. what I'm trying to accomplish is created new calculated field from Total column where I'm subtracting first row-second row, third row-fourth row and so on..
What i have tried so far is
DATA WANT2;
SET WANT;
BY APPT_TYPE;
IF FIRST.APPT_TYPE THEN SUPPLY-OPEN; ELSE 'ERROR';
RUN;
this throws an eror as statement is not valid..
not really sure how to go about this
My dataset
Here you go. The best I can do with the limited information you provided. Next time please provide sample data and your expected output.
data have;
input APPT_TYPE$ _NAME_$ Quantity;
datalines;
ASON Supply 10
ASON Open 8
ASSN Supply 9
ASSN Open 7
S30 Supply 11
S30 Open 8
;
proc sort data = have;
by APPT_TYPE descending _NAME_ ;
run;
data want;
set have;
by APPT_TYPE descending _NAME_;
lag_N_Order = lag1(Quantity);
N_Order = Quantity;
Difference = lag_N_Order - N_Order;
keep APPT_TYPE _NAME_ N_Order lag_N_Order Difference Type;
if last.APPT_TYPE & last._NAME_ & Difference>0;
run;

SAS : Convert dollar amount to decimal packed amount

I could convert the decimal packed amounts to Numeric amounts but unable to do this reversely.
data HAVE;
amount = '00000258Q';output;
amount = '000000000';output;
amount = '00002488M';output;
amount = '00002126P';output;
amount = '000007{ ';output;
run;
data WANT;
set HAVE;
amount_dollar = input(cats(amount),zdv10.);
run;
That is -
data HAVE;
amount_dollar = -2588;output;
amount_dollar = .;output;
amount_dollar = -24884;output;
amount_dollar = -21267;output;
amount_dollar = 70;output;
run;
Thanks for your help!
Your last value is is shorter than the others and that is why you needed to add the cats() function (or a trim() or strip() function) to remove the trailing blanks from what you pass to the ZDV. informat. Actually your other values are actually only 9 characters long and not 10. Your all zero value is going to get translated to missing by the ZDV. informat, but will be converted to zero by the ZD. informat since it doesn't mind that the nibble with the sign is 0.
Use the ZD. format to generate zoned decimal strings, but note that it will add the leading zeros to the last value and sign nibble to the all zero value.
data test;
input original $9. ;
num=input(original,zd9.);
numv=input(original,zdv9.);
numt=input(trim(original),zd9.);
string=put(numt,zd9.);
same = string=original;
cards;
00000258Q
000000000
00002488M
00002126P
000007{
;
SAS didn't make a ZDV format, as it wouldn't make sense, but you still have the ZD format:
data want;
set have;
amount = put(amount_dollar,zd10.);
run;
If it matters, this is not precisely a packed decimal, but a zoned decimal (packed decimal is, unsurprisingly, PDw.d, among others).

SAS proc sgplot with date axis formatted as m/d/yy (i.e. without leading zeros)

I'm trying to make a scatter plot with SAS proc sgplot and format the xaxis to be m/d/yy (for example 1/1/06). I created a custom date format like this:
PICTURE myDateFmt low-high = '%m/%d/%0y' (DATATYPE = date);
Then I formatted my date variable to be this format in a data step, and put this line in my proc sgplot step:
xaxis offsetmin = 0 offsetmax = 0 display=(nolabel) tickvalueformat=data;
However, when I do this, the date axis text all just disappears. Does anyone know of a way to format the date axis in a plot to be m/d/yy format?
Thank you in advance!
I think the TICKVALUEFORMAT option must have a problem with picture formats. When I tried this, my graph displayed "%m/%d/%0y" on the x-axis. But if I print the data, the formatted values are as desired so I think the picture format is created correctly.
I did a work-around where I created a value format for the date range of interest and then used that in the SGPLOT. To do this, I had to generate a dataset with one record for each day in the range of interest, and then converted that dataset to a format. Not ideal, but it works.
Hope this helps.
proc format;
PICTURE myDateFmt
low-high = '%m/%d/%0y' (DATATYPE = date)
;
run;
*** TEST DATA TO EXPERIMENT WITH - SPANS YEAR 1987 ***;
data stocks;
set sashelp.stocks;
where (mdy(1,1,1987) <= date <= mdy(12,31,1987));
format date myDateFmt. ;
run;
title 'USER CREATED PICTURE FORMAT DOES NOT WORK';
proc sgplot data=work.stocks;
scatter x=date y=close;
xaxis offsetmin = 0 offsetmax = 0 display=(nolabel) tickvalueformat=data;
run;
title 'SAS SUPPLIED FORMAT DOES WORK';
proc sgplot data=work.stocks;
scatter x=date y=close;
xaxis offsetmin = 0 offsetmax = 0 display=(nolabel) tickvalueformat=monyy5.;
run;
*** RECREATE FORMAT FOR SPECIFIC DATE RANGE THAT MATCHES DATA AND GRAPH AXIS DESIRED ***;
*** THIS WILL CREATE A FORMAT ENTRY FOR EVERY DAY IN THE RANGE ***;
data cntldate;
fmtname = 'myDateN';
type = 'n';
*** HARD CODE START/END DATES TO MATCH GRAPH AXIS DESIRED ***;
do start = mdy(1,1,1987) to mdy(1,1,1988);
*** FORMAT LABEL WILL BE DATE FORMAT WITHOUT LEADING ZEROS ***;
label = strip (put(start, myDateFmt.) );
output;
end;
run;
*** CONVERT CONTROL DATASET TO A FORMAT ***;
proc format library=work cntlin=cntldate;
run;
title 'USER CREATED VALUE FORMAT WORKS';
title2 'NOTE: HARDCODE OF START/END VALUE FOR XAXIS, OTHERWISE SAS MAY SELECT AXIS ENDPOINT OUTSIDE OF FORMAT RANGE';
title3 'NOTE2: AXIS MAY NOT REPORT EVERY MONTH DUE TO SPACE ISSUES';
proc sgplot data=work.stocks;
scatter x=date y=close;
xaxis offsetmin = 0 offsetmax = 0 display=(nolabel) tickvalueformat=myDateN.
values=('1jan87'd to '1jan88'd by month);
run;

Numeric variable to character variable conversion

I'm trying to plot two sets of calculations and compare them over time. The "cohort" variable, derived from coh_asof_yyyymm from original table, were stored in the data set in numeric format (201003 for 2010 March). Now when I plot them by using proc sgplot, 4 quarters of data are crammed together. How do I change the format of this variable in order to product outputs where x-axis should be in interval of quarters?
options nofmterr;
libname backtest "/retail/mortgage/consumer/new";
proc sql;
create table frst_cur as
select
coh_asof_yyyymm as cohort,
sum(annual_default_occurrence* wt)/sum(wt) as dr_ct,
sum(ScorePIT_PD_2013 * wt)/sum(wt) as pd_ct_pit,
sum(ScoreTTC_PD_2013 * wt)/sum(wt) as pd_ct_ttc
from backtest.sample_frst_cur_201312bkts
group by 1;
quit;
proc sgplot data = frst_cur;
series x = cohort y = pd_ct_pit;
series x = cohort y = pd_ct_ttc;
format cohort yyyyqc.;
xaxis label = 'Cohort Date';
yaxis label = 'Defaults';
title 'First Mortgage Current';
run;
If i'm getting it right, i think your date is a number and not a SAS date. It's not unusual, people do store date as integers in their RDBMS tables and when SAS import data from table, it assumes it to be integer rather than date. Check out the below solution code for reference.
data testing_date_integer;
infile datalines missover;
input int_date 8.;
/* creating another variable which would be a SAS date, based on int_date.
we would be converting the integer date to charater and then append
day (01) to the charater and read using YYMMDD8. informat for SAS
to store the character as date
*/
sas_date=input(cats(put(int_date,8.),'01'),yymmdd8.);
format sas_date YYQ8.;
datalines4;
200008
200009
200010
200011
200012
200101
200102
200103
200104
200105
200106
;;;;
run;
proc print data=testing_date_integer;run;
If above code show and solves you problem then i would recommend you to update you PROC SQL Code
proc sql;
create table frst_cur as
select
input(cats(put(coh_asof_yyyymm ,8.),'01'),yymmdd8.) as cohort,
.
.
.
Also, i would recommend updating the format statement for cohort in PROC SGPLOT
proc sgplot data = frst_cur;
.
.
format cohort yyq8.;
Hope this solves your problem.
Using the YYMMn6. informat
data HAVE;
input DATE YYMMn6.;
format date YYQ8.;
datalines;
200008
200009
200010
200011
200012
200101
200102
200103
200104
200105
200106
;
run;
Proc Print Data=HAVE noobs; Run;
data HAVE2;
input coh_asof_yyyymm 8.;
datalines;
200008
200009
200010
200011
200012
200101
200102
200103
200104
200105
200106
;
run;
proc sql;
create table frst_cur as
select
input(put(coh_asof_yyyymm,6.),YYMMn6.) as cohort format=YYQ8.
From HAVE2;
Quit;

Simplifying the variable input in SAS

I have 90 variables in the data, I want to do the following in SAS.
Here is my SAS code:
data test;
length id class sex $ 30;
input id $ 1 class $ 4-6 sex $ 8 survial $ 10;
cards;
1 3rd F Y
2 2nd F Y
3 2nd F N
4 1st M N
5 3rd F N
6 2nd M Y
;
run;
data items2;
set test;
length tid 8;
length item $8;
tid = _n_;
item = class;
output;
item = sex;
output;
item = survial;
output;
keep tid item;
run;
What if I have 90 variables to input the data like this? There should be a very long list. I want to simplify it.
You could use an ARRAY or alternately a PROC TRANSPOSE.
The following is untested, because you haven't provided an exxample of your input dataset.
DATA ITEMS;
ARRAY VARS {*} VAR1-VAR90;
SET REPLACE;
DO I = LBOUND(VARS) TO HBOUUND(VARS);
ITEM = VARS{I};
OUTPUT;
END;
RUN;
OR
PROC TRANSPOSE DATA = TEST OUT = WANT;
BY ID;
VAR CLASS -- SURVIAL;
RUN;
In the future it would be best is you could supply your input and desired output.
I don't seem to be able to add another comment to the above answer, as such I am adding one here.
You need to extend the VAR statement to include all variables that you want transposed.
CLASS -- SURVIAL means all variables between CLASS and SURVIVAL inclusive.
Post your code and the error so that I can help you better.