SAS: Insert data when counter is 0 for the upcoming rows - sql

I need to find the min_week when the counter is 0 and that row's week value should continue until again the counter becomes 0.
Following is an example of the Output I am looking for:
In my current output with the following code, I am getting the output as:
proc sql;
create table week_min_1 as
select t1.*, t2.week as min_week from emp_table t1
left join (select * from emp_table where counter = 0 group by emp having sequence = min(sequence)) t2
on t1.emp= t2.emp
;
quit;

Let's first create your sample data:
data have;
input Emp $ Sequence Week Counter;
datalines;
a001 1 2 0
a001 2 4 1
a001 3 8 2
a001 4 12 3
a001 5 24 0
a001 6 36 1
a001 7 48 2
a001 8 52 3
;
run;
Now we should sort our data. In this sample it is already sorted, but it's better safe than sorry.
proc sort data=have;
by Emp Sequence;
run;
With a simple if statement we identify those min_week values.
data have2;
set have;
if counter = 0 then do;
min_week = Week;
end;
run;
And with update statement we put those values trough every row.
data want;
update have2 (obs=0) have2;
by Emp;
output;
run;

Related

Rolling sum for last 3 hour records of just one column in SAS

Everyone,
What I need is to calculate for every record (every row) for the last 3 hour sum of usage (Usage is one of the columns in dataset) grouped by User and ID_option.
Every line(row) represent one record (one hour have about million records). For example I made a table with just a few records (including desired column sum_usage_3 hour):
User ID_option time usage sum_usage_3hr
1 a1 12OCT2017:11:20:32 3 10
1 a1 12OCT2017:10:23:24 7 14
1 b1 12OCT2017:09:34:55 12 12
2 b1 12OCT2017:08:55:06 4 6
1 a1 12OCT2017:07:59:53 7 7
2 b1 12OCT2017:06:59:12 2 2
I have tried with something like this code below and it returns me a sum of all time, not just the last 3 hour. I'm not surprised, but I have not so much idea how I'm going to do this in SAS.
proc sql:
CREATE table my_table
SELECT *, SUM(usage) AS sum_usage_3hr
FROM prev_table WHERE time BETWEEN TIME and intnx('second', time, -3*3600)
GROUP BY User, ID_option;
RUN;
Any help is welcomed, thanks. It's not necessary to do this in proc sql, data step is also acceptable if it's possible. I just assume that I need some kind of partition by.
Thanks in advance.
Why not just use a correlated sub-query to get the sum?
data have ;
input user id_option $ datetime :datetime. usage expected ;
format datetime datetime20.;
cards;
1 a1 12OCT2017:11:20:32 3 10
1 a1 12OCT2017:10:23:24 7 14
1 b1 12OCT2017:09:34:55 12 12
2 b1 12OCT2017:08:55:06 4 6
1 a1 12OCT2017:07:59:53 7 7
2 b1 12OCT2017:06:59:12 2 2
;
proc print; run;
proc sql ;
create table want as
select a.*
, (select sum(b.usage)
from have b
where a.user=b.user and a.id_option=b.id_option
and b.datetime between intnx('hour',a.datetime,-3,'s') and a.datetime
) as usage_3hr
from have a
;
quit;
Results
usage_
Obs user id_option datetime usage expected 3hr
1 1 a1 12OCT2017:11:20:32 3 10 10
2 1 a1 12OCT2017:10:23:24 7 14 14
3 1 b1 12OCT2017:09:34:55 12 12 12
4 2 b1 12OCT2017:08:55:06 4 6 6
5 1 a1 12OCT2017:07:59:53 7 7 7
6 2 b1 12OCT2017:06:59:12 2 2 2
The result is not surprising, as the condition for the WHERE clause is always true (time is necessarily greater or equal (or lesser or equal) to time).
I believe the simplest way would be to join the table on itself, and select the relevant rows this way:
proc sql;
create table want as
select distinct a.*
,sum(b.USAGE) as sum_usage_3hr
from have as a
left join have as b
on a.USER = b.USER
and a.ID_OPTION = b.ID_OPTION
and b.TIME between intnx('hour', a.TIME, -3) and a.TIME
group by a.USER, a.ID_OPTION, a.TIME;
quit;

How to sum two columns based on another column condition in SAS and store the result in a variable

I have data like this :
ColumnA ColumnB ColumnC
1 10 20
1 10 30
0 50 10
0 50 20
1 10 40
I want to sum values of ColumnB and ColumnC where the values of ColumnA = 1 and store the result in a variable
i.e. I want 120 (sum values of ColumnB and ColumnC where the values of ColumnA = 1) and store this result in a variable in SAS.
With this I also want to (sum values of ColumnB and ColumnC where the values of ColumnA = 0) in another variable i.e. 130 in another variable in SAS
I tried to create a variable in proc print, means,etc. and even thought of doing it in proc sql but was unable to achieve the result.
Easily done with basic SQL:
data have;
infile datalines;
input columnA columnB columnC;
datalines;
1 10 20
1 10 30
0 50 10
0 50 20
1 10 40
;
run;
proc sql;
select sum(ColumnB)+sum(ColumnC)
into: want
from have
where columnA=1
;
quit;
/* the result is stored in variable &want */
%put &want.;
EDIT: to answer your follow-up question, this will give you two output variables with the two sums:
data have;
infile datalines;
input columnA columnB columnC;
datalines;
1 10 20
1 10 30
0 50 10
0 50 20
1 10 40
;
run;
proc sql;
select sum(case when columnA=1 then ColumnB+ColumnC end) as A0
,sum(case when columnA=0 then ColumnB+ColumnC end) as A1
into: want0, :want1
from have
;
quit;
/* variable &want1 contains the result where ColumnA=1 and &want2 where ColumnA=0 */
%put &want0;
%put &want1;

Create adjacency matrix from dataset in SAS

I have been trying desperately to create an adjacency matrix from a dataset (I have the equivalent in R), but am unable to do so in SAS (beginner proficiency). It would be very helpful if you could help me out with this. Also, kindly suggest if this and sparse matrices are possible in SAS (without SNA) ?
data test;
input id id_o;
cards;
100 300
600 400
200 300
100 200
;
run;
I find the union of all unique id and id_o to create a list
proc sql;
create table test2 as
select distinct id from
(select id as id from test
union all
select id_o as id from test);
quit;
Test2 looks like
100
600
200
300
400
Now I want an adjacency matrix which assigns a 1 at a position when there is a link between Test2 (100 and id_o (300) from original dataset). Consider Test2 to be the i's and there is a 1 at a corresponding j.
So, the adjacency matrix will look like
100 600 200 300 400
100 0 0 1 1 0
600 0 0 0 0 1
200 0 0 0 1 0
300 0 0 0 0 0
400 0 0 0 0 0
Here's one way, expanding on your current code. First you need to create a empty table with all options and then fill in the 1/0's. Second transpose the table to the desired format.
There may be a way to do this with proc distance or some other proc, but I'm not sure.
*get all the 1's for matrix;
proc freq data=test;
table id*id_o/sparse out=dist1;
run;
*Fill into matrix with all options;
proc sql;
create table test3 as
select a.id, b.id as id_o, coalesce(c.count, 0) as count
from test2 as a
cross join test2 as b
left join dist1 as c
on a.id=c.id
and b.id=c.id_o
order by a.id, b.id;
quit;
*Transpose to desired format.
proc transpose data=test3 out=test4 prefix=id_;
by id;
id id_o;
var count;
run;

How to group using two variables

Supose we've got the following dataset:
DATE VAR1 VAR2
1 A 1
2 A 1
3 B 1
4 C 2
5 D 3
6 E 4
7 F 5
8 B 6
9 B 7
10 D 1
Each record belongs to a person, the problem is that a single person can have more than one record with different values.
To identify a person: If you share the same VAR1, you are the same person, BUT also if you share the same VAR2, you are the same person.
My objective is to create a new variable IDPERSON which uniquely identifies the person for each record. In my example, there are only 4 different people:
DATE VAR1 VAR2 IDPERSON
1 A 1 1
2 A 1 1
3 B 1 1
4 C 2 2
5 D 3 1
6 E 4 3
7 F 5 4
8 B 6 1
9 B 7 1
10 D 1 1
How could I achieve this by using SQL or SAS?
%macro grouper(
inData /*Input dataset*/,
outData /*output dataset*/,
id1 /*First identification variable (must be numeric)*/,
id2 /*Second identification variable*/,
idOut /*Name of variable to contain group ID*/,
maxN = 5 /*Max number of itterations in case of failure*/);
/* Assign an ID to each distict connected graph in a a network */
/* Create first guess for group ID */
data _g_temp;
set &inData.;
&idOut. = &id1.;
run;
/* Loop, improve group ID each time*/
%let i = 1;
%do %while (&i. <= &maxN.);
%put Loop number &i.;
%let i = %eval(&i. + 1);
proc sql noprint;
/* Find the lowest group ID for each group of first variable */
create table _g_map1 as
select
min(&idOut.) as &idOut.,
&id1.
from _g_temp
group by &id1.;
/* Find the lowest group ID for each group of second variable */
create table _g_map2 as
select
min(&idOut.) as &idOut.,
&id2.
from _g_temp
group by &id2.;
/* Find the lowest group ID from both grouping variables */
create table _g_new as
select
a.&id1.,
a.&id2.,
coalesce(min(b.&idOut., c.&idOut.), a.&idOut.) as &idOut.,
a.&idOut. as &idOut._old
from _g_temp as a
full outer join _g_map1 as b
on a.&id1. = b.&id1.
full outer join _g_map2 as c
on a.&id2. = c.&id2.;
/* Put results into temporary dataset ready for next itteration */
create table _g_temp as
select *
from _g_new;
/* Check if the itteration provided any improvement */
select
min(
case when &idOut._old = &idOut. then 1
else 0
end) into :stopFlag
from _g_temp;
quit;
/* End loop if ID unchanged over last itteration */
%if &stopFlag. %then %let i = %eval(&maxN. + 1);
%end;
/* Output lookup table */
proc sql;
create table &outData. as
select
&id1.,
min(&idOut.) as &idOut.
from _g_temp
group by &id1.;
quit;
/* Clean up */
proc datasets nolist;
delete _g_:;
quit;
%mend grouper;
DATA baseData;
INPUT VAR1 VAR2 $;
CARDS;
1 A
1 A
1 B
2 C
3 D
4 E
5 F
6 B
7 B
1 D
1 X
7 G
6 Y
6 D
6 I
8 D
9 Z
9 X
;
RUN;
%grouper(
baseData,
outData,
VAR1,
VAR2,
groupID);
Do you think this will work?
It's written in SAS, but it uses SQL sentences.
DATA TEMP3;
INPUT VAR1 VAR2 $ DATE;
CARDS;
1 A 1
1 A 2
1 B 3
2 C 4
3 D 5
4 E 6
5 F 7
6 B 8
7 B 9
1 D 10
;
RUN;
PROC SQL;
CREATE TABLE WORK.TEMP4 AS SELECT DISTINCT VAR2, VAR1 FROM WORK.TEMP3 ORDER BY VAR2, VAR1;
CREATE TABLE WORK.TEMP5 AS SELECT DISTINCT VAR1, VAR2 FROM WORK.TEMP3 ORDER BY VAR1, VAR2;
CREATE TABLE WORK.TEMP6 AS SELECT TEMP4.VAR2, TEMP4.VAR1, TEMP5.VAR2 AS VAR22 FROM WORK.TEMP4 INNER JOIN WORK.TEMP5 ON (TEMP4.VAR1=TEMP5.VAR1);
CREATE TABLE WORK.TEMP7 AS SELECT TEMP6.*, TEMP5.VAR1 AS VAR12 FROM WORK.TEMP6 INNER JOIN WORK.TEMP5 ON (TEMP6.VAR2=TEMP5.VAR2);
CREATE TABLE WORK.TEMP8 AS SELECT DISTINCT VAR22, VAR12 FROM WORK.TEMP7 ORDER BY VAR22, VAR12;
CREATE TABLE WORK.TEMP9 AS SELECT VAR22, MAX(VAR12) AS VAR12 FROM WORK.TEMP8 GROUP BY VAR22;
CREATE TABLE WORK.TEMP10 AS SELECT TEMP8.* FROM WORK.TEMP8 INNER JOIN WORK.TEMP9 ON (TEMP8.VAR22=TEMP9.VAR22 AND TEMP8.VAR12=TEMP9.VAR12);
CREATE TABLE WORK.TEMP11 AS SELECT TEMP3.*, TEMP10.VAR12 AS IDPERSONA FROM WORK.TEMP3 LEFT JOIN WORK.TEMP10 ON (TEMP3.VAR2=TEMP10.VAR22);
QUIT;
I've broken down this problem into a few steps, which works for the data you've supplied. There's probably a way to reduce the number of steps, at the expense of readability. Let me know if this works for your real data.
/* create input dataset */
data have;
input DATE VAR1 $ VAR2;
datalines;
1 A 1
2 A 1
3 B 1
4 C 2
5 D 3
6 E 4
7 F 5
8 B 6
9 B 7
10 D 1
;
run;
/* calculate min VAR2 per VAR1 */
proc summary data=have nway idmin;
class var1;
output out=minvar2 (drop=_:) min(var2)=temp_var;
run;
/* add in min VAR2 data */
proc sql;
create table temp1 as select
a.*,
b.temp_var
from have as a
inner join
minvar2 as b
on a.var1 = b.var1
order by b.temp_var;
quit;
/* create idperson variable */
data want;
set temp1;
by temp_var;
if first.temp_var then idperson+1;
drop temp_var;
run;
/* sort back to original order */
proc sort data=want;
by date var1;
run;
Keith:
You solution does not work properly, take a look at the following dataset:
DATA TEMP3;
INPUT VAR2 VAR1 $ DATE;
DUMMY=1;
CARDS;
1 A 1
1 A 2
1 B 3
2 C 4
3 D 5
4 E 6
5 F 7
6 B 8
7 B 9
1 D 10
1 X 11
7 G 14
6 Y 15
6 D 16
6 I 18
8 D 20
9 Z 21
9 X 22
;
RUN;
Your program's result is:
VAR2 VAR1 DATE DUMMY idperson
1 A 1 1 1
1 A 2 1 1
1 B 3 1 1
2 C 4 1 2
3 D 5 1 1
4 E 6 1 3
5 F 7 1 4
6 B 8 1 1
7 B 9 1 1
1 D 10 1 1
1 X 11 1 1
7 G 14 1 6
6 Y 15 1 5
6 D 16 1 1
6 I 18 1 5
8 D 20 1 1
9 Z 21 1 7
9 X 22 1 1
Which are not corrent since Var1=6 records have two different ids.
This is what i've done, the whole program (not posted here) is more complex (and not so elegant) since it deals with missing data in Var1 and Var2.
PROC SQL;
CREATE TABLE WORK.TEMP4 AS SELECT DISTINCT VAR1, VAR2 FROM WORK.TEMP3 WHERE DUMMY=1 AND VAR2^=. ORDER BY VAR1, VAR2;
CREATE TABLE WORK.TEMP5 AS SELECT DISTINCT VAR2, VAR1 FROM WORK.TEMP3 WHERE DUMMY=1 AND VAR2^=. ORDER BY VAR2, VAR1;
CREATE TABLE WORK.TEMP6 AS SELECT TEMP4.*, TEMP5.VAR1 AS CIP2 FROM WORK.TEMP4 INNER JOIN WORK.TEMP5 ON (TEMP4.VAR2=TEMP5.VAR2);
CREATE TABLE WORK.TEMP7 AS SELECT TEMP6.*, TEMP4.VAR2 AS IDHH2 FROM WORK.TEMP6 INNER JOIN WORK.TEMP4 ON (TEMP6.VAR1=TEMP4.VAR1);
CREATE TABLE WORK.TEMP8 AS SELECT DISTINCT IDHH2, CIP2 FROM WORK.TEMP7;
CREATE TABLE WORK.TEMP9 AS SELECT TEMP7.*, TEMP8.CIP2 AS CIP3 FROM WORK.TEMP7 INNER JOIN WORK.TEMP8 ON (TEMP7.IDHH2=TEMP8.IDHH2);
CREATE TABLE WORK.TEMP10 AS SELECT TEMP9.*, TEMP8.IDHH2 AS IDHH3 FROM WORK.TEMP9 INNER JOIN WORK.TEMP8 ON (TEMP9.CIP3=TEMP8.CIP2);
CREATE TABLE WORK.TEMP11 AS SELECT DISTINCT VAR1, IDHH3 AS VAR2 FROM WORK.TEMP10 ORDER BY VAR1, IDHH3;
CREATE TABLE WORK.TEMP12 AS SELECT VAR1, MAX(VAR2) AS VAR2 FROM WORK.TEMP11 GROUP BY VAR1;
CREATE TABLE WORK.TEMP13 AS SELECT TEMP11.* FROM WORK.TEMP11 INNER JOIN WORK.TEMP12 ON (TEMP11.VAR1=TEMP12.VAR1 AND TEMP11.VAR2=TEMP12.VAR2);
CREATE TABLE WORK.TEMP14 AS SELECT TEMP3.*, TEMP13.VAR2 AS IDPERSONA FROM WORK.TEMP3 LEFT JOIN WORK.TEMP13 ON (TEMP3.VAR1=TEMP13.VAR1);
CREATE TABLE WORK.TEMP15 AS SELECT DISTINCT VAR2, IDPERSONA FROM WORK.TEMP14 WHERE VAR2^=. AND IDPERSONA^=.;
CREATE TABLE WORK.TEMP16 AS SELECT TEMP14.*, TEMP15.IDPERSONA AS IDPERSONA2 FROM WORK.TEMP14 LEFT JOIN WORK.TEMP15 ON (TEMP14.VAR2=TEMP15.VAR2) ORDER BY DATE;
QUIT;
DATA TEMP16;
SET TEMP16;
IF IDPERSONA=. THEN IDPERSONA=IDPERSONA2;
DROP IDPERSONA2;
RUN;
And the right results:
VAR2 VAR1 DATE DUMMY IDPERSONA
1 A 1 1 9
1 A 2 1 9
1 B 3 1 9
2 C 4 1 2
3 D 5 1 9
4 E 6 1 4
5 F 7 1 5
6 B 8 1 9
7 B 9 1 9
1 D 10 1 9
1 X 11 1 9
7 G 14 1 9
6 Y 15 1 9
6 D 16 1 9
6 I 18 1 9
8 D 20 1 9
9 Z 21 1 9
9 X 22 1 9
I forgot to post my final solution, it is a SAS macro. I've made another one for 3 variables.
%MACRO GROUPER2(INDATA,OUTDATA,ID1,ID2,IDOUT,IDN=_N_,MAXN=5);
%PUT ****************************************************************;
%PUT ****************************************************************;
%PUT **** GROUPER MACRO;
%PUT **** PARAMETERS:;
%PUT **** INPUT DATA: &INDATA.;
%PUT **** OUTPUT DATA: &OUTDATA.;
%PUT **** FIRST VARIABLE: &ID1.;
%PUT **** SECOND VARIABLE: &ID2.;
%PUT **** OUTPUT GROUPING VARIABLE: &IDOUT.;
%IF (&IDN.=_N_) %THEN %PUT **** STARTING NUMBER VARIABLE: AUTONUMBER;
%ELSE %PUT **** STARTING NUMBER VARIABLE: &IDN.;
%PUT **** MAX ITERATIONS: &MAXN.;
%PUT ****************************************************************;
%PUT ****************************************************************;
/* CREATE FIRST GUESS FOR GROUP ID */
DATA _G_TEMP1 _G_TEMP2;
SET &INDATA.;
&IDOUT.=&IDN.;
IF &IDOUT.=. THEN OUTPUT _G_TEMP2;
ELSE OUTPUT _G_TEMP1;
RUN;
PROC SQL NOPRINT;
SELECT MAX(&IDOUT.) INTO :MAXIDOUT FROM _G_TEMP1;
QUIT;
DATA _G_TEMP2;
SET _G_TEMP2;
&IDOUT.=_N_+&MAXIDOUT.;
RUN;
DATA _G_TEMP;
SET _G_TEMP1 _G_TEMP2;
RUN;
PROC SQL;
UPDATE _G_TEMP SET &IDOUT.=. WHERE &ID1. IS NULL AND &ID2. IS NULL;
QUIT;
/* LOOP, IMPROVE GROUP ID EACH TIME*/
%LET I = 1;
%DO %WHILE (&I. <= &MAXN.);
%PUT LOOP NUMBER &I.;
%LET I = %EVAL(&I. + 1);
PROC SQL NOPRINT;
/* FIND THE LOWEST GROUP ID FOR EACH GROUP OF FIRST VARIABLE */
CREATE TABLE _G_MAP1 AS SELECT MIN(&IDOUT.) AS &IDOUT., &ID1. FROM _G_TEMP WHERE &ID1. IS NOT NULL GROUP BY &ID1.;
/* FIND THE LOWEST GROUP ID FOR EACH GROUP OF SECOND VARIABLE */
CREATE TABLE _G_MAP2 AS SELECT MIN(&IDOUT.) AS &IDOUT., &ID2. FROM _G_TEMP WHERE &ID2. IS NOT NULL GROUP BY &ID2.;
/* FIND THE LOWEST GROUP ID FROM BOTH GROUPING VARIABLES */
CREATE TABLE _G_NEW AS SELECT A.&ID1., A.&ID2., COALESCE(MIN(B.&IDOUT., C.&IDOUT.), A.&IDOUT.) AS &IDOUT.,
A.&IDOUT. AS &IDOUT._OLD FROM _G_TEMP AS A FULL OUTER JOIN _G_MAP1 AS B ON A.&ID1. = B.&ID1.
FULL OUTER JOIN _G_MAP2 AS C ON A.&ID2. = C.&ID2.;
/* PUT RESULTS INTO TEMPORARY DATASET READY FOR NEXT ITTERATION */
CREATE TABLE _G_TEMP AS SELECT * FROM _G_NEW ORDER BY &ID1., &ID2.;
/* CHECK IF THE ITTERATION PROVIDED ANY IMPROVEMENT */
SELECT MIN(CASE WHEN &IDOUT._OLD = &IDOUT. THEN 1 ELSE 0 END) INTO :STOPFLAG FROM _G_TEMP;
%PUT NO IMPROVEMENT? &STOPFLAG.;
QUIT;
/* END LOOP IF ID UNCHANGED OVER LAST ITTERATION */
%LET ITERATIONS=%EVAL(&I. - 1);
%IF &STOPFLAG. %THEN %LET I = %EVAL(&MAXN. + 1);
%END;
%PUT ****************************************************************;
%PUT ****************************************************************;
%IF &STOPFLAG. %THEN %PUT **** LOOPING ENDED BY NO-IMPROVEMENT CRITERIA. OUTPUT FULLY GROUPED.;
%ELSE %PUT **** WARNING: LOOPING ENDED BY REACHING THE MAXIMUM NUMBER OF ITERARIONS. OUTPUT NOT FULLY GROUPED.;
%PUT **** NUMBER OF ITERATIONS: &ITERATIONS. (MAX: &MAXN.);
%PUT ****************************************************************;
%PUT ****************************************************************;
DATA &OUTDATA.;
SET _G_TEMP;
DROP &IDOUT._OLD;
RUN;
/* OUTPUT LOOKUP TABLE */
PROC SQL;
CREATE TABLE &OUTDATA._1 AS SELECT &ID1., MIN(&IDOUT.) AS &IDOUT. FROM _G_TEMP WHERE &ID1. IS NOT NULL GROUP BY &ID1. ORDER BY &ID1.;
CREATE TABLE &OUTDATA._2 AS SELECT &ID2., MIN(&IDOUT.) AS &IDOUT. FROM _G_TEMP WHERE &ID2. IS NOT NULL GROUP BY &ID2. ORDER BY &ID2.;
QUIT;
/* CLEAN UP */
PROC DATASETS NOLIST;
DELETE _G_:;
QUIT;
%MEND GROUPER2;

SAS and proc sql

I have to get rid of a subject if it satisfies a condition.
DATA:
Name Value1
A 60
A 30
B 70
B 30
C 60
C 50
D 70
D 40
What I want is if the value=30 then both the lines should not come in theoutput.
Desired outpu is
Name Value1
C 60
C 50
D 70
D 40
I have written a code in proc sql as
proc sql;
create table ck1 as
select * from ip where name in
(select distinct name from ip where value = 30)
order by name, subject, folderseq;
quit;
Change your SQL to be:
proc sql;
create table ck1 as
select * from ip where name not in
(select distinct name from ip where value = 30)
order by name, subject, folderseq;
quit;
Data step method:
data have;
input Name $ Value1;
datalines;
A 60
A 30
B 70
B 30
C 60
C 50
D 70
D 40
;;;;
run;
data want;
do _n_ = 1 by 1 until (last.name);
set have;
by name;
if value1=30 then value1_30=1;
if value1_30=1 then leave;
end;
do _n_ = 1 by 1 until (last.name);
set have;
by name;
if value1_30 ne 1 then output;
end;
run;
And an alternate, slightly faster method in some cases that avoids the second set statement when value1_30 is 1 (this is faster in particular if most have a 30 in them, so you're only keeping a small number of records).
data want;
do _n_ = 1 by 1 until (last.name);
set have;
by name;
counter+1;
if first.name then firstcounter=counter;
else if last.name then lastcounter=counter;
if value1=30 then value1_30=1;
if value1_30=1 then leave;
end;
if value1_30 ne 1 then
do _n_ = firstcounter to lastcounter ;
set have point=_n_;
output;
end;
run;
Another SQL option...
proc sql number;
select
a.name,
a.value1,
case
when value1 = 30 then 1
else 0
end as flag,
sum(calculated flag) as countflagpername
from have a
group by a.name
having countflagpername = 0
;quit;