6.2 - Identifying Observations

In the previous section, we learned how to suppress the observation column that appears in default reports. Alternatively, we can use the ID statement to replace the observation column with one or more variables.

Example 6.4 Section

Using the ID statement, we can emphasize one or more key variables. The ID statement, which automatically suppresses the printing of the observation number, tells SAS to print the variable(s) specified in the ID statement as the first column(s) of your output. Thus, the ID statement allows you to use the values of the variables to identify observations, rather than the (usually meaningless) observation number. The following SAS code illustrates the use of the ID statement option:

PROC PRINT data = basic;
   id name;
   var gender expense;
RUN;

The SAS System

name

gender

expense

Alice Smith

1

1001.98

Maryann White

1

2999.34

Thomas Jones

2

3904.89

Benedict Arnold

2

1450.23

Felicia Ho

1

1209.94

John Smith

2

1763.09

Jane Smiley

1

3567.00

Launch and run  the SAS program, and review the output to convince yourself that the name, having replaced the observation number, appears in the first column.

Example 6.5 Section

It is particularly useful to use the ID statement when observations are so long that SAS can't print them on one line. In that case, SAS breaks up the observations and prints them on two (or more lines). When that happens, it is helpful to use an ID variable (or more) so that you can keep track of the observations. The following SAS code illustrates such a situation:

OPTIONS LS = 64 PS = 58 NODATE;
PROC PRINT data = basic;
   id name;
   var subj name clinic gender 
       subj no_vis type_vis expense;
RUN;

The SAS System

name

subj

name

clinic

gender

Alice Smith

1024

Alice Smith

LEWN

1

Maryann White

1167

Maryann White

LEWN

1

Thomas Jones

1168

Thomas Jones

ALTO

2

Benedict Arnold

1201

Benedict Arnold

ALTO

2

Felicia Ho

1302

Felicia Ho

MNMC

1

John Smith

1471

John Smith

MNMC

2

Jane Smiley

1980

Jane Smiley

MNMC

1

The SAS System

name

subj

no_vis

type_vis

expense

Alice Smith

1024

7

101

1001.98

Maryann White

1167

2

101

2999.34

Thomas Jones

1168

10

190

3904.89

Benedict Arnold

1201

1

190

1450.23

Felicia Ho

1302

7

190

1209.94

John Smith

1471

6

187

1763.09

Jane Smiley

1980

5

190

3567.00

Let's take note of a couple of things about the code. Given that we are working with a small data set with just seven variables, I used the OPTIONS statement to intentionally shrink the line size to 64 so that SAS is unable to fit the requested variables on one line. Also, note that the variable name appears not only in the ID statement but also in the VAR statement. When that happens, SAS displays the variable twice in the output. Finally, note that subj is included twice in the VAR statement again only to make sure that the requested variables do not fit on one line.

Launch and run  the SAS program, and review the output to convince yourself that SAS behaves as described.