6.2 - Identifying Observations

In the previous section, we learned how to suppress the observation column that appears in default reports. Alternatively, we can use the ID statement to replace the observation column with one or more variables.

Example 6.4 Section

Using the ID statement, we can emphasize one or more key variables. The ID statement, which automatically suppresses the printing of the observation number, tells SAS to print the variable(s) specified in the ID statement as the first column(s) of your output. Thus, the ID statement allows you to use the values of the variables to identify observations, rather than the (usually meaningless) observation number. The following SAS code illustrates the use of the ID statement option:

PROC PRINT data = basic;
   id name;
   var gender expense;
RUN;
name gender expense
Alice Smith 1 1001.98
Maryann White 1 2999.34
Thomas Jones 2 3904.89
Benedict Arnold 2 1450.23
Felicia Ho 1 1209.94
John Smith 2 1763.09
Jane Smiley 1 3567.00

Launch and run the SAS program, and review the output to convince yourself that name, having replaced the observation number, appears in the first column.

Example 6.5 Section

It is particularly useful to use the ID statement when observations are so long that SAS can't print them on one line. In that case, SAS breaks up the observations and prints them on two (or more lines). When that happens, it is helpful to use an ID variable (or more) so that you can keep track of the observations. The following SAS code illustrates such a situation:

OPTIONS LS = 64 PS = 58 NODATE;

PROC PRINT data = basic;
   id name;
   var subj name clinic gender 
       subj no_vis type_vis expense;
RUN;
name subj name clinic gender
Alice Smith 1024 Alice Smith LEWN 1
Maryann White 1167 Maryann White LEWN 1
Thomas Jones 1168 Thomas Jones ALTO 2
Benedict Arnold 1201 Benedict Arnold ALTO 2
Felicia Ho 1302 Felicia Ho MNMC 1
John Smith 1471 John Smith MNMC 2
Jane Smiley 1980 Jane Smiley MNMC 1
name subj no_vis type_vis expense
Alice Smith 1024 7 101 1001.98
Maryann White 1167 2 101 2999.34
Thomas Jones 1168 10 190 3904.89
Benedict Arnold 1201 1 190 1450.23
Felicia Ho 1302 7 190 1209.94
John Smith 1471 6 187 1763.09
Jane Smiley 1980 5 190 3567.00

Let's take note of a couple of things about the code. Given that we are working with a small data set with just seven variables, I used the OPTIONS statement to intentionally shrink the line size to 64 so that SAS is unable to fit the requested variables on one line. Also note that the variable name appears not only in the ID statement but also in the VAR statement. When that happens, SAS displays the variable twice in the output. Finally, note that subj is included twice in the VAR statement again only to make sure that the requested variables do not fit on one line.

Launch and run the SAS program, and review the output to convince yourself that SAS behaves as described.