6.3 - Selecting Observations

6.3 - Selecting Observations

By default, the PRINT procedure displays all of the observations in a SAS data set. You can control which observations are printed by:

  • using the FIRSTOBS= and OBS = options to tell SAS which range of observation numbers to print
  • using the WHERE statement to print only those observations that meet a certain condition

Example 6.6

The following SAS code uses the PRINT procedure's FIRSTOBS= and OBS= options to the second, third, fourth and fifth observations of the basic data set:

OPTIONS LS = 75 PS = 58 NODATE;

PROC PRINT data = basic (FIRSTOBS = 2 OBS = 5);
   var subj name no_vis expense;
RUN;
Obs subj name no_vis expense
2 1167 Maryann White 2 2999.34
3 1168 Thomas Jones 10 3904.89
4 1201 Benedict Arnold 1 1450.23
5 1302 Felicia Ho 7 1209.94

The FIRSTOBS= option tells SAS the first observation to print, and the OBS= option tells SAS the last observation to print. Both options must be placed in parentheses, and the parentheses must immediately follow the DATA= option. You will get a syntax error if you try to use the options without also using the DATA= option. (Incidentally, if you don't use the DATA= option to tell SAS which data set to print, SAS will print the most recent data set.)

Launch and run the SAS program, and review the output to convince yourself that SAS behaves as described.

Example 6.7

The FIRSTOBS= and OBS= options tell SAS to print observations based on their observation numbers, whereas the WHERE statement tells SAS to print observations based on whether or not they meet the specified condition. The following SAS code uses the WHERE statement to tell SAS to print only those observations for which the value of the variable no_vis is greater than 5:

PROC PRINT data = basic;
   var name no_vis type_vis expense;
   where no_vis > 5;
RUN;
Obs name no_vis type_vis expense
1 Alice Smith 7 101 1001.98
3 Thomas Jones 10 190 3904.89
5 Felicia Ho 7 190 1209.94
6 John Smith 6 187 1763.09

Launch and run the SAS program, and review the output to convince yourself that SAS behaves as described. And then, a couple of things to note:

  • Only one WHERE statement can appear in each PRINT procedure.
  • You can specify any variable in your SAS data set in a WHERE statement, not just the variables that appear in the VAR statement. (I like to place variables in both places just so I am able to see the effect of the WHERE statement.)
  • The WHERE statement works for both character and numeric variables. To specify a condition based on the value of a character variable, you must:
    • Enclose the value in quotation marks, and
    • Type the value using lowercase and uppercase letters exactly as it appears in the data set.
  • Any of the comparison operators — such as eq, ne, gt, lt, ge, or le — that you learned for if-then-else statements can be used in a WHERE statement.
  • You can also use logical operators — such as and & or — to select observations that meet multiple conditions.
  • You can also use a CONTAINS operator to select observations that include the specified substring.

Example 6.8

The following SAS code uses the CONTAINS operator to select observations in the basic data set for which the name variable contains the substring 'Smi':

PROC PRINT data = basic;
    var name gender no_vis type_vis expense;
	where name contains 'Smi';
RUN;
Obs name gender no_vis type_vis expense
1 Alice Smith 1 7 101 1001.98
6 John Smith 2 6 187 1763.09
7 Jane Smiley 1 5 190 3567.00

First, launch and run the SAS program, and review the output to convince yourself that SAS behaves as described. Then, change the word contains in the WHERE statement to a question mark (?), and re-run the SAS program. If you review the output again and note that it is identical to the previous output, you should be able convince yourself that the ? operator is equivalent to the contains operator.


Legend
[1]Link
Has Tooltip/Popover
 Toggleable Visibility