Overview Section
In this lesson, we explore the various DATA step options that are available in SAS to control the structure and contents of a SAS data set when the input is from a SAS data set. For example, we might want to select only those observations in a SAS data set that meet a certain condition. Or, we might want to select only a subset of variables to keep in a working analysis data set.
Options illustrated in this lesson include:
- FIRSTOBS= and OBS=, to reduce the number of observations in the dataset.
- DROP= and KEEP=, to reduce the number of variables in the dataset.
- IN=, to create an indicator variable (0,1) which indicates whether the current observation came from the data set. (This is useful when merging and concatenating data sets which we'll study in the next two lessons).
- RENAME=, to change the name of a variable.
- WHERE=, to select observations from a SAS data set that meet a specified condition.
Objectives
Upon completion of this lesson, you should be able to:
Upon completing this lesson, you should be able to do the following:
- write a SAS DATA step that correctly uses the FIRSTOBS= and/or OBS= options
- write a SAS DATA step that correctly uses the DROP= and/or KEEP= options
- understand the difference between the DROP= and KEEP= options attached to the SET statement and the DROP= and KEEP= options attached to the DATA statement (and therefore be able to choose which is more appropriate for a given situation)
- explain the difference between the DROP= and KEEP= options attached to the SET statement and the DROP and KEEP statements within a data step (and therefore be able to choose which is more appropriate for a given situation)
- write a SAS DATA step that correctly uses the WHERE= option
- write a SAS DATA step that uses the WHERE= option to divide a larger data set up into two or more smaller data sets
- explain the difference between the WHERE= option attached to the SET statement and the WHERE= option attached to the DATA statement
- write a SAS DATA step that correctly uses the RENAME= option
- explain the difference between the RENAME= option attached to the SET statement and the RENAME= option attached to the DATA statement
- write a SAS DATA step that correctly uses the IN= option
- write a SAS DATA step that uses more than one DATA step option at a time