The IN= option tells SAS to create an "indicator variable" that takes on either the value 0 or 1 depending on whether or not the current observation comes from the input data set. If the observation does not come from the input data set, then the indicator variable takes on the value 0. If the observation does come from the input data set, then the indicator takes on the value 1. The IN= option is especially useful when merging and concatenating data sets which we'll study in the next two lessons. The basic format of the IN option is:
IN = varname
where varname is the name of a variable in the input data set.
Example 14.14 Section
The following program illustrates using the IN= option when concatenating — that is, appending one data set to another data set. Although we'll take a closer look at concatenating two or more data sets in the next lesson, this example will give you a taste of what's to come:
DATA back9; set temple okla (in=okie); if okie = 1 then hospital = 31; else if okie = 0 then hospital = 23; RUN; PROC PRINT data=back9; title 'Output Dataset: BACK9'; RUN;
The SET statement, in which both data set names temple and okla appear, tells SAS to concatenate the two data sets named temple and okla. That is, SAS will append the data set okla to the data set temple, so that the temporary data set back9 will contain 58 observations — 6 observations from temple and 52 observations from okla for a total of 58 observations.
The IN= option here tells SAS to create a temporary variable called okie that takes on the value 1 if the observation came from the okla data set and 0 if it did not. Therefore, the variable okie will equal 1 for the 52 observations from the data set okla and will equal 0 for the 6 observations from the data set temple. Because the indicator variable created by the IN= option is temporary, it goes away as soon as you leave the DATA step. For example, you can not print the indicator variable. To get around this, you can use the temporary variable to create a permanent variable. In this program, the temporary variable okie is used to create the permanent variable hospital.
Now, launch and run the SAS program. Review the output from the PRINT procedure. Convince yourself that the temporary data set back9 does indeed contain 58 observations — 6 observations from temple and 52 observations from okla. Also, verify that the variable hospital was created as expected from the temporary variable okie.
Example 14.15 Section
The following program illustrates a cute programming trick when using the IN= option. Specifically, it illustrates how SAS assumes that you mean "if varname = 1" in an IF statement if you just say "if varname" where varname is the variable name specified in an IN= option. Therefore, you can use this fact to create helpful temporary variable names, such as indatasetname. Let's take a look:
DATA back10; set temple okla (in=inokie); if inokie then hospital = 31; else hospital = 23; RUN; PROC PRINT data=back10; title 'Output Dataset: BACK10'; RUN;
Do you get it? The temporary variable inokie equals 1 for records coming from the okla data set and 0 for records coming from the temple data set. The IF statement does not (or need not) say "if inokie = 1 then hospital = 31." Instead, the IF statement says the much more english sounding "if inokie then hospital = 31."
Launch and run the SAS program. Review the output from the PRINT procedure, and convince yourself that the back10 data set has the same structure and contents as the back9 data set from the previous example.