2.5 - Reading Column Input

As mentioned in the introduction to this lesson, there are three different styles of input that are available to us in SAS. They are:

  • column input, which is the most commonly used style, allows you to read data values that are entered in fixed columns.
  • list input, which allows you to read data by simply listing the variable names in the INPUT statement. At least one space (or character) must occur between each value in the data set.
  • formatted input, which allows you to read numeric data containing special characters, such as dates and dollar amounts.

In this section, we will take a look at two simple examples of column input. In the next lesson, we will spend some time investigating list input and formatted input.

A couple of comments. For the sake of the examples that follow, we'll use the DATALINES statement to read in data. We could have just as easily used the INFILE statement to illustrate each point. Additionally, we'll create temporary data sets rather than permanent ones, even though we could have just as easily created permanent data sets to illustrate each point. Finally, after each SAS DATA step, we'll use the SAS print procedure (PROC PRINT) to print the resulting SAS data set for your perusal.

Column input Section

Column input allows you to read variable values that occupy the same columns within each record. To use column input, list the variable names in the INPUT statement, immediately following each variable name with its corresponding column positions in each of the data lines. (Of course, you'll need to follow each character variable with a dollar sign ($) first.) Column input can be used whenever your raw data are in fixed columns and in standard character or numeric format. Column input reads data values until it reaches the last specified column for the field.

The important points to note about column input are:

  • When using column input, you are not required to indicate missing values with a placeholder, such as a period. That is, missing values can be left as blank.
  • Column input uses the columns specified to determine the length of character variables, thereby allowing the character values to exceed the default 8 characters and to have embedded spaces.
  • Column input allows fields to be skipped altogether or to be read in any order.
  • Column input allows only part of a value to be read and allows values to be re-read.
  • Spaces are not required between the data values.

Example 2.5 Section

The following SAS program illustrates the simplest example of column input.

DATA temp;
  input subj 1-4 name $ 6-23 gender 25 height 27-28 weight 30-32;
  CARDS;
1024 Alice Smith        1 65 125
1167 Maryann White      1 68 140
1168 Thomas Jones       2 68 190
1201 Benedictine Arnold 2 68 190
1302 Felicia Ho         1 63 115
  ;
RUN;


PROC PRINT data=temp;
  title 'Output dataset: TEMP';
RUN;
Output dataset: TEMP
Obs subj name gender height weight
1 1024 Alice Smith 1 65 125
2 1167 Maryann White 1 68 140
3 1168 Thomas Jones 2 68 190
4 1201 Benedictine Arnold 2 68 190
5 1302 Felicia Ho 1 63 115

First, inspect the SAS code to make sure you understand how to set up the INPUT statement for column input.

Then, launch and run the SAS program.

Finally, review the output (click on  from the print procedure to convince yourself that the data are read in properly.

Example 2.6 Section

The following SAS program illustrates some of the key features of column input:

DATA temp;
  input init $ 6 f_name $ 6-16 l_name $ 18-23
        weight 30-32 height 27-28;
  CARDS;
1024 Alice       Smith  1 65 125
1167 Maryann     White  1 68 140
1168 Thomas      Jones  2    190
1201 Benedictine Arnold 2 68 190
1302 Felicia     Ho     1 63 115
  ;
RUN;


PROC PRINT data=temp;
  title 'Output dataset: TEMP';
RUN;
Output dataset: TEMP
Obs init f_name l_name weight height
1 A Alice Smith 125 65
2 M Maryann White 140 68
3 T Thomas Jones 190 .
4 B Benedictine Arnold 190 68
5 F Felicia Ho 115 63

First, inspect the SAS code so that you can become familiar with some of the features of column input. Then, launch and run the SAS program.

Review the output (click on ) from the print procedure to convince yourself that the data are read in properly. Note that the position of the variables within the temporary data set temp corresponds to the order in which the variables appear in the input statement, not the order in which the variables appear in the data set.