Overview Section
In Stat 480, we learned how to read only the most basic data files into a SAS data set. In this lesson (and the next), we'll extend our knowledge in this area by learning how to read just about any data file into SAS — no matter how messy or unstructured the input data file. In most cases, the data files will be raw ascii data files that are obtained from exporting data from some other PC software.
Objectives
Upon completion of this lesson, you should be able to:
Upon completing this lesson, you should be able to do the following:
- read raw data separated by spaces into a SAS data set (that is, use list input)
- read raw data arranged in columns into a SAS data set (that is, use column input)
- read raw data not in standard format into a SAS data set (that is, use formatted input)
- mix list, column, and formatted input styles to read raw data into a SAS data set
- be able to determine when list input, column input, formatted input or some combination of the three styles should be used to read in a raw data file
- understand that the lengths of numeric variables are set to 8 by default and therefore do not necessarily coincide with the widths of the numeric informats used in an INPUT statement
- know the differene between fixed-length record data files and variable-length record data files
- know when it is appropriate, and how, to use the INFILE statement's PAD option
- know when it is appropriate, and how, to use the INFILE statement's MISSOVER option
- know when it is appropriate, and how, to use the INFILE statement's DLM= option
- know when it is appropriate, and how, to use the INFILE statement's DSD option
- know when it is appropriate, and how, to use the INFILE statement's FIRSTOBS= option
- know how to read missing values when using list input
- know when it is appropriate, and how, to specify a range of numeric or character variables in the INPUT statement
- know how to use the LENGTH statement to modify the length of a character or numeric variable
- use the ampersand (&) modifier with list input to read character values that contain embedded blanks
- use the colon (:) modifier with list input to read nonstandard data values and character values that are longer than eight characters, but which have no embedded blanks
- know that with formatted input, the informat determines both the length of character variables and the number of columns that are read
- know that the informat in modified list input determines only the length of the modified variable, not the number of columns that are read