2.4 - Reading From a Raw Data File

Thus far, we've only looked at examples in which we've read instream data into SAS data sets. Now, let's direct our attention to learning how to read data from a raw data file into SAS data sets.

A raw data file is an external text file whose records contain data values that are arranged in fields. Typical filename extensions of raw data files are .dat and .txt. A raw data file (also commonly called an ASCII file) is the kind of data file that you would view using your Notepad or Wordpad software. Data sets stored in spreadsheets, such as Microsoft's Excel, are binary, not raw ASCII data files.

Example 2.3 Section

The following SAS program illustrates how to create a temporary SAS data set called temp3 by using column input to read in data stored in a raw data file called temp3.dat:

The following SAS program illustrates the simplest example of column input.

DATA temp3;
  infile 'C:\stat480\data\temp3.dat';
  input subj 1-4 gender 6 height 8-9 weight 11-13;
RUN;

PROC PRINT data=temp3;
  title 'Output dataset: TEMP3';
RUN;

Notice that the INFILE statement, which must precede the INPUT statement, merely replaces the DATALINES statement and the data that appeared in the previous two examples. The INFILE statement tells SAS where the raw data file is stored on your computer. The name and location of the raw data file must appear in single quotes and immediately follow the INFILE keyword. In this case, our raw data file temp3.dat is stored in the directory "C:\stat480\data\". As you can see, the data values are the only items stored in the file.

In order to run this program, first save the temp3.dat file to a convenient location on your computer. Then, edit the INFILE statement as necessary to reflect the correct location. Finally, launch and run  the SAS program. Review the log and output windows to convince yourself that the data were properly read into the temporary data set temp3.

Instead of identifying the raw data file by specifying the entire filename and location in the INFILE statement, we can alternatively use what is called a fileref (for file reference). Just as we use a LIBNAME statement to assign a libref, we use a FILENAME statement to assign a filref. Filerefs perform the same function as librefs. That is, they temporarily point to a storage location for data. However, librefs point to SAS data libraries, whereas filerefs point to external data files.

Example 2.4 Section

The following SAS program illustrates the use of a fileref in the INFILE statement, in conjunction with a FILENAME statement, to read data stored in a raw data file called temp3.dat to create a temporary SAS data set called temp4:

The following SAS program illustrates the simplest example of column input.

FILENAME patients 'C:\stat480\data\temp3.dat';
DATA temp4;
  infile patients;
  input subj 1-4 gender 6 height 8-9 weight 11-13;
RUN;
PROC PRINT data = temp4;
  title 'Output dataset: TEMP4';
RUN;

The FILENAME statement in our program assigns the fileref patients to our temp3.dat file stored in our C:\stat480\data folder. In general, the fileref can be a nickname of our choosing, as long as it is between 1 and 8 characters long, begins with a letter or underscore, and contains only letters, numbers, or underscores. The specification of the physical name of the file, of course, adheres to the conventions of the Windows operating system.

If you haven't already done so for the previous example, save the temp3.dat file to a convenient location on your computer. Then, edit the FILENAME statement as necessary to reflect the correct location. Finally, launch and run  the SAS program. Review the log and output windows to convince yourself that the data were properly read into the temporary data set temp4.