This section illustrates the kinds of messages you might see in the log window when your input raw data fails to contain placeholders for missing values. The note "SAS went to a new line when INPUT statement reached past the end of a line" is rather innocent looking, but its presence can suggest there is a problem with your input data. The note means that as SAS was reading your data, it reached the end of the data line before it read values for all of the variable names appearing in your INPUT statement. When this happens, by default, SAS proceeds to the next line of data to get values for the remaining variables. In many cases that is not how you'd like SAS to behave, and so you need to compare your input data and output data sets carefully to make sure the data were read in properly.
Example 8.10 Section
The following example shows what can happen if you are using list input, and your data file doesn't contain periods as placeholders for numeric missing values:
OPTIONS PS = 58 LS = 72 NODATE NONUMBER;
DATA trees;
input treeID circ_in hght_ft crown_ft;
DATALINES;
101 222 105 112
102 149 138
103 258 80 70
104 187 91
105 210 99 74
106 229 127 104
;
RUN;
PROC PRINT data = trees;
title 'Tree data';
RUN;
OPTIONS PS = 58 LS = 72 NODATE NONUMBER;
DATA trees;
input treeID circ_in hght_ft crown_ft;
DATALINES;
NOTE: SAS went to a new line when INPUT statement reached past the end
of a line.
NOTE: The data set WORK.TREES has 4 observations and 4 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds
;
RUN;
PROC PRINT data = trees;
title 'Tree data';
RUN;
NOTE: There were 4 observations read from the data set WORK.TREES.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.00 seconds
cpu time 0.01 seconds
First, review the data and note that tree numbers 102 and 104 are missing values for the crown_ft variable, with no periods serving as placeholders for the missing values. Then, launch and run the SAS program, and review the log window to see the message SAS displays in this situation. If you review the output from the PRINT procedure:
Obs | tree ID | circ_in | hght_ft | crown_ft |
---|---|---|---|---|
1 | 101 | 222 | 105 | 112 |
2 | 102 | 149 | 138 | 103 |
3 | 104 | 187 | 91 | 105 |
4 | 106 | 229 | 127 | 104 |
you can see the effect on the output data set of SAS going to a new line to find the missing data. For example, for tree 102, when SAS went to the next line to look for the crown_ft value, it found the value 103, which was supposed to be the ID number for the next tree. Oh, if SAS could only read our mind!
Example 8.11 Section
One way of solving the problem of SAS going to the next line to look for the missing data values is to insert missing value periods (.) as placeholders. That solution would work for this small data set, but it wouldn't work when you are working with a large data set with thousands of records. In that case, the simplest thing to do to prevent SAS from going to a new line looking for data is to use the MISSOVER option of the INFILE statement. The MISSOVER option tells SAS to assign missing values to any variables for which there was no data instead of proceeding to the next line looking for the values. The following example illustrates using the MISSOVER option:
OPTIONS PS = 58 LS = 72 NODATE NONUMBER;
DATA trees;
INFILE DATALINES MISSOVER;
input treeID circ_in hght_ft crown_ft;
DATALINES;
101 222 105 112
102 149 138
103 258 80 70
104 187 91
105 210 99 74
106 229 127 104
;
RUN;
PROC PRINT data = trees;
title 'Tree data';
RUN;
OPTIONS PS = 58 LS = 72 NODATE NONUMBER;
DATA trees;
INFILE DATALINES MISSOVER;
input treeID circ_in hght_ft crown_ft;
DATALINES;
NOTE: The data set WORK.TREES has 6 observations and 4 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds
;
RUN;
PROC PRINT data = trees;
title 'Tree data';
RUN;
NOTE: There were 6 observations read from the data set WORK.TREES.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
First, note that the only thing that differs between this program and the previous one is the presence of the INFILE statement with the MISSOVER option. Then, launch and run the SAS program, and review the output from the PRINT procedure to confirm that SAS correctly reads in the data when the MISSOVER option is invoked:
Obs | tree ID | circ_in | hght_ft | crown_ft |
---|---|---|---|---|
1 | 101 | 22 | 105 | 112 |
2 | 102 | 138 | 138 | . |
3 | 103 | 258 | 80 | 70 |
4 | 104 | 187 | 91 | . |
5 | 105 | 210 | 99 | 74 |
6 | 106 | 229 | 127 | 104 |
You might also want to take a look at the log window to verify that this time SAS did not display a NOTE about going to a new line when it reached past the end of a line.