8.7 - Missing Values Generated

8.7 - Missing Values Generated

This section illustrates the kinds of messages you might see in the log window when you perform calculations using variables that contain missing values for some of the observations in your data set. In such a case, SAS displays a message along the lines of "Missing values were generated as a result of performing an operation on missing values." Of course, having SAS behave in this way is not always a problem. It is possible that your data contain legitimate missing values and setting the new variable to missing is a desirable action for SAS to take. On the other hand, it is also possible that missing values result from an error and that you need to fix either your program or your data. Therefore, it is always a good idea, when you receive the "missing values generated" note to take the time to play detective and verify that your program is behaving the way you desire.

Example 8.12

The following example illustrates how SAS propagates missing values. That is, for some calculations, SAS assigns a variable a missing value if any of the values contributing to the calculation are missing. In this example, SAS generates missing values when attempting to calculate the volume of the tree when either the height or the circumference is missing:

OPTIONS PS = 58 LS = 72 NODATE NONUMBER;
DATA trees;
    input type $ 1-16 circ_in hght_ft crown_ft;
	volume = (0.319*hght_ft)*(0.0000163*circ_in**2);
	DATALINES;
oak, black        222   . 112
hemlock, eastern  .   138  52
ash, white        258  80  70
cherry, black     187  91  75
maple, red        210  99  74
elm, american     229 127 104
;
RUN;
PROC PRINT data = trees;
    title 'Tree data';
RUN;

   OPTIONS PS = 58 LS = 72 NODATE NONUMBER;

   DATA trees;
       input type $ 1-16 circ_in hght_ft crown_ft;
       volume = (0.319*hght_ft)*(0.0000163*circ_in**2);
       DATALINES;

NOTE: Missing values were generated as a result of performing an
    operation on missing values.
    Each place is given by: (Number of times) at (Line):(Column).
    1 at 73:20   1 at 73:40   1 at 73:48
NOTE: The data set WORK.TREES has 6 observations and 5 variables.
NOTE: DATA statement used (Total process time):
    real time           0.00 seconds
    cpu time            0.01 seconds

   ;
   RUN;

   PROC PRINT data = trees;
       title 'Tree data';
   RUN;

NOTE: There were 6 observations read from the data set WORK.TREES.
NOTE: PROCEDURE PRINT used (Total process time):
      real time           0.00 seconds
      cpu time            0.01 seconds

First, review the data and note that the first tree is missing a value for hght_ft and the second tree is missing a value for circ_in. Then, launch and run  the SAS program, and review the log window to see the "missing values generated" message SAS displays in this situation. If you review the output from the PRINT procedure:

Tree data

Obs

type

circi_in

hght_ft

crown_ft

volume

1

oak, black

222

.

112

.

2

hemlock, eastern

.

138

52

.

3

ash, white

258

80

70

27.6890

4

cherry, black

187

91

75

16.5464

5

maple, red

210

99

74

22.7014

6

elm, american

229

127

104

34.6300

you can see that SAS did indeed assign a missing value to the volume variable for the first two observations.

Example 8.13

When you are working with a large data set, it can be difficult to locate all the places in which a missing value was generated based on a calculation. In that case, you'll probably want to use a selecting IF statement to find the missing values. The following example illustrates using an IF statement to find the observations that are assigned a missing value for the newly calculated variable volume:

OPTIONS PS = 58 LS = 72 NODATE NONUMBER;
DATA trees;
    input type $ 1-16 circ_in hght_ft crown_ft;
	volume = (0.319*hght_ft)*(0.0000163*circ_in**2);
	if volume = .;
	DATALINES;
oak, black        222   . 112
hemlock, eastern  .   138  52
ash, white        258  80  70
cherry, black     187  91  75
maple, red        210  99  74
elm, american     229 127 104
;
RUN;
PROC PRINT data = trees;
    title 'Trees with Missing Volumes';
RUN;

Trees with Missing Volumes

Obs

type

circ_in

hght_ft

crown_ft

volume

1

oak, black

222

.

112

.

2

hemlock, eastern

.

138

52

.

First, note that the only thing that differs between this program and the previous one is the presence of the IF statement. Then, launch and run  the SAS program, and review the output from the PRINT procedure to see the two observations for which the volume is deemed missing.


Legend
[1]Link
Has Tooltip/Popover
 Toggleable Visibility