7.4 - DATA Step Debugger

7.4 - DATA Step Debugger

In this section, we'll introduce you to a tool called the DATA step debugger that some SAS programmers like to use as a way to find errors that occur in the execution phase of their programs. In the end, you may too want to use it as a way of debugging your programs. It is important to remember though that the DATA step debugger works only at execution time. That means you can't use the DATA step debugger to help find compile-time errors such as missing semi-colons. Our main purpose of investigating the debugger now is to get a real-time, behind-the-scenes illustration of the execution phase when SAS encounters a DATA step that involves reading in raw data.

Example 7.5

The following DATA step is identical to the one that appears in Example 7.3, except here the DATA step debugger has been invoked by adding the DEBUG option to the end of the DATA statement:

DATA trees / DEBUG;
    input type $ 1-16 circ_in hght_ft crown_ft;
	volume = (0.319*hght_ft)*(0.0000163*circ_in**2);
	DATALINES;
oak, black        222 105 112
hemlock, eastern  149 138  52
ash, white        258  80  70
cherry, black     187  91  75
maple, red        210  99  74
elm, american     229 127 104
;
RUN;

Launch and run the SAS program. Upon doing so, you should see two windows —the DEBUGGER SOURCE window and the DEBUGGER LOG window —open up:

If the DEBUGGER LOG window is not stacked on top of the DEBUGGER SOURCE window, as illustrated above, you might try selecting Window > Resize in the SAS menu in order to get the windows stacked as they are above. You might also want to notice that once you invoke the DATA step debugger, as we just have, some new menu options appear along the top of your screen. The View, Run, and Breakpoint menus contain debugger commands.

Now that we have the debugger up and running, we are going to step through the program line by line, along the way asking SAS to display the values of all of the variables in the program data vector. To begin, activate the DEBUGGER LOG window by clicking on it once. You'll know that it is activated when its border becomes a bright blue as opposed to a faded blue. Let's start by asking SAS to show us the values of the variables in the program data vector. To do so, type EXAMINE _ALL_ on the command line at the bottom of the DEBUGGER LOG window, and then press your Enter key. (The command line is the area under the dashed line and immediately following the greater than (>) sign.) Upon doing so, the values of the five user-defined variables and the two automatic variables appear in the DEBUGGER LOG window:

It's exactly what we should expect to see at the beginning of the execution phase of the DATA step. The automatic variable _N_ is set to 1, _ERROR_ is set to 0, and the user-defined variables are each set to missing. Now, in order to tell SAS to execute the INPUT statement, either type Step on the command line and press the Enter key once, or simply press the Enter key once. You should see the command in the DEBUGGER LOG window:

and you should see that SAS has advanced its processing one line in the DEBUGGER SOURCE window. Now that SAS has processed the INPUT statement for the first data record, let's see the values of the variables in the program data vector by again typing EXAMINE _ALL_ on the command line and pressing Enter once:

No surprises there, eh? SAS has read in the four data values for type, circ_in, hght_ft, and crown_ft. The value for volume remains missing because SAS hasn't yet executed the assignment statement. Let's tell SAS to do that by advancing its processor one line by pressing the Enter key once. Then type EXAMINE _ALL_ on the command line and press the Enter key again:

SAS advanced the processor one line as we requested, and there's that 26.9075 value that we were expecting volume to be assigned for the first observation. Now, here's the part that really illustrates how the DATA step works like a loop, repetitively executing statements to read data values and create observations one by one. Advance the processor another step by pressing the Enter key once. There you have it ... SAS moves the processor back up to the INPUT statement at the beginning of the DATA step:

so that it is ready to create the next observation. Taking a look at the contents of the program data vector by typing EXAMINE _ALL_ on the command line, and pressing the Enter key:

you should not be surprised to see that SAS increased the value of _N_ to 2, retained the _ERROR_ value of 0, and reset the user-defined variables to missing.

Are you getting tired of this? You should be getting the idea of this. We are merely using the DATA step debugger to take a behind-the-scenes look of the execution phase as we described it in the last section. If you are finding that you are still learning something from this exercise, you can continue to alternate between advancing the processor and examining the variable values. Alternatively, you can move the process along by pressing the Enter key 16 times (I think) until SAS displays a message indicating that the DATA STEP program has completed execution:

Now, if you EXAMINE _ALL_ the variables, you can see that the automatic variable _N_ has been increased to 7. Because there are only 6 records in the input raw data, there are no more records to read. SAS has thus completed the execution phase of the DATA step. Let's have you quit the debugger then by typing Quit on the command line and pressing the Enter key, or by selecting the Run menu and then selecting Quit debugger.

Incidentally, in the future, rather than typing EXAMINE _ALL_ on the command line, you could select the View menu, and then Examine values..., and then type _ALL_ in the first box, and select OK. And, rather than typing Step on the command line, and pressing the Enter key, you could select the Run menu, and then Step.

Example 7.6

At first glance, you might look at the following program and think that it is identical to the one in the previous example. If you look at the zero that appears in the value 105 in the "oak, black" record, however, you might notice that it is a little rounder than other zeroes that you've seen. That's because it is the letter O and not the number 0. When we ask SAS to execute the program, we should therefore expect an error when SAS tries to read the character value "1O5" in the first record for the numeric variable hght_ft. Let's use the SAS debugger again to see the behind the scenes execution of this program:

DATA trees / DEBUG;
    input type $ 1-16 circ_in hght_ft crown_ft;
	volume = (0.319*hght_ft)*(0.0000163*circ_in**2);
	DATALINES;
oak, black        222 1O5 112
hemlock, eastern  149 138  52
ash, white        258  80  70
cherry, black     187  91  75
maple, red        210  99  74
elm, american     229 127 104
;
RUN;

Launch and run the SAS program. Upon doing so, you should see the DEBUGGER SOURCE and DEBUGGER LOG windows open again. Again, if the DEBUGGER LOG window is not stacked on top of the DEBUGGER SOURCE window, select Window > Resize in the SAS menu in order to get the windows stacked.

Typing EXAMINE _ALL_ on the command line and pressing the Enter key, we see:

the initialization of the program data vector that we'd expect. Pressing the Enter key once to tell SAS to execute the INPUT statement, and then asking SAS to display the values in the program data vector by typing EXAMINE _ALL_ and pressing the Enter key, we see:

Ahaaaa! As expected, SAS is unable to read in the (mistaken) character value of "1O5" for the numeric hght_ft variable, and so SAS sets its value to missing. Upon encountering the error, SAS changes the value of the automatic variable _ERROR_ to 1.

Upon advancing the processor one line and examining the values of the variables, we see that the value of volume remains missing, because SAS can't calculate it without the hght_ft value:

Okay, one more time ... upon advancing the processor another step and examining the values of the variables, we see that SAS pops back up to the top of the data step to begin processing the second record:

As you can see, SAS increased the automatic variable _N_ to 2, reset _ERROR_ to 0 (as there are as of yet no errors detected while processing the observation), and reset the user-defined variables to missing. That's all that I really wanted to illustrate here. If you find it educational, you can continue to alternate between advancing the processor and examining the variable values. Or, you can quit like I'm going to by typing Quit on command line and pressing the Enter key.


Legend
[1]Link
Has Tooltip/Popover
 Toggleable Visibility