7.1 - Processing SAS Programs

7.1 - Processing SAS Programs

In order to write SAS programs that work well, we need to fully understand what SAS does, that is, how SAS processes a program after you've submitted it by clicking on that little running man . By now, you know that SAS first reads the program — from top-to-bottom and left-to-right — looking for syntax errors, that is, for things such as missing semi-colons, misspelled keywords, and invalid variable names. In reality, SAS processes the program chunk by chunk, or more perhaps more accurately described as step by step.

DATA and PROC statements signal the beginning of a new step. When SAS encounters:

  • a subsequent DATA or PROC statement,
  • a RUN statement (for DATA steps and most procedures),
  • or a QUIT statement (for some procedures)

SAS stops reading statements looking for errors and executes the previous step. Each time a step is executed, SAS:

  • generates messages about the processing activities in the log window, and
  • the results of the processing in the output window.

You can get the idea that SAS functions as described by reviewing the log window after submitting a program.

Example 7.1

The following SAS program merely creates a simple data set called trees, sorts the data set by tree type, and then prints the data set:

OPTIONS PS = 58 LS = 72 NODATE NONUMBER;
DATA trees;
    input type $ 1-16 circ_in hght_ft crown_ft;
	DATALINES;
oak, black        222 105 112
hemlock, eastern  149 138  52
ash, white        258  80  70
cherry, black     187  91  75
maple, red        210  99  74
elm, american     229 127 104
;
PROC SORT data = trees;
   by type;
RUN;
PROC PRINT data = trees;
  title 'Tree data';
  id type;
RUN;

Tree data

Obs

type

circ_in

hght_in

crown_ft

1

ash, white

258

80

70

2

cherry, black

187

91

75

3

elm,american

229

127

104

4

hemlock, eastern

149

138

52

5

maple, red

210

99

74

6

oak, black

222

105

112

You can go ahead and launch and run  the SAS program and review the output, but what we ultimately care about in this discussion is what shows up in the log window:

    OPTIONS PS = 58 LS = 72 NODATE NONUMBER;
    DATA trees;
        input type $ 1-16 circ_in hght_ft crown_ft;
        DATALINES;

NOTE: The data set WORK.TREES has 6 observations and 4 variables.
NOTE: DATA statement used (Total process time):
      real time           0.03 seconds
      cpu time            0.00 seconds


   ;
   PROC SORT data = trees;
      by type;
   RUN;

NOTE: There were 6 observations read from the data set WORK.TREES.
NOTE: The data set WORK.TREES has 6 observations and 4 variables.
NOTE: PROCEDURE SORT used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds


   PROC PRINT data = trees;
NOTE: Writing HTML Body file: sashtml.htm
     title 'Tree data';
     id type;
   RUN;

The most important thing to note about the contents of the log window is that SAS first displayed messages about the DATA step, and then messages about the SORT procedure step, and finally messages about the PRINT procedure step. That's because SAS first read the DATA step looking for errors, and then when SAS deemed the step to be error-free, SAS executed the DATA step when it encountered the RUN statement. Then SAS read the SORT procedure looking for errors, and then when SAS deemed the step to be error-free, SAS executed the the SORT procedure when it encountered the RUN statement. Finally, SAS read the PRINT procedure looking for errors, and then when SAS deemed the step to be error-free, SAS executed the the PRINT procedure when it encountered the RUN statement. It is in this manner that SAS processes programs step by step.

Oh yeah, there is something to notice about the output that our program generated. You'll see that in spite of us submitting three different steps — the DATA step, the SORT procedure, and the PRINT procedure — SAS generated only one piece of output. That's because some steps don't generate output, but rather just produce messages in the log window. In this case, the DATA step produced messages in the log window, but it did not create a report or other output. Likewise, the SORT procedure produced messages in the log window, but it did not create a report or other output. The PRINT procedure produced messages in the log window and created a report in the output window.

It is the messages contained in the log window that will be the focus of our attention for this lesson and the next. It is those messages that will help us learn how to write programs that work. Let's take a look at one more example.

Example 7.2

The following SAS program merely creates a simple data set called trees, attempts to sort the data set by tree height, and then prints the data set:

OPTIONS PS = 58 LS = 72 NODATE NONUMBER;
DATA trees;
    input type $ 1-16 circ_in hght_ft crown_ft;
	DATALINES;
oak, black        222 105 112
hemlock, eastern  149 138  52
ash, white        258  80  70
cherry, black     187  91  75
maple, red        210  99  74
elm, american     229 127 104
;
PROC SORT data = trees;
   by height;
RUN;
PROC PRINT data = trees;
  title 'Tree data again';
  id type;
RUN;

Tree Data Again

type

circ_in

hght_in

crown_ft

oak, black

222

105

112

hemlock, eastern

149

138

52

ashe, white

258

80

70

cherry, black

187

91

75

maple, red

210

99

74

elm,american

229

127

104

If you review the program carefully, you'll see that it shouldn't work as we intended, because we attempted to sort the trees data set by the incorrect variable name height rather than by the correct variable name hght_ft. Let's go ahead and launch and run  the SAS program and then take a look at the messages displayed in the log window:

   OPTIONS PS = 58 LS = 72 NODATE NONUMBER;
   DATA trees;
       input type $ 1-16 circ_in hght_ft crown_ft;
       DATALINES;

NOTE: The data set WORK.TREES has 6 observations and 4 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds


   ;
   PROC SORT data = trees;
      by height;
ERROR: Variable HEIGHT not found.
   RUN;

NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE SORT used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds



   PROC PRINT data = trees;
     title 'Tree data again';
     id type;
   RUN;

Once more, this example illustrates how SAS processes one step at a time. As you can see by the messages displayed in the log, the DATA step successfully creates a temporary data set named trees containing four variables and six observations. Upon completing the DATA step, SAS then moves on to read the SORT procedure looking for errors. When SAS discovers that we are attempting to sort the trees data set by a variable height that doesn't exist in the data set, SAS displays an ERROR message in the log window. Because SAS can't even begin to attempt to execute the SORT procedure without knowledge of the correct variable, SAS stops processing the SORT step and says so in the log window. SAS then moves on to read the PRINT procedure looking for errors. When SAS deems that the PRINT procedure is error-free and that the previous error doesn't prevent the PRINT procedure from succeeding, SAS executes the PRINT procedure, as it says so in the log window.


Legend
[1]Link
Has Tooltip/Popover
 Toggleable Visibility