10.4 - Using Order Variables

10.4 - Using Order Variables

Thus far, we have used the DEFINE statement to change the appearance of our reports, doing such things as specifiying the format of a variable, changing the width of a column, specifying column labels, and changing the justification of a column. The truth is that the DEFINE statement is even more powerful when it is used to tell SAS how you want SAS to use each variable that appears in your report. There are six possible usages — DISPLAY, ANALYSIS, ORDER, GROUP, ACROSS, and COMPUTED — which we'll investigate in the course of the next five sections of this lesson. By default, the REPORT procedure uses:

  • character variables as display variables
  • numeric variables as analysis variables, which are used to calculate the SUM statistic

Because we haven't explicitly defined any variable usages, the reports we have created thus far have contained only display and analysis variables. For example, here's the most recent report we created:

Some Pennsylvania Golf Courses
Name Year Type of Course Par Total Yardage
Toftrees 1968 Resort 72 7,018
Penn State Blue 1921 Public 72 6,525
Centre Hills 1921 Private 71 6,392
Lewistown CC . Private 72 6,779
State College Elks 1973 SemiPri 71 6,369
Park Hills CC 1966 SemiPri 70 6,004
Sinking Valley CC 1967 SemiPri 72 6,755
Williamsport CC 1909 Private 71 6,489
Standing Stone CC 1973 SemiPri 70 6,593
Bucknell GC 1960 SemiPri 70 6,253
Mount Airy Lodge 1972 Resort 72 7,123

Because the second, fourth and fifth columns, Year, Par, and Yards, are numeric variables, SAS treats them as analysis variables by default. Analysis variables are used to calculate a statistic ... by default a SUM.

Now, because the first and third columns, Name and Type, are character variables, SAS treats them as display variables by default. What impact does that have on the appearance of the report? It means three things: 1) A report that contains one or more display variables necessarily contains a detail row for each observation in the data set, 2) Each detail row contains a value for each display variable, and 3) The order of the rows in the report matches the order of the observations in the data set.

We can alter the order in which the rows appear in our report by defining an order variable (or two or three ...). An order variable orders the detail the rows in a report according to the variable's formatted values. Let's take a look at an example.

Example 10.12

Among other things, the DEFINE statements in the following REPORT procedure defines the Yards and Par variables as being ORDER variables:

PROC REPORT data = stat480.penngolf NOWINDOWS HEADLINE;
     title 'Some Pennsylvania Golf Courses';
     column Par Name Year Type Yards;
	 define Yards / order format = comma5.0 'Total/Yardage' 
                    width = 7 spacing = 4 center;
	 define Type / 'Type of/Course' spacing = 6 
                    width = 8 center;
	 define Par / order;
RUN;
Some Pennsylvania Golf Courses
Par Name Year Type Yards
70 Park Hills CC 1966 SemiPri 6,004
  Bucknell GC 1960 SemiPri 6,253
  Standing Stone CC 1973 SemiPri 6,593
71 State College Elks 1973 SemiPri 6,369
  Centre Hills 1921 Private 6,392
  Williamsport CC 1909 Private 6,489
72 Penn State Blue 1921 Public 6,525
  Sinking Valley CC 1967 SemiPri 6,755
  Lewistown CC . Private 6,779
  Toftrees 1968 Resort 7,018
  Mount Airy Lodge 1972 Resort 7,123

There are just two differences between this REPORT procedure and the one from the previous example:

  1. The word ORDER has been added to the DEFINE statements for the Yards and Par variables
  2. The order of the columns appearing in the COLUMN statement has been changed so that the Par variable is the first to appear in the report.

Launch and run the SAS program, and review the output. You should note that the report rows are now first in ascending order of the Par variable, and then within in each Par value the rows are in ascending order of the Yards variable. You should also note that SAS displays only the first occurrence of each value of an order variable in a set of rows that have the same value for all order variables. For example, the first three rows of output pertain to golf courses whose Par is 70, the next three rows of output pertain to golf courses whose Par is 71, and so on. One more thing ... we never told SAS to display the rows in ascending order. That's just what SAS does by default. We can change the order by using the DEFINE statement's DESCENDING option.

Example 10.13

The following program tells SAS, when creating the requested report, to display the rows in descending order of the Par variable and in ascending order of the Yards variable:

PROC REPORT data = stat480.penngolf NOWINDOWS HEADLINE;
     title 'Some Pennsylvania Golf Courses';
     column Par Name Year Type Yards;
	 define Yards / order format = comma5.0 'Total/Yardage' 
                    width = 7 spacing = 4 center;
	 define Type / 'Type of/Course' spacing = 6 
                    width = 8 center;
	 define Par / order descending;
RUN;
Some Pennsylvania Golf Courses
Par Name Year Type Yards
72 Penn State Blue 1921 Public 6,525
  Sinking Valley CC 1967 SemiPri 6,755
  Lewistown CC . Private 6,779
  Toftrees 1968 Resort 7,018
  Mount Airy Lodge 1972 Resort 7,123
71 State College Elks 1973 SemiPri 6,369
  Centre Hills 1921 Private 6,392
  Williamsport CC 1909 Private 6,489
70 Park Hills CC 1966 SemiPri 6,004
  Bucknell GC 1960 SemiPri 6,253
  Standing Stone CC 1973 SemiPri 6,593

Launch and run the SAS program, and review the output to convince yourself that the rows are indeed in the order as described.


Legend
[1]Link
Has Tooltip/Popover
 Toggleable Visibility