10.4 - Using Order Variables

10.4 - Using Order Variables

Thus far, we have used the DEFINE statement to change the appearance of our reports, doing such things as specifying the format of a variable, changing the width of a column, specifying column labels, and changing the justification of a column. The truth is that the DEFINE statement is even more powerful when it is used to tell SAS how you want SAS to use each variable that appears in your report. There are six possible usages — DISPLAY, ANALYSIS, ORDER, GROUP, ACROSS, and COMPUTED — which we'll investigate in the course of the next five sections of this lesson. By default, the REPORT procedure uses:

  • character variables as display variables
  • numeric variables as analysis variables, which are used to calculate the SUM statistic

Because we haven't explicitly defined any variable usages, the reports we have created thus far have contained only display and analysis variables. For example, here's the most recent report we created:

Some Pennsylvania Golf Courses

Name

Year

Type of Course

Par

Total Yardage

Toftrees

1968

Resort

72

7,018

Penn State Blue

1921

Public

72

6,525

Centre Hills

1921

Private

71

6,392

Lewistown CC

.

Private

72

6,779

State College Elks

1973

SemiPri

71

6,369

Park Hills CC

1966

SemiPri

70

6,004

Sinking Valley CC

1967

SemiPri

72

6,755

Williamsport CC

1909

Private

71

6,489

Standing Stone CC

1973

SemiPri

70

6,593

Bucknell GC

1960

SemiPri

70

6,253

Mount Airy Lodge

1972

Resort

72

7,123

Because the second, fourth, and fifth columns, Year, Par, and Yards, are numeric variables, SAS treats them as analysis variables by default. Analysis variables are used to calculate a statistic ... by default a SUM.

Now, because the first and third columns, Name and Type, are character variables, SAS treats them as display variables by default. What impact does that have on the appearance of the report? It means three things: 1) A report that contains one or more display variables necessarily contains a detail row for each observation in the data set, 2) Each detail row contains a value for each display variable, and 3) The order of the rows in the report matches the order of the observations in the data set.

We can alter the order in which the rows appear in our report by defining an order variable (or two or three ...). An order variable orders the detail of the rows in a report according to the variable's formatted values. Let's take a look at an example.

Example 10.12

Among other things, the DEFINE statements in the following REPORT procedure define the Yards and Par variables as being ORDER variables:

PROC REPORT data = stat480.penngolf NOWINDOWS HEADLINE;
         title 'Some Pennsylvania Golf Courses';
         column Par Name Year Type Yards;
    	 define Yards / order format = comma5.0 'Total/Yardage' 
                        width = 7 spacing = 4 center;
    	 define Type / 'Type of/Course' spacing = 6 
                        width = 8 center;
    	 define Par / order;
    RUN;

Some Pennsylvania Golf Courses

Par

Name

Year

Type

Yards

70

Park Hills CC

1966

SemiPri

6,004

 

Bucknell GC

1960

SemiPri

6,253

 

Standing Stone CC

1973

SemiPri

6,593

71

State College Elks

1973

SemiPri

6,369

 

Centre Hills

1921

Private

6,392

 

Williamsport CC

1909

Private

6,489

72

Penn State Blue

1921

Public

6,525

 

Sinking Valley CC

1967

SemiPri

6,755

 

Lewistown CC

.

Private

6,779

 

Toftrees

1968

Resort

7,018

 

Mount Airy Lodge

1972

Resort

7,123

There are just two differences between this REPORT procedure and the one from the previous example:

  1. The word ORDER has been added to the DEFINE statements for the Yards and Par variables
  2. The order of the columns appearing in the COLUMN statement has been changed so that the Par variable is the first to appear in the report.

Launch and run  the SAS program, and review the output. You should note that the report rows are now first in ascending order of the Par variable, and then within in each Par value the rows are in ascending order of the Yards variable. You should also note that SAS displays only the first occurrence of each value of an order variable in a set of rows that have the same value for all order variables. For example, the first three rows of output pertain to golf courses whose Par is 70, the next three rows of output pertain to golf courses whose Par is 71, and so on. One more thing ... we never told SAS to display the rows in ascending order. That's just what SAS does by default. We can change the order by using the DEFINE statement's DESCENDING option.

Example 10.13

The following program tells SAS, when creating the requested report, to display the rows in descending order of the Par variable and in ascending order of the Yards variable:

PROC REPORT data = stat480.penngolf NOWINDOWS HEADLINE;
     title 'Some Pennsylvania Golf Courses';
     column Par Name Year Type Yards;
	 define Yards / order format = comma5.0 'Total/Yardage' 
                    width = 7 spacing = 4 center;
	 define Type / 'Type of/Course' spacing = 6 
                    width = 8 center;
	 define Par / order descending;
RUN;

Some Pennsylvania Golf Courses

Par

Name

Year

Type

Yards

72

Penn State Blue

1921

Public

6,525

 

Sinking Valley CC

1967

SemiPri

6,755

 

Lewistown CC

.

Private

6,779

 

Toftrees

1968

Resort

7,018

 

Mount Airy Lodge

1972

Resort

7,123

71

State College Elks

1973

SemiPri

6,369

 

Centre Hills

1921

Private

6,392

 

Williamsport CC

1909

Private

6,489

70

Park Hills CC

1966

SemiPri

6,004

 

Bucknell GC

1960

SemiPri

6,253

 

Standing Stone CC

1973

SemiPri

6,593

Launch and run  the SAS program, and review the output to convince yourself that the rows are indeed in the order as described.


Legend
[1]Link
Has Tooltip/Popover
 Toggleable Visibility