Thus far, we have used the DEFINE statement to change the appearance of our reports, doing such things as specifying the format of a variable, changing the width of a column, specifying column labels, and changing the justification of a column. The truth is that the DEFINE statement is even more powerful when it is used to tell SAS how you want SAS to use each variable that appears in your report. There are six possible usages — DISPLAY, ANALYSIS, ORDER, GROUP, ACROSS, and COMPUTED — which we'll investigate in the course of the next five sections of this lesson. By default, the REPORT procedure uses:
- character variables as display variables
- numeric variables as analysis variables, which are used to calculate the SUM statistic
Because we haven't explicitly defined any variable usages, the reports we have created thus far have contained only display and analysis variables. For example, here's the most recent report we created:
Name | Year | Type of Course | Par | Total Yardage |
---|---|---|---|---|
Toftrees | 1968 | Resort | 72 | 7,018 |
Penn State Blue | 1921 | Public | 72 | 6,525 |
Centre Hills | 1921 | Private | 71 | 6,392 |
Lewistown CC | . | Private | 72 | 6,779 |
State College Elks | 1973 | SemiPri | 71 | 6,369 |
Park Hills CC | 1966 | SemiPri | 70 | 6,004 |
Sinking Valley CC | 1967 | SemiPri | 72 | 6,755 |
Williamsport CC | 1909 | Private | 71 | 6,489 |
Standing Stone CC | 1973 | SemiPri | 70 | 6,593 |
Bucknell GC | 1960 | SemiPri | 70 | 6,253 |
Mount Airy Lodge | 1972 | Resort | 72 | 7,123 |
Because the second, fourth, and fifth columns, Year, Par, and Yards, are numeric variables, SAS treats them as analysis variables by default. Analysis variables are used to calculate a statistic ... by default a SUM.
Now, because the first and third columns, Name and Type, are character variables, SAS treats them as display variables by default. What impact does that have on the appearance of the report? It means three things: 1) A report that contains one or more display variables necessarily contains a detail row for each observation in the data set, 2) Each detail row contains a value for each display variable, and 3) The order of the rows in the report matches the order of the observations in the data set.
We can alter the order in which the rows appear in our report by defining an order variable (or two or three ...). An order variable orders the detail of the rows in a report according to the variable's formatted values. Let's take a look at an example.
Example 10.12 Section
Among other things, the DEFINE statements in the following REPORT procedure define the Yards and Par variables as being ORDER variables:
PROC REPORT data = stat480.penngolf NOWINDOWS HEADLINE;
title 'Some Pennsylvania Golf Courses';
column Par Name Year Type Yards;
define Yards / order format = comma5.0 'Total/Yardage'
width = 7 spacing = 4 center;
define Type / 'Type of/Course' spacing = 6
width = 8 center;
define Par / order;
RUN;
Par | Name | Year | Type | Yards |
---|---|---|---|---|
70 | Park Hills CC | 1966 | SemiPri | 6,004 |
Bucknell GC | 1960 | SemiPri | 6,253 | |
Standing Stone CC | 1973 | SemiPri | 6,593 | |
71 | State College Elks | 1973 | SemiPri | 6,369 |
Centre Hills | 1921 | Private | 6,392 | |
Williamsport CC | 1909 | Private | 6,489 | |
72 | Penn State Blue | 1921 | Public | 6,525 |
Sinking Valley CC | 1967 | SemiPri | 6,755 | |
Lewistown CC | . | Private | 6,779 | |
Toftrees | 1968 | Resort | 7,018 | |
Mount Airy Lodge | 1972 | Resort | 7,123 |
There are just two differences between this REPORT procedure and the one from the previous example:
- The word ORDER has been added to the DEFINE statements for the Yards and Par variables
- The order of the columns appearing in the COLUMN statement has been changed so that the Par variable is the first to appear in the report.
Launch and run the SAS program, and review the output. You should note that the report rows are now first in ascending order of the Par variable, and then within in each Par value the rows are in ascending order of the Yards variable. You should also note that SAS displays only the first occurrence of each value of an order variable in a set of rows that have the same value for all order variables. For example, the first three rows of output pertain to golf courses whose Par is 70, the next three rows of output pertain to golf courses whose Par is 71, and so on. One more thing ... we never told SAS to display the rows in ascending order. That's just what SAS does by default. We can change the order by using the DEFINE statement's DESCENDING option.
Example 10.13 Section
The following program tells SAS, when creating the requested report, to display the rows in descending order of the Par variable and in ascending order of the Yards variable:
PROC REPORT data = stat480.penngolf NOWINDOWS HEADLINE;
title 'Some Pennsylvania Golf Courses';
column Par Name Year Type Yards;
define Yards / order format = comma5.0 'Total/Yardage'
width = 7 spacing = 4 center;
define Type / 'Type of/Course' spacing = 6
width = 8 center;
define Par / order descending;
RUN;
Par | Name | Year | Type | Yards |
---|---|---|---|---|
72 | Penn State Blue | 1921 | Public | 6,525 |
Sinking Valley CC | 1967 | SemiPri | 6,755 | |
Lewistown CC | . | Private | 6,779 | |
Toftrees | 1968 | Resort | 7,018 | |
Mount Airy Lodge | 1972 | Resort | 7,123 | |
71 | State College Elks | 1973 | SemiPri | 6,369 |
Centre Hills | 1921 | Private | 6,392 | |
Williamsport CC | 1909 | Private | 6,489 | |
70 | Park Hills CC | 1966 | SemiPri | 6,004 |
Bucknell GC | 1960 | SemiPri | 6,253 | |
Standing Stone CC | 1973 | SemiPri | 6,593 |
Launch and run the SAS program, and review the output to convince yourself that the rows are indeed in the order as described.