10.6 - Specifying Statistics

As mentioned earlier, the default statistic for analysis variables is SUM. There may be some instances, however, in which you want to display other statistics in your reports. To do so, you merely specify your desired statistic as an attribute in the DEFINE statement. Here's a list of the statistics you can request:

Statistic Description
CSS corrected sum of squares
USS uncorrected sum of squares
CV coefficient of variation
MAX maximum value
MEAN average value
MIN minimum value
N number of observations with nonmissing values
NMISS number of observations with missing values
RANGE range of values
STD standard deviation
STDERR standard error of the mean
SUM sum of the values
SUMWGT sum of the weight variable values
PCTN percentage of cell or row frequency to total frequency
PCTSUM percentage of cell or row sum to total sum
VAR variance of the values
T student's t-statistic for testing population mean is 0
PRT probability of a greater absolute value of student's t

Let's take a look at an example in which the mean statistic is requested.

Example 10.15 Section

The following REPORT procedure creates a report in which the average par and average yardage is reported for each of the four types of golf courses:

PROC REPORT data = stat480.penngolf NOWINDOWS HEADLINE;
     title 'Average Size of Some PA Golf Courses by Type';
     column Type Par Yards;
	 define Type / group 'Type of/Course' spacing = 6 
                    width = 8;
	 define Par /  mean format= 4.1 
                   'Average/Par' width = 7 center;
	 define Yards /  mean format = comma6.0 'Average/Yardage' 
                    width = 7 spacing = 4 center;
RUN;
 
Some Pennsylvania Golf Courses
Type of Course Average Par Average Yardage
Private 71.3 6,553
Public 72.0 6,525
Resort 72.0 7,071
SemiPri 70.6 6,395

The COLUMN statement tells SAS that we only want to display three columns, namely Type, Par, and Yards, in that order. The first DEFINE statement tells SAS to use Type as a group variable, as well as specifies the column heading, spacing, and width. The mean that is present in the second DEFINE statement tells SAS to calculate the average Par for each Type of golf course. The second DEFINE statement also tells SAS how to format the result, as well as how to label, justify and set the width of the Par column. The mean that is present in the third DEFINE statement tells SAS to calculate the average Yards for each Type of golf course. The third DEFINE statement also tells SAS how to format the result, as well as how to label, justify and set the width and spacing of the Yards column.

Now, launch and run the SAS program, and review the output to convince yourself that SAS collapses the observations, and in so doing, calculates the averages of the Par and Yards variables for each Type of golf course.

Example 10.16 Section

The following example illustrates the type of one-line report you get when the columns of your report contain only (numeric) analysis variables:

PROC REPORT data = stat480.penngolf NOWINDOWS HEADLINE;
     title 'Size of Some PA Golf Courses';
     column Par Yards;
	 define Par /  mean format= 4.1 
                   'Average/Par' width = 7 center;
	 define Yards / format = comma7.0 'Total/Yardage' 
                    width = 7 spacing = 4 center;
RUN;
Size of Some Pennsylvania Golf Courses
Average Par Total Yardage
71.2 72,300

First, note that the COLUMN statement contains just two variables, Par and Yards, and that both are numeric variables. The first DEFINE statement tells SAS to calculate the average Par, as well as how to format the result and label, justify and set the width of the column. The second DEFINE statement tells SAS to calculate the total yards, as well as how to format the result and label, justify and set the width and spacing of the column.

Now, launch and run the SAS program, and review the output to convince yourself that SAS collapses all of the observations, and in so doing, calculates the average Par and the total Yards of all of the golf courses. It is in this way that we end up with just a one-line report.