10.2 - Column Attributes

In the output that you've seen so far, you might have noticed that the REPORT procedure displays:

  • each data value the way that it is stored in the data set,
  • variable names as column headings in the report,
  • a default width for the report columns,
  • left-justified character values,
  • right-justified numeric values, and
  • observations in the order in which they are stored in the data set.

In this section, we'll learn how to use the DEFINE statement to tell SAS to display a variable using a certain format, specify the width of the columns, and set the number of blank spaces that should appear to the left of the columns in a report. To tell SAS to display a variable var1 using the comma7.2 format, say, we must use the FORMAT= attribute of the DEFINE statement as follows:

DEFINE var1 / FORMAT = comma7.2;

The format specified can either be a SAS format or a user-defined format. To tell SAS to set the column width for a variable var2 at 6 spaces, say, we must use the WIDTH= attribute of the DEFINE statement as follows:

DEFINE var2 / WIDTH = 6;

The default column width is set to be just large enough to handle the specified format. To tell SAS to leave 4 blank characters between the column containing var3 and the column immediately to its left, we must use the SPACING= attribute of the DEFINE statement as follows:

DEFINE var3 / SPACING = 4;

By default, SAS leaves 2 blank characters to the left of each column. Let's take a look at some examples!

Example 10.5 Section

The following SAS program uses the FORMAT= attribute to tell SAS to display when creating a report using the stat480.penngolf data set, the Yards variable using the SAS format comma5.0:

PROC REPORT data = stat480.penngolf NOWINDOWS HEADLINE;
     title 'Some Pennsylvania Golf Courses';
     column Name Year Type Par Yards;
	 define Yards / format = comma5.0;
RUN;

Some Pennsylvania Golf Courses

Name

Year

Type

Par

Yards

Toftrees

1968

Resort

72

7,018

Penn State Blue

1921

Public

72

6,525

Centre Hills

1921

Private

71

6,392

Lewistown CC

.

Private

72

6,779

State College Elks

1973

SemiPri

71

6,369

Park Hills CC

1966

SemiPri

70

6,004

Sinking Valley CC

1967

SemiPri

72

6,755

Williamsport CC

1909

Private

71

6,489

Standing Stone CC

1973

SemiPri

70

6,593

Bucknell GC

1960

SemiPri

70

6,253

Mount Airy Lodge

1972

Resort

72

7,123

Launch and run  the SAS program, and review the output to convince yourself that SAS does indeed display the Yards variable using the comma5.0 format. Incidentally, if you do not specify a format for a variable within a REPORT procedure, SAS displays the variable using the format that is stored in the data set. If no format is stored in the data set, then SAS uses the default format for that variable type.

Oh, one more thing. You can, by the way, also use FORMAT statements within a REPORT procedure to specify a variable's format. As we'll soon see, the DEFINE statement allows you, however, to specify more than one column attribute at a time. You can also use the DEFINE statement's FORMAT= attribute to specify the format of report columns that are not variables actually contained in your data set (such as the statistics or computed variables that we'll investigate later in this lesson).

Example 10.6 Section

If a variable in the input data set doesn't have a format associated with it, then the default column width in a report is set at the variable's length for character variables and 9 for numeric variables. The following SAS program illustrates what can go wrong with the reports you generate when you allow SAS to use these defaults:

DATA penngolf;
    set stat480.penngolf;
	length CourseType $ 8;
	CourseType =  Type;
	drop Type;
	format Slope 3.;
RUN;
PROC CONTENTS data = penngolf;
RUN;
PROC REPORT data = penngolf NOWINDOWS HEADLINE;
     title 'Some Pennsylvania Golf Courses';
     column Name Year CourseType Slope Par Yards;
	 define Yards / format = comma5.0;
RUN;

The DATA step creates a temporary SAS data set called penngolf using the permanent stat480.penngolf data set. The LENGTH statement tells SAS to create a new character variable called CourseType that is 8 characters long. An assignment statement is then used to assign the values of the Type variable to the new CourseType variable. (Seems like a silly DATA step so far, eh? You'll see why we're doing this in a minute.) The DROP statement tells SAS to then drop the Type variable from the data set as we no longer need it. The FORMAT statement assigns the numeric 3. format to the Slope variable.

Now, if you launch and run  the SAS program, you can see first that the output from the CONTENTS procedure:

 

Alphabetic List of Variables and Attributes

#

Variable

Type

Len

Format

3

Architect

Char

18

 

9

CourseType

Char

8

 

1

ID

Num

8

 

2

Name

Char

18

 

5

Par

Num

8

 

7

Slope

Num

8

3.

8

USGA

Num

8

 

6

Yards

Num

8

 

4

Year

Num

8

 

 

confirms that the length of the character variable CourseType is 8, the format of the numeric variable Slope is 3., and the remaining five numeric variables are not associated with a format. Here's what the report that the REPORT procedure generates looks like:

 

Some Pennsylvania Golf Courses

Name

Year

CourseType

Slope

Par

Yards

Toftrees

1968

Resort

134

72

7,018

Penn State Blue

1921

Public

128

72

6,525

Centre Hills

1921

Private

128

71

6,392

Lewistown CC

.

Private

125

72

6,779

State College Elks

1973

SemiPri

123

71

6,369

Park Hills CC

1966

SemiPri

126

70

6,004

Sinking Valley CC

1967

SemiPri

132

72

6,755

Williamsport CC

1909

Private

131

71

6,489

Standing Stone GC

1973

SemiPri

120

70

6,593

Bucknell GC

1960

SemiPri

132

70

6,253

Mount Airy Lodge

1972

Resort

140

72

7,123

 

We've got a little problem with the CourseType and Slope column headings. What went wrong here? To answer that question, we have to review how SAS sets the default column widths. Slope is a numeric variable with a numeric format of 3. By default, SAS sets the column width of a numeric variable to be just large enough to handle the specified format. Thus, SAS sets the column width for the Slope variable to just 3 spaces wide, which is clearly not enough space for the column heading. Now, CourseType is a character variable with a length of 8 characters. By default, SAS sets the column width of a character variable to be the length of the character variable. Thus, SAS sets the column width for the CourseType variable to just 8 spaces wide, which is again not enough space for the column heading. (Now, you can see why I wanted to illustrate this example with a long variable name like CourseType rather than the shorter name Type.) Fortunately, we can solve our problem by using the DEFINE statement's WIDTH= attribute.

Example 10.7 Section

The following SAS program modifies the REPORT procedure of the previous example so that the width of the CourseType and Slope columns are set to 10 and 5, respectively:

PROC REPORT data = penngolf NOWINDOWS HEADLINE;
	title 'Some Pennsylvania Golf Courses';
	column Name Year CourseType Slope Par Yards;
	define Yards / format = comma5.0;
	define CourseType / width = 10;
	define Slope / width = 5;
RUN;

Some Pennsylvania Golf Courses

Name

Year

CourseType

Slope

Par

Yards

Toftrees

1968

Resort

134

72

7,018

Penn State Blue

1921

Public

128

72

6,525

Centre Hills

1921

Private

128

71

6,392

Lewistown CC

.

Private

125

72

6,779

State College Elks

1973

SemiPri

123

71

6,369

Park Hills CC

1966

SemiPri

126

70

6,004

Sinking Valley CC

1967

SemiPri

132

72

6,755

Williamsport CC

1909

Private

131

71

6,489

Standing Stone GC

1973

SemiPri

120

70

6,593

Bucknell GC

1960

SemiPri

132

70

6,253

Mount Airy Lodge

1972

Resort

140

72

7,123

Launch and run  the SAS program, and review the output to convince yourself that the widths of the CourseType and Slope columns are now set to be large enough to accommodate the column headings. Incidentally, the WIDTH= attribute can handle any value from 1 to the value of the LINESIZE= system option.

If you look at the output from this example, you might notice that some of the columns are scrunched together more than others. For example, the CourseType column is rather close to the Year column to its left, and the Yards column is rather close to the Par column to its left. We can use the DEFINE attribute's SPACING= attribute to change how much white space sits before each column.

Example 10.8 Section

The following SAS program uses the DEFINE statement's SPACING= attribute to tell SAS to place 5 blank spaces before the Yards column and 6 blank spaces before the CourseType column:

PROC REPORT data = penngolf NOWINDOWS HEADLINE;
     title 'Some Pennsylvania Golf Courses';
     column Name Year CourseType Slope Par Yards;
	 define Yards / format = comma5.0 spacing = 5;
	 define CourseType / width = 10 spacing = 6;
	 define Slope / width = 5;
RUN;

Some Pennsylvania Golf Courses

Name

Year

CourseType

Slope

Par

Yards

Toftrees

1968

Resort

134

72

7,018

Penn State Blue

1921

Public

128

72

6,525

Centre Hills

1921

Private

128

71

6,392

Lewistown CC

.

Private

125

72

6,779

State College Elks

1973

SemiPri

123

71

6,369

Park Hills CC

1966

SemiPri

126

70

6,004

Sinking Valley CC

1967

SemiPri

132

72

6,755

Williamsport CC

1909

Private

131

71

6,489

Standing Stone GC

1973

SemiPri

120

70

6,593

Bucknell GC

1960

SemiPri

132

70

6,253

Mount Airy Lodge

1972

Resort

140

72

7,123

You might first note that you can specify more than one attribute per DEFINE statement. Case in point, the DEFINE statement for Yards modifies the format and spacing for the Yards column, and the DEFINE statement for CourseType modifies the width and spacing for the CourseType column. Then, launch and run  the SAS program, and review the output to convince yourself that the columns in the report are now more evenly distributed.