6.5 - Column Totals

There may be situations in which you want SAS to calculate and present column totals for some of the numeric variables appearing in your reports. In that case, you'll want to take advantage of the SUM statement. We'll investigate the use of the statement here.

Example 6.11 Section

The following SAS code uses the PRINT procedure's SUM statement to generate a report of the total number of visits (no_vis) for patients undergoing physical therapy (type_vis = 190):

PROC PRINT data = basic;
   id name;
   var clinic no_vis;
   where type_vis = 190;
   sum no_vis;
RUN;

The SAS System

name

clinic

no_vis

Thomas Jones

ALTO

10

Benedictine Arnold

ALTO

1

Felicia Ho

MNMC

7

Jane Smiley

MNMC

5

======

  

23

The ID statement tells SAS to suppress the observation number and to place the variable name in the first column of the output. The WHERE statement tells SAS to print only the observations pertaining to a physical therapy appointment (type_vis = 190). The SUM statement tells SAS to provide the total number of visits (no_vis) the patients undergoing physical therapy had.

Launch and run  the SAS program, and review the output to convince yourself that the report is generated as described.

Incidentally, when you specify a variable in a SUM statement, the variable is also implicitly assumed to be present in the VAR statement. This prevents you from having to type the variable names twice, once in the VAR statement and once in the SUM statement. You might want to delete no_vis in the VAR statement and re-run  the SAS program to convince yourself that no_vis is still printed in the report because its name appears in the SUM statement.

Example 6.12 Section

There may be situations in which you want not just column totals, but also column subtotals. Using the PRINT procedure's BY statement, you can tell SAS to print observations in groups based on the values of the different BY variables. When a SUM statement is specified in the presence of a BY statement, SAS produces subtotals each time the value of a BY variable changes.  

The following SAS program illustrates the use of the BY statement in conjunction with the SUM statement to print the data in our basic data set in three groups based on the value of clinic, as well as display the total expense for each of the three groups separately:

PROC SORT data = basic out = srtd_basic;
  by clinic;
RUN;
PROC PRINT data = srtd_basic;
   by clinic;
   var subj name no_vis type_vis expense;
   sum expense;
RUN;

---------------------------------- clinic=ALTO ----------------------------------

The SAS System

Obs

subj

name

no_vis

type_vis

expense

1

1168

Thomas Jones

10

190

3904.89

2

1201

Benedictine Arnold

1

190

1450.23

------

------

clinic

5355.12

---------------------------------- clinic=LEWN ----------------------------------

The SAS System

Obs

subj

name

no_vis

type_vis

expense

3

1024

Alice Smith

7

101

1001.98

4

1167

Maryann White

2

101

2999.34

------

------

clinic

4001.32

---------------------------------- clinic=MNMC ----------------------------------

The SAS System

Obs

subj

name

no_vis

type_vis

expense

5

1302

Felicia Ho

7

190

1209.94

6

1471

John Smith

6

187

1763.09

7

1980

Jane Smiley

5

190

3567.00

------

------

clinic

6540.03

======

15896.47

As you'll see is always the case, whenever a BY statement is used in any DATA step or procedure, the data set must first be sorted in order based on the variables specified in the BY statement. If not, your program will halt the execution, and SAS will print a message in the log indicating that the data set is not properly sorted. The SORT procedure prepares the data for the PRINT procedure's BY statement by sorting the basic data set by clinic and storing the sorted data in a new data set called srtd_basic. The PRINT procedure then tells SAS to print srtd_basic in three groups (by clinic) — one for ALTO, one for LEWN, and one for MNMC. The PRINT procedure also tells SAS to sum the expenses (sum expense) for each clinic separately, as well as provide a grand total of all expenses.

Launch and run  the SAS program, and review the resulting output. Your output should display the observations in three groups (ALTO, LEWN, and MNMC) and the variable expense should be added up across the subjects and then across the three groups.

Example 6.13 Section

If you take a look at the output from the previous example, you should see that the columns don't line up across the three clinics. The UNIFORM option tells SAS to make sure the columns of data line up from one group to the next. Without the UNIFORM statement, the PRINT procedure works to fit as many variables and observations on the page as possible. As a result, printed columns can be shifted from one group to the next. In the PRINT procedure in the previous example, no UNIFORM option was specified. Therefore, since a different number of characters are needed for name for the three groups, the columns are not aligned.

The PRINT procedure in the following SAS program illustrates use of the UNIFORM option to remedy this problem:

PROC SORT data = basic out = srtd_basic;
  by clinic;
RUN;
PROC PRINT data = srtd_basic UNIFORM;
   by clinic;
   var subj name no_vis type_vis expense;
   sum expense;
RUN;

------------------------------ clinic=ALTO ------------------------------

The SAS System

Obs

subj

name

no_vis

type_vis

expense

1

1168

Thomas Jones

10

190

3904.89

2

1201

Benedictine Arnold

1

190

1450.23

------

------

clinic

5355.12

------------------------------ clinic=LEWN ------------------------------

The SAS System

Obs

subj

name

no_vis

type_vis

expense

3

1024

Alice Smith

7

101

1001.98

4

1167

Maryann White

2

101

2999.34

------

------

clinic

4001.32

------------------------------ clinic=MNMC ------------------------------

The SAS System

Obs

subj

name

no_vis

type_vis

expense

5

1302

Felicia Ho

7

190

1209.94

6

1471

John Smith

6

187

1763.09

7

1980

Jane Smiley

5

190

3567.00

------

------

clinic

6540.03

======

15896.47

The only difference between this PRINT procedure and the previous one is that the UNIFORM option has been included here. Launch and run  the SAS program, and review the resulting output to convince yourself that the columns across the three groups are now properly aligned.

Example 6.14 Section

In the output from the previous two examples, you might have noticed that redundant information is displayed for each group. For example, the BY variable clinic is identified across the top of the data for each group, as well as for the subtotal for each group. To show the BY variable heading only once, you can use an ID statement and a BY statement in conjunction with the SUM statement. When an ID statement specifies the same variable as the BY statement:

  1. The observation number is suppressed
  2. The ID variable is printed as the first column of the report
  3. Each value of the ID variable is printed only at the start of each BY group and on the line that contains that group's subtotal.

The PRINT procedure in the following SAS program illustrates the use of the ID statement to control the headings displayed in the report for each clinic:

PROC SORT data = basic out = srtd_basic;
  by clinic;
RUN;
PROC PRINT data = srtd_basic UNIFORM;
   by clinic;
   var subj name no_vis type_vis expense;
   sum expense;
   id clinic;
RUN;

The SAS System

clinic

subj

name

no_vis

type_vis

expense

ALTO

1168

Thomas Jones

10

190

3904.89

 

1201

Benedictine Arnold

1

190

1450.23

------

    

------

ALTO

    

5355.12

LEWN

1024

Alice Smith

7

101

1001.98

 

1167

Maryann White

2

101

2999.34

------

    

------

LEWN

    

4001.32

MNMC

1302

Felicia Ho

7

190

1209.94

 

1471

John Smith

6

187

1763.09

 

1980

Jane Smiley

5

190

3567.00

------

    

------

MNMC

    

6540.03

======

15896.47

Launch and run  the SAS program, and review the resulting output to convince yourself that the headings for the three clinics appear as described.

Example 6.15 Section

Rather than having SAS display the output for each group on the same page, you can take advantage of the PAGEBY statement to tell SAS to print each group on a separate page. The following SAS program creates the same output as the previous example, except here we request that the clinics be printed on separate pages:

PROC SORT data = basic out = srtd_basic;
  by clinic;
RUN;

PROC PRINT data = srtd_basic UNIFORM;
   by clinic;
   var subj name no_vis type_vis expense;
   sum expense;
   id clinic;
   pageby clinic;
RUN;

The PAGEBY statement tells SAS to print the data for each clinic on a separate page. Note that the variable that is specified in the PAGEBY statement must also be specified in the PRINT procedure's BY statement.

Launch and run  the SAS program. You should see that each clinic is represented on a separate page — the first page:

clinic

subj

name

no_vis

type_vis

expense

ALTO

1168

Thomas Jones

10

190

3904.89

 

1201

Benedictine Arnold

1

190

1450.23

------

    

------

ALTO

    

5355.12

the second page:

clinic

subj

name

no_vis

type_vis

expense

LEWN

1024

Alice Smith

7

101

1001.98

 

1167

Maryann White

2

101

2999.34

------

    

------

LEWN

    

4001.32

and the third page:

clinic

subj

name

no_vis

type_vis

expense

MNMC

1302

Felicia Ho

7

190

1209.94

 

1471

John Smith

6

187

1763.09

 

1980

Jane Smiley

5

190

3567.00

------

    

------

MNMC

    

6540.03

======

15896.47