6.1 - Basic Reports

As you are already well aware, the PRINT procedure is the mechanism by which we can print and therefore see the contents — variables and observations — of a SAS data set. Using the PRINT procedure, we can create quite informative list reports. The PRINT procedure takes the following general form:

PROC PRINT options;
	statement1;
	statement2;
	etc;
RUN;

As usual, throughout this lesson, we'll look at some examples in order to learn about the various options and features of the PRINT procedure.

Example 6.1

The following SAS code merely creates a baseline data set called basic that we can print throughout the lesson:

OPTIONS LS = 75 PS = 58 NODATE;
DATA basic;
  input subj 1-4 name $ 6-23 clinic $ 25-28
        gender 30 no_vis 32-33 type_vis 35-37
        expense 39-45;
  DATALINES;
1024 Alice Smith        LEWN 1  7 101 1001.98
1167 Maryann White      LEWN 1  2 101 2999.34
1168 Thomas Jones       ALTO 2 10 190 3904.89
1201 Benedictine Arnold ALTO 2  1 190 1450.23
1302 Felicia Ho         MNMC 1  7 190 1209.94
1471 John Smith         MNMC 2  6 187 1763.09
1980 Jane Smiley        MNMC 1  5 190 3567.00
  ;
RUN;
PROC PRINT data = basic;
RUN;

The SAS System
Obs	subj	name	clinic	gender	no_vis	type_vis	expense
1	1024	Alice Smith	LEWN	1	7	101	1001.98
2	1167	Maryann White	LEWN	1	2	101	2999.34
3	1168	Thomas Jones	ALTO	2	10	190	3904.89
4	1201	Benedict Arnold	ALTO	2	1	190	1450.23
5	1302	Felicia Ho	MNMC	1	7	190	1209.94
6	1471	John Smith	MNMC	2	6	187	1763.09
7	1980	Jane Smiley	MNMC	1	5	190	3567.00

For the sake of context, define the variables in the data set as follows:

subj: as usual, the subject ID number
name: patient's name
clinic: where the patient was treated
gender: gender of subject (1: female, 2: male)
no_vis: number of visits to a medical facility (0, 1, 2,...)
type_vis: type of visit (101: gynecology, 190: physical therapy 187: cardiology)
expense: medical charges in dollars.

The PRINT procedure that tells SAS to print the basic data set is the simplest form of the PRINT procedure and one that we have, of course, already used numerous times throughout the semester. Launch and run the SAS code so that the data set becomes available for use throughout the lesson. You should also review the resulting report to familiarize yourself with its basic characteristics. You should see that by default:

All observations and variables in the data set are printed
A column for observation numbers appears on the far left
Variables appear in the order in which they occur in the data set

Example 6.2

By default, the PRINT procedure lists all of the variables contained in a SAS data set. We can use the PRINT procedure's VAR statement to not only select variables but also to control the order in which the variables appear in our reports. The following SAS program uses the VAR statement to tell SAS to print just a subset of the variables — name, no_vis, and expense — contained in the basic data set:

PROC PRINT data = basic;
   var name no_vis expense;
RUN;

The SAS System
Obs	name	no_vis	expense
1	Alice Smith	7	1001.98
2	Maryann White	2	2999.34
3	Thomas Jones	10	3904.89
4	Benedict Arnold	1	1450.23
5	Felicia Ho	7	1209.94
6	John Smith	6	1763.09
7	Jane Smiley	5	3567.00

In general, the VAR statement tells SAS to print the variables in the order specified in the VAR statement. Launch and run the SAS program, and review the output to convince yourself that SAS indeed printed just the three variables — name, no_vis, and expense — in that order.

Example 6.3

Using the NOOBS option, we can suppress the printing of the default observation number. The following SAS program illustrates the PRINT procedure's NOOBS option:

PROC PRINT data = basic noobs;
   var name no_vis expense;
RUN;

The SAS System
name	no_vis	expense
Alice Smith	7	1001.98
Maryann White	2	2999.34
Thomas Jones	10	3904.89
Benedict Arnold	1	1450.23
Felicia Ho	7	1209.94
John Smith	6	1763.09
Jane Smiley	5	3567.00

Launch and run the SAS program, and review the resulting report to convince yourself that the observation number has been suppressed.

6.2 - Identifying Observations

In the previous section, we learned how to suppress the observation column that appears in default reports. Alternatively, we can use the ID statement to replace the observation column with one or more variables.

Example 6.4

Using the ID statement, we can emphasize one or more key variables. The ID statement, which automatically suppresses the printing of the observation number, tells SAS to print the variable(s) specified in the ID statement as the first column(s) of your output. Thus, the ID statement allows you to use the values of the variables to identify observations, rather than the (usually meaningless) observation number. The following SAS code illustrates the use of the ID statement option:

PROC PRINT data = basic;
   id name;
   var gender expense;
RUN;

The SAS System
name	gender	expense
Alice Smith	1	1001.98
Maryann White	1	2999.34
Thomas Jones	2	3904.89
Benedict Arnold	2	1450.23
Felicia Ho	1	1209.94
John Smith	2	1763.09
Jane Smiley	1	3567.00

Launch and run the SAS program, and review the output to convince yourself that the name, having replaced the observation number, appears in the first column.

Example 6.5

It is particularly useful to use the ID statement when observations are so long that SAS can't print them on one line. In that case, SAS breaks up the observations and prints them on two (or more lines). When that happens, it is helpful to use an ID variable (or more) so that you can keep track of the observations. The following SAS code illustrates such a situation:

OPTIONS LS = 64 PS = 58 NODATE;
PROC PRINT data = basic;
   id name;
   var subj name clinic gender 
       subj no_vis type_vis expense;
RUN;

The SAS System
name	subj	name	clinic	gender
Alice Smith	1024	Alice Smith	LEWN	1
Maryann White	1167	Maryann White	LEWN	1
Thomas Jones	1168	Thomas Jones	ALTO	2
Benedict Arnold	1201	Benedict Arnold	ALTO	2
Felicia Ho	1302	Felicia Ho	MNMC	1
John Smith	1471	John Smith	MNMC	2
Jane Smiley	1980	Jane Smiley	MNMC	1

The SAS System
name	subj	no_vis	type_vis	expense
Alice Smith	1024	7	101	1001.98
Maryann White	1167	2	101	2999.34
Thomas Jones	1168	10	190	3904.89
Benedict Arnold	1201	1	190	1450.23
Felicia Ho	1302	7	190	1209.94
John Smith	1471	6	187	1763.09
Jane Smiley	1980	5	190	3567.00

Let's take note of a couple of things about the code. Given that we are working with a small data set with just seven variables, I used the OPTIONS statement to intentionally shrink the line size to 64 so that SAS is unable to fit the requested variables on one line. Also, note that the variable name appears not only in the ID statement but also in the VAR statement. When that happens, SAS displays the variable twice in the output. Finally, note that subj is included twice in the VAR statement again only to make sure that the requested variables do not fit on one line.

Launch and run the SAS program, and review the output to convince yourself that SAS behaves as described.

6.3 - Selecting Observations

By default, the PRINT procedure displays all of the observations in a SAS data set. You can control which observations are printed by:

using the FIRSTOBS= and OBS = options to tell SAS which range of observation numbers to print
using the WHERE statement to print only those observations that meet a certain condition

Example 6.6

The following SAS code uses the PRINT procedure's FIRSTOBS= and OBS= options to the second, third, fourth and fifth observations of the basic data set:

OPTIONS LS = 75 PS = 58 NODATE;
PROC PRINT data = basic (FIRSTOBS = 2 OBS = 5);
   var subj name no_vis expense;
RUN;

The SAS System
Obs	subj	name	no_vis	expense
2	1167	Maryann White	2	2999.34
3	1168	Thomas Jones	10	3904.89
4	1201	Benedict Arnold	1	1450.23
5	1302	Felicia Ho	7	1209.94

The FIRSTOBS= option tells SAS the first observation to print, and the OBS= option tells SAS the last observation to print. Both options must be placed in parentheses, and the parentheses must immediately follow the DATA= option. You will get a syntax error if you try to use the options without also using the DATA= option. (Incidentally, if you don't use the DATA= option to tell SAS which data set to print, SAS will print the most recent data set.)

Launch and run the SAS program, and review the output to convince yourself that SAS behaves as described.

Example 6.7

The FIRSTOBS= and OBS= options tell SAS to print observations based on their observation numbers, whereas the WHERE statement tells SAS to print observations based on whether or not they meet the specified condition. The following SAS code uses the WHERE statement to tell SAS to print only those observations for which the value of the variable no_vis is greater than 5:

PROC PRINT data = basic;
   var name no_vis type_vis expense;
   where no_vis > 5;
RUN;

The SAS System
Obs	name	no_vis	type_vis	expense
1	Alice Smith	7	101	1001.98
3	Thomas Jones	10	190	3904.89
5	Felicia Ho	7	190	1209.94
6	John Smith	6	187	1763.09

Launch and run the SAS program, and review the output to convince yourself that SAS behaves as described. And then, a couple of things to note:

Only one WHERE statement can appear in each PRINT procedure.
You can specify any variable in your SAS data set in a WHERE statement, not just the variables that appear in the VAR statement. (I like to place variables in both places just so I am able to see the effect of the WHERE statement.)
The WHERE statement works for both character and numeric variables. To specify a condition based on the value of a character variable, you must:
- Enclose the value in quotation marks, and
- Type the value using lowercase and uppercase letters exactly as it appears in the data set.
Any of the comparison operators — such as eq, ne, gt, lt, ge, or le — that you learned for if-then-else statements can be used in a WHERE statement.
You can also use logical operators — such as and & or — to select observations that meet multiple conditions.
You can also use a CONTAINS operator to select observations that include the specified substring.

Example 6.8

The following SAS code uses the CONTAINS operator to select observations in the basic data set for which the name variable contains the substring 'Smi':

PROC PRINT data = basic;
    var name gender no_vis type_vis expense;
	where name contains 'Smi';
RUN;

The SAS System
Obs	name	gender	no_vis	type_vis	expense
1	Alice Smith	1	7	101	1001.98
6	John Smith	2	6	187	1763.09
7	Jane Smiley	1	5	190	3567.00

First, launch and run the SAS program, and review the output to convince yourself that SAS behaves as described. Then, change the word contains in the WHERE statement to a question mark (?), and re-run the SAS program. If you review the output again and note that it is identical to the previous output, you should be able to convince yourself that the ? operator is equivalent to the contains operator.

6.4 - Sorting Data

By default, the PRINT procedure displays observations in the order in which they appear in your data set. Alternatively, you can use the SORT procedure to first sort your data set based on the values of one or more variables. Then, when you use the PRINT procedure, SAS will display the observations in the order in which you sorted the data.

Example 6.9

The following SAS program uses the SORT procedure to sort the basic data set first by clinic name (clinic) and then by the number of visits (no_vis) before printing the (sorted) data set srtd_basic:

PROC SORT data = basic out = srtd_basic;
   by clinic no_vis;
RUN;
PROC PRINT data = srtd_basic NOOBS;
   var clinic no_vis subj name gender type_vis expense;
RUN;

The SAS System
clinic	no_vis	subj	name	gender	type_vis	expense
ALTO	1	1201	Benedict Arnold	2	190	1450.23
ALTO	10	1168	Thomas Jones	2	190	3904.89
LEWN	2	1167	Maryann White	1	101	2999.34
LEWN	7	1024	Alice Smith	1	101	1001.98
MNMC	5	1980	Jane Smiley	1	190	3567.00
MNMC	6	1471	John Smith	2	187	1763.09
MNMC	7	1302	Felicia Ho	1	190	1209.94

First, launch and run the SAS program, and review the output to convince yourself that the output from the srtd_basic data set is in order first by clinic and then by no_vis.

Then, note that while the SORT procedure's BY statement is required, its OUT= option is optional. If you don't use it, however, then the SORT procedure permanently sorts the data set that is specified in the DATA= option. Therefore, if you need your data to be sorted just to produce output temporarily, then you should use the OUT= option in conjunction with a temporary SAS data set name.

Oh, I guess you should also note that, by default, SAS sorts the values of the variables appearing in the BY statement in ascending order. If you want them sorted in descending order, you need to use the BY statement's DESCENDING option.

Example 6.10

The following SAS program uses the BY statement's DESCENDING option to tell SAS to sort the basic data set first by clinic name (clinic) in descending order, and then by the number of visits (no_vis) in ascending order:

PROC SORT data = basic out = srtd_basic;
   by descending clinic no_vis;
RUN;
PROC PRINT data = srtd_basic NOOBS;
   var clinic no_vis subj name gender type_vis expense;
RUN;

The SAS System
clinic	no_vis	subj	name	gender	type_vis	expense
MNMC	5	1980	Jane Smiley	1	190	3567.00
MNMC	6	1471	John Smith	2	187	1763.09
MNMC	7	1302	Felicia Ho	1	190	1209.94
LEWN	2	1167	Maryann White	1	101	2999.34
LEWN	7	1024	Alice Smith	1	101	1001.98
ALTO	1	1201	Benedict Arnold	2	190	1450.23
ALTO	10	1168	Thomas Jones	2	190	3904.89

First, launch and run the SAS program, and review the output to convince yourself that the output from the srtd_basic data set is in descending order of clinic and in ascending order of no_vis. That is, if your BY statement contains more than one variable, then the DESCENDING option applies only to the variable that immediately follows it. You might want to sandwich another DESCENDING between clinic and no_vis in the BY statement and then re-run the SAS program to see the effect.

6.5 - Column Totals

There may be situations in which you want SAS to calculate and present column totals for some of the numeric variables appearing in your reports. In that case, you'll want to take advantage of the SUM statement. We'll investigate the use of the statement here.

Example 6.11

The following SAS code uses the PRINT procedure's SUM statement to generate a report of the total number of visits (no_vis) for patients undergoing physical therapy (type_vis = 190):

PROC PRINT data = basic;
   id name;
   var clinic no_vis;
   where type_vis = 190;
   sum no_vis;
RUN;

The SAS System
name	clinic	no_vis
Thomas Jones	ALTO	10
Benedictine Arnold	ALTO	1
Felicia Ho	MNMC	7
Jane Smiley	MNMC	5
======
		23

The ID statement tells SAS to suppress the observation number and to place the variable name in the first column of the output. The WHERE statement tells SAS to print only the observations pertaining to a physical therapy appointment (type_vis = 190). The SUM statement tells SAS to provide the total number of visits (no_vis) the patients undergoing physical therapy had.

Launch and run the SAS program, and review the output to convince yourself that the report is generated as described.

Incidentally, when you specify a variable in a SUM statement, the variable is also implicitly assumed to be present in the VAR statement. This prevents you from having to type the variable names twice, once in the VAR statement and once in the SUM statement. You might want to delete no_vis in the VAR statement and re-run the SAS program to convince yourself that no_vis is still printed in the report because its name appears in the SUM statement.

Example 6.12

There may be situations in which you want not just column totals, but also column subtotals. Using the PRINT procedure's BY statement, you can tell SAS to print observations in groups based on the values of the different BY variables. When a SUM statement is specified in the presence of a BY statement, SAS produces subtotals each time the value of a BY variable changes.

The following SAS program illustrates the use of the BY statement in conjunction with the SUM statement to print the data in our basic data set in three groups based on the value of clinic, as well as display the total expense for each of the three groups separately:

PROC SORT data = basic out = srtd_basic;
  by clinic;
RUN;
PROC PRINT data = srtd_basic;
   by clinic;
   var subj name no_vis type_vis expense;
   sum expense;
RUN;

---------------------------------- clinic=ALTO ----------------------------------

The SAS System
Obs	subj	name	no_vis	type_vis	expense
1	1168	Thomas Jones	10	190	3904.89
2	1201	Benedictine Arnold	1	190	1450.23
------			------
clinic			5355.12

---------------------------------- clinic=LEWN ----------------------------------

The SAS System
Obs	subj	name	no_vis	type_vis	expense
3	1024	Alice Smith	7	101	1001.98
4	1167	Maryann White	2	101	2999.34
------			------
clinic			4001.32

---------------------------------- clinic=MNMC ----------------------------------

The SAS System
Obs	subj	name	no_vis	type_vis	expense
5	1302	Felicia Ho	7	190	1209.94
6	1471	John Smith	6	187	1763.09
7	1980	Jane Smiley	5	190	3567.00
------			------
clinic			6540.03
======
15896.47

As you'll see is always the case, whenever a BY statement is used in any DATA step or procedure, the data set must first be sorted in order based on the variables specified in the BY statement. If not, your program will halt the execution, and SAS will print a message in the log indicating that the data set is not properly sorted. The SORT procedure prepares the data for the PRINT procedure's BY statement by sorting the basic data set by clinic and storing the sorted data in a new data set called srtd_basic. The PRINT procedure then tells SAS to print srtd_basic in three groups (by clinic) — one for ALTO, one for LEWN, and one for MNMC. The PRINT procedure also tells SAS to sum the expenses (sum expense) for each clinic separately, as well as provide a grand total of all expenses.

Launch and run the SAS program, and review the resulting output. Your output should display the observations in three groups (ALTO, LEWN, and MNMC) and the variable expense should be added up across the subjects and then across the three groups.

Example 6.13

If you take a look at the output from the previous example, you should see that the columns don't line up across the three clinics. The UNIFORM option tells SAS to make sure the columns of data line up from one group to the next. Without the UNIFORM statement, the PRINT procedure works to fit as many variables and observations on the page as possible. As a result, printed columns can be shifted from one group to the next. In the PRINT procedure in the previous example, no UNIFORM option was specified. Therefore, since a different number of characters are needed for name for the three groups, the columns are not aligned.

The PRINT procedure in the following SAS program illustrates use of the UNIFORM option to remedy this problem:

PROC SORT data = basic out = srtd_basic;
  by clinic;
RUN;
PROC PRINT data = srtd_basic UNIFORM;
   by clinic;
   var subj name no_vis type_vis expense;
   sum expense;
RUN;

------------------------------ clinic=ALTO ------------------------------

The SAS System
Obs	subj	name	no_vis	type_vis	expense
1	1168	Thomas Jones	10	190	3904.89
2	1201	Benedictine Arnold	1	190	1450.23
------			------
clinic			5355.12

------------------------------ clinic=LEWN ------------------------------

The SAS System
Obs	subj	name	no_vis	type_vis	expense
3	1024	Alice Smith	7	101	1001.98
4	1167	Maryann White	2	101	2999.34
------			------
clinic			4001.32

------------------------------ clinic=MNMC ------------------------------

The SAS System
Obs	subj	name	no_vis	type_vis	expense
5	1302	Felicia Ho	7	190	1209.94
6	1471	John Smith	6	187	1763.09
7	1980	Jane Smiley	5	190	3567.00
------			------
clinic			6540.03
======
15896.47

The only difference between this PRINT procedure and the previous one is that the UNIFORM option has been included here. Launch and run the SAS program, and review the resulting output to convince yourself that the columns across the three groups are now properly aligned.

Example 6.14

In the output from the previous two examples, you might have noticed that redundant information is displayed for each group. For example, the BY variable clinic is identified across the top of the data for each group, as well as for the subtotal for each group. To show the BY variable heading only once, you can use an ID statement and a BY statement in conjunction with the SUM statement. When an ID statement specifies the same variable as the BY statement:

The observation number is suppressed
The ID variable is printed as the first column of the report
Each value of the ID variable is printed only at the start of each BY group and on the line that contains that group's subtotal.

The PRINT procedure in the following SAS program illustrates the use of the ID statement to control the headings displayed in the report for each clinic:

PROC SORT data = basic out = srtd_basic;
  by clinic;
RUN;
PROC PRINT data = srtd_basic UNIFORM;
   by clinic;
   var subj name no_vis type_vis expense;
   sum expense;
   id clinic;
RUN;

The SAS System
clinic	subj	name	no_vis	type_vis	expense
ALTO	1168	Thomas Jones	10	190	3904.89
	1201	Benedictine Arnold	1	190	1450.23
------					------
ALTO					5355.12
LEWN	1024	Alice Smith	7	101	1001.98
	1167	Maryann White	2	101	2999.34
------					------
LEWN					4001.32
MNMC	1302	Felicia Ho	7	190	1209.94
	1471	John Smith	6	187	1763.09
	1980	Jane Smiley	5	190	3567.00
------					------
MNMC					6540.03
======
15896.47

Launch and run the SAS program, and review the resulting output to convince yourself that the headings for the three clinics appear as described.

Example 6.15

Rather than having SAS display the output for each group on the same page, you can take advantage of the PAGEBY statement to tell SAS to print each group on a separate page. The following SAS program creates the same output as the previous example, except here we request that the clinics be printed on separate pages:

PROC SORT data = basic out = srtd_basic;
  by clinic;
RUN;

PROC PRINT data = srtd_basic UNIFORM;
   by clinic;
   var subj name no_vis type_vis expense;
   sum expense;
   id clinic;
   pageby clinic;
RUN;

The PAGEBY statement tells SAS to print the data for each clinic on a separate page. Note that the variable that is specified in the PAGEBY statement must also be specified in the PRINT procedure's BY statement.

Launch and run the SAS program. You should see that each clinic is represented on a separate page — the first page:

clinic	subj	name	no_vis	type_vis	expense
ALTO	1168	Thomas Jones	10	190	3904.89
	1201	Benedictine Arnold	1	190	1450.23
------					------
ALTO					5355.12

the second page:

clinic	subj	name	no_vis	type_vis	expense
LEWN	1024	Alice Smith	7	101	1001.98
	1167	Maryann White	2	101	2999.34
------					------
LEWN					4001.32

and the third page:

clinic	subj	name	no_vis	type_vis	expense
MNMC	1302	Felicia Ho	7	190	1209.94
	1471	John Smith	6	187	1763.09
	1980	Jane Smiley	5	190	3567.00
------					------
MNMC					6540.03
======
15896.47

6.6 - Output Appearance

So far, we've focused on how to alter the content and structure of our PRINT procedure's output. Now, we'll focus a bit on how to "prettify" our output using TITLE and FOOTNOTE statements and the DOUBLE option.

Example 6.16

The following PRINT procedure merely prints our basic data set, but this time with helpful TITLE and FOOTNOTE statements:

OPTIONS LS = 72 PS = 20 NODATE NONUMBER;
PROC PRINT data = basic;
    title 'Our BASIC Data Set';
	footnote1 'Clinic: ALTO = altoona,  LEWN = Lewistown,  MNMC = Mount Nittany';
	footnote3 'Type_vis: 101 = Gynecology, 190 = Physical Therapy, 187 = Cardiology';
	footnote5 'Gender: 1 = female,  2 = male';
RUN;
footnote;

Our BASIC Data Set
Obs	subj	name	clinic	gender	no_vis	type_vis	expense
1	1024	Alice Smith	LEWN	1	7	101	1001.98
2	1167	Maryann White	LEWN	1	2	101	2999.34
3	1168	Thomas Jones	ALTO	2	10	190	3904.89
4	1201	Benedictine Arnold	ALTO	2	1	190	1450.23
5	1302	Felicia Ho	MNMC	1	7	190	1209.94
6	1471	John Smith	MNMC	2	6	187	1763.09
7	1980	Jane Smiley	MNMC	1	5	190	3567.00
Clinic ALTO = altoona, LEWN = Lewistown, MNMC = Mount Nittany
Type_vis: 101 = Gynecology, 190 = Physical Therapy, 187 = Cardiology
Gender: 1 = female, 2 = male

The TITLE and FOOTNOTE statements contained within the PRINT procedure are fairly self-explanatory. In general, though, the TITLE and FOOTNOTE statements can appear anywhere in your code, as they are global statements. As such, they each work as a "toggle" statement: once you specify a title and footnote, they are used for all of the subsequent output your program generates until you define another title and footnote or cancel them with empty TITLE and FOOTNOTE statements. The last footnote statement in the above code is an empty footnote statement that just "turns off" the previously specified footnotes.

You can have up to ten titles and ten footnotes appearing in a single SAS program, each denoted by a number: title1, title2, ..., title10 and footnote1, footnote2, ..., footnote10. The number tells SAS on which lines you'd like the title or footnote printed. The footnotes in the above program tell SAS to print the footnotes on the first, third, and fifth footnote lines. That's why there is a blank line between each of the footnotes.

Launch and run the SAS program, and review the resulting output to convince yourself that the title and footnotes are displayed as described. Note too that titles and footnotes are centered by default.

To make sure you understand the global nature of the TITLE and FOOTNOTE statements, you might want to try submitting another simple PRINT procedure:

PROC PRINT;
RUN;

to see what happens. In the output window, you should see another printout of the basic data set having no footnotes but having the same title as the previous output.

Example 6.17

If you want to make your output more readable by double-spacing it, you can use the PRINT procedure's DOUBLE option. The following SAS program prints six variables in the basic data set using double-spacing:

OPTIONS PS = 58 LS = 72;
PROC PRINT data = basic NOOBS DOUBLE;
   title 'Our BASIC Data Set';
   var subj name clinic no_vis type_vis expense;
RUN;

Our BASIC Data Set
subj	name	clinic	no_vis	type_vis	expense
1024	Alice Smith	LEWN	7	101	1001.98
1167	Maryann White	LEWN	2	101	2999.34
1168	Thomas Jones	ALTO	10	190	3904.89
1201	Benedictine Arnold	ALTO	1	190	1450.23
1302	Felicia Ho	MNMC	7	190	1209.94
1471	John Smith	MNMC	6	187	1763.09
1980	Jane Smiley	MNMC	5	190	3567.00

Launch and run the SAS program, and review the resulting output to convince yourself that the output is double-spaced as described.

6.7 - Descriptive Labels

There may be some cases in which your variable names would not be particularly meaningful to other people reading your reports. For example, the variables q1_08, q2_08, and q3_08 might represent, respectively, your company's sales in the first, second, and third quarters of 2008. Perhaps, then, you'd want to label q1_08 as "Sales First Quarter 2008", q2_08 as "Sales Second Quarter 2008", and so on. In order to label the columns in your report as such, you need to use:

a LABEL statement to assign a descriptive label to a variable, and
the LABEL option in the PROC PRINT statement to specify that labels, rather than variable names, be displayed.

The LABEL statement can be placed either in a DATA step or directly in the PRINT procedure. When you place the LABEL statement in a DATA step, the label gets permanently affixed to the variable and therefore is available for all subsequent procedures. That is, you permanently change the variable's label attribute. When you place the LABEL statement directly in the PRINT procedure, the label is available for use only in the PRINT procedure in which it is specified.

As a default, SAS does not print labels. You must use the LABEL option to tell it to do so. Let's take a look at a couple of examples.

Example 6.18

The following SAS program illustrates the use of the LABEL option in conjunction with the LABEL statement in the PRINT procedure:

PROC PRINT data = basic LABEL;
    label name = 'Name'
	      no_vis = 'Number of Visits'
		  type_vis = 'Type of Visit'
		  expense = 'Expense';
	id name;
	var no_vis type_vis expense;
RUN;

The SAS System
Name	Number of Visits	Type of Visit	expense
Alice Smith	7	101	1001.98
Maryann White	2	101	2999.34
Thomas Jones	10	190	3904.89
Benedictine Arnold	1	190	1450.23
Felicia Ho	7	190	1209.94
John Smith	6	187	1763.09
Jane Smiley	5	190	3567.00

As you can see, the LABEL statement assigns a label to four variables — name, no_vis, type_vis, and expense. The syntax of any LABEL statement must match the syntax of the LABEL statement in this program, namely first the LABEL keyword, the variable name, an equals sign (=), and finally a descriptive label (up to 256 characters long) in quotation marks. If you forget to close the label off with a quotation mark, SAS will be sure to let you know that you have a problem with your code. The LABEL option merely tells SAS to use the assigned labels when printing the report.

Launch and run the SAS program, and review the resulting output to convince yourself the labels were printed as expected. Then, you might want to remove the LABEL option and re-run the SAS program to see that the labels that are assigned to the variables within the PRINT procedure are printed only if you also specify the LABEL option.

Example 6.19

If you look at the output from the previous program, you'll see that SAS does what it can to fit longer labels, such as the Number of Visits, above the column headings. You can instead control where SAS splits long labels by using the SPLIT= option. By using the option, you tell SAS to split the labels wherever the designated split character appears. The following SAS code illustrates the PRINT procedure's SPLIT= option:

PROC PRINT data = basic SPLIT='/';
    label name = 'Name';
	label no_vis = 'Number of/Visits';
	label type_vis = 'Type of Visit';
    label expense = 'Expense';
	id name;
	var no_vis type_vis expense;
RUN;

The SAS System
Name	Number of Visits	Type of Visit	Expense
Alice Smith	7	101	1001.98
Maryann White	2	101	2999.34
Thomas Jones	10	190	3904.89
Benedictine Arnold	1	190	1450.23
Felicia Ho	7	190	1209.94
John Smith	6	187	1763.09
Jane Smiley	5	190	3567.00

This program also illustrates that you can assign labels in a single LABEL statement, as we did in the previous program ... or in multiple LABEL statements, as we did here. It also illustrates that you need not specify a LABEL option in the presence of the SPLIT option... the LABEL option is implicit. The SPLIT= option tells SAS to split (no kidding?) the label defined here for the variable no_vis wherever the character '/' appears. That is, instead of printing the variable no_vis, the LABEL statement tells SAS to print a two-row heading: "Number of" on the first row and "Visits" on the second row.

Launch and run the SAS program, and review the resulting output to convince yourself the labels were printed as expected.

6.8 - Formatting Data Values

You might recall that informats are used to tell SAS how to read special data values into your SAS data sets, and formats are used to tell SAS how to display those special data values in your reports. As you might recall from your prior (but admittedly brief) work with dates, when SAS stores special data values, it doesn't necessarily store numbers that would be meaningful to a casual reader of your reports. As a result, you have to use a FORMAT statement to tell SAS to display the stored numbers in a way that is meaningful to you and your readers. Let's look at an example of the use of the FORMAT statement in the PRINT procedure.

Example 6.20

The following SAS program illustrates the use of the FORMAT statement to tell SAS to display the expense variable using the dollar9.2 format:

PROC PRINT data = basic LABEL;
   label name = 'Name'
         clinic = 'Clinic'
         expense = 'Expense';
   format expense dollar9.2;
   id name;
   var clinic expense;
RUN;

Name	Clinic	Expense
Alice Smith	LEWN	$1,001.98
Maryann White	LEWN	$2,999.34
Thomas Jones	ALTO	$3,904.89
Benedictine Arnold	ALTO	$1,450.23
Felicia Ho	MNMC	$1,209.94
John Smith	MNMC	$1,763.09
Jane Smiley	MNMC	$3,567.00

The FORMAT statement tells SAS to associate, for the duration of the PRINT procedure, the dollar9.2 format with the expense variable. As soon as the PRINT procedure closes, the association no longer holds. The dollar9.2 format tells SAS to display the expense values using dollar signs, commas (when appropriate), and two decimal places. The 9 tells SAS that it will need at most 9 spaces to accommodate each expense value — 1 for the dollar sign, 1 for the comma sign, 4 for the digits before the decimal place, 1 for the decimal place, and 2 for the decimal place digits.

Launch and run the SAS program, and review the resulting output to convince yourself the expense variable values were printed as described. Then, you might want to change the 9 in the dollar9.2 format to an 8 and re-run the SAS program to see that SAS drops the comma in order to fit the values into the eight allocated spaces. Then, if you're still having fun, you might want to change the entire dollar9.2 format to the comma8.2 format and re-run the SAS program to familiarize yourself with the comma format.

In general, you can use a separate FORMAT statement for each variable, or you can format several variables in a single FORMAT statement. The table below illustrates some of the most commonly used SAS formats:

Format	Specifies These Values	Example
COMMAw.d	that contain commas and decimal places	`comma8.2`
DOLLARw.d	that contain dollar signs, commas, and decimal places	`dollar6.2`
MMDDYYw.	as date values of the form 10/03/08 (mmddyy8.) or 10/03/2008 (mmddyy10.)	`mmddyy10.`
w.	rounded to the nearest integer in w spaces	`7.`
w.d	rounded to d decimal places in w spaces	`8.2`
$w.	as character values in w spaces	`$12.`
DATEw.	as date values of the form 02OCT08 (date7.) or 02OCT2008 (date9.)	`date9.`

Of course, you can find the other formats that are available using the SAS Help and Documentation.

6.9 - Summary

In this lesson, we learned about the various features and options of the PRINT procedure that enable us to create reports.

The homework assignment for this lesson will give you practice with these techniques.

^[1]	Link
↥	Has Tooltip/Popover
	Toggleable Visibility

Lesson 6: Creating List Reports

Overview

Objectives