11.5 - Interaction Plots

A great example of being in a situation in which you need to create a summarized data set is when you want to create an interaction plot. We'll take a look at such an example in this section. If you haven't taken a course on analysis of variance yet, such as Stat 502, and therefore don't yet know what an interaction plot is, don't fret. You'll get the basic idea here.

Example 11.13 Section

The following program uses data from the ICDB Background data set to illustrate how to create a simple plot to depict whether an interaction exists between two class variables, sex and race, when the analysis variable of interest is education level (ed_level):

PROC SORT data=icdb.back out=back;
  by sex race;

PROC MEANS data=back noprint;
   by sex race;
   var ed_level;
   output out=meaned mean=mn_edlev;

   title 'Mean Education Level for Sex and Race combinations';

PROC PLOT data=meaned;
   title 'Interaction Plot of SEX, RACE, and Mean Education Level';
   plot mn_edlev*race=sex;

Let's review the code. The SORT procedure merely prepares the Background data set for BY-group processing. The MEANS procedure calculates the mean education level ("var ed_level") for each sex and race combination ("by sex race"). The OUTPUT statement tells SAS to dump the results into a new data set called meaned. The PRINT procedure of course tells SAS to print the meaned data set, which as you'll see when you run the code, looks like this:

Mean Educational Level for Sex and race combinations
Obs sex race _TYPE_ _FREQ_ mn_edlev
1 1 2 0 3 4.66667
2 1 3 0 1 5.00000
3 1 4 0 51 3.47059
4 1 7 0 1 3.00000
5 2 1 0 2 4.00000
6 2 2 0 4 3.75000
7 2 3 0 28 3.42857
8 2 4 0 542 3.70849
9 2 5 0 3 3.33333
10 2 6 0 2 2.50000
11 2 8 0 1 3.00000

As we'd expect, the data set contains one observation for each sex and race combination. The primary variable is mn_edlev, the average education level of the subjects of that sex and race combination. Once the meaned data set is created, all we need to do is use the means in the data set to create an interaction plot. The PLOT procedure tells SAS to plot the mean education level (mn_edlev) on the y-axis and race (race) on the x-axis. The "=sex" part of the PLOT statement tells SAS to label the x-y (race-edlevel) coordinates with the value of the variable sex.

Before you run this program, you'll need to right-click on the link for the background data set to download and save it to your computer. You should store it in the same location that you've saved the permanent icdb.hem2 data set. Launch and run the SAS program, and review the resulting plot. You should see the interaction plot as advertised.