Lesson 12: Summarizing Categorical Data

Overview Section

In this lesson, we'll investigate the FREQ procedure as a tool for summarizing and analyzing categorical data. The procedure is a descriptive procedure, as well as a statistical procedure. It allows you to produce one-way to n-way frequency and cross-tabulation tables. For two-way tables, the FREQ procedure also computes chi-square tests and measures of association. And, for n-way tables, the FREQ procedure also performs stratified analyses, computing statistics within and across strata. The FREQ procedure can also output summary statistics, such as counts and percentages, to a SAS data set.

Objectives

Upon completion of this lesson, you should be able to:

Upon completing this lesson, you should be able to use the FREQ procedure to summarize a data set numerically in a variety of ways, including:

  • create simple one-way, two-way, ... and n-way table summaries
  • suppress the printing of cumulative statistics in a table using the NOCUM option
  • print only one table per page using the PAGE option
  • read the values from a two-way table created by the FREQ procedure
  • create two-way (and in general, n-way) tables using the available shortcuts
  • suppress some of the default output in each of the cells of an n-way table using the NOROW, NOCOL, and NOPERCENT options
  • request additional output in each of the cells of an n-way table, such as EXPECTED, DEVIATION, and CELLCHI2
  • print n-way tables in a list format rather than as crosstabulation tables using the LIST and CROSSLIST tables options
  • perform an operation for each level of the BY group using a BY statement
  • treat missing values as non-missing values, and therefore include them in the calculation of the statistics using the MISSING tables option 
  • treat missing values as non-missing values when printing the frequencies, but do not include them in the calculation of the statistics using the MISSPRINT option 
  • create new SAS data sets containing summary statistics of categorical variables using the FREQ procedure
  • suppress printing the n-way crosstabulation using the NOPRINT tables option
  • print information about all possible combinations of levels of the variables in the table request, even when some combinations of levels do not occur in the data using the SPARSE tables option
  • invoke statistics table options, such as CHISQ, MEASURES, CMH, ALL, and EXACT