Overview Section
In this lesson, we investigate statistical analyses that are typically performed when dealing with two or more continuous numeric variables. Specifically, we investigate:
- the GPLOT procedure, to create publication quality x-y scatter plots of any two numeric variables in a SAS data set
- the CORR procedure, to compute various correlation coefficients between two or more numeric variables in a SAS data set
- the REG procedure, to perform a regression analysis on any subset of numeric variables in a SAS data set
Objectives
Upon completion of this lesson, you should be able to:
Upon completing this lesson, you should be able to do the following:
- use the CORR procedure to tell SAS to calculate Pearson correlation coefficients among a set of numeric variables
- use the CORR procedure's SPEARMAN, KENDALL, and HOEFFDING options to tell SAS to calculate alternative coefficients
- read typical correlation procedure output in order to be able to extract the calculated correlations and their associated P-values
- use the CORR procedure's WITH statement to tell SAS to calculate only the correlation coefficients among the variables in the WITH and VAR statements
- understand how sample size can affect the significance of a correlation coefficient
- interpret a correlation coefficient
- use the CORR procedure's PARTIAL statement to tell SAS to calculate partial correlations among variables
- use the CORR procedure's BEST = n option to tell SAS to print only the first n of the ordered estimated correlations
- use the REG procedure to compute a regression equation between two numeric variables
- use the REG procedure's MODEL statement to tell SAS which variable to treat as the response variable and which variable to treat as the predictor variables
- read the typical SAS output from regression analysis to pull off key information, such as parameter estimates, confidence intervals, and P-values
- use the REG procedure's PLOT statement to request residual diagnostic plots
- use the GPLOT procedure to request plots containing estimated regression equations, 95% confidence intervals about the mean of y, and 95% prediction intervals about the individual y-values
- use the REG procedure to conduct a regression analysis involving quadratic terms
- use the REG procedure to conduct a regression analysis involving transformed variables
Textbook Reference Section
Chapter 5 of the textbook.