12.2 - Correlation

In this course we have been using Pearson's \(r\) as a measure of the correlation between two quantitative variables. In a sample, we use the symbol \(r\). In a population, we use the symbol \(\rho\) ("rho").

Pearson's \(r\) can easily be computed using Minitab Express. However, understanding the conceptual formula may help you to better understand the meaning of a correlation coefficient.

Pearson's \(r\): Conceptual Formula

\(r=\dfrac{\sum{z_x z_y}}{n-1}\)
where \(z_x=\dfrac{x - \overline{x}}{s_x}\) and \(z_y=\dfrac{y - \overline{y}}{s_y}\)

When we replace \(z_x\) and \(z_y\) with the \(z\) score formulas and move the \(n-1\) to a separate fraction we get the formula in your textbook: \(r=\dfrac{1}{n-1}\sum{\left(\dfrac{x-\overline x}{s_x}\right) \left( \dfrac{y-\overline y}{s_y}\right)}\)

In this course you will never need to compute \(r\) by hand, we will always be using Minitab Express to perform these calculations. 

MinitabExpress  – Computing Pearson's r

We previously created a scatterplot of quiz averages and final exam scores and observed a linear relationship. Here, we will compute the correlation between these two variables.

  1. Open the data set:
  2. On a PC: Select STATISTICS > Correlation > Correlation 
    On a MAC: Select Statistics > Regression > Correlation
  3. Double click the Quiz_Average and Final in the box on the left to insert them into the Variables box
  4. Click OK

This should result in the following output:

Pearson correlation of Quiz_Average and Final = 0.608630
P-Value = <0.0001
Video Walkthrough

Select your operating system below to see a step-by-step guide for this example.

Properties of Pearson's r Section

  1. \(-1\leq r \leq +1\)
  2. For a positive association, \(r>0\), for a negative association \(r<0\), if there is no relationship \(r=0\)
  3. The closer \(r\) is to 0 the weaker the relationship and the closer to +1 or -1 the stronger the relationship (e.g., \(r=-.88\) is a stronger relationship than \(r=+.60\)); the sign of the correlation provides direction only
  4. Correlation is unit free; the \(x\) and \(y\) variables do NOT need to be on the same scale (e.g., it is possible to compute the correlation between height in centimeters and weight in pounds)
  5. It does not matter which variable you label as \(x\) and which you label as \(y\). The correlation between \(x\) and \(y\) is equal to the correlation between \(y\) and \(x\). 

The following table may serve as a guideline when evaluating correlation coefficients

Absolute Value of \(r\) Strength of the Relationship
0 - 0.2 Very weak
0.2 - 0.4 Weak
0.4 - 0.6 Moderate
0.6 - 0.8 Strong
0.8 - 1.0 Very strong