Case Study: Baseball
Clark is a big Boston Red Sox fan. Noticing that the Red Sox had a slightly better home record than away, he was interested to see if there is an association between batting averages at away (y) verses home (x) games in the 2018 regular season. Clark included stats from players who played at least 3 away and 3 home games (One player had a batting average of .000 for home games). He got his data from: mlb.com/stats. Clark used Minitab to create a scatterplot of the data.
From his scatterplot, he was surprised to see that the higher batting averages at home were not always higher away. He also noticed two of the two top hitters: Mookie Betts and JD Martinez had data up in the upper right hand corner.
In looking at the data, Clark could not decide what to conclude. He decided to run a correlation in Minitab. He got the following output: r = 0.280, p-value = 0.261. What should he conclude about the batting averages home vs away?
Clark is off to a great start. He realizes that he is working with two quantitative variables (home batting average and away batting average). Let’s take a closer look at the statistical methods he used to come up with his results.
- Use a scatterplot to appropriately graph two quantitative variables
- Apply a correlation technique to two quantitative variables
- Define the difference between a correlation and a covariance
- Identify the magnitude, direction, and linearity of a correlation coefficient.