6  Working with Two Variables

Overview

In this lesson, we’ll start working with more than one variable, using Chapter 6 in Essential R Course Notes, and it will (hopefully) start to feel more like we are actually doing statistics. Note that while we’ll introduce plotting functions a bit more here, we’re just scratching the surface - in later chapters we’ll go into plotting in much greater depth.

Objectives

Upon completion of this lesson, you should be able to:


  1. make frequency table and carry out chi-squared test for two factors,
  2. make barplots for two factors,
  3. explore correlation between two numeric variables,
  4. fit a regression between two numeric variable, and
  5. compare group means for a continuous variable over levels of a factor.

Data and R Code Files

As always, you can access these files in the “Code Files” folders available from the Essential R Course Notes, or here: Chapter 6 R Script

R logo

6.1 Two Factors: Frequency Tables

In this video, we’ll demonstrate frequency tables and proportion tables for analyzing the relationship between two factors (qualitative or categorical variables).

Video - STAT 484 Lesson: 6.1

6.2 Two Factors: Chi-squared Tests

Here we’ll build on the last video by showing how a frequency table can be used to calculate a test for independence between two variables.

Video - STAT 484 Lesson: 6.2

6.3 Two Factors: Barplots and Mosaic Plots

Here we will consider a couple of ways to visualize the relationship between two factors.

Video - STAT 484 Lesson: 6.3

6.4 Two Numeric Variables: Scatterplot

Here we’ll begin with visualizing the relationship between two continuous variables.

Video - STAT 484 Lesson: 6.4

6.5 Two Numeric Variables: Correlation

Now we’ll introduce the function cor() for correlations, and show how to derive both Pearson and Spearman correlations, and how to specify how missing values should be treated.

Video - STAT 484 Lesson: 6.5

6.6 Two Numeric Variables: Regression

Now that we’ve explored correlation, we’ll take a brief look at using the “linear model” function lm() to fit regressions. We’ll also look at how we can examine the residuals (stored inside the “lm object” created by lm()) to see if their distribution is approximately normal. Note that we’ll we explore regression in much more detail in a later chapter.

Video - STAT 484 Lesson: 6.6

6.7 Two Numeric Variables: Regression Diagnostics

Here we’ll briefly introduce the built-in regression diagnostics available by calling plot() on an lm object. The residual plot and the normal q-q plots are the most helpful.

Video - STAT 484 Lesson: 6.7

6.8 Comparing Group Means: T-tests and One-way Tests

We’ll wind up this chapter by demonstrating a comparison of group means using t-tests and one-way tests of means, and ANOVA.

Video - STAT 484 Lesson: 6.8