# 6 Working with Two Variables

## Overview

In this lesson, we’ll start working with more than one variable, using Chapter 6 in Essential R Course Notes, and it will (hopefully) start to feel more like we are actually doing statistics. Note that while we’ll introduce plotting functions a bit more here, we’re just scratching the surface - in later chapters we’ll go into plotting in much greater depth.

Objectives

Upon completion of this lesson, you should be able to:

- make frequency table and carry out chi-squared test for two factors,
- make barplots for two factors,
- explore correlation between two numeric variables,
- fit a regression between two numeric variable, and
- compare group means for a continuous variable over levels of a factor.

## Data and R Code Files

As always, you can access these files in the “Code Files” folders available from the Essential R Course Notes, or here: Chapter 6 R Script

## 6.1 Two Factors: Frequency Tables

In this video, we’ll demonstrate frequency tables and proportion tables for analyzing the relationship between two factors (qualitative or categorical variables).

## 6.2 Two Factors: Chi-squared Tests

Here we’ll build on the last video by showing how a frequency table can be used to calculate a test for independence between two variables.

## 6.3 Two Factors: Barplots and Mosaic Plots

Here we will consider a couple of ways to visualize the relationship between two factors.

## 6.4 Two Numeric Variables: Scatterplot

Here we’ll begin with visualizing the relationship between two continuous variables.

## 6.5 Two Numeric Variables: Correlation

Now we’ll introduce the function `cor()`

for correlations, and show how to derive both Pearson and Spearman correlations, and how to specify how missing values should be treated.

## 6.6 Two Numeric Variables: Regression

Now that we’ve explored correlation, we’ll take a brief look at using the “linear model” function `lm()`

to fit regressions. We’ll also look at how we can examine the residuals (stored inside the “lm object” created by `lm()`

) to see if their distribution is approximately normal. Note that we’ll we explore regression in much more detail in a later chapter.

## 6.7 Two Numeric Variables: Regression Diagnostics

Here we’ll briefly introduce the built-in regression diagnostics available by calling `plot()`

on an lm object. The residual plot and the normal q-q plots are the most helpful.

## 6.8 Comparing Group Means: T-tests and One-way Tests

We’ll wind up this chapter by demonstrating a comparison of group means using t-tests and one-way tests of means, and ANOVA.