Before we dig into the methods of simple linear regression, we need to distinguish between two different type of relationships, namely:
- deterministic relationships
- statistical relationships
As we'll soon see, simple linear regression concerns statistical relationships.
Deterministic (or Functional) Relationships Section
A deterministic (or functional) relationship is an exact relationship between the predictor \(x\) and the response \(y\). Take, for instance, the conversion relationship between temperature in degrees Celsius \((C)\) and temperature in degrees Fahrenheit \((F)\). We know the relationship is:
Therefore, if we know that it is 10 degrees Celsius, we also know that it is 50 degrees Fahrenheit:
This is what the exact (linear) relationship between degrees Celsius and degrees Fahrenheit looks like graphically:
Other examples of deterministic relationships include the relationship between the diameter \((d)\) and circumference of a circle \((C)\):
\(C=\pi \times d\)
the relationship between the applied weight \((X)\) and the amount of stretch in a spring \((Y)\) (known as Hooke's Law):
the relationship between the voltage applied \((V)\), the resistance \((r)\) and the current \((I)\) (known as Ohm's Law):
and, for a constant temperature, the relationship between pressure \((P)\) and volume of gas \((V)\) (known as Boyle's Law):
where \(\alpha\) is a known constant for each gas.
Statistical Relationships Section
A statistical relationship, on the other hand, is not an exact relationship. It is instead a relationship in which "trend" exists between the predictor \(x\) and the response \(y\), but there is also some "scatter." Here's a graph illustrating how a statistical relationship might look:
In this case, researchers investigated the relationship between the latitude (in degrees) at the center of each of the 50 U.S. states and the mortality (in deaths per 10 million) due to skin cancer in each of the 50 U.S. states. Perhaps we shouldn't be surprised to see a downward trend, but not an exact relationship, between latitude and skin cancer mortality. That is, as the latitude increases for the northern states, in which sun exposure is less prevalent and less intense, mortality due to skin cancer decreases, but not perfectly so.
Other examples of statistical relationships include:
- the positive relationship between height and weight
- the positive relationship between alcohol consumed and blood alcohol content
- the negative relationship between vital lung capacity and pack-years of smoking
- the negative relationship between driving speed and gas mileage
It is these type of less-than-perfect statistical relationships that we are interested in when we investigate the methods of simple linear regression.