Introduction Section
This page is essentially the page of formulas and notes that you can use to study for Exam 2. You will find a printable version of this in Canvas that you can print out and bring to your proctored exams. The printable version also includes the normal table which you may need for the exam as well.
Outline of Material Covered on Exam 2 Section
Correlation measures strength of linear relationship
Linear prediction: Estimated as \(y = a + bx\), where '\(a\)' is the intercept which is the value of \(y\) when \(x=0\); '\(b\)' is the slope which is the change in \(y\) per unit increase in \(x\).
Correlation/regression issues: non-linearity; outliers; avoiding extrapolation in prediction; correlation is not causation.
Risk = proportion (or percentage) of the time an adverse outcome occurs.
Increased risk = percentage change from baseline risk to risk of exposed group.
Relative risk = (Risk for exposed group) / (risk for baseline group).
Odds of an event = (Chance of event) / (1 – chance of event).
Odds ratio = (odds of event in one group) / (odds of event in another group).
Simpson’s Paradox: An observed association between two variables can change or even reverse direction when there is another variable that interacts strongly with both variables.
Probabilities (also called "chances") are numbers between 0 and 1.
- If two events are mutually exclusive, then they have no outcomes in common and the probability that one or the other occurs is found by adding the chances.
- If two events are independent, then the chance of one thing don’t change when you know how the other turned out and the probability that both occur is found by multiplying the chances.
- If the ways one event happens are a subset of the ways for another event, then its probability can’t be higher.
- Probabilities of an event can be simulated by repeating the process many times on a computer and keeping track of the relative frequency of the times that the event happens.
Expectation (long run average) = sum of values × probabilities.
Law of Large Numbers: Averages or proportions are likely to be more stable when there are more trials while sums or counts are likely to be more variable. This does not happen by compensation for a bad run of luck since independent trials have no memory.
Sampling distribution of a summary statistic = the distribution of values you would get if you repeat the basic process over-and-over again.
Normal Approximation: the sampling distribution of averages or proportions from a large number of independent trials will approximately follow the normal curve.
The standard deviation of a sample proportion is \(\sqrt{p(1 − p)/n}\).
The standard deviation of a sample mean is \(\frac{(\text{population standard deviation})}{\sqrt{n}}= \frac{\sigma}{n}\).