3.6  Odds Ratio
3.6  Odds RatioThis is perhaps the most commonly used measure of association. Later on, we will see this is a natural parameter for many of the loglinear and logistic models.
 Odds
 The odds are ratios of probabilities of "success" and "failure" for a given row, or a ratio of conditional probabilities of the same conditional distribution.
Odds of getting a cold versus not getting a cold given that a person took a placebo:
\(odds_1=\dfrac{P(Z=1Y=1)}{P(Z=2Y=1)}=\dfrac{\pi_{11}}{\pi_{21}}=\dfrac{\pi_{11}}{1\pi_{11}}\)
The second odds (given that ascorbic acid was taken),
\(odds_2=\dfrac{P(Z=1Y=2)}{P(Z=2Y=2)}=\dfrac{\pi_{12}}{\pi_{22}}=\dfrac{\pi_{12}}{1\pi_{12}}\)
Properties of odds
 If odds equal to 1, "success" and "failure" are equally likely.
 If odds > 1, then "success" is more likely than "failure".
 If odds < 1, then "success" is less likely than "failure".
 Odds Ratio

The odds ratio, is the ratio of odds_{1} and odds_{2} (or vice versa):
\begin{align}
\theta &= \dfrac{P(Z=1Y=1)/P(Z=2Y=1)}{P(Z=1Y=2)/P(Z=2Y=2)}\\
&= \dfrac{\left(\dfrac{\pi_{11}}{\pi_{1+}}\right)/\left(\dfrac{\pi_{12}}{\pi_{1+}}\right)}{\left(\dfrac{\pi_{21}}{\pi_{2+}}\right)/\left(\dfrac{\pi_{22}}{\pi_{2+}}\right)}\\
&= \dfrac{\pi_{11}\pi_{22}}{\pi_{12}\pi_{21}}\\
\end{align}
Clearly, \(\theta\) is a function of the parameters of \(P(Z  Y )\), so inferences about it should be the same under Poisson, multinomial, or productmultinomial (\(n_{i+}\)s fixed) sampling. But if we interchange the roles of \(Y\) and \(Z\), we still get
\(\theta=\dfrac{\pi_{11}\pi_{22}}{\pi_{12}\pi_{21}}\)
so \(\theta\) can also be regarded as a function of the parameters of \(P(Y Z )\). Therefore, the likelihood inferences will be the same if we regard the \(n_{+j}\)s as fixed.
Point estimate, CI and hypothesis test
The natural estimate of \(\theta\) is the sample crossproduct ratio,
\(\hat{\theta}=\dfrac{n_{11}n_{22}}{n_{12}n_{21}}\)
The properties of \(\hat{\theta}\) are easily established under multinomial sampling, but the same properties will hold under Poisson or productmultinomial sampling with either the row totals or column totals (but not both) regarded as fixed.
As with the relative risk, the logodds ratio \(\log\hat{\theta}\) has a better normal approximation than \(\hat{\theta}\) does. Therefore, we usually obtain a confidence interval on the log scale; please note again that log throughout this course is a natural log, that is log base \(e\). The estimated variance of \(\log\hat{\theta}\) is easy to remember,
\(\hat{V}(\log\hat{\theta})=\dfrac{1}{n_{11}}+\dfrac{1}{n_{12}}+\dfrac{1}{n_{21}}+\dfrac{1}{n_{22}}\)
and we get a 95% confidence interval for \(\theta\) by exponentiating the endpoints of
\(\log\hat{\theta} \pm 1.96\sqrt{\dfrac{1}{n_{11}}+\dfrac{1}{n_{12}}+\dfrac{1}{n_{21}}+\dfrac{1}{n_{22}}}\)
For the Vitamin C example, the odds of "success" (i.e., getting a cold), given that a skier took vitamin C, are \(0.12/0.88 = 0.14\). The odds of "success" (i.e., getting a cold), given that a skier took a placebo pill, are \(0.22/0.78 = 0.28\).
The odds ratio is \(0.14/0.28 = 0.49\), and the 95% CI for \(\log\theta\) would be
\(\log(0.490)\pm 1.96 \sqrt{1/17+1/109+1/122+1/31}=\)
\((1.359,0.068)\)
Finally, exponentiating limits gives us the 95% CI for \(\theta\): (0.256, 0.934). Notice, that we could have also computed \(0.28/0.14=2.04=31(122)/(109(17))\), which is the inverse of the above value we computed: \(1/0.49=2.04\). For our example, \(\hat{\theta}=0.49\) means that
 the odds of getting a cold given vitamin C are .49 times the odds of getting cold given a placebo
 the odds of getting a cold given a placebo are \(1/.49 = 2.04\) times greater than the odds of given vitamin C
 getting cold is less likely given vitamin C than given a placebo.
For computation in SAS, for the Vitamin C example compare the above calculations to relevant SAS output under heading "Statistics for Table of treatment and response: Odds Ratio";
Odds Ratio  

Odds Ratio  0.4900 
Asymptotic Conf Limits  
95% Lower Conf Limit  0.2569 
95% Upper Conf Limit  0.9343 
Exact Conf Limits  
95% Lower Conf Limit  0.2407 
95% Upper Conf Limit  0.9740 
tables treatment*response/ chisq relrisk riskdiff expected;
with
tables treatment*response/ chisq measures expected;
The computation in R is available with the VitaminC.R file.
For more on the interpretation of odds and oddsratios and their properties see below.
Properties of Odds Ratios
If \(\theta = 3\), the odds of "success" in row 1 are 3 times greater than the odds of success in row 2; individuals in row 1 are more likely to have a "success" than those in row 2. If \(\theta = 0.3\), the odds of "success" in row 1 are 0.3 times the odds of the row 2; the odds of "success" in row 2 are \((1/0.3) = 3.33\) times the odds in row 1.
The relationship between odds and probabilities can be expressed as
\(odds_1=\dfrac{\pi_{11}}{1\pi_{11}}\iff\pi_{11}=\dfrac{odds_1}{1+odds_1}\)
If the variables are independent, then \(\pi_{11} = \pi_{12}\), \(odds_1 = odds_2\), and
\(\theta=\frac{odds_1}{odds_2}=1\)
If the variables are not independent such that \(\pi_{11} > \pi_{12}\), then \(odds_1 > odds_2\), and
\( 1<\theta\)
If the variables are not independent such that \(\pi_{11} < \pi_{12}\), then \(odds_1< odds_2\), and
\( 0 < \theta < 1\)
If both \(\pi_{11}\) and \(\pi_{12}\) are small in the population, then the odds ratio and relative risk will be close since \(\frac{1\pi_{11}}{1\pi_{12}}\) will be close to 1. The odds ratio \(\theta\) does NOT depend on the marginal distribution of either variable. If the categories of both variables are interchanged, the value of \(\theta\) does not change. If the categories of one variable are switched, the odds ratio in the new rearranged table will equal \(1/\theta\).
Finally, note that the sample odds ratio will equal zero or \(\infty\) if any \(n_{ij}=0\). Some authors suggest adding \(1/2\) to each cell count and then recalculating the sample odds ratio and its standard error to avoid this issue.