Confidence Intervals

Key Concepts:

  • Form of the Confidence Interval
  • Interpretation of the Confidence Interval
  • Z-test intervals

General form of a confidence interval (CI)

A confidence interval estimates are intervals within which the parameter is expected to fall, with a certain degree of confidence.

The general form:

  • estimate ± critical value × std.dev of the estimate
  • estimate ± margin of error

For example:

  • sample mean ± critical value × estimated standard error

The CIs differ based on:

  • The parameter of interest, e.g., population mean, population proportion, difference in population's means, etc…
  • Design of the sample: SRS, stratified, experiments
  • Confidence level or a confidence coefficient, (1 - α)100%, e.g., 95%, 99%, 90%, 80%, corresponding, respectively, to α values of 0.05, 0.01, 0.1, 0.2, etc…

Interpretation of a Confidence Interval

In most general terms, for a 95% CI, we say “we are 95% confident that the true population parameter is between the lower and upper calculated values”.

A 95% CI for a population parameter DOES NOT mean that the interval has a probability of 0.95 that the true value of the parameter falls in the interval.

The CI either contains the parameter or it does not contain it.

The probability is associated with the process that generated the interval. And if we repeat this process many times, 95% of all intervals should in fact contain the true value of the parameter.

What does a 99% CI say?

Would you choose a 99% or 95% CI, and why?

Tradeoffs

We want confidence coefficient to be closer to 1.

We want the sample size to be as small as possible (but not too small). This is a practical issue.

We want the CI to be as narrow as possible

  • As we increase the sample estimate, the CI …?
  • As we decrease st. dev, the CI …?
  • As we decrease the confidence level, (1-α), the CI …?
  • As we increase sample size….?

z-Tests & Intervals

For an unknown population mean, and a known variance:

  • Assumptions of the model:
    • Suppose there is a normally distributed population whose standard deviation σ is known to be (say) 3 but whose mean μ may not be known. How could we estimate μ?
  • Take a random sample of size n = (say) 54.
  • Sample statistic
    • The sample mean, \(\bar{X}\) is a good estimator of the population mean μ.
  • Sampling distribution under the model assumptions:
    • Via CLT is ~ N(μ, σ2/n)
  • We are 95% confident that μ is in the interval \(\bar{X}-2\frac{\sigma}{\sqrt{n}}, \bar{X}+2\frac{\sigma}{\sqrt{n}}\).


More About Confidence Intervals

Simplified Expression for a 95% Confidence Interval

diagram

Generalizing the 95% Confidence Interval

Critical value, z /2 is a multiplier for a (1-α) × 100%

For 95% CI, α = 0.5, so the Z-value of the standard normal is at 0.025, that is z = 1.96 For any probability value (1- ) there is a number z/2 such that any normal distribution has probability (1- ) within z /2 standard deviations of the mean. Assuming that σ is known, the multiplier for a (1-α) × 100% confidence interval is the (1 - ½α) × 100th percentile of the standard normal distribution.

plot

Height Example

  • Assume that the s is known and is equal to 3.
  • We want to estimate the unknown true height of our population.
  • Point sample estimate, can be the sample mean, 66.463.
  • What is the distribution of the sample mean?
  • What is the 95% confidence interval? What does it mean?

SAS output


Confidence Intervals for Proportions in Newspapers

As found in CNN in June, 2006 (here):

CNN survey

Next question ...

CNN survey

The stated Margin of error: +/- 3%

Therefore, this would be the Confidence interval: 62%+/- 3%

We can be really confident that between 59% and 65% of all U.S. adults disapprove of how President Bush is handling the situation in Iraq.