5.2 - Estimation and Confidence Intervals

Estimation Section

Two common estimation methods are point and interval estimates.

Point Estimates
An estimate for a parameter that is one numerical value. An example of a point estimate is the sample mean or the sample proportion.
Interval Estimates
Interval estimates give an interval as the estimate for a parameter. This is a new concept which is the focus of this lesson. Such intervals are built around point estimates which is why understanding point estimates is important to understanding interval estimates.

In this course, the interval estimates we find are referred to as confidence intervals.

Confidence Interval
An interval of values computed from sample data that is likely to cover the true parameter of interest.

There are many estimators for population parameters. For example, if we want to know the "center" of a distribution, why use the mean? Could we use the median? How about using the middle value, i.e. (max+min)/2? We choose particular estimators for various reasons with information based on their sampling distributions. Here are some properties of "good" estimators.


Properties of 'Good' Estimators

In determining what makes a good estimator, there are two key features:

  1. The center of the sampling distribution for the estimate is the same as that of the population. When this property is true, the estimate is said to be unbiased. The most often-used measure of the center is the mean.
  2. The estimate has the smallest standard error when compared to other estimators. For example, in the normal distribution, the mean and median are essentially the same. However, the standard error of the median is about 1.25 times that of the standard error of the mean. We know the standard error of the mean is \(\frac{\sigma}{\sqrt{n}}\). Therefore in a normal distribution, the SE(median) is about 1.25 times \(\frac{\sigma}{\sqrt{n}}\). This is why the mean is a better estimator than the median when the data is normal (or approximately normal).

Note!

We should stop here and explain why we use the estimated standard error and not the standard error itself when constructing a confidence interval. The answer is because, typically, the population values are not known. Take, for example, the standard error of the sample proportion. It is...

\(\sqrt{\dfrac{p(1-p)}{n}}\)

If the goal is to estimate \(p\) and \(p\) is unknown, we would also then have to estimate the standard error. In this case the estimated standard error is...

\(\sqrt{\dfrac{\hat{p}(1-\hat{p})}{n}}\)

For the case for estimating the population mean, the population standard deviation, \(\sigma\), may also be unknown. When it is unknown, we can estimate it with the sample standard deviation, s. Then the estimated standard error of the sample mean is...

\(\dfrac{s}{\sqrt{n}}\)


General Format of a Confidence Interval

In putting the two properties above together, the center of our interval should be the point estimate for the parameter of interest. With the estimated standard error of the point estimate, we can include a measure of confidence to our estimate by forming a margin of error.

This you may have readily seen whenever you have heard or read a sample survey result (e.g. a survey of the current approval rating of the President, or attitude citizens have on some new policy). In such surveys, you may hear reference to the "44% of those surveyed approved of the President's reaction" (this is the sample proportion), and "the survey had a 3.5% margin or error, or ± 3.5%." This latter number is the margin of error.

With the point estimate and the margin of error, we have an interval for which the group conducting the survey is confident the parameter value falls (i.e. the proportion of U.S. citizens who approve of the President's reaction). In this example, that interval would be from 40.5% to 47.5%.

This example provides the general construction of a confidence interval:

General form of a confidence interval
\(sample\ statistic \pm margin\ of\ error\)

The margin of error will consist of two pieces. One is the standard error of the sample statistic. The other is some multiplier, \(M\), of this standard error, based on how confident we want to be in our estimate. This multiplier will come from the same distribution as the sampling distribution of the point estimate; for example, as we will see with the sample proportion this multiplier will come from the standard normal distribution. The general form of the margin of error is shown below.

General form of the margin of error
\(\text{Margin of error}=M\times \hat{SE}(\text{estimate})\)

*the multiplier, \(M\), depends on our level of confidence


Interpretation of a Confidence Interval

The interpretation of a confidence interval has the basic template of: "We are 'some level of percent confident' that the 'population of interest' is from 'lower bound to upper bound'. The phrases in single quotes are replaced with the specific language of the problem. We will discuss more about the interpretation of a confidence interval after we provide a few more examples.

Note!

Some might say, "Why not just be 100% confident?", but that does not make practical sense. For instance, what value comes from me saying I am 100% confident that the approval rating for the President is from 0% to 100%. That is the only interval in which one can be truly confident will capture the actual proportion. Similarly, if you were to ask your professor what they think your score will be on an exam and they reply, "zero to one hundred", what would you think of that answer?

However, one does want to be as confident as reasonably possible. Most confidence levels use ranges from 90% confidence to 99% confidence, with 95% being the most widely used. In fact, when you read a report that includes a margin of error, you can usually assume this has a 95% confidence attached to it unless otherwise stated.


Moving forward...

We're going to begin exploring confidence intervals for one population proportions. The important issue of determining the required sample size to estimate a population proportion will also be discussed in detail in this lesson.