19.2 - For Any Percentile

The method that we learned for finding a confidence interval for the median of a continuous distribution can be easily extended so that we can find a confidence interval for any percentile \(\pi_p\). The only thing we have to change is the probability of a success, that is, that \(X_i\) is less than \(\pi_p\):

\(p=P(X_i < \pi_p)\)

Then, the exact confidence coefficient is calculated just as before using the binomial distribution with parameters \(n\) and \(p\):

\(1-\alpha=P(Y_i < \pi_p < Y_j)=\sum_{k=i}^{j-1}\binom{n}{k}p^k(1-p)^{n-k}\)

And, for large samples of size \(n\ge 20\), say, an approximate confidence coefficient is calculated using the normal approximation to the binomial by way of the standard normal random variable:

\(Z=\dfrac{W-np}{\sqrt{np(1-p)}}\)

Once the sample is observed and the order statistics determined, then the known interval \((y_i, y_j)\) serves as a \(100(1-\alpha)\%\) confidence interval for the unknown population percentile \(\pi_p\). Let's revisit an example from the previous page.

Example 19-2 (continued) Section

oil rig out at sea

A sample of 26 offshore oil workers took part in a simulated escape exercise, resulting in the following data on time (in seconds) to complete the escape:

325 325 334 339 356 356 359 359 363
364 364 366 369 370 373 373 374 375
389 392 393 394 397 402 403 424

Find a confidence interval for the 75th percentile, and calculate its confidence coefficient. (The data are from the journal article "Oxygen Consumption and Ventilation During Escape from an Offshore Platform," Ergonomics 1997: 281-292.)

Answer

Since \((0.75)(26+1)=20.25\), the weighted average of the 20th and 21st order statistic:

\(\tilde{\pi}_{0.75}=y_{20}+0.25(y_{21}-y_{20})=0.75y_{20}+0.25y_{21}=0.75(392)+0.25(393)=392.25\)

serves as a good point estimate of \(\pi_{0.75}\). To find a confidence interval for \(\pi_{0.75}\), let's move up and down a few order statistics from \(y_{20}\) to, say, \(y_{16}\) and \(y_{24}\). In that case, our interval is \((y_{16}, y_{24}=(373, 402)\) with an exact confidence coefficient calculated using a binomial distribution with n = 26 and p = 0.75 as:

\(P(Y_{16}<m<Y_{24})=P(16 \le W \le 23)=P(W \le 23)-P(W \le 15)=0.9742-0.0401=0.9341\)

We can be 93.4% confident that the 75th percentile of all escape times is between 373 and 402 seconds.

Because \(n=26\) here, we could have alternatively used the normal approximation to the binomial. In this case, the mean and variance are:

\(\mu=np=0.75(26)=19.5\) and \(\sigma^2 =np(1-p)=26(0.75)(1-0.75)=4.875\)

respectively. Therefore, the approximate confidence coefficient for the interval \((y_{16}, y_{24})\) is:

\(P(Y_{16}<m<Y_{24})=P(16 \le W \le 23)=P\left(\dfrac{15.5-19.5}{\sqrt{4.875}} < Z < \dfrac{23.5-19.5}{\sqrt{4.875}} \right)\)

which can be simplified to:

\(P(Y_{16}<m<Y_{24})=P(-1.81 \le Z \le 1.81)=0.9649-0.0359=0.929\)

As you can see, the normal approximation does quite well, as the approximate probability is 0.929 compared to the exact probability of 0.934.