# 18.2 - The Probability Density Functions

18.2 - The Probability Density Functions

Our work on the previous page with finding the probability density function of a specific order statistic, namely the fifth one of a certain set of six random variables, should help us here when we work on finding the probability density function of any old order statistic, that is, the $$r^{th}$$ one.

Theorem

Let $$Y_1<Y_2<\cdots, Y_n$$ be the order statistics of n independent observations from a continuous distribution with cumulative distribution function $$F(x)$$ and probability density function:

$$f(x)=F'(x)$$

where $$0<F(x)<1$$ over the support $$a<x<b$$. Then, the probability density function of the $$r^{th}$$ order statistic is:

$$g_r(y)=\dfrac{n!}{(r-1)!(n-r)!}\left[F(y)\right]^{r-1}\left[1-F(y)\right]^{n-r}f(y)$$

over the support $$a<y<b$$.

## Proof

We'll again follow the strategy of first finding the cumulative distribution function $$G_r(y)$$ of the $$r^{th}$$ order statistic, and then differentiating it with respect to $$y$$ to get the probability density function $$g_r(y)$$. Now, if the event $$\{X_i\le y\},\;i=1, 2, \cdots, r$$ is considered a "success," and we let $$Z$$ = the number of such successes in $$n$$ mutually independent trials, then $$Z$$ is a binomial random variable with $$n$$ trials and probability of success:

$$F(y)=P(X_i \le y)$$

Now, the $$r^{th}$$ order statistic $$Y_r$$ is less than or equal to $$y$$ only if r or more of the $$n$$ observations $$x_1, x_2, \cdots, x_n$$ are less than or equal to $$y$$, which implies:

$$G_r(y)=P(Y_r \le y)=P(Z=r)+P(Z=r+1)+ ... + P(Z=n)$$

which can be written using summation notation as:

$$G_r(y)=\sum_{k=r}^{n} P(Z=k)$$

Now, we can replace $$P(Z=k)$$ with the probability mass function of a binomial random variable with parameters $$n$$ and $$p=F(y)$$. Doing so, we get:

$$G_r(y)=\sum_{k=r}^{n}\binom{n}{k}\left[F(y)\right]^{k}\left[1-F(y)\right]^{n-k}$$

Rewriting that slightly by pulling the $$n^{th}$$ term out of the summation notation, we get:

$$G_r(y)=\sum_{k=r}^{n-1}\binom{n}{k}\left[F(y)\right]^{k}\left[1-F(y)\right]^{n-k}+\left[F(y)\right]^{n}$$

Now, it's just a matter of taking the derivative of $$G_r(y)$$ with respect to $$y$$. Using the product rule in conjunction with the chain rule on the terms in the summation, and the power rule in conjunction with the chain rule, we get:

$$g_r(y)=\sum_{k=r}^{n-1}{n\choose k}(k)[F(y)]^{k-1}f(y)[1-F(y)]^{n-k}\\ +\sum_{k=r}^{n-1}[F(y)]^k(n-k)[1-F(y)]^{n-k-1}(-f(y))\\+n[F(y)]^{n-1}f(y)$$ (**)

Now, it's just a matter of recognizing that:

$$\left(\begin{array}{l} n \\ k \end{array}\right) k=\frac{n !}{\color{blue}\underbrace{\color{black}k !}_{\underset{\text{}}{{\color{blue}\color{red}\cancel {\color{blue}k}\color{blue}(k-1)!}}}\color{black}(n-k) !} \times \color{red}\cancel {\color{black}k}\color{black}=\frac{n !}{(k-1) !(n-k) !}$$

and

$$\left(\begin{array}{l} n \\ k \end{array}\right)(n-k)=\frac{n !}{k !\color{blue}\underbrace{\color{black}(n-k) !}_{\underset{\text{}}{{\color{blue}\color{red}\cancel {\color{blue}(n-k)}\color{blue}(n- k -1)!}}}\color{black}} \times \color{red}\cancel {\color{black}(n-k)}\color{black}=\frac{n !}{k !(n-k-1) !}$$

Once we do that, we see that the p.d.f. of the $$r^{th}$$ order statistic $$Y_r$$ is just the first term in the summation in $$g_r(y)$$. That is:

$$g_r(y)=\dfrac{n!}{(r-1)!(n-r)!}\left[F(y)\right]^{r-1}\left[1-F(y)\right]^{n-r}f(y)$$

for $$a<y<b$$. As was to be proved! Simple enough! Well, okay, that's a little unfair to say it's simple, as it's not all that obvious, is it? For homework, you'll be asked to write out, for the case when $$n=6$$ and r = 3, the terms in the starred equation (**). In doing so, you should see that for all but the first of the positive terms in the starred equation, there is a corresponding negative term, so that everything but the first term cancels out. After you get a chance to work through that exercise, then perhaps it would be fair to say simple enough!

## Example 18-2 (continued)

Let $$Y_1<Y_2<Y_3<Y_4<Y_5<Y_6$$ be the order statistics associated with $$n=6$$ independent observations each from the distribution with probability density function:

$$f(x)=\dfrac{1}{2}x$$

for $$0<x<2$$. What is the probability density function of the first, fourth, and sixth order statistics?

### Solution

When we worked with this example on the previous page, we showed that the cumulative distribution function of $$X$$ is:

$$F(x)=\dfrac{x^2}{4}$$

for $$0<x<2$$. Therefore, applying the above theorem with $$n=6$$ and $$r=1$$, the p.d.f. of $$Y_1$$ is:

$$g_1(y)=\dfrac{6!}{0!(6-1)!}\left[\dfrac{y^2}{4}\right]^{1-1}\left[1-\dfrac{y^2}{4}\right]^{6-1}\left(\dfrac{1}{2}y\right)$$

for $$0<y<2$$, which can be simplified to:

$$g_1(y)=3y\left(1-\dfrac{y^2}{4}\right)^{5}$$

Applying the theorem with $$n=6$$ and $$r=4$$, the p.d.f. of $$Y_4$$ is:

$$g_4(y)=\dfrac{6!}{3!(6-4)!}\left[\dfrac{y^2}{4}\right]^{4-1}\left[1-\dfrac{y^2}{4}\right]^{6-4}\left(\dfrac{1}{2}y\right)$$

for $$0<y<2$$, which can be simplified to:

$$g_4(y)=\dfrac{15}{32}y^7\left(1-\dfrac{y^2}{4}\right)^{2}$$

Applying the theorem with $$n=6$$ and $$r=6$$, the p.d.f. of $$Y_6$$ is:

$$g_6(y)=\dfrac{6!}{5!(6-6)!}\left[\dfrac{y^2}{4}\right]^{6-1}\left[1-\dfrac{y^2}{4}\right]^{6-6}\left(\dfrac{1}{2}y\right)$$

for $$0<y<2$$, which can be simplified to:

$$g_6(y)=\dfrac{3}{1024}y^{11}$$

Fortunately, when we graph the three functions on one plot:

we see something that makes intuitive sense, namely that as we increase the number of the order statistic, the p.d.f. "moves to the right" on the support interval.

 [1] Link ↥ Has Tooltip/Popover Toggleable Visibility