Lesson 14: Continuous Random Variables

Overview

A continuous random variable differs from a discrete random variable in that it takes on an uncountably infinite number of possible outcomes. For example, if we let \(X\) denote the height (in meters) of a randomly selected maple tree, then \(X\) is a continuous random variable. In this lesson, we'll extend much of what we learned about discrete random variables to the case in which a random variable is continuous. Our specific goals include:

Finding the probability that \(X\) falls in some interval, that is finding \(P(a<X<b)\), where \(a\) and \(b\) are some constants. We'll do this by using \(f(x)\), the probability density function ("p.d.f.") of \(X\), and \(F(x)\), the cumulative distribution function ("c.d.f.") of \(X\).
Finding the mean \(\mu\), variance \(\sigma^2\), and standard deviation of \(X\). We'll do this through the definitions \(E(X)\) and \(\text{Var}(X)\) extended for a continuous random variable, as well as through the moment generating function \(M(t)\) extended for a continuous random variable.

Objectives

Upon completion of this lesson, you should be able to:

To introduce the concept of a probability density function of a continuous random variable.
To learn the formal definition of a probability density function of a continuous random variable.
To learn that if \(X\) is continuous, the probability that \(X\) takes on any specific value \(x\) is 0.
To learn how to find the probability that a continuous random variable \(X\) falls in some interval \((a, b)\).
To learn the formal definition of a cumulative distribution function of a continuous random variable.
To learn how to find the cumulative distribution function of a continuous random variable \(X\) from the probability density function of \(X\).
To learn the formal definition of a \((100p)^{th}\) percentile.
To learn the formal definition of the median, first quartile, and third quartile.
To learn how to use the probability density function to find the \((100p)^{th}\) percentile of a continuous random variable \(X\).
To extend the definitions of the mean, variance, standard deviation, and moment-generating function for a continuous random variable \(X\).
To be able to apply the methods learned in the lesson to new problems.
To learn a formal definition of the probability density function of a continuous uniform random variable.
To learn a formal definition of the cumulative distribution function of a continuous uniform random variable.
To learn key properties of a continuous uniform random variable, such as the mean, variance, and moment generating function.
To understand and be able to create a quantile-quantile (q-q) plot.
To understand how randomly-generated uniform (0,1) numbers can be used to randomly assign experimental units to treatment.
To understand how randomly-generated uniform (0,1) numbers can be used to randomly select participants for a survey.

14.1 - Probability Density Functions

A continuous random variable takes on an uncountably infinite number of possible values. For a discrete random variable \(X\) that takes on a finite or countably infinite number of possible values, we determined \(P(X=x)\) for all of the possible values of \(X\), and called it the probability mass function ("p.m.f."). For continuous random variables, as we shall soon see, the probability that \(X\) takes on any particular value \(x\) is 0. That is, finding \(P(X=x)\) for a continuous random variable \(X\) is not going to work. Instead, we'll need to find the probability that \(X\) falls in some interval \((a, b)\), that is, we'll need to find \(P(a<X<b)\). We'll do that using a probability density function ("p.d.f."). We'll first motivate a p.d.f. with an example, and then we'll formally define it.

Example 14-1

Even though a fast-food chain might advertise a hamburger as weighing a quarter-pound, you can well imagine that it is not exactly 0.25 pounds. One randomly selected hamburger might weigh 0.23 pounds while another might weigh 0.27 pounds. What is the probability that a randomly selected hamburger weighs between 0.20 and 0.30 pounds? That is, if we let \(X\) denote the weight of a randomly selected quarter-pound hamburger in pounds, what is \(P(0.20<X<0.30)\)?

Solution

In reality, I'm not particularly interested in using this example just so that you'll know whether or not you've been ripped off the next time you order a hamburger! Instead, I'm interested in using the example to illustrate the idea behind a probability density function.

Now, you could imagine randomly selecting, let's say, 100 hamburgers advertised to weigh a quarter-pound. If you weighed the 100 hamburgers, and created a density histogram of the resulting weights, perhaps the histogram might look something like this:

In this case, the histogram illustrates that most of the sampled hamburgers do indeed weigh close to 0.25 pounds, but some are a bit more and some a bit less. Now, what if we decreased the length of the class interval on that density histogram? Then, the density histogram would look something like this:

Now, what if we pushed this further and decreased the intervals even more? You can imagine that the intervals would eventually get so small that we could represent the probability distribution of \(X\), not as a density histogram, but rather as a curve (by connecting the "dots" at the tops of the tiny tiny tiny rectangles) that, in this case, might look like this:

Such a curve is denoted \(f(x)\) and is called a (continuous) probability density function.

Now, you might recall that a density histogram is defined so that the area of each rectangle equals the relative frequency of the corresponding class, and the area of the entire histogram equals 1. That suggests then that finding the probability that a continuous random variable \(X\) falls in some interval of values involves finding the area under the curve \(f(x)\) sandwiched by the endpoints of the interval. In the case of this example, the probability that a randomly selected hamburger weighs between 0.20 and 0.30 pounds is then this area:

Now that we've motivated the idea behind a probability density function for a continuous random variable, let's now go and formally define it.

Probability Density Function ("p.d.f.")

The probability density function ("p.d.f.") of a continuous random variable \(X\) with support \(S\) is an integrable function \(f(x)\) satisfying the following:

\(f(x)\) is positive everywhere in the support \(S\), that is, \(f(x)>0\), for all \(x\) in \(S\)
The area under the curve \(f(x)\) in the support \(S\) is 1, that is:

\(\int_S f(x)dx=1\)
If \(f(x)\) is the p.d.f. of \(x\), then the probability that \(x\) belongs to \(A\), where \(A\) is some interval, is given by the integral of \(f(x)\) over that interval, that is:

\(P(X \in A)=\int_A f(x)dx\)

As you can see, the definition for the p.d.f. of a continuous random variable differs from the definition for the p.m.f. of a discrete random variable by simply changing the summations that appeared in the discrete case to integrals in the continuous case. Let's test this definition out on an example.

Example 14-2

Let \(X\) be a continuous random variable whose probability density function is:

\(f(x)=3x^2, \qquad 0<x<1\)

First, note again that \(f(x)\ne P(X=x)\). For example, \(f(0.9)=3(0.9)^2=2.43\), which is clearly not a probability! In the continuous case, \(f(x)\) is instead the height of the curve at \(X=x\), so that the total area under the curve is 1. In the continuous case, it is areas under the curve that define the probabilities.

Now, let's first start by verifying that \(f(x)\) is a valid probability density function.

Solution

What is the probability that \(X\) falls between \(\frac{1}{2}\) and 1? That is, what is \(P\left(\frac{1}{2}<X<1\right)\)?

Solution

What is \(P\left(X=\frac{1}{2}\right)\)?

Solution

It is a straightforward integration to see that the probability is 0:

\(\int^{1/2}_{1/2} 3x^2dx=\left[x^3\right]^{x=1/2}_{x=1/2}=\dfrac{1}{8}-\dfrac{1}{8}=0\)

In fact, in general, if \(X\) is continuous, the probability that \(X\) takes on any specific value \(x\) is 0. That is, when \(X\) is continuous, \(P(X=x)=0\) for all \(x\) in the support.

An implication of the fact that \(P(X=x)=0\) for all \(x\) when \(X\) is continuous is that you can be careless about the endpoints of intervals when finding probabilities of continuous random variables. That is:

\(P(a\le X\le b)=P(a<X\le b)=P(a\le X<b)=P(a<x<b)\)

for any constants \(a\) and \(b\).

Example 14-3

Let \(X\) be a continuous random variable whose probability density function is:

\(f(x)=\dfrac{x^3}{4}\)

for an interval \(0<x<c\). What is the value of the constant \(c\) that makes \(f(x)\) a valid probability density function?

Solution

14.2 - Cumulative Distribution Functions

You might recall that the cumulative distribution function is defined for discrete random variables as:

\(F(x)=P(X\leq x)=\sum\limits_{t \leq x} f(t)\)

Again, \(F(x)\) accumulates all of the probability less than or equal to \(x\). The cumulative distribution function for continuous random variables is just a straightforward extension of that of the discrete case. All we need to do is replace the summation with an integral.

Cumulative Distribution Function ("c.d.f.")

The cumulative distribution function ("c.d.f.") of a continuous random variable \(X\)is defined as:

\(F(x)=\int_{-\infty}^x f(t)dt\)

for \(-\infty<x<\infty\).

You might recall, for discrete random variables, that \(F(x)\) is, in general, a non-decreasing step function. For continuous random variables, \(F(x)\) is a non-decreasing continuous function.

Example 14-2 Revisited

Let's return to the example in which \(X\) has the following probability density function:

\(f(x)=3x^2, \qquad 0<x<1\)

What is the cumulative distribution function \(F(x)\)?

Example 14-3 Revisited again

Let's return to the example in which \(X\) has the following probability density function:

\(f(x)=\dfrac{x^3}{4}\)

for \(0<x<2\). What is the cumulative distribution function of \(X\)?

Example 14-4

Suppose the p.d.f. of a continuous random variable \(X\) is defined as:

\(f(x)=\begin{cases} x+1, & -1<x<0\\ 1-x, & 0\le x<1 \end{cases} \)

Find and graph the c.d.f. \(F(x)\).

Solution

If we look at a graph of the p.d.f. \(f(x)\):

we see that the cumulative distribution function \(F(x)\) must be defined over four intervals — for \(x\le -1\), when \(-1<x\le 0\), for \(0<x<1\), and for \(x\ge 1\). The definition of \(F(x)\) for \(x\le -1\) is easy. Since no probability accumulates over that interval, \(F(x)=0\) for \(x\le -1\). Similarly, the definition of \(F(x)\) for \(x\ge 1\) is easy. Since all of the probability has been accumulated for \(x\) beyond 1, \(F(x)=1\) for \(x\ge 1\). Now for the other two intervals:

In summary, the cumulative distribution function defined over the four intervals is:

\(\begin{equation}F(x)=\left\{\begin{array}{ll}
0, & \text { for } x \leq-1 \\
\frac{1}{2}(x+1)^{2}, & \text { for }-1<x \leq 0 \\
1-\frac{(1-x)^{2}}{2}, & \text { for } 0<x<1 \\
1, & \text { for } x \geqslant 1
\end{array}\right.\end{equation}\)

The cumulative distribution function is therefore a concave up parabola over the interval \(-1<x\le 0\) and a concave down parabola over the interval \(0<x<1\). Therefore, the graph of the cumulative distribution function looks something like this:

14.3 - Finding Percentiles

At some point in your life, you have most likely been told that you fall in the something-something percentile with regards to some measure. For example, if you are tall, you might have been told that you are in the 95th percentile in height, meaning that you are taller than 95% of the population. When you took the SAT Exams, you might have been told that you are in the 80th percentile in math ability, meaning that you scored better than 80% of the population on the math portion of the SAT Exams. We'll now formally define what a percentile is within the framework of probability theory.

Definition. If \(X\) is a continuous random variable, then the \((100p)^{th}\) percentile is a number \(\pi_p\) such that the area under \(f(x)\) and to the left of \(\pi_p\) is \(p\).

That is, \(p\) is the integral of \(f(x)\) from \(-\infty\) to \(\pi_p\):

\(p=\int_{-\infty}^{\pi_p} f(x)dx=F(\pi_p)\)

Some percentiles are given special names:

The 25th percentile, \(\pi_{0.25}\), is called the first quartile (denoted \(q_1\)).
The 50th percentile, \(\pi_{0.50}\), is called the median (denoted \(m\)) or the second quartile (denoted \(q_2\)).
The 75th percentile, \(\pi_{0.75}\), is called the third quartile (denoted \(q_3\)).

Example 14-5

A prospective college student is told that if her total score on the SAT Exam is in the 99th percentile, then she can most likely attend the college of her choice. It is well-known that the distribution of SAT Exam scores is bell-shaped, and the average total score is typically around 1500. Here is a picture depicting the situation:

The student would like to know what her total score, \(\pi_{0.99}\), needs to be in order to ensure that she falls in the 99th percentile. Data from the 2009 SAT Exam Scores suggests that the student should obtain at least a 2200 on her exam. That is, \(\pi_{0.99}=2200\).

Example 14-6

Let \(X\) be a continuous random variable with the following probability density function:

\(f(x)=\dfrac{1}{2}\)

for \(0<x<2\). What is the first quartile, median, and third quartile of \(X\)?

Solution

Because the p.d.f. is uniform, meaning it remains constant over the support, we can readily find the percentiles in one of two ways. We can use the p.d.f. directly to find the first quartile, median, and third quartile:

Alternatively, we can use the cumulative distribution function:

Example 14-7

Let \(X\) be a continuous random variable with the following probability density function:

\(f(x)=\dfrac{1}{2}(x+1)\)

for \(-1<x<1\). What is the 64th percentile of \(X\)?

Solution

To find the 64th percentile, we first need to find the cumulative distribution function \(F(x)\). It is:

\(F(x)=\dfrac{1}{2}\int_{-1}^x(t+1)dt=\dfrac{1}{2} \left[\dfrac{(t+1)^2}{2}\right]^{t=x}_{t=-1}=\dfrac{1}{4}(x+1)^2\)

for \(-1<x<1\). Now, to find the 64th percentile, we just need to set 0.64 equal to \(F(\pi_{0.64})\) and solve for \(\pi_{0.64}\). That is, we need to solve for \(\pi_{0.64}\) in the following equation:

\(0.64=F(\pi_{0.64})=\dfrac{1}{4}(\pi_{0.64}+1)^2\)

Multiplying both sides by 4, we get:

\(2.56=(\pi_{0.64}+1)^2\)

Taking the square root of both sides, we get:

\(\pi_{0.64}+1=\pm \sqrt{2.56}=\pm 1.6\)

And, subtracting both sides by 1, we get:

\(\pi_{0.64}=-2.6 \text{ or } 0.60\)

Because the support is \(-1<x<1\), the 64th percentile and must be 0.6, not −2.6.

14.4 - Special Expectations

The special expectations, such as the mean, variance, and moment generating function, for continuous random variables are just a straightforward extension of those of the discrete case. Again, all we need to do is replace the summations with integrals.

Expected Value

The expected value or mean of a continuous random variable \(X\) is:

\(\mu=E(X)=\int^{+\infty}_{-\infty} xf(x)dx\)

Variance

The variance of a continuous random variable \(X\) is:

\(\sigma^2=Var(X)=E[(X-\mu)^2]=\int^{+\infty}_{-\infty}(x-\mu)^2 f(x)dx\)

Alternatively, you can still use the shortcut formula for the variance, \(\sigma^2=E(X^2)-\mu^2\), with:

\(E(X^2)=\int^{+\infty}_{-\infty} x^2 f(x)dx\)

Standard Deviation

The standard deviation of a continuous random variable \(X\) is:

\(\sigma=\sqrt{Var(X)}\)

Moment Generating Function

The moment generating function of a continuous random variable \(X\), if it exists, is:

\(M(t)=\int^{+\infty}_{-\infty} e^{tx}f(x)dx\)

for \(-h<t<h\).

As before, differentiating the moment generating function provides us with a way of finding the mean:

\(E(X)=M'(0)\)

and the variance:

\(\text{Var}(X)=M^{\prime\prime}(0)-\left(M^\prime(0)\right)^2\)

Example 14-2 Revisited Again

Suppose \(X\) is a continuous random variable with the following probability density function:

\(f(x)=3x^2, \qquad 0<x<1\)

What is the mean of \(X\)?

Solution

What is the variance of \(X\)?

Solution

Example 14-8

Suppose \(X\) is a continuous random variable with the following probability density function:

\(f(x)=xe^{-x}\)

for \(0<x<\infty\). Use the moment generating function \(M(t)\) to find the mean of \(X\).

Solution

The moment generating function is found by integrating:

\(M(t)=E(e^{tX})=\int^{+\infty}_0 e^{tx} (xe^{-x})dx=\int^{+\infty}_0 xe^{-x(1-t)}dx\)

Because the upper limit is \(\infty\), we can rewrite the integral using a limit:

\(M(t)=\lim\limits_{b \to \infty} \int_0^b xe^{-x(1-t)}dx\)

Now, you might recall from your study of calculus that integrating this beast is going to require integration by parts. If you need to integrate

\(\int udv\)

integration by parts tells us that the integral is:

\(\int udv=uv-\int vdu\)

In our case, let's let:

\(u=x\) and \(dv=e^{-x(1-t)}\)

Differentiating \(u\) and integrating \(dv\), we get:

\(du=dx\) and \(v=-\dfrac{1}{1-t}e^{-x(1-t)}\)

Therefore, using the integration by parts formula, we get:

\(M(t)=\lim\limits_{b \to \infty} \left\{\left[-\dfrac{1}{1-t}xe^{-x(1-t)}\right]_{x=0}^{x=b}-\left(-\dfrac{1}{1-t}\right)\int_0^be^{-x(1-t)}dx\right\}\)

Evaluating the first term at \(x=0\) and \(x=b\), and integrating the last term, we get:

\(M(t)=\lim\limits_{b \to \infty}\left\{\left[-\dfrac{1}{1-t} be^{-b(1-t)}\right]+\left(\dfrac{1}{1-t}\right) \left[\left(-\dfrac{1}{1-t}\right)e^{-x(1-t)}\right]_{x=0}^{x=b} \right\}\)

which, upon evaluating the last term at \(x=0\) and \(x=b\), as well as simplifying and distributing the limit as \(b\) goes to infinity, we get:

\(M(t)=\lim\limits_{b \to \infty}\left[-\dfrac{1}{1-t} \dfrac{b}{e^{b(1-t)}}\right]-\left(\dfrac{1}{1-t}\right)^2 \lim\limits_{b \to \infty}(e^{-b(1-t)}-1)\)

Now, taking the limit of the second term is straightforward:

\(\lim\limits_{b \to \infty}(e^{-b(1-t)}-1)=-1\)

Therefore:

\(M(t)=\lim\limits_{b \to \infty}\left[-\dfrac{1}{1-t} \dfrac{b}{e^{b(1-t)}}\right]+\left(\dfrac{1}{1-t}\right)^2\)

Now, if you take the limit of the first term as \(b\) goes to infinity, you can see that we get infinity over infinity! You might recall that in this situation we need to use what is called L'Hôpital's Rule. It tells us that we can find the limit of that first term by first differentiating the numerator and denominator separately. Doing just that, we get:

\(M(t)=\lim\limits_{b \to \infty}\left[-\dfrac{1}{1-t} \times \dfrac{1}{(1-t)e^{b(1-t)}}\right]+\left(\dfrac{1}{1-t}\right)^2\)

Now, if you take the limit as \(b\) goes to infinity, you see that the first term approaches 0. Therefore (finally):

\(M(t)=\left(\dfrac{1}{1-t}\right)^2\)

as long as \(t<1\). Now, with the hard work behind us, using the m.g.f. to find the mean of \(X\) is a straightforward exercise:

14.5 - Piece-wise Distributions and other Examples

Some distributions are split into parts. They are not necessarily continuous, but they are continuous over particular intervals. These types of distributions are known as Piecewise distributions. Below is an example of this type of distribution

\(\begin{align*} f(x)=\begin{cases} 2-4x, & x< 1/2\\ 4x-2, & x\ge 1/2 \end{cases} \end{align*}\)

for \(0<x<1\). The pdf of \(X\) is shown below.

The first step is to show this is a valid pdf. To show it is a valid pdf, we have to show the following:

\(f(x)>0\). We can see that \(f(x)\) is greater than or equal to 0 for all values of \(X\).
\(\int_S f(x)dx=1\).

\(\begin{align*} & \int_0^{1/2}2-4xdx+\int_{1/2}^1 4x-2dx\\ & = 2\left(\frac{1}{2}\right)-2\left(\frac{1}{4}\right)+2-2-\left[2\left(\frac{1}{4}\right)-1\right]\\ & = 1-\left(\frac{1}{2}\right)+2-2-\left(\frac{1}{2}\right)+1=2-1=1 \end{align*}\)
If \((a, b)\subset S\), then \(P(a<X<b)=\int_a^bf(x)dx\). Lets find the probability that \(X\) is between 0 and \(2/3\).

\(P(X<2/3)=\int_0^{1/2} 2-4xdx+\int_{1/2}^{2/3} 4x-2dx=\frac{5}{9}\)

The next step is to know how to find expectations of piecewise distributions. If we know how to do this, we can find the mean, variance, etc of a random variable with this type of distribution. Suppose we want to find the expected value, \(E(X)\).

\(\begin{align*}& E(X)=\int_0^{1/2} x(2-4x)dx+\int_{1/2}^1 x(4x-2)dx\\& =\left(x^2-\frac{4}{3}x^3\right)|_0^{1/2}+\left(\frac{4}{3}x^3-x^2\right)|_{1/2}^1=\frac{1}{2}\end{align*}\)

The variance and other expectations can be found similarly.

The final step is to find the cumulative distribution function. cdf. Recall the cdf of \(X\) is \(F_X(t)=P(X\le t)\). Therefore, for \(t<\frac{1}{2}\), we have

\(F_X(t)=\int_0^t 2-4xdx=2x-x^2|_0^t=2t-2t^2\)

and for \(t\ge\frac{1}{2}\) we have

\(\begin{align*} & F_X(t)=\int_0^{1/2}2-4xdx+\int_{1/2}^t 4x-2dx=\frac{1}{2}+\left(2x^2-2x\right)|_{1/2}^t\\ & =2t^2-2t+1 \end{align*}\)

Thus, the cdf of \(X\) is

\(\begin{equation*} F_X(t)=\begin{cases} 2t-2t^2 & 0<t<1/2\\ 2t^2-2t+1 & 1/2\le t<1 \end{cases} \end{equation*}\)

Example 14-9: Mixture Distribution

Let \(f_1(y)\) and \(f_2(y)\) be density functions, \(y\) is a real number, and let \(a\) be a constant such that \(0\le a\le 1\). Consider the function

\(f(y)=af_1(y)+(1-a)f_2(y)\)

First, lets show that \(f(y)\) is a density function. A density function of this form is referred to as a mixture density (a mixture of two different density functions). My research is based on mixture densities.

\(\begin{align*} & \int_{-\infty}^{\infty} af_1(y)+(1-a)f_2(y)dy=a\int f_1(y)dy+(1-a)\int f_2(y)dy\\ & = a(1)+(1-a)(1)=a+1-a=1 \end{align*}\)
Suppose that \(Y_1\) is a random variable with density function \(f_1(y)\) and that \(E(Y_1)=\mu_1\) and \(Var(Y_1)=\sigma^2_1\); and similarly suppose that \(Y_2\) is a random variable with density function \(f_2(y)\) and that \(E(Y_2)=\mu_2\) and \(Var(Y_2)=\sigma^2_2\). Assume that \(Y\) is a random variable whose density is a mixture of densities corresponding to \(Y_1\) and \(Y_2\).
1. We can find the expected value of \(Y\) in terms of \(a, \;\mu_1, \text{ and } \mu_2\).
  
  \(\begin{align*} & E(Y)=\int yf(y)dy=\int y(af_1(y)+(1-a)f_2(y))dy\\ & = a\int yf_1(y)dy + (1-a) \int yf_2(y)dy=\\ & =a\mu_1+(1-a)\mu_2 \end{align*}\)
2. We can also find the variance of \(Y\) similar to the above.
  
  \(\begin{align*} & E(Y^2)=\int ay^2f_1(y)+(1-a)y^2f_2(y)dy=aE(Y_1^2)+(1-a)E(Y_2^2)\\ & =a(\mu_1^2+\sigma^2_1)+(1-a)(\mu_2^2+\sigma_2^2)\\ & Var(Y)=E(Y^2)-E(Y)^2 \end{align*}\)

Additional Practice Problems

A random variable \(X\) has the following probability density function:

\(\begin{align*} f(x)=\begin{cases} \frac{1}{8}x & 0\le x\le 2\\ \frac{1}{4} & 4\le x\le 7 \end{cases}. \end{align*}\)
1. Find the cumulative distribution function (CDF) of \(X\).
  
  We should do this in pieces:
  
  \(F(x)=\int_0^x\frac{1}{8}xdx=\frac{x^2}{16}, \qquad 0\le x\le 2\)
  
  Between 2 and 4, the cdf remains the same. Therefore,
  
  \(F(x)=\frac{2^2}{16}=\frac{1}{4}, \qquad 2\le x<4\)
  
  After 4, the cdf becomes:
  
  \(F(x)=\frac{1}{4}+\int_4^x\frac{1}{4}dx=\frac{1}{4}+\frac{1}{4}x-1=\frac{x-3}{4}, \qquad 4\le x\le 7\)
  
  Therefore, we have:
  
  \(F(x)=\begin{cases}0, & x<0\\ \frac{x^2}{16}, & 0\le x<2\\ \frac{1}{4}, & 2\le x<4\\ \frac{x-3}{4}, & 4\le x\le 7\\ 1, & x>7 \end{cases}\)
2. Find the median of \(X\). It helps to plot the CDF.
  
  The median is between 4 and 7 and \(P(X<4)=\frac{1}{4}\). Let \(m\) denote the median.
  
  \(0.5=F(m)=\frac{m-3}{4}\qquad \Rightarrow m-3=2 \qquad \Rightarrow m=5\).
Let \(X\) have probability density function \(f_X\) and cdf \(F_X(x)\). Find the probability density function of the random variable \(Y\) in term of \(f_X\), if \(Y\) is defined by \(Y=aX+b\). HINT: Start with the definition of the cdf of \(Y\).

\(F_Y(y)=P(Y\le y)=P(aX+b\le y)=P\left(X\le \frac{y-b}{a}\right)=F_X\left(\frac{y-b}{a}\right)\)

We know \(\frac{\partial }{\partial y}F_Y(y)=f_Y(y)\). Therefore,

\(f_Y(y)=\frac{\partial }{\partial y}F_Y(y)=\frac{\partial }{\partial y}F_X\left(\frac{y-b}{a}\right)=f_X\left(\frac{y-b}{a}\right)\left(\frac{1}{a}\right)\)

14.6 - Uniform Distributions

Uniform Distribution

A continuous random variable \(X\) has a uniform distribution, denoted \(U(a,b)\), if its probability density function is:

\(f(x)=\dfrac{1}{b-a}\)

for two constants \(a\) and \(b\), such that \(a<x<b\). A graph of the p.d.f. looks like this:

Note that the length of the base of the rectangle is \((b-a)\), while the length of the height of the rectangle is \(\dfrac{1}{b-a}\). Therefore, as should be expected, the area under \(f(x)\) and between the endpoints \(a\) and \(b\) is 1. Additionally, \(f(x)>0\) over the support \(a<x<b\). Therefore, \(f(x)\) is a valid probability density function.

Because there are an infinite number of possible constants \(a\) and \(b\), there are an infinite number of possible uniform distributions. That's why this page is called Uniform Distributions (with an s!) and not Uniform Distribution (with no s!). That said, the continuous uniform distribution most commonly used is the one in which \(a=0\) and \(b=1\).

Cumulative distribution Function of a Uniform Random Variable \(X\)

The cumulative distribution function of a uniform random variable \(X\) is:

\(F(x)=\dfrac{x-a}{b-a}\)

for two constants \(a\) and \(b\) such that \(a<x<b\). A graph of the c.d.f. looks like this:

As the picture illustrates, \(F(x)=0\) when \(x\) is less than the lower endpoint of the support (\(a\), in this case) and \(F(x)=1\) when \(x\) is greater than the upper endpoint of the support (\(b\), in this case). The slope of the line between \(a\) and \(b\) is, of course, \(\dfrac{1}{b-a}\).

14.7 - Uniform Properties

Here, we present and prove three key properties of a uniform random variable.

Theorem

The mean of a continuous uniform random variable defined over the support \(a<x<b\) is:

\(\mu=E(X)=\dfrac{a+b}{2}\)

Proof

Theorem

The variance of a continuous uniform random variable defined over the support \(a<x<b\) is:

\(\sigma^2=Var(X)=\dfrac{(b-a)^2}{12}\)

Proof

Because we just found the mean \(\mu=E(X)\) of a continuous random variable, it will probably be easiest to use the shortcut formula:

\(\sigma^2=E(X^2)-\mu^2\)

to find the variance. Let's start by finding \(E(X^2)\):

Now, using the shortcut formula and what we now know about \(E(X^2)\) and \(E(X)\), we have:

\(\sigma^2=E(X^2)-\mu^2=\dfrac{b^2+ab+a^2}{3}-\left(\dfrac{b+a}{2}\right)^2\)

Simplifying a bit:

\(\sigma^2=\dfrac{b^2+ab+a^2}{3}-\dfrac{b^2+2ab+a^2}{4}\)

and getting a common denominator:

\(\sigma^2=\dfrac{4b^2+4ab+4a^2-3b^2-6ab-3a^2}{12}\)

Simplifying a bit more:

\(\sigma^2=\dfrac{b^2-2ab+a^2}{12}\)

and, finally, we have:

\(\sigma^2=\dfrac{(b-a)^2}{12}\)

as was to be proved.

Theorem

The moment generating function of a continuous uniform random variable defined over the support \(a < x < b\) is:

\(M(t)=\dfrac{e^{tb}-e^{ta}}{t(b-a)}\)

Proof

14.8 - Uniform Applications

Perhaps not surprisingly, the uniform distribution is not particularly useful in describing much of the randomness we see in the natural world. Its claim to fame is instead its usefulness in random number generation. That is, approximate values of the \(U(0,1)\) distribution can be simulated on most computers using a random number generator. The generated numbers can then be used to randomly assign people to treatments in experimental studies, or to randomly select individuals for participation in a survey.

Before we explore the above-mentioned applications of the \(U(0,1)\) distribution, it should be noted that the random numbers generated from a computer are not technically truly random, because they are generated from some starting value (called the seed). If the same seed is used again and again, the same sequence of random numbers will be generated. It is for this reason that such random number generation is sometimes referred to as pseudo-random number generation. Yet, despite a sequence of random numbers being pre-determined by a seed number, the numbers do behave as if they are truly randomly generated, and are therefore very useful in the above-mentioned applications. They would probably not be particularly useful in the applications of cryptography or internet security, however!

Quantile-Quantile (Q-Q) Plots

Before we jump in and use a computer and a \(U(0,1)\) distribution to make random assignments and random selections, it would be useful to discuss how we might evaluate if a particular set of data follow a particular probability distribution. One possibility is to compare the theoretical mean (\(\mu\)) and variance (\(\sigma^2\)) with the sample mean ( \(\bar{x}\)) and sample variance (\(s^2\)). It shouldn't be surprising that such a comparison is hardly sufficient. Another technique used frequently is the creation of what is called a quantile-quantile plot (or a q-q plot, for short. The basic idea behind a q-q plot is a two-step process: 1) first determine the theoretical quantiles (from the supposed probability distribution) and the sample quantiles (from the data), and then 2) compare them on a plot. If the theoretical and sample quantiles "match," there is good evidence that the data follow the supposed probability distribution. Here are the specific details of how to create a q-q plot:

Determine the theoretical quantile of order \(p\), that is, the \((100p)^{th}\) percentile \(\pi_p\).
Determine the sample quantile, \(y_r\), of order \(\dfrac{r}{n+1}\), that is the \(100\dfrac{r}{n+1}\) percentile. While that might sound complicated, it amounts to just ordering the data \(x_1, x_2, \ldots, x_n\) to get the order statistics \(y_1\le y_2\le \ldots \le y_n\).
Plot the theoretical quantile on the y-axis against the sample quantile on the x-axis. If the sample data follow the theoretical probability distribution, we would expect the points \((y_r, \pi_p)\) to lie close to a line through the origin with slope equal to one.

In the case of the \(U(0,1)\) distribution, the cumulative distribution function is \(F(x)=x\). Now, recall that to find the \((100p)^{th}\) percentile \(\pi_p\), we set \(p\) equal to \(F(\pi_p)\) and solve for \(\pi_p\). That means in the case of the \(U(0,1)\) distribution, we set \(F(\pi_p)=\pi_p\) equal to \(p\) and solve for \(\pi_p\). Ahhhhaaa! In the case of the \(U(0,1)\) distribution, \(\pi_p=p\). That is, \(\pi_{0.05}=0.05\), \(\pi_{0.35}=0.35\), and so on. Let's take a look at an example!

Example 14-9

Consider the following set of 19 numbers generated from Minitab's \(U(0,1)\) random number generator. Do these data appear to have come from the probability model given by \(f(x)=1\) for \(0<x<1\)?

Uniform
0.790222	0.367893	0.446442	0.889043	0.227839	0.541575
0.805958	0.156496	0.157753	0.465619	0.805580	0.784926
0.288771	0.010717	0.511768	0.496895	0.076856	0.254670
0.752679

Solution

Here are the original data (the column labeled Uniform) along with their sample quantiles (the column labeled Sorted) and their theoretical quantiles (the column labeled Percent):

r	Uniform	Sorted	Percent
1	0.790222	0.010717	0.05
2	0.367893	0.076856	0.10
3	0.446442	0.156496	0.15
4	0.889043	0.157753	0.20
5	0.227839	0.227839	0.25
6	0.541575	0.254670	0.30
7	0.805958	0.288771	0.35
8	0.156496	0.367893	0.40
9	0.157753	0.446442	0.45
10	0.465619	0.465619	0.50
11	0.805580	0.496895	0.55
12	0.784926	0.511768	0.60
13	0.288771	0.541575	0.65
14	0.010717	0.752679	0.70
15	0.511768	0.784926	0.75
16	0.496895	0.790222	0.80
17	0.076856	0.805580	0.85
18	0.254670	0.805958	0.90
19	0.7526679	0.889043	0.95

As might be obvious, the Sorted column is just the original data in increasing sorted order. The Percent column is determined from the \(\pi_p=p\) relationship. In a set of 19 data points, we'd expect the 1st of the 19 points to be the 1/20th or fifth percentile, we'd expect the 2nd of the 19 points to be the 2/20th or tenth percentile, and so on. Plotting the Percent column on the vertical axis (labeled Theoretical Quantile) and the Sorted column on the horizontal axis (labeled Sample Quantile), here's the resulting q-q plot:

Now, the key to interpreting q-q plots is to do it loosely! If the data points generally follow an (approximate) straight line, then go ahead and conclude that the data follow the tested probability distribution. That's what we'll do here!

Incidentally, the theoretical mean and variance of the \(U(0,1)\) distribution are \(\dfrac{1}{2}=0.5\) and \(\dfrac{1}{12}=0.0833\), respectively. If you calculate the sample mean and sample variance of the 19 data points, you'll find that they are 0.4648 and 0.078, respectively. Not too shabby of an approximation for such a small data set.

Random Assignment to Treatment

As suggested earlier, the \(U(0,1)\) distribution can be quite useful in randomly assigning experimental units to the treatments in an experiment. First, let's review why randomization is a useful venture when conducting an experiment. Suppose we were interested in measuring how high a person could reach after "taking" an experimental treatment. It would be awfully hard to draw a strong conclusion about the effectiveness of the experimental treatment if the people in one treatment group were, to begin with, significantly taller than the people in the other treatment group. Randomly assigning people to the treatments in an experiment minimize the chances that such important differences exist in the treatment groups. That way if differences exist in the two groups at the conclusion of the study with respect to the primary variable of interest, we can feel confident in attributing the difference strongly to the treatment of interest rather than due to some other fundamental difference in the groups.

Okay, now let's talk about how the \(U(0,1)\) distribution can help us randomly assign the experimental units to the treatments in a completely randomized experiment. For the sake of concreteness, suppose we wanted to randomly assign 20 students to one group (those who complete a blue data collection form, say) and 20 students to a second group (those who complete a green data collection form, say). This is what the procedure might look like:

Assign the pool of 40 potential students each one number from 1 to 40. It doesn't matter how you assign these numbers.
Generate 40 \(U(0,1)\) numbers in one column of a spreadsheet. Enter the numbers 1 to 40 in a second column of a spreadsheet.
Sort the 40 \(U(0,1)\) numbers in sorted increasing order, so that the numbers in the second column follow along during the sorting process. For example, if the 13th generated \(U(0,1)\) number was the smallest number generated, then the number 13 should appear, after sorting, in the first row of the second column. If the 24th generated \(U(0,1)\) number was the second smallest number generated, the number 24 should appear, after sorting, in the second row of the second column. And so on.
The students whose numbers appear in the first 20 rows of the second column should be assigned to complete the blue data collection form. The students whose numbers appear in the second 20 rows of the second column should be assigned to complete the green data collection form.

One semester, I conducted the above experiment exactly as described. Twenty students were randomly assigned to complete a blue version of the following form, and the remaining twenty students were randomly assigned to complete a green version of the form:

Data Collection Form
1. What is your gender?	Male ___	Female ___
2. What is your current cumulative grade point average?	___.___ ___
3. What is your height?	___ ___.___ inches	(or ___ ___ ___.___ centimeters)
4. What is your weight?	___ ___ ___.___ pounds	(or ___ ___ ___.___ kilograms)
5. Excluding class time, how much do you study for this course?	___ ___ hours/week
6. What is your current exam average in this class?	___ ___ ___
7. Are you a native resident of Pennsylvania?	Yes ___	No ___
8. How many credits are you taking this semester?	___ ___ credits

After administering the forms to the 40 students, here's what the resulting data looked like:

ROW	COLOR	GENDER	GPA	HEIGHT	WEIGHT	HOURS	EXAM	NATIVE	CREDITS
1	blue	1	3.44	68.00	140.0	2.0	85.0	1	18.0
2	blue	1	3.90	71.00	210.0	1.0	98.5	0	18.0
3	blue	2	*	68.00	200.0	10.0	83.0	0	9.0
4	blue	2	*	67.00	185.0	4.0	100.0	1	10.0
5	blue	1	3.82	66.00	143.0	3.0	98.0	0	9.0
6	blue	2	3.66	66.00	140.0	5.0	91.5	0	17.0
7	blue	2	2.98	66.00	135.0	4.0	71.0	1	16.0
8	blue	2	3.67	67.00	118.0	5.0	65.0	1	18.0
9	blue	1	3.15	69.50	180.5	10.0	61.0	1	13.0
10	blue	1	3.29	72.00	120.0	2.0	61.0	1	17.0
11	blue	1	4.00	69.00	175.0	4.0	95.0	0	3.0
12	blue	2	2.46	69.70	181.0	7.0	59.0	0	16.0
13	blue	2	2.97	52.00	105.0	6.0	62.0	0	15.0
14	blue	1	*	68.80	136.5	2.0	95.0	0	12.0
15	blue	1	3.50	70.00	215.0	1.0	87.5	1	15.0
16	blue	2	3.61	65.00	135.0	9.0	89.0	0	12.0
17	blue	2	3.24	64.00	148.0	10.0	65.0	1	16.0
18	blue	2	3.41	65.50	126.0	5.0	70.0	1	15.0
19	blue	2	3.40	65.00	115.0	2.0	83.0	1	17.0
20	blue	1	3.80	68.00	236.0	6.0	87.5	1	18.0
21	blue	2	2.95	65.00	140.0	6.0	79.0	1	15.0
22	green	1	3.80	71.50	145.0	3.0	77.5	1	18.0
23	green	2	4.00	62.00	185.0	6.0	98.0	1	10.0
24	green	2	*	60.00	100.0	8.0	95.0	0	10.0
25	green	2	3.65	62.00	150.0	4.0	85.0	1	15.0
26	green	1	3.54	71.75	160.0	0.5	83.0	1	6.0
27	green	2	*	65.00	113.0	7.0	97.5	0	18.0
28	green	1	3.63	69.50	155.0	10.0	80.0	1	18.0
29	green	2	3.90	65.20	110.0	1.0	96.0	0	9.0
30	green	1	*	76.00	165.0	3.0	88.0	1	15.0
31	green	1	3.80	70.40	143.0	1.0	100.0	0	9.0
32	green	2	3.19	64.50	140.0	10.0	65.0	1	16.0
33	green	1	2.97	71.00	140.0	5.0	72.0	1	17.5
34	green	1	2.87	65.50	160.3	7.0	45.0	1	13.0
35	green	1	3.43	60.70	160.0	12.0	90.0	0	12.0
36	green	1	3.61	66.00	126.0	18.0	88.0	0	13.0
37	green	1	*	66.00	120.0	7.0	79.0	1	15.0
38	green	1	2.97	69.70	185.5	6.0	83.0	1	13.0
39	green	1	3.26	70.00	152.0	3.0	70.5	1	13.0
40	green	1	3.30	72.00	190.0	2.5	81.0	1	17.0

And, here's a portion of a basic descriptive analysis of six of the variables on the form:

VARIABLE	COLOR	N	N*	MEAN	MEDIAN	TrMean
GPA	blue	18	3	3.4028	3.4250	3.4244
	green	15	4	3.4613	3.5400	3.4654
HEIGHT	blue	21	0	66.786	67.000	67.289
	green	19	0	67.30	66.00	67.22
WEIGHT	blue	21	0	156.38	140.00	154.89
	green	19	0	147.36	150.00	147.64
HOURS	blue	21	0	4.952	5.000	4.895
	green	19	0	5.526	6.000	5.441
EXAM	blue	21	0	80.29	83.00	80.37
	green	19	0	82.82	83.00	84.03
CREDITS	blue	21	0	14.238	15.000	14.632
	green	19	0	13.553	13.000	13.735

VARIABLE	COLOR	StDev	SE Mean	MINIMUM	MAXIMUM	Q1
GPA	blue	0.3978	0.0938	2.4600	4.0000	3.1075
	green	0.3564	0.0920	2.8700	4.0000	3.1900
HEIGHT	blue	4.025	0.878	52.000	72.000	65.250
	green	4.43	1.02	60.00	76.00	64.50
WEIGHT	blue	36.97	8.07	105.00	236.00	130.50
	green	25.53	5.86	100.00	190.00	126.00
HOURS	blue	2.941	0.642	1.000	10.000	2.000
	green	3.251	0.746	0.500	12.000	3.000
EXAM	blue	14.5	3.09	59.00	100.00	65.00
	green	13.42	3.08	45.00	100.00	77.50
CREDITS	blue	3.885	0.848	3.000	18.000	12.000
	green	3.547	0.814	6.000	18.000	10.000

VARIABLE	COLOR	Q3
GPA	blue	3.7025
	green	3.8000
HEIGHT	blue	69.250
	green	71.00
WEIGHT	blue	183.00
	green	160.30
HOURS	blue	6.500
	green	8.000
EXAM	blue	93.25
	green	95.00
CREDITS	blue	17.000
	green	17.000

The analysis suggests that my randomization worked quite well. For example, the mean grade-point average of those students completing the blue form was 3.40, while the mean g.p.a. for those students completing the green form was 3.46. And, the mean height of those students completing the blue form was 66.8 inches, while the mean height for those students completing the green form was 67.3 inches. The two groups appear to be similar, on average, with respect to the other collected data as well. It should be noted that there is no guarantee that any particular randomization will be as successful as the one I illustrated here. The only thing that the randomization ensures is that the chance that the groups will differ with respect to key measurements will be small.

Random Selection for Participation in a Survey

Just as you should always randomly assign experimental units to treatments when conducting an experiment, you should always randomly select your participants when conducting a survey. If you don't, you might very well end up with biased survey results. (The people who choose to take the time to complete a survey in a magazine or on a web site typically have quite strong opinions!) The procedure we can use to randomly select participants for a survey is quite similar to that used for randomly assigning people to treatments in a completely randomized experiment. This is what the procedure would look like if you wanted to randomly select, say, 1000 students to participate in a survey from a potential pool of, say, 40000 students:

Assign the pool of 40000 potential participants each one number from 1 to 40000. It doesn't matter how you assign these numbers.
Generate 40000 \(U(0,1)\) numbers in one column of a spreadsheet. Enter the numbers 1 to 40000 in a second column of a spreadsheet.
Sort the 40000 \(U(0,1)\) numbers in sorted increasing order, so that the numbers in the second column follow along during the sorting process. For example, if the 23rd generated \(U(0,1)\) number was the smallest number generated, the number 23 should appear, after sorting, in the first row of the second column. If the 102nd generated \(U(0,1)\) number was the second smallest number generated, the number 102 should appear, after sorting, in the second row of the second column. And so on.
The students whose numbers appear in the first 1000 rows of the second column should be selected to participate in the survey.

Following the procedure as described, the 1000 selected students represent a random sample from the population of 40000 students.

^[1]	Link
↥	Has Tooltip/Popover
	Toggleable Visibility