14.1 - Probability Density Functions

A continuous random variable takes on an uncountably infinite number of possible values. For a discrete random variable \(X\) that takes on a finite or countably infinite number of possible values, we determined \(P(X=x)\) for all of the possible values of \(X\), and called it the probability mass function ("p.m.f."). For continuous random variables, as we shall soon see, the probability that \(X\) takes on any particular value \(x\) is 0. That is, finding \(P(X=x)\) for a continuous random variable \(X\) is not going to work. Instead, we'll need to find the probability that \(X\) falls in some interval \((a, b)\), that is, we'll need to find \(P(a<X<b)\). We'll do that using a probability density function ("p.d.f."). We'll first motivate a p.d.f. with an example, and then we'll formally define it.

Example 14-1 Section

quarter pounder burger

Even though a fast-food chain might advertise a hamburger as weighing a quarter-pound, you can well imagine that it is not exactly 0.25 pounds. One randomly selected hamburger might weigh 0.23 pounds while another might weigh 0.27 pounds. What is the probability that a randomly selected hamburger weighs between 0.20 and 0.30 pounds? That is, if we let \(X\) denote the weight of a randomly selected quarter-pound hamburger in pounds, what is \(P(0.20<X<0.30)\)?


In reality, I'm not particularly interested in using this example just so that you'll know whether or not you've been ripped off the next time you order a hamburger! Instead, I'm interested in using the example to illustrate the idea behind a probability density function.

Now, you could imagine randomly selecting, let's say, 100 hamburgers advertised to weigh a quarter-pound. If you weighed the 100 hamburgers, and created a density histogram of the resulting weights, perhaps the histogram might look something like this:

X 0.25 Density

In this case, the histogram illustrates that most of the sampled hamburgers do indeed weigh close to 0.25 pounds, but some are a bit more and some a bit less. Now, what if we decreased the length of the class interval on that density histogram? Then, the density histogram would look something like this:

X 0.25 Density

Now, what if we pushed this further and decreased the intervals even more? You can imagine that the intervals would eventually get so small that we could represent the probability distribution of \(X\), not as a density histogram, but rather as a curve (by connecting the "dots" at the tops of the tiny tiny tiny rectangles) that, in this case, might look like this:

X 0.25 f(x)

Such a curve is denoted \(f(x)\) and is called a (continuous) probability density function.

Now, you might recall that a density histogram is defined so that the area of each rectangle equals the relative frequency of the corresponding class, and the area of the entire histogram equals 1. That suggests then that finding the probability that a continuous random variable \(X\) falls in some interval of values involves finding the area under the curve \(f(x)\) sandwiched by the endpoints of the interval. In the case of this example, the probability that a randomly selected hamburger weighs between 0.20 and 0.30 pounds is then this area:

X 0.20 0.30 f(x) Area = Probability P(0.20<X<0.30)

Now that we've motivated the idea behind a probability density function for a continuous random variable, let's now go and formally define it.

Probability Density Function ("p.d.f.")

The probability density function ("p.d.f.") of a continuous random variable \(X\) with support \(S\) is an integrable function \(f(x)\) satisfying the following:

  1. \(f(x)\) is positive everywhere in the support \(S\), that is, \(f(x)>0\), for all \(x\) in \(S\)

  2. The area under the curve \(f(x)\) in the support \(S\) is 1, that is:

    \(\int_S f(x)dx=1\)

  3. If \(f(x)\) is the p.d.f. of \(x\), then the probability that \(x\) belongs to \(A\), where \(A\) is some interval, is given by the integral of \(f(x)\) over that interval, that is:

    \(P(X \in A)=\int_A f(x)dx\)

As you can see, the definition for the p.d.f. of a continuous random variable differs from the definition for the p.m.f. of a discrete random variable by simply changing the summations that appeared in the discrete case to integrals in the continuous case. Let's test this definition out on an example.

Example 14-2 Section

Let \(X\) be a continuous random variable whose probability density function is:

\(f(x)=3x^2, \qquad 0<x<1\)

First, note again that \(f(x)\ne P(X=x)\). For example, \(f(0.9)=3(0.9)^2=2.43\), which is clearly not a probability! In the continuous case, \(f(x)\) is instead the height of the curve at \(X=x\), so that the total area under the curve is 1. In the continuous case, it is areas under the curve that define the probabilities.

Now, let's first start by verifying that \(f(x)\) is a valid probability density function.


What is the probability that \(X\) falls between \(\frac{1}{2}\) and 1? That is, what is \(P\left(\frac{1}{2}<X<1\right)\)?


What is \(P\left(X=\frac{1}{2}\right)\)?


It is a straightforward integration to see that the probability is 0:

\(\int^{1/2}_{1/2} 3x^2dx=\left[x^3\right]^{x=1/2}_{x=1/2}=\dfrac{1}{8}-\dfrac{1}{8}=0\)

In fact, in general, if \(X\) is continuous, the probability that \(X\) takes on any specific value \(x\) is 0. That is, when \(X\) is continuous, \(P(X=x)=0\) for all \(x\) in the support.

An implication of the fact that \(P(X=x)=0\) for all \(x\) when \(X\) is continuous is that you can be careless about the endpoints of intervals when finding probabilities of continuous random variables. That is:

\(P(a\le X\le b)=P(a<X\le b)=P(a\le X<b)=P(a<x<b)\)

for any constants \(a\) and \(b\).

Example 14-3 Section

Let \(X\) be a continuous random variable whose probability density function is:


for an interval \(0<x<c\). What is the value of the constant \(c\) that makes \(f(x)\) a valid probability density function?