7.3 - The Cumulative Distribution Function (CDF)

The cumulative distribution function (CDF or cdf) of the random variable \(X\) has the following definition:

\(F_X(t)=P(X\le t)\)

The cdf is discussed in the text as well as in the notes but I wanted to point out a few things about this function. The cdf is not discussed in detail until section 2.4 but I feel that introducing it earlier is better. The notation sometimes confuses students. The notation \(F_X(t)\) means that \(F\) is the cdf for the random variable \(X\) but it is a function of \(t\).

We do not focus too much on the cdf for a discrete random variable but we will use them very often when we study continuous random variables. It does not mean that the cdf is not important for discrete random variables. They are just not always used since there are tables and software that help us to find these probabilities for common distributions.

The cdf of random variable \(X\) has the following properties:

\(F_X(t)\) is a nondecreasing function of \(t\), for \(-\infty<t<\infty\).
The cdf, \(F_X(t)\), ranges from 0 to 1. This makes sense since \(F_X(t)\) is a probability.
If \(X\) is a discrete random variable whose minimum value is \(a\), then \(F_X(a)=P(X\le a)=P(X=a)=f_X(a)\). If \(c\) is less than \(a\), then \(F_X(c)=0\).
If the maximum value of \(X\) is \(b\), then \(F_X(b)=1\).
Also called the distribution function.
All probabilities concerning \(X\) can be stated in terms of \(F\).

I have provided a few very brief examples using the cdf. We will be looking at these functions in more detail in the future.

Suppose \(X\) is a discrete random variable. Let the pmf of \(X\) be equal to

\(f(x)=\dfrac{5-x}{10}, \;\; x=1,2,3,4.\)

Suppose we want to find the cdf of \(X\). The cdf is \(F_X(t)=P(X\le t)\).

For \(t=1\), \(P(X\le 1)=P(X=1)=f(1)=\dfrac{5-1}{10}=\dfrac{4}{10}\).
For \(t=2\), \(P(X\le 2)=P(X=1 \text{ or } X=2)=P(X=1)+P(X=2)=\dfrac{5-1}{10}+\dfrac{5-2}{10}=\dfrac{4+3}{10}=\dfrac{7}{10}\)
For \(t=3\), \(P(X\le 3)=\dfrac{5-1}{10}+\dfrac{5-2}{10}+\dfrac{5-3}{10}=\dfrac{4+3+1}{10}=\dfrac{9}{10}\).
For \(t=4\), \(P(X\le 4)=\dfrac{5-1}{10}+\dfrac{5-2}{10}+\dfrac{5-3}{10}+\dfrac{5-4}{10}=\dfrac{10}{10}=1\).

It is worth noting that \(P(X\le 2)\) does not equal \(P(X<2)\); \(P(X\le 2)=P(X=1, 2)\) and \(P(X<2)=P(X=1)\). It is very important for you to carefully read the problems in order to correctly set up the probabilities. You should also look carefully at the notation if a problem provides it.

Consider \(X\) to be a random variable (a binomial random variable) with the following pmf

\(f(x)=P(X=x)={n\choose x}p^x(1-p)^{n-x}, \;\; \text{for } x=0, 1, \cdots , n.\)

The cdf of \(X\) evaluated at \(t\), denoted \(F_X(t)\), is

\(F_X(t)=\sum_{x=0}^t {n\choose x}p^x(1-p)^{n-x}, \;\; \text{for } 0\le t\le n.\)

When \(t=0\), we have \(F_X(0)={n\choose 0}p^0(1-p)^{n-0}\).
When \(t=1\), we have \(F_X(1)={n\choose 0}p^0(1-p)^{n-0}+{n\choose 1}p^1(1-p)^{n-1}\).
When \(t=2\), we have \(F_X(2)={n\choose 0}p^0(1-p)^{n-0}+{n\choose 1}p^1(1-p)^{n-1}+ {n\choose 2}p^2(1-p)^{n-2}\).

And so on and so forth.

One last example. Suppose we have a family with three children. The sample space for this situation is

\(\mathbf{S}= \left \{ BBB, BBG, BGB, GBB, GGG, GGB, GBG, BGG \right \} \)

where \(B\) = boy and \(G\) = girl and suppose the probability of having a boy is the same as the probability of having a girl. Let the random variable \(X\) be the number of boys. Then \(X\) will have the following pmf:

t	0	1	2	3
\(P(X=t)\)	\(\dfrac{1}{8}\)	\(\dfrac{3}{8}\)	\(\dfrac{3}{8}\)	\(\dfrac{1}{8}\)

Then, we can use the pmf to find the cdf.

t	0	1	2	3
\(F_X(t)=P(X\le t)\)	\(\dfrac{1}{8}\)	\(\dfrac{1}{8}+\dfrac{3}{8}=\dfrac{4}{8}\)	\(\dfrac{4}{8}+\dfrac{3}{8}=\dfrac{7}{8}\)	\(\dfrac{7}{8}+\dfrac{1}{8}=1\)

Additional Practice Problem Section

These are some theoretical problems for the CDF and for expectations. Work these problems out on your own and then click on the link to view the solution.

Express the following probabilities in terms of the cdf, \(F_X(t)\), if \(X\) is a discrete random variable with support such that \(x\) being any integer from 0 to \(b\) and \(0\le a\le b\):
1. \(P(X\le a)\)
  
  \(P(X\le a)=F_X(a)\) by definition of cdf
2. \(f_X(a)=P(X=a)\), where \(f_X(x)\) is the pmf of \(X\)
  
  \(P(X=a)=P(X\le a)-P(X\le a-1)=F_X(a)-F_X(a-1)\)
3. \(P(X<a)\)
  
  \(P(X<a)=P(X\le a)-P(X=a)=P(X\le a-1)=F_X(a-1)\)
4. \(P(X\ge a)\)
  
  \(P(X\ge a)=1-P(X\le a-1)=1-F_X(a-1)\)
Let \(X\) have distribution function \(F\). What is the distribution function and expectation of \(\dfrac{X - \mu}{\sigma}\)? In other words, find the distribution function in terms of \(F_X\) and the expectation in terms of \(E(X)\).

Let \(Y=\dfrac{X-\mu}{\sigma}\). We want \(F_Y(t)\) and \(E(Y)\).

\begin{align*} & F_Y(y)=P(Y\le y), \text{ by definition of cdf of Y}\\ & F_Y(y)=P(Y\le y)=P\left(\dfrac{X-\mu}{\sigma}\le y\right)=P\left(X\le y\sigma+\mu\right)\\ & F_Y(y)= F_X(t\sigma+\mu) \end{align*}

Now, to find the expectation, we can do this in two ways. One way is to find it using the definition of expectation, \(E(Y)=\sum_y yf_Y(y)\). In order to do this though, we would need to find \(f_Y(y)\), which we can find using the CDF if \(F_X\) was given.

The other way to approach this is to use the properties of expectation.

\(E(Y)=E\left(\dfrac{X-\mu}{\sigma}\right)=\dfrac{1}{\sigma}E(X-\mu)=\dfrac{E(X)-\mu}{\sigma}\)