22.2 - Change-of-Variable Technique

On the last page, we used the distribution function technique in two different examples. In the first example, the transformation of \(X\) involved an increasing function, while in the second example, the transformation of \(X\) involved a decreasing function. On this page, we'll generalize what we did there first for an increasing function and then for a decreasing function. The generalizations lead to what is called the change-of-variable technique.

Generalization for an Increasing Function Section

Let \(X\) be a continuous random variable with a generic p.d.f. \(f(x)\) defined over the support \(c_1<x<c_2\). And, let \(Y=u(X)\) be a continuous, increasing function of \(X\) with inverse function \(X=v(Y)\). Here's a picture of what the continuous, increasing function might look like:

The blue curve, of course, represents the continuous and increasing function \(Y=u(X)\). If you put an \(x\)-value, such as \(c_1\) and \(c_2\), into the function \(Y=u(X)\), you get a \(y\)-value, such as \(u(c_1)\) and \(u(c_2)\). But, because the function is continuous and increasing, an inverse function \(X=v(Y)\) exists. In that case, if you put a \(y\)-value into the function \(X=v(Y)\), you get an \(x\)-value, such as \(v(y)\).

Okay, now that we have described the scenario, let's derive the distribution function of \(Y\). It is:

\(F_Y(y)=P(Y\leq y)=P(u(X)\leq y)=P(X\leq v(y))=\int_{c_1}^{v(y)} f(x)dx\)

for \(d_1=u(c_1)<y<u(c_2)=d_2\). The first equality holds from the definition of the cumulative distribution function of \(Y\). The second equality holds because \(Y=u(X)\). The third equality holds because, as shown in red on the following graph, for the portion of the function for which \(u(X)\le y\), it is also true that \(X\le v(Y)\):

X=v(Y)Y=μ(X)yv(y)C1C1u(C1)u(C2)

And, the last equality holds from the definition of probability for a continuous random variable \(X\). Now, we just have to take the derivative of \(F_Y(y)\), the cumulative distribution function of \(Y\), to get \(f_Y(y)\), the probability density function of \(Y\). The Fundamental Theorem of Calculus, in conjunction with the Chain Rule, tells us that the derivative is:

\(f_Y(y)=F'_Y(y)=f_x (v(y))\cdot v'(y)\)

for \(d_1=u(c_1)<y<u(c_2)=d_2\).

Generalization for a Decreasing Function Section

Let \(X\) be a continuous random variable with a generic p.d.f. \(f(x)\) defined over the support \(c_1<x<c_2\). And, let \(Y=u(X)\) be a continuous, decreasing function of \(X\) with inverse function \(X=v(Y)\). Here's a picture of what the continuous, decreasing function might look like:

X=v(Y)Y=μ(X)yv(y)C1C1u(C1)u(C2)

The blue curve, of course, represents the continuous and decreasing function \(Y=u(X)\). Again, if you put an \(x\)-value, such as \(c_1\) and \(c_2\), into the function \(Y=u(X)\), you get a \(y\)-value, such as \(u(c_1)\) and \(u(c_2)\). But, because the function is continuous and decreasing, an inverse function \(X=v(Y)\) exists. In that case, if you put a \(y\)-value into the function \(X=v(Y)\), you get an x-value, such as \(v(y)\).

That said, the distribution function of \(Y\) is then:

\(F_Y(y)=P(Y\leq y)=P(u(X)\leq y)=P(X\geq v(y))=1-P(X\leq v(y))=1-\int_{c_1}^{v(y)} f(x)dx\)

for \(d_2=u(c_2)<y<u(c_1)=d_1\). The first equality holds from the definition of the cumulative distribution function of \(Y\). The second equality holds because \(Y=u(X)\). The third equality holds because, as shown in red on the following graph, for the portion of the function for which \(u(X)\le y\), it is also true that \(X\ge v(Y)\):

X=v(Y)Y=μ(X)yv(y)C1C1u(C1)u(C2)

The fourth equality holds from the rule of complementary events. And, the last equality holds from the definition of probability for a continuous random variable \(X\). Now, we just have to take the derivative of \(F_Y(y)\), the cumulative distribution function of \(Y\), to get \(f_Y(y)\), the probability density function of \(Y\). Again, the Fundamental Theorem of Calculus, in conjunction with the Chain Rule, tells us that the derivative is:

\(f_Y(y)=F'_Y(y)=-f_x (v(y))\cdot v'(y)\)

for \(d_2=u(c_2)<y<u(c_1)=d_1\). You might be alarmed in that it seems that the p.d.f. \(f(y)\) is negative, but note that the derivative of \(v(y)\) is negative, because \(X=v(Y)\) is a decreasing function in \(Y\). Therefore, the two negatives cancel each other out, and therefore make \(f(y)\) positive.

Phew! We have now derived what is called the change-of-variable technique first for an increasing function and then for a decreasing function. But, continuous, increasing functions and continuous, decreasing functions, by their one-to-one nature, are both invertible functions. Let's, once and for all, then write the change-of-variable technique for any generic invertible function.

Definition. Let \(X\) be a continuous random variable with generic probability density function \(f(x)\) defined over the support \(c_1<x<c_2\). And, let \(Y=u(X)\) be an invertible function of \(X\) with inverse function \(X=v(Y)\). Then, using the change-of-variable technique, the probability density function of \(Y\) is:

\(f_Y(y)=f_X(v(y))\times |v'(y)|\)

defined over the support \(u(c_1)<y<u(c_2)\).

Having summarized the change-of-variable technique, once and for all, let's revisit an example.

Example 22-1 Continued Section

Let's return to our example in which \(X\) is a continuous random variable with the following probability density function:

\(f(x)=3x^2\)

for \(0<x<1\). Use the change-of-variable technique to find the probability density function of \(Y=X^2\).

Solution

Note that the function:

\(Y=X^2\)

defined over the interval \(0<x<1\) is an invertible function. The inverse function is:

\(x=v(y)=\sqrt{y}=y^{1/2}\)

for \(0<y<1\). (That range is because, when \(x=0, y=0\); and when \(x=1, y=1\)). Now, taking the derivative of \(v(y)\), we get:

\(v'(y)=\dfrac{1}{2} y^{-1/2}\)

Therefore, the change-of-variable technique:

\(f_Y(y)=f_X(v(y))\times |v'(y)|\)

tells us that the probability density function of \(Y\) is:

\(f_Y(y)=3[y^{1/2}]^2\cdot \dfrac{1}{2} y^{-1/2}\)

And, simplifying we get that the probability density function of \(Y\) is:

\(f_Y(y)=\dfrac{3}{2} y^{1/2}\)

for \(0<y<1\). We shouldn't be surprised by this result, as it is the same result that we obtained using the distribution function technique.

Example 22-2 continued Section

Let's return to our example in which \(X\) is a continuous random variable with the following probability density function:

\(f(x)=3(1-x)^2\)

for \(0<x<1\). Use the change-of-variable technique to find the probability density function of \(Y=(1-X)^3\).

Solution

Note that the function:

\(Y=(1-X)^3\)

defined over the interval \(0<x<1\) is an invertible function. The inverse function is:

\(x=v(y)=1-y^{1/3}\)

for \(0<y<1\). (That range is because, when \(x=0, y=1\); and when \(x=1, y=0\)). Now, taking the derivative of \(v(y)\), we get:

\(v'(y)=-\dfrac{1}{3} y^{-2/3}\)

Therefore, the change-of-variable technique:

\(f_Y(y)=f_X(v(y))\times |v'(y)|\)

tells us that the probability density function of \(Y\) is:

\(f_Y(y)=3[1-(1-y^{1/3})]^2\cdot |-\dfrac{1}{3} y^{-2/3}|=3y^{2/3}\cdot \dfrac{1}{3} y^{-2/3} \)

And, simplifying we get that the probability density function of Y is:

\(f_Y(y)=1\)

for \(0<y<1\). Again, we shouldn't be surprised by this result, as it is the same result that we obtained using the distribution function technique.