Printer-friendly versionPrinter-friendly version

We have just one more topic to tackle in this lesson, namely, Student's t distribution. Let's just jump right in and define it!

Definition. If Z ~ N(0, 1) and U ~ χ2(r) are independent, then the random variable:

\(T=\dfrac{Z}{\sqrt{U/r}}\)

follows a t-distribution with r degrees of freedom.  We write T ~ t(r). The p.d.f. of T is:

\(f(t)=\dfrac{\Gamma((r+1)/2)}{\sqrt{\pi r} \Gamma(r/2)} \cdot \dfrac{1}{(1+t^2/r)^{(r+1)/2}}\)

for −∞ < t < ∞.

By the way, the t distribution was first discovered by a man named W.S. Gosset. He discovered the distribution when working for an Irish brewery. Because he published under the pseudonym Student, the t distribution is often called Student's t distribution.

History aside, the above definition is probably not particularly enlightening. Let's try to get a feel for the t distribution by way of simulation. Let's randomly generate 1000 standard normal values (Z) and 1000 chi-square(3) values (U). Then, the above definition tells us that, if we take those randomly generated values, calculate:

\(T=\dfrac{Z}{\sqrt{U/3}}\)

and create a histogram of the 1000 resulting T values, we should get a histogram that looks like a t distribution with 3 degrees of freedom. Well, here's a subset of the resulting values from one such simulation:

data

Note, for example, in the first row:

\(T(3)=\dfrac{-2.60481}{\sqrt{10.2497/3}}=-1.4092\)

Here's what the resulting histogram of the 1000 randomly generated T(3) values looks like, with a standard N(0,1) curve superimposed:

plot

Hmmm. The t-distribution seems to be quite similar to the standard normal distribution. Using the formula given above for the p.d.f. of T, we can plot the density curve of various t random variables, say when r = 1, r = 4, and r = 7, to see that that is indeed the case:

plot

In fact, it looks as if, as the degrees of freedom r increases, the t density curve gets closer and closer to the standard normal curve. Let's summarize what we've learned in our little investigation about the characteristics of the t distribution:

(1) The support appears to be −∞ < t < ∞. (It is!)

(2) The probability distribution appears to be symmetric about t = 0. (It is!)

(3) The probability distribution appears to be bell-shaped. (It is!)

(4) The density curve looks like a standard normal curve, but the tails of the t-distribution are "heavier" than the tails of the normal distribution.  That is, we are more likely to get extreme t-values than extreme z-values. 

(5) As the degrees of freedom r increases, the t-distribution appears to approach the standard normal z-distribution. (It does!)

As you'll soon see, we'll need to look up t-values, as well as probabilities concerning T random variables, quite often in Stat 415. Therefore, we better make sure we know how to read a t table.

The t Table

If you take a look at Table VI in the back of your text book, you'll find what looks like a typical t table. Here's what the top of Table VI looks like (well, minus the shading that I've added):

top of t table

The t-table is similar to the chi-square table in that the inside of the t-table (shaded in purple) contains the t-values for various cumulative probabilities (shaded in red), such as 0.60, 0.75, 0.90, 0.95, 0.975, 0.99, and 0.995, and for various t distributions with degrees of freedom (shaded in blue). The row shaded in green indicates the upper α probability that corresponds to the 1−α cumulative probability. For example, if you're interested in either a cumulative probability of 0.60, or an upper probability of 0.40, you'll want to look for the t-value in the first column. 

Let's use the t-table to read a few probabilities and t-values off of the table:

Let's take a look at a few more examples. 

Example

Let T follow a t-distribution with r = 8 df.  What is the probability that the absolute value of T is less than 2.306? 

Solution. The probability calculation is quite similar to a calculation we'd have to make for a normal random variable. First, rewriting the probability in terms of T instead of the absolute value of T, we get:

P(|T | < 2.306) = P(2.306 < T < 2.306)

Then, we have to rewrite the probability in terms of cumulative probabilities that we can actually find, that is:

P(|T | < 2.306) = P(T  < 2.306)  P(T  < 2.306)

Pictorially, the probability we are looking for looks something like this:

t distribution

But the t-table doesn't contain negative t-values, so we'll have to take advantage of the symmetry of the T distribution. That is:

P(|T | < 2.306) = P(T < 2.306)  P(T  > 2.306)

Can you find the necessary t-values on the t-table?

T Table

Do you need a hint?

The t-table tells us that P(T < 2.306) = 0.975 and P(T  > 2.306) = 0.025. Therefore:

P(|T | < 2.306) = 0.975  0.025 = 0.95

What is t0.05(8)?

Solution. The value t0.05(8) is the value t0.05 such that the probability that a T random variable with 8 degrees of freedom is greater than the value t0.05 is 0.05. That is:

t 8

Can you find the value t0.05 on the t-table?

T Table

Do you need a hint?

We have determined that the probability that a T random variable with 8 degrees of freedom is greater than the value 1.860 is 0.05. 

Why will we encounter a T random variable?

Given a random sample X1, X2, ..., Xn from a normal distribution, we know that:

\(Z=\dfrac{\bar{X}-\mu}{\sigma/\sqrt{n}}\sim N(0,1)\)

Earlier in this lesson, we learned that:

\(U=\dfrac{(n-1)S^2}{\sigma^2}\)

follows a chi-square distribution with n−1 degrees of freedom. We also learned that Z and U are independent. Therefore, using the definition of a T random variable, we get:

It is the resulting quantity, that is:

\(T=\dfrac{\bar{X}-\mu}{s/\sqrt{n}}\)

that will help us, in Stat 415, to use a mean from a random sample, that is \(\bar{X}\), to learn, with confidence, something about the population mean μ.