##
Univariate Normal Distributions
Section* *

Before defining the multivariate normal distribution we will visit the univariate normal distribution. A random variable *X *is normally distributed with mean \(\mu\) and variance \(\sigma^{2}\) if it has the probability density function of *X* as:

\(\phi(x) = \frac{1}{\sqrt{2\pi\sigma^2}}\exp\{-\frac{1}{2\sigma^2}(x-\mu)^2\}\)

This result is the usual bell-shaped curve that you see throughout statistics. In this expression, you see the squared difference between the variable *x* and its mean, \(\mu\). This value will be minimized when x is equal to \(\mu\). The quantity \(-\sigma^{-2}(x - \mu)^{2}\) will take its largest value when *x* is equal to \(\mu\) or likewise since the exponential function is a monotone function, the normal density takes a maximum value when x is equal to \(\mu\).

The variance \(\sigma^{2}\) defines the spread of the distribution about that maximum. If \(\sigma^{2}\) is large, then the spread is going to be large, otherwise, if the \(\sigma^{2}\) value is small, then the spread will be small.

As shorthand notation we may use the expression below:

\(X \sim N(\mu, \sigma^2)\)

indicating that X is distributed according to (denoted by the wavey symbol 'tilde') a normal distribution (denoted by *N*), with mean \(\mu\) and variance \(\sigma^{2}\).

##
Multivariate Normal Distributions
Section* *

If we have a *p* x 1 random vector \(\mathbf{X} \) that is distributed according to a multivariate normal distribution with a population mean vector \(\mu\) and population variance-covariance matrix \(\Sigma\), then this random vector, \(\mathbf{X} \), will have the joint density function as shown in the expression below:

\(\phi(\textbf{x})=\left(\frac{1}{2\pi}\right)^{p/2}|\Sigma|^{-1/2}\exp\{-\frac{1}{2}(\textbf{x}-\mathbf{\mu})'\Sigma^{-1}(\textbf{x}-\mathbf{\mu})\}\)

\(| \Sigma |\) denotes the determinant of the variance-covariance matrix \(\Sigma\) and \(\Sigma^{-1}\)is just the inverse of the variance-covariance matrix \(\Sigma\). Again, this distribution will take maximum values when the vector \(\mathbf{X} \) is equal to the mean vector \(\mu\), and decrease around that maximum.

If *p* is equal to 2, then we have a bivariate normal distribution and this will yield a bell-shaped curve in three dimensions.

The shorthand notation, similar to the univariate version above, is

\(\mathbf{X} \sim N(\mathbf{\mu},\Sigma)\)

We use the expression that the vector \(\mathbf{X} \) 'is distributed as' multivariate normal with mean vector \(\mu\) and variance-covariance matrix \(\Sigma\).

Some things to note about the multivariate normal distribution:

- The following term appearing inside the exponent of the multivariate normal distribution is a quadratic form:
\((\textbf{x}-\mathbf{\mu})'\Sigma^{-1}(\textbf{x}-\mathbf{\mu})\)

This particular quadratic form is also called the squared

*Mahalanobis distance*between the random vectorand the mean vector \(\mu\).**x** - If the variables are uncorrelated then the variance-covariance matrix will be a diagonal matrix with variances of the individual variables appearing on the main diagonal of the matrix and zeros everywhere else:
\(\Sigma = \left(\begin{array}{cccc}\sigma^2_1 & 0 & \dots & 0\\ 0 & \sigma^2_2 & \dots & 0\\ \vdots & \vdots & \ddots & \vdots\\ 0 & 0 & \dots & \sigma^2_p \end{array}\right)\)

- Multivariate Normal Density Function
- In this case the multivariate normal density function simplifies to the expression below:
- \(\phi(\textbf{x}) = \prod_{j=1}^{p}\frac{1}{\sqrt{2\pi\sigma^2_j}}\exp\{-\frac{1}{2\sigma^2_j}(x_j-\mu_j)^2\}\)

**Note!**The product term, given by 'capital' pi, (\(Π\)), acts very much like the summation sign, but instead of adding we multiply over the elements ranging from*j=*1 to*j=**p*. Inside this product is the familiar univariate normal distribution where the random variables are subscripted by*j*. In this case, the elements of the random vector,*\(\mathbf { X } _ { 1 }, \mathbf { X } _ { 2 , \cdots } \mathbf { X } _ { p }\)*, are going to be independent random variables. - We could also consider linear combinations of the elements of a multivariate normal random variable as shown in the expression below:
\(Y = \sum_{j=1}^{p}c_jX_j =\textbf{c}'\textbf{X}\)

**Note!**To define a linear combination, the random variables \(X_{j}\) need not be uncorrelated. The coefficients \(c_{j}\) are chosen arbitrarily, specific values are selected according to the problem of interest and so are influenced very much by subject matter knowledge. Looking back at the Women's Nutrition Survey Data, for example, we selected the coefficients to obtain the total intake of vitamins A and C.Now suppose that the random vector

is multivariate normal with mean \(\mu\) and variance-covariance matrix \(\Sigma\).**X**\(\textbf{X} \sim N(\mathbf{\mu},\Sigma)\)

Then

*Y*is normally distributed with mean:\(\textbf{c}'\mathbf{\mu} = \sum_{j=1}^{p}c_j\mu_j\)

and variance:

\(\textbf{c}'\Sigma \textbf{c} =\sum_{j=1}^{p}\sum_{k=1}^{p}c_jc_k\sigma_{jk}\)

See the previous lesson to review the computation of the population mean of a linear combination of random variables.

In summary,

*Y*is normally distributed with meantransposed \(\mu\) and variance**c**transposed times \(\Sigma\) times**c**.**c**\(Y \sim N(\textbf{c}'\mathbf{\mu},\textbf{c}'\Sigma\textbf{c})\)

As we have seen before, these quantities may be estimated using sample estimates of the population parameters.

##
Other Useful Results for the Multivariate Normal
Section* *

For variables with a multivariate normal distribution with mean vector \(\mu\) and covariance matrix \(\Sigma\), some useful facts are:

- Every single variable has a univariate normal distribution. Thus we can look at univariate tests of normality for each variable when assessing multivariate normality.
- Any subset of the variables also has a multivariate normal distribution.
- Any linear combination of the variables has a univariate normal distribution.
- Any conditional distribution for a subset of the variables conditional on known values for another subset of variables is a multivariate distribution. The full meaning of this statement will be clear after Lesson 6.

##
Example 4-1 - Linear Combination of the Cholesterol Measurements
Section* *

Measurements were taken on *n* heart-attack patients on their cholesterol levels. For each patient, measurements were taken 0, 2, and 4 days following the attack. Treatment was given to reduce cholesterol levels. The sample mean vector is:

Variable |
Mean |
---|---|

X_{1} = 0-Day |
259.5 |

X_{2} = 2-Day |
230.8 |

X_{3} = 4-Day |
221.5 |

The covariance matrix is

0-Day | 2-Day | 4-day | |
---|---|---|---|

0-Day | 2276 | 1508 | 813 |

2-Day | 1508 | 2206 | 1349 |

4-Day | 813 | 1349 | 1865 |

Suppose that we are interested in the difference \(X _ { 1 } - X _ { 2 }\), the difference between the 0-day and the 2-day measurements. We can write the linear combination of interest as

\(\textbf{a}'\textbf{x}= \left(\begin{array}{ccc}1 & -1 & 0 \end{array}\right)

\left(\begin{array}{c}x_1\\ x_2\\ x_3 \end{array}\right)\)

The mean value for the difference is

\begin{align} &=\left(\begin{array}{ccc}1 & -1 & 0 \end{array}\right)\left(\begin{array}{c}259.5\\230.8\\221.5 \end{array}\right)\\ & = 28.7 \end{align}

The variance is

\begin{align} &=\left(\begin{array}{ccc}1 & -1 & 0 \end{array}\right) \left(\begin{array}{ccc}2276 & 1508 & 813\\ 1508 & 2206 & 1349\\ 813 & 1349 & 1865 \end{array}\right) \left(\begin{array}{c}1\\-1\\0 \end{array}\right)\\&= \left(\begin{array}{ccc}768 & -698 & -536 \end{array}\right)\left(\begin{array}{c}1\\-1\\0 \end{array}\right) \\ &= 1466 \end{align}

If we assume the three measurements have a multivariate normal distribution, then the distribution of the difference \(X _ { 1 } - X _ { 2 }\) has a univariate normal distribution.