# 12.1 - Notations and Terminology

12.1 - Notations and Terminology## Notation

Collect all of the variables *X*'s into a vector \(\mathbf{X}\) for each individual subject. Let \(\mathbf{X_i}\)* _{ }*denote observable trait

*i.*These are the data from each subject and are collected into a vector of traits.

\(\textbf{X} = \left(\begin{array}{c}X_1\\X_2\\\vdots\\X_p\end{array}\right) = \text{vector of traits}\)

This is a random vector, with a population mean. Assume that vector of traits \(\mathbf{X}\) is sampled from a population with population mean vector:

\(\boldsymbol{\mu} = \left(\begin{array}{c}\mu_1\\\mu_2\\\vdots\\\mu_p\end{array}\right) = \text{population mean vector}\)

Here, \(\mathrm { E } \left( X _ { i } \right) = \mu _ { i }\) denotes the population mean of variable *i*.

Consider *m* unobservable common factors \(f _ { 1 } , f _ { 2 } , \dots , f _ { m }\). The \(i^{th}\) common factor is \(f _ { i } \). Generally, *m* is going* *to be substantially less than *p*.

The common factors are also collected into a vector,

\(\mathbf{f} = \left(\begin{array}{c}f_1\\f_2\\\vdots\\f_m\end{array}\right) = \text{vector of common factors}\)

## Model

Our factor model can be thought of as a series of multiple regressions, predicting each of the observable variables \(X_{i}\) from the values of the unobservable common factors \(f_{i}\) :

\begin{align} X_1 & = \mu_1 + l_{11}f_1 + l_{12}f_2 + \dots + l_{1m}f_m + \epsilon_1\\ X_2 & = \mu_2 + l_{21}f_1 + l_{22}f_2 + \dots + l_{2m}f_m + \epsilon_2 \\ & \vdots \\ X_p & = \mu_p + l_{p1}f_1 + l_{p2}f_2 + \dots + l_{pm}f_m + \epsilon_p \end{align}

Here, the variable means \(\mu_{1}\) through \(\mu_{p}\) can be regarded as the intercept terms for the multiple regression models.

The regression coefficients \(l_{ij}\) (the partial slopes) for all of these multiple regressions are called factor loadings. Here, \(l_{ij}\) = *loading *of the \(i^{th}\) variable on the \(j^{th}\) factor. These are collected into a matrix as shown here:

\(\mathbf{L} = \left(\begin{array}{cccc}l_{11}& l_{12}& \dots & l_{1m}\\l_{21} & l_{22} & \dots & l_{2m}\\ \vdots & \vdots & & \vdots \\l_{p1} & l_{p2} & \dots & l_{pm}\end{array}\right) = \text{matrix of factor loadings}\)

And finally, the errors \(\varepsilon _{i}\) are called the specific factors. Here, \(\varepsilon _{i}\) = *specific factor *for variable *i*. The specific factors are also collected into a vector:

\(\boldsymbol{\epsilon} = \left(\begin{array}{c}\epsilon_1\\\epsilon_2\\\vdots\\\epsilon_p\end{array}\right) = \text{vector of specific factors}\)

In summary, the basic model is like a regression model. Each of our response variables *X* is predicted as a linear function of the unobserved common factors \(f_{1}\), \(f_{2}\) through \(f_{m}\). Thus, our explanatory variables are \(f_{1}\) , \(f_{2}\) through \(f_{m}\). We have *m* unobserved factors that control the variation in our data.

We will generally reduce this into matrix notation as shown in this form here:

\(\textbf{X} = \boldsymbol{\mu} + \textbf{Lf}+\boldsymbol{\epsilon}\)

**Note!**In general, we want

*m*<<

*p*.