# 1.1 - Measures of Central Tendency

1.1 - Measures of Central Tendency

## Central Tendency: The Mean Vector

Throughout this course, we’ll use the ordinary notations for the mean of a variable. That is, the symbol $$\mu$$ is used to represent a (theoretical) population mean and the symbol $$\bar{x}$$ is used to represent a sample mean computed from observed data. In the multivariate setting, we add subscripts to these symbols to indicate the specific variable for which the mean is being given. For instance, $$\mu_1$$ represents the population mean for variable $$X_1$$ and $$\bar{x}_{1}$$ denotes a sample mean based on observed data for variable $$X_{1}$$.

The population mean is the measure of central tendency for the population. Here, the population mean for variable $$j$$ is

$\mu_j = E(X_{j})$

The notation $$E$$ stands for statistical expectation; here $$E(X_{j})$$ is the mean of $$X_{j}$$ over all members of the population, or equivalently, overall random draws from a stochastic model. For example, $$\mu_j = E(X_{j})$$ may be the mean of a normal variable.

The population mean $$\mu_j$$ for variable $$j$$ can be estimated by the sample mean

$\bar{x}_j = \frac{1}{n}\sum_{i=1}^{n}X_{ij}$

Note! The sample mean $$\bar{x}_{j}$$, because it is a function of our random data is also going to have a mean itself. In fact, the population mean of the sample mean is equal to population mean $$\mu_j$$; i.e.,$E(\bar{x}_j) = \mu_j$

Therefore, the $$\bar{x}_{j}$$ is unbiased for $$\mu_j$$.

Another way of saying this is that the mean of the $$\bar{x}_{j}$$’s over all possible samples of size $$n$$ is equal to $$\mu_j$$.

Recall that the population mean vector is $$\boldsymbol{\mu}$$ which is a collection of the means for each of the population means for each of the different variables.

$$\boldsymbol{\mu} = \left(\begin{array}{c} \mu_1 \\ \mu_2\\ \vdots\\ \mu_p \end{array}\right)$$

We can estimate this population mean vector, $$\boldsymbol{\mu}$$, by $$\mathbf{\bar{x}}$$. This is obtained by collecting the sample means from each of the variables in a single vector. This is shown below.

$$\mathbf{\bar{x}} = \left(\begin{array}{c}\bar{x}_1\\ \bar{x}_2\\ \vdots \\ \bar{x}_p\end{array}\right) = \left(\begin{array}{c}\frac{1}{n}\sum_{i=1}^{n}X_{i1}\\ \frac{1}{n}\sum_{i=1}^{n}X_{i2}\\ \vdots \\ \frac{1}{n}\sum_{i=1}^{n}X_{ip}\end{array}\right) = \frac{1}{n}\sum_{i=1}^{n}\textbf{X}_i$$

Just as the sample means, $$\bar{x}$$, for the individual variables are unbiased for their respective population means, the sample mean vector is unbiased for the population mean vector.

$$E(\mathbf{\bar{x}}) = E\left(\begin{array}{c}\bar{x}_1\\\bar{x}_2\\ \vdots \\\bar{x}_p\end{array}\right) = \left(\begin{array}{c}E(\bar{x}_1)\\E(\bar{x}_2)\\ \vdots \\E(\bar{x}_p)\end{array}\right)=\left(\begin{array}{c}\mu_1\\\mu_2\\\vdots\\\mu_p\end{array}\right)=\boldsymbol{\mu}$$

  Link ↥ Has Tooltip/Popover Toggleable Visibility