Measures of Central Tendency

Measures of central tendency are some of the most basic and useful statistical functions. They summarize a sample or population by a single typical value.

The two most commonly used measures of central tendency for numerical data are the mean and the median.

Mean: The average of all data points

Median: The data point where half of the data lies above and half below it

Mode: The most common value in the data

Mean

Synonyms: average, arithmetic mean.

Gives an “expected value” (not literally)

The sample mean, written as \( \bar{X}\), equals the sum of observations divided by the size of the sample.

\(\bar{X}=\frac{\sum_{i=1}^{n}X_i}{n}\)

The population mean, written as μ, is analogous to the sample mean, but for the whole population.

So for our 1 bedroom apartment data: \( \bar{X}\) = \$5505 / 15 = \$367

Median

Also known as 50th percentile, the sample median is the middle number (or the arithmetic mean of the two middle numbers in the case of an even number of observations) when the observations are written out in order.

The population median is the 50th percentile in the whole population.

Order the values from the smallest to the largest, and find the median. For our 1 bedroom appartment example: \$375 is the median; there are seven values above and below this value.

Mean vs. Median

DIFFERENCES

The mean is somewhat more “mathematically tractable” (works better with some statistical procedures).

The median is more resistant to outliers.

The median and mean have slightly different interpretations.

SIMILARITIES

Both tell about where the “typical” or “central” value in a distribution is found.

For a symmetric distribution such as the normal distribution, the mean and the median are the same number.

mean vs median

Example of Resistance to Outliers

The mean of 3, 4, 6, 7, 8, 10, 15 is about 7.57.

The mean of 3, 4, 6, 7, 8, 10, 150 is about 26.86.

The median of either data set is 7.

Most statisticians would say that in this situation the median is the better measure of central tendency to use. (Of course, it is best to report both.)