Measures of Spread

Measures of Spread: Variability, Deviation, Variance, Standard Deviation

So, let's say that I tell you the average rent is \$360. Does that mean you should expect all 1 bedroom apartments to rent close to \$360?

Variability

Synonyms: Dispersion, Spread

One of the basic themes of statistics. In general, not every member of a population or every outcome of a process has the same score on a variable of interest. Measures of central tendency like the mean are incomplete in themselves because they don’t tell how spread out the data is around the center. All statistical inference must take variability into account.

low variability
high variability

Deviation

The (signed) deviation for an observation is the difference between that observation and the mean. That is, the deviation for the ith observation is: \( X_i - \bar{X} \) .

The arithmetic mean of all of the deviations must be zero because they sum to zero. However, the arithmetic mean of the absolute values of the deviations (the Mean Absolute Deviation) or their median (the Median Absolute Deviation).

For mathematical and historical reasons, however, we usually use the average of the squares of the deviations rather than their absolute values. Recall that a square is never negative.

For instance in our 1 bedroom apartment example, the deviation of the 2nd value from the mean is: \$320 - \$360 = \$40.


Variance

The sample variance, abbreviated s2, is a commonly used measure of variability. It is approximately the mean of the squares of the deviations.

\( s^2= \text{Variance }=\frac{\sum_{i=1}^{n}(X_i-\bar{X})^2}{n}\text{or } \frac{\sum_{i=1}^{n}(X_i-\bar{X})^2}{n-1}\)

The population variance, abbreviated σ2, is similar but measures the variability of the whole population. So, σ2 would be the mean of the squared deviations of all the members of the whole population.

Sample variance for our 1 bedroom apartment example is \$1770.714


Standard Deviation

The standard deviation is the square root of the variance (to recover original scale of measurement). The standard deviation of a sample is:

\( s=\text{sqrt (Variance) }= \sqrt{\frac{\sum_{i=1}^{n}(X_i-\bar{X})^2}{n-1}} \)

The standard deviation of a population is written as σ. This can be thought of as roughly the average distance of the observed values from the mean.

The standard deviation for our 1 bedroom apartment example is \$42.07986


Other Measures of Spread

Range:

The lowest and highest values

1 bedroom apartment example: \$280, \$340

IQR: Inter-Quartile Range

Inter-quartile range is the distance between the 75th and 25th percentile. It’s essentially the middle 50% of the data.

1 bedroom apartment example:

  • 0.25 * (15 + 1) = 4, \$330
  • 0.75 * (15 + 1) = 12, \$390
  • IQR: \$390 - \$330