4.1.2 - Population is Not Normal

Central Limit Theorem Section

What happens when the sample comes from a population that is not normally distributed? This is where the Central Limit Theorem comes in.

Central Limit Theorem

For a large sample size (we will explain this later), \(\bar{x}\) is approximately normally distributed, regardless of the distribution of the population one samples from. If the population has mean \(\mu\) and standard deviation \(\sigma\), then \(\bar{x}\) has mean \(\mu\) and standard deviation \(\dfrac{\sigma}{\sqrt{n}}\).

We should stop here to break down what this theorem is saying because the Central Limit Theorem is very powerful!

The Central Limit Theorem applies to a sample mean from any distribution. We could have a left-skewed or a right-skewed distribution. As long as the sample size is large, the distribution of the sample means will follow an approximate Normal distribution.

For the purposes of this course, a sample size of \(n>30\) is considered a large sample.

CLT Demonstration Section

Before we begin the demonstration, let's talk about what we should be looking for…

Notes on the CLT for this demonstration:

  • If the population is skewed and sample size small, then the sample mean won't be normal.
  • When doing a simulation, one replicates the process many times. Using 10,000 replications is a good idea.
  • If the population is normal, then the distribution of sample mean looks normal even if \(n = 2\). Note the app in the video used capital N for the sample size.
  • If the population is skewed, then the distribution of sample mean looks more and more normal when \(n\) gets larger.
  • Note that in all cases, the mean of the sample mean is close to the population mean and the standard error of the sample mean is close to \(\dfrac{\sigma}{\sqrt{n}}\).
To work through this demonstration of the central limit theorem yourself, click on the link to the website, Rice Virtual Lab in Statistics > Sampling Distributions, and then click Begin.

Sampling Distribution of the Sample Mean Section

With the Central Limit Theorem, we can finally define the sampling distribution of the sample mean.

Sampling Distribution of the Sample Mean

The sampling distribution of the sample mean will have:

  • The same mean as the population mean, \(\mu\).
  • Standard deviation [standard error] of \(\dfrac{\sigma}{\sqrt{n}}\).

It will be Normal (or approximately Normal) if either of these conditions is satisfied:

  • The population distribution is Normal.
  • The sample size is large (\(n \gt 30\)).

Example 4-2: Weights of Baby Giraffes Section

Giraffes in a field

The weights of baby giraffes are known to have a mean of 125 pounds and a standard deviation of 15 pounds.

If we obtained a random sample of 40 baby giraffes,

  1. what is the probability that the sample mean will be between 120 and 130 pounds?
  2. what is the 75th percentile of the sample means of size \(n=40\)?

Answer

Does the problem indicate that the distribution of weights is normal? No, it does not. In order to apply the Central Limit Theorem, we need a large sample. Since \(n=40>30\), we can use the theorem. The sampling distribution of the sample mean is approximately Normal with mean \(\mu=125\) and standard error \(\dfrac{\sigma}{\sqrt{n}}=\dfrac{15}{\sqrt{40}}\).

  1. We want \(P(120<\bar{X}<130)\).

    \begin{align} P(120<\bar{X}<130) &=P\left(\dfrac{120-125}{\dfrac{15}{\sqrt{40}}}<\dfrac{\bar{X}-\mu}{\dfrac{\sigma}{\sqrt{n}}}<\frac{130-125}{\dfrac{15}{\sqrt{40}}}\right)\\ &=P(-2.108<Z<2.108)\\&=P(Z<2.108)-P(Z<-2.108)\\ &=0.9826-0.0174\\ &=0.9652 \end{align}

    The probability that the sample mean of the 40 giraffes is between 120 and 130 lbs is 96.52%.

  2. To find the 75th percentile, we need the value \(a\) such that \(P(Z<a)=0.75\). Using the Z-table or software, we get \(a=.6745\). The formula for the z-score is...

    \(z=\dfrac{\bar{X}-\mu}{\dfrac{\sigma}{\sqrt{40}}}=\dfrac{\bar{X}-125}{\dfrac{15}{\sqrt{40}}}\)

    Since we know the \(z\) value is 0.6745, we can use algebra to solve for \(\bar{X}\).

    \begin{align} 0.6745&=\dfrac{\bar{X}-125}{\frac{15}{\sqrt{40}}}\\
    0.6745\left(\frac{15}{\sqrt{40}}\right) &=\bar{X}-125\\
    \bar{X}&=0.6745\left(\frac{15}{\sqrt{40}}\right)+125\\&=126.6 \end{align}

    The 75th percentile of all the sample means of size \(n=40\) is \(126.6\) pounds.