Central Limit Theorem Section
What happens when the sample comes from a population that is not normally distributed? This is where the Central Limit Theorem comes in.
For a large sample size (we will explain this later), \(\bar{x}\) is approximately normally distributed, regardless of the distribution of the population one samples from. If the population has mean \(\mu\) and standard deviation \(\sigma\), then \(\bar{x}\) has mean \(\mu\) and standard deviation \(\dfrac{\sigma}{\sqrt{n}}\).
We should stop here to break down what this theorem is saying because the Central Limit Theorem is very powerful!
The Central Limit Theorem applies to a sample mean from any distribution. We could have a left-skewed or a right-skewed distribution. As long as the sample size is large, the distribution of the sample means will follow an approximate Normal distribution.
For the purposes of this course, a sample size of \(n>30\) is considered a large sample.
CLT Demonstration Section
Before we begin the demonstration, let's talk about what we should be looking for…
Notes on the CLT for this demonstration:
- If the population is skewed and sample size small, then the sample mean won't be normal.
- When doing a simulation, one replicates the process many times. Using 10,000 replications is a good idea.
- If the population is normal, then the distribution of sample mean looks normal even if \(n = 2\). Note the app in the video used capital N for the sample size.
- If the population is skewed, then the distribution of sample mean looks more and more normal when \(n\) gets larger.
- Note that in all cases, the mean of the sample mean is close to the population mean and the standard error of the sample mean is close to \(\dfrac{\sigma}{\sqrt{n}}\).
Sampling Distribution of the Sample Mean Section
With the Central Limit Theorem, we can finally define the sampling distribution of the sample mean.
- Sampling Distribution of the Sample Mean
-
The sampling distribution of the sample mean will have:
- The same mean as the population mean, \(\mu\).
- Standard deviation [standard error] of \(\dfrac{\sigma}{\sqrt{n}}\).
It will be Normal (or approximately Normal) if either of these conditions is satisfied:
- The population distribution is Normal.
- The sample size is large (\(n \gt 30\)).
Example 4-2: Weights of Baby Giraffes Section
The weights of baby giraffes are known to have a mean of 125 pounds and a standard deviation of 15 pounds.
If we obtained a random sample of 40 baby giraffes,
- what is the probability that the sample mean will be between 120 and 130 pounds?
- what is the 75th percentile of the sample means of size \(n=40\)?
Answer
Does the problem indicate that the distribution of weights is normal? No, it does not. In order to apply the Central Limit Theorem, we need a large sample. Since \(n=40>30\), we can use the theorem. The sampling distribution of the sample mean is approximately Normal with mean \(\mu=125\) and standard error \(\dfrac{\sigma}{\sqrt{n}}=\dfrac{15}{\sqrt{40}}\).
- We want \(P(120<\bar{X}<130)\).
\begin{align} P(120<\bar{X}<130) &=P\left(\dfrac{120-125}{\dfrac{15}{\sqrt{40}}}<\dfrac{\bar{X}-\mu}{\dfrac{\sigma}{\sqrt{n}}}<\frac{130-125}{\dfrac{15}{\sqrt{40}}}\right)\\ &=P(-2.108<Z<2.108)\\&=P(Z<2.108)-P(Z<-2.108)\\ &=0.9826-0.0174\\ &=0.9652 \end{align}
The probability that the sample mean of the 40 giraffes is between 120 and 130 lbs is 96.52%.
-
To find the 75th percentile, we need the value \(a\) such that \(P(Z<a)=0.75\). Using the Z-table or software, we get \(a=.6745\). The formula for the z-score is...
\(z=\dfrac{\bar{X}-\mu}{\dfrac{\sigma}{\sqrt{40}}}=\dfrac{\bar{X}-125}{\dfrac{15}{\sqrt{40}}}\)
Since we know the \(z\) value is 0.6745, we can use algebra to solve for \(\bar{X}\).
\begin{align} 0.6745&=\dfrac{\bar{X}-125}{\frac{15}{\sqrt{40}}}\\
0.6745\left(\frac{15}{\sqrt{40}}\right) &=\bar{X}-125\\
\bar{X}&=0.6745\left(\frac{15}{\sqrt{40}}\right)+125\\&=126.6 \end{align}The 75th percentile of all the sample means of size \(n=40\) is \(126.6\) pounds.