# 4.1.2 - Population is Not Normal

4.1.2 - Population is Not NormalWhat happens when the sample comes from a population that is not normally distributed? This is where the Central Limit Theorem (CLT) comes in.

#### Central Limit Theorem

For a large sample size (we will explain this later), \(\bar{x}\) is approximately normally distributed, regardless of the distribution of the population one samples from. If the population has mean \(\mu\) and standard deviation \(\sigma\), then the distribution of \(\bar{x}\) has mean \(\mu\) and standard deviation \(\dfrac{\sigma}{\sqrt{n}}\).

We should stop here to break down what this theorem is saying because the Central Limit Theorem is very powerful!

The Central Limit Theorem applies to a sample mean from any distribution. We could have a left-skewed or a right-skewed distribution. As long as the sample size is large, the distribution of the sample means will follow an approximate Normal distribution.

For the purposes of this course, a sample size of \(n>30\) is considered a large sample.

For many people just learning statistics there is a "so what" thought about the CLT. Why is this important and why do I care? If you recall, when we introduced the idea of Z scores we did so with the caveat that the distribution was normal. We take the observed data, that is normally distributed, and convert the data to z scores creating a standard normal distribution. We then leveraged this distribution to find percentiles (and will in future units leverage this to find probabilities.

The CLT allows us to assume a distribution IS normal as long as the sample size is greater than 30 observations. With this, we can apply most of our inferential statistics without having to compensate for non-normal distributions. This will take on greater relevance as we move through the course.

## Sampling Distribution of the Sample Mean

With the Central Limit Theorem, we can finally define the sampling distribution of the sample mean.

**Sampling Distribution of the Sample Mean**

The sampling distribution of the sample mean will have:

- the same mean as the population mean, \(\mu\)
- Standard deviation [standard error] of \(\dfrac{\sigma}{\sqrt{n}}\)

It will be Normal (or approximately Normal) if either of these conditions is satisfied

- The population distribution is Normal
- The sample size is large (greater than 30).