# 12.1 - Summary of Statistical Techniques

## Summary Table for Statistical Techniques Section

#### Estimating a Mean

##### Parameter

One opulation mean, $$\mu$$

##### Statistic

Sample mean, $$\bar{x}$$

Numerical

##### Analysis

1-sample t-interval

$$\bar{x}\pm t_{\alpha /2}\cdot \frac{s}{\sqrt{n}}$$

##### Minitab Command

Stat > Basic statistics > 1-sample t

##### Conditions

data approximately normal OR

have a large sample size (n ≥ 30)

##### Examples
• What is the average weight of adults?
• What is the average cholesterol level of adult females?

#### Test About a Mean

##### Parameter

One population mean, $$\mu$$

##### Statistic

Sample mean, $$\bar{x}$$

Numerical

##### Analysis

$$H_0\colon \mu = \mu_0$$

$$H_a\colon \mu \ne \mu_0$$ OR

$$H_a\colon \mu > \mu_0$$ OR

$$H_a\colon \mu < \mu_0$$

1-sample t-test:

$$t=\frac{\bar{x}-\mu_{0}}{\frac{s}{\sqrt{n}}}$$

##### Minitab Command

Stat > Basic statistics > 1-sample t

##### Conditions

data approximately normal

OR

have a large sample size (n ≥ 30)

##### Examples
• Is the average GPA of juniors at Penn State higher than 3.0?
• Is the average winter temperature in State College less than 42°F?

#### Estimating a Proportion

##### Parameter

One population proportion $$p$$

##### Statistic

Sample proportion, $$\hat{p}$$

##### Type of Data

Categorical (Binary)

##### Analysis

1-proportion Z-interval:

$$\hat{p}\pm z_{\alpha /2}\sqrt{\frac{\hat{p}\cdot \left ( 1-\hat{p} \right )}{n}}$$

##### Minitab Command

Stat > Basic statistics > 1-sample proportion

##### Conditions
have at least 5 in each category
##### Examples
• What is the proportion of males in the world?
• What is the proportion of students that smoke?

#### Test about a Proportion

##### Parameter

One population proportion, $$p$$

##### Statistic

Sample proportion, $$\hat{p}$$

##### Type of Data

Categorical (Binary)

##### Analysis

$$H_0\colon p = p_0$$

$$H_a\colon p \ne p_0$$ OR

$$H_a\colon p > p_0$$ OR

$$H_a\colon p < p_0$$

1-proportion Z-test:

$$z=\frac{\hat{p}-p _{0}}{\sqrt{\frac{p _{0}\left ( 1- p _{0}\right )}{n}}}$$

##### Minitab Command

Stat > Basic statistics > 1-sample proportion

##### Conditions

$$np_0 \geq 5$$ and

$$n (1 - p_0) \geq 5$$

##### Examples
• Is the proportion of females different from 0.5?
• Is the proportion of students who fail STAT 500 less than 0.1?

#### Estimating the Difference of Two Means*

##### Parameter

Difference in two population means,

$$\mu_1 - \mu_2$$

##### Statistic

Difference in two sample means,

$$\bar{x}_{1} - \bar{x}_{2}$$

Numerical

##### Analysis

2-sample t-interval:

$$\bar{x}_{1}-\bar{x}_{2}\pm t_{\alpha /2}\cdot \\\hat{s.e.}\left (\bar{x}_{1}-\bar{x}_{2} \right )$$

##### Minitab Command

Stat > Basic statistics > 2-sample t

##### Conditions

Independent samples from the two populations

Data in each sample are about normal or large samples

##### Examples
• How different are the mean GPAs of males and females?
• How many fewer colds do vitamin C takers get, on average, than non-vitamin takers?

#### Test to Compare Two Means*

##### Parameter

Difference in two population means,

$$\mu_1 - \mu_2$$

##### Statistic

Difference in two sample means,

$$\bar{x}_{1} - \bar{x}_{2}$$

Numerical

##### Analysis

$$H_0\colon \mu_1 = \mu_2$$ $$H_a\colon \mu_1 \ne \mu_2$$ OR

$$H_a\colon \mu_1 > \mu_2$$ OR

$$H_a\colon \mu_1 < \mu_2$$

2-sample t-test: $$t=\frac{\left (\bar{x}_{1}-\bar{x}_{2} \right )-0}{\hat{s.e.}\left (\bar{x}_{1}-\bar{x}_{2} \right )}$$

##### Minitab Command

Stat > Basic statistics > 2-sample t

##### Conditions

Independent samples from the two populations

Data in each sample are about normal or large samples

##### Examples
• Do the mean pulse rates of exercisers and non-exercisers differ?
• Is the mean EDS score for dropouts greater than the mean EDS score for graduates?

*(The Standard Error (S.E.) will depend on pooled vs unpooled)

#### Estimating a Mean with Paired Data

##### Parameter

Mean of paired difference,

$$\mu_D$$

##### Statistic

Sample mean of difference,

$$\bar{d}$$

Numerical

##### Analysis

paired t-interval:

$$\bar{d}\pm t_{\alpha /2}\cdot \frac{s_{d}}{\sqrt{n}}$$

##### Minitab Command

Stat > Basic statistics > Paired t

##### Conditions

Differences approximately normal OR

Have a large number of pairs (n ≥ 30)

##### Examples
• What is the difference in pulse rates, on the average, before and after exercise?

#### Test about a Mean with Paired Data

##### Parameter

Mean of paired difference,

$$\mu_D$$

##### Statistic

Sample mean of difference,

$$\bar{d}$$

Numerical

##### Analysis

$$H_0\colon \mu_D = 0$$

$$H_a\colon \mu_D \ne 0$$ OR

$$H_a\colon \mu_D > 0$$ OR

$$H_a\colon \mu_D < 0$$

t-test statistic:

$$t=\frac{\bar{d}-0}{\frac{s_d}{\sqrt{n}}}$$

##### Minitab Command

Stat > Basic statistics > Paired t

##### Conditions

Differences approximately normal OR

Have a large number of pairs (n ≥ 30)

##### Examples
• Is the difference in IQ of pairs of twins zero?
• Are the pulse rates of people higher after exercise?

#### Estimating the Difference of Two Proportions

##### Parameter

Difference in two population proportions,

$$p_1 - p_2$$

##### Statistic

Difference in two sample proportions,

$$\hat{p}_{1} - \hat{p}_{2}$$

##### Type of Data

Categorical (Binary)

##### Analysis

2-proportions Z-interval:

$$\hat{p} _{1}-\hat{p} _{2}\pm z_{\alpha /2}\cdot\\ \hat{s.e.}\left ( \hat{p} _{1}-\hat{p} _{2} \right )$$

##### Minitab Command

Stat > Basic statistics > 2 proportions

##### Conditions

Independent samples from the two populations

Have at least 5 in each category for both populations

##### Examples
• How different are the percentages of male and female smokers?
• How different are the percentages of upper- and lower-class binge drinkers?

#### Test to Compare Two Proportions

##### Parameter

Difference in two population proportions,

$$p_1 - p_2$$

##### Statistic

Difference in two sample proportions,

$$\hat{p}_{1} - \hat{p}_{2}$$

##### Type of Data

Categorical (Binary)

##### Analysis

$$H_0\colon p_1 = p_2$$

$$H_a\colon p_1 \ne p_2$$ OR

$$H_a\colon p_1 > p_2$$ OR

$$H_a\colon p_1 < p_2$$

2-proportion Z-test:

$$z^*=\frac{\hat{p}_{1}-\hat{p}_{2}}{\sqrt{\hat{p}^*\left ( 1-\hat{p}^* \right )\left ( \frac{1}{n_{1}}+ \frac{1}{n_{2}}\right )}}$$

$$\hat{p}^*=\dfrac{x_{1}+x_{2}}{n_{1}+n_{2}}$$

##### Minitab Command

Stat > Basic statistics > 2 proportions

##### Conditions

Independent samples from the two populations

Have at least 5 in each category for both populations

##### Examples
• Is the percentage of males with lung cancer higher than the percentage of females with lung cancer?

• Are the percentages of upper- and lower- class binge drinkers different?

#### Relationship in a 2-Way Table

##### Parameter

Relationship between two categorical variables, OR

difference in two or more population proportions

##### Statistic

The observed counts in a two-way table

Categorical

##### Analysis

$$H_0\colon\text{The two variables are not related}$$

$$H_a\colon\text{The two variables are related}$$

Chi-square test statistic:

$$X^2=\sum_{\text{all cells}}\frac{(\text{Observed-Expected})^2}{\text{Expected}}$$

##### Minitab Command

Stat > Tables > Chi square Test for Association

##### Conditions

All expected counts should be greater than 1

At least 80% of the cells should have an expected count greater than 5

##### Examples
• Is there a relationship between smoking and lung cancer?
• Do the proportions of students in each class who smoke differ?

#### Test About a Slope

##### Parameter

Slope of the population regression line,

$$\beta_1$$

##### Statistic

Sample estimate of the slope,

$$b_1$$

Numerical

##### Analysis

$$H_0\colon \beta_1 = 0$$

$$H_a\colon \beta_1 \ne 0$$ OR

$$H_a\colon \beta_1 > 0$$ OR

$$H_a\colon \beta_1 < 0$$

t-test with n - 2 degrees of freedom:

$$t=\dfrac{b_{1}-0}{\hat{s.e.}\left ( b_{1} \right )}$$

##### Minitab Command

Stat > Regression > Regression

##### Conditions

The form of the equation that links the two variables must be correct

The error terms are normally distributed

The errors terms have equal variances

The error terms are independent of each other

##### Examples
• Is there a linear relationship between height and weight of a person?

#### Test to Compare Several Means

##### Parameter

Population means of the t populations,

$$\mu_1, \mu_2, \cdots , \mu_t$$

##### Statistic

Sample means of the t populations,

$$x_1, x_2, \cdots , x_t$$

Numerical

##### Analysis

$$H_0\colon \mu_1 = \mu_2 = ... = \mu_t$$

$$H_a\colon \text{not all the means are equal}$$

F-test for one-way ANOVA:

$$F=\dfrac{MST}{MSE}$$

##### Minitab Command

Stat > ANOVA > Oneway

##### Conditions

Each population is normally distributed

Independent samples from the t populations

Equal population standard deviations

##### Examples
• Is there a difference between the mean GPA of freshman, sophomore, junior, and senior classes?

#### Test of Strength & Direction of Linear Relationship of 2 Quantitative Variables

##### Parameter

Population correlation,

$$\rho$$

"rho"

##### Statistic

Sample correlation,

$$r$$

Numerical

##### Analysis

$$H_0\colon \rho = 0$$

$$H_a\colon \rho \ne 0$$

t-test statistic:

$$t=\frac{r\sqrt{n-2}}{\sqrt{1-r^2}}$$

##### Minitab Command

Stat > Basic Statistics > Correlation

##### Conditions

2 variables are continuous

Related pairs

No significant outliers

Normality of both variables

Linear relationship between the variables

##### Examples
• Is there a linear relationship between height and weight?

#### Test to Compare Two Population Variances

##### Parameter

Population variances of two populations,

$$\sigma_{1}^{2}, \sigma_{2}^{2}$$

##### Statistic

Sample variances of two populations,

$$s_{1}^{2}, s_{2}^{2}$$

Numerical

##### Analysis

$$H_0\colon \sigma_{1}^{2} = \sigma_{2}^{2}$$

$$H_2\colon \sigma_{1}^{2} \ne \sigma_{2}^{2}$$

F-test statistic:

$$F=\frac{s_{1}^{2}}{s_{2}^{2}}$$

##### Minitab Command

Stat > Basic statistics > 2 variances

##### Conditions

Each population is normally distributed

Independent samples from the 2 populations

##### Examples
• Are the variances of length of lumber produced by Company A different from those produced by Company B?