12.1 - Summary of Statistical Techniques

Summary Table for Statistical Techniques Section

Estimating a Mean

Parameter

One opulation mean, $$\mu$$

Statistic

Sample mean, $$\bar{x}$$

Numerical

Analysis

1-sample t-interval

$$\bar{x}\pm t_{\alpha /2}\cdot \frac{s}{\sqrt{n}}$$

Minitab Command

Stat > Basic statistics > 1-sample t

Conditions

data approximately normal OR

have a large sample size (n ≥ 30)

Examples
• What is the average weight of adults?
• What is the average cholesterol level of adult females?

Parameter

One population mean, $$\mu$$

Statistic

Sample mean, $$\bar{x}$$

Numerical

Analysis

$$H_0\colon \mu = \mu_0$$

$$H_a\colon \mu \ne \mu_0$$ OR

$$H_a\colon \mu > \mu_0$$ OR

$$H_a\colon \mu < \mu_0$$

1-sample t-test:

$$t=\frac{\bar{x}-\mu_{0}}{\frac{s}{\sqrt{n}}}$$

Minitab Command

Stat > Basic statistics > 1-sample t

Conditions

data approximately normal

OR

have a large sample size (n ≥ 30)

Examples
• Is the average GPA of juniors at Penn State higher than 3.0?
• Is the average winter temperature in State College less than 42°F?

Estimating a Proportion

Parameter

One population proportion $$p$$

Statistic

Sample proportion, $$\hat{p}$$

Type of Data

Categorical (Binary)

Analysis

1-proportion Z-interval:

$$\hat{p}\pm z_{\alpha /2}\sqrt{\frac{\hat{p}\cdot \left ( 1-\hat{p} \right )}{n}}$$

Minitab Command

Stat > Basic statistics > 1-sample proportion

Conditions
have at least 5 in each category
Examples
• What is the proportion of males in the world?
• What is the proportion of students that smoke?

Parameter

One population proportion, $$p$$

Statistic

Sample proportion, $$\hat{p}$$

Type of Data

Categorical (Binary)

Analysis

$$H_0\colon p = p_0$$

$$H_a\colon p \ne p_0$$ OR

$$H_a\colon p > p_0$$ OR

$$H_a\colon p < p_0$$

1-proportion Z-test:

$$z=\frac{\hat{p}-p _{0}}{\sqrt{\frac{p _{0}\left ( 1- p _{0}\right )}{n}}}$$

Minitab Command

Stat > Basic statistics > 1-sample proportion

Conditions

$$np_0 \geq 5$$ and

$$n (1 - p_0) \geq 5$$

Examples
• Is the proportion of females different from 0.5?
• Is the proportion of students who fail STAT 500 less than 0.1?

Estimating the Difference of Two Means*

Parameter

Difference in two population means,

$$\mu_1 - \mu_2$$

Statistic

Difference in two sample means,

$$\bar{x}_{1} - \bar{x}_{2}$$

Numerical

Analysis

2-sample t-interval:

$$\bar{x}_{1}-\bar{x}_{2}\pm t_{\alpha /2}\cdot \\\hat{s.e.}\left (\bar{x}_{1}-\bar{x}_{2} \right )$$

Minitab Command

Stat > Basic statistics > 2-sample t

Conditions

Independent samples from the two populations

Data in each sample are about normal or large samples

Examples
• How different are the mean GPAs of males and females?
• How many fewer colds do vitamin C takers get, on average, than non-vitamin takers?

Test to Compare Two Means*

Parameter

Difference in two population means,

$$\mu_1 - \mu_2$$

Statistic

Difference in two sample means,

$$\bar{x}_{1} - \bar{x}_{2}$$

Numerical

Analysis

$$H_0\colon \mu_1 = \mu_2$$ $$H_a\colon \mu_1 \ne \mu_2$$ OR

$$H_a\colon \mu_1 > \mu_2$$ OR

$$H_a\colon \mu_1 < \mu_2$$

2-sample t-test: $$t=\frac{\left (\bar{x}_{1}-\bar{x}_{2} \right )-0}{\hat{s.e.}\left (\bar{x}_{1}-\bar{x}_{2} \right )}$$

Minitab Command

Stat > Basic statistics > 2-sample t

Conditions

Independent samples from the two populations

Data in each sample are about normal or large samples

Examples
• Do the mean pulse rates of exercisers and non-exercisers differ?
• Is the mean EDS score for dropouts greater than the mean EDS score for graduates?

*(The Standard Error (S.E.) will depend on pooled vs unpooled)

Estimating a Mean with Paired Data

Parameter

Mean of paired difference,

$$\mu_D$$

Statistic

Sample mean of difference,

$$\bar{d}$$

Numerical

Analysis

paired t-interval:

$$\bar{d}\pm t_{\alpha /2}\cdot \frac{s_{d}}{\sqrt{n}}$$

Minitab Command

Stat > Basic statistics > Paired t

Conditions

Differences approximately normal OR

Have a large number of pairs (n ≥ 30)

Examples
• What is the difference in pulse rates, on the average, before and after exercise?

Test about a Mean with Paired Data

Parameter

Mean of paired difference,

$$\mu_D$$

Statistic

Sample mean of difference,

$$\bar{d}$$

Numerical

Analysis

$$H_0\colon \mu_D = 0$$

$$H_a\colon \mu_D \ne 0$$ OR

$$H_a\colon \mu_D > 0$$ OR

$$H_a\colon \mu_D < 0$$

t-test statistic:

$$t=\frac{\bar{d}-0}{\frac{s_d}{\sqrt{n}}}$$

Minitab Command

Stat > Basic statistics > Paired t

Conditions

Differences approximately normal OR

Have a large number of pairs (n ≥ 30)

Examples
• Is the difference in IQ of pairs of twins zero?
• Are the pulse rates of people higher after exercise?

Estimating the Difference of Two Proportions

Parameter

Difference in two population proportions,

$$p_1 - p_2$$

Statistic

Difference in two sample proportions,

$$\hat{p}_{1} - \hat{p}_{2}$$

Type of Data

Categorical (Binary)

Analysis

2-proportions Z-interval:

$$\hat{p} _{1}-\hat{p} _{2}\pm z_{\alpha /2}\cdot\\ \hat{s.e.}\left ( \hat{p} _{1}-\hat{p} _{2} \right )$$

Minitab Command

Stat > Basic statistics > 2 proportions

Conditions

Independent samples from the two populations

Have at least 5 in each category for both populations

Examples
• How different are the percentages of male and female smokers?
• How different are the percentages of upper- and lower-class binge drinkers?

Test to Compare Two Proportions

Parameter

Difference in two population proportions,

$$p_1 - p_2$$

Statistic

Difference in two sample proportions,

$$\hat{p}_{1} - \hat{p}_{2}$$

Type of Data

Categorical (Binary)

Analysis

$$H_0\colon p_1 = p_2$$

$$H_a\colon p_1 \ne p_2$$ OR

$$H_a\colon p_1 > p_2$$ OR

$$H_a\colon p_1 < p_2$$

2-proportion Z-test:

$$z^*=\frac{\hat{p}_{1}-\hat{p}_{2}}{\sqrt{\hat{p}^*\left ( 1-\hat{p}^* \right )\left ( \frac{1}{n_{1}}+ \frac{1}{n_{2}}\right )}}$$

$$\hat{p}^*=\dfrac{x_{1}+x_{2}}{n_{1}+n_{2}}$$

Minitab Command

Stat > Basic statistics > 2 proportions

Conditions

Independent samples from the two populations

Have at least 5 in each category for both populations

Examples
• Is the percentage of males with lung cancer higher than the percentage of females with lung cancer?

• Are the percentages of upper- and lower- class binge drinkers different?

Relationship in a 2-Way Table

Parameter

Relationship between two categorical variables, OR

difference in two or more population proportions

Statistic

The observed counts in a two-way table

Categorical

Analysis

$$H_0\colon\text{The two variables are not related}$$

$$H_a\colon\text{The two variables are related}$$

Chi-square test statistic:

$$X^2=\sum_{\text{all cells}}\frac{(\text{Observed-Expected})^2}{\text{Expected}}$$

Minitab Command

Stat > Tables > Chi square Test for Association

Conditions

All expected counts should be greater than 1

At least 80% of the cells should have an expected count greater than 5

Examples
• Is there a relationship between smoking and lung cancer?
• Do the proportions of students in each class who smoke differ?

Parameter

Slope of the population regression line,

$$\beta_1$$

Statistic

Sample estimate of the slope,

$$b_1$$

Numerical

Analysis

$$H_0\colon \beta_1 = 0$$

$$H_a\colon \beta_1 \ne 0$$ OR

$$H_a\colon \beta_1 > 0$$ OR

$$H_a\colon \beta_1 < 0$$

t-test with n - 2 degrees of freedom:

$$t=\dfrac{b_{1}-0}{\hat{s.e.}\left ( b_{1} \right )}$$

Minitab Command

Stat > Regression > Regression

Conditions

The form of the equation that links the two variables must be correct

The error terms are normally distributed

The errors terms have equal variances

The error terms are independent of each other

Examples
• Is there a linear relationship between height and weight of a person?

Test to Compare Several Means

Parameter

Population means of the t populations,

$$\mu_1, \mu_2, \cdots , \mu_t$$

Statistic

Sample means of the t populations,

$$x_1, x_2, \cdots , x_t$$

Numerical

Analysis

$$H_0\colon \mu_1 = \mu_2 = ... = \mu_t$$

$$H_a\colon \text{not all the means are equal}$$

F-test for one-way ANOVA:

$$F=\dfrac{MST}{MSE}$$

Minitab Command

Stat > ANOVA > Oneway

Conditions

Each population is normally distributed

Independent samples from the t populations

Equal population standard deviations

Examples
• Is there a difference between the mean GPA of freshman, sophomore, junior, and senior classes?

Test of Strength & Direction of Linear Relationship of 2 Quantitative Variables

Parameter

Population correlation,

$$\rho$$

"rho"

Statistic

Sample correlation,

$$r$$

Numerical

Analysis

$$H_0\colon \rho = 0$$

$$H_a\colon \rho \ne 0$$

t-test statistic:

$$t=\frac{r\sqrt{n-2}}{\sqrt{1-r^2}}$$

Minitab Command

Stat > Basic Statistics > Correlation

Conditions

2 variables are continuous

Related pairs

No significant outliers

Normality of both variables

Linear relationship between the variables

Examples
• Is there a linear relationship between height and weight?

Test to Compare Two Population Variances

Parameter

Population variances of two populations,

$$\sigma_{1}^{2}, \sigma_{2}^{2}$$

Statistic

Sample variances of two populations,

$$s_{1}^{2}, s_{2}^{2}$$

Numerical

Analysis

$$H_0\colon \sigma_{1}^{2} = \sigma_{2}^{2}$$

$$H_2\colon \sigma_{1}^{2} \ne \sigma_{2}^{2}$$

F-test statistic:

$$F=\frac{s_{1}^{2}}{s_{2}^{2}}$$

Minitab Command

Stat > Basic statistics > 2 variances

Conditions

Each population is normally distributed

Independent samples from the 2 populations

Examples
• Are the variances of length of lumber produced by Company A different from those produced by Company B?