Formulas

95% Confidence Interval
\(sample\;statistic \pm 2 (standard\;error)\)
Between Groups (Numerator) Degrees of Freedom

\(df_{between}=k-1\)

\(k\) = number of groups

Binomial Random Variable Probability

\(P(X=k)=\binom{n}{k}p^k(1-p)^{n-k}\)

\(n\) = number of trials
\(k\) = number of successes
\(p\) = probability event of interest occurs on any one trial

Chi-Square (\(\chi^2\)) Test Statistic

\(\chi^2=\Sigma \frac{(Observed-Expected)^2}{Expected}\)

Complement of A
\(P(A^{C})=1−P(A)\)
Conditional Probability of A Given B

\(P(A\mid B)=\frac{P(A \: \cap\: B)}{P(B)}\)

Conditional Probability of B Given A

\(P(B\mid A)=\frac{P(A \: \cap\: B)}{P(A)}\)

Confidence Interval for a Population Mean

\(\overline{x} \pm t^{*} \frac{s}{\sqrt{n}}\)

Confidence Interval for the Difference Between Two Paired Means

\(\overline{x}_d \pm t^* \left(\frac{s_d}{\sqrt{n}}\right)\)

\(t^*\) is the multiplier with \(df = n-1\)

Confidence Interval for the Difference Between Two Proportions
\((\widehat{p}_1-\widehat{p}_2) \pm z^\ast {\sqrt{\frac{\widehat{p}_1 (1-\widehat{p}_1)}{n_1}+\frac{\widehat{p}_2 (1-\widehat{p}_2)}{n_2}}}\)
Confidence Interval for Two Independent Means

\((\bar{x}_1-\bar{x}_2) \pm t^\ast{ \sqrt{\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2}}}\)

Confidence Interval of \(p\): Normal Approximation Method

\(\widehat{p} \pm z^{*} \left ( \sqrt{\frac{\hat{p} (1-\hat{p})}{n}} \right)\)

Confidence Interval of \(\beta_1\)

\(b_1 \pm t^\ast (SE_{b_1})\)

\(b_1\) = sample slope
\(t^\ast\) = value from the \(t\) distribution with \(df=n-2\)
\(SE_{b_1}\) = standard error of \(b_1\)

Degrees of Freedom: Chi-Square Test of Independence

\(df=(number\;of\;rows-1)(number\;of\;columns-1)\)

Estimated Degrees of Freedom

\(df=smallest\;n-1\)

Expected Cell Value

\(E=\frac{row\;total \; \times \; column\;total}{n}\)

Expected Count

\(Expected\;count=n (p_i)\)

\(n\) is the total sample size
\(p_i\) is the hypothesized proportion of the "ith" group

Finding Sample Size for Estimating a Population Proportion

\(n=\left ( \frac{z^*}{M} \right )^2 \tilde{p}(1-\tilde{p})\)

\(M\) is the margin of error
\(\tilde p\) is an estimated value of the proportion

Finding the Sample Size for Estimating a Population Mean

\(n=\frac{z^{2}\widetilde{\sigma}^{2}}{M^{2}}=\left ( \frac{z\widetilde{\sigma}}{M} \right )^2\)

\(z\) = z multiplier for given confidence level
\(\widetilde{\sigma}\) = estimated population standard deviation
\(M\) = margin of error

Five Number Summary

(Minimum, \(Q_1\), Median, \(Q_3\), Maximum)

General Form of 95% Confidence Interval

\(sample\ statistic\pm2\ (standard\ error)\)

General Form of a Test Statistic

\(test\;statistic=\frac{sample\;statistic-null\;parameter}{standard\;error}\)

Interquartile Range

\(IQR = Q_3 - Q_1\)

Intersection

\(P(A\cap B) =P(A)\times P(B\mid A)\)

Mean of a Binomial Random Variable

\(\mu=np\)
Also known as \(E(X)\)

Observed Sample Mean Difference

\(\overline{x}_d=\frac{\Sigma{x}_d}{n}\)

\(x_d\) = observed difference

Odds

\(odds = \frac {number \;with \;the\; outcome}{number \;without \;the \;outcome}\)

OR

\(odds=\frac{risk}{1-risk}\)

Pearson's r: Conceptual Formula

\(r=\frac{\sum{z_x z_y}}{n-1}\)
where \(z_x=\frac{x - \overline{x}}{s_x}\) and \(z_y=\frac{y - \overline{y}}{s_y}\)

Pooled Estimate of \(p\)

\(\widehat{p}=\frac{\widehat{p}_1n_1+\widehat{p}_2n_2}{n_1+n_2}\)

Population Mean

\(\mu=\frac{\Sigma x}{N}\)

Power

\(Power = 1-\beta\)

\(\beta\) = probability of committing a Type II Error.

Probability of Event A
\(P(A)=\frac{Number\;in\;group\;A}{Total\;number}\)
Proportion

\(Proportion=\frac{Number\;in\;the\;category}{Total\;number}\)

Range

\(Range = Maximum - Minimum\)

Relative Risk
\(Relative\ Risk=\frac{Risk\ in\ Group\ 1}{Risk\ in\ Group\ 2}\)
Residual

\(e_i =y_i -\widehat{y}_i\)

\(y_i\) = actual value of y for the ith observation
\(\widehat{y}_i\) = predicted value of y for the ith observation

Risk

\(Risk= \frac{number \;with \;the\; outcome}{total\;number\;of\;outcomes}\)

Sample Standard Deviation

\(s=\sqrt{\frac{\sum (x-\overline{x})^{2}}{n-1}}\)

Sample Variance

\(s^{2}=\frac{\sum (x-\overline{x})^{2}}{n-1}\)

Simple Linear Regression Line: Population

\(\widehat{y}=\alpha+\beta x\)

Simple Linear Regression Line: Sample

\(\widehat{y}=a+bx\)

\(\widehat{y}\) = predicted value of \(y\) for a given value of \(x\)
\(a\) = \(y\)-intercept
\(b\) = slope

Slope

\(b_1 =r \frac{s_y}{s_x}\)

\(r\) = Pearson’s correlation coefficient between \(x\) and \(y\)
\(s_y\) = standard deviation of \(y\)
\(s_x\) = standard deviation of \(x\)

Standard Deviation of a Binomial Random Variable

\(\sigma=\sqrt {np(1-p)}\)

Standard Deviation of the Differences
\(s_d=\sqrt{\frac{\sum (x_d-\overline{x}_d)^{2}}{n-1}}\)
Standard Error

\(\sqrt{\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2}}\)

Sum of Squared Residuals

Also known as Sum of Squared Errors (SSE)
\(SSE=\sum (y-\widehat{y})^2\)

Sum of Squares

\(SS={\sum (x-\overline{x})^{2}}\)

Test Statistic

\(test\; statistic = \frac{sample \; statistic - null\;parameter}{standard \;error}\)

Test Statistic for Dependent Means

\(t=\frac{\bar{x}_d-\mu_0}{\frac{s_d}{\sqrt{n}}}\)

\(\overline{x}_d\) = observed sample mean difference
\(\mu_0\) = mean difference specified in the null hypothesis
\(s_d\) = standard deviation of the differences
\(n\) = sample size (i.e., number of unique individuals)

Test Statistic for Dependent Means

\(t=\frac{\bar{x}_d-\mu_0}{\frac{s_d}{\sqrt{n}}}\)

\(\overline{x}_d\) = observed sample mean difference
\(\mu_0\) = mean difference specified in the null hypothesis
\(s_d\) = standard deviation of the differences
\(n\) = sample size (i.e., number of unique individuals)

Test Statistic for Two Independent Proportions

\(z=\frac{\widehat{p}_1-\widehat{p}_2}{SE_0}\)

Test statistic: One Group Proportion

\(z=\frac{\widehat{p}- p_0 }{\sqrt{\frac{p_0 (1- p_0)}{n}}}\)

\(\widehat{p}\) = sample proportion
\(p_{0}\) = hypothesize population proportion
\(n\) = sample size

Union
\(P(A\cup B) = P(A)+P(B)-P(A\cap B)\)
Within Groups (Denominator, Error) Degrees of Freedom

\(df_{within}=n-k\)

\(n\) = total sample size with all groups combined

\(k\) = number of groups

y-intercept

\(b_0=\overline {y} – b_1 \overline {x}\)

\(\overline {y}\) = mean of \(y\)
\(\overline {x}\) = mean of \(x\)
\(b_1\) = slope

z Test Statistic: One Group Mean

\(z=\frac{\overline{x}-\mu_0}{\frac{\sigma}{\sqrt{n}}}\)

\(\overline{x}\) = sample mean
\(\mu_{0}\) = hypothesized population mean
\(s\) = sample standard deviation
\(n\) = sample size

z-score

\(z=\frac{x - \overline{x}}{s}\)

\(x\) = original data value
\(\overline{x}\) = mean of the original distribution
\(s\) = standard deviation of the original distribution