Formulas

95% Confidence Interval: \(sample\;statistic \pm 2 (standard\;error)\)

Between Groups (Numerator) Degrees of Freedom: \(df_{between}=k-1\); \(k\) = number of groups

Binomial Random Variable Probability: \(P(X=k)=\binom{n}{k}p^k(1-p)^{n-k}\); \(n\) = number of trials
\(k\) = number of successes
\(p\) = probability event of interest occurs on any one trial

Chi-Square (\(\chi^2\)) Test Statistic: \(\chi^2=\Sigma \frac{(Observed-Expected)^2}{Expected}\)

Complement of A: \(P(A^{C})=1−P(A)\)

Conditional Probability of A Given B: \(P(A\mid B)=\frac{P(A \: \cap\: B)}{P(B)}\)

Conditional Probability of B Given A: \(P(B\mid A)=\frac{P(A \: \cap\: B)}{P(A)}\)

Confidence Interval for a Population Mean: \(\overline{x} \pm t^{*} \frac{s}{\sqrt{n}}\)

Confidence Interval for the Difference Between Two Paired Means: \(\overline{x}_d \pm t^* \left(\frac{s_d}{\sqrt{n}}\right)\); \(t^*\) is the multiplier with \(df = n-1\)

Confidence Interval for the Difference Between Two Proportions: \((\widehat{p}_1-\widehat{p}_2) \pm z^\ast {\sqrt{\frac{\widehat{p}_1 (1-\widehat{p}_1)}{n_1}+\frac{\widehat{p}_2 (1-\widehat{p}_2)}{n_2}}}\)

Confidence Interval for Two Independent Means: \((\bar{x}_1-\bar{x}_2) \pm t^\ast{ \sqrt{\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2}}}\)

Confidence Interval of \(p\): Normal Approximation Method: \(\widehat{p} \pm z^{*} \left ( \sqrt{\frac{\hat{p} (1-\hat{p})}{n}} \right)\)

Confidence Interval of \(\beta_1\): \(b_1 \pm t^\ast (SE_{b_1})\); \(b_1\) = sample slope
\(t^\ast\) = value from the \(t\) distribution with \(df=n-2\)
\(SE_{b_1}\) = standard error of \(b_1\)

Degrees of Freedom: Chi-Square Test of Independence: \(df=(number\;of\;rows-1)(number\;of\;columns-1)\)

Estimated Degrees of Freedom: \(df=smallest\;n-1\)

Expected Cell Value: \(E=\frac{row\;total \; \times \; column\;total}{n}\)

Expected Count: \(Expected\;count=n (p_i)\); \(n\) is the total sample size
\(p_i\) is the hypothesized proportion of the "ith" group

Finding Sample Size for Estimating a Population Proportion: \(n=\left ( \frac{z^*}{M} \right )^2 \tilde{p}(1-\tilde{p})\); \(M\) is the margin of error
\(\tilde p\) is an estimated value of the proportion

Finding the Sample Size for Estimating a Population Mean: \(n=\frac{z^{2}\widetilde{\sigma}^{2}}{M^{2}}=\left ( \frac{z\widetilde{\sigma}}{M} \right )^2\); \(z\) = z multiplier for given confidence level
\(\widetilde{\sigma}\) = estimated population standard deviation
\(M\) = margin of error

Five Number Summary: (Minimum, \(Q_1\), Median, \(Q_3\), Maximum)

General Form of 95% Confidence Interval: \(sample\ statistic\pm2\ (standard\ error)\)

General Form of a Test Statistic: \(test\;statistic=\frac{sample\;statistic-null\;parameter}{standard\;error}\)

Interquartile Range: \(IQR = Q_3 - Q_1\)

Intersection: \(P(A\cap B) =P(A)\times P(B\mid A)\)

Mean of a Binomial Random Variable: \(\mu=np\)
Also known as \(E(X)\)

Observed Sample Mean Difference: \(\overline{x}_d=\frac{\Sigma{x}_d}{n}\); \(x_d\) = observed difference

Odds

\(odds = \frac {number \;with \;the\; outcome}{number \;without \;the \;outcome}\)

OR

\(odds=\frac{risk}{1-risk}\)

Pearson's r: Conceptual Formula: \(r=\frac{\sum{z_x z_y}}{n-1}\)
where \(z_x=\frac{x - \overline{x}}{s_x}\) and \(z_y=\frac{y - \overline{y}}{s_y}\)

Pooled Estimate of \(p\): \(\widehat{p}=\frac{\widehat{p}_1n_1+\widehat{p}_2n_2}{n_1+n_2}\)

Population Mean: \(\mu=\frac{\Sigma x}{N}\)

Power: \(Power = 1-\beta\); \(\beta\) = probability of committing a Type II Error.

Probability of Event A: \(P(A)=\frac{Number\;in\;group\;A}{Total\;number}\)

Proportion: \(Proportion=\frac{Number\;in\;the\;category}{Total\;number}\)

Range: \(Range = Maximum - Minimum\)

Relative Risk: \(Relative\ Risk=\frac{Risk\ in\ Group\ 1}{Risk\ in\ Group\ 2}\)

Residual: \(e_i =y_i -\widehat{y}_i\); \(y_i\) = actual value of y for the ith observation
\(\widehat{y}_i\) = predicted value of y for the ith observation

Risk: \(Risk= \frac{number \;with \;the\; outcome}{total\;number\;of\;outcomes}\)

Sample Standard Deviation: \(s=\sqrt{\frac{\sum (x-\overline{x})^{2}}{n-1}}\)

Sample Variance: \(s^{2}=\frac{\sum (x-\overline{x})^{2}}{n-1}\)

Simple Linear Regression Line: Population: \(\widehat{y}=\alpha+\beta x\)

Simple Linear Regression Line: Sample: \(\widehat{y}=a+bx\); \(\widehat{y}\) = predicted value of \(y\) for a given value of \(x\)
\(a\) = \(y\)-intercept
\(b\) = slope

Slope: \(b_1 =r \frac{s_y}{s_x}\); \(r\) = Pearson’s correlation coefficient between \(x\) and \(y\)
\(s_y\) = standard deviation of \(y\)
\(s_x\) = standard deviation of \(x\)

Standard Deviation of a Binomial Random Variable: \(\sigma=\sqrt {np(1-p)}\)

Standard Deviation of the Differences: \(s_d=\sqrt{\frac{\sum (x_d-\overline{x}_d)^{2}}{n-1}}\)

Standard Error: \(\sqrt{\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2}}\)

Sum of Squared Residuals: Also known as Sum of Squared Errors (SSE)
\(SSE=\sum (y-\widehat{y})^2\)

Sum of Squares: \(SS={\sum (x-\overline{x})^{2}}\)

Test Statistic: \(test\; statistic = \frac{sample \; statistic - null\;parameter}{standard \;error}\)

Test Statistic for Dependent Means: \(t=\frac{\bar{x}_d-\mu_0}{\frac{s_d}{\sqrt{n}}}\); \(\overline{x}_d\) = observed sample mean difference
\(\mu_0\) = mean difference specified in the null hypothesis
\(s_d\) = standard deviation of the differences
\(n\) = sample size (i.e., number of unique individuals)

Test Statistic for Dependent Means: \(t=\frac{\bar{x}_d-\mu_0}{\frac{s_d}{\sqrt{n}}}\); \(\overline{x}_d\) = observed sample mean difference
\(\mu_0\) = mean difference specified in the null hypothesis
\(s_d\) = standard deviation of the differences
\(n\) = sample size (i.e., number of unique individuals)

Test Statistic for Two Independent Proportions: \(z=\frac{\widehat{p}_1-\widehat{p}_2}{SE_0}\)

Test statistic: One Group Proportion: \(z=\frac{\widehat{p}- p_0 }{\sqrt{\frac{p_0 (1- p_0)}{n}}}\); \(\widehat{p}\) = sample proportion
\(p_{0}\) = hypothesize population proportion
\(n\) = sample size

Union: \(P(A\cup B) = P(A)+P(B)-P(A\cap B)\)

Within Groups (Denominator, Error) Degrees of Freedom

\(df_{within}=n-k\)

\(n\) = total sample size with all groups combined

\(k\) = number of groups

y-intercept: \(b_0=\overline {y} – b_1 \overline {x}\); \(\overline {y}\) = mean of \(y\)
\(\overline {x}\) = mean of \(x\)
\(b_1\) = slope

z Test Statistic: One Group Mean: \(z=\frac{\overline{x}-\mu_0}{\frac{\sigma}{\sqrt{n}}}\); \(\overline{x}\) = sample mean
\(\mu_{0}\) = hypothesized population mean
\(s\) = sample standard deviation
\(n\) = sample size

z-score: \(z=\frac{x - \overline{x}}{s}\); \(x\) = original data value
\(\overline{x}\) = mean of the original distribution
\(s\) = standard deviation of the original distribution