7.1.8 - Multivariate Paired Hotelling's T-Square

7.1.8 - Multivariate Paired Hotelling's T-Square

Example 7-8: Spouse Data (Multivariate Case)

Now let's consider the multivariate case. All scalar observations will be replaced by vectors of observations. As a result, we will use the notation that follows:

Husband:

\(\mathbf{X}_{1i} = \left(\begin{array}{c}X_{1i1}\\ X_{1i2}\\\vdots\\X_{1ip}\end{array}\right)\) = vector of observations for the \(i ^{ th } \text{ husband}\)

\(X _{ 1i1 }\) will denote the response of the \(i ^{ th }\) husband to the first question. \(X _{ 1i2 }\) will denote the response of the \(i ^{ th }\) husband to the second question and so on...

Wife:

\(\mathbf{X}_{2i} = \left(\begin{array}{c}X_{2i1}\\ X_{2i2}\\\vdots\\X_{2ip}\end{array}\right)\) = vector of observations for the \(i ^{ th } \text{ wife}\)

\(X _{ 2i1 }\) will denote the response of the \(i ^{ th }\) wife to the first question. \(X _{ 2i2 }\) will denote the response of the \(i ^{ th }\) wife to the second question and so on...

The scalar population means are replaced by the population mean vectors so that \(\mu _{ 1 }\) = population mean vector for husbands and \(\mu _{ 2 }\) = population mean vector for the wives.

Here we are interested in testing the null hypothesis that the population mean vectors are equal against the general alternative that these mean vectors are not equal.

\(H_0\colon \boldsymbol{\mu_1} = \boldsymbol{\mu_2}\) against \(H_a\colon \boldsymbol{\mu_1} \ne \boldsymbol{\mu_2}\)

Under the null hypothesis, the two mean vectors are equal element by element. As with the one-sample univariate case, we are going to look at the differences between the observations. We will define the vector \(Y _{ 1 }\) for the \(i ^{ th }\) couple to be equal to the vector \(X _{ 1i }\) for the \(i ^{ th }\) husband minus the vector \(X _{ 2i }\) for the \(i ^{ th }\) wife. Then we will also, likewise, define the vector \(\mu _{ Y }\) to be the difference between the vector \(\mu _{ 1 }\) and the vector \(\mu _{ 2 }\).

\(\mathbf{Y}_i = \mathbf{X}_{1i}-\mathbf{X}_{2i}\) and \(\boldsymbol{\mu}_Y = \boldsymbol{\mu}_1-\boldsymbol{\mu}_2\)

Testing the above null hypothesis is going to be equivalent to testing the null hypothesis that the population mean vector \(\mu _{ Y }\) is equal to 0. That is, all of its elements are equal to 0. This is tested against the alternative that the vector \(\mu _{ Y }\) is not equal to 0, that is at least one of the elements is not equal to 0.

\(H_0\colon \boldsymbol{\mu}_Y = \mathbf{0}\) against \(H_a\colon \boldsymbol{\mu}_Y \ne \mathbf{0}\)

This hypothesis is tested using the paired Hotelling's \(T^{ 2 }\) test.

As before, we will define \(\mathbf{\bar{Y}}\) to denote the sample mean vector of the vectors \(Y_{ i }\).

\(\mathbf{\bar{Y}} = \dfrac{1}{n}\sum_{i=1}^{n}\mathbf{Y}_i\)

And, we will define \(S _{ Y }\) to denote the sample variance-covariance matrix of the vectors \(Y_{ i }\).

\(\mathbf{S}_Y = \dfrac{1}{n-1}\sum_{i=1}^{n}\mathbf{(Y_i-\bar{Y})(Y_i-\bar{Y})'}\)

The assumptions are similar to the assumptions made for the one-sample Hotelling's T-square test:

  1. The vectors \(Y _{ 1 }\)'s have a common population mean vector \(\mu _{ Y }\), which essentially means that there are no sub-populations with mean vectors.
  2. The vectors \(Y _{ 1 }\)'s have common variance-covariance matrix \(\Sigma_Y\).
  3. Independence. The \(Y _{ 1 }\)'s are independently sampled. In this case, independence among the couples in this study.
  4. Normality. The \(Y _{ 1 }\)'s are multivariate normally distributed.

Paired Hotelling's T-Square test statistic is given by the expression below:

\(T^2 = n\bar{\mathbf{Y}}'\mathbf{S}_Y^{-1}\mathbf{\bar{Y}}\)

It is a function of sample size n, the sample mean vectors, \(\mathbf{\bar{Y}}\), and the inverse of the variance-covariance matrix \(\mathbf{S} _{ Y }\).

Then we will define an F-statistic as given in the expression below:

\(F = \dfrac{n-p}{p(n-1)}T^2 \sim F_{p, n-p}\)

Under the null hypothesis, \(H _{ o } \colon \mu _{ Y } = 0\), this will have an F-distribution with p and n-p degrees of freedom. We will reject \(H _{ o }\) at level \(\alpha\) if the F-value exceeds the value from F-value with p and n-p degrees of freedom, evaluated at level \(\alpha\).

\(F > F_{p,n-p,\alpha}\)

Let us return to our example...


Legend
[1]Link
Has Tooltip/Popover
 Toggleable Visibility