Let us revisit the original hypothesis of interest, as below
\(H_0\colon \boldsymbol{\mu} = \boldsymbol{\mu_0}\) against \(H_a\colon \boldsymbol{\mu} \ne \boldsymbol{\mu_a}\)
Note! It is equivalent to testing this null hypothesis:
\(H_0\colon \dfrac{\mu_1}{\mu^0_1} = \dfrac{\mu_2}{\mu^0_2} = \dots = \dfrac{\mu_p}{\mu^0_p} = 1\)
against the alternative that at least one of these ratios is not equal to 1, (below):
\(H_a\colon \dfrac{\mu_j}{\mu^0_j} \ne 1\) for at least one \(j \in \{1,2,\dots, p\}\)
Instead of testing the null hypothesis for the ratios of the means over their hypothesized means are all equal to one, profile analysis involves testing the null hypothesis that all of these ratios are equal to one another, but not necessarily equal to 1.
After rejecting
\(H_0\colon \dfrac{\mu_1}{\mu^0_1} = \dfrac{\mu_2}{\mu^0_2} = \dots = \dfrac{\mu_p}{\mu^0_p} = 1\)
we may wish to test
\(H_0\colon \dfrac{\mu_1}{\mu^0_1} = \dfrac{\mu_2}{\mu^0_2} = \dots = \dfrac{\mu_p}{\mu^0_p}\)
Profile Analysis can be carried out using the following procedure.
Step 1: Compute the differences between the successive ratios. That is we take the ratio of the j + 1-th variable over its hypothesized mean and subtract this from the ratio of jth variable over its hypothesized mean as shown below:
\(D_{ij} = \dfrac{X_{ij+1}}{\mu^0_{j+1}}-\dfrac{X_{ij}}{\mu^0_j}\)
We call this ratio \(D_{ij}\) for observation i.
Note! That, testing the null hypothesis that all of the ratios are equal to one another
\(H_0\colon \dfrac{\mu_1}{\mu^0_1} = \dfrac{\mu_2}{\mu^0_2} = \dots = \dfrac{\mu_p}{\mu^0_p}\)
is equivalent to testing the null hypothesis that all the mean differences are going to be equal to 0.
\(H_0\colon \boldsymbol{\mu}_D = \mathbf{0}\)
Step 2: Apply Hotelling’s \(T^{2}\) test to the data \(D_{ij}\) to test the null hypothesis that the mean of these differences is equal to 0.
\(H_0\colon \boldsymbol{\mu}_D = \mathbf{0}\)
This is carried out using the SAS program as shown below:
download the SAS Program here: nutrient8.sas
Explore the code for an explanation of the SAS code.Note: In the upper right-hand corner of the code block you will have the option of copying ( ) the code to your clipboard or downloading ( ) the file to your computer.
options ls=78;
title "Profile Analysis - Women's Nutrition Data";
/* After reading in the data, each of the original
* variables is divided by its null value to convert
* to a common scale without specific units.
* The differences are then defined for each successive
* pair of variables.
*/
data nutrient;
infile "D:\Statistics\STAT 505\data\nutrient.csv" firstobs=2 delimiter=','
input id calcium iron protein a c;
calcium=calcium/1000;
iron=iron/15;
protein=protein/60;
a=a/800;
c=c/75;
diff1=iron-calcium;
diff2=protein-iron;
diff3=a-protein;
diff4=c-a;
run;
/* The iml code to compute the hotelling t2 statistic
* is similar to that for the one-sample t2 statistic
* except that by working with differences of variable,
* the null values are all 0s, and the corresponding
* null hypothesis is that all variable means are equal
* to each other.
*/
proc iml;
start hotel;
mu0={0,0,0,0};
one=j(nrow(x),1,1);
ident=i(nrow(x));
ybar=x`*one/nrow(x);
s=x`*(ident-one*one`/nrow(x))*x/(nrow(x)-1.0);
print mu0 ybar;
print s;
t2=nrow(x)*(ybar-mu0)`*inv(s)*(ybar-mu0);
f=(nrow(x)-ncol(x))*t2/ncol(x)/(nrow(x)-1);
df1=ncol(x);
df2=nrow(x)-ncol(x);
p=1-probf(f,df1,df2);
print t2 f df1 df2 p;
finish;
use nutrient;
read all var{diff1 diff2 diff3 diff4} into x;
run hotel;
Profile analysis
To test the hypothesis of equal mean ratios:
- Open the ‘nutrient’ data set in a new worksheet.
- Name the columns id, calcium, iron, protein, vit_A, and vit_C, from left to right.
- Name new columns diff1, diff2, diff3, and diff4.
- Calc > Calculator
- Highlight and select 'diff1' to move it to the 'Store result' window.
- In the Expression window, enter ‘iron’ / 15 - ‘calcium’ / 1000, where the values 15 and 1000 come from the null values of interest for iron and calcium, respectively.
- Choose 'OK'. The difference between the iron and calcium ratios is displayed in the worksheet variable diff1.
- Calc > Calculator
- Highlight and select 'diff2' to move it to the Store result window.
- In the Expression window, enter ‘protein’ / 60 - ‘iron’ / 15, where the values 60 and 15 come from the null values of interest for protein and iron, respectively.
- Choose 'OK'. The difference between the protein and iron ratios is displayed in the worksheet variable diff2.
- Repeat step 5. for the difference between the vitamin A ratio and the protein ratio, where each ratio is obtained by dividing the original variable by its null value.
- Repeat step 5. again for the last difference, which is between the vitamin C ratio and the vitamin A ratio.
- Stat > Basic Statistics > Store Descriptive Statistics
- Highlight and select all 4 difference variables (diff1 through diff4) to move them to the 'Variables' window.
- Under Statistics, choose ‘Mean’, and then ‘OK’.
- Choose ‘OK’ again. The 4 difference means are displayed in five new columns in the worksheet.
- Data > Transpose Columns
- Highlight and select the 4 column names with the means from the step above.
- Choose ‘After last column in use’ then ‘OK'. The means are displayed in a single column in the worksheet.
- Name the new column of the numeric means ‘means’ for the remaining steps.
- Data > Copy > Columns to Matrix
- Highlight and select ‘means’ for the 'Copy from columns' window.
- Enter the ‘M1’ in the In current worksheet window. Then choose ‘OK’.
- Calc > Matrices > Transpose
- Highlight and select ‘M1’ in the 'Transpose from' window and enter ‘M2’ in the 'Store result' window.
- Choose ‘OK’.
- Stat > Basic Statistics > Covariance
- Highlight and select all 4 difference variables (diff1 through diff4) to move them to the 'Variables' window.
- Check the box to Store matrix and then choose ‘OK’. This will store the covariance matrix in a new variable ‘M3’.
- Calc > Matrices > Invert
- Highlight and select ‘M3’ to move it to the 'Invert from' window.
- Enter ‘M4’ in the 'Store result in' window and choose ‘OK’. This will store the inverted covariance matrix in a new variable ‘M4’.
- Calc > Matrices > Arithmetic
- Choose Multiply and enter M2 and M4, respectively in the two windows.
- Enter ‘M5’ as the name in the 'Store result' window and then ‘OK’.
- Calc > Matrices > Arithmetic
- Choose Multiply and enter M5 and M1, respectively in the two windows.
- Enter ‘M6’ as the name in the 'Store result' window and then ‘OK’. The answer 1.39864 is displayed in the results window.
- Calc > Calculator
- Enter C16 (or any unused column name) in the 'Store result' window.
- In the Expression window, enter 1.39864 * 737 (this is the answer from 16b times the sample size). Choose ‘OK’. The value of the T2 statistic is displayed in the worksheet under ‘C16’.
Analysis
The results yield a Hotelling's \(T^{2}\) of 1,030.7953 and an F value of 256.64843, 4, and 733 degrees of freedom, and a p-value very close to 0.
MU0 | YBAR |
---|---|
0 | 0.1179441 |
0 | 0.3547307 |
0 | -0.04718 |
0 | 0.0028351 |
S | ||||
---|---|---|---|---|
0.191621 | -0.071018 | 0.0451147 | -0.037562 | |
-0.071018 | 0.1653837 | -0.178844 | 0.029556 | |
0.0451147 | -0.178844 | 4.123726 | -3.755071 | |
-0.037562 | 0.029556 | -3.75507 | -3.755071 |
T2 | F | DF1 | DF2 | P |
---|---|---|---|---|
1030.7953 | 256.64843 | 4 | 733 | 0 |
Example 7-14: Women’s Health Survey Section
Here we can reject the null hypothesis that the ratio of mean intake over the recommended intake is the same for each nutrient as the evidence has shown here:
\( T ^ { 2 } = 1030.80 ; F = 256.65 ; \mathrm { d.f. } = 4,733 ; p < 0.0001 \)
This null hypothesis could be true if, for example, all the women were taking in nutrients in their required ratios, but they were either eating too little or eating too much.