options ls=78; title "Profile Plot - Spouse Data"; /* %let allows the p variable to be used throughout the code below * After reading in the spouse data, where each variable is * originally in its own column, the next statements define difference * variables between husbands and wives, and they stack the data * so that all group labels (1 through 4) are in one column called 'variable', * and all differences are in another column called 'diff'. * This format is used for the calculations that follow. */ %let p=4; data spouse; infile "D:\Statistics\STAT 505\data\spouse.csv" firstobs=2 delimiter=','; input h1 h2 h3 h4 w1 w2 w3 w4; variable=1; diff=h1-w1; output; variable=2; diff=h2-w2; output; variable=3; diff=h3-w3; output; variable=4; diff=h4-w4; output; drop h1 h2 h3 h4 w1 w2 w3 w4; run; proc sort; by variable; run; /* The means procedure calculates and saves the sample size, * mean, and variance for each variable. It then saves these results * in a new data set 'a' for use in the final step below. * / proc means data=spouse; by variable; var diff; output out=a n=n mean=xbar var=s2; run; /* The data step here is used to calculate the simultaneous * confidence intervals based on the F-multiplier * from the statistics calculated in the data set 'a'. * / data b; set a; f=finv(0.95,&p,n-&p); diff=xbar; output; diff=xbar-sqrt(&p*(n-1)*f*s2/(n-&p)/n); output; diff=xbar+sqrt(&p*(n-1)*f*s2/(n-&p)/n); output; run; /* The axis commands define the size of the plotting window. * The horizontal axis is of the variables, and the vertical * axis is used for the confidence limits. * The reference line of 0 corresponds to the null value of the * difference for each variable. * / proc gplot data=b; axis1 length=4 in; axis2 length=6 in; plot diff*variable / vaxis=axis1 haxis=axis2 vref=0 lvref=21; symbol v=none i=hilot color=black; run;