options ls=78; title "Confidence Intervals - Spouse Data"; /* %let allows the p variable to be used throughout the code below * After reading in the spouse data, where each variable is * originally in its own column, the next statements define difference * variables between husbands and wives, and they stack the data * so that all group labels (1 through 4) are in one column called 'variable', * and all differences are in another column called 'diff'. * This format is used for the calculations that follow. */ %let p=4; data spouse; infile "D:\Statistics\STAT 505\data\spouse.csv" firstobs=2 delimiter=','; input h1 h2 h3 h4 w1 w2 w3 w4; variable=1; diff=h1-w1; output; variable=2; diff=h2-w2; output; variable=3; diff=h3-w3; output; variable=4; diff=h4-w4; output; drop h1 h2 h3 h4 w1 w2 w3 w4; run; proc sort; by variable; run; /* The means procedure calculates and saves the sample size, * mean, and variance for each variable. It then saves these results * in a new data set 'a' for use in the final step below. * / proc means data=spouse noprint; by variable; var diff; output out=a n=n mean=xbar var=s2; run; /* The data step here is used to calculate the confidence interval * limits from the statistics calculated in the data set 'a'. * The values 't' and'f' are the critical values used in the * Bonferroni and F intervals, respectively. * / data b; set a; f=finv(0.95,&p,n-&p); t=tinv(1-0.025/&p,n-1); losim=xbar-sqrt(&p*(n-1)*f*s2/(n-&p)/n); upsim=xbar+sqrt(&p*(n-1)*f*s2/(n-&p)/n); lobon=xbar-t*sqrt(s2/n); upbon=xbar+t*sqrt(s2/n); run; proc print data=b; run;