19.5 - Array Bounds

Each of the arrays we've considered thus far have been defined, by default, to have a lower bound of 1 and an upper bound which equals the number of elements in the array's dimension. For example, the array pennstate:

ARRAY pennstate(4) nittany lions happy valley;

has a lower bound of 1 and an upper bound of 4. In this section, we'll look at three examples that concern the bounds of an array. In the first example, we'll use the DIM function to change the upper bound of a DO loop's index variable dynamically (rather than stating it in advance). In the second example, we'll define the lower and upper bounds of a one-dimensional array to create a bounded array. In the third example, we'll use the LBOUND and HBOUND functions to change the lower and upper bounds of a DO loop's index variable dynamically.

Example 19.12 Section

The following program reads the yes/no responses of five subjects to six survey questions (q1, q2, ..., q6) into a temporary SAS data set called survey. A yes response is coded and entered as a 2, while a no response is coded and entered as a 1. Just four of the variables (q3, q4, q5, and q6) are stored in a one-dimensional array called qxs. Then, a DO LOOP, in conjunction with the DIM function, is used to recode the responses to the four variables so that a 2 is changed to a 1, and a 1 is changed to a 0:

DATA survey (DROP = i);
	INPUT subj q1 q2 q3 q4 q5 q6;
	ARRAY qxs(4) q3-q6;
	DO i = 1 to dim(qxs);
		qxs(i) = qxs(i) - 1;
	END;
	DATALINES;
	1001 1 2 1 2 1 1
	1002 2 1 2 2 2 1
	1003 2 2 2 1 . 2
	1004 1 . 1 1 1 2
	1005 2 1 2 2 2 1
	;
RUN;
 
PROC PRINT data = survey;
	TITLE 'The survey data using dim function';
RUN;

The survey data using dim function
Obs	subj	q1	q2	q3	q4	q5	q6
1	1001	1	2	0	1	0	0
2	1002	2	1	1	1	1	0
3	1003	2	2	1	0	.	1
4	1004	1	.	0	0	0	1
5	1005	2	1	1	1	1	0

First, note that although all of the survey variables (q1, ..., q6) are read into the survey data set, the ARRAY statement groups only 4 of the variables (q3, q4, q5, q6) into the one-dimensional array qxs. For example, qxs(1) corresponds to the q3 variable, qxs(2) corresponds to the q4 variable, and so on. Then, rather than telling SAS to process the array from element 1 to element 4, the DO loop tells SAS to process the array from element 1 to the more general DIM(qxs). In general, the DIM function returns the number of the elements in the array, which in this case is 4. The DO loop tells SAS to recode the values by simply subtracting 1 from each value. And, the index variable i is output to the survey data set by default and is therefore dropped.

Now, launch and run the SAS program. Then, review the output from the PRINT procedure to convince yourself that the program does indeed recode the four variables q3, q4, q5, and q6 as described.

Example 19.13 Section

As previously discussed and illustrated, if you do not specifically tell SAS the lower bound of an array, SAS assumes that the lower bound is 1. For most arrays, 1 is a convenient lower bound and the number of elements is a convenient upper bound, so you usually don't need to specify both the lower and upper bounds. However, in cases where it is more convenient, you can modify both bounds for any array dimension.

In the previous example, perhaps you find it a little awkward that the array element qxs(1) corresponds to the q3 variable, the array element qxs(2) corresponds to the q4 variable, and so on. Perhaps you would find it more clear for the array element qxs(3) to correspond to the q3 variable, the array element qxs(4) to correspond to the q4 variable, ..., and the array element qxs(6) to correspond to the q6 variable. The following program is similar in function to the previous program, except here the task of recoding is accomplished by defining the lower bound of the qxs array to be 3 and the upper bound to be 6:

DATA survey (DROP = i);
DATA survey2 (DROP = i);
	INPUT subj q1 q2 q3 q4 q5 q6;
	ARRAY qxs(3:6) q3-q6;
	DO i = 3 to 6;
		qxs(i) = qxs(i) - 1;
	END;
	DATALINES;
	1001 1 2 1 2 1 1
	1002 2 1 2 2 2 1
	1003 2 2 2 1 . 2
	1004 1 . 1 1 1 2
	1005 2 1 2 2 2 1
	;
RUN;
 
PROC PRINT data = survey2;
	TITLE 'The survey data using bounded arrays';
RUN;

The survey data using bounded arrays
Obs	subj	q1	q2	q3	q4	q5	q6
1	1001	1	2	0	1	0	0
2	1002	2	1	1	1	1	0
3	1003	2	2	1	0	.	1
4	1004	1	.	0	0	0	1
5	1005	2	1	1	1	1	0

If you compare this program with the previous program, you'll see that only two things differ. The first difference is that the ARRAY statement here defines the lower bound of the qxs array to be 3 and the upper bound to be 6. In general, you can always define the lower and upper bounds of any array dimension in this way, namely by specifying the lower bound, then a colon (:), and then the upper bound. The second difference is that, for the DO loop, the bounds on the index variable i are specifically defined here to be between 3 and 6 rather than 1 to DIM(qxs) (which in this case is 4).

Now, launch and run the SAS program. Then, review the output from the PRINT procedure to convince yourself that the program does indeed re-code the four variables q3, q4, q5, and q6 just as in the previous program.

Example 19.14 Section

Now, there's still a little bit more that we can do to automate the handling of the bounds of an array dimension. The following program again uses a one-dimensional array qxs to recode four survey variables as did the previous two programs. Here, though, an asterisk (*) is used to tell SAS to determine the dimension of the qxs array, and the LBOUND and HBOUND functions are used to tell SAS to determine, respectively, the lower and upper bounds of the DO loop's index variable dynamically:

DATA survey3 (DROP = i);
	INPUT subj q1 q2 q3 q4 q5 q6;
	ARRAY qxs(*) q3-q6;
	DO i = lbound(qxs) to hbound(qxs);
		qxs(i) = qxs(i) - 1;
	END;
	DATALINES;
	1001 1 2 1 2 1 1
	1002 2 1 2 2 2 1
	1003 2 2 2 1 . 2
	1004 1 . 1 1 1 2
	1005 2 1 2 2 2 1
	;
RUN;
 
PROC PRINT data = survey3;
	TITLE 'The survey data by changing upper and lower bounds automatically';
RUN;

The survey data by changing upper and lower bounds automatically
Obs	subj	q1	q2	q3	q4	q5	q6
1	1001	1	2	0	1	0	0
2	1002	2	1	1	1	1	0
3	1003	2	2	1	0	.	1
4	1004	1	.	0	0	0	1
5	1005	2	1	1	1	1	0

If you compare this program with the previous program, you'll see that only two things differ. The first difference is that the asterisk (*) that appears in the ARRAY statement tells SAS to determine the bounds on the dimensions of the array during the declaration of qxs. SAS counts the number of elements in the array and determines that the dimension of qxs is 4. The second difference is that, for the DO loop, the bounds on the index variable i are determined dynamically to be between LBOUND(qxs) and HBOUND(qxs).

Now, launch and run the SAS program. Then, review the output from the PRINT procedure to convince yourself that the program does indeed recode the four variables q3, q4, q5, and q6 just as in the previous two programs.