# 34.3 - Stratified Random Sampling

34.3 - Stratified Random SamplingIn the two previous sections, we were concerned with taking a random sample from a data set without regard to whether an observation comes from a particular subgroup. When you are conducting a survey, it often behooves you to make sure that your sample contains a certain number of observations from each particular subgroup. We'll concern ourselves with such a restriction here. That is, in this section, we'll focus on ways of using SAS to obtain a **stratified random sample**, in which a subset of observations are selected randomly from each subgroup of observations as determined by the value of a stratifying variable. We'll also go back to sampling without replacement, in which once an observation is selected it cannot be selected again.

## Selecting a Stratified Sample of Equal-Sized Groups

We'll first focus on the situation in which an equal number of observations are selected from each subgroup of observations as determined by the value of a stratifying variable.

## Example 34.9

The following code illustrates how to select a **stratified random sample of equal-sized groups**. Specifically, the code tells SAS to randomly select 5 observations from each of the three subgroups — State College, Port Matilda, Bellefonte — as determined by the value of the variable *city*:

```
PROC FREQ data=stat482.mailing;
table city/out=bycount noprint;
RUN;
PROC SORT data=stat482.mailing;
by city;
RUN;
DATA sample5;
merge stat482.mailing bycount (drop = percent);
by city;
retain k;
if first.city then k=5;
random = ranuni(109);
propn = k/count;
if random le propn then
do;
output;
k=k-1;
end;
count=count-1;
RUN;
PROC PRINT data=bycount;
title 'Count by CITY';
RUN;
PROC PRINT data=sample5;
title 'Sample5: Stratified Random Sample with Equal-Sized Strata';
by city;
RUN;
```

Obs | City | COUNT | PERCENT |
---|---|---|---|

1 | Bellefonte | 15 | 30 |

2 | Port Matilda | 13 | 26 |

3 | State College | 22 | 44 |

First, launch and run * * the SAS program. Then, review the resulting output to convince yourself that the code did indeed select, from the

*mailing*data set, five observations from Bellefonte, five observations from Port Matilda, and five observations from State College.

Now, how does the program work? In order to select a stratified random sample in SAS, we basically use code similar to selecting equal-sized random samples without replacement, except now we process within each subgroup. More specifically, here's how the program works step-by-step:

- The sole purpose of the FREQ procedure is to determine the number of observations in the
*stat482.mailing*data set that correspond to each level of the stratification variable*city*(hence, "table*city*"). The OUT = option tells SAS to create a data set called*bycount*that contains the variable*city*and two variables that contain the number (*count*) and percentage (*percent*) of records for each level of*city*. - The SORT procedure merely sorts the
*stat482.mailing*data set by*city*and stores the sorted result in a temporary data set called*mailing*so that it can be processed by*city*in the next DATA step. - Merge, by
*city*, the sorted data set*mailing*with the*bycount*data set, so that the number of observations per subgroup is available. Since the percentage of observations is not needed, drop it from the data set on input. - The rest of the code in the DATA step should look very familiar. That is, once the number of observations per subgroup in the original
*stat482.mailing*data set is available, you can randomly select records from the subgroup as you would select equal-sized random samples without replacement, except you select within*city*(hence, "by*city*"). Every time SAS reads in a new*city*(hence, "if*first.city*"), the number of observations still needed in the subgroup's sample (*k*) is set to the number of observations desired in each of the subgroups (5, here).

Note that, again, the *random *= **ranuni**( ) and *propn = k/n* assignments are made here only so their values can be printed. In another situation, these values would be incorporated directly in the IF statement: if **ranuni**( ) le *k/n* then do; Additionally, *k* and *count* could be dropped from the output data set, but are kept here, so their values can be printed for educational purposes.

## Example 34.10

The following code illustrates an alternative way of randomly selecting a **stratified random sample of equal-sized groups**. The code, while less efficient — because it requires that the data set be processed twice and sorted once — may feel more natural and intuitive to you:

```
DATA scollege pmatilda bellefnt;
set stat482.mailing;
if city = 'State College' then output scollege;
else if city = 'Port Matilda' then output pmatilda;
else if city = 'Bellefonte' then output bellefnt;
RUN;
%MACRO select (dsn, num);
DATA &dsn;
set &dsn;
random=ranuni(85329);
RUN;
PROC SORT data=&dsn;
by random;
RUN;
DATA &dsn;
set &dsn;
if _N_ le 5;
RUN;
%MEND select;
%SELECT(scollege, 5); %SELECT(pmatilda, 5); %SELECT(bellefnt, 5);
DATA sample6A;
set bellefnt pmatilda scollege;
RUN;
PROC PRINT data=sample6A;
title 'Sample6A: Stratified Random Sample with Equal-Sized Strata';
by city;
RUN;
```

## Sample6A: Stratified Random Sample with Equal-Sized Strata

Obs | Num | Name | Street | State | random |
---|---|---|---|---|---|

1 | 10 | Laura Mills | 704 Hill Street | PA | 0.05728 |

2 | 4 | Mark Adams | 312 Oak Lane | PA | 0.22701 |

3 | 13 | James Whitney | 104 Pine Hill Drive | PA | 0.28315 |

4 | 12 | Fran Cipolla | 912 Cardinal Drive | PA | 0.34773 |

5 | 5 | Lisa Brothers | 89 Elm Street | PA | 0.42637 |

6 | 6 | Delilah Fequa | 2094 Acorn Street | PA | 0.46698 |

7 | 3 | Jim Jefferson | 10101 Allegheny Street | PA | 0.60821 |

8 | 11 | Linda Bentlager | 1010 Tricia Lane | PA | 0.63431 |

9 | 1 | Jonathon Smothers | 103 Oak Lane | PA | 0.67112 |

10 | 2 | Jane Doe | 845 Main Street | PA | 0.70002 |

11 | 8 | Mamie Davison | 102 Cherry Avenue | PA | 0.72302 |

12 | 14 | William Edwards | 79 Oak Lane | PA | 0.79275 |

13 | 7 | John Doe | 812 Main Street | PA | 0.86987 |

14 | 9 | Ernest Smith | 492 Main Street | PA | 0.87446 |

15 | 15 | Harold Harvey | 480 Main Street | PA | 0.88875 |

Obs | Num | Name | Street | State | random |
---|---|---|---|---|---|

16 | 47 | Barb Wyse | 21 Cleveland Drive | PA | 0.05728 |

17 | 41 | Lou Barr | 219 Eagle Street | PA | 0.22701 |

18 | 50 | George Matre | 75 Ashwind Drive | PA | 0.28315 |

19 | 49 | Tim Winters | 95 Dove Street | PA | 0.34773 |

20 | 42 | Casey Spears | 123 Main Street | PA | 0.42637 |

21 | 43 | Leslie Olin | 487 Bluebird Haven | PA | 0.46698 |

22 | 40 | Jane Smiley | 298 Cardinal Drive | PA | 0.60821 |

23 | 48 | Coach Pierce | 74 Main Street | PA | 0.63431 |

24 | 38 | Miriam Denders | 2348 Robin Avenue | PA | 0.67112 |

25 | 39 | Scott Fitzgerald | 43 Blue Jay Drive | PA | 0.70002 |

26 | 45 | Ann Draper | 72 Lake Road | PA | 0.72302 |

27 | 44 | Edwin Hoch | 389 Dolphin Drive | PA | 0.86987 |

28 | 46 | Linda Nicolson | 71 Liberty Terrace | PA | 0.87446 |

Obs | Num | Name | Street | State | random |
---|---|---|---|---|---|

29 | 33 | Steve Ignella | 367 Whitehall Road | PA | 0.00548 |

30 | 25 | Steve Lindhoff | 130 E. College Avenue | PA | 0.05728 |

31 | 19 | Frank Smith | 238 Waupelani Drive | PA | 0.22701 |

32 | 31 | Robert Williams | 156 Straford Drive | PA | 0.26377 |

33 | 28 | Srabashi Kundu | 112 E. Beaver Avenue | PA | 0.28315 |

34 | 27 | Lucy Arnets | 345 E. College Avenue | PA | 0.34773 |

35 | 34 | Mike Dahlberg | 1201 No. Atherton | PA | 0.36894 |

36 | 20 | Kristin Jones | 120 Stratford Drive | PA | 0.42637 |

37 | 21 | Amy Kuntz | 357 Park Avenue | PA | 0.46698 |

38 | 36 | Daniel Fremgen | 103 W. College Avenue | PA | 0.53660 |

39 | 18 | Ade Fequa | 803 Allen Street | PA | 0.60821 |

40 | 26 | Jan Davison | 201 E. Beaver Avenue | PA | 0.63431 |

41 | 35 | Doris Alcorn | 453 Fraser Street | PA | 0.66005 |

42 | 16 | Linda Edmonds | 410 College Avenue | PA | 0.67112 |

43 | 32 | George Ball | 888 Park Avenue | PA | 0.69135 |

44 | 17 | Rigna Patel | 101 Beaver Avenue | PA | 0.70002 |

45 | 23 | Greg Pope | 5100 No. Atherton | PA | 0.72302 |

46 | 37 | Scott Henderson | 245 W. Beaver Avenue | PA | 0.72795 |

47 | 29 | Joe White | 678 S. Allen Street | PA | 0.79275 |

48 | 22 | Roberta Kudla | 312 Whitehall Road | PA | 0.86987 |

49 | 24 | Mark Mendel | 256 Fraser Street | PA | 0.87446 |

50 | 30 | Daniel Peterson | 328 Waupelani Drive | PA | 0.88875 |

First, launch and run * * the SAS program. Then, review the resulting output to convince yourself that the code did indeed select, from the

*mailing*data set, five observations from Bellefonte, five observations from Port Matilda, and five observations from State College.

Now, how does the program work? In summary, here's how this approach works:

- The first DATA step uses an IF-THEN-ELSE statement in conjunction with OUTPUT statements to divide the original mailing data set up into several data sets based on the value of
*city*. (Here, we create three data sets, one for each*city*... namely,*scollege*,*pmatilda*, and*bellefnt*.) - Then, the macro
*select*exactly mimics the creation of the*sample3A*data set in Example 10.5 on the Random Sampling Without Replacement page. That is, the macro generates a random number for each observation, the data set is sorted by the random number, and then the first*num*observations are selected. - Then, call the macro
*select*three times once for each of the*city*data sets ....*scollege*,*pmatilda*, and*bellefnt*.... selecting five observations from each. - Finally, the final DATA step concatenates the three data sets,
*bellefnt, scollege*, and*pmatilda*, with 5 observations each back into one data set called*sample6A*with the 15 randomly selected observations.

Lo and behold, when all is said and done, we have another stratified random sample of equal-sized groups! Approach #2 checked off. Now, onto one last approach!

## Example 34.11

The following code illustrates yet another alternative way of randomly selecting a **stratified random sample of equal-sized groups**. Specifically, the program uses the SURVEYSELECT procedure to tell SAS to randomly sample exactly five observations from each of the three *city* subgroups in the permanent SAS data set *mailing*:

```
PROC SURVEYSELECT data = stat482.mailing
out = sample6B
method = SRS
seed = 12345678
sampsize = (5 5 5);
strata city notsorted;
title;
RUN;
PROC PRINT data = sample6B;
title1 'Sample6B: Stratified Random Sample';
title2 'with Equal-Sized Strata (using PROC SURVEYSELECT)';
RUN;
```

Selection Method | Simple Random Sampling |
---|---|

Strata Variable | City |

Input Data Set | MAILING |
---|---|

Random Number Seed | 12345678 |

Number of Strata | 3 |

Total Sample Size | 15 |

Output Data Set | SAMPLE6B |

Obs | City | Num | Name | Street | State | SelectionProb | SamplingWeight |
---|---|---|---|---|---|---|---|

1 | Bellefonte | 5 | Lisa Brothers | 89 Elm Street | PA | 0.33333 | 3.0 |

2 | Bellefonte | 7 | John Doe | 812 Main Street | PA | 0.33333 | 3.0 |

3 | Bellefonte | 8 | Mamie Davison | 102 Cherry Avenue | PA | 0.33333 | 3.0 |

4 | Bellefonte | 11 | Linda Bentlager | 1010 Tricia Lane | PA | 0.33333 | 3.0 |

5 | Bellefonte | 15 | Harold Harvey | 480 Main Street | PA | 0.33333 | 3.0 |

6 | Port Matilda | 41 | Lou Barr | 219 Eagle Street | PA | 0.38462 | 2.6 |

7 | Port Matilda | 42 | Casey Spears | 123 Main Street | PA | 0.38462 | 2.6 |

8 | Port Matilda | 44 | Edwin Hoch | 389 Dolphin Drive | PA | 0.38462 | 2.6 |

9 | Port Matilda | 48 | Coach Pierce | 74 Main Street | PA | 0.38462 | 2.6 |

10 | Port Matilda | 50 | George Matre | 75 Ashwind Drive | PA | 0.38462 | 2.6 |

11 | State College | 20 | Kristin Jones | 120 Stratford Drive | PA | 0.22727 | 4.4 |

12 | State College | 30 | Daniel Peterson | 328 Waupelani Drive | PA | 0.22727 | 4.4 |

13 | State College | 32 | George Ball | 888 Park Avenue | PA | 0.22727 | 4.4 |

14 | State College | 35 | Doris Alcorn | 453 Fraser Street | PA | 0.22727 | 4.4 |

15 | State College | 37 | Scott Henderson | 245 W. Beaver Avenue | PA | 0.22727 | 4.4 |

First, launch and run * * the SAS program. Then, review the resulting output to convince yourself that the code did indeed select, from the

*mailing*data set, five observations from Bellefonte, five observations from Port Matilda, and five observations from State College.

Now, the specifics about the code. The only things that should look new here are the STRATA statement and the form of the SAMPSIZE statement. The STRATA statement tells SAS to partition the input data set *stat482.mailing* into nonoverlapping groups defined by the variable *city*. The NOTSORTED option does not tell SAS that the data set is unsorted. Instead, the NOTSORTED option tells SAS that the observations in the data set are arranged in *city* groups, but the groups are not necessarily in alphabetical order. The SAMPSIZE statement tells SAS that we are interested in sampling five observations from each of the *city* groups.

### Selecting a Stratified Sample of Unequal-Sized Groups

Now, we'll focus on the situation in which an unequal number of observations are selected from each subgroup of observations as determined by the value of a stratifying variable. If there are an unequal number of observations for each subgroup in the original data set, this sampling scheme may be accomplished by selecting the same proportion of observations from each subgroup. Again, we'll sample without replacement, in which once an observation is selected it cannot be re-selected.

To select a **stratified random sample of unequal-sized groups**, we could use the code from Example 10.10 by passing the different group sample sizes into the macro *select*. Alternatively, we could create a data set containing two count variables ...one that contains the number of observations in each subgroup (*n*) ...and the other that contains the number of observations that need to be selected from each subgroup (*k*). Once the data set is created, we could merge it with the original data set, and select observations randomly as we have done previously for random samples without replacement. That's the strategy that the following example uses.

## Selecting a Stratified Sample of Unequal-Sized Groups

Now, we'll focus on the situation in which an unequal number of observations are selected from each subgroup of observations as determined by the value of a stratifying variable. If there are an unequal number of observations for each subgroup in the original data set, this sampling scheme may be accomplished by selecting the same proportion of observations from each subgroup. Again, we'll sample without replacement, in which once an observation is selected it cannot be re-selected.

To select a **stratified random sample of unequal-sized groups**, we could use the code from Example 10.10 by passing the different group sample sizes into the macro *select*. Alternatively, we could create a data set containing two count variables ...one that contains the number of observations in each subgroup (*n*) ...and the other that contains the number of observations that need to be selected from each subgroup (*k*). Once the data set is created, we could merge it with the original data set, and select observations randomly as we have done previously for random samples without replacement. That's the strategy that the following example uses.

## Example 34.12

The following code illustrates how to select a **stratified random sample of unequal-sized groups**. Specifically, the code tells SAS to randomly select 5, 6, and 8 observations, respectively, from each of the three subgroups — Bellefonte, Port Matilda, and State College — as determined by the value of the variable *city*:

```
DATA nselect;
set stat482.mailing (keep = city);
by city;
n+1;
if last.city;
input k;
output;
n=0;
DATALINES;
5
6
8
;
RUN;
DATA sample7 (drop = k n);
merge stat482.mailing nselect;
by city;
if ranuni(7841) le k/n then
do;
output;
k=k-1;
end;
n=n-1;
RUN;
PROC PRINT data=nselect;
title 'NSELECT: Count by CITY';
RUN;
PROC PRINT data=sample7;
title 'Sample7: Stratified Random Sample of Unequal-Sized Groups';
by city;
RUN;
```

## NSELECT: Count by CITY

Obs | City | n | k |
---|---|---|---|

1 | Bellefonte | 15 | 5 |

2 | Port Matilda | 13 | 6 |

3 | State College | 22 | 8 |

## Sample7: Stratified Random Sample of Unequal-Sized Groups

Obs | Num | Name | Street | State |
---|---|---|---|---|

1 | 1 | Jonathon Smothers | 103 Oak Lane | PA |

2 | 3 | Jim Jefferson | 10101 Allegheny Street | PA |

3 | 6 | Delilah Fequa | 2094 Acorn Street | PA |

4 | 11 | Linda Bentlager | 1010 Tricia Lane | PA |

5 | 15 | Harold Harvey | 480 Main Street | PA |

Obs | Num | Name | Street | State |
---|---|---|---|---|

6 | 38 | Miriam Denders | 2348 Robin Avenue | PA |

7 | 42 | Casey Spears | 123 Main Street | PA |

8 | 43 | Leslie Olin | 487 Bluebird Haven | PA |

9 | 46 | Linda Nicolson | 71 Liberty Terrace | PA |

10 | 48 | Coach Pierce | 74 Main Street | PA |

11 | 49 | Tim Winters | 95 Dove Street | PA |

Obs | Num | Name | Street | State |
---|---|---|---|---|

12 | 16 | Linda Edmonds | 410 College Avenue | PA |

13 | 17 | Rigna Patel | 101 Beaver Avenue | PA |

14 | 18 | Ade Fequa | 803 Allen Street | PA |

15 | 21 | Amy Kuntz | 357 Park Avenue | PA |

16 | 24 | Mark Mendel | 256 Fraser Street | PA |

17 | 26 | Jan Davison | 201 E. Beaver Avenue | PA |

18 | 34 | Mike Dahlberg | 1201 No. Atherton | PA |

19 | 35 | Doris Alcorn | 453 Fraser Street | PA |

First, launch and run * * the SAS program. Then, review the resulting output to convince yourself that the code did indeed select, from the

*mailing*data set, five observations from Bellefonte, six observations from Port Matilda, and eight observations from State College.

Now, how does the program work? The key to understanding the program is to understand the first DATA step. The remainder of the program is much like code we've seen before, like that in say Example 10.4, in which a random sample is selected without replacement. Now, the first DATA step creates a temporary data set called *nselect* that contains three variables *city*, *n*, and *k*:

- To count the number of observations
*n*from each*city*, we use a counter variable*n*in conjunction with the*last.city*variable. By default, SAS sets*n*to 0 on the first iteration of the DATA step, and then increases*n*by 1 for each subsequent iteration of the DATA step until it counts the number of observations for one of the levels of*city*. - To tell SAS the number of observations to select from each
*city*, we use an INPUT statement in conjunction with a DATALINES statement. The numbers*k*are listed in the order of*city*...so here we tell SAS we want to randomly select 5 observations from Bellefonte, 6 observations from Port Matilda, and 8 observations from State College. - To write the numbers
*n*and*k*to the new data set*nselect*, we use the*last.city*variable in a subsetting IF statement. So here, when SAS finds the last record within a*city*subgroup,*n*and*k*are written to the*nselect*data set, and*n*is reset to 0 in preparation for counting the number of observations for the next*city*in the data set.

The second DATA step creates a temporary data set called *sample7* by merging the *stat482.mailing* data set with the *nselect* data set. After merging, the code then randomly selects the deemed number of observations from each *city* just as we did previously for random samples without replacement.

## Example 34.13

The following code illustrates an alternative way of randomly selecting a **stratified random sample of unequal-sized groups**. In selecting such a sample, rather than specifying the desired number sampled from each group, we could tell SAS to select an equal proportion of observations from each group. The following code does just that. Specifically, the code tells SAS to randomly select 25% of the observations from each of the three subgroups — Bellefonte, Port Matilda, and State College:

```
DATA nselect2;
set stat482.mailing (keep=city);
by city;
n+1;
if last.city;
k=ceil(0.25*n);
output;
n=0;
RUN;
DATA sample8 (drop = k n);
merge stat482.mailing nselect2;
by city;
if ranuni(7841) le k/n then
do;
output;
k=k-1;
end;
n=n-1;
RUN;
PROC PRINT data=nselect2;
title 'NSELECT2: Count by CITY';
RUN;
PROC PRINT data=sample8;
title 'Sample8: Stratified Random Sample of Unequal-Sized Groups';
RUN
```

Obs | City | n | k |
---|---|---|---|

1 | Bellefonte | 15 | 4 |

2 | Port Matilda | 13 | 4 |

3 | State College | 22 | 6 |

Obs | Num | Name | Street | City | State |
---|---|---|---|---|---|

1 | 3 | Jim Jefferson | 10101 Allegheny Street | Bellefonte | PA |

2 | 6 | Delilah Fequa | 2094 Acorn Street | Bellefonte | PA |

3 | 11 | Linda Bentlager | 1010 Tricia Lane | Bellefonte | PA |

4 | 15 | Harold Harvey | 480 Main Street | Bellefonte | PA |

5 | 38 | Miriam Denders | 2348 Robin Avenue | Port Matilda | PA |

6 | 42 | Casey Spears | 123 Main Street | Port Matilda | PA |

7 | 46 | Linda Nicolson | 71 Liberty Terrace | Port Matilda | PA |

8 | 49 | Tim Winters | 95 Dove Street | Port Matilda | PA |

9 | 16 | Linda Edmonds | 410 College Avenue | State College | PA |

10 | 17 | Rigna Patel | 101 Beaver Avenue | State College | PA |

11 | 18 | Ade Fequa | 803 Allen Street | State College | PA |

12 | 24 | Mark Mendel | 256 Fraser Street | State College | PA |

13 | 26 | Jan Davison | 201 E. Beaver Avenue | State College | PA |

14 | 35 | Doris Alcorn | 453 Fraser Street | State College | PA |

In this case, it probably makes most sense to first compare the code here with the code from the previous example. The only difference you should see is that rather than using an INPUT and DATALINES statement to read in the number of observations, *k*, to be selected from each of the subgroups, here we use the **ceiling function**, *ceil*( ), to determine *k*. Specifically, *k* is calculated using:

k=ceil(0.25*n);

Now, if you think about it, if I tell you to select 25% of the *n* = 16 observations in a subgroup, you'd tell me that we need to select 4 observations. But what if I tell you to select 25% of the *n* = 15 observations in a subgroup? If you calculate 25% of 15, you get 3.75. Hmmm.... how can you select 3.75 observations? That's where the ceiling function comes into play. The ceiling function, **ceil**(argument), returns the smallest integer that is greater than or equal to the argument. So, for example, **ceil**(3.75) equals 4 ... as does of course ceil(3.1), ceil(3.25), and ceil(3.87) ...you get the idea.

That's it ... that's all there is to it. Once *k* is determined using the ceiling function, the rest of the program is identical to the program in the previous example.

Now, try it out... launch and run * * the SAS program. Then, review the resulting output to convince yourself that the code did indeed select, from the

*mailing*data set, 25% of the observations from Bellefonte, Port Matilda, and State College. In this case, that translates to 4, 4, and 6 observations, respectively.

## Example 34.14

The following code illustrates yet another alternative way of randomly selecting a **stratified random sample of unequal-sized groups**. Specifically, the program uses the SURVEYSELECT procedure to tell SAS to randomly sample exactly 5, 6, and 8 observations, respectively, from each of the three *city* subgroups in the permanent SAS data set *stat482.mailing*:

```
PROC SURVEYSELECT data = stat482.mailing
out = sample9
method = SRS
seed = 12345678
sampsize = (5 6 8);
strata city notsorted;
title;
RUN;
PROC PRINT data = sample9;
title1 'Sample9: Stratified Random Sample';
title2 'with Unequal-Sized Strata (using PROC SURVEYSELECT)';
RUN;
```

Selection Method | Simple Random Sampling |
---|---|

Strata Variable | City |

Input Data Set | MAILING |
---|---|

Random Number Seed | 12345678 |

Number of Strata | 3 |

Total Sample Size | 19 |

Output Data Set | SAMPLE9 |

Obs | City | Num | Name | Street | State | SelectionProb | SamplingWeight |
---|---|---|---|---|---|---|---|

1 | Bellefonte | 5 | Lisa Brothers | 89 Elm Street | PA | 0.33333 | 3.00000 |

2 | Bellefonte | 7 | John Doe | 812 Main Street | PA | 0.33333 | 3.00000 |

3 | Bellefonte | 8 | Mamie Davison | 102 Cherry Avenue | PA | 0.33333 | 3.00000 |

4 | Bellefonte | 11 | Linda Bentlager | 1010 Tricia Lane | PA | 0.33333 | 3.00000 |

5 | Bellefonte | 15 | Harold Harvey | 480 Main Street | PA | 0.33333 | 3.00000 |

6 | Port Matilda | 40 | Jane Smiley | 298 Cardinal Drive | PA | 0.46154 | 2.16667 |

7 | Port Matilda | 41 | Lou Barr | 219 Eagle Street | PA | 0.46154 | 2.16667 |

8 | Port Matilda | 42 | Casey Spears | 123 Main Street | PA | 0.46154 | 2.16667 |

9 | Port Matilda | 43 | Leslie Olin | 487 Bluebird Haven | PA | 0.46154 | 2.16667 |

10 | Port Matilda | 49 | Tim Winters | 95 Dove Street | PA | 0.46154 | 2.16667 |

11 | Port Matilda | 50 | George Matre | 75 Ashwind Drive | PA | 0.46154 | 2.16667 |

12 | State College | 20 | Kristin Jones | 120 Stratford Drive | PA | 0.36364 | 2.75000 |

13 | State College | 24 | Mark Mendel | 256 Fraser Street | PA | 0.36364 | 2.75000 |

14 | State College | 25 | Steve Lindhoff | 130 E. College Avenue | PA | 0.36364 | 2.75000 |

15 | State College | 28 | Srabashi Kundu | 112 E. Beaver Avenue | PA | 0.36364 | 2.75000 |

16 | State College | 30 | Daniel Peterson | 328 Waupelani Drive | PA | 0.36364 | 2.75000 |

17 | State College | 32 | George Ball | 888 Park Avenue | PA | 0.36364 | 2.75000 |

18 | State College | 34 | Mike Dahlberg | 1201 No. Atherton | PA | 0.36364 | 2.75000 |

19 | State College | 37 | Scott Henderson | 245 W. Beaver Avenue | PA | 0.36364 | 2.75000 |

Straightforward enough! The only difference between this code and the code in Example 10.11 is that here the sample sizes are specified at 5, 6, and 8 rather than 5, 5, and 5. Note that you must list the stratum sample size values in the order in which the strata appear in the input data set.

Launch and run * * the SAS program. Then, review the resulting output to convince yourself that the code did indeed select, from the

*mailing*data set, five observations from Bellefonte, six observations from Port Matilda, and eight observations from State College.