# 7.1 - Introduction to Cluster and Systematic Sampling

7.1 - Introduction to Cluster and Systematic SamplingOn the surface, systematic and cluster sampling are very different. In fact, the two designs share the same structure: the population is partitioned into primary units, each primary unit being composed of secondary units. Whenever a primary unit is included in the sample, the *y*-values of every secondary unit within it are observed.

**Example**: An one in three systematic sampling where we randomly pick one from the first three units and then choose every three from that on.

Randomly pick a value from {1, 2, 3}. For example, if 2 is chosen, then we will pick {2, 5, 8, 11, 14}, the 's. The set {2, 5, 8, 11, 14} is an example of a primary unit.

It is not uncommon to have a systematic sample of size 1, such as the above 1 in 3 systematic sample. We just sample 1 primary unit.

In the following two graphs, we provide examples for two configurations of primary units:

The above figure has 50 primary units (PSU)

(the colored rectangle is an example of a primary unit of cluster sampling)

The above figure has 25 primary units (PSU)

(the colored units (collectively) is an example of a primary unit of a systematic sampling)

Primary units (PSU) may be different from observation units. One can view the systematic sampling as a sampling of primary units. Once the primary units are selected, a cluster of secondary units are also selected.

#### Advantages of Systematic Sampling

- Easier to perform in the field, especially if a good frame is not available.
- Frequently provides more information per unit cost than simple random sampling, in the sense of smaller variances.

For example, a systematic sample was drawn from a batch of produced computer chips. The first 400 chips are fine but due to a fault of the machine, the last 300 chips are defective. Systematic sampling will select uniformly over the defective and non-defective items and would give a very accurate estimate of the fraction of defective items.

A cluster/systematic sample is a probability sample in which each sampling unit is a collection, or cluster, of elements.

##### Notations of cluster and systematic sampling:

*N*: the number of primary units in the population*n*: the number of primary units in the sample \(M_i\): the number of secondary units in the*i*th primary unit- \(M=\sum\limits_{i=1}^N M_i\): the total number of secondary units in the population
*\(y_{ij}\): the value of the variable of interest of**j*th secondary unit in the*i*th primary unit*\(y_i=\sum\limits_{j=1}^{M_i}y_{ij}\): the total (i.e. sum) of**y*-values in the*i*th primary unit

For Fig. 1 below, *N* = 50, *n* = 10, \(M_i\) = 8

For Fig. 2 below, *N* = 25, *n* = 2, \(M_i\) = 16

Figure 1 shows an example of cluster sampling and figure 2 shows an example of systematic sampling. Secondary units of a primary unit of the cluster sampling are close together whereas secondary units of a primary unit of the systematic sampling separate from each other.

#### More Notation:

Thus, the population total is:

\(\tau=\sum\limits_{i=1}^N \sum\limits_{j=1}^{M_i}y_{ij}=\sum\limits_{i=1}^N y_i\)

The population mean per primary unit is:

\(\mu_1=\tau/N\)

The population mean per secondary unit is

\(\mu=\tau/M\).