7.1 - Introduction to Cluster and Systematic Sampling

On the surface, systematic and cluster sampling is very different. The two designs share the same structure: the population is partitioned into primary units, each primary unit being composed of secondary units. Whenever a primary unit is included in the sample, the y-values of every secondary unit within it are observed.

Example: A one-in-three systematic sampling where we randomly pick one from the first three units and then choose every three from that on.


figure 12.1

Randomly pick a value from {1, 2, 3}. For example, if 2 is chosen, then we will pick {2, 5, 8, 11, 14}, the figure 12.1's. The set {2, 5, 8, 11, 14} is an example of a primary unit.

It is common to have a systematic sample of size 1, such as the above 1 in 3 systematic sample. We just sample 1 primary unit.

In the following two graphs, we provide examples of two configurations of primary units:

figure 12.1

The above figure has 50 primary units (PSU)
(the colored rectangle is an example of a primary unit of cluster sampling)

figure 12.2

The above figure has 25 primary units (PSU)
(the colored units (collectively) are an example of a primary unit of systematic sampling)

Primary units (PSU) may be different from observation units. One can view systematic sampling as a sampling of primary units. Once the primary units are selected, a cluster of secondary units is also selected.

Advantages of Systematic Sampling

  1. Easier to perform in the field, especially if a good frame is not available.
  2. Frequently provides more information per unit cost than simple random sampling, in the sense of smaller variances.

For example, a systematic sample was drawn from a batch of produced computer chips. The first 400 chips are fine but due to a fault in the machine, the last 300 chips are defective. Systematic sampling will select uniformly over the defective and non-defective items and would give a very accurate estimate of the fraction of defective items.

Cluster Sampling and Systematic Sampling

A cluster/systematic sample is a probability sample in which each sampling unit is a collection, or cluster, of elements.

Notations of cluster and systematic sampling:
  • N: the number of primary units in the population n: the number of primary units in the sample \(M_i\): the number of secondary units in the ith primary unit
  • \(M=\sum\limits_{i=1}^N M_i\): the total number of secondary units in the population
  • \(y_{ij}\): the value of the variable of interest of jth secondary unit in the ith primary unit
  • \(y_i=\sum\limits_{j=1}^{M_i}y_{ij}\): the total (i.e. sum) of y-values in the ith primary unit

For Fig. 1 below, N = 50, n = 10, \(M_i\) = 8

figure 12.1
Fig. 1

For Fig. 2 below, N = 25, n = 2, \(M_i\) = 16

figure 12.2
Fig. 2

Figure 1 shows an example of cluster sampling and figure 2 shows an example of systematic sampling. Secondary units of a primary unit of cluster sampling are close together whereas secondary units of a primary unit of systematic sampling are separate.

More Notation:

Thus, the population total is:

\(\tau=\sum\limits_{i=1}^N \sum\limits_{j=1}^{M_i}y_{ij}=\sum\limits_{i=1}^N y_i\)

The population mean per primary unit is:


The population mean per secondary unit is