# 7.1 - Introduction to Cluster and Systematic Sampling

7.1 - Introduction to Cluster and Systematic Sampling

On the surface, systematic and cluster sampling are very different. In fact, the two designs share the same structure: the population is partitioned into primary units, each primary unit being composed of secondary units. Whenever a primary unit is included in the sample, the y-values of every secondary unit within it are observed.

Example: An one in three systematic sampling where we randomly pick one from the first three units and then choose every three from that on. Randomly pick a value from {1, 2, 3}. For example, if 2 is chosen, then we will pick {2, 5, 8, 11, 14}, the 's. The set {2, 5, 8, 11, 14} is an example of a primary unit.

It is not uncommon to have a systematic sample of size 1, such as the above 1 in 3 systematic sample. We just sample 1 primary unit.

In the following two graphs, we provide examples for two configurations of primary units: The above figure has 50 primary units (PSU)
(the colored rectangle is an example of a primary unit of cluster sampling) The above figure has 25 primary units (PSU)
(the colored units (collectively) is an example of a primary unit of a systematic sampling)

Primary units (PSU) may be different from observation units. One can view the systematic sampling as a sampling of primary units. Once the primary units are selected, a cluster of secondary units are also selected.

1. Easier to perform in the field, especially if a good frame is not available.
2. Frequently provides more information per unit cost than simple random sampling, in the sense of smaller variances.

For example, a systematic sample was drawn from a batch of produced computer chips. The first 400 chips are fine but due to a fault of the machine, the last 300 chips are defective. Systematic sampling will select uniformly over the defective and non-defective items and would give a very accurate estimate of the fraction of defective items.

Cluster Sampling and Systematic Sampling

A cluster/systematic sample is a probability sample in which each sampling unit is a collection, or cluster, of elements.

##### Notations of cluster and systematic sampling:
• N : the number of primary units in the population n : the number of primary units in the sample $$M_i$$: the number of secondary units in the ith primary unit
• $$M=\sum\limits_{i=1}^N M_i$$: the total number of secondary units in the population
• $$y_{ij}$$: the value of the variable of interest of jth secondary unit in the ith primary unit
• $$y_i=\sum\limits_{j=1}^{M_i}y_{ij}$$: the total (i.e. sum) of y-values in the ith primary unit

For Fig. 1 below, N = 50, n = 10, $$M_i$$ = 8

For Fig. 2 below, N = 25, n = 2, $$M_i$$ = 16

Figure 1 shows an example of cluster sampling and figure 2 shows an example of systematic sampling. Secondary units of a primary unit of the cluster sampling are close together whereas secondary units of a primary unit of the systematic sampling separate from each other.

#### More Notation:

Thus, the population total is:

$$\tau=\sum\limits_{i=1}^N \sum\limits_{j=1}^{M_i}y_{ij}=\sum\limits_{i=1}^N y_i$$

The population mean per primary unit is:

$$\mu_1=\tau/N$$

The population mean per secondary unit is

$$\mu=\tau/M$$.

  Link ↥ Has Tooltip/Popover Toggleable Visibility