1.4 - Types of data

Example 1-1 Section

sleeping person

Your instructor asked a random sample of 20 college students "do you consider yourself to be sleep-deprived?" Their replies were:

yes yes yes no no no yes yes yes yes
yes no no yes yes no yes yes yes yes

Of course, it would be good to summarize the students' responses. What we do with the data though depends on the type of data collected. For our purposes, we will primarily be concerned with three types of data:

  • discrete
  • continuous
  • categorical

Now, for their definitions!

Discrete Data
Quantitative data are called discrete if the sample space contains a finite or countably infinite number of values.

Recall that a set of elements are countably infinite if the elements in the set can be put into one-to-one correspondence with the positive integers. My third research question yields discrete data, because of its sample space:

\(\mathbf{S} = \{0, 1, 2, ..., 31\}\)

contains a finite number of values. And, my fourth research question yields discrete data, because of its sample space:

\(\mathbf{S} = \{0, 1, 2, ...\}\)

contains a countably infinite number of values.

Continuous Data
Quantitative data are called continuous if the sample space contains an interval or continuous span of real numbers.

My second research question yields continuous data, because of its sample space:

\(\mathbf{S} = \{h: h \ge 0 \text{ hours}\}\)

is the entire positive real line. For continuous data, there is theoretically an infinite number of possible outcomes; the measurement tool is the restricting factor. For example, if I were to ask how much each student in the class weighed (in pounds), I would most likely get responses such as 126, 172, and 210. The responses are seemingly discrete. But, are they? If I report that I weigh 118 pounds, am I exactly 118 pounds? Probably not; I'm perhaps 118.0120980335927.... pounds. It's just that the scale that I get on in the morning tells me that I weigh 118 pounds. Again, the measurement tool is the restricting factor — something you always have to think about when trying to distinguish between discrete and continuous data.

Categorical Data
Qualitative data are called categorical if the sample space contains objects that are grouped or categorized based on some qualitative trait. When there are only two such groups or categories, the data are considered binary.

My first research question yields binary data because its sample space is:

\(\mathbf{S} = \{\text{yes}, \text{ no}\}\)

Two other examples of categorical data are eye color (brown, blue, hazel, and so on) and semester standing (freshman, sophomore, junior and senior).