Once we determine that a variable is Qualitative (or Categorical), we need tools to summarize the data. We can summarize the data by using frequencies and by graphing the data.
Let’s start by an example. In a class size of 30 students, a survey question asked the students to indicate their eye color. The responses are shown in the table.
Hazel | Brown | Brown | Brown | Blue | Brown |
Brown | Brown | Brown | Brown | Brown | Green |
Brown | Brown | Brown | Brown | Brown | Brown |
Blue | Brown | Brown | Brown | Hazel | Blue |
Brown | Brown | Brown | Brown | Brown | Brown |
From this list, we can clearly see that the eye color brown is the most common. Which is more frequent, Hazel or Green? It may only take a few seconds to answer the question but what if there were 100 students? Or 1000? The best way to summarize categorical data is to use frequencies and percentages (or proportions).
- Proportion
- A proportion is a fraction or part of the total that possesses a certain characteristic.
The best way to summarize categorical data is to use frequencies and percentages like in the table.
Eye Color | Frequency | Percentage |
---|---|---|
Brown | 24 | 80% |
Blue | 3 | 10% |
Hazel | 2 | 6.6667% |
Green | 1 | 3.3333% |
The table is much easier to read than the actual data. It is clear to see that more students have Hazel than Green eyes in the class.
As the saying goes, “A picture is worth 1000 words”, it is helpful to visualize the data in a graph.