21.2 - Creating Multiple Observations From a Single Record

In the next three sections, we'll pull in a number of different tools that we've learned throughout the course — as well as add a few new ones — in order to read raw data files that contain data values for multiple observations in just one record. We'll just introduce the situations here and then investigate each more fully over the next three sections.

Reading Repeating Blocks of Data Section

First, we'll learn how to read raw data files in which each record contains a repeating block of values in which each block in the record represents a separate observation. For example, we'll learn how to read this data file:

Month	Av High	Av Low	Month	Av High	Av Low	Month	Av High	Av Low
Jan	32	16	Feb	35	18	Mar	46	26
Apr	58	37	May	68	47	Jun	78	56
Jul	82	60	Aug	80	58	Sep	72	51
Oct	61	40	Nov	48	32	Dec	37	22

in which each record contains three blocks of data values — the month and the average high and low temperature for that month in State College, PA. When all is said and done, we will have transformed the input raw data file into a SAS data set that looks like this:

Month	Avg High	Avg Low
Jan	32	16
Feb	35	18
Mar	46	26
Apr	58	37
May	68	47
Jun	78	56
Jul	82	60
Aug	80	58
Sep	72	51
Oct	61	40
Nov	48	32
Dec	37	22

Reading the Same Number of Repeating Fields Section

Then, we'll learn how to read raw data files in which each record contains an ID field followed by an equal number of repeating fields that contribute values to separate observations. For example, we'll learn how to read this data file:

id	exam 1	exam 2	exam 3
111000234	79	82	100
922232573	87	89	95
252359873	65	72	73
205804679	92	95	99

in which each record contains a nine-digit student ID number followed by three exam scores. When all is said and done, we will have transformed the input raw data file into a SAS data set that looks like this:

id	exam	score
111000234	1	79
111000234	2	82
111000234	3	100
922232573	1	87
922232573	2	89
922232573	3	95
252359873	1	65
252359873	2	72
252359873	3	73
205804679	1	92
205804679	2	95
205804679	3	99

Reading a Varying Number of Repeating Fields Section

Finally, we'll learn how to read raw data files in which each record contains an ID field followed by a varying number of repeating fields that contribute values to separate observations. For example, we'll learn how to read this raw data file:

id	weight 1	weight 2	weight 3	weight 4	weight 5
1001	179	172	169
1002	250	249
1003	190	196	195	164	158
1004	232	224	219	212	208
1005	211	208	204	202

in which each record contains a four-digit subject ID number followed by the monthly weights (in pounds) of the subjects. Because some of the subjects dropped out of the diet program in which they were participating, the data file does not contain an equal number of weights in each record. When all is said and done, we will have transformed the input raw data file into a SAS data set that looks like this:

id	weigh-in	weight
1001	1	179
1001	2	172
1001	3	169
1002	1	250
1002	2	249
1003	1	190
1003	2	196
1003	3	195
1004	1	232
1004	2	224
1004	3	219
1004	4	212
1004	5	208
1005	1	211
1005	2	208
1005	3	204
1005	4	202