At first glance, **one-to-one merging** appears to be the same as
one-to-one reading, since it too combines two or more SAS data sets, one "to
the right" of the other into a single "fat" data set. That is, just like
one-to-one reading, one-to-one merging combines observations from two or more
data sets into a single observation in a new data set. There is just one
primary difference though — SAS continues to merge observations until it has
read *all* of the observations from *all* of the data sets. For
example, suppose again that our *patients* data set contains three
variables: patient ID number (`ID`), gender (`Sex`), and age of
the patient (`Age`):

ID | Sex | Age |
---|---|---|

1157 | F | 33 |

2395 | F | 48 |

1098 | M | 39 |

4829 | F | 24 |

3456 | M | 30 |

5920 | M | 41 |

1493 | F | 42 |

and our *scale* data set contains three variables:
`ID` (number),` Height` (in inches) and `Weight` (in
pounds):

ID | Height | Weight |
---|---|---|

1157 | 65 | 122 |

2395 | 64 | 130 |

1098 | 70 | 178 |

4829 | 67 | 142 |

3456 | 72 | 190 |

5920 | 71 | 188 |

Then, when we one-to-one merge the two data sets, we get a data set, called
say *one2onemerge*, that looks like this:

ID | Sex | Age | Height | Weight |
---|---|---|---|---|

157 | F | 33 | 65 | 122 |

2395 | F | 48 | 64 | 130 |

1098 | M | 39 | 70 | 178 |

4829 | F | 24 | 67 | 142 |

3456 | M | 30 | 72 | 190 |

5920 | M | 41 | 71 | 188 |

1493 | F | 42 | . | . |

in which the second data set gets placed to the "right" of the first data set.
Again, the observations are
**combined based on their relative position** in the data set.
The first observation of *patient* is combined with the first
observation of *scale* to create the first observation in
*one2onemerge*; the second observation of *patient* is combined
with the second observation of *scale* to create the second observation
in *one2onemerge*; and so on. When SAS performs a one-to-one merge, the
DATA step continues to read observations until the last observation is read
from the largest data set. That's why the *one2onemerge* data set has
one more observation than the *one2oneread* data set. In general, the
number of observations in a data set created by a one-to-one merge always
equals the numbers of observations in the largest data set named for
one-to-one merging.

##
Example 15.4
Section* *

The following program uses one-to-one merging to combine the
*patients* data set with the *scale* data set:

```
DATA one2onemerge;
merge patients scale;
RUN;
PROC PRINT NOOBS;
title 'The one2onemerge data set';
RUN;
```

The MERGE statement tells SAS to create a new data set by combining
*patients* and *scale*. Launch and run
* * the SAS program and
review the output from the PRINT procedure to see the results of the
one-to-one merge. You should see that the first observation in
*one2onemerge* contains the first observation of *patients* and
*scale*, the second observation in *one2onemerge* contains the
second observation of *patients* and *scale*, and so on. Since
there are seven observations in *patients* and six observations in
*scale*, the new *one2onemerge *data set contains seven
observations, with missing values for the `Height` and
`Weight `variables in the seventh observation. Note that although
this example only combined two data sets, the MERGE statement can contain any
number of input data sets.

Just as is true for one-to-one reading, if data sets that are being one-to-one merged contain variables that have the same names, the values that are read in from the last data set overwrite the values that were read in from earlier data sets. Let's go back to our contrived example to illustrate this point.

##
Example 15.5
Section* *

The following program uses one-to-one merging to combine the *one* data
set with the *two* data set to create a new data set called
*onetwomerged*:

```
DATA one;
input ID VarA $ VarB $;
DATALINES;
10 A1 B1
20 A2 B2
30 A3 B3
;
RUN;
DATA two;
input ID VarB $ VarC $;
DATALINES;
40 B4 C1
50 B5 C2
;
RUN;
DATA onetwomerged;
merge one two;
RUN;
PROC PRINT data = onetwomerged NOOBS;
title 'The onetwomerged data set';
RUN;
```

Note again that the *one* and *two* data sets share two
variables, namely `ID` and `VarB`. The third (and last) DATA
step tells SAS to combine the two data sets using the one-to-one merging
method. Let's walk our way through how SAS processes the DATA step. At the end
of the compile phase, SAS will have created a program data vector containing
the variables from the *one* and *two* data sets in the order in
which they appear in the DATA step:

_N_ | _ERROR_ | Num | VarA | VarB | VarC |
---|---|---|---|---|---|

1 | 0 | . | . |

During the first iteration of the DATA step, the MERGE statement reads first
one observation from data set *one*:

_N_ | _ERROR_ | Num | VarA | VarB | VarC |
---|---|---|---|---|---|

1 | 0 | 10 | A1 | B1 | . |

and then one observation from data set *two*. The values for
`ID` and `VarB` in data set *two* overwrites the values
for `ID` and `VarB` in data set *one*:

_N_ | _ERROR_ | Num | VarA | VarB | VarC |
---|---|---|---|---|---|

1 | 0 | 40 | A1 | B4 | C1 |

Being at the end of the first iteration of the DATA step, SAS writes the
contents of the program data vector as the first observation in the
*onetwomerged* data set. Upon returning to the top of the DATA step,
the program data vector looks like this at the beginning of the second
iteration of the DATA step:

_N_ | _ERROR_ | Num | VarA | VarB | VarC |
---|---|---|---|---|---|

2 | 0 | 40 | A1 | B4 | C1 |

Just as is true for a SET statement, SAS retains the values of variables that
were read from a SAS data set with the MERGE statement. Now, the MERGE
statement reads the second observation from the *one* data set:

_N_ | _ERROR_ | Num | VarA | VarB | VarC |
---|---|---|---|---|---|

2 | 0 | 20 | A2 | B2 | C1 |

and the second observation from the* two* data set. Again, the values
for `ID` and `VarB` in data set *two* overwrites the
values for `ID` and `VarB` in data set *one*:

_N_ | _ERROR_ | Num | VarA | VarB | VarC |
---|---|---|---|---|---|

2 | 0 | 50 | A2 | B5 | C2 |

Being at the end of the second iteration of the DATA step, SAS writes the
contents of the program data vector as the second observation in the
*onetwomerged* data set. Now this is where things get different!
Because an observation still exists in the *one* data set, SAS returns
to the top of the DATA step. The program data vector looks like this at the
beginning of the third iteration of the DATA step:

_N_ | _ERROR_ | Num | VarA | VarB | VarC |
---|---|---|---|---|---|

3 | 0 | 50 | A2 | B5 | C2 |

And, the MERGE statement reads the third observation from the
*one* data set:

_N_ | _ERROR_ | Num | VarA | VarB | VarC |
---|---|---|---|---|---|

3 | 0 | 30 | A3 | B3 | C2 |

SAS attempts to read a third observation from the *two* data set, but
instead encounters an end-of-data set marker. Therefore,
*as is always the case in this kind of situation*, SAS sets the values
of all of that data set's variables in the program data vector to missing:

_N_ | _ERROR_ | Num | VarA | VarB | VarC |
---|---|---|---|---|---|

3 | 0 | 30 | A3 | B |

Being at the end of the third iteration of the DATA step, SAS writes the
contents of the program data vector as the third observation in the
*onetwomerged* SAS data set. Because there are no more observations in
either the *one* or the *two* data set, processing stops. Thank
goodness!

Now, launch and run
* * the SAS program, and
review the output to convince yourself that the *one* and
*two* data sets are combined as described.

**One more closing comment.** One-to-one reading and one-to-one
merging require users to exercise extreme caution when combining two (or more)
data sets based on relative position only. It would just take one of the data
sets to be "shifted" ever so slightly to get really messed up results. It's
for this reason that I personally don't find the one-to-one read or the
one-to-one merge all that practical. The more useful and therefore much more
common merge performed in SAS is what is called
**match-merging**. We'll learn about it in the next lesson.