14.4 - The RENAME= option

There may be occasions in which you want to change some of the variable names in your SAS data set. To do so, you'll want to use the RENAME= option. As its name suggests, the RENAME= option allows you to change the variable names within a SAS data set.

The format of the RENAME= option is:

RENAME = (old1=new1 old2=new2 .... oldk=newk);

where old1, old2, ... oldk are the variable names as they appear in the data set that precedes the RENAME= option, and new1, new2, ..., newk are the corresponding new variable names.

The effect of the RENAME= option depends on where it appears:

  • If the RENAME= option appears in the SET statement, then the new variable name takes effect when the program data vector is created. Therefore, all programming statements within the DATA step must refer to the new variable name.
  • If the RENAME= option appears in the DATA statement, then the new variable name takes effect only when the data are written to the SAS data set. Therefore, all programming statements within the DATA step must refer to the old variable name.

Example 14.12 Section

The following program illustrates the use of the RENAME= option in the SET statement. Specifically, the variable sex is changed to gender, and b_date is changed to birth when the program data vector is created:

DATA back7 (keep = subj gender v_date birth age);
	set back3 (rename=(sex=gender b_date=birth));
	age = (v_date - birth)/365;   *MUST use NEW name for date of birth;
RUN;
 
PROC PRINT data=back7;
	title 'Output Dataset: BACK7';
RUN;

 

Output Dataset: BACK7

Obs

subj

v_date

birth

gender

age

1

110051

01/25/94

12/02/42

2

51.1836

2

110052

01/27/94

01/04/25

2

69.1096

3

110053

02/22/94

03/15/22

2

71.9918

4

110055

03/15/94

03/31/41

2

52.9918

5

110057

03/15/94

07/10/44

2

49.7123

6

110058

03/18/94

09/09/50

2

43.5507

7

110059

03/18/94

07/25/34

2

59.6877

8

110060

06/14/94

05/29/36

2

58.0822

9

110062

03/31/94

04/21/36

2

57.9808

10

110065

04/04/94

10/12/52

2

41.5041

11

110066

04/12/94

08/28/62

2

31.6438

12

110067

04/26/94

02/22/72

2

22.1890

13

110068

06/13/94

09/10/55

2

38.7836

14

110069

05/31/94

08/17/38

2

55.8247

Because the RENAME= option appears in the SET statement, SAS no longer recognizes the variable name sex as the gender, nor b_date as the birth date, of the subject. Instead, SAS recognizes the variable names gender and birth. Hence, when we subsequently calculate the subjects' ages (age) in the DATA step, we must refer to the new variable name birth.

Again, pay particular attention to the syntax of the RENAME= option ... it too can be tricky. The entire RENAME= option must be contained in parentheses immediately following the data set to which you want the name changes to apply. The variable names must also be placed in parentheses. So, in general, the syntax, when applied to a DATA statement, should look like this:

DATA dsname (RENAME = (o1=n1 o2=n2 ...));

where dsname is the data set name and o1 and o2 are the old variable names, and n1 and n2 are the new variable names.

Launch and run  the SAS program. Review the output from the PRINT procedure. Convince yourself that the variable names sex and b_date have been changed as advertised to gender and birth, respectively. Also, verify that the ages of the subjects have been calculated appropriately. Then, in the SAS program, change the variable name birth back to the variable name b_date, and re-run  the program. Does SAS indeed hiccup?

Example 14.13 Section

The following program illustrates the use of the RENAME= option, when it appears in the DATA statement. Specifically, the variable sex is changed to gender, and b_date is changed to birth when SAS writes the data to the output data set

DATA back8 (rename=(sex=gender b_date=birth)
		keep = subj sex v_date b_date age);
	set back3;
	age = (v_date - b_date)/365;      *MUST use OLD name for date of birth;
RUN;
 
PROC PRINT data=back8;
	title 'Output Dataset: BACK8';
RUN;

Output Dataset: BACK8

Obs

subj

v_date

birth

gender

age

1

110051

01/25/94

12/02/42

2

51.1836

2

110052

01/27/94

01/04/25

2

69.1096

3

110053

02/22/94

03/15/22

2

71.9918

4

110055

03/15/94

03/31/41

2

52.9918

5

110057

03/15/94

07/10/44

2

49.7123

6

110058

03/18/94

09/09/50

2

43.5507

7

110059

03/18/94

07/25/34

2

59.6877

8

110060

06/14/94

05/29/36

2

58.0822

9

110062

03/31/94

04/21/36

2

57.9808

10

110065

04/04/94

10/12/52

2

41.5041

11

110066

04/12/94

08/28/62

2

31.6438

12

110067

04/26/94

02/22/72

2

22.1890

13

110068

06/13/94

09/10/55

2

38.7836

14

110069

05/31/94

08/17/38

2

55.8247

Because the RENAME= option appears in the DATA statement, SAS only recognizes the variable names as they appear in the input data set back3. That is, for example, SAS recognizes the variable name b_date as the birth date of the subjects. Hence, when we subsequently calculate the subjects' ages in the DATA step, we must refer to the old variable name b_date. Also, note that the KEEP= option in the DATA statement must refer to the original variable names as they appear in the back3 data set.

This program also illustrates how to use more than one DATA step option at a time. Specifically, the RENAME= and KEEP= options are used to modify the back8 data set. As such, both options are placed within one set of parentheses immediately following the data set to which you want the changes to apply. Then, within those parentheses, the basic syntax for each option is followed.

Launch and run  the SAS program, and review the output from the PRINT procedure. Convince yourself that the variable names sex and b_date have been changed as advertised to gender and birth, respectively. Also, verify that the ages of the subjects have been calculated appropriately.