9.2 - The INVALUE Statement

The INVALUE statement in the FORMAT procedure allows you to create your own customized informats for character variables. That is, it allows you to tell SAS how you'd like the program to read in special character values. In doing so, SAS effectively translates the values of a character variable into different, typically more meaningful character or numeric values. For example, the following INVALUE statement:

INVALUE $french 'OUI'= 'YES' 
                'NON'= 'NO'; 

prepares SAS to translate a character variable in French to a character variable in English.

Restrictions on the INVALUE statement include:

  • You can only translate a character variable to another variable. You cannot translate a numeric variable using the INVALUE statement.
  • The name of the informat must begin with a $ sign, since it refers to a character variable.
  • The name of the informat (for example, french) must be a valid SAS name with no more than 30 additional characters following the imperative $ sign. The name cannot end in a number nor can the name be a standard SAS informat name.
  • When you refer to the informat later, you must follow the name with a period.

The INVALUE statement in the FORMAT procedure merely defines an informat so that it is available for use. In order for the informat to take effect, you must associate the character variable with the informat either explicitly in the INPUT statement:

INPUT resp $french.; 

or in a FORMAT statement:

FORMAT resp $french.; 

Let's take a look at an example!

Example 9.3 Section

The following SAS code illustrates the use of the FORMAT procedure to define how SAS should translate the two character variables sex and race during input:

PROC FORMAT;
  invalue $insex '1' = 'M'
                 '2' = 'F';

  invalue $inrace '1' = 'Indian'
                  '2' = 'Asian'
                  '3' = 'Black'
                  '4' = 'White';
RUN;

Because the INVALUE statement is used, the translation is restricted to taking place on input. As a result of this code, providing the character variable sex is later associated with the informat $insex, whenever SAS encounters the character value '1' for the variable sex it will instead store the character value 'M'. Similarly, whenever SAS encounters the character value '2' for the variable sex it will instead store the character value 'F'.

Launch and run the SAS program. The only way you'll know if anything happened is by checking out your log window. You should see a message that looks something like this:

As we'll learn later in this lesson, in order to make the definitions for reading in sex and race permanently stored beyond our current work session, we'd need to attach a "LIBRARY =" option to the PROC FORMAT statement. Since one doesn't exist here, the definitions defined in this format procedure are temporary only. That is, they are not stored beyond your current SAS session.

All we've done so far is define the informats so that they are available for use. Now let's use them!

Example 9.4 Section

The following data step uses the informats that we defined in the previous example to read in a subset of the data from the input raw data file back.dat:

DATA temp1;
  infile 'C:\simon\icdb\data\back.dat';
  length sex $ 1 race $ 6;
  input subj 1-6 @17 sex $insex1. @19 race $inrace1.;
RUN;

PROC CONTENTS data=temp1;
  title 'Output Dataset: TEMP1';
RUN;

PROC PRINT data=temp1;
  var subj sex race;
RUN;
Output Dataset: TEMP1
The CONTENTS Procedure
Data Set name WORK.TEMP1 Observations 10
Member Type DATA Variables 3
Engine V9 Indexes 0
Created Wed, Nov 05, 2008 11:06:38 AM Observation Length 16
Last Modified Wed, Nov 05, 2008 11:06:38 AM Deleted Observations 0
Protection   Compressed NO
Data Set Type   Sorted NO
Label      
Data Representation WINDOWS_32    
Encoding wlatin1 Western (Windows)    
Engine/Host Dependent Information
Data Set Page Size 4096
Number of Data Set Pages 1
First Data Page 1
Max Obs per Page 252
Obs in First Data Page 10
Number of Data Set Repairs 0
File Name C:\DOCUME~1\LAURAJ~1\LOCALS~1\TEMP\SAS TEMPORARY FILES\_TD3812\temp1.sas7bdat
Release Created 9.010M3
Host Created XP_PRO
Alphabetic List of Variables and Attributes
# Variable Type Len
2 race Char 6
1 sex Char 1
3 subj Num 8
Output Dataset: TEMP1
Obs subj sex race
1 110051 F White
2 110088 F White
3 210012 F White
4 220004 F White
5 230006 F White
6 310083 M Asian
7 410012 F White
8 420037 F White
9 510027 F White
10 520017 F White

Only a subset of the variables in the back.dat data file is read. Column numbers ("1-6") are used to read the variable subj, and absolute pointer controls are used to read the variables sex ("@17") and race ("@19") from the file. Note that:

  • Because we want to translate the variables, we must read sex and race as character variables, even though they are numbers.
  • On input, we have the option of specifying the length of the variables being read in. The length of the variables is specified in the informat name between the name and the period. For example, the length of the variable race being read in is defined as 1 in the informat $inrace1.
  • The LENGTH statement defines the length of sex and race after translation.

Launch the SAS program. Then, edit the INFILE statement so that it reflects the location of your stored back.dat file. Then, run the SAS program and review the output from the CONTENTS and PRINT procedures. In particular, note that the variables sex and race are both character variables, as indicated by "Char" appearing under the Type column in the output from the CONTENTS procedure. Also, note that the contents procedure gives no indication that the variables sex and race are formatted in any particular way for output. We'd have to take care of that by using a VALUE statement (as opposed to an INVALUE statement)!

Finally, as a little sidebar, recall that the TITLE statement is a toggle statement. That is, its value remains in effect until it is changed with another TITLE statement. Therefore, the title in the PRINT procedure is the same that is used in the CONTENTS procedure.