9.4 - Permanent Formats

All of the customized informat and format definitions in this lesson thus far have been stored only temporarily. That is, the informats and formats are valid only for the duration of the SAS session in which they are defined. If you wanted to use the informats or formats again in a different SAS program, you would have to create them again using another FORMAT procedure. If you plan to use a customized informat or format repeatedly, you can store it permanently in a "formats catalog" by using the LIBRARY= option in the PROC FORMAT statement. Basically, the LIBRARY= option tells SAS where the formats catalog is (to be) stored. You tell SAS the library (which again you can think of as a directory or location) by using a LIBNAME statement:

LIBNAME libref 'c:\directory\where\formats\stored';

where libref is technically a name of your choosing. Note though that when a user-defined informat or format is called by a DATA or PROC step, SAS first looks in a temporary catalog named work.formats. (Recall that "work" is what SAS always treats as your temporary working library that goes away at the end of your SAS session.) If SAS does not find the format or informat in the temporary catalog, it then by default looks in a permanent catalog called library.formats. So, while, yes, libref is technically a name of your choosing, it behooves you to call it library since that what SAS looks for first. That's why SAS recommends, but does not require, that you use the word library as the libref when creating permanent formats.

To make this blather a bit more concrete, suppose we have the following LIBNAME statement in our SAS program:

LIBNAME library 'C:\Simon\Stat480WCDEV\08format\sasndata\';

and have a format procedure that starts with:

PROC FORMAT library=library;

Then, upon running the program, SAS creates a permanent catalog containing all of the formats and informats that are defined in the FORMAT procedure and stores it in the folder referenced above, as illustrated here:

A formats catalog, regardless of whether it is temporary (work.formats) or permanent (library.formats), contains one entry for each format or informat defined in a FORMAT procedure. Because library.formats is the reserved name for permanent formats catalogs, you can create only one catalog called formats per SAS library (directory). There are ways around this restriction, but let's not get into that now. Let's jump to an example instead.

Example 9.9 Section

The following SAS program illustrates a FORMAT procedure that creates a permanent formats catalog in the directory referenced by library, that is, in C:\simon\icdb\data:

LIBNAME library 'C:\simon\icdb\data';

PROC FORMAT library=library;
   value sex2fmt 1 = 'Male'
                 2 = 'Female';

   value race2fmt 3 = 'Black'
                  4 = 'White'
                  OTHER = 'Other';

 DATA temp4; 
   infile 'c:\simon\icdb\data\back.dat';
   input subj 1-6 sex 17 race 19;
   format sex sex2fmt. race race2fmt.;

 PROC CONTENTS data=temp4;
   title 'Output Dataset: TEMP4';

 PROC PRINT data=temp4;
Output Dataset: TEMP4
The CONTENTS Procedure
Data Set name WORK.TEMP4 Observations 10
Member Type DATA Variables 3
Engine V9 Indexes 0
Created Wed, Nov 05, 2008 12:05:29 PM Observation Length 24
Last Modified Wed, Nov 05, 2008 12:05:29 PM Deleted Observations 0
Protection   Compressed NO
Data Set Type   Sorted NO
Data Representation WINDOWS_32    
Encoding wlatin1 Western (Windows)    
Engine/Host Dependent Information
Data Set Page Size 4096
Number of Data Set Pages 1
First Data Page 1
Max Obs per Page 168
Obs in First Data Page 10
Number of Data Set Repairs 0
Release Created 9.010M3
Host Created XP_PRO
Alphabetic List of Variables and Attributes
# Variable Type Len Format
3 race Num 8 RACE2FMT.
2 sex Num 8 SEX2FMT.
1 v_date Num 8  
Output Dataset: TEMP4
Obs subj sex race
1 110051 Female White
2 110088 Female White
3 210012 Female White
4 220004 Female White
5 230006 Female White
6 310083 Male Other
7 410012 Female White
8 420037 Female White
9 510027 Female White
10 520017 Female White

The DATA step creates a temporary data set called temp4 by reading in the variables subj, sex, and race from the raw data file back.dat, and associates the variables sex and race, respectively, with the formats sex2fmt and race2fmt that are defined in the FORMAT procedure. SAS first looks for the occurrence of these two formats in the temporary catalog work.formats and then when it doesn't find them there, it looks for them in the catalog of the permanent format in the c:\simon\icdb\data directory.

Launch the SAS program, and edit the INFILE statement so it reflects the location of your back.dat file. And, edit the LIBNAME statement so it reflects your desired location for the catalog of the permanent format. Then, run the program and review the output from the CONTENTS and PRINT procedures to convince yourself that the variables sex and race are associated with the permanent formats sex2fmt and race2fmt, not the temporary formats sexfmt and racefmt previously associated with f_sex and f_race. Also, view the directory referenced in your LIBNAME statement to convince yourself that SAS created and stored a permanent formats catalog there.

Just a few more comments on this permanent formats stuff. One of the problems with permanent informats and formats is that once a variable has been associated permanently with an informat or format, SAS must be able to refer to the library to access the formats catalog. As long as the formats catalog exists, and you have permission to the file, you just have to specify the appropriate LIBNAME statement:

LIBNAME library 'c:\stat480\formats\'; 

to access the catalog. If for some reason, you do not have access to the formats catalog, SAS will give you an error that looks something like this:

If you specify the NOFMTERR in the OPTIONS statement:


you can use the SAS data sets without getting errors. SAS will just display a note (not a program-halting error!) in the log file:

You will be able to run SAS programs that use the data sets containing the permanent formats. You will just not have access to the formats.

Example 9.10 Section

Rather than creating a permanent formats catalog, you can create a SAS program file which contains only a FORMAT procedure with the desired value and invalue statements. Then you need merely include this secondary program file in your main SAS program using the %INCLUDE statement, as illustrated here:

%INCLUDE 'C:\simon\icdb\formats\backfmt.sas';

 PROC FREQ data=back;
   title 'Frequency Count of STATE (statefmt)';
   format state statefmt.;
   table state/missing;
Frequency Count of STATE (statefmt)
The FREQ Procedure
state Frequency Percent Cumulative Frequency Cumulative Percent
Missing 2 20.00 2 20.00
Ind 1 10.00 3 30.00
Mass 1 10.00 4 40.00
Mich 2 20.00 6 60.00
Minn 1 10.00 7 70.00
Other 1 10.00 8 80.00
Tenn 1 10.00 9 90.00
Wisc 1 10.00 10 100.00

To make it clear, here's the only thing contained in the backfmt.sas file:

    value statefmt 14 = 'Ind'
                   21 = 'Mass'
                   22 = 'Mich'
                   23 = 'Minn'
                   42 = 'Tenn'
                   49 = 'Wisc'
                   .  = 'Missing'
                Other = 'Other';

Since the FORMAT procedure in the backfmt.sas file does not refer to a permanent library, the format statefmt is stored in the temporary work.formats catalog.

To run this program, first download and save the backfmt.sas file to a convenient location on your computer. Then, launch the SAS program and edit the %INCLUDE statement so it reflects the location of your backfmt.sas file. Finally, run the program and review the output from the FREQ procedure. Convince yourself that the format statement in the FREQ procedure appropriately associates the state variable with the statefmt format created by the FORMAT procedure in backfmt.sas. You may as well also take note of the effect of the MISSING option in the FREQ procedure. Basically, it tells SAS to include missing values as a countable category.

The technique illustrated in this example is particularly useful when you work in an open environment, in which data sets are shared. Different users may not have access to the format file, or different users may prefer different formats.