All of the customized informat and format definitions in this lesson thus far have been stored only temporarily. That is, the informats and formats are valid only for the duration of the SAS session in which they are defined. If you wanted to use the informats or formats again in a different SAS program, you would have to create them again using another FORMAT procedure. If you plan to use a customized informat or format repeatedly, you can store it permanently in a "formats catalog" by using the LIBRARY= option in the PROC FORMAT statement. Basically, the LIBRARY= option tells SAS where the formats catalog is (to be) stored. You tell SAS the library (which again you can think of as a directory or location) by using a LIBNAME statement:
LIBNAME libref 'c:\directory\where\formats\stored';
where libref is technically a name of your choosing. Note though that when a user-defined informat or format is called by a DATA or PROC step, SAS first looks in a temporary catalog named work.formats. (Recall that "work" is what SAS always treats as your temporary working library that goes away at the end of your SAS session.) If SAS does not find the format or informat in the temporary catalog, it then by default looks in a permanent catalog called library.formats. So, while, yes, libref is technically a name of your choosing, it behooves you to call it library since that is what SAS looks for first. That's why SAS recommends but does not require, that you use the word library as the libref when creating permanent formats.
To make this blather a bit more concrete, suppose we have the following LIBNAME statement in our SAS program:
LIBNAME library 'C:\YourDriveName\Stat480WCDEV\format\sasndata\';
and have a format procedure that starts with:
PROC FORMAT library=library;
Then, upon running the program, SAS creates a permanent catalog containing all of the formats and informats that are defined in the FORMAT procedure and stores it in the folder referenced above, as illustrated here:
A formats catalog, regardless of whether it is temporary (work.formats) or permanent (library.formats), contains one entry for each format or informat defined in a FORMAT procedure. Because library.formats is the reserved name for permanent formats catalogs, you can create only one catalog called formats per SAS library (directory). There are ways around this restriction, but let's not get into that now. Let's jump to an example instead.
Example 9.9 Section
The following SAS program illustrates a FORMAT procedure that creates a permanent formats catalog in the directory referenced by library, that is, in C:\yourdrivename\icdb\data:
LIBNAME library 'C:\yourdrivename\icdb\data';
PROC FORMAT library=library;
value sex2fmt 1 = 'Male'
2 = 'Female';
value race2fmt 3 = 'Black'
4 = 'White'
OTHER = 'Other';
RUN;
DATA temp4;
infile 'c:\yourdrivename\icdb\data\back.dat';
input subj 1-6 sex 17 race 19;
format sex sex2fmt. race race2fmt.;
RUN;
PROC CONTENTS data=temp4;
title 'Output Dataset: TEMP4';
RUN;
PROC PRINT data=temp4;
RUN;
Data Set name | WORK.TEMP4 | Observations | 10 |
---|---|---|---|
Member Type | DATA | Variables | 3 |
Engine | V9 | Indexes | 0 |
Created | Wed, Nov 05, 2023 12:05:29 PM | Observation Length | 24 |
Last Modified | Wed, Nov 05, 2023 12:05:29 PM | Deleted Observations | 0 |
Protection | Compressed | NO | |
Data Set Type | Sorted | NO | |
Label | |||
Data Representation | WINDOWS_32 | ||
Encoding | wlatin1 Western (Windows) |
Data Set Page Size | 4096 |
---|---|
Number of Data Set Pages | 1 |
First Data Page | 1 |
Max Obs per Page | 168 |
Obs in First Data Page | 10 |
Number of Data Set Repairs | 0 |
File Name | C:\DOCUME~1\Yourdrivename~1\LOCALS~1\TEMP\SAS TEMPORARY FILES\_TD3812\temp4.sas7bdat |
Release Created | 9.010M3 |
Host Created | XP_PRO |
# | Variable | Type | Len | Format |
---|---|---|---|---|
3 | race | Num | 8 | RACE2FMT. |
2 | sex | Num | 8 | SEX2FMT. |
1 | v_date | Num | 8 |
Obs | subj | sex | race |
---|---|---|---|
1 | 110051 | Female | White |
2 | 110088 | Female | White |
3 | 210012 | Female | White |
4 | 220004 | Female | White |
5 | 230006 | Female | White |
6 | 310083 | Male | Other |
7 | 410012 | Female | White |
8 | 420037 | Female | White |
9 | 510027 | Female | White |
10 | 520017 | Female | White |
The DATA step creates a temporary data set called temp4 by reading in the variables subj, sex, and race from the raw data file back.dat, and associates the variables sex and race, respectively, with the formats sex2fmt and race2fmt that are defined in the FORMAT procedure. SAS first looks for the occurrence of these two formats in the temporary catalog work.formats and then when it doesn't find them there, it looks for them in the catalog of the permanent format in the c:\yourdrivename\icdb\data directory.
Launch the SAS program, and edit the INFILE statement so it reflects the location of your back.dat file. And, edit the LIBNAME statement so it reflects your desired location for the catalog of the permanent format. Then, run the program and review the output from the CONTENTS and PRINT procedures to convince yourself that the variables sex and race are associated with the permanent formats sex2fmt and race2fmt, not the temporary formats sexfmt and racefmt previously associated with f_sex and f_race. Also, view the directory referenced in your LIBNAME statement to convince yourself that SAS created and stored a permanent formats catalog there.
Just a few more comments on this permanent formats stuff. One of the problems with permanent informats and formats is that once a variable has been associated permanently with an informat or format, SAS must be able to refer to the library to access the formats catalog. As long as the formats catalog exists, and you have permission to the file, you just have to specify the appropriate LIBNAME statement:
LIBNAME library 'c:\stat480\formats\';
to access the catalog. If for some reason, you do not have access to the formats catalog, SAS will give you an error that looks something like this:
ERROR: Format SEX2FMT not found or couldn't be loaded for variable sex.
ERROR: Format RACE2FMT not found or couldn't be loaded for variable race.
NOTE: The SAS SYstem stopped processing this step because of errors.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.00 seconds
cpu time 0.01 seconds
If you specify the NOFMTERR in the OPTIONS statement:
OPTIONS NOFMTERR;
you can use the SAS data sets without getting errors. SAS will just display a note (not a program-halting error!) in the log file:
8870 format sex sex2fmt. race race2fmt. ;
-------
484
NOTE: 484-185: Format SEX2FMT was not found or could not be loaded.
8870! format sex sex2fmt. race race2fmt. ;
--------
484
NOTE: 484-185: Format RACE2FMT was not found or could not be loaded.
You will be able to run SAS programs that use the data sets containing the permanent formats. You will just not have access to the formats.
Example 9.10 Section
Rather than creating a permanent formats catalog, you can create a SAS program file that contains only a FORMAT procedure with the desired value and invalue statements. Then you need merely include this secondary program file in your main SAS program using the %INCLUDE statement, as illustrated here:
%INCLUDE 'C:\yourdrivename\icdb\formats\backfmt.sas';
PROC FREQ data=back;
title 'Frequency Count of STATE (statefmt)';
format state statefmt.;
table state/missing;
RUN;
state | Frequency | Percent | Cumulative Frequency | Cumulative Percent |
---|---|---|---|---|
Missing | 2 | 20.00 | 2 | 20.00 |
Ind | 1 | 10.00 | 3 | 30.00 |
Mass | 1 | 10.00 | 4 | 40.00 |
Mich | 2 | 20.00 | 6 | 60.00 |
Minn | 1 | 10.00 | 7 | 70.00 |
Other | 1 | 10.00 | 8 | 80.00 |
Tenn | 1 | 10.00 | 9 | 90.00 |
Wisc | 1 | 10.00 | 10 | 100.00 |
To clarify, here's the only thing contained in the backfmt.sas file:
PROC FORMAT;
value statefmt 14 = 'Ind'
21 = 'Mass'
22 = 'Mich'
23 = 'Minn'
42 = 'Tenn'
49 = 'Wisc'
. = 'Missing'
Other = 'Other';
Since the FORMAT procedure in the backfmt.sas file does not refer to a permanent library, the format statefmt is stored in the temporary work.formats catalog.
To run this program, first download and save the backfmt.sas file to a convenient location on your computer. Then, launch the SAS program and edit the %INCLUDE statement so it reflects the location of your backfmt.sas file. Finally, run the program and review the output from the FREQ procedure. Convince yourself that the format statement in the FREQ procedure appropriately associates the state variable with the statefmt format created by the FORMAT procedure in backfmt.sas. You may also note the MISSING option's effect in the FREQ procedure. Basically, it tells SAS to include missing values as a countable category.
The technique illustrated in this example is particularly useful when you work in an open environment, in which data sets are shared. Different users may not have access to the format file, or different users may prefer different formats.