I've always thought the FORMAT procedure's PICTURE statement is one of the procedure's coolest features. In short, the PICTURE statement allows you to create templates for printing numbers by defining a format that allows for special characters, such as leading zeros, decimal and comma punctuation, fill characters, prefixes, and negative number representation. Only numeric variables can have picture formats. For a quick example, the following PICTURE statement:
PICTURE phonepix OTHER = '(999)999-9999';
tells SAS to print phone numbers in the specified format.
Restrictions on the PICTURE statement include:
- The name of the picture format (e.g., phonepix) must be a valid SAS name.
- When you use the format later, you must follow the name with a period.
- Picture format options, such as FILL, MULT, PREFIX, and NOEDIT, should be specified in parentheses after each picture in a picture format.
- The maximum length for a picture is 40 characters.
As is true for the INVALUE and VALUES statements, the PICTURE statement in the FORMAT procedure merely defines a picture. In order for the picture to take effect, you must associate the variable with the picture with a FORMAT statement in either a DATA step or a PROC step.
The FILL option specifies a fill character, which replaces the leading blanks of the picture until a significant digit is encountered. The default fill character is a blank. The PREFIX option specifies a one- or two-character prefix placed in front of the value's first significant digit. The PREFIX option is often used for leading dollar signs and minus signs. For example, the following picture statement:
PICTURE dolpix OTHER='00,000,000.00' (fill='*' prefix='$');
tells SAS to print dollar amounts, such as 19999 as ***$19,999.00.
See SAS Help for information about the MULT and NOEDIT options of the PICTURE statement.
Example 9.12 Section
The FORMAT procedure in the following SAS program defines a picture format for the ICDB variable subj:
PROC FORMAT;
picture subjpix LOW-HIGH = '00-0000';
RUN;
PROC PRINT data=back;
title 'BACK dataset with SUBJ pictured as 00-0000';
format subj subjpix.;
var subj v_date sex;
RUN;
Obs | subj | v_date | sex |
---|---|---|---|
1 | 11-0051 | 01/25/94 | 2 |
2 | 11-0088 | 02/28/95 | 2 |
3 | 21-0012 | 07/16/93 | 2 |
4 | 22-004 | 07/27/93 | 2 |
5 | 23-0006 | 01/06/94 | 2 |
6 | 31-0083 | 01/20/95 | 1 |
7 | 41-0012 | 09/16/93 | 2 |
8 | 42-0037 | 02/02/94 | 2 |
9 | 51-0027 | 02/15/94 | 2 |
10 | 52-0017 | 11/17/93 | 2 |
The picture allows the first two digits of subj, which happens to define the hospital number, to be separated from the remaining four digits by a dash. The range "LOW-HIGH" tells SAS that all values should be printed in this format. Since the "digit selector" used to define the template is 0, leading zeroes are not printed. (That's a moot point here since all of the ICDB subject numbers begin with non-zero numbers). In general, non-zero digit selectors, such as say 9, tell SAS to print leading zeroes.
Launch and run the SAS program. Review the contents of the PRINT procedure to convince yourself that, as requested, SAS associated the variable subj with the defined picture format subjpix, and then printed the subj variable accordingly.
Example 9.13 Section
The following SAS program illustrates two more picture formats:
DATA temp5;
input subj ssn expens;
cards;
110051 001111111 1099.99
110088 022234567 10876.34
210012 123345567 9567.21
220004 120451207 5640.12
230006 125398710 344.46
310083 237982019 3235.09
410012 323432429 1343.03
420037 340234839 11348.29
510027 928373402 7362.79
520017 433492349 3295.09
;
RUN;
PROC FORMAT;
picture ssnpix LOW-HIGH = '999-99-9999';
picture dolpix LOW-HIGH = '000,000.00' (prefix='$' fill='*');
RUN;
PROC PRINT data=temp5;
title 'Output Dataset: TEMP5. Examples of Picture Formats.';
format ssn ssnpix. expens dolpix.;
var subj ssn expens;
RUN;
Obs | subj | ssn | expens |
---|---|---|---|
1 | 10051 | 001-11-1111 | *$1,099.99 |
2 | 110088 | 022-23-4597 | $10,876.34 |
3 | 2100012 | 123-34-5567 | *$9,567.21 |
4 | 220004 | 120-45-1207 | *$5,640.12 |
5 | 230006 | 125-39-8710 | ***$344.46 |
6 | 310083 | 237-98-2019 | *$3,235.09 |
7 | 410012 | 323-43-2429 | *$1,343.03 |
8 | 420037 | 340-23-4839 | $11,348.29 |
9 | 510027 | 928-37-3402 | *$7,362.79 |
10 | 520017 | 433-49-2349 | *$3,295.09 |
The first DATA step merely creates a temporary data set called temp5 by reading in three variables: subj (matching the ID numbers with those in the back data set), ssn (social security number), and expens (hospital expenses, a dollar amount). Then, the FORMAT procedure defines two picture formats, ssnpix and dolpix.
- The ssnpix format tells SAS to print social security numbers with the stated format. Since a non-zero digit selector ("9") is used, SAS prints leading zeroes. That is, the social security number 032-45-2190 will be printed with the leading zero.
- The dolpix format tells SAS to print expense amounts dropping any leading zeroes. The PREFIX option tells SAS to place a dollar sign ("$") before the expense amount, while the FILL option tells SAS to fill the remaining positions with asterisks ("*").
The PRINT procedure associates the variable ssn with the format ssnpix and the variable expens with the format dolpix, as well as prints the data set. Launch and run the SAS program and review the output of the PRINT procedure to convince yourself that the picture formats did indeed have the desired effect.
Note that although the largest expense in the data set is 11,348.29, the dolpix picture seemingly allows for numbers in the hundreds of thousands, 100,000.00 say. If we did not provide that extra digit, then SAS would not have room to place a dollar sign before an expense amount of 11,348.29. This is a common mistake, so please keep it in mind! If this is unclear to you, you might want to remove one of the three 0s that appear before the comma in the dolpix picture, re-run the SAS program, and review the PRINT procedure to see that the dollar sign is not displayed for the 11,348.29 or 10,876.34 amounts.