22.1 - SAS Date Basics

In this section, we'll get a quick and broad overview of the fundamental things you need to know about working with dates in SAS. We'll learn how SAS defines a date value, how to use an informat to read a date into a SAS data set, how to use a format to display a SAS date, how to perform simple date calculations, and how to define a SAS date constant.

The Definition of a SAS Date Section

SAS stores dates as single, unique numbers, so that they can be used in your programs like any other numeric value. Specifically, SAS stores dates as numeric values equal to the number of days from January 1, 1960. That is, dates prior to January 1, 1960 are stored as unique negative integers, and dates after January 1, 1960 are stored as unique positive integers. So, for example, SAS stores:

  • a 0 for January 1, 1960
  • a 1 for January 2, 1960
  • a 2 for January 3, 1960
  • and so on ...

And, SAS stores:

  • a -1 for December 31, 1959
  • a -2 for December 30, 1959
  • a -3 for December 29, 1959
  • and so on ...

No matter what method is used in creating a SAS date, SAS always converts the date to an integer as just defined.

Using an Informat to Read in a SAS Date Section

As you already know, in order to read variables that are dates, we need to tell SAS what form the date takes. For example, is the date in the form Dec 1, 2005? Or is it 12/01/05? Or 01 December 2005? The form that a date takes on input is known as a date informat. There seems to be as much SAS date informats as there are ways that you could imagine writing a date. Well, okay, maybe not quite that many. We'll take a look at several of the informats that are available in SAS later in this lesson. For now, we'll just refresh our memory of how to write the formatted style input statement that is necessary to read in dates.

Example 22.1.

The following SAS program reads five observations into a SAS data set called diet. Two of the variables — weight date (wt_date) and birth date (b_date) — are in mm/dd/yy format, and therefore SAS is told to read the dates using the mmddyy8. informat:

DATA diet;
	input subj 1-4 l_name $ 18-23 weight 30-32
    	+1 wt_date mmddyy8. @43 b_date mmddyy8.;
	DATALINES;
1024 Alice       Smith  1 65 125 12/1/05  01/01/60
1167 Maryann     White  1 68 140 12/01/05 01/01/59
1168 Thomas      Jones  2    190 12/2/05  06/15/60
1201 Benedictine Arnold 2 68 190 11/30/05 12/31/60
1302 Felicia     Ho     1 63 115 1/1/06   06/15/58
	;
RUN;
 
PROC PRINT data=diet;
	TITLE 'The unformatted diet data set';
RUN;

The unformatted diet data set
Obssubjl_nameweightwt_dateb_date
11024Smith125167710
21167White14016771-365
31168Jones19016772166
41201Arnold19016770365
51302Ho11516802-565

First, note that the mmddyy8. informat must immediately follow the date's variable name. Here, it immediately follows wt_date, and then again follows b_date. Incidentally, the 8 in mmddyy8. defines, in general, the width of the informat. It tells SAS that the dates to be read into SAS contain as many as 8 positions. Here, two of the positions are taken up by forward slashes (/). You could alternatively use hyphens (-) or blank spaces between the mm dd and yy. Also, note that the period is a very important part of the informat name. Without it, SAS may attempt to interpret the informat as a variable name instead.

Then, launch and run  the SAS program, and review the resulting output to familiarize yourself with the contents of the diet data set. Note, in particular, the numeric values that are stored for the wt_date and b_date variables. As expected, the 01/01/60 birth date is stored as a 0, the 01/01/59 birthdate is stored as -365, and the 12/31/60 birthdate is stored as +365. Well, I guess the other thing that the output illustrates is that it is not enough just to tell SAS what informat to use to read in a date's value, you also have to tell SAS what format to use to display a date's value. If you don't, as you see here, the dates that are displayed are not particularly user-friendly!

Using a Format to Display a SAS Date Section

As the preceding example illustrates, we have to tell SAS in what form we would like our dates displayed. The form that a date takes in output is known as a date format. Do we want the date displayed in the form Dec 1, 2005? Or 12/01/05? Or 01 December 2005? Again, there seem to be as many SAS date formats as there are ways that you could imagine writing a date. To tell SAS in which form we want our dates displayed, we use a FORMAT statement.

Example 22.2.

The following SAS program is identical to the previous program, except a FORMAT statement has been added to tell SAS to display the wt_date and b_date variables in date7. format:

DATA diet;
input subj 1-4 l_name $ 18-23 weight 30-32
        +1 wt_date mmddyy8. @43 b_date mmddyy8.;
format wt_date b_date date7.;
DATALINES;
1024 Alice       Smith  1 65 125 12/1/05  01/01/60
1167 Maryann     White  1 68 140 12/01/05 01/01/59
1168 Thomas      Jones  2    190 12/2/05  06/15/60
1201 Benedictine Arnold 2 68 190 11/30/05 12/31/60
1302 Felicia     Ho     1 63 115 1/1/06   06/15/58
    ;
RUN;
 
PROC PRINT data=diet;
    title 'The formatted diet data set';
RUN;

The formatted diet data set
Obssubjl_nameweightwt_dateb_date
11024Smith12501DEC0501JAN60
21167White14001DEC0501JAN59
31168Jones19002DEC0515JUN60
41201Arnold19030NOV0531DEC60
51302Ho11501JAN0615JUN58

First, take note of the FORMAT statement in which the selected format date7. follows the two variables — wt_date and b_date — whose values we want to display as ddMonyy. Then, launch and run  the SAS program, and review the resulting output to convince yourself of the effect of the FORMAT statement.

Using SAS Dates in Calculations Section

The best thing about SAS dates is that, because SAS date values are numeric values, you can easily sort them, subtract them, and add them. You can also compare dates. Or, you can use them in many of the available numeric functions.

Example 22.3

The following SAS program illustrates how you can treat date variables as any other numeric variable, and therefore can use the dates in numeric calculations. Assuming that individuals in the diet data set need to be weighed every 14 days, a new variable nxt_date, the anticipated date of the individual's next visit, is determined by merely adding 14 to the individual's current weight date (wt_date). Then, a crude estimate of each individual's age is also calculated by subtracting b_date from wt_date and dividing the resulting number of days by 365.25 to get an approximate age in years. And, the MEAN function is used to calculate avg_date, the average of each individual's birth and weight dates:

DATA diet;
    input  subj 1-4 l_name $ 18-23 weight 30-32
           +1 wt_date mmddyy8. @43 b_date mmddyy8.;
    nxt_date = wt_date + 14;
    age_wt = (wt_date - b_date)/365.25;
    avg_date = MEAN(wt_date, b_date);
    format wt_date b_date nxt_date avg_date date7. 
           age_wt 4.1; 
    DATALINES;
1024 Alice       Smith  1 65 125 12/1/05  01/01/60
1167 Maryann     White  1 68 140 12/01/05 01/01/59
1168 Thomas      Jones  2    190 12/2/05  06/15/60
1201 Benedictine Arnold 2 68 190 11/30/05 12/31/60
1302 Felicia     Ho     1 63 115 1/1/06   06/15/58
    ;
RUN;
 
PROC PRINT data=diet;
    title 'The diet data set with three new variables';
RUN;

The diet data set with three new variables
Obssubjl_nameweightwt_dateb_datenxt_dateage_wtavg_date
11024Smith12501DEC0501JAN6015DEC0545.916DEC82
21167White14001DEC0501JAN5915DEC0546.917JUN82
31168Jones19002DEC0515JUN6016DEC0545.510MAR83
41201Arnold19030NOV0531DEC6014DEC0544.916JUN83
51302Ho11501JAN0615JUN5815JAN0647.524MAR82

First, review the code to see how the three new variables — nxt_date, age_wt, and avg_date — are calculated using standard numeric expressions. You should also acknowledge that the calculation of avg_date is just a desperate attempt by a desperate instructor to illustrate the use of dates in a standard numeric function, and is otherwise probably fairly useless. Then, launch and run  the SAS program, and review the resulting output to convince yourself that the results of the calculations seem reasonable.

Example 22.4

The following SAS program illustrates again how you can treat date variables as any other numeric variable, and therefore can sort dates. The diet data set is sorted by nxt_date in ascending order so that the individuals whose next weigh-in date is closest in time appear first:

PROC SORT data = diet out = sorteddiet;
    by nxt_date;
RUN;
             
PROC PRINT data = sorteddiet;
    TITLE 'The diet data set sorted by nxt_date';
RUN;

The diet data set sorted by nxt_date
Obssubjl_nameweightwt_dateb_datenxt_dateage_wtavg_date
11201Arnold19030NOV0531DEC6014DEC0544.916JUN83
21024Smith12501DEC0501JAN6015DEC0545.916DEC82
31167White14001DEC0501JAN5915DEC0546.917JUN82
41168Jones19002DEC0515JUN6016DEC0545.510MAR83
51302Ho11501JAN0615JUN5815JAN0647.524MAR82

First, review the code, and then launch and run  the SAS program. Then, review the resulting output to convince yourself that the variable nxt_date is sorted as indeed claimed.

Comparing Dates Section

Again, because SAS date values are numeric values, you can easily compare two or more dates. The comparisons are made just as the comparisons between any two numbers would take place. For example, because the date 01/03/60 is stored as a 2 in SAS, it is considered smaller than the date 01/10/60, which is stored as a 9 in SAS.

Example 22.5.

The following SAS program illustrates how to compare the values of a date variable, not to the values of some other date variable, but rather to a date constant. Specifically, the WHERE= option that appears on the DATA statement tells SAS to output to the diet data set only those individuals whose b_date is before January 1, 1960:

DATA diet (where = (b_date < '01jan1960'd));
	input subj 1-4 l_name $ 18-23 weight 30-32
    	+1 wt_date mmddyy8. @43 b_date mmddyy8.;
	format wt_date b_date date9.;
    DATALINES;
1024 Alice       Smith  1 65 125 12/1/05  01/01/60
1167 Maryann     White  1 68 140 12/01/05 01/01/59
1168 Thomas      Jones  2    190 12/2/05  06/15/60
1201 Benedictine Arnold 2 68 190 11/30/05 12/31/60
1302 Felicia     Ho     1 63 115 1/1/06   06/15/58
	;
RUN;
 
PROC PRINT data=diet;
	title 'Birthdays in the diet data set before 01/01/1960';
RUN;

Birthdays in the diet data set before 01/01/1960
Obssubjl_nameweightwt_dateb_date
11167White14001DEC200501JAN1959
21302Ho11501JAN200615JUN1958

First, note the form of the SAS date constant:

'01jan1960'd

used in the WHERE= option. In general, a SAS date constant takes the form 'ddMONyyyy'd where dd denotes the day of the month (0, ..., 31), MON denotes the first three letters of the month, and yyyy denotes the four-digit year. The letter d that follows the date in single quotes tells SAS to treat the date string like a constant. Note that regardless of how you have informatted or formatted your SAS dates, the SAS date constant always takes the above form.

Now, launch and run  the SAS program. Then, review the resulting output to convince yourself that only those individuals whose birth date is before January 1, 1960, are included in the output diet data set. You might also want to note the difference between the date7. and date9. format. Previously, we saw that when you used the date7. format, your dates are displayed in ddMonyy format. Here, you can see that when you use the date9. format, your dates are displayed in ddMonyyyy format. (Incidentally, I think it is a good practice to use four-digit years wherever possible to avoid any ambiguity.) We'll take a look at some of the other informats and formats available later in this lesson. Now, we'll go take a look at some of the available functions that work specifically with SAS dates.