5.5 - Logical Operators

In addition to the comparison operators that we learned previously, we can also use the following logical operators:

Operation

SAS syntax

Alternative SAS syntax

are both conditions true?

&

AND

is either condition true?

|

OR

reverse the logic of a comparison

^ or ~

NOT

You will want to use the AND operator to execute the THEN statement if both expressions that are linked by AND are true, such as here:

IF (p1 GT 90) AND (f1 GT 90) THEN performance = 'excellent';

You will want to use the OR operator to execute the THEN statement if either expression that is linked by the OR is true, such as here:

IF (p1 GT 90) OR (f1 GT 90) THEN performance = 'very good';

And, you will want to use the NOT operator in conjunction with other operators to reverse the logic of a comparison:

IF p1 NOT IN (98, 99, 100) THEN performance = 'not excellent';

Now when we look at examples using these logical operators, why stop at just two ELSE statements? Let's go crazy and program a bunch of them! One thing though — when we do, we have to be extra careful to make sure that our conditions are mutually exclusive. That is, we have to make sure that, for each observation in the data set, one and only one of the conditions holds. This most often means that we have to make sure that the endpoints of our intervals don't overlap in some way.

Example 5.6 Section

The following SAS program illustrates the use of several mutually exclusive conditions within an if-then-else statement. The program uses the AND operator to define the conditions. Again, when comparisons are connected by AND, all of the comparisons must be true in order for the condition to be true.

DATA grades;
    length overall $ 10;
   	input name $ 1-15 e1 e2 e3 e4 p1 f1;
	avg = round((e1+e2+e3+e4)/4,0.1);
		 if (avg = .)                   then overall = 'Incomplete';
	else if (avg >= 90)                 then overall = 'A';
	else if (avg >= 80) and (avg < 90)  then overall = 'B';
	else if (avg >= 70) and (avg < 80)  then overall = 'C';
	else if (avg >= 65) and (avg < 70)  then overall = 'D';
	else if (avg < 65)                  then overall = 'F';	
	DATALINES;
Alexander Smith  78 82 86 69  97 80
John Simon       88 72 86  . 100 85
Patricia Jones   98 92 92 99  99 93
Jack Benedict    54 63 71 49  82 69
Rene Porter     100 62 88 74  98 92
;
RUN;
PROC PRINT data = grades;
	var name avg overall;
RUN;

Note: In the upper right-hand corner of the code block you will have the option of copying ( ) the code to your clipboard or downloading ( ) the file to your computer.

DATA grades;
    length overall $ 10;
   	input name $ 1-15 e1 e2 e3 e4 p1 f1;
	avg = round((e1+e2+e3+e4)/4,0.1); *Calculate the average and round it to one decimal place.  John Smith’s average will be missing;

	*Program for missing values;
	if (avg = .)                   then overall = 'Incomplete';
		*Make sure each student falls into one of the categories;
		else if (avg >= 90)                 then overall = 'A';
		else if (avg >= 80) and (avg < 90)  then overall = 'B';
		else if (avg >= 70) and (avg < 80)  then overall = 'C';
		else if (avg >= 65) and (avg < 70)  then overall = 'D';
		else if (avg < 65)                  then overall = 'F';	
	DATALINES;
Alexander Smith  78 82 86 69  97 80
John Simon       88 72 86  . 100 85
Patricia Jones   98 92 92 99  99 93
Jack Benedict    54 63 71 49  82 69
Rene Porter     100 62 88 74  98 92
;
RUN;

PROC PRINT data = grades;
	var name avg overall;
RUN;

First, inspect the program to make sure you understand the code. Then, launch and run  the SAS program. Review the output from the PRINT procedure to convince yourself that the letter grades have been assigned correctly. Also note how the program in general, and the if-then-else statement in particular, is formatted in order to make the program easy to read. The conditions and assignment statements are aligned nicely in columns and parentheses are used to help offset the conditions. Whenever possible ... okay, make that always ... format (and comment) your programs. After all, you may actually need to use them again in a few years. Trust me ... you'll appreciate it then!

Oh, one more point. You may have noticed, after the condition that takes care of missing values, that the conditions appear in order from A, B, ... down to F. Is the instructor treating the glass as being half-full as opposed to half-empty? Hmmm ... actually, the order has to do with the efficiency of the statements. When SAS encounters the condition that is true for a particular observation, it jumps out of the if-then-else statement to the next statement in the DATA step. SAS thereby avoids having to needlessly evaluate all of the remaining conditions. Hence, we have another good programming habit ... arrange the order of your conditions (roughly speaking, of course!) in an if-then-else statement so that the most common one appears first, the next most common one appears second, and so on. You'll also need to make sure that your condition concerning missing values appears first in the IF statement, otherwise, SAS may bypass it.

Example 5.7 Section

In the previous program, the conditions were written using the AND operator. Alternatively, we could have just used straightforward numerical intervals. The following SAS program illustrates the use of alternative intervals as well as the alternative syntax for the comparison operators:

DATA grades;
    length overall $ 10;
   	input name $ 1-15 e1 e2 e3 e4 p1 f1;
	avg = round((e1+e2+e3+e4)/4,0.1);
		 if (avg EQ .)         then overall = 'Incomplete';
	else if (90 LE avg LE 100) then overall = 'A';
	else if (80 LE avg LT  90) then overall = 'B';
	else if (70 LE avg LT  80) then overall = 'C';
	else if (65 LE avg LT  70) then overall = 'D';
	else if (0  LE avg LT  65) then overall = 'F';
	DATALINES;
Alexander Smith  78 82 86 69  97 80
John Simon       88 72 86  . 100 85
Patricia Jones   98 92 92 99  99 93
Jack Benedict    54 63 71 49  82 69
Rene Porter     100 62 88 74  98 92
;
RUN;

PROC PRINT data = grades;
	var name avg overall;
RUN;

Launch and run  the SAS program. Review the output from the PRINT procedure to convince yourself that the letter grades have again been assigned correctly.

Example 5.8 Section

Now, suppose an instructor wants to give bonus points to students who show some sign of improvement from the beginning of the course to the end of the course. Suppose she wants to add two points to a student's overall average if either her first exam grade is less than her third and fourth exam grade or her second exam grade is less than her third and fourth exam grade. (Don't ask why! I'm just trying to motivate something here.) The operative words here are "either" and "or". In order to accommodate the instructor's wishes, we need to take advantage of the OR comparison operator. When comparisons are connected by OR, only one of the comparisons needs to be true in order for the condition to be true. The following SAS program illustrates the use of the OR operator, the AND operator, and the use of the OR and AND operators together:

DATA grades;
   	input name $ 1-15 e1 e2 e3 e4 p1 f1;
	avg = round((e1+e2+e3+e4)/4,0.1);
		 if    ((e1 < e3) and (e1 < e4)) 
            or ((e2 < e3) and (e2 < e4)) then adjavg = avg + 2;
    else adjavg = avg;
	DATALINES;
Alexander Smith  78 82 86 69  97 80
John Simon       88 72 86  . 100 85
Patricia Jones   98 92 92 99  99 93
Jack Benedict    54 63 71 49  82 69
Rene Porter     100 62 88 74  98 92
;
RUN;

PROC PRINT data = grades;
	var name e1 e2 e3 e4 avg adjavg;
RUN;

First, inspect the program to make sure you understand the code. In particular, note that logical comparisons that are enclosed in parentheses are evaluated as true or false before they are compared to other expressions. In this example:

  • SAS first determines if e1 is less than e3 AND if e1 is less than e4
  • SAS then determines if e2 is less than e3 AND if e2 is less than e4
  • SAS then determines if the first bullet is true OR if the second bullet is true

Launch and run  the SAS program. Review the output from the PRINT procedure to convince yourself that, where appropriate, two points were added to the student's average (avg) to get an adjusted average (adjavg). Also, note that we didn't have to worry about programming for missing values here, because the student's adjusted average (adjavg) would automatically be assigned missing if his or her average (avg) was missing. SAS calls this "propagation of missing values."