32.1 - Proc SQL Basics

PROC SQL is a procedure that SAS developed for the implementation of Structured Query Language. You can use this procedure to modify, retrieve and report data in tables and views (created on tables). Just as with other SAS procedures, PROC SQL also has basic syntax structures. It takes the following general form:


PROC SQL;
SELECT column-1<,…column-n>
FROM table-1|view-1<,…table-n|view-n>
<WHERE expression>
<GROUP BY column-1<,…column-n>>
<HAVING expression>
<ORDER BY column-1<,…column-n>>;
QUIT;

First of all, you may see differences in terminology between SQL and other SAS steps. For example, the data file is called data set in other SAS steps, but the table is in SQL. Correspondingly, records are called observations in the previous lessons, but rows in SQL tables, while we call a field of data set a variable, but a column in this lesson.

Other SAS steps SQL Procedure
data set table
observation row
variable column

Another thing that needs your attention is that, unlike other SAS procedures, there could be one or a few SELECT statements inside PROC SQL. One SELECT statement is called a query, which is composed of many clauses, like SELECT, FROM, WHERE, GROUP BY, HAVING, and ORDER BY. The order of these clauses is important. They must appear in the order as shown above.

We will use the whole lesson to work our way through all these keywords in PROC SQL. Let’s start with the most basic one.

Example 32.1 Section

The following SAS SQL code is just query that retrieves data from a single table:

libname stat482 "X:\MG\stat480~483\proc sql";
PROC SQL;
	select ID,
		SATM,
		SATV
	from stat482.survey;
QUIT;

The SAS System
id SATM SATV
1001 700 700
1002 700 500
1003 470 470
1004 710 560
1005 600 520
1006 610 720
1007 710 670
1008 610 580
1009 690 620
1010 580 540
1011 690 690
1012 . .
1013 450 580
1014 700 700
1015 400 600
1016 640 600
1017 720 580
1018 600 750
1019 600 720
1020 600 630
1021 600 300
1022 670 630
1023 670 680
1024 600 730
1025 560 680
1026 560 560
1027 550 600
1028 610 600
1029 680 680
1030 450 450
1031 640 620
1032 300 400
1033 650 600
1034 670 630
1035 500 550
1036 530 760
1037 500 400
1038 680 650
1039 560 560
1040 . .
1041 620 760
1042 640 640
1043 590 640
1044 570 560
1045 680 710
1046 660 580
1047 700 570
1048 650 590
1049 620 640
1050 450 450
1051 650 550
1052 600 620
1053 420 490
1054 630 580
1055 760 600
1056 600 500
1057 600 600
1058 585 590
1059 800 750
1060 540 550
1061 680 550
1062 670 670
1063 560 590
1064 550 700
1065 575 600
1066 450 400
1067 640 560
1068 550 610
1069 600 490
1070 600 610
1071 650 660
1072 570 570
1073 600 550
1074 500 450
1075 650 550
1076 590 570
1077 560 610
1078 570 490
1079 700 640
1080 660 660
1081 620 600
1082 550 690
1083 560 580
1084 690 690
1085 740 680
1086 540 570
1087 580 570
1088 . .
1089 500 600
1090 560 330
1091 800 650
1092 620 530
1093 560 560
1094 640 710
1095 690 650
1096 750 550
1097 640 430
1098 540 520
1099 700 600
1100 490 520
1101 570 490
1102 620 630
1103 450 500
1104 . .
1105 510 680
1106 620 500
1107 630 530
1108 500 550
1109 500 490
1110 700 600
1111 540 570
1112 . .
1113 650 550
1114 530 530
1115 580 620
1116 560 550
1117 570 480
1118 650 700
1119 640 630
1120 680 670
1121 450 500
1122 670 510
1123 560 530
1124 430 440
1125 470 510
1126 720 570
1127 600 650
1128 570 650
1129 500 540
1130 660 620
1131 730 800
1132 420 420
1133 450 550
1134 650 600
1135 520 650
1136 650 600
1137 580 640
1138 560 580
1139 530 600
1140 540 600
1141 . .
1142 720 500
1143 550 500
1144 640 600
1145 500 450
1146 580 600
1147 580 620
1148 620 640
1149 640 700
1150 630 650
1151 590 590
1152 600 500
1153 550 630
1154 520 450
1155 . .
1156 780 540
1157 400 400
1158 500 580
1159 640 440
1160 700 700
1161 600 630
1162 710 710
1163 600 590
1164 790 460
1165 560 500
1166 700 520
1167 520 550
1168 650 560
1169 650 670
1170 610 590
1171 480 520
1172 780 580
1173 580 590
1174 680 600
1175 640 600
1176 600 590
1177 520 500
1178 700 600
1179 660 660
1180 720 500
1181 600 550
1182 640 460
1183 490 520
1184 . .
1185 600 600
1186 480 580
1187 780 660
1188 670 550
1189 580 600
1190 560 520
1191 580 480
1192 . .
1193 500 500
1194 640 600
1195 570 540
1196 480 560
1197 600 700
1198 450 600
1199 620 640
1200 650 590
1201 760 660
1202 460 540
1203 500 610
1204 650 630
1205 550 550
1206 620 630
1207 640 670
1208 550 400
1209 580 580
1210 660 590
1211 650 570
1212 570 480
1213 425 .
1214 590 500
1215 650 630
1216 680 600
1217 620 400
1218 540 560
1219 630 590
1220 480 480
1221 590 610
1222 510 680
1223 . .
1224 650 630
1225 640 630
1226 630 520

To run the program above, you will need to save the SAS data file (survey.sas7bdat) to your computer first. Right-click the link and select the “Save link as…”. A save dialog box will appear and allow you to save the file to the location you choose on your computer. Edit the LIBNAME statement to reflect the directory in which you saved the survey data set. Then run the program and check the output.

The SQL procedure in this code represents the most basic form of the procedure. Like other SAS procedures, you need to run PROC SQL at the beginning to invoke it. Inside the procedure, there is only one statement starting with SELECT, which chooses the columns you want. You can list as many columns as needed, separated by commas. Another clause is FROM, which is used to specify the table(s). PROC SQL follows the same protocol of SAS file names. Here we used a two-level name to reference the permanent file. Just as you read the code, this program is used to select three columns (student id, SAT Math score and SAS Verbal score) from the table.

If you use the SAS list output as default, the results of a query will be displayed in the SAS output window. For SAS9.2 and above, the default output will be html. You can also change the open ODS destinations to pdf, rtf or others for the output.

Example 32.2 Section

The following SAS program changes the ODS destination from listing to pdf. So the output will be saved under the predefined directory as a pdf file.

ods listing close;
ods pdf file="X:\MG\stat480~483\proc sql\sat_scores.pdf";

PROC SQL;
	select ID,
		SATM,
		SATV
	from stat482.survey;
QUIT;

ods pdf close;
ods listing;

The PROC SQL part in the above program is the same as in the previous example. The added code is to close the current listing output and open the pdf destination and save the pdf output generated by the SQL procedure that follows the specified file name. The end part of ODS statements tells SAS to close the pdf destination and open the listing again.

In some cases, you may want to create a new table from a query. To do so, you need the CREATE TABLE statement with the keyword AS and along with other clauses used in PROC SQL, such as SELECT, or FROM. Below is the basic syntax of CREATE TABLE statement from a query result.

CREATE TABLE table-name AS
	SELECT column-1<,…column-n>
	FROM table-1|view-1<,…table-n|view-n>
	<optional query clauses>;

Example 32.3 Section

The following SAS program uses the CREATE TABLE statement to create a new table named SAT_scores, which contains student ID, SAT math scores, and verbal scores.


PROC SQL;
	CREATE TABLE SAT_Scores as
	select ID,
		SATM,
		SATV
	from stat482.survey;
QUIT;

Launch and run the SAS program. You may notice that there is no output displayed in the SAS output window or any open ODS destination. That’s because the CREATE TABLE statement suppresses the printed output of the query. However, if you check the SAS log window, it shows a message that indicates that the table has been created, and the number of rows and columns in the table.

167  PROC SQL;
168      CREATE TABLE SAT_Scores as
169      select ID,
170          SATM,
171          SATV
172      from stat482.survey;
ERROR: Libref STAT482 is not assigned.
173  QUIT;
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE SQL used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds

In this example, table SAT_scores has 226 rows and 3 columns. The new table’s columns have the same attributes (type, length, format, label) as those of the selected source columns.

From the previous three examples, you now have some idea about what PROC SQL is like to work with. Let’s summarize what makes it so unique from other SAS procedures.

  1. Unlike other SAS procedures which contain many statements, the SQL procedure may consist of one or more than one SELECT statement. Each SELECT statement contains several clauses, like SELECT, FROM, WHERE, and ORDER BY. But the SELECT and FROM clauses are essential and indispensable. Other clauses are optional. All clauses have to be written in the order listed in the syntax. For each SELECT statement, only one semicolon is needed at the end of the statement.
  2. No RUN statement is required for PROC SQL to execute. SQL procedure will continue to run after you submit the program. To end it, you have to submit another PROC step, a DATA step, or a QUIT statement.