PROC SQL is a procedure that SAS developed for the implementation of Structured Query Language. You can use this procedure to modify, retrieve and report data in tables and views (created on tables). Just as with other SAS procedures, PROC SQL also has basic syntax structures. It takes the following general form:
<GROUP BY column-1<,…column-n>>
<ORDER BY column-1<,…column-n>>;
First of all, you may see differences in terminology between SQL and other SAS steps. For example, the data file is called data set in other SAS steps, but the table is in SQL. Correspondingly, records are called observations in the previous lessons, but rows in SQL tables, while we call a field of data set a variable, but a column in this lesson.
|Other SAS steps
Another thing that needs your attention is that, unlike other SAS procedures, there could be one or a few SELECT statements inside PROC SQL. One SELECT statement is called a query, which is composed of many clauses, like SELECT, FROM, WHERE, GROUP BY, HAVING, and ORDER BY. The order of these clauses is important. They must appear in the order as shown above.
We will use the whole lesson to work our way through all these keywords in PROC SQL. Let’s start with the most basic one.
Example 32.1 Section
The following SAS SQL code is just query that retrieves data from a single table:
libname stat482 "X:\MG\stat480~483\proc sql";
To run the program above, you will need to save the SAS data file (survey.sas7bdat) to your computer first. Right-click the link and select the “Save link as…”. A save dialog box will appear and allow you to save the file to the location you choose on your computer. Edit the LIBNAME statement to reflect the directory in which you saved the survey data set. Then run the program and check the output.
The SQL procedure in this code represents the most basic form of the procedure. Like other SAS procedures, you need to run PROC SQL at the beginning to invoke it. Inside the procedure, there is only one statement starting with SELECT, which chooses the columns you want. You can list as many columns as needed, separated by commas. Another clause is FROM, which is used to specify the table(s). PROC SQL follows the same protocol of SAS file names. Here we used a two-level name to reference the permanent file. Just as you read the code, this program is used to select three columns (student id, SAT Math score and SAS Verbal score) from the table.
If you use the SAS list output as default, the results of a query will be displayed in the SAS output window. For SAS9.2 and above, the default output will be html. You can also change the open ODS destinations to pdf, rtf or others for the output.
Example 32.2 Section
The following SAS program changes the ODS destination from listing to pdf. So the output will be saved under the predefined directory as a pdf file.
ods listing close;
ods pdf file="X:\MG\stat480~483\proc sql\sat_scores.pdf";
ods pdf close;
The PROC SQL part in the above program is the same as in the previous example. The added code is to close the current listing output and open the pdf destination and save the pdf output generated by the SQL procedure that follows the specified file name. The end part of ODS statements tells SAS to close the pdf destination and open the listing again.
In some cases, you may want to create a new table from a query. To do so, you need the CREATE TABLE statement with the keyword AS and along with other clauses used in PROC SQL, such as SELECT, or FROM. Below is the basic syntax of CREATE TABLE statement from a query result.
CREATE TABLE table-name AS
<optional query clauses>;
Example 32.3 Section
The following SAS program uses the CREATE TABLE statement to create a new table named SAT_scores, which contains student ID, SAT math scores, and verbal scores.
CREATE TABLE SAT_Scores as
Launch and run the SAS program. You may notice that there is no output displayed in the SAS output window or any open ODS destination. That’s because the CREATE TABLE statement suppresses the printed output of the query. However, if you check the SAS log window, it shows a message that indicates that the table has been created, and the number of rows and columns in the table.
167 PROC SQL;
168 CREATE TABLE SAT_Scores as
169 select ID,
172 from stat482.survey;
ERROR: Libref STAT482 is not assigned.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE SQL used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
In this example, table SAT_scores has 226 rows and 3 columns. The new table’s columns have the same attributes (type, length, format, label) as those of the selected source columns.
From the previous three examples, you now have some idea about what PROC SQL is like to work with. Let’s summarize what makes it so unique from other SAS procedures.
- Unlike other SAS procedures which contain many statements, the SQL procedure may consist of one or more than one SELECT statement. Each SELECT statement contains several clauses, like SELECT, FROM, WHERE, and ORDER BY. But the SELECT and FROM clauses are essential and indispensable. Other clauses are optional. All clauses have to be written in the order listed in the syntax. For each SELECT statement, only one semicolon is needed at the end of the statement.
- No RUN statement is required for PROC SQL to execute. SQL procedure will continue to run after you submit the program. To end it, you have to submit another PROC step, a DATA step, or a QUIT statement.