Lesson 23: SAS Character Functions

Overview Section

In this lesson, we'll investigate some of the functions available in SAS that can be applied only to character variables. For example, if you want to remove blanks from a character string, you might consider using the compress function. Or, if you want to select a smaller substring, say a first name, from a larger string containing one's full name, you might want to take advantage of the substr function. Some of the functions that we will learn about are old standbys, such as: length, substr, compbl, compress, verify, input, put, tranwrd, scan, trim, upcase, lowcase, | | (concatenation), index, indexc, and spedis. And, some of the functions that we will learn about are new just to SAS Version 9. They include: anyalpha, anydigit, catx, cats, lengthc, propcase, strip, count, and countc.

Your ingenious instructor is going to take this opportunity to introduce you to a couple of great resources for finding information about a variety of SAS topics. One resource is sasCommunity.org. You can learn more about sasCommunity.org just by mucking around for a bit on the site. Another resource is the SAS Global Forum Conference which is held each year. However, the really helpful place to end up is a resource that allows you to search for previous papers and presentations from the SAS Global Forum (annual conferences for SAS Users). It is using this search engine where you'll find the material for this lesson!

Rather than having to read your instructor's lesson notes, you should read the tutorial paper written and presented by Ron Cody called:

An Introduction to SAS Character Functions

Download the paper — you'll need Adobe Reader to do so — and enjoy! Oh, and if you are interested, you can download the dataset below:

roncody.sas

and run all of the examples that are contained in the paper yourself. The page numbers in the program refer to the paper's page numbers.

Acknowledgment: It goes without saying that your instructor is very appreciative of Ron Cody's work on the paper. It presents an excellent and concise introduction to the numerous character functions that are available in SAS.

Objectives

Upon completion of this lesson, you should be able to:

  • use a LENGTH statement to set the desired length of a character variable
  • use the concatenation operator (| |) to join two or more character strings
  • use the COMPBL function to convert multiple blanks in a character string to a single blank
  • use the COMPRESS function to remove characters from a string
  • use the VERIFY function to check that certain values are present in a character variable
  • use the TRIM function to remove the trailing blanks from a character string
  • use the SUBSTR function to select a subset of consecutive characters from a larger string
  • use the SUBSTR function on the left-hand side of an equal sign
  • use the SUBSTR function to unpack a string of characters into its individual characters
  • use the INPUT function to convert a character variable to a numeric variable
  • use the PUT function to convert a numeric variable to a character variable
  • use the SCAN function to parse a string and/or extract part of a string
  • use the INDEX and INDEXC functions to locate a position of one string within another string
  • use the UPCASE function to change lowercase letters to uppercase letters, and use the LOWCASE function to change uppercase letters to lowercase letters
  • use the PROPCASE function to capitalize the first letter of each word
  • use the TRANWRD function to translate a word
  • use the CATS function to strip leading and trailing blanks before joining two or more strings
  • use the CATX function to strip leading and trailing blanks, and then join two or more strings with a specified character inserted between the strings
  • use the LENGTHC function to return the storage length of a character variable
  • use the LENGTH and/or LENGTHN functions to determine the length of a character variable not counting trailing blanks
  • use the COUNT function to count the number of times a particular substring appears in a string
  • use the COUNTC function to count the number of times one or more characters appear in a string