UMASS Department of:
Biostatistics and Epidemiology

SAS:The Statistical Analysis System
Basics
Combo
Desc Stats
Advanced
Index
SASHome
Intro
Data In
Data Out
Pretty
Cleaning
Documentation
Environs
Basics: Getting Data Into SAS
  • List Input

Reading Data into SAS: (List Input)

Introduction

We illustrate how to read data into SAS using as an example data from the Intensive Care Study (icu.dat). These data are stored as an ASCII data file in a rectangular format with columns representing variables and rows representing subjects. Each column is separated by one or more blanks. A code book (icu.txt) describes the variables representing the columns. To follow this example, you should

  • i. have a copy of SAS installed on your computer
  • ii. Print a copy of the code book (icu.txt)
  • iii. Save a copy of the file icu.dat on your computer. If you are using OIT or SPH&HS computers, save this file in on the "c:\data" directory. Inspect the data in the file icu.dat, and record the number of the first line that contains data.
  • iv. Have opened the SAS program.

Reading Data Into SAS

The data files stored as ASCII files with rows corresponding to subjects, and columns corresponding to variables. The values in columns are separated by one or more blanks. This enables the data to be read in list input. A batch program that reads the file c:\data\icu.dat into SAS is given by LISTP1.SAS . All SAS batch programs should end with the suffix *.sas . Each statement in the program ends with a semi-colon (;).

  • a. Include a copy of the program LISTP1.SAS from the WEB in the program window in SAS.
  • b. Run the program by clicking on the ICON of the person running.
  • c. Move to the LOG window, and check for errors. If the data file icu.dat is not found, make sure you have copied it to the c:\temp directory.
  • d. Move to the OUTPUT window, and inspect the output. When the program runs correctly, you should view a list of the data.
  • e. Move to the PROGRAM window (which is now empty). Recall your program by clicking on the LOCAL pull down menu, and then clicking on RECALL TEXT. The most recent version of your SAS program is brought back into the PROGRAM window.
  • f. Use the text editor to alter the name of the program (alter a54p1.sas) (we suggest using simple names indexed by numbers for consecutive programs). Make sure your program name ends with *.sas.
  • g. Change the name of the programmer to your name. Be careful not to delete the quotation marks.
  • h. Save the program using the SAVE AS command and the new program name.
  • i. Run the program and review the annotated discussion below.

Annotated Discussion of Program LISTP1.SAS


OPTIONS LINESIZE=72 PAGESIZE=55 NODATE NONUMBER NOCENTER;

  • The OPTIONS statement sets values for the program environment. We will use the same settings for all batch programs. The settings correspond to parameters for the number of characters on a line, the number of lines on the page, whether output will include a date or page number, or be centered.


***************************************************************************;

  • Comments in SAS are for use by the user to help keep track of what the program is doing. Any command line that begins with an asterisks is a comment. We start each program with a comment indicating what the program is doing.


TITLE1 "Source: LISTP1.SAS 9/24/98 Ed Stanek" ;

  • Titles can be specified that appear at the top of each page in all results. The first line in a title is indicated by TITLE1. A second title line can be indicated by the SAS keyword TITLE2. Titles must be enclosed in quotation marks. It is important that the quotation marks be balanced.
  • Since results are produced by programs, we include a title that indicates the name of the program that created the output.in all programs, and the location where the program is stored.


* DESCRIPTION: Read in ICU Data and created SAS system data set ;
***************************************************************************;

  • Additional comments indicate what the program is doing. Each comment begins with an * and ends with a ; .


DATA d;

INFILE 'c:\temp\icu.dat' FIRSTOBS=11;
INPUT id sta age sex race ser can crn inf cpr sys hra pre typ
fra po2 ph pco bic cre loc;
  • The SAS statements given above constitute a SAS DATA STEP. A DATA STEP is used to read in data into the SAS system. We will always indent statements in a SAS Data step to help emphasize that these statements go together. Data steps always begin with the keyword DATA and are followed with the name that is given to the created data set. In this example, the SAS system data set is named "d.sd2". SAS automatically adds the suffix *.sd2 to the SAS system data sets. The DATA statement ends with a semi-colon.
  • In order to identify data that will be contained in the data set, the file that contains the data is specified via the location using an INFILE statement. The file location is specified in quotation marks. It is important that quotation marks are used in sets (two single quotation marks, or two double quotation marks). Don't mix single and double quotations. There is one optional parameter specified in the INFILE statement. The optional keyword is FIRSTOBS, and it indicates that data begin on line 11 in the file. The INFILE statement ends with a semi-colon.
  • The third statement in the SAS data step begins with the SAS keyword INPUT. Following this keyword, the names are listed in order corresponding to values in columns in the data file. These names match the code book icu.txt. Note that the INPUT statement can continue for more than one line. The end of the statement is indicated by the semi-colon.


PROC PRINT;

  • The statement PROC PRINT is a procedure in SAS that prints a copy of data in the current SAS data step. PROC is short for procedure. The SAS program is sequential, meaning that the order of statements in the program determines the order of processing. A listing of the data in the OUTPUT window results. The PROC PRINT procedure ends with a semi-colon.


PROC CONTENTS;

  • The statement PROC CONTENTS is a procedure in SAS to list the SAS system information about a data set. The most recently created SAS data set is used. The results are given in the OUTPUT window.


RUN;

  • The final SAS statement is RUN. This statement requests that the previous statements be executed by the SAS program.




Last Update: 9/2/98
Comments: Ed Stanek
Email:
stanek@schoolph.umass.edu
\ed\web\be691f\webready\sasrlst.html