UMASS Department of:
Biostatistics and Epidemiology

Basics
Combos
Desc Stats
Advanced
Index
SASHome
Print
Freq
Chart
Means
Univariate
Plot
more....
Example #1. Intensive Care Unit Data [uses PROC FORMAT]


Obtaining Frequency Distributions


Introduction

We illustrate how to obtain frequency distributions for all subjects on a variable, and for sub- groups of subjects illustrating the SAS Procedures:

  • PROC FREQ;
  • PROC FORMAT;
  • PROC SORT;

We use as an example data from the Intensive Care Study ( icu.dat ). A code book (icu.txt) describes the variables representing the columns, and is neede to interpret the results. We assume that you:

  • i. have a copy of SAS installed on your computer
  • ii. have a printed copy of the code book (icu.txt )
  • iii. have previously saved a copy of the file icu.dat on the "c:\temp\" directory on your computer.
  • iv. have opened the SAS program.
  • v. have copied the program freqp1.sas into the SAS program window.

Obtaining Frequency Distributions.

Frequency distributions are obtained using the procedure PROC FREQ. A TABLES statement identifies the variable(s) where frequency distribution will be tabulated. If frequency distributions are desired for a subset of the subjects, cross-classifications of variables can be used. Descriptive names can be associated with values in the tables using PROC FORMAT.


Annotated Discussion of Program freqp1.sas


OPTIONS LS=72 PS=55 NODATE NONUMBER NOCENTER;
******************************************************;
*** Program Date Disk Programmer ;
TITLE1 "Source: FREQP1.SAS 9/24/98 Ed Stanek ";
* DESCRIPTION: ;
* a. Obtaining simple frequency tables ;
* b. Cross-classified frequency tables for 2 vars ;
* c. Attaching formats to discete value ;
* using ICU.SD2 created from icu.dat ;
* Examples of PROC FREQ, PROC FORMAT ;
******************************************************;
*****************************;
*** Read ICU Study data ***;
*****************************;
DATA icu;

INFILE 'c:\temp\icu.dat' FIRSTOBS=11;
INPUT id sta age sex race ser can crn inf cpr sys hra
pre typ fra po2 ph pco bic cre loc;


PROC FREQ;

TABLES sex ;
TITLE2 "Table 1a. Frequency Distribution of Gender";
 
  • These statements illustrate a simple example of the PROC FREQ procedure. The TABLES statement is used to specify variables for which a frequency distribution is to be tabulated.

 



PROC FORMAT;

VALUE sexf
0="Male"
1="Female";

VALUE staf

0="Live"
1="Die";

PROC FREQ;

TABLES sex ;
FORMAT sex sexf. ;
TITLE2 "Table 1b. Frequency Distribution of Gender with Formats"; 
  • It is often valuable to know what rows in the Frequency table stand for. We substitute names for the values using PROC FORMAT. The set of names that are attached is assigned a Format Name given in a VALUE statement. For the variable sex, the format name assigned is sexf . The description of the names is given by equating the value to a descriptive name. Once a list of all values is given, the VALUE statement ends in a semi-colon. More than one VALUE statement can be used in the PROC FORMAT procedure.
  • Once the PROC FORMAT procedure has been run, other procedures in SAS can use the formats that have been defined by including an optional FORMAT statement in the procedure. The FORMAT statement tells the SAS program to use the format names (ie sexf ) in place of the values for the variable (ie. sex). A period must be included after the format name in a format statement ( sexf. ) for the program to distinguish the format name from the variable name (sex)

 


 

PROC FREQ;

TABLES age;
TITLE2 "Table 1c. Frequency Distribution of age";

PROC FREQ;

TABLES sex*sta /MISSPRINT;
TITLE2 "Table 1d. Freq Dist of Gender by Survival cross-classification";

PROC FREQ;

TABLES sex*sta /MISSPRINT NOROW NOCOL NOPERCENT;
FORMAT
sex sexf.
sta staf. ;

TITLE2 "Table 1e. Freq Dist of Gender by Survival cross-classification";
TITLE3 " with no percentages given ";

  • Frequency distributions can be constructed for continuous variables (such as age), but may not be very valuable. Frequencies of cross-classifications of variables are specified by specifying two variables: sex*sta . When one or more value of a variable is missing, including the optional /MISSPRINT will include the variable in the table. Note that the output of a cross- classification will include percentages first formed by dividing by the overall total, next the row totals, and finally the column totals. These percentages can by the options NOPERCENT, NOROW, or NOCOL respectively. 

 


PROC SORT;

BY sta;

PROC FREQ;

BY sta;

TABLES sex /LIST;

FORMAT

sex sexf.

sta staf. ;

TITLE2 "Table 2. Freq Dist. of Gender by Survival Status";

RUN; 

  • Separate frequency tables for each survival group can be obtained by first sorting the data into survival status groups (sta) using the procedure PROC SORT, and then forming a frequency distribution separately for each level of survival status. To obtain the separate frequency tables, we include a BY statement in PROC FREQ. The example illustrates another option, the LIST option to change the way in which the frequency table is printed.



Last Update:9/24/98
Comments: Ed Stanek
Email:
stanek@schoolph.umass.edu
\ed\web\be691\webready\sasfreq.html