|
Example #1. Intensive Care
Unit Data [uses PROC FORMAT]
Obtaining Frequency
Distributions
Introduction
We illustrate how to obtain frequency distributions for
all subjects on a variable, and for sub- groups of subjects
illustrating the SAS Procedures:
- PROC FREQ;
- PROC FORMAT;
- PROC SORT;
We use as an example data from the Intensive Care Study (
icu.dat
). A code book (icu.txt) describes the variables
representing the columns, and is neede to interpret the
results. We assume that you:
- i. have a copy of SAS installed on your computer
- ii. have a printed copy of the code book (icu.txt
)
- iii. have previously saved a copy of the file icu.dat
on the "c:\temp\" directory on your computer.
- iv. have opened the SAS program.
- v. have copied the program freqp1.sas
into the SAS program window.
Obtaining Frequency
Distributions.
Frequency distributions are obtained using the procedure
PROC FREQ. A
TABLES statement identifies the
variable(s) where frequency distribution will be tabulated.
If frequency distributions are desired for a subset of the
subjects, cross-classifications of variables can be used.
Descriptive names can be associated with values in the
tables using
PROC FORMAT.
Annotated Discussion of
Program freqp1.sas
OPTIONS LS=72 PS=55 NODATE
NONUMBER NOCENTER;
******************************************************;
*** Program Date Disk Programmer ;
TITLE1 "Source: FREQP1.SAS 9/24/98 Ed Stanek ";
* DESCRIPTION: ;
* a. Obtaining simple frequency tables ;
* b. Cross-classified frequency tables for 2 vars ;
* c. Attaching formats to discete value ;
* using ICU.SD2 created from icu.dat ;
* Examples of PROC FREQ, PROC FORMAT ;
******************************************************;
*****************************;
*** Read ICU Study data ***;
*****************************;
DATA icu;
INFILE
'c:\temp\icu.dat' FIRSTOBS=11;
INPUT id sta age sex race ser can crn inf cpr sys hra
pre typ fra po2 ph pco bic cre loc;
PROC FREQ;
TABLES sex ;
TITLE2 "Table 1a. Frequency Distribution of Gender";
- These statements illustrate a simple example of the
PROC FREQ procedure. The
TABLES statement is used to
specify variables for which a frequency distribution is
to be tabulated.
PROC FORMAT;
VALUE sexf
0="Male"
1="Female";
VALUE staf
0="Live"
1="Die";
PROC FREQ;
TABLES sex ;
FORMAT sex sexf. ;
TITLE2 "Table 1b. Frequency Distribution of Gender with
Formats";
- It is often valuable to know what rows in the
Frequency table stand for. We substitute names for the
values using PROC FORMAT.
The set of names that are attached is assigned a
Format Name given in a
VALUE statement. For the
variable sex, the format name
assigned is sexf . The
description of the names is given by equating the value
to a descriptive name. Once a list of all values is
given, the VALUE statement
ends in a semi-colon. More than one
VALUE statement can be used
in the PROC FORMAT
procedure.
- Once the PROC FORMAT
procedure has been run, other procedures in
SAS can use the formats that
have been defined by including an optional FORMAT
statement in the procedure. The
FORMAT statement tells the
SAS program to use the format names (ie
sexf ) in place of the
values for the variable (ie. sex). A
period must be included after the
format name in a format statement (
sexf. ) for the program to
distinguish the format name from the
variable name
(sex).
PROC FREQ;
TABLES age;
TITLE2 "Table 1c. Frequency Distribution of
age";
PROC FREQ;
TABLES sex*sta
/MISSPRINT;
TITLE2 "Table 1d. Freq Dist of Gender by Survival
cross-classification";
PROC FREQ;
TABLES sex*sta
/MISSPRINT NOROW NOCOL NOPERCENT;
FORMAT
sex sexf.
sta staf. ;
TITLE2 "Table 1e. Freq Dist
of Gender by Survival cross-classification";
TITLE3 " with no percentages given ";
- Frequency distributions can be constructed for
continuous variables (such as age), but may not be
very valuable. Frequencies of cross-classifications of
variables are specified by specifying two variables:
sex*sta . When one or more
value of a variable is missing, including the optional
/MISSPRINT will include the
variable in the table. Note that the output of a cross-
classification will include percentages first formed by
dividing by the overall total, next the row totals, and
finally the column totals. These percentages can by the
options NOPERCENT,
NOROW, or
NOCOL
respectively.
PROC SORT;
BY sta;
PROC FREQ;
BY sta;
TABLES sex
/LIST;
FORMAT
sex sexf.
sta staf. ;
TITLE2 "Table 2. Freq Dist.
of Gender by Survival Status";
RUN;
- Separate frequency tables for each survival group can
be obtained by first sorting the data into survival
status groups (sta) using the procedure
PROC SORT, and then forming
a frequency distribution separately for each level of
survival status. To obtain the separate frequency tables,
we include a BY statement in
PROC FREQ. The example
illustrates another option, the
LIST option to change the
way in which the frequency table is printed.
|