Practical Data Management
and Statistical Computing (BioEp691F)
Outline: Lec1 Lec2
Lec3 Lec4
Lec5 Lec6
Lec7 Lec8
Lec9 Lec10
Lectures: Lec1
Lec2 Lec3
Lec4
Lec5
Lec6
Lec7
Lec8
Lec9
Lec10
Lecture 4
1.
Review
- Saving an ASCII data set from the WEB and reading it into
SAS (Example: lec1sm1.sas, Data:
lec1sm2.dta)
- When saving ASCII data, formatting characters can cause
problems. Possible solutions include:
- Open MS WORD. Then, in Netscape, use
SELECT ALL from Edit
Menu, and then COPY to
Copy data into MS-WORD. Then save the file as a
.txt file with carriage returns
(ie. lec1sm2.txt)
- Open Notepad (to open, from
Start, select
Programs, and then
Accessories). Then, in
Netscape, use SELECT ALL
from Edit Menu, and then
COPY to Copy data into
Notepad. Then save the file as a
.dat file with carriage returns
(ie. lec1sm2.dat)
- Printing SAS OUTPUT in a word processor (use a
non-proportional font (COURIER, or SAS MONOSPACE)
- Changing ffffff to ------- in output from PROC FREQ. Examine
form character option in the CONFIG.SAS
in C:\SAS directory.
- Changing the Explorer View Options so that the full name
(including extensions) can be seen.
- Using a header and titles effectively- (use your own
names)
2.
Identifying invisible ASCII
characters in an ASCII data set
Review:
- When saving data from the WEB, formatting characters can cause
problems. This was illustrated in Lecture 3 as follows:
- Save lec1sm2.dta on the C:\DATA
directory using the following:
- Open the data set by clicking on it in your WEB
browser.
- Click on the FILE pull-down menu in your WEB browser,
and then select SAVE AS, placing the file in the C:\DATA
directory with the name lec1sm2.dta.
- Copy the program dmes99p4.sas
into SAS, and run the program.
- Examine the program log (dmes99p4.log)
- Discussion: The log file indicates some errors in the
data that prevent SAS from reading the last variable. Inspecting
the data in NOTEPAD reveals no problem. What's wrong? One
possibility is that there are hidden characters
(ASCII characters that do not appear on the computer screen
in programs like NETSCAPE, or NOTEPAD) that are contained in the
data set, and confuse SAS. By going through the steps above, other
software will delete these special characters. We can identify
some of these characters by reading the saved data into the SAS
program window.
- In SAS, open the data set lec1sm2.dta in the program
window. You should see some characters that are not
alphanumeric.
Identifying ASCII characters:
- We use the BYTE and RANK functions in SAS to see the
ASCII correspondance to the characters in an ASCII file
(dmes99p7.sas) with new features:
- INFILE with the option MISSOVER (to go to a new
record)
- INPUT (v1-v5) ($1.) ; as a shorthand to input a set of
variables
- ARRAY v{5} ; to define an array of variables
- DO i=1 TO 5; ...... END; to perform an operation on
variables
- RANK(v1) to determine the ASCII code equivalent of an
ASCII character variable
- FILE PRINT; to route output from PUT statements to the
OUTPUT window
- PUT ..... to write out results directly as data lines are
processed.
- BYTE(v1) to convert an ASCII code to the
ASCII character.
3. Describing Data
Once a SAS data set has been created, you can easily do
exploritory analysis using ANALYZE. This is illustrated with the SAS
data set ICUA3.sd2 that was created in
Assignment #3. To do so:
- Make sure there is a SAS data set open via the
Library Icon.
- Click on the Global pull
down menu.
- Select the Analyze
option.
- Select Interactive Data
Analysis.
- Move to the WORK library,
and select the SAS data set you want to use, clicking on the data
set.
- Use the Analyze pull down
menu to select the analysis.
You can copy results from the interactive analysis to a Word
Processor, and use these results in a report.