To HomePage
Introduction to SAS
History of SAS
- SAS means Statistical Analysis System.
- SAS was created by Anthony Barr, a graduate student at the University of North Carolina, in 1966.
- By 1971, SAS was popular in the academic community and also for pharmaceutical and agricultural research.
- The syntax of SAS source code was inspired by the PLI programming language, which was popular in the 1960s and 70s.
- In 1976, SAS Institute, Inc. was incorporated by Barr, Goodnight, Sall, and Helwig. It is located in Chapel Hill, North Carolina.
- In addition to statistical analysis, SAS is a popular database management and data mining tool at many companies.
- See this
blog for information on companies that use SAS.
Invoking SAS
- To invoke the SAS software, select Start >> All Programs >> R >> SAS 9.2 English.
This will bring up the SAS Display Manager Window, which is a multiple document interface (MDI) application
that contains several child windows.
- Here are some of the important child windows:
Explorer Allow the user to access various resources associated with SAS, such as the Windows file system,
SAS libraries, file shortcuts.
Editor The SAS source code editor. The last statement in your source code should be run;
To submit your SAS source code to be run, there are three options:
- Click the Submit button (icon of running person) in the tool bar,
- Select Run >> Submit in the main menu,
- press the F8 key.
A SAS program consists of data steps,
that creates a SAS dataset for analysis and proc steps, which analyze the SAS datasets.
Log Shows reports of which SAS data steps and procs were run, how long they took, and which errors occurred.
Output Shows the results of the SAS output. The output may contain duplicate output from repeated similar
submits of source code. Use the Results window to delete duplicate items.
Results A tree view that helps you locate items in the SAS Output Window and delete duplicate items.
Help The Help Window is a valuable way to obtain help for writing your SAS source code.
Invoke the Help window by selecting Help >> SAS Help and Documentation in the main menu, or by pressing the F1 key.
SAS Data Step
- Use a data step to create a SAS dataset.
- Suppose you want to create a SAS dataset that contains the following data:
StudentID | Name | Gender | Year | Midterm | Final |
1138 | William | M | 3 | 82 | 84 |
1422 | Chloe | F | 1 | 72 | 75 |
2293 | Michael | M | 4 | 93 | 95 |
2483 | Anthony | M | 4 | 97 | 98 |
2568 | Andrew | M | 1 | 74 | 55 |
3320 | Sophia | F | 4 | 99 | 98 |
3484 | Emily | F | 3 | 91 | 98 |
3610 | Alexander | M | 2 | 78 | 79 |
3859 | Joshua | M | 1 | 19 | 0 |
4738 | Natalie | F | 2 | 85 | 79 |
5187 | Elizabeth | F | 3 | 90 | 91 |
6247 | Isabella | F | 1 | 89 | 88 |
6992 | Daniel | M | 3 | 82 | 93 |
7123 | Jacob | M | 1 | 75 | 73 |
7156 | Samantha | F | 2 | 82 | 80 |
8471 | Olivia | F | 3 | 89 | 87 |
- See the ExamSco1 Example for the SAS source code that creates the SAS dataset
examsco.
- The name of the SAS dataset is specified by the first line
data students;
Every source code line in a SAS program must end with a semicolon (;).
Use an input statement to give values to SAS variables:
input name $ gender $ year;
Use a dollar sign ($) to designate categorical input data.
Use a datalines statement to designate data that is embedded in the
SAS source code:
datalines;
1138 William M 3 82 84
1422 Chloe F 1 72 75
;
A semicolon on a line by itself marks the end of the embedded data.
Use @@ at the end of an input statement to hold a
data line for reading additional observations:
data nist;
input weight @@;
datalines;
9.999591 9.999600 9.999594 9.999601 9.999598
9.999594 9.999599 9.999597 9.999599 9.999597
;
Instead of including the input data as embedded data lines in the source
code, the data can be read from an input file like the
ExamSco2 Example, where
an infile line is used:
infile /c:sas-examples/students.txt;
Sometimes an input file is formatted without spaces to conserve space, for example in the
input file examsco3.txt. In this case,
column numbers are needed in the input statement to properly read the data, as shown in the
source code for the ExamSco3 Example.
Explicit loops are usually not required in SAS. The code in the data step is repeated until
the end of data the data lines or input file is reached.
There are two ways to place comments in SAS source code:
* This is a SAS traditional style comment;
* Each line begins with an star and ends;
* with a semicolon;
/* This is a C style SAS comment. Begin a
(possibly multiline) comment with slash
star; End the comment with star slash. */
SAS Proc Step
- A SAS proc step is used to display, process, or analyze or process the data that was created in the data step.
A proc step can also produce graphical output or create new datasets.
- Here are some examples of SAS Proc statements:
* Print SAS dataset;
proc print data=examsco;
var name gender age;
* Sort SAS dataset;
* If dataset name is omitted, use the most;
* recently created dataset;
proc sort;
by name;
* Show descriptive statistics: sample size;
* mean, and standard deviation;
proc means n mean std;
var midterm final;
* Show univariate statistics;
* Include stem plot, boxplot, and normal plot;
proc univariate plots normal;
var examave;
* Compute correlation of midterm and final scores;
proc corr;
var midterm final;
* Compute regression equation and output dataset;
* that includes residuals and predicted values;
proc reg;
model final = midterm / resid predict;
out=diagnostic;