To Notes

IT 223 -- Apr 17, 2024

Review Exercises

  1. Use R to create a data vector with entries from 1 to 100. Answer:
    > v <- 1:100
    
  2. Use R to create a data vector with entries that start at 1, end at 3, and increase by 0.01 from one entry to the next? Answer:
    > v <- seq(1, 3, 0.01)
    
  3. What is the difference between SD and SD+? Answer:
    SD = square root of (sum of squared deviations divided by n)
    SD+ = (sum of squared deviations divided by n-1)
    SD+ is slightly larger than SD.
  4. Which R functions do you use to do compute following for the data vector x:
    1. Sample mean: mean(x)
    2. Sample standard deviation (SD+): sd(x)
    3. Sample median: quantile(x, 0.5) or median(x)
    4. Histogram: hist(x)
    5. Boxplot: boxplot(x)
    6. Scatterplot of data vector vs. observation number: plot(1:length(x), x)
  5. Use the R function plot to draw a standard normal density histogram, for which μ=0 and σ=1. Use the xlim argument to set the range of x to [-4, 4]. Use the ylim argument to set the range of y to [0, 1]. Use type="l" to connect the heights of the histogram with line segements. Use the main parameter to set the title of the plot. Answer:
    > x <- seq(-4, 4, 0.05)
    > y <- dnorm(y)
    > plot(x, y, type="l", xlim=c(-4, 4), ylim=c(0, 1), main="Standard Normal Density")
    
  6. What is a z-score? How do you use R to compute the z-scores for a data vector?
    Answer: a z-score for an observation tells you how many standard deviations away from the sample mean the observation is.  You compute the z-scores of a data vector like this:
    z = (x - mean(x)) / sd(x)
    

The Normal Distribution

Some R functions:
dnorm, pnorm, qnorm, rnorm

Biased vs. Unbiased; Heteroscedastic vs. Homoscedastic for Graphs

Practice Problems

  1. Work problems from the Practice Problems on the Area under the Normal Curve.

Normal Plots

Practice Problems

  1. Compute normal scores (Van der Waerden's method) for a dataset of size 9.
  2. Construct the normal plots by hand of this dataset:
     
           81   95   97   101   112   125   129   167   220
  3. Create the normal plot for this dataset with R.

Random Variable Simulation

Project 2BC