To Exam Info
Midterm Review Guide
Format of Exam Questions
- Short answer, multiple choice, short essay, problems, interpret R output
Items to Know for Midterm
- A univariate dataset can be summarized by a histogram or boxplot.
- A normal histogram
- can be described by a normal histogram with center = μ and spread = σ.
- can be parsimoneously described by its sample mean (x), which approximates μ and its
sample standard deviation (SD+), which approximates σ.
- A bivariate normal dataset
- forms an ellipse shaped cloud. It can be parsimoniously described by
x, y,
SDx+, SDy+, and r.
- The ideal measurement model is
xi = μ + ei, where
the random errors ei are unbiased, homoscedastic, and normally distributed.
Persons
- Pascal, Graunt, de Moivre, Cotes, Gauss, Fisher, Tukey
Symbols
-
μ Population mean
σ Population standard deviation
σ2 Population
variance
x Sample mean
SD
Sample standard deviation, divide by n.
sx = SD+
Sample standard deviation, divide by n - 1.
Q0
Minimum value in sample
Q1
1st Quartile = 25th Percentile
Q2
2nd Quartile = 50th Percentile
Q3
3rd Quartile = 75th Percentile
Q4 Maximum value in sample
Formulas
- Interquartile Range: IQR = Q3 - Q1
- Inner fences for boxplot: Q1 - 1.5 × IQR; Q3 + 1.5 × IQR
- Outer fences for boxplot: Q1 - 3.0 × IQR; Q3 + 3.0 × IQR
- z-score for individual observations: z = (x - x) / SD+
- Standard error of the average: SEave = SD+ / √n
- z-score for sample average: z = (x - μ) / SEave
- Ideal measurement model: xi = μ + ei
- Linear regression model: yi = axi + b + ei
Definitions
- Controlled experiment, double blind, randomization, observational
study, lurking variables (also called confounding factors),
univariate dataset, histogram, density histogram, bin, variance (=SD squared), parsimonious,
Q0, Q1, Q2, Q3, Q4, IQR, stem-plot, boxplot, normal plot, mild outliers,
extreme outliers, normal histogram, ideal measurement model, bias,
center, spread, plot of xi vs. i (unbiased, biased,
homoscedastic, heteroscedastic, standard normal curve),
critical point, inflection point, standard units, standard error of the mean,
normal score, normal plot, bivariate
dataset, bivariate normal, correlation,
causation.
- Know both the defining formula and the
intuitive idea behind the concept.
Know How To
- Construct a stem-plot, also called a stem-and-leaf display.
- Determine the number or percentage of observations in an
interval of a histogram (assuming the data in each bin is distributed
uniformily.
- Compute the mean, SD, SD+, Q0, Q1, Q2, Q3, Q4, IQR, by hand.
- Draw a histogram with possibly unequal class widths.
- Compute the median, Q1, Q3, IQR, and mean of a histogram.
- Estimate the proportion of observations in an interval for a
histogram.
- Find the proportion of observations in a given interval of
a normal histogram using the standard normal table.
- Write down or discuss the ideal measurement model.
- Compute the standard error of the average and a 95 confidence
interval for the true measurement in the ideal measurement model
- Draw the boxplot and use it to detect outliers.
- Given an x-value, x, and SDx,
compute the z-score.
- Use the normal table to determine the proportion of observations in a bin of
the form [a, b], (-∞, a], or [a, ∞), or (-∞, ∞)
- Obtain percentiles using a normal table: work backwards by looking up the proportion
in the body of the table to find the corresponding z-score, then use
x = z * mu + sigma, if necessary.
- Use the R function pnorm to compute proportions under the normal curve.
- Use the R function qnorm to compute percentiles under the normal curve.
- Obtain normal scores using the standard normal table.
- Interpret a normal plot (normal, skewed to the left or right,
thin tails, thick tails).
- Discuss a plot of xi vs. i.
Explain
- Be able to explain in terms that someone not familiar with
statistics will understand:
- What to the sample mean and SD tell you about a dataset? What are other ways
to estimate the center and spread of a histogram.
- What does a histogram tell you and what must you watch out for
of the bin widths are not all equal?
- What is the ideal measurement model?
- The original and current definitions of the meter, second, and kilogram.
- What is correlation and how does it relate to causation?
- Why is correlation not always the same as causation?
- What is a regression equation? What is required to have a good linear regression model.