To ExamInfo Page

Final Exam Practice Problems 1

Multiple Choice Questions

For each question, show your work or give a reason explaining your answer. 4 points for the reason, 1 point for the correct answer.

  1. Roger Cotes was the first to publish a study on
    1. how well IQ scores fit the normal distribution.
    2. the applications of probability to economics.
    3. the theory of errors in astronomy.
    4. the uses of statistics in genetics.
    Ans: c.
  2. Which of these is another name for categorical variables?
      a. Continuous    b. Nominal    c. Ordinal    d. Scale

    Ans: b. Nominal means that the data is not numbers. This is the term used in SPSS.
  3. If Q1 = 1,230 and Q3 = 5,238, using the boxplot, the observation at 11,389 is
      a. a mild outlier   b. an extreme outlier   c. below the 75th percentile   d. the median 

    Ans: a. IQR = 4,098. The inner fence to the right is at Q3 + 1.5 * IQR = 11,385. The inner fence to the right is at Q3 + 3.0 * IQR = 17,532. 11,389 is between the inner and outer fences, so it is a mild outlier.
  4. What percentage of IQ scores are over 160, assuming that IQ scores are normally distributed?
      a. 0.32%    b. 0.032%    c. 0.0032%    d. 0.00032%

    Ans: c: 0.0032%.
  5. For the curve shown below, the points shown in red are
      a. asymptotes   b. critical points   c. inflection points   d. outliers

    Ans: c: inflection points.
  6. The residual plot that we use in it223 consists of
    1. residuals plotted vs. normal scores
    2. residuals plotted vs. observation number
    3. residuals plotted vs. predicted values
    4. y-values plotted vs. x-values

    Ans: c. Definition of residual plot.

Problems

Show all of your work. You may use a calculator.

  1. Given the following data, draw the box plot.
    Q0 = 0.001     Q1 = 0.035     Q2 = 0.057     Q3 = 0.089     Q4 = 0.311
    Additional outliers are at 0.141, 0.189, 0.217, 0.240. You decide whether they are mild or extreme.
  2. Compute the correlation of this dataset:
    x: 1    2    3    4    5
    y: 2    4    7    3    4

    Here are the z-scores:
    zx: -1.414   -0.707   0.000   0.707   1.414
    zy: -1.195   0.000   0.000   -0.423   0.000

    Ans: r = 0.2536
  3. Here are the summary statistics for the midterm and final scores in a large class:
  4. average midterm score = 50; SD for midterm = 25;  
    average final score = 55; SD for midterm = 15; r = 0.60


    Assume that the data are bivariate normal.
    1. About what percentage of students scored over 85 on the midterm?
    2. About what percentage of students obtained a score over 85 on the final?
    3. Of the students that scored 85 on the midterm, what percentage scored over 85 on the final?
    4. Of the students that scored 25 on the midterm, what percentage scored over 85 on the final?

R Analysis

Perform the following analyses with R. Save your output file as a Word .doc file. Type any interpretation of the output into the output file itself.

  1. Input the R file. tv-gpa.txt into R.
  2. Determine the following for Hours and HsGpa:
    Q0    Q1    Q2    Q3    Q4    mean    SD+
    Ans: Using Tukey's Hinges for Percentiles
    Hours:    Q0=1.9  Q1=2.5  Q2=2.9  Q3=3.3  Q4=3.7  mean=2.871  SD+=0.5130
    HS_GPA: Q0=2  Q1=5  Q2=9  Q3=14  Q4=14  mean=9.71  SD+=5.425
    1. Create the boxplot for Hours and HsGpa.  Ans: Draw the boxplot by hand or use R.
    2. Determine the correlation between Hours and Hspa. Ans: -0.626
    3. Which is the independent variable? which is the dependent variable? Ans: Hours, HsGpa
    4. Compute and interpret the r-squared value. Ans: 0.00392. 
      R2 is the proportion of variation in the dependent variable that can be attributed to the independent variable.
    5. Find the regression equation for predicting the dependent variable from the independent variable.
      Ans: HsGpa = -0.006 * Hours + 2.931
    6. What is the predicted highschool GPA for someone that watches TV 40 hours per week. Why is this prediction not likely to be very accurate? Ans: 2.691. Because R2 is large.
    7. Create and interpret the residual plot. Use R.
    8. Create and interpret the normal plot of the residuals. Use R.